0% found this document useful (0 votes)

34 views1,239 pages

A New Approach in Energy Consumption Based On Genetic Algorithm and Fuzzy Logic For WSN

Uploaded by

Javad Rahebi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views1,239 pages

A New Approach in Energy Consumption Based On Genetic Algorithm and Fuzzy Logic For WSN

Uploaded by

Javad Rahebi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1239

Lecture Notes in Electrical Engineering 666

Zainah Md Zain · Hamzah Ahmad ·

Dwi Pebrianti · Mahfuzah Mustafa ·
Nor Rul Hasma Abdullah ·
Rosdiyana Samad ·
Maziyah Mat Noh Editors

Proceedings of the
11th National
Technical Seminar
on Unmanned System
Technology 2019
NUSYS’19
Lecture Notes in Electrical Engineering

Volume 666

Series Editors
Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli
Federico II, Naples, Italy
Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán,
Mexico
Bijaya Ketan Panigrahi, Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India
Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, Munich, Germany
Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China
Shanben Chen, Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore,
Singapore, Singapore
Rüdiger Dillmann, Humanoids and Intelligent Systems Laboratory, Karlsruhe Institute for Technology,
Karlsruhe, Germany
Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China
Gianluigi Ferrari, Università di Parma, Parma, Italy
Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid,
Madrid, Spain
Sandra Hirche, Department of Electrical Engineering and Information Science, Technische Universität
München, Munich, Germany
Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA,
USA
Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Alaa Khamis, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt
Torsten Kroeger, Stanford University, Stanford, CA, USA
Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA
Ferran Martín, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra,
Barcelona, Spain
Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore
Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany
Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA
Sebastian Möller, Quality and Usability Laboratory, TU Berlin, Berlin, Germany
Subhas Mukhopadhyay, School of Engineering & Advanced Technology, Massey University,
Palmerston North, Manawatu-Wanganui, New Zealand
Cun-Zheng Ning, Electrical Engineering, Arizona State University, Tempe, AZ, USA
Toyoaki Nishida, Graduate School of Informatics, Kyoto University, Kyoto, Japan
Federica Pascucci, Dipartimento di Ingegneria, Università degli Studi “Roma Tre”, Rome, Italy
Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Gan Woon Seng, School of Electrical & Electronic Engineering, Nanyang Technological University,
Singapore, Singapore
Joachim Speidel, Institute of Telecommunications, Universität Stuttgart, Stuttgart, Germany
Germano Veiga, Campus da FEUP, INESC Porto, Porto, Portugal
Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Beijing, China
Junjie James Zhang, Charlotte, NC, USA
The book series Lecture Notes in Electrical Engineering (LNEE) publishes the latest developments
in Electrical Engineering - quickly, informally and in high quality. While original research
reported in proceedings and monographs has traditionally formed the core of LNEE, we also
encourage authors to submit books devoted to supporting student education and professional
training in the various ﬁelds and applications areas of electrical engineering. The series cover
classical and emerging topics concerning:
• Communication Engineering, Information Theory and Networks
• Electronics Engineering and Microelectronics
• Signal, Image and Speech Processing
• Wireless and Mobile Communication
• Circuits and Systems
• Energy Systems, Power Electronics and Electrical Machines
• Electro-optical Engineering
• Instrumentation Engineering
• Avionics Engineering
• Control Systems
• Internet-of-Things and Cybersecurity
• Biomedical Devices, MEMS and NEMS

For general information about this book series, comments or suggestions, please contact leontina.
[email protected].
To submit a proposal or request further information, please contact the Publishing Editor in
your country:
China
Jasmine Dou, Associate Editor ([email protected])
India, Japan, Rest of Asia
Swati Meherishi, Executive Editor ([email protected])
Southeast Asia, Australia, New Zealand
Ramesh Nath Premnath, Editor ([email protected])
USA, Canada:
Michael Luby, Senior Editor ([email protected])
All other Countries:
Leontina Di Cecco, Senior Editor ([email protected])

** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex,
SCOPUS, MetaPress, Web of Science and Springerlink **

More information about this series at http://www.springer.com/series/7818

Zainah Md Zain Hamzah Ahmad
• •

Dwi Pebrianti Mahfuzah Mustafa

• •

Nor Rul Hasma Abdullah •

Rosdiyana Samad Maziyah Mat Noh

•

Editors

Proceedings of the 11th

National Technical Seminar
on Unmanned System
Technology 2019
NUSYS’19

123
Editors
Zainah Md Zain Hamzah Ahmad
Faculty of Electrical & Electronics Faculty of Electrical & Electronics
Engineering Engineering
Universiti Malaysia Pahang Universiti Malaysia Pahang
Pekan, Pahang, Malaysia Pekan, Pahang, Malaysia

Dwi Pebrianti Mahfuzah Mustafa

Faculty of Electrical & Electronics Faculty of Electrical & Electronics
Engineering Engineering
Universiti Malaysia Pahang Universiti Malaysia Pahang
Pekan, Pahang, Malaysia Pekan, Pahang, Malaysia

Nor Rul Hasma Abdullah Rosdiyana Samad

Faculty of Electrical & Electronics Faculty of Electrical & Electronics
Engineering Engineering
Universiti Malaysia Pahang Universiti Malaysia Pahang
Pekan, Pahang, Malaysia Pekan, Pahang, Malaysia

Maziyah Mat Noh

Faculty of Electrical & Electronics
Engineering
Universiti Malaysia Pahang
Pekan, Pahang, Malaysia

ISSN 1876-1100 ISSN 1876-1119 (electronic)

Lecture Notes in Electrical Engineering
ISBN 978-981-15-5280-9 ISBN 978-981-15-5281-6 (eBook)
https://doi.org/10.1007/978-981-15-5281-6
© Springer Nature Singapore Pte Ltd. 2021
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface

The National Technical Seminar on Unmanned System Technology 2019

(NUSYS’19) was organized by the IEEE Oceanic Engineering Society (OES)
Malaysia Chapter and Malaysian Society for Automatic Control Engineers (MACE)
IFAC NMO. NUSYS’19 was held during December 2–3, 2019, at Universiti
Malaysia Pahang, Gambang Campus, Kuantan, Pahang, Malaysia, with a confer-
ence theme “Unmanned System Technology and AI Applications”. The event was
the 11th conference continuing from previous conferences since the year 2008.
NUSYS’19 focused on both theory and application, primarily covering the topics of
intelligent unmanned technologies, robotics and autonomous vehicle. We invited
four keynote speakers who dealt with related state-of-the-art technologies including
unmanned aerial vehicles (UAVs), underwater vehicles (UVs), autonomous
vehicles, humanoid robot and intelligent system, among others. They are
Mr. Kamarulzaman Muhamed (Founder and CEO Aerodyne Group, “CEO of Top
10 hottest start-up company by Nikkei Japan, May 2019”), Assoc. Prof.
Dr. Hanafiah Yussof (Founder, Board of Director and Group Chief Officer of
Robopreneur Sdn. Bhd.), Assoc. Prof. Dr. Hairi Zamsuri (General Manager
eMoovit Technology Sdn. Bhd.) and Mr. Mohd Fairuz Nor Azmi (Project Manager,
Fugro Malaysia Marine Sdn. Bhd. formerly known as Fugro Geodetic Malaysia
Sdn. Bhd.). The objectives of the conference are threefold: to accommodate a
medium to discuss a wide range of unmanned system technology between uni-
versities and industries, to disseminate the latest technology in the field of
unmanned system technology and to provide an opportunity for researchers to
present their research paper in the unmanned system technology area.
Despite focusing on a rather specialized area of research concerning unmanned
system technology and electrical and electronics engineering technology,
NUSYS’19 has successfully attracted 87 papers locally from 12 universities and
one internationally from Institute Technology Surabaya, Indonesia. This volume of
proceedings from the conference provides an opportunity for readers to engage with
a selection of refereed papers that were presented during the NUSYS’19 confer-
ence. The book is organized into four parts, which reflect the research topics of the
conference themes:

v
vi Preface

Part 1: Unmanned System Technology, Underwater Technology and Marine

Part 2: Applied Electronics and Computer Engineering
Part 3: Control, Instrumentations and Artiﬁcial Intelligent Systems
Part 4: Sustainable Energy and Power Electronics.
One aim of this book is to stimulate interactions among researchers in the areas
pertinent to intelligent unmanned systems of AUV, UAV and AGV, namely
autonomous control systems and vehicles. Another aim is to share new ideas, new
challenges and the author’s expertise on critical and emerging technologies. The
book covers multifaceted aspects of unmanned system technology.
The editors hope that readers will ﬁnd this book not only stimulating but also
useful and usable in whatever aspect of unmanned system design in which they may
be involved or interested. The editors would like to express their sincere appreci-
ation to all the contributors for their cooperation in producing this book.
We wish to take the opportunity to thank all individuals and organizations who
have contributed in some way in making NUSYS’19 a success and a memorable
gathering. Also, we wish to extend our gratitude to the members of the IEEE OES
Malaysia Chapter Committee and Organizing Committee for their tireless effort.
Finally, the publisher, Springer, and most importantly, Mr. Karthik Raj Selvaraj for
his support and encouragement in undertaking this publication.

Editors
Contents

Unmanned System Technology, Underwater Technology

and Marine
Tracking Control Design for Underactuated Micro Autonomous
Underwater Vehicle in Horizontal Plane Using Robust Filter
Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Muhammad Azri Bin Abdul Wahed and Mohd Rizal Arshad
Design and Development of Remotely Operated Pipeline
Inspection Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Mohd Shahrieel Mohd Aras, Zainah Md Zain, Aliff Farhan Kamaruzaman,
Mohd Zamzuri Ab Rashid, Azhar Ahmad, Hairol Nizam Mohd Shah,
Mohd Zaidi Mohd Tumari, Alias Khamis, Fadilah Ab Azis,
and Fariz Ali@Ibrahim
Vision Optimization for Altitude Control and Object Tracking
Control of an Autonomous Underwater Vehicle (AUV) . . . . . . . . . . . . . 25
Joe Siang Keek, Mohd Shahrieel Mohd Aras, Zainah Md. Zain,
Mohd Bazli Bahar, Ser Lee Loh, and Shin Horng Chong
Development of Autonomous Underwater Vehicle Equipped
with Object Recognition and Tracking System . . . . . . . . . . . . . . . . . . . . 37
Muhammad Haniff Abu Mangshor, Radzi Ambar,
Herdawatie Abdul Kadir, Khalid Isa, Inani Yusra Amran,
Abdul Aziz Abd Kadir, Nurul Syila Ibrahim, Chew Chang Choon,
and Shinichi Sagara
Dual Image Fusion Technique for Underwater Image
Contrast Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Chern How Chong, Ahmad Shahrizan Abdul Ghani,
and Kamil Zakwan Mohd Azmi

vii
viii Contents

Red and Blue Channels Correction Based on Green Channel

and Median-Based Dual-Intensity Images Fusion for Turbid
Underwater Image Quality Enhancement . . . . . . . . . . . . . . . . . . . . . . . . 73
Kamil Zakwan Mohd Azmi, Ahmad Shahrizan Abdul Ghani,
and Zulkifli Md Yusof
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)
for Underwater Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
A. F. Ayob, K. Khairuddin, Y. M. Mustafah, A. R. Salisa, and K. Kadir
Different Cell Decomposition Path Planning Methods for Unmanned
Air Vehicles-A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Sanjoy Kumar Debnath, Rosli Omar, Susama Bagchi, Elia Nadira Sabudin,
Mohd Haris Asyraf Shee Kandar, Khan Foysol,
and Tapan Kumar Chakraborty
Improved Potential Field Method for Robot Path Planning
with Path Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Elia Nadira Sabudin, Rosli Omar, Ariffudin Joret, Asmarashid Ponniran,
Muhammad Suhaimi Sulong, Herdawatie Abdul Kadir,
and Sanjoy Kumar Debnath
Development of DugongBot Underwater Drones Using Open-Source
Robotic Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Ahmad Anas Yusof, Mohd Khairi Mohamed Nor,
Mohd Shahrieel Mohd Aras, Hamdan Sulaiman, and Abdul Talib Din
Development of Autonomous Underwater Vehicle for Water Quality
Measurement Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Inani Yusra Amran, Khalid Isa, Herdawatie Abdul Kadir, Radzi Ambar,
Nurul Syila Ibrahim, Abdul Aziz Abd Kadir,
and Muhammad Haniff Abu Mangshor
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle
in Steering Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Nira Mawangi Sarif, Raﬁdah Ngadengon, Herdawatie Abdul Kadir,
and Mohd Haﬁz A. Jalil
Impact of Acoustic Signal on Optical Signal and Vice Versa
in Optoacoustic Based Underwater Localization . . . . . . . . . . . . . . . . . . 177
M. R. Arshad and M. H. A. Majid
Design and Development of Mini Autonomous Surface Vessel
for Bathymetric Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Muhammad Ammar Mohd Adam, Zulkifli Zainal Abidin,
Ahmad Imran Ibrahim, Ahmad Shahril Mohd Ghani,
and Al Jawharah Anchumukkil
Contents ix

Control, Instrumentation and Artiﬁcial Intelligent Systems

Optimal Power Flow Solutions for Power System Operations
Using Moth-Flame Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . 207
Salman Alabd, Mohd Herwan Sulaiman,
and Muhammad Ikram Mohd Rashid
A Pilot Study on Pipeline Wall Inspection Technology Tomography . . . 221
Muhammad Nuriffat Roslee, Siti Zarina Mohd. Muji,
Jaysuman Pusppanathan, and Mohd. Fadzli Abd. Shaib
Weighted-Sum Extended Bat Algorithm Based PD Controller
Design for Wheeled Mobile Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Nur Aisyah Syafinaz Suarin, Dwi Pebrianti, Nurnajmin Qasrina Ann,
and Luhur Bayuaji
An Analysis of State Covariance of Mobile Robot Navigation
in Unstructured Environment Based on ROS . . . . . . . . . . . . . . . . . . . . . 259
Hamzah Ahmad, Lim Zhi Xian, Nur Aqilah Othman,
Mohd Syakirin Ramli, and Mohd Mawardi Saari
Control Strategy for Differential Drive Wheel Mobile Robot . . . . . . . . . 271
Nor Akmal Alias and Herdawatie Abdul Kadir
Adaptive Observer for DC Motor Fault Detection
Dynamical System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Janet Lee, Rosmiwati Mohd-Mokhtar,
and Muhammad Nasiruddin Mahyuddin
Water Level Classification for Flood Monitoring System
Using Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
J. L. Gan and W. Zailah
Evaluation of Back-Side Slits with Sub-millimeter Resolution
Using a Differential AMR Probe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
M. A. H. P. Zaini, M. M. Saari, N. A. Nadzri, A. M. Halil,
A. J. S. Hanifah, and K. Tsukada
Model-Free Tuning of Laguerre Network for Impedance Matching
in Bilateral Teleoperation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Mohd Syakirin Ramli, Hamzah Ahmad, Addie Irawan,
and Nur Liyana Ibrahim
Identification of Liquid Slosh Behavior Using Continuous-Time
Hammerstein Model Based Sine Cosine Algorithm . . . . . . . . . . . . . . . . 345
Julakha Jahan Jui, Mohd Helmi Suid, Zulkifli Musa,
and Mohd Ashraf Ahmad
x Contents

Cardiotocogram Data Classiﬁcation Using Random Forest Based

Machine Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
M. M. Imran Molla, Julakha Jahan Jui, Bifta Sama Bari, Mamunur Rashid,
and Md Jahid Hasan
FPGA Implementation of Sensor Data Acquisition for Real-Time
Human Body Motion Measurement System . . . . . . . . . . . . . . . . . . . . . . 371
Zarina Tukiran, Afandi Ahmad, Herdawatie Abd. Kadir,
and Ariffudin Joret
Pulse Modulation (PM) Ground Penetrating Radar (GPR) System
Development by Using Envelope Detector Technique . . . . . . . . . . . . . . . 381
Maryanti Razali, Ariffuddin Joret, M. F. L. Abdullah,
Elfarizanis Baharudin, Asmarashid Ponniran,
Muhammad Suhaimi Sulong, Che Ku Nor Azie Hailma Che Ku Melor,
and Noor Azwan Shairi
An Overview of Modeling and Control of a Through-the-Road
Hybrid Electric Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
M. F. M. Sabri, M. H. Husin, M. I. Jobli, and A. M. N. A. Kamaruddin
Euler-Lagrange Based Dynamic Model of Double Rotary
Inverted Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
Mukhtar Fatihu Hamza, Jamilu Kamilu Adamu,
and Abdulbasid Ismail Isa
Network-Based Cooperative Synchronization Control
of 3 Articulated Robotic Arms for Industry 4.0 Application . . . . . . . . . 435
Kam Wah Chan, Muhammad Nasiruddin Mahyuddin, and Bee Ee Khoo
EEG Signal Denoising Using Hybridizing Method Between Wavelet
Transform with Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Zaid Abdi Alkareem Alyasseri, Ahamad Tajudin Khader,
Mohammed Azmi Al-Betar, Ammar Kamal Abasi,
and Sharif Naser Makhadmeh
Neural Network Ammonia-Based Aeration Control for Activated
Sludge Process Wastewater Treatment Plant . . . . . . . . . . . . . . . . . . . . . 471
M. H. Husin, M. F. Rahmat, N. A. Wahab, and M. F. M. Sabri
A Min-conﬂict Algorithm for Power Scheduling Problem
in a Smart Home Using Battery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
Sharif Naser Makhadmeh, Ahamad Tajudin Khader,
Mohammed Azmi Al-Betar, Syibrah Naim,
Zaid Abdi Alkareem Alyasseri, and Ammar Kamal Abasi
An Improved Text Feature Selection for Clustering Using Binary
Grey Wolf Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
Ammar Kamal Abasi, Ahamad Tajudin Khader,
Mohammed Azmi Al-Betar, Syibrah Naim, Sharif Naser Makhadmeh,
and Zaid Abdi Alkareem Alyasseri
Contents xi

Applied Electronics and Computer Engineering

Metamaterial Antenna for Biomedical Application . . . . . . . . . . . . . . . . . 519
Mohd Aminudin Jamlos, Nur Amirah Othman, Wan Azani Mustafa,
and Maswani Khairi Marzuki
Refraction Method of Metamaterial for Antenna . . . . . . . . . . . . . . . . . . 529
Maswani Khairi Marzuki, Mohd Aminudin Jamlos, Wan Azani Mustafa,
and Khairul Najmy Abdul Rani
Circular Polarized 5.8 GHz Directional Antenna Design
for Base Station Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Mohd Aminudin Jamlos, Nurasma Husna Mohd Sabri,
Wan Azani Mustafa, and Maswani Khairi Marzuki
Medical Image Enhancement and Deblurring . . . . . . . . . . . . . . . . . . . . 543
Reza Amini Gougeh, Tohid Yousefi Rezaii, and Ali Farzamnia
A Fast and Efficient Segmentation of Soil-Transmitted Helminths
Through Various Color Models and k-Means Clustering . . . . . . . . . . . . 555
Norhanis Ayunie Ahmad Khairudin, Aimi Salihah Abdul Nasir,
Lim Chee Chin, Haryati Jaafar, and Zeehaida Mohamed
Machine Learning Calibration for Near Infrared
Spectroscopy Data: A Visual Programming Approach . . . . . . . . . . . . . . 577
Mahmud Iwan Solihin, Zheng Zekui, Chun Kit Ang, Fahri Heltha,
and Mohamed Rizon
Real Time Android-Based Integrated System for Luggage
Check-in Process at the Airport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
Xin Yee Lee and Rosmiwati Mohd-Mokhtar
Antenna Calibration in EMC Semi-anechoic Chamber
Using Standard Antenna Method (SAM) and Standard
Site Method (SSM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Abdulrahman Ahmed Ghaleb Amer, Syarfa Zahirah Sapuan,
Nur Atikah Zulkefli, Nasimuddin Nasimuddin, Nabiah Binti Zinal,
and Shipun Anuar Hamzah
An Automatic Driver Assistant Based on Intention Detecting
Using EEG Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Reza Amini Gougeh, Tohid Yousefi Rezaii, and Ali Farzamnia
Hybrid Skull Stripping Method for Brain CT Images . . . . . . . . . . . . . . 629
Fakhrul Razan Rahmad, Wan Nurshazwani Wan Zakaria, Ain Nazari,
Mohd Razali Md Tomari, Nik Farhan Nik Fuad,
and Anis Azwani Muhd Suberi
xii Contents

Improvising Non-uniform Illumination and Low Contrast Images

of Soil Transmitted Helminths Image Using Contrast
Enhancement Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
Norhanis Ayunie Ahmad Khairudin, Aimi Salihah Abdul Nasir,
Lim Chee Chin, Haryati Jaafar, and Zeehaida Mohamed
Signal Processing Technique for Pulse Modulation (PM) Ground
Penetrating Radar (GPR) System Based on Phase and Envelope
Detector Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Che Ku Nor Azie Hailma Che Ku Melor, Ariffuddin Joret,
Asmarashid Ponniran, Muhammad Suhaimi Sulong, Rosli Omar,
and Maryanti Razali
Evaluation of Leap Motion Controller Usability in Development
of Hand Gesture Recognition for Hemiplegia Patients . . . . . . . . . . . . . . 671
Wan Norliyana Wan Azlan, Wan Nurshazwani Wan Zakaria,
Nurmiza Othman, Mohd Norzali Haji Mohd,
and Muhammad Nurfirdaus Abd Ghani
Using Convolution Neural Networks Pattern for Classification
of Motor Imagery in BCI System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
Sepideh Zolfaghari, Tohid Yousefi Rezaii, Saeed Meshgini,
and Ali Farzamnia
Metasurface with Wide-Angle Reception for Electromagnetic
Energy Harvesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693
Abdulrahman A. G. Amer, Syarfa Zahirah Sapuan, Nasimuddin,
and Nabiah Binti Zinal
Integrated Soil Monitoring System for Internet
of Thing (IOT) Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 701
Xin Yi Lau, Chun Heng Soo, Yusmeeraz Yusof, and Suhaila Isaak
Contrast Enhancement Approaches on Medical
Microscopic Images: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
Nadzirah Nahrawi, Wan Azani Mustafa,
Siti Nurul Aqmariah Mohd Kanafiah, Mohd Aminudin Jamlos,
and Wan Khairunizam
Effect of Different Filtering Techniques on Medical
and Document Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
Wan Azani Mustafa, Syafiq Sam, Mohd Aminudin Jamlos,
and Wan Khairunizam
Implementation of Seat Belt Monitoring and Alert System
for Car Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
Zainah Md Zain, Mohd Hairuddin Abu Bakar, Aman Zaki Mamat,
Wan Nor Rafidah Wan Abdullah, Norsuryani Zainal Abidin,
and Haris Faisal Shaharuddin
Contents xiii

Electroporation Study: Pulse Electric Field Effect

on Breast Cancer Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 751
Nur Adilah Abd Rahman, Muhammad Mahadi Abdul Jamil,
Mohamad Nazib Adon, Chew Chang Choon, and Radzi Ambar
Influence of Electroporation on HT29 Cell Proliferation, Spreading
and Adhesion Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
Hassan Buhari Mamman, Muhammad Mahadi Abdul Jamil,
Nur Adilah Abd Rahman, Radzi Ambar, and Chew Chang Choon
Wound Healing and Electrofusion Application via Pulse Electric
Field Exposure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775
Muhammad Mahadi Abdul Jamil, Mohamad Nazib Adon,
Hassan Buhari Mamman, Nur Adilah Abd Rahman, Radzi Ambar,
and Chew Chang Choon
Color Constancy Analysis Approach for Color Standardization
on Malaria Thick and Thin Blood Smear Images . . . . . . . . . . . . . . . . . 785
Thaqifah Ahmad Aris, Aimi Salihah Abdul Nasir, Haryati Jaafar,
Lim Chee Chin, and Zeehaida Mohamed
Stochastic Analysis of ANN Statistical Features for CT Brain
Posterior Fossa Image Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 805
Anis Azwani Muhd Suberi, Wan Nurshazwani Wan Zakaria,
Razali Tomari, Ain Nazari, Nik Farhan Nik Fuad,
Fakhrul Razan Rahmad, and Salsabella Mohd Fizol
Improvement of Magnetic Field Induction for MPI Application Using
Maxwell Coils Paired-Sub-coils System Arrangement . . . . . . . . . . . . . . 819
Muhamad Fikri Shahkhirin Birahim, Nurmiza Othman,
Syarfa’ Zahirah Sapuan, Mohd Razali Md Tomari,
Wan Nurshazwani Wan Zakaria, and Chua King Lee
DCT Image Compression Implemented on Raspberry Pi
to Compress Image Captured by CMOS Image Sensor . . . . . . . . . . . . . 831
Ibrahim Saad Mohsin, Muhammad Imran Ahmad, Saad M. Salman,
Mustafa Zuhaer Nayef Al-Dabagh, Mohd Nazrin Md Isa,
and Raja Abdullah Raja Ahmad
A Racial Recognition Method Based on Facial Color and Texture
for Improving Demographic Classification . . . . . . . . . . . . . . . . . . . . . . . 843
Amer A. Sallam, Muhammad Nomani Kabir, Athmar N. M. Shamhan,
Heba K. Nasser, and Jing Wang
Automatic Passengers Counting System Using Images Processing
Based on YCbCr and HSV Colour Spaces Analysis . . . . . . . . . . . . . . . . 853
Muhammad Shahid Che Husin and Aimi Salihah Abdul Nasir
xiv Contents

Face Recognition Using PCA Implemented on Raspberry Pi . . . . . . . . . 873

Ibrahim Majid Mohammed, Mustafa Zuhaer Nayef Al-Dabagh,
Muhammad Imran Ahmad, and Mohd Nazrin Md Isa
Comparability of Edge Detection Techniques for Automatic Vehicle
License Plate Detection and Recognition . . . . . . . . . . . . . . . . . . . . . . . . 891
Fatin Norazima Mohamad Ariff, Aimi Salihah Abdul Nasir,
Haryati Jaafar, and Abdul Nasir Zulkifli
Classification of Facial Part Movement Acquired from Kinect V1
and Kinect V2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911
Sheng Guang Heng, Rosdiyana Samad, Mahfuzah Mustafa,
Zainah Md Zain, Nor Rul Hasma Abdullah, and Dwi Pebrianti
Hurst Exponent Based Brain Behavior Analysis of Stroke Patients
Using EEG Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
Wen Yean Choong, Wan Khairunizam, Murugappan Murugappan,
Mohammad Iqbal Omar, Siao Zheng Bong, Ahmad Kadri Junoh,
Zuradzman Mohamad Razlan, A. B. Shahriman,
and Wan Azani Wan Mustafa
Examination Rain and Fog Attenuation for Path Loss Prediction
in Millimeter Wave Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935
Imadeldin Elsayed Elmutasim and Izzeldin I. Mohd
Introduction of Static and Dynamic Features to Facial Nerve
Paralysis Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 947
Wan Syahirah W Samsudin, Rosdiyana Samad, Kenneth Sundaraj,
and Mohd Zaki Ahmad
Offline EEG-Based DC Motor Control for Wheelchair Application . . . . 965
Norizam Sulaiman, Nawfan Mohammed Mohammed Ahmed Al-Fakih,
Mamunur Rashid, Mohd Shawal Jadin, Mahfuzah Mustafa,
and Fahmi Samsuri
Automated Cells Counting for Leukaemia and Malaria Detection
Based on RGB and HSV Colour Spaces Analysis . . . . . . . . . . . . . . . . . 981
Amer Fazryl Din and Aimi Salihah Abdul Nasir
Simulation Studies of the Hybrid Human-Fuzzy Controller
for Path Tracking of an Autonomous Vehicle . . . . . . . . . . . . . . . . . . . . 997
Hafiz Halin, Wan Khairunizam, Hasri Haris, Z. M. Razlan, S. A. Bakar,
I. Zunaidi, and Wan Azani Mustafa
A New Approach in Energy Consumption Based on Genetic
Algorithm and Fuzzy Logic for WSN . . . . . . . . . . . . . . . . . . . . . . . . . . . 1007
Ali Adnan Wahbi Alwafi, Javad Rahebi, and Ali Farzamnia
Contents xv

Sustainable Energy and Power Engineering

Comparison of Buck-Boost Derived Non-isolated DC-DC
Converters in a Photovoltaic System . . . . . . . . . . . . . . . . . . . . . . . . . . . 1023
Jotham Jeremy Lourdes, Chia Ai Ooi, and Jiashen Teh
Fault Localization and Detection in Medium Voltage Distribution
Network Using Adaptive Neuro-Fuzzy Inference System (ANFIS) . . . . . 1039
N. S. B. Jamili, Mohd Rafi Adzman, Wan Syaza Ainaa Wan Salman,
M. H. Idris, and M. Amirruddin
Flashover Voltage Prediction on Polluted Cup-Pin the Insulators
Under Polluted Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053
Ali. A. Salem, R. Abd-Rahman, M. S. Kamarudin, N. A. Othman,
N. A. M. Jamail, N. Hussin, H. A. Hamid, and I. M. Rawi
Effect of Distributed Generation to the Faults in Medium Voltage
Network Using ATP-EMTP Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 1067
Wan Syaza Ainaa Wan Salman, Mohd Rafi Adzman, Muzamir Isa,
N. S. B. Jamili, M. H. Idris, and M. Amirruddin
Optimal Reactive Power Dispatch Solution by Loss Minimisation
Using Dragonfly Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 1083
Ibrahim Haruna Shanono, Masni Ainina Mahmud,
Nor Rul Hasma Abdullah, Mahfuzah Mustafa, Rosdiyana Samad,
Dwi Pebrianti, and Aisha Muhammad
Analysis of Pedal Power Energy Harvesting for Alternative
Power Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1105
Sheikh-Muhammad Haziq Sah-Azmi and Zuraini Dahari
An Application of Barnacles Mating Optimizer Algorithm
for Combined Economic and Emission Dispatch Solution . . . . . . . . . . . 1115
Mohd Herwan Sulaiman, Zuriani Mustaffa, Mohd Mawardi Saari,
and Amir Izzani Mohamed
Development of Microcontroller Based Portable Solar Irradiance
Meter Using Mini Solar Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125
Lee Woan Jun, Mohd Shawal Jadin, and Norizam Sulaiman
Performance of Graphite and Activated Carbon as Electrical
Grounding Enhancement Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1139
Mohd Yuhyi Mohd Tadza, Tengku Hafidatul Husna Tengku Anuar,
Fadzil Mat Yahaya, and Rahisham Abd Rahman
Design on Real Time Control for Dual Axis Solar Tracker
for Mobile Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155
Muhammad Hanzolah Shahul Hameed, Mohd Zamri Hasan,
and Junaidah Ali Mohd Jobran
xvi Contents

Modiﬁed Particle Swarm Optimization for Robust Anti-swing

Gantry Crane Controller Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1173
Mahmud Iwan Solihin, Wei Hong Lim, Sew Sun Tiang,
and Chun Kit Ang
Feasibility Analysis of a Hybrid System for a Health Clinic
in a Rural Area South-Eastern Iraq . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1193
Zaidoon W. J. AL-Shammari, M. M. Azizan, and A. S. F. Rahman
Optimal Sizing of PV/Wind/Battery Hybrid System for Rural
School in South Iraq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1203
Zaidoon W. J. AL-Shammari, M. M. Azizan, and A. S. F. Rahman
The Use of Gypsum and Waste Gypsum for Electrical
Grounding Backﬁll . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1213
Amizatulhani Abdullah, Nurmazuria Mazelan,
Mohd Yuhyi Mohd Tadza, and Rahisham Abd Rahman
Energy-Efﬁcient Superframe Scheduling in Industrial Wireless
Networked Control System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1227
Duc Chung Tran, Rosdiazli Ibrahim, Fawnizu Azmadi Hussin,
and Madiah Omar
Design of Two Axis Solar Tracker Based on Optoelectrical Tracking
Using Hybrid FuGA Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1243
Imam Abadi, Erma Hakim Setyawan, and D. R. Pramesrani
Unmanned System Technology,
Underwater Technology and Marine
Tracking Control Design
for Underactuated Micro Autonomous
Underwater Vehicle in Horizontal Plane
Using Robust Filter Approach

Muhammad Azri Bin Abdul Wahed and Mohd Rizal Arshad

Abstract Micro autonomous underwater vehicle (µAUV) design and developed at

Underwater, Control and Robotics Group (UCRG) is a torpedo-shaped vehicle
measuring only 0.72 m in length and 0.11 in diameter with a mass of approximately
6 kg. This paper proposed a time invariant tracking control method for underactu-
ated micro AUV in horizontal plane using robust ﬁlter approach to track a predeﬁned
trajectory. Tracking error is introduced which can then be converged by using force
in surge direction and moment in yaw direction. A robust control will minimize
the effects of external disturbance and parameter uncertainties on the AUV perfor-
mance. With only rigid-body system inertia matrix information of the micro AUV,
robustness against parameter uncertainties, model nonlinearities, and unexpected
external disturbance is achievable with the proposed controller. Performance of the
proposed robust tracking control is demonstrated in simulation results.

Keywords Underactuated system Micro autonomous underwater vehicles

Robust control Trajectory tracking

1 Introduction

The micro Autonomous Underwater Vehicle [1] developed by Underwater, Control

and Robotics Group (UCRG) is a torpedo shaped vehicle design for use in shallow
water inspection such as coral reef inspection. It measures at 0.72 m in length, 0.11
in diameter and 6 kg at its most basic configuration.
Underwater mission requires the µAUV to be very stable to be able to follow the
predefined trajectory with high accuracy. However, this µAUV is an underactuated
AUV and this complicates the AUV to follow a predefined trajectory. Therefore, a

M. A. B. A. Wahed M. R. Arshad (&)

Underwater, Control and Robotics Group, School of Electrical and Electronic Engineering,
Universiti Sains Malaysia, 14300 Nibong Tebal, Pulau Pinang, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 3

Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_1
4 M. A. B. A. Wahed and M. R. Arshad

tracking control system is required to allow the AUV to overcome the limitation of
its propulsion system. Furthermore, performance of the µAUV is adversely affected
by the unpredictable disturbances in the underwater environment.
A precise mathematical representation of an Autonomous Underwater Vehicle
(AUV) is very hard to obtain and this cause the control problem of underwater
robot becomes even more challenging. Hydrodynamic parameters that occurs in the
interaction between the vehicle and fluid is difficult to obtain with reasonable
accuracy due to their variations against different maneuvering conditions.
Therefore, a robust control technique with the constraint of not having its
complete mathematical representation is required to reduce the effects of external
disturbance on system behavior of the AUV.
Sliding Mode Control (SMC) has been used by many researchers due to it
robustness and is the most powerful robust control technique. SMC technique alter
the dynamics of underwater vehicle by applying a discontinuous control signal. The
control signal guides and maintains the trajectory of the system state error toward a
specified surface called sliding surface [2].
However, because of the frequent switching, chattering phenomenon occur in
the control input of SMC. Chattering has to be avoided because it causes high
thruster wear and degrade the system performance. To avoid chattering, dynamics
in a small vicinity of the discontinuity surface need to be alter by using a smoothing
function such as saturation function and hyperbolic tangent function [3, 4].
Unfortunately, accuracy and robustness are partially lost as convergence are only
ensured to approach a boundary layer of the sliding surface.
To overcome the chattering effect, a second order SMC controller has been
proposed [5, 6]. No smoothing function is required by the second order SMC
controller to produce the continuous control signal and this allows for finite-time
convergence to zero of the first-time derivative of sliding surfaces. However, sec-
ond order SMC controller takes a longer time for its error to converges to zero.
Another robust control technique used in underwater environment is Time Delay
Control (TDC) which is relatively a new technique. It assumes that during a small
short enough time, a continuous signal will remain the same. Therefore, past
observation of uncertainties and disturbance can be used directly in the controller.
Even in the presence of sensor noise and ocean current disturbance, good perfor-
mance is achievable by using TDC controller [7, 8].
In general, TDC controller consists of time delay estimator and linear controller.
However, the introduced delay causes the TDC controller unable to eliminate
estimation error that arises. To avoid critically affecting the stability and perfor-
mance of the system, the feedback data acquisition rate has to be fast in order to
shorten the delay time.
In this paper, position of AUV is controlled by using a time invariant tracking
control method using robust filter approach. First proposed by [9], robustness
against parameter uncertainties, model nonlinearities, and unexpected external
disturbance is achievable with only inertia matrix information. The controller [10,
11] is designed consisting of a nominal controller and a robust compensator.
Tracking Control Design for Underactuated (lAUV) ... 5

This paper contains 6 sections. Section 1 introduce the research background

while Sect. 2 presents the µAUV dynamic model and Sect. 3 presents the control
objectives. Section 4 presents the design of proposed robust tracking control
design, Sect. 5 discussed the simulation result and ﬁnally Sect. 6 concluded this
paper.

2 Mathematical Modeling of µAUV

Before defining the model, reference frames need to be defined. AUV are best
described as a nonlinear system, thus two reference frame are considered:
Earth-fixed frame and Body-fixed frame. Standard notation from Society of Naval
Architects and Marine Engineers (SNAME) is used for easier understanding in this
paper. Figure 1 shows the defined reference frames. Earth-fixed frame has its x-axis
and y-axis pointing towards the North and East respectively while z-axis points
downwards normal to the surface of earth. On the other hand, Body-fixed frame has
its origin coincides with the center of gravity of the AUV.
In this paper, the AUV is assumed to be moving only at a certain depth and is
passively stable in roll direction. Therefore, all corresponding elements are
neglected during derivation of dynamic equation.
The nonlinear equations of motion of a Body-fixed frame is expressed in a
vectorial setting as shown in (1)–(6), where v represents vector of linear and angular
velocities expressed in Body-fixed frame, rigid-body system inertia matrix repre-
sented by MRB while added mass system inertia matrix represented by MA . DL and
DQ represents linear hydrodynamic damping matrix and quadratic hydrodynamic
damping matrix respectively. Lift matrix represented by L and the vector of

Fig. 1 Defined Earth-fixed frame and Body-fixed frame

6 M. A. B. A. Wahed and M. R. Arshad

Body-ﬁxed force from actuators is represented by s. For simplicity, the lift matrix is
assume as input.

ðMRB þ MA Þv þ ðDL þ DQ jvjÞv ¼ s þ Ljvjv ð1Þ

v ¼ ½u v r T ð2Þ

MRB ¼ diag½ m m Iz ð3Þ

MA ¼ diag½ MAu MAv MAr ð4Þ

DL ¼ diag½ DLu DLv DLr ð5Þ

DQ ¼ diag½ DQu DQv DQr ð6Þ

Body-ﬁxed linear and angular velocities can be conveyed in Earth-ﬁxed frame

using Euler angle transformation as shown in (7)–(9). g represents the vector of
position and attitude expressed in Earth-ﬁxed frame while J represents the Jacobian
matrix.

g_ ¼ J ðwÞv ð7Þ

g ¼ ½ x y w T ð8Þ
2 3
cos w sin w 0
J ðwÞ ¼ 4 sin w cos w 05 ð9Þ
0 0 1

3 Control Objectives

Before designing the trajectory tracking control problem, we need to first defined
the tracking error as shown in (10). e represent the vector tracking error in
Earth-fixed frame while gd represent the vector of desired position and orientation.
Because the AUV is underactuated in sway direction, the desired velocities in x and
y directions has to depend on the desired yaw angle as (12).

e ¼ gd g ð10Þ

e ¼ ½ ex ey ew T ð11Þ

y_ d
wd ¼ tan1 ð12Þ
x_ d
Tracking Control Design for Underactuated (lAUV) ... 7

The ﬁrst objective of this research is in designing a controller for an underac-

tuated AUV to track a predeﬁned, time-varying trajectory in the horizontal plane.
Using only force in surge direction and moment in yaw direction, the proposed
controller should be able to converge to zero the tracking error of the underactuated
AUV in the x, y and w directions.
The second objective of this research is to design a robust ﬁlter to compensate
the effect of unknown hydrodynamic parameters on the AUV. This is because the
complete mathematical representation of the AUV is not available.

4 Robust Tracking Control Design

This section presents the design of the proposed tracking control of underactuated
AUV in horizontal plane by using robust filter approach. Figure 2 shows the block
diagram of the proposed controller.
There are 3 steps in designing the proposed controller. Firstly, the tracking error
has to be transformed to allow it to be converge by only using force in surge
direction and moment in yaw direction. The Earth-fixed tracking error vector
described as shown in (10) is transformed into introduced error vector in
Body-fixed frame as shown in (13).

ge ¼ ½ x e ye we T ð13Þ

xe ¼ cosðwÞex þ sinðwÞey ð14Þ

ye ¼ sinðwÞex þ cosðwÞey ð15Þ

we ¼ ew þ aye ð16Þ

Second step is in designing a robust ﬁlter to compensate the effect of added mass
and hydrodynamic damping force on the AUV system as used by [12]. Since the
complete mathematical representation of the AUV is unknown, an artiﬁcial signal
of equivalent disturbance, q as shown in (17) which represent effect of added mass
and damping force on the AUV system is introduced. This equivalent signal is then
compensated by compensating signal as shown in (18) produced by a unity gain,

Fig. 2 Block diagram of the proposed controller

8 M. A. B. A. Wahed and M. R. Arshad

low pass filter. FLP represent the low pass filter with fs and fl representing the two
positive constants related to undamped natural frequency of the filter.

MRB v_ þ q ¼ s ð17Þ

uR ¼ FLP q ð18Þ

q ¼ s MRB v_ ð19Þ
h i
fl fs fl fs
FLP ðsÞ ¼ ðs þ fl Þðs þ fs Þ 0 ðs þ fl Þðs þ fs Þ ð20Þ

Final step is to designed a nominal controller to introduce desired error dynamic

into the AUV system. The nominal control signal which is similar to PD controller is
shown in (21). KD and KP represent derivative and proportional gain matrix
respectively. A predeﬁned error dynamic as shown in (22) will converge the
introduced tracking error to zero by using a suitable derivative and proportional gain.

uN ¼ MRB ðKD g_ e þ KP ge Þ ð21Þ

€ge þ KD g_ e þ KP ge ¼ 0 ð22Þ

In the proposed controller, two input from robust compensator and nominal
controller is used as shown in (23). Where uR is robust compensating signal while
uN is nominal control signal.

s ¼ uR þ uN ð23Þ

5 Simulations

For simulation, SimulinkTM is used to verify the performance of the proposed

controller. AUV parameters derived in (1) based on parameters presented in [1] is
used as the AUV parameters while control parameters values are shown in (24)–
(27).

KP ¼ diag½ 0:2 0 0:89 ð24Þ

KD ¼ diag½ 0:2 0 0:89 ð25Þ

fl ¼ 8 ð26Þ

fs ¼ 2 ð27Þ

Simulation 1 is performed to test the performance of the proposed controller in a

straight line trajectory with a constant velocity. The parameter of the value used is
Tracking Control Design for Underactuated (lAUV) ... 9

Table 1 Straight-line trajectory with constant velocity simulation parameters

Desired trajectory gd ¼ ½ 0:2t 0:5 0 T
Initial position in y direction eð0Þ ¼ ½ 0 0:5 0 T
Initial velocity in x direction e_ ð0Þ ¼ ½ 0:2 0 0 T
Positive constant related to converging rate of ye a¼1

Fig. 3 Position response of straight-line trajectory tracking

shown in Table 1 and the results are shown in Figs. 3, 4 and 5. At a constant
velocity, the controller is able to track a straight-line trajectory and converge to zero
the initial error in y direction within 30 s.
Next, simulation 2 is done to show the capabilities of the proposed controller in a
sinusoidal desired trajectory against a Model Free High Order Sliding Mode
Control (MFHOSMC) controller designed by [6]. The parameter of the value used
is shown in Table 2. From Fig. 6, both controller is able to achieve a path similar to
the desired path. In Fig. 7, the tracking error reach steady state for proposed con-
troller in 22 s while MFHOSMC controller requires 25 s. Finally, Fig. 8 shows the
comparison for the controllers to reach steady state in y direction with the proposed
controller tracking error bounded to within 2 10−3 while SMC controller
bounded within 20 10−3. The tracking error is bigger in y direction due to no
actuator in y direction.
10 M. A. B. A. Wahed and M. R. Arshad

Fig. 4 Tracking error in x direction of straight-line trajectory tracking

Fig. 5 Tracking error in y direction of straight-line trajectory tracking

Tracking Control Design for Underactuated (lAUV) ... 11

Table 2 Sinusoidal trajectory tracking simulation parameters

Desired trajectory gd ¼ ½ 0:2t sinð0:05tÞ 0:25 cosð0:05tÞ T
Initial position in y direction eð0Þ ¼ ½ 0 0 0:25 T
Initial velocity in x direction e_ ð0Þ ¼ ½ 0:2 0:05 0 T
Positive constant related to converging rate of ye a¼4

Fig. 6 Position response of sinusoidal trajectory tracking

Fig. 7 Tracking error in x direction of sinusoidal trajectory tracking

12 M. A. B. A. Wahed and M. R. Arshad

Fig. 8 Tracking error in y direction of sinusoidal trajectory tracking

6 Conclusions

This paper proposed an underwater tracking control method using robust ﬁlter
approach. By using the proposed controller, the effects of external influences on
AUV’s system behavior with subjects to the constraint of not having a complete
representation of the AUV system has been minimized. Simulation results show
that the proposed controller is able to track trajectory of straight-line and sinusoidal
with an excellent performance.

Acknowledgements The authors would like to thank RUI grant (Grant no.: 1001/PELECT/
8014088) and Universiti Sains Malaysia for supporting the research.

References

1. Wahed MA, Arshad MR (2019) Modeling of Torpedo-Shaped Micro Autonomous

Underwater Vehicle. Springer, Singapore
2. Shtessel Y, Edwards C, Fridman L, Levant A (2014) Sliding Mode Control and Observation.
Springer, New York
3. Guo J, Chiu FC, Huang CC (2003) Design of a sliding mode fuzzy controller for the guidance
and control of an autonomous underwater vehicle. Ocean Eng 30(16):2137–2155
4. Hoang NQ, Kreuzer E (2008) A robust adaptive sliding mode controller for remotely operated
vehicles. Tech Mech 28(3–4):185–193
5. Deng CN, Ge T (2013) Depth and heading control of a two DOF underwater system using a
model-free high order sliding controller with transient process. In: Proceedings of 2013 5th
Tracking Control Design for Underactuated (lAUV) ... 13

International Conference on Measuring Technology and Mechatronics Automation,

ICMTMA 2013, pp 423–426
6. García-Valdovinos LG, Salgado-Jiménez T, Bandala-Sánchez M, Nava-Balanzar L,
Hernández-Alvarado R, Cruz-Ledesma JA (2014) Modelling, design and robust control of
a remotely operated underwater vehicle. Int J Adv Robot Syst 11(1):1–16
7. Prasanth Kumar R, Dasgupta A, Kumar CS (2007) Robust trajectory control of underwater
vehicles using time delay control law. Ocean Eng 34(5–6):842–849
8. Park JY, Cho BH, Lee JK (2009) Trajectory-tracking control of underwater inspection robot
for nuclear reactor internals using Time Delay Control. Nucl Eng Des 239(11):2543–2550
9. Zhong YS (2002) Robust output tracking control of SISO plants with multiple operating
points and with parametric and unstructured uncertainties. Int J Control 75(4):219–241
10. Gilbert S, Varghese E (2017) Design and simulation of robust ﬁlter for tracking control of
quadcopter system. In: 2017 International Conference on Circuit, Power and Computing
Technologies, ICCPCT, Kollam, pp 1–7
11. Yu Y, Zhong YS (2008) Robust tracking control for a 3DOF helicopter with multi-operation
points. In: Proceedings 27th Chinese Control Conference, CCC, pp 733–737
12. Song YS, Arshad MR (2016) Tracking control design for autonomous underwater
vehicle using robust ﬁlter approach. In: Autonomous Underwater Vehicles 2016, AUV
2016, pp 374–380
Design and Development of Remotely
Operated Pipeline Inspection Robot

Mohd Shahrieel Mohd Aras, Zainah Md Zain,

Aliff Farhan Kamaruzaman, Mohd Zamzuri Ab Rashid,
Azhar Ahmad, Hairol Nizam Mohd Shah, Mohd Zaidi Mohd Tumari,
Alias Khamis, Fadilah Ab Azis, and Fariz Ali@Ibrahim

Abstract Pipeline Inspection Robot (PIR) which is a type of mobile robot is operated
remotely or autonomously with little to no human intervention, inspecting various
fields of the pipeline system and even cleaning the inner walls of the pipelines by using
integrated programs. The development and application of PIR that is specifically used
in monitoring the pipeline system are still not widely studied and applied, although
Malaysia is a nation that is vastly developing in the industrial fields. The proposed PIR
can help in monitoring and inspecting pipe diameter ranging from 215 to 280 mm that
are impossible to reach and hazardous to human life. In addition, the PIR is needed to
make the inspecting operation easier and able to save work time. This project is
focusing on the design and development of suitable PIR for pipeline system moni-
toring. The PIR is designed by using the SolidWorks software and several simulations
are conducted in the software such as the stress and strain analysis. The PIR is fab-
ricated by using aluminium and uses the adaptive mechanism structure which allow
the robot to adapt in pipe changing diameters. Moreover, the PIR is controlled by a
microcontroller. Experiments are performed to verify the robot’s performance such as
the ability of the robot to adapt in the pipeline. The results shown that the PIR has an
average speed of 0.0096 m/s and can move accurately straight in the pipeline.

Keywords Pipeline Inspection Robot Solid works design Performances

analysis

M. S. Mohd Aras (&) A. F. Kamaruzaman M. Z. Ab Rashid H. N. Mohd Shah

A. Khamis F. Ab Azis F. Ali@Ibrahim
Underwater Technology Research Group (UTeRG), Centre for Robotics and Industrial
Automation (CERIA), Fakulti Kejuruteraan Elektrik, Universiti Teknikal Malaysia Melaka,
76100 Durian Tunggal, Melaka, Malaysia
e-mail: [email protected]
Z. Md Zain
Robotics & Unmanned Systems (RUS) Research Group, Faculty of Electrical and Electronics
Engineering, Universiti Malaysia Pahang, 26600 Pekan, Pahang, Malaysia
A. Ahmad M. Z. Mohd Tumari
Fakulti Teknologi Kejuruteraan Elektrik dan Elektronik, Universiti Teknikal Malaysia
Melaka, 76100 Durian Tunggal, Melaka, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 15

1 Introduction

The Pipelines Inspection Robot is a mobile robot that is equipped with a camera and
specifically used to inspect various fields of the pipeline systems. The PIRs are used
vastly in the supply of water, petrochemical and industries that working on fluid
transportation [1–3]. On the other hand, the pipelines are the crucial equipment for
transporting fuel oils and gas, delivering drinking water and transferring pollutants
[4]. Piping networks can cause a lot of inconvenience such as corrosion, aging,
cracks, and mechanical abrasion. Hence, the need of constant inspection, mainte-
nance and repairs are massively needed [5]. The pipeline inspection robots are
utilized to investigate internal disintegration, fractures and defects which are mainly
due to many causes such as corrosion, degradation, and overheating [6]. With the
decades of enormous developments in the robotics field, the pipeline robots have
numerous designs such as the wheel type robot, caterpillar type robot, wall-press
robot, legged type robot, inchworm type robot and screw type robot [2]. In this
project, a PIR is to be designed and developed by using the SOLIDWORKS
software and the designs of the robot are specifically to apply in a straight pipeline
system and it can adapt in a various pipeline diameter. The PIR will be programmed
by a microcontroller which is the Arduino Mega2560. The performance of the PIR
will be based on its ability to move in a various pipeline diameter and its ability to
inspect the pipelines.
The aim of this project is to design and develop the PIR by using the
SOLIDWORKS software, fabricate the robot and to analyze its performance. The
goal of this project is to design and develop a PIR that is not too complex, low cost,
able to adapt in various pipelines and multifunctional. However, the performance of
other types of complex robot is detailed in this project. The pipelines are generally
used for fluid transportation from place to place. The usage and application of
pipelines across all over Malaysian industries are growing massively [7]. There are
several industries that are very well known to the pipeline industries, namely
Lembaga Air Sarawak, Telekom Malaysia, Petronas and Indah Water. As an
example, Petronas themselves is responsible to operate a huge number of 2500 km
of gas transmission pipeline in our country [8]. Nowadays, modern housing and
town planning in Malaysia are mostly having centralized sewage system. With the
utilization of the new sewage systems, all houses’ pipelines will be connected to
one station for each district. In addition, there will eventually be a more future
network of pipelines that will be constructed.
These pipelines will require the constant need of maintenance and technology as
the pipeline repair has become more vital [9]. There have been a series of accidents
involving pipelines throughout the years. As claimed by Carl Weimer [10], the
executive director of the Pipeline Safety, 135 excavation tragedy that involved
pipelines have occurred in which the pipelines are transporting dangerous chemicals
such as crude oil and petroleum over the last 10 years. This incident can be
summarized that roughly one incident happens every month. Apart from that, on the
31st of July 2014, gas explosion series had occurred in the Cianjhen and Lingya
Design and Development of Remotely Operated Pipeline … 17

districts of Kaohsiung, Taiwan. Earlier that evening, there were reports of gas spills
and unfortunately, after the blasts, thirty-two people did not survive and a number
of 321 others were wounded [11]. Recently this year, on the 1st of August, another
series of natural gas pipeline explosion in Midland Country, Texas has occurred and
ﬁve people were sent to hospital, leaving them with critical burn injuries. The cause
of the explosions was unknown, the ofﬁcials said [12].

2 Methodology

The whole system has been constructed as shown in the Fig. 1. The control module
consists of the controller that is wired to connect with the Arduino Mega2560. The
inspection module consists of the pan-and-tilt CCD camera that is attached with
servo motor and the computer that is used to get real-time image or video recording
for pipe inspection. Next, the moving part module consists of the motor driver,
12 V DC motor, gear and the wheel’s movement. The whole module is powered by
a power supply that is connected externally from the robot.
Pipeline Inspection Robot is shown in the different planes of view as shown in
Fig. 2(a)–(d). The robot that have been designed can ﬁt a pipe diameter ranging
from 90 to 130 mm. This robot applies the adaptive mechanism in which the spring
tension acts as a passive support which enable the robot to keep intact to the pipe
inner walls. The designed robot has a length of 15 cm and the arms of the robot
have a maximum reach of 130 mm. The most contracted and expanded state of the
robot arm as shown in the Fig. 2(e) and (f), respectively. The body tube of the
designed robot which act as the main body is used to store the electrical compo-
nents. The designed robot uses stainless steel as its main materials that composed
most of its parts. Stainless steel has been chosen mainly due to its ability to
withstand corrosion and oxidation as this robot is going to be used to inspect
pipelines which have various conditions. In addition, the front and the rear of the
robot is attached with a transparent acrylic plastic respectively to protect the
electrical components inside the body tube especially the camera that is used for
inspecting the pipelines.

3 Results and Discussions

The stress and strain analysis results on the certain parts of the robot that have been
done in the SolidWorks software as shown in Fig. 3. All the parts are given the
same amount of force which is 100 N and are given the same type of materials
which is the Annealed Stainless Steel. The Annealed Stainless Steel has a yield
strength of 2.750e8 N/m2. The maximum stress given by the 100 N force to the
Body Tube is 2.656e5 N/m2 which is lower than the yield strength of the material.
Therefore, the body tube is operating within safe limits because the maximum stress
18 M. S. Mohd Aras et al.

Fig. 1 The block diagram of the pipeline inspection robot

is below the amount of the yield strength. As mentioned earlier, all the parts are
given the same amount of force and materials which is 100 N and Annealed
Stainless Steel. The robot part as shown in the Fig. 3 has the yield strength of
2.750e8 N/m2 and the maximum stress given by the 100 N force is 4.325e68 N/m2
which is lower than the yield strength. Therefore, this part of the robot operates
within the safe limit.
Same goes to the two robot parts in the Fig. 3, they are operating within the safe
limits because the maximum stress given is below the yield strength of the parts.
The speciﬁcations and the measurements of the fabricated robot is shown in the
Table 1.
The differences between the designed and the fabricated Pipeline Inspection
Robot are mainly on the adaptive mechanism linkage, which connect to the wheels
of the robot. The changes are made because the measurements of the adaptive
Design and Development of Remotely Operated Pipeline … 19

Fig. 2 A view of the designed pipeline inspection robot using SolidWorks software

mechanism parts of the designed robot are too small and thus, it was impossible to
be fabricated. The changes in the measurements led to the increase of the maximum
extended state diameter and the minimum extended diameter of the Pipeline
Inspection Robot. Hence, pipes with bigger diameter are needed to analyze the
performance of the fabricated Pipeline Inspection Robot. On the other hand, the
changes in measurements also led to the increase of the robot’s weight. The robot is
20 M. S. Mohd Aras et al.

Fig. 3 The stress and strain analysis results on the certain parts of the PIR using SolidWorks
software

Table 1 The speciﬁcations Items Speciﬁcations

and measurements of the
fabricated pipeline inspection Length (mm) 150
robot Weight (kg) 2.2
Maximum adaptive diameter (mm) 280
Minimum adaptive diameter (mm) 215
Diameter without spring attached (mm) 200
Wheels diameter (mm) 30
Average speed 0.0096

quite heavy with the weight of 2.2 kg. The robot’s weight was not expected to be
heavier than we thought after the fabrications and thus the DC motors that are used
to move the robot did not have enough power to move the robot sufficiently. The
speed of the robot is rather slow with an average speed of 0.0096 m/s. Thus, further
modifications of the fabricated Pipeline Inspection Robot and recommendations
will be made and stated for future works to improve the robot’s driving speed. The
materials that are used to make the Pipeline Inspection Robot parts are entirely
aluminiums. Aluminiums have a very low specific weight of about 1/3 of iron.
Hence, this can decrease the robot’s weight than using common metals to fabricate
the robot. Furthermore, aluminium has a very high resistance against corrosion and
oxidation, which best to be used for the Pipeline Inspection Robot as the robot will
be used and travel inside a pipeline with various conditions. Despite the beneficial
properties of the aluminium, the fabricated Pipeline Inspection Robot turns out
quiet heavy and thus, further research and development will be made to the robot
for future works and studies. Next, the transparent body covers for the front and
backside of the Pipeline Inspection Robot were not be able to completed because of
time constraint. The fabrications, modifications and the assembly of the fabricated
Pipeline Inspection Robot took a tremendous amount of time. The designed body
covers that are made up of acrylic plastic are used to protect the electronic parts
inside the body of the robot. It also protects the camera that will be placed inside the
robot’s body for inspection utilizations (Fig. 4).
The experiment is prepared to analyze and observe the robot’s average speed in a
320 mm long pipe with the diameter of 266 mm. A number of 10 trials were done
to test the robot’s speed inside the pipe and the time for the robot to move inside the
Design and Development of Remotely Operated Pipeline … 21

Fig. 4 A view of the fabricate Pipeline Inspection Robot

Table 2 The results of the pipeline inspection robot speed test

Trials Time taken to move inside the pipeline (320 mm length 266 diameter) s
1 31
2 33
3 35
4 31
5 34
6 33
7 32
8 35
9 36
10 32
Average time 33.2
Average speed 0.0096 m/s
22 M. S. Mohd Aras et al.

pipe and the average speed is records in the Table 2. The robot took an average of
33.2 s to move to the end of the 320 mm long pipe and gain an average speed of
0.0096 m/s. The performance of the robot’s speed can be further improved with
proper modiﬁcations and future works.

4 Conclusion

The design of the Pipeline Inspection Robot with the specifications and features has
been done successfully. Next, the fabrications of the robot are also a success,
although there were a few modifications that have been made to the measurements
and specifications of the PIR. The performance of the PIR in terms of flexibility can
be further analyze with proper modifications to the Pipeline Inspection Robot.
Throughout the fabrication process, a few changes in measurements were made to
the parts of the robot because some parts are too small to be fabricated. These
changes were carefully made and the robot is fabricated successfully. There was the
unexpected result made after the fabrications of the robot. The weight of the robot
was unexpectedly heavy and it affected the speed of the robot. There are many ways
to improve the Pipeline Inspection Robot in terms of its performance and design. To
increase and improves the performance of the robot, these future works are needed
and further develop this Pipeline Inspection Robot.

Acknowledgements The authors would like to thank Universiti Malaysia Pahang for the pro-
vision of PJP grant (RDU170366) and Special appreciation and gratitude to especially for Centre
of Research and Innovation Management (CRIM), Centre for Robotics and Industrial Automation
(CERIA) for supporting this research and to Faculty of Electrical Engineering from UTeM for
supporting this research under PJP (PJP/2019/FKE(3C)/S01667).

References

1. Harish P, Venkateswarlu V (2013) Design and motion planning of indoor pipeline inspection
robot. Int J Innov Technol Explor Eng 3(7):41–47
2. Bhadoriya AVS, Gupta VK, Mukherjee S (2018) Development of in-pipe inspection robot.
Mater Today Proc 5(9):20769–20776
3. Nayak A, Pradhan SK (2014) Design of a new in-pipe inspection robot. Procedia Eng
97:2081–2091
4. Lee D, Park J, Hyun D, Yook G, Yang HS (2012) Novel mechanisms and simple locomotion
strategies for an in-pipe robot that can inspect various pipe types. Mech Mach Theory 56:52–
68
5. Roh SG, Choi HR (2005) Differential-drive in-pipe robot for moving inside urban gas
pipelines. IEEE Trans Robot 21(1):1–17
6. Roslin NS, Anuar A, Jalal MFA, Sahari KSM (2012) A review: Hybrid locomotion of in-pipe
inspection robot. Procedia Eng 41:1456–1462
7. Abidin ASZ (2015) Development of track wheel for in-pipe robot application. Procedia
Comput Sci 76:500–505
Design and Development of Remotely Operated Pipeline … 23

8. Bujang AS, Bern CJ, Brumm TJ (2016) Summary of energy demand and renewable energy
policies in Malaysia. Renew Sustain Energy Rev 53:1459–1467
9. Enner F, Rollinson D, Choset H (2013) Motion estimation of snake robots in straight pipes.
In: Proceedings of IEEE International Conference on Robotics and Automation, Germany,
pp 5168–5173. IEEE
10. How often do pipelines blow up? https://money.cnn.com/2016/11/01/news/pipelinefatalities/
index.html. Accessed 25 May 2019
11. Multiple gas explosions rock Kaohsiung streets. http://focustaiwan.tw/news/asoc/
201408010001.aspx. Accessed 25 May 2019
12. Natural Gas Pipeline Explosions in Texas Critically Injure 5 Workers. https://www.huffpost.
com/entry/natural-gas-pipeline-explosionstexas_n_5b62964be4b0fd5c73d62c97. Accessed
25 May 2019
Vision Optimization for Altitude Control
and Object Tracking Control
of an Autonomous Underwater
Vehicle (AUV)

Joe Siang Keek, Mohd Shahrieel Mohd Aras, Zainah Md. Zain,
Mohd Bazli Bahar, Ser Lee Loh, and Shin Horng Chong

Abstract Underwater vision is very different with atmospheric vision, in which the
former is subjected to a dynamic and visually noisy environment. Absorption of
light by the water and rippling waves caused by atmospheric wind are resulting
uncertain refraction of light in the underwater environment, thus continuously
causing disturbance towards the visual data collected. Therefore, it is always a
challenging task to obtain reliable visual data for the control of autonomous
underwater vehicle (AUV). In this paper, an AUV was developed and is tasked to
perform altitude control and object (poles) tracking control in a swimming pool by
merely using a forward-viewing vision camera and a convex mirror. Prior to design
and development of control system for the AUV, this paper only focuses on uti-
lizing and optimizing the visual data acquired. The processing process involves
only gray-scaled image and without any common color restoration or image
enhancement techniques. In fact, the image processing technique implemented for
the object tracking control in this paper contains a self-optimizing algorithm, which
results improvement on the object detection. The result shows that under similar
challenging and dynamic underwater environment, the detection with optimization
is 80% more successful than without the optimization.

Keywords Vision optimization Altitude control Object tracking control
Autonomous underwater vehicle

J. S. Keek M. S. Mohd Aras (&) M. B. Bahar S. L. Loh S. H. Chong

Faculty of Electrical Engineering, Universiti Teknikal Malaysia Melaka, Jalan Hang Tuah
Jaya, 76100 Durian Tunggal, Melaka, Malaysia
e-mail: [email protected]
Z. Md. Zain
Robotics and Unmanned Systems (RUS) Research Group, Faculty of Electrical and
Electronics Engineering, Universiti Malaysia Pahang, 26600 Pekan, Pahang, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 25

1 Introduction

Besides of the universe up in the sky and beyond, underwater world is another
universe that is always in the to-explore-list of mankind throughout the past dec-
ades. While the mankind has already reached millions of light years up into the
universe, but still yet to complete the exploration of underwater world even though
it is just a few hundreds of kilometers of deepness. The main reason for this
circumstance is because of the medium of the underwater environment—water not
only hinders the transmission of radio frequency (RF) signal, it refracts and absorbs
the penetration of visible light and thus causing the exploration of underwater world
to encounter various difﬁculties, even for shallow water environment as well. As
vision is one of the most informative source of feedback sensing, losing such
capability means a ‘handicapped’ autonomous underwater vehicle (AUV).
Therefore, the exploration of underwater world without vision is not preferable.
In underwater environment, visible light is refracted. What even worse is, a gust
of wind can easily create waves of ripple, causing the refraction to be varying and
uncertain. Therefore, the light reflected from underwater object may has dynamic
light reflection and patching over time. Moreover, water tends to absorb red and
green lights, thus leaving multicolor object left with only blue color. Therefore, the
image taken under water is very different with the image taken on ground, addi-
tional image processing techniques are mandatory.
Existing conventional image processing techniques for ground image are
matured and common, however, when it comes to the application of underwater
images, these techniques may be inadequate. Therefore, various additional image
processing technique for underwater image is developed and formulated from time
to time. As mentioned earlier, the rippling water waves cause the underwater image
to contain noise and disturbance. Image transformation technique such as wavelet,
curvelet and contourlet are promising in overcoming such circumstance [1].
Meanwhile, as water tends to absorb all spectrum of visible light except the blue
one, therefore, effort such as color restoration and correction was proposed for
acoustic underwater image with heuristic algorithm [2, 3]. Occasionally, working
with colorful image can be easier for feature extraction and object recognition, but it
is three times more computational power hungrier than gray-scaled image. Zhang
et al. proposed an implementation of Particle Swarm Optimization (PSO) in opti-
mizing the gray-scaled tuning parameter, with the objective of achieving lesser
computational power yet retaining decent accuracy of object recognition and
detection [4].
As working with color restoration or correction techniques may add complexity
to the image processing, and colorful image involves higher computational power
as well, therefore in this project, gray-scaled underwater image is adopted but
unlike [4–7], a more complicated object is used for detection and a simple
self-tuning algorithm is implemented to cope with the dynamic environment of
under water. The ﬁnal result displays a more robust detection of the object assigned
and deployed. This paper is organized as follow. Section 2 describes the hardware
Vision Optimization for Altitude Control and Object Tracking … 27

and experimental setups of the AUV developed. Section 3 presents the image
processing techniques used in this paper. Section 4 presents and discusses exper-
imental result and ﬁnally in Sect. 5, this paper is concluded.

2 Hardware and Experimental Setups

The autonomous underwater vehicle (AUV) developed in this paper is equipped

with a looking-forward Raspberry Pi camera module and is tasked to acquire
altitude and object location data. In order to fulfill these criteria concurrently and
instead of using two cameras (one looking-forward camera and one
looking-downward camera), a convex mirror is used. The outcome of the
looking-forward raw image data is as shown in Fig. 1.
The convex mirror is actually a blind-spot mirror for the rear mirrors of car. The
advantage of such mirror is that it produces zoomed and wider field of view. Based
on Fig. 1, the areas (size) of the tiles spotted in the mirror are computed and used to
determine the immediate altitude of the AUV. The benefit of such approach or
hardware setup is, both altitude and object detecting data can be acquired con-
currently by using merely one camera. Moreover, the image can be segmented into
two smaller regions of interest (ROI) for simultaneous processing, thus saving
abundant of computational power and time. Next, the detail of the poles is illus-
trated in Fig. 2.

Fig. 1 Forward view from the perspective of the AUV in a swimming pool
28 J. S. Keek et al.

Fig. 2 Illustration and detail of the object (poles) used

Overall, the frame captured by the camera has resolution of 640 480 pixels
and with frame rate of 10 frames per second (fps). Although the poles are colored
with bright orange color, however in Fig. 1, the poles appeared to have dark colored
surface and the overall image is blueish. Such properties vary from time to time and
from position to position. Therefore, a self-tuning image processing technique is
implemented to cope with such dynamicity, which will be presented in upcoming
section.

3 Image Processing Technique

3.1 Data for Altitude Control

To efficiently acquire altitude data, the raw image or frame is first cropped based on
region of interest (ROI), that is where the mirror locates in the image. Since the
mirror moves along with the AUV, the position of the mirror is constant and thus
the parameters for the ROI can be pre-defined. Figure 3 depicts the cropped image
of the raw image in Fig. 1.
To ease the computation, the segmented or cropped image is converted into
gray-scaled image, whereby the intensity of each pixel is then ranged between 0 and
255. Next, Gaussian blur is applied with 5 5 pixels of kernel to smoothen edges,
then followed by edge detection by using built-in Canny function from Python
OpenCV. To enhance edges, morphological transformations is applied, whereby the
Vision Optimization for Altitude Control and Object Tracking … 29

Fig. 3 ROI for altitude

control

Fig. 4 Morphological
transformed image

image is ﬁrst dilated and then followed by erosion and the result is as shown in
Fig. 4. At this stage, contours of the image can be easily obtained. The shape of
each contour can be approximated by using Douglas-Peucker algorithm. Polygon
with four vertices is detected as a quadrilateral, which denotes the tile of the
swimming pool. Finally, the areas of each detected quadrilateral (tiles) are com-
puted and collected and the altitude of the AUV can be determined by using the
average value of these tile areas.
30 J. S. Keek et al.

Fig. 5 ROI for object tracking control

3.2 Data for Object Tracking Control

In this subsection, the image processing technique on locating the targeted object
i.e. poles in the vision of the autonomous underwater vehicle (AUV) is presented.
As mentioned earlier, due to the dynamic and noisy environment of underwater
environment, detecting the poles in the swimming pool requires certain extent of
adaptability. Therefore, a self-tuning algorithm is discussed in this subsection,
whereby a parameter will be optimized heuristically based on the fitness function
designed and developed. First of all, and as previous, to minimize computational
power as much as possible, only region of interest (ROI) is extracted or cropped out
for processing. The cropped image with the ROI is as shown in Fig. 5.
Then, the image is converted into gray-scaled image to further lighten the
computation. Based on the image in Fig. 5, the poles straightforwardly outstand
from the environment based on our perspective. Therefore, there is certainly a
boundary value that can capture and detect the poles. Since the image is in
gray-scaled, the lower boundary value is 0 whereas the upper boundary value, Uop
is the parameter to be optimized. Since the optimization does not involve multi-
dimensional search space and multivariable, a simple optimization process is
implemented, that is by just increasing the value of Uop with step value of 1 at each
iteration. During each iteration, contours are computed, and all polygons with four
vertices (quadrilaterals) are collected. The key point of a successful and accurate
detection of the poles depends on the reliability of the fitness function designed.
The algorithm of the fitness function in Python programming language is presented
in Algorithm 1.
Vision Optimization for Altitude Control and Object Tracking … 31

Algorithm 1: Fitness function for optimizing Uop.

…
1 if angles is not None and len(angles) == 2 and abs(angles[0]) < 45 and
2 abs(angles[1]) < 45:
3 angleDiff = abs(round(angles[0]) - round(angles[1]))
4 else:
5 angleDiff = 90
6 if len(widths) == 2 and len(areas) == 2 and angleDiff < 45:
7 widthAreaRa o = []
8 for i in range(2):
9 widthAreaRa o.append(widths[i]/areas[i])
10 fitnessFunc on = abs(widthAreaRa o[0]-widthAreaRa o[1])
11 else:
12 fitnessFunc on = float(‘inf’)
…
13 costs.append(fitnessFunc on)
…
14 minimumCostLoca on = costs.index(min(costs))
15 minimumCost = costs[minimumCostLoca on]
16 op malParameter = parameters[minimumCostLoca on]

Intuitively, the characteristics of the object (poles) are used as the criteria to
design the fitness function. Based on Fig. 5, the object is made up of two poles and
therefore in line 1 of Algorithm 1, the number of detected quadrilaterals allowed is
equals to 2. Moreover, the poles are in upright position and never in horizontal
position. Therefore, ‘abs(angles[0])’ and ‘abs(angles [1])’ only accept quadrilaterals
that are angled in less than 45° and −45°. Next, since these two poles are parallel to
each other, their angle difference should not have large difference; only angle
difference of less than 45° is allowed.
Next, the width-area ratio is introduced, and the value returned by the fitness
function is exactly the absolute difference of the width-area ratio of these two
quadrilateral (poles) as shown in line 10 of Algorithm 1. Intuitively, these two poles
are identical and therefore have very similar width. However, due to the cropped
region as shown in Fig. 5, the pole or poles may be partially blocked occasionally,
resulting the area of the poles obtained via the image processing technique to have
significant difference. Therefore, their widths are normalized by their respective
areas for reasonable detection. Finally, all the fitness function values are compiled.
The value of Uop with minimum cost value is selected as the optimal parameter.
32 J. S. Keek et al.

4 Experimental Result and Discussion

In this section, the result of the methods implemented for the image processing is
presented and discussed. The autonomous underwater vehicle (AUV) was manually
moved from one position to another to acquire raw image data. 15 frames of images
are selected to evaluate the performance of the proposed method. Table 1 presents
the altitude data obtained experimentally.
Based on Table 1, all 15 frames have successful detection of the tiles, even
though the underwater environment is dynamic and is sensitive to external dis-
turbance. This is because, unlike the detection of the poles, detection of the tiles is
simply easier. Moreover, the tiles are beneath the AUV and therefore, noisy light
refraction caused by the rippling water waves does not affect the image signifi-
cantly. Overall, the tile areas of each frame have coefficient of variation (COV) of
not more than 0.27, which denotes that the detection is reliable and consistent.
Next, the result for the detection of the poles is presented in Table 2.
In Table 2, experiments without and with optimization is compared. Without the
optimization, parameter Uop is fixed at value of 98 throughout all frames. Whereas
with the optimization, the value of Uop is dynamic and varies according to
immediate state and environment. The overall result shows that without the
self-tuning algorithm, only three frames i.e. Frames 1, 5 and 7 successfully detect
the poles whereas with the self-tuning algorithm, all 15 frames attain successful
detection. Take note that the values of Uop varies without an incremental or
decremental pattern, which indicates the uncertain dynamic environment of under
water. Meanwhile, the error, which is also the input for system controller, denotes
the horizontal distance between center point of the frame (white dot) and the center
point between the poles (black dot).

Table 1 Altitude data

Frame Outcome Areas (pixel2) Coeﬃcient Mean

No. of Varia on (pixel2)

208.0, 126.0, 210.0,

1 0.22 188.5
210.0

2 126.0, 154.0 0.14 140

3 150.0, 224.0, 180.0 0.20 184.7

Vision Optimization for Altitude Control and Object Tracking … 33

Table 1 (continued)

126.0, 264.0, 176.0,

4 0.27 203.1
164.5, 256.0, 180.0, 255.5

224.0, 224.2, 196.0,

5 0.26 224.6
250.7, 154.0, 335.8, 188.0

6 225.0, 225.0 0 225.0

296.1, 223.4, 255.0,

7 0.22 204.5
176.0, 176.0, 192.0

289.0, 256.0, 255.0,

8 180.0, 180.0, 221.0, 0.18 232.9
272.0, 210.0

126.0, 150.0, 150.0,

9 196.0, 130.0, 225.0, 0.24 173.1
165.0, 154.0, 255.0, 180.0

225.0, 165.0, 203.4,

10 0.17 213.8
255.4, 180.0
34 J. S. Keek et al.

Table 2 Object detection data without and with the self-tuning algorithm

Frame Outcome without Error Outcome with Self- Error Uop

No. Self-tuning (pixels) tuning Algorithm (pixels)
Algorithm, Uop = 98

1 -15.66 98
-15.66

2 nil 4.54 95

3 nil 7.38 89

4 nil 52.60 83

5 nil 40.43 76
Vision Optimization for Altitude Control and Object Tracking … 35

Table 2 (continued)

6 -4.24 -4.24 79

7 22.01 23.06 69

8 nil 91
-29.32

9 nil 83
-74.97

10 nil 65
-18.56

5 Conclusion and Future Work

The proposed method has successfully achieved robust data extraction for the
purposes of altitude control and object tracking control in the future. A conclusion
that can be drawn is, self-tuning or self-optimizing algorithm is a mandatory for
dynamic circumstance such as the environment of under water. In future work,
optimization technique with better convergence time can be implemented to
improve the proposed image processing technique. Moreover, more tuning
36 J. S. Keek et al.

parameters can be introduced to improve the robustness and reliability of the

detection.

Acknowledgements The authors would like to thank Universiti Malaysia Pahang for the pro-
vision of PJP grant (RDU170366) and Ministry of Higher Education of Malaysia for the provision
of FRGS grant (FRGS/2018/FKE-CeRIA/F00352).

References

1. Sharumathi K, Priyadharsini R (2016) A survey on various image enhancement techniques for

underwater acoustic images. In: International Conference on Electrical, Electronics, and
Optimization Techniques, pp 2930–2933
2. Pramunendar R, Shidik AGF, Supriyanto CP, Andono N, Hariadi M (2018) Auto level color
correction for underwater image matching optimization. Int J Comput Sci Netw Secur 13
(1):18–23
3. Trucco E, Olmos-Antillon AT (2016) Self-tuning underwater image restoration. IEEE J
Oceanic Eng 31(2):511–519
4. Zhang R, Liu J (2006) Underwater image segmentation with maximum entropy based on
particle swarm optimization (PSO). In: Proceedings of the First International
Multi-symposiums on Computer and Computational Sciences
5. Silpa-Anan C, Brinsmead T, Abdallah S, Zelinsky A (2001) Preliminary experiments in visual
servo control for autonomous underwater vehicle. In: Proceedings 2001 IEEE/RSJ
International Conference on Intelligent Robots and Systems. Expanding the Societal Role of
Robotics in the Next Millennium, vol 4, pp 1824–1829
6. Lee P-M, Hong S-W, Lim Y-K, Lee C-M, Jeon B-H, Park J-W (1999) Discrete-time
quasi-sliding mode control of an autonomous underwater vehicle. IEEE J Oceanic Eng 24
(3):388–395
7. Shojaei K, Dolatshahi M (2017) Line-of-sight target tracking control of underactuated
autonomous underwater vehicles. Ocean Eng 133:244–252
Development of Autonomous
Underwater Vehicle Equipped
with Object Recognition and Tracking
System

Muhammad Haniff Abu Mangshor, Radzi Ambar,

Herdawatie Abdul Kadir, Khalid Isa, Inani Yusra Amran,
Abdul Aziz Abd Kadir, Nurul Syila Ibrahim, Chew Chang Choon,
and Shinichi Sagara

Abstract The development and design of autonomous underwater vehicle (AUVs)

provides unmanned, self-propelled vehicles that are typically deployed from a
surface vessel, and can operate independently for periods of a few hours to several
days. This project discusses the development of an AUV equipped with object
recognition and tracking system. In this project, the motion of AUV is controlled by
two thrusters for horizontal motions and two thrusters for vertical motions. A Pixy
CMUcam5 is used as a vision sensor for the AUV that is utilized to recognize an
object through its specific color signatures. The camera recognizes an object
through colour-based filtering algorithm by calculating the colour (hue) and satu-
rations of each red, green and blue (RGB) pixel derived from built-in image sensor.
When the camera recognizes an object, the AUV will automatically track the object
without any operator. Preliminary underwater experiments have been carried out to
test its ability to stay submerge underwater as well as its functionality to navigate
and recognize object underwater. Experiments also have been carried out to verify
the effectiveness of Pixy CMUcam5 to recognize a single and multiple objects
underwater, then tracks the recognize object. This work reports the findings that
demonstrate the usefulness of PixyCMUcam5 in the development of the AUV.

Keywords Autonomous underwater vehicle Pixy CMUcam5 Object

recognition Object tracking

M. H. Abu Mangshor R. Ambar (&) H. A. Kadir K. Isa I. Y. Amran A. A. A. Kadir

N. S. Ibrahim C. C. Choon
Department of Electronic Engineering, Faculty of Electrical and Electronic Engineering,
Universiti Tun Hussein Onn Malaysia, 86400 Parit Raja, Batu Pahat, Johor, Malaysia
e-mail: [email protected]
S. Sagara
Department of Mechanical and Control Engineering, Kyushu Institute of Technology,
Tobata, Kitakyushu 804-8550, Japan

© Springer Nature Singapore Pte Ltd. 2021 37

1 Introduction

An underwater vehicle is a robotic vehicle that travels underwater that can be

classified into manned and unmanned vehicles. The manned variants include sub-
marines and submersible. A submarine is a ship that can be submerged and navigated
underwater with a streamlined hull intended for lengthy periods of operation in the
ocean, fitted with a periscope and typically fitted with torpedoes or rockets. Military
submarines are typically used to protect aircraft carriers on the water surface, to
attack other submarines and watercraft, to supply ships for other submarines, to
launch torpedoes and rockets, and to provide surveillance and protection against
prospective attackers. It differs from a submersible which has limited underwater
capability. Submersible is used for various purpose, including deep-sea surveys,
marine ecological assessment, natural marine resource harvesting, deep-sea explo-
ration and marine exploration [1].
Unmanned underwater vehicle (UUV) or more often referred to as autonomous
underwater vehicles (AUV) are robots that travels underwater independently
without requiring no physical connection to their input from an operator [2, 3].
AUVs are programmed at the surface, and then navigate through the water on their
own, collecting data as they go. AUVs can be preprogramed with an assignment
and location. Once their assignment is complete, the robot will return to its location.
On the other hand, remotely operated vehicles (ROV) are any vehicles that are
able to operate underwater where the vehicles are controlled by humans from a
remote location using remote control devices [4–6]. A series of wires running on
land or in the air connect the vehicles to a surface ship. These wires convey control
and control signals between the operator and the ROV, enabling the vehicles to be
remotely navigated. A ROV can include a video camera, lights, sonar systems and
robotic arms. The roles of UUV such as ROVs and AUVs are for example to map
the seabed for oil and gas industry, underwater observation, seabed exploration,
underwater building and subsea project maintenance and underwater inspection and
ship hull cleaning. ROVs involve in collecting samples or manipulating the
environment while AUVs will help to create detailed maps or measure water
properties.
Vision system is a technology that enables a computer to recognize and evaluate
images. A vision system usually comprises of hardware and software for digital
cameras and back-end image processing. The front camera of a robotic vehi-
cle captures pictures from the setting or a centered object and sends them to the
processing scheme [7]. The vision system has the ability to recognize objects,
places, people, writing and actions in images. Computers can use machine vision
technologies in combination with a camera and artificial intelligence software to
achieve image recognition. Image recognition is utilized to play out an enormous
number of machine-based visual errands, for example, naming the substance of
images with meta-tags, self-driving vehicles and mishap evasion frameworks,
performing images content inquiry and controlling self-governing robots. Robotic
vehicles are expected to simultaneously detect obstacles and recognize an object.
Development of Autonomous Underwater Vehicle Equipped … 39

The technology is even capable of following the objects. By applying a vision

system to a robotic vehicle means that you give it eyes to recognize an object. In
this project, an autonomous underwater vehicle equipped with vision system has
been developed. The project proposes the design and development of an AUV that
can navigate based on object recognition and tracking system using a single camera.
A Pixy CMUcam5 camera is used to recognize a target object and track its
movements in underwater environment.
This paper is organized as follows. Section 2 describes the detail design of the
propose AUV including 3D model design and actual AUV prototype design which
consist of the vision sensor. Section 3 introduces object recognition and tracking
method algorithm that is used in this work, followed by a brief conclusion and
future recommendation in Sect. 4.

2 Methodology

2.1 AUV Design Process

Figure 1 shows the AUV design process. It can be classiﬁed into several stages.
The main stage focuses on the design concept of the AUV which covers mechanical

Testing and assessing AUV

Review of previous AUV concepts Object
and designs recognition
Propose design Tracking system
Analysis of design concept
Choose final design
concept

yes
Require Adjustment to
Designing Designing upgrades? AUV design
mechanical electrical
system system
no

Integrate mechanical and electrical

system
Final AUV design

Construction process

Fig. 1 Process of design and construction of the propose AUV

40 M. H. Abu Mangshor et al.

and electrical design. The next stages can be described in two sections; the ﬁrst
section is the development of the mechanical parts. Computer-aided software such
as the Sketch Up software is used to draw and animate the proposed AUV. Other
subsections discuss on the development of the internal and external electrical design
of the AUV. The last stages are testing, ﬁne tuning and minor upgrading tasks.

2.2 AUV Structure 3D Modelling

This subsection discusses the 3D design of the AUV. The actual structure of the
AUV is developed based on the 3D design. Figure 2 shows the 3D design of the
proposed AUV modelled using Sketch Up software based on the actual size,
dimensions and the entire component that has been used. Figure 3 shows various
views of the 3D design. Figure 4 shows the main components of the proposed
AUV.

2.3 AUV Structure 3D Modelling

Figure 5 shows various views of the completed AUV structure. The structure is
composed of aluminium alloy struts which is extremely tough, light-weight,

Fig. 2 3D design of the AUV structure

Development of Autonomous Underwater Vehicle Equipped … 41

Fig. 3 Various view of the AUV’s 3D design

PVC Pipe
Left thruster

Compartment
Aluminium Alloy Strut
Arduino Mega

Pixy CMUCam5

Right thruster Bottom thruster

Bottom thruster

Fig. 4 AUV main components

42 M. H. Abu Mangshor et al.

Fig. 5 AUV body structure

with dimensions

corrosive resistant, and anti-rusting. The aluminium alloy struts are easy to be
installed and modified making it very flexible in order to fitting with other com-
ponent into the AUV. The dimension of the AUV is 65 cm length, 24 cm width and
24 cm height as shown in the figure.
The process of cutting the metal must be precise to avoid difficulty during
buoyancy test. Each aluminium alloy strut is joined using aluminium corner 90° L
shape joint bracket tightened using button head and ball nut. The joint parts need to
be completely tightened so that the AUV structure is strong enough to face
underwater external forces.
After all the installation and testing completed, all the system were integrated
and uploaded into the Arduino Mega microcontroller. All the electronic compo-
nents were placed into the underwater compartment and the thrusters were mounted
onto the AUV in order to test its overall system functionality. Figure 6 shows the
completed installation of AUV including all peripherals such as thrusters and
electronic circuitry.

2.4 Pixy CMUcam5 Installation

The Pixy CMUcam5 is placed inside a waterproofed underwater compartment as

shown in Fig. 7. The underwater compartment has a dome end cap design. This
dome end cap helps to improve vision underwater environment clearly. The
Development of Autonomous Underwater Vehicle Equipped … 43

Fig. 6 Various viewpoints of the completed AUV

position of Pixy CMUcam5 is inside the compartment and at the dome end
cap. A mounting bracket has been designed using 3D printer in order to hold the
camera inside the compartment Fig. 8a shows the mounting bracket for Pixy
CMUcam5. The dimension of the mounting bracket is 8 cm in diameter with a
thickness of 1 cm. Figure 8b shows the Pixy CMUcam5 is attached inside the
compartment using the mounting bracket.
44 M. H. Abu Mangshor et al.

Fig. 7 AUV’s waterproofed underwater compartment

Fig. 8 a Mounting bracket for Pixy CMUcam5, b Pixy CMUcam5 is attached inside the
compartment using the mounting bracket

2.5 Object Recognition and Tracking System Using Single

Camera

Object recognition using Pixy CMUcam5. In this work, a Pixy CMUcam5 is

used as a vision sensor. Figure 9 shows an image of a Pixy CMUcam5 connected to
an Arduino Mega microcontroller. This Pixy CMUcam5 uses a colour-based ﬁl-
tering algorithm to recognize object. Pixy calculates the hue and saturation of each
RGB pixel from the image sensor and uses these as the primary ﬁltering parameters.
Development of Autonomous Underwater Vehicle Equipped … 45

Fig. 9 Pixy CMUcam5 connected to Arduino. As can be seen the Pixy CMUcam5 is connected to
Arduino at ICSP pin

The hue of an object remains largely unchanged with changes in lighting and
exposure. The changes in lighting and exposure can have a frustrating effect on
color filtering algorithms. It can also recognize seven different color signatures; find
hundreds of objects at the same time, and processing at 50 fps. Pixy processes an
entire 640 400 image frame every 1/50th of a second (20 ms). This means that
you get a complete update of all detected objects’ positions every 20 ms. Pixy
CMUcam5 addresses these problems by pairing a powerful dedicated processor
with the image sensor. Pixy processes images from the image sensor and only sends
the useful information to the microcontroller. Pixy can easily connect to lots of
different controllers because it supports several interface options (UART serial, SPI,
I2C, USB, or digital/analog output).
Object Tracking using Pixy CMUcam5. The Pixy CMUcam5 is connected to an
Arduino microcontroller to recognize and track object. Figure 10 shows the flow-
chart of object tracking. The Pixy CMUcam5 will find the set signature colour by
using object colour-based filtering algorithm. Once the Pixy CMUcam5 succeed in
recognizing the object, the AUV will take action to achieve the goal. Otherwise the
AUV will keep acquiring image to recognize target object. As the AUV near to the
recognized object, the AUV will stop moving.
Initially, the Pixy CMUcam5 was ‘taught’ to track an object. PixyMon software is
used to teach the AUV to recognize the objects. This was done by holding the
object in front of its lens while holding down the button located on top. While doing
this, the RGB LED under the lens provides feedback regarding which object it is
looking at directly. When tracking an object using PixyMon, the Pixy CMUcam5
will determine some object image resolutions that have same assumption when
trying to detect an object. Object tracking is implemented in the TrackBlock
function where the function is to keep following the object in a set area. It analyzes
the image and identifies objects matching the colour characteristics of the object
being tracked. It then reports the position size and colors of all the detected objects
back to the Arduino.
46 M. H. Abu Mangshor et al.

Fig. 10 Flowchart of Object

Tracking START

Image acquisition

Object colour-based
filtering algorithm

No
Object recognized?

Yes

Object tracking

AUV moves towards object

(Forward, Reverse, Left, Right)

No No
Object distance
=10cm? Object lost?
Yes
Yes
AUV stop

2.6 AUV Circuit Design

Figure 11 shows the circuit design for the AUV illustrated using Fritzing software.
As shown in the ﬁgure, the AUV utilizes an Arduino Mega microcontroller to
control all peripherals. The circuit consists of one (1) input and four (4) outputs.
The input is only Pixy CMUcam5 that connects at Arduino’s ICSP pin. The output
are consisting of four (4) T100 thrusters from BlueRobotics that perform up, bot-
tom, right and left movements. To operate the thrusters, 11 V power supplies are
needed. The thrusters are connected to electronic speed controllers (ESC) and then
to the Arduino Mega. The ESC is used to control the speed of thrusters and the
forward or reverse rotation for forward or reverse thrust. A Pixy CMUCam5 is used
to give instructions to the AUV to recognize and track the object in underwater
based on a colour set signature and sends the data to the control system. The control
Development of Autonomous Underwater Vehicle Equipped … 47

9V
Power jack

ESC
Thruster A

ESC
Thruster B
Arduino Mega ESC
Thruster C

ESC Thruster D

Pixy CMUcam5

Fig. 11 AUV circuit design using Fritzing

9V LiPo Battery

Electronic Speed Controller

5V Power Arduino Mega
bank

Thrusters
Pixy CMUCam5

Fig. 12 Actual circuit for the proposed AUV

system will give instruction to the thrusters whether to move forward or reverse,
submerge deeper or rise depending the location of the object. Figure 12 shows the
actual circuit of the proposed AUV.
48 M. H. Abu Mangshor et al.

3 Preliminary Experiments

3.1 Water Leakage and Submerging Experiment

Before placing the electronic devices inside the underwater compartment, it is

necessary to perform water leak test. Figure 13a shows the water leakage test
condition. To detect an air leaks, the underwater compartment was submerged for
an hour inside a water container. If there is any present of bubbles means there is an
air leak. This test helps to prevent short circuit for electronic components inside the
underwater compartment and keeps of the underwater compartment dry while
submerged underwater. The underwater compartment has been tested three times
submerged underwater where each test was done for an hour. Before submerging,
the compartment was tested to make sure it is watertight and reliable in preventing
the electronic devices from damage due to water leakage.
After the AUV was completely assembled, a submerging test was carried out in a
lake to test whether the AUV ready to remain completely submerged for a period of
time. The experiment also carried out to verify the waterproofing of the component
storage compartment. Figure 13b shows the submerging experiment condition. As
shown in the figure, the yellow coloured PVC pipes were added to the sides of the
AUV to act as floating mechanism for the AUV to reduce the buoyant force acted
upon the AUV. Additional loads were added to the AUV in order to the AUV
submerged. Based on the experiment, the compartment was waterproofed reliably.
Furthermore, the right amount of loads required for the AUV to stay submerged
were verified successfully.

PVC Pipe

AUV

(a) (b)

Fig. 13 a Compartment water leakage test condition, b AUV submerging experiment condition
performed in a lake
Development of Autonomous Underwater Vehicle Equipped … 49

3.2 Underwater Experiment on Single Object Recognition

Using Pixy CMUcam5

This experiment has been carried out to investigate the effectiveness on Pixy
CMUcam5 to recognize a single object in underwater. The object used in the
experiment test is a pink colour dinosaur toy named as Spinosaurus (pink).
Underwater experiments have been carried out in a water container with the size of
80 cm (width) 58 cm (depth) 50 cm (height). The container was chosen since
there was no large water tank to test long distance recognition capabilities.
Therefore, the maximum distance between camera position and the object was
30 cm.
Experimental Steps. The steps for this experiment are as follows:
1. Connect Pixy CMUcam5 to Arduino Mega.
2. Use 5 V power supply to Arduino Mega.
3. Upload a source code to Arduino Mega.
4. The electronic components are placed inside underwater compartment.
5. The object is placed in a water container as shown in Fig. 14.
6. Initially, the camera is located with a distance 30 cm to the object position.
Then, it is moved near to the object at 25, 20, 15, and 10 cm positions.
7. Repeat step 4 to 6 with different type of water which is clear water, and mud
water.
8. The video images captured by camera are recorded.

Fig. 14 Clear underwater single object recognition by Pixy CMUcam5

50 M. H. Abu Mangshor et al.

(a) 30cm (b) 25cm (c) 20cm (d) 15cm (e) 10cm

Fig. 15 Camera views of a single object in clear water with varying distances

(a) 30cm (b) 25cm (c) 20cm (d) 15cm (e) 10cm

Fig. 16 Camera views of a single object in muddy water with varying distances

Experimental Results. From the experiment, the pixy CMUcam5 was able to
recognize a single object in clear water condition with the distances of camera to
object set as 30, 25, 20, 15 and 10 cm for the clear water as shown in Figs. 15a–e.
In muddy water condition, the pixy CMUcam5 was able to only recognize object
located 10 cm from the camera position as shown in Figs. 16a–e.

3.3 Underwater Experiment on Multiple Objects

Recognition Using Pixy CMUcam5

This experiment has been carried out to investigate the effectiveness of Pixy
CMUcam5 to recognize on multiple objects in underwater. The objects used in the
experiment were Spinosaurus (pink), Stegosaurus (green), Pteranodon (yellow),
Triceratops (orange) and Tyrannosaurus (purple) in colours.
Experimental Steps. The steps for this experiment are as follows:
1. Connect Pix CMUcam5 to Arduino Mega.
2. Use 5 V power supply to Arduino Mega.
3. Upload a source code to Arduino Mega.
4. The electronic components are placed inside underwater compartment.
5. The object is placed in a water container as shown in Fig. 17a.
Development of Autonomous Underwater Vehicle Equipped … 51

(a) Clear water (b) Muddy water

Fig. 17 Camera views of a multiple objects in a clear water, b muddy water

52 M. H. Abu Mangshor et al.

6. Initially, the camera is located with a distance 30 cm to the object position.

Then, it is moved near to the object at 25, 20, 15, and 10 cm positions.
7. Repeat step 4 to 6 with different type of water which is clear water and mud
water.
8. The video images captured by camera are recorded.
Experimental Results. From the experiment, it was found that the pixy
CMUcam5 was able to recognize a certain multiple objects in clear underwater at
certain distances as shown in Fig. 17a. At the camera distance to object at 30 cm,
the camera was able to recognize Spinosaurus and Pterandon. The camera able to
recognize Stegosaurus at the distances 25 cm. Next is Tyrannosaurus, where the
camera recognizes at distance 20 cm. The camera started to recognize the orange
coloured Triceratop at the distance of 15 cm.
On the other hand, the camera was able to recognize multiple objects in muddy
water at certain distances. At distance of 20 cm, the camera could only recognized
Stegosaurus. The camera started to recognize all objects at a distance of 15 cm,
but only the Tyrannosaurus was undetected in muddy underwater. Figure 17b
shows the results. Light is comprised of wavelengths of light, and every wavelength
is a speciﬁc colour. As results, the pixy CMUcam5 recognize longest wavelength
and then follow by the lowest wavelength from the light’s visible spectrums.

3.4 Underwater Experiment on Recognizing and Tracking

a Single Object

This experiment has been carried out to investigate the effectiveness of Pixy
CMUcam5 to recognition an object in underwater and track the object. The object
used to recognize and track was a pink coloured Spinosaurus.
Experimental Steps. The steps for this experiment are as follows:
1. Supply 9 V LiPo battery to ECS for thrusters.
2. Connect Pixy CMUcam5 to Arduino Mega.
3. Use 5 V power supply to Arduino Mega.
4. Upload a source code to Arduino Mega.
5. The electronic components are placed inside underwater compartment.
6. The object is place in 10 m underwater depth.
7. The camera from object distance is 20 cm and continuous move an object from
left to right.
8. The distances for object to recognize and track are recorded.
Experimental Results. Figures 18, 19, 20 and 21 show the experimental results.
From the experiments, the system is able to perform the desire tasks where the pixy
CMUcam5 able to recognise Spinosaurus in a clear underwater and tracking the
Development of Autonomous Underwater Vehicle Equipped … 53

Left thruster

Right thruster

Fig. 18 The direction of thrusters moving to the right. As can be seen thrusters on the left is
rotating based on the produced bubbles

Left thruster

Right thruster

Fig. 19 The direction of thrusters moving to the left. As can be seen, the thruster on the right is
rotating
54 M. H. Abu Mangshor et al.

Left thruster

Right thruster

Fig. 20 The direction of thrusters are moving forward. As can be seen both thrusters are rotating

Left thruster

Right thruster

Fig. 21 All thrusters stopped. As can be seen both thrusters are not rotating

Spinosaurus. When the Spinosaurus is moved to the left, the thruster A stopped and
the thruster B stopped hence it turned to left. Then, the thrusters A was activated
and the thruster B is stopping hence it turned to right. When Spinosaurus was
moved backwards, the thruster A and thruster B were activated to move forward to
track the object. Lastly, when the distance between the Spinosaurus and the camera
is 10 cm, the thrusters stopped.
Development of Autonomous Underwater Vehicle Equipped … 55

3.5 Summary

Every step that has been taken plays an essential role in order to successfully
develop a fully functional AUV. From sketching up the structure of the AUV by
using computer software until assembling the AUV, each procedure was very
crucial in the process of developing the AUV. Since the AUV will remain sub-
merged, it is imperative to guarantee all the electronic components is water proof
and would not leak to water. The experimental results show that the camera was
able to recognize a single and multiple objects underwater especially for clear
water. The thrusters have been operated as desired where the direction of thrusters
follow the position of object.

4 Conclusion

This paper describes the development of an autonomous underwater vehicle

equipped with object recognition and tracking system. In this paper, the hardware
and software designs of the AUV has been described. The AUV is installed with a
Pixy CMUcam5 camera for object recognition and tracking system. Based on
preliminary object recognizing experiments, the Pixy CMUcam5 is capable to
recognize single and multiple objects underwater. It has been observed that the Pixy
CMUcam5 starts recognizing objects at a distance of 30 cm for clear water. While
in muddy water condition, it was difﬁcult for the Pixy CMUcam5 to recognize
objects. This is maybe due to the fact that CMUcam5 utilizes colour-based algo-
rithm. Furthermore, experiments related to thrusters showed that the thruster rotated
based on input from the image captured from the Pixy CMUcam5.
In conclusion, the objective of the project is to design and develop an AUV
equipped with object recognition and tracking system is successfully independently.
Lastly, improvement to be considered in future projects include using high-end
vision system which can monitor a real-time underwater. As a camera that can
perform in multiple types of water so that the AUV not limited to clear water only
but also muddy waters.

Acknowledgements The authors would like to thank the Research Management Center (RMC),
UTHM and Ministry of Higher Education for sponsoring the research under Tier 1 Research
Grants (Vot H161).

References

1. Levin LA et al (2019) Global observing needs in the deep ocean. Front Mar Sci 6(241):1–32
2. Spears A et al (2016) Under Ice in Antarctica: the iceﬁn unmanned underwater vehicle
development and deployment. IEEE Robot Autom Mag 23(4):30–41
56 M. H. Abu Mangshor et al.

3. Ribas D et al (2015) I-AUV mechatronics integration for the TRIDENT FP7 project. IEEE/
ASME Trans Mechatron 20(5):2583–2592
4. Ambar RB, Sagara S (2015) Development of a master controller for a 3-link dual-arm
underwater robot. Artif Life Robotics 20:327–335
5. Yuh J (2000) Design and control of autonomous underwater robots: a survey. Auton Robots 8
(1):7–24
6. Khatib O et al (2016) Ocean one: a robotic avatar for oceanic discovery. IEEE Robot Autom
Mag 23(4):20–29
7. Techopedia: Machine Vision System (MVS). https://www.techopedia.com/deﬁnition/30414/
machine-vision-system-mvs. Accessed 21 Feb 2019
Dual Image Fusion Technique
for Underwater Image Contrast
Enhancement

Chern How Chong, Ahmad Shahrizan Abdul Ghani,

and Kamil Zakwan Mohd Azmi

Abstract Underwater imaging is receiving attentions throughout these years.

Attenuation of light causes the underwater images to have poor contrast and
deteriorated color. Furthermore, these images usually appear foggy and hazy. In this
paper, a new approach to enhance underwater images is proposed, which imple-
ments the integration of dehazing method, homomorphic ﬁltering and image fusion.
The dehazing method consists of multi-scale fusion technique, which applies
weight maps in the pre-processing step. Homomorphic ﬁltering and image fusion
are then applied to the resultant image for contrast and color enhancement.
Qualitative and quantitative evaluations are performed to analyze the performance
of the proposed method. The results show the superiority of the proposed method in
terms of contrast, image details, colors, and entropy. Moreover, implementation of
Raspberry Pi with Picamera as standalone underwater image processing device is
also successfully implemented.

Keywords Underwater image Contrast Color Multi-scale fusion Standalone
prototype device

1 Introduction

The physical features of an object are captured and stored as an image by capturing
device such as a camera, telescope, and computer devices built-in camera module.
In such ways, images have been categorized in varied forms. In terms of digital
reign, digital image is represented as a form in two-dimensional (2D) rectangular
matrix of any digital form sample value for the image itself. All of the quantized
sample values are converted as picture, pixels and image elements. The properties

C. H. Chong A. S. Abdul Ghani (&) K. Z. Mohd Azmi

Faculty of Manufacturing and Mechatronic Engineering Technology, Universiti Malaysia
Pahang, 26600 Pekan, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 57

Fig. 1 Different wavelengths

of light are attenuated at
different rates in water

of the image itself can be quantified and undergo processing for further analysis to
the next stage to illustrate the characteristics and properties of an image.
As reported by Abdul Ghani [1], most images that captured in water medium
have qualities (e.g. color and contrast) differ from internal properties possessed of
the environmental medium. An object captured underwater is overshadowed by
blue-green color cast. This problem creates an undesirable condition where the
genuine characteristics and natural color of an underwater object is falsely inter-
preted. Moreover, a capturing device (i.e. camera) also can cause degradation to
underwater image. Incompetent specification of a capturing device may result in
various noises to be induced in output image. Therefore, these issues need to be
resolved in order to have better quality of underwater image.
Nowadays, underwater image processing gradually becomes as one of chal-
lenging field study to researchers. The fundamental knowledge of image formation
in a water medium is described briefly in order to understand the underwater
imaging process.
Light’s phenomena that originated from the light attenuation as shown in Fig. 1
resulted in underwater images to suffer from low quality and poor contrast [2].
There are few experiments where the light source is replaced with artificial light to
rectify the light illumination in underwater, yet it contributes toward other lighting
issues. An image that captured with artificial light source tends to have bright spot
appeared in the center of the image. Moreover, absorption and scattering effects also
degrades further the contrast of underwater image.
There are lots of ways that have been introduced and proposed by researchers in
order to enhance underwater image quality. The advance of underwater image
processing technique can help to ease up the overall progress of marine’s explo-
ration. For instance, Chiang and Chen [3] developed underwater image enhance-
ment by wavelength compensation and de-hazing to compensate for the attenuation
discrepancy along the light propagation path. In 2017, Abdul Ghani and Mat Isa [4]
introduced a new method of enhancing underwater image, which implements the
modification to image histograms column wisely in accordance with Rayleigh
distribution. In other report, Mohd Azmi et al. [5] proposed a method that focuses
Dual Image Fusion Technique for Underwater Image Contrast Enhancement 59

on enhancing deep underwater image. They [6, 7] have also successfully integrated
a swarm-intelligence algorithm to further enhance the effectiveness of their image
enhancement method.
In 2017, Peng and Cosman [8] proposed a depth estimation method for under-
water scenes based on image blurriness and light absorption for underwater image
enhancement. The visibility of output image can be improved through this method.
However, the blue-green color cast problem is not signiﬁcantly reduced. In 2018,
Ancuti et al. [9] offered a single image approach where it builds on the blending of
two images that are directly derived from a color-compensated and white-balanced
version of the original image. This method is proven effective in improving turbid
underwater images. However, for deep underwater images, this method tends to
produce a reddish effect. Recently, Kareem et al. [10] applied integrated color
model with Rayleigh distribution (ICMRD) in their proposed method. The ICMRD
approach is operated in YCbCr color space for image enhancement. The blue-green
color cast is seen to be successfully reduced through this method. However, the
image contrast remains low.
In this paper, the image enhancement technique is presented with the application of
Graphical User Interfaces (GUI) to display the comparison between the raw input
image and the processed output image. Moreover, the proposed method is extended by
using Raspberry Pi [11, 12] as the computing platform to run underwater image
processing. Visual aid with GUI is developed to compare the results, and a standalone
prototype device is also designed for underwater image acquisition application.

2 New Approach for Underwater Image Contrast

Enhancement

In this work, homomorphic ﬁltering and image fusion with dehazing (HFIFD)
technique is introduced for underwater image enhancement. First, the input image is
subjected to dehazing process in order to reduce the haziness element in the
underwater image. Dehazing method is a pre-processing procedure which split the
input image into two separated images, where these images are improved through
white balanced and contrast enhancement techniques. Implementation of lumi-
nance, chromatic and saliency weight maps are performed to both images and then
all the outputs are fused together to produce the output image as shown in Fig. 2.
This dehazing technique is necessary to eliminate unwanted distortion elements
in the image. The white balancing process is aimed to shed unreal color, chromatic
casts that are distorted by atmospheric color. Shades-of-gray color constancy
technique is applied in this process to have better computational efﬁciency.
Meanwhile, contrast enhancement is implemented to the second image by using
adaptive histogram equalization technique. This method is used to enhance the
contrast of each RGB channels by applying the histogram equalization on the
intensity of the whole frame of the image.
60 C. H. Chong et al.

Underwater image

White Balanced Contrast Enhanced

Image Image

Luminance Weight Chromac Weight Saliency Weight

Map Map Map

Images Fusion

Dehazing Image

Fig. 2 Block diagram of dehazing method with fusion technique

Then, weight maps are applied to white-balanced image and contrast-enhanced

image as the previous enhancement is insufﬁcient to restore the quality of under-
water image. Luminance, chromatic and saliency weight maps are introduced and
opted into the resultant image to improve the visibility and the color of underwater
Dual Image Fusion Technique for Underwater Image Contrast Enhancement 61

image. Luminance weight map is applied as there is color reduction occurred after
performing white balancing technique. Luminance weight map is used to assign
higher saturation value to region with better visibility and low saturation value to
others region. Chromatic weight map is then introduced by working on saturation
gain of the input image. Some of the object’s edge in certain region is considered as
the informative part of the image, which should be distinguished from their sur-
rounding as they possessed important features. Therefore, saliency weight map is
applied to improve those regions so that they can be easily seen. Those three weight
maps that have been employed in the dehazing process hold critical roles to enhance
the image quality and to reduce haziness element.

Step Output Image

Dehazing Image

Homomorphic Filtering
process

Histogram Matching

Dual-image Global
Stretching

Local Stretching as post

processing

Sharpening of image
for ﬁnal output image
Fig. 3 Block diagram of homomorphic ﬁltering and image fusion with dehazing (HFIFD)
62 C. H. Chong et al.

After the pre-processing steps (dehazing) are done, homomorphic ﬁltering is

applied to the resultant image to enhance and restore the natural colors of under-
water image as shown in Fig. 3. Butterworth filtering technique is applied in the
homomorphic filtering to filter low-frequency noise in the image.
However, the homomorphic filtering is inadequate to improve the underwater
image as the bluish or greenish illumination tends to retain in the background.
Therefore, histogram matching method is utilized in the filtering process to increase
inferior and intermediate color channels. In this step, the dominant color channel is
matched by the inferior and intermediate color channels. This process automatically
increases the influence of the inferior and intermediate color channels, while the
dominant color channel is being reduced. Then, dual-image global stretching, local
stretching, and image sharpening are applied to enhance further the image contrast.

3 GUI Application on Underwater Image Acquisition

MATLAB software is used as the compiler platform in this work. In addition, a

GUI application is designed and developed as well through MATLAB Guide to
display the input and output of the processed underwater image. The GUI is
developed to help users to see clearly the difference between the raw underwater
image and the processed image.
As shown in Fig. 4, there are axes and press button which have been designed
on the GUI. The axes are divided into two, and both axes have been labeled to
display both input and output images. The “Pick and Process” button is clicked for
selection of input image through file selector function in MATLAB. The corre-
sponding function is uigetfile () where filename and pathname are the output from

Fig. 4 GUI for underwater image acquisition using MATLAB

Dual Image Fusion Technique for Underwater Image Contrast Enhancement 63

the function. The type of input image is deﬁned as .jpg, which is common image
format. The user can choose any underwater image with .jpg format and then select
it as the input image. Figure 5 shows the flowchart of the GUI application with the
implementation of the proposed image enhancement technique.

Fig. 5 Flowchart for GUI application

4 Standalone Prototype Device for Underwater Image

Acquisition Application

In this proposed method, Raspberry Pi is used as the computing device for

underwater image enhancement. Raspberry Pi is a basic embedded system con-
jointly with a low cost single-board computer that is commonly used to ease the
complexity of a system in real time application. The application of Raspberry Pi
gives better opportunities than only observing simulation results. The interaction
between the Raspberry Pi and PC is handled by MATLAB and Simulink software
where Simulink makes possible porting of the MATLAB software to variety of
devices and platforms. MATLAB in Raspberry Pi can operate both in a simulation
mode where the board is connected to a PC and in a standalone mode where a
software is downloaded onto the board and runs independently from a PC.
64 C. H. Chong et al.

Raspberry Pi operates on special derivatives of Linux Operating System (OS).

There are six OS variants that are capable to install into Raspberry Pi such as
Raspbian, Pidora, OpenELEC, RaspBMC, RISC OS and Arch Linux. Raspbian is
the most frequently used OS which is specifically developed for Raspberry Pi. For
underwater imaging field, Raspberry Pi is supported by different programming
language software (i.e. MATLAB, Simulink) which is integrated by MathWorks.
MATLAB’s supporting package enable in development software for algorithms
that can run in Raspberry Pi. It also allows controlling peripheral devices connected
to the board through its GPIO interfaces, namely serial, I2C and SPI as well as a
camera module via command functions in MATLAB command window. The
performance of Raspberry Pi as the computing platform, helps researchers to study
and analyze the phenomena existed in underwater environment.
To capture live images from underwater environment, Raspberry Pi Camera
Module is utilized. The reason to use the Picamera as it has a built-in module that
can be integrated through Raspbian, and easy to connect it to the Raspberry Pi
board via short ribbon cable. Live still-images can be captured through the
Picamera module. It also has 8 megapixels lens in the module which is capable to
capture great quality of image. Moreover, a 5 in. TFT Display with a mini
panel-mountable HDMI monitor is used to display the Raspbian operating system
since the original Raspberry Pi board doesn’t come with a display. The display
showed 800 400 common HDMI display that is made for the Raspberry Pi. For
the power source, a portable power bank from PINENG with 20,000 maH capacity
is adopted to the Raspberry Pi board.
OpenCV is an open-source computer vision and machine learning software
library. It is also aimed at real-time computer vision function. In this proposed
method, OpenCV is applied and written with Python language to deploy the
implementation of homomorphic filtering process for underwater image enhance-
ment. Python2 IDLE is used as the programming environment to write out the
algorithm for homomorphic filtering method. The libraries for both of Picamera and
OpenCV are imported to the programming environment to fully utilize the features
(Fig. 6).
Picamera captures the input image and the image is saved into a prepared folder
for storing purpose. The captured image is read by using function in OpenCV tools
which is cv2.imread. cv2.imread read an RGB image to BGR sequence image.
Therefore, another function from the tools itself, cv2.imwrite will write the final
output image which is the enhanced image back to RGB image. The input image is
then processed with adaptive histogram equalization. The image is divided into
small blocks by 2 2 tile grid sizes and then each histogram is equalized based on
tile. A contrast limiting parameter is also applied to prevent any noise amplification
in the image if there is noise presents in the blocks. The pixels in the input image
are clipped and distributed evenly to other bins before implementing adaptive
histogram equalization.
The next step is to split out the processed image after adaptive histogram
equalization to B, G, and R color channels. Contrast adjustment is applied by
normalized all three B, G, and R color channels in order to adjust the color and
Dual Image Fusion Technique for Underwater Image Contrast Enhancement 65

Fig. 6 Block diagram

process flow of homomorphic
ﬁltering method through
Raspberry Pi microprocessor
module

Applied homomorphic filtering to

the image

contrast in the image. The function processes each color band (BGR) and deter-
mines the minimum and maximum value in each of the three colors band.
Each of the color channels has the same minimum value but different maximum
value. The minimum and maximum values are range in between 0–255 since the
input image is in 8-bits. The normalized of all BGR color channels are then merged
together and adaptive histogram equalization is performed again. The output from
the merging is then subjected to homomorphic filtering for the final enhancement.
Gaussian high-pass filter is used in the homomorphic filtering.
Figure 7 shows the interfaces generated to display the comparison on the raw
input image and enhanced output image in the Raspberry Pi. The left side of the
window shows the raw input image that is loaded from the database sample images,
and the right side of the window shows the output image that has been enhanced
through the proposed method. Both windows are generated by using Python IDLE.
66 C. H. Chong et al.

Fig. 7 Raw image and enhanced image display show in Raspberry Pi with HDMI display

5 Results and Discussion

Five sample images are used to test the effectiveness of the proposed method,
namely fish 1, coral 1, stone, fish 2, and coral 2. The performance of the proposed
method is compared with homomorphic filtering, gray world [13], CLAHE, and
contrast adjustment. The resultant images produced by all methods are shown in
Figs. 8, 9, 10, 11 and 12.

Fig. 8 Comparison of ﬁsh 1 images, a Original image; b Homomorphic ﬁltering; c Gray world;
d CLAHE; e Contrast adjustment; f Proposed HFIFD method
Dual Image Fusion Technique for Underwater Image Contrast Enhancement 67

Fig. 9 Comparison of coral l images, a Original image; b Homomorphic ﬁltering; c Gray world;
d CLAHE; e Contrast adjustment; f Proposed HFIFD method

Fig. 10 Comparison of stone images, a Original image; b Homomorphic ﬁltering; c Gray world;
d CLAHE; e Contrast adjustment; f Proposed HFIFD method

The original image of fish 1 is affected by bluish color cast and the objects are
hardly seen. The homomorphic filtering method show a promising result as the
bluish color cast is significantly reduced. Meanwhile, gray world tends to generate a
reddish output image. CLAHE inadequately improve the original image as the
bluish color cast retains in the image. Contrast adjustment method is able to reduce
the bluish color cast in the foreground. However, this effect retains in the
68 C. H. Chong et al.

Fig. 11 Comparison of ﬁsh 2 images, a Original image; b Homomorphic ﬁltering; c Gray world;
d CLAHE; e Contrast adjustment; f Proposed HFIFD method

Fig. 12 Comparison of coral 2 images, a Original image; b Homomorphic ﬁltering; c Gray

world; d CLAHE; e Contrast adjustment; f Proposed HFIFD method

background. On the other hand, the proposed method is able to reduce the bluish
color cast significantly. The image contrast is also well-improved as the fishes can
be seen clearly.
The original image of coral 1 has poor contrast and the real color of the object is
overshadowed by the bluish color cast. Homomorphic filtering method is able to
reduce the bluish color cast. However, the image contrast is insufficiently enhanced.
Gray world method over-enhances the foreground color as the reddish color cast
dominated in that region. There is no significant improvement made by CLAHE as
the bluish color cast retains in the output image. Similar to gray world, the contrast
Dual Image Fusion Technique for Underwater Image Contrast Enhancement 69

adjustment method tends to produce reddish color cast in the foreground. On the
other hand, the proposed method is able to improve the image contrast adequately.
The bluish color cast is also significantly reduced. A similar trend can be seen in
other tested images, where the proposed method successfully recovers the image
contrast as the visibility of objects has been improved.
To support the visual observation, the quantitative evaluation metrics are used
such as entropy [14], MSE [15], and PSNR [15]. Entropy represents the abundance
of image information which measures the image information content. High entropy
is preferred as it shows the resultant images contain more information. Meanwhile,
MSE and PSNR are the quantitative metrics used to compare the original image and
the improved image. High noise of an image is indicated by high value of MSE and
low value of PSNR.
As shown in Table 1, for all tested images, the proposed method obtains the
highest value of entropy, indicating that the proposed method is able to produce
output images that have more details and information. For MSE and PSNR eval-
uations, the proposed method is in fourth place for images fish 1, coral 1, fish 2, and
coral 2. For image stone, the proposed method is in fifth place. Nevertheless, this
does not certainly denote that the proposed method is inferior compared to the other
methods. The quantitative evaluation metrics that are being used are subjective and
thus have complexities in measuring correctly the enhancements made by an image
enhancement technique [16]. In some cases, some performance metrics unsuc-
cessfully achieve a result that is in agreement with the human perception of image
quality [7].
For example, based on image fish 1, gray world method obtains a better score for
MSE (3.521) compared to the proposed method (6.802). However, according to
visual observation, the output image produced by gray world looks reddish and the
image contrast is inadequately improved. Meanwhile, the proposed method ade-
quately reduces the bluish color cast while the image contrast has been improved
significantly as the fish can be seen clearly. Therefore, in terms of image quality
comparison, visual qualitative evaluation by human visual system is taken as the
first priority for overall image quality evaluation [4].
On the other hand, the GUI which has been developed with MATLAB software
is successfully developed. The performance of this application in enhancing the
underwater image is promising since the computational time required is short. Each
of images requires 2–3 s to be processed and enhanced. Compared to GUI,
Raspberry Pi requires longer computational time to process the underwater image.
On average, this application takes about 21 s to improve underwater image.
70 C. H. Chong et al.

Table 1 Quantitative results Image Method Entropy MSE PSNR

in terms of entropy, MSE, and
PSNR fish 1 Original 7.463 – –
Homomorphic 7.865 5.920 40.408
filtering
Gray world 6.404 3.521 42.664
CLAHE 6.940 11.078 37.686
Contrast 7.419 2.948 43.436
adjustment
HFIFD 7.870 6.802 39.804
coral Original 7.591 – –
1 Homomorphic 7.779 55.592 30.681
filtering
Gray world 7.141 143.191 26.572
CLAHE 7.466 20.487 35.016
Contrast 7.491 32.673 32.989
adjustment
HFIFD 7.878 60.119 30.341
stone Original 7.557 – –
Homomorphic 7.886 31.451 33.154
filtering
Gray world 7.494 22.906 34.531
CLAHE 7.653 8.298 38.941
Contrast 7.608 9.549 38.331
adjustment
HFIFD 7.888 40.308 32.077
fish 2 Original 7.529 – –
Homomorphic 7.863 5.635 40.622
filtering
Gray world 6.647 3.010 43.345
CLAHE 7.254 8.093 39.049
Contrast 7.441 3.102 43.214
adjustment
HFIFD 7.889 7.589 39.329
coral Original 7.181 – –
2 Homomorphic 7.553 38.944 32.226
filtering
Gray World 6.734 266.114 23.880
CLAHE 7.047 28.447 33.590
Contrast 7.087 19.058 35.330
adjustment
HFIFD 7.735 40.022 32.108
Dual Image Fusion Technique for Underwater Image Contrast Enhancement 71

6 Conclusion

The proposed image enhancement method has proven to be effective in enhancing

underwater image in terms of color, contrast and image details. Qualitative eval-
uation and quantitative evaluation have been performed to evaluate and justify the
performance of the proposed method. Three sample images were tested and the
results showed the effectiveness of the proposed method. In addition, GUI appli-
cation has been successfully developed for processing underwater images.
This GUI has successfully displayed the comparison between the input image (raw
image) and the output image (enhanced image). The implementation of the
Raspberry Pi device in underwater image acquisition application is also success-
fully produced. The idea of it is to take an image from the Picamera, and then the
image quality is improved through the proposed method. The image quality pro-
duced through the Raspberry Pi also shows satisfactory results.

Acknowledgements The research is supported by University Malaysia Pahang (UMP) research

grant RDU1803131 entitled “Development of Multi-Vision Guided Obstacle Avoidance System
for Ground Vehicle”. The sample images and some related references are taken from database
https://sites.google.com/ump.edu.my/shahrizan/database-publication.

References

1. Abdul Ghani AS (2015) Improvement of underwater image contrast enhancement technique

based on histogram modiﬁcation. Thesis - Universiti Sains Malaysia. Accessed Jan 2019
2. Ancuti C, Ancuti CO, Haber T, Bekaert P (2012) Enhancing underwater images and videos
by fusion. In: Proceedings of the IEEE computer society conference on computer vision and
pattern recognition, pp 81–88
3. Chiang JY, Chen YC (2012) Underwater image enhancement by wavelength compensation
and dehazing. IEEE Trans Image Process 21(4):1756–1769
4. Abdul Ghani AS, Mat Isa NA (2017) Automatic system for improving underwater image
contrast and color through recursive adaptive histogram modiﬁcation. Comput Electron Agric
141:181–195
5. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Deep underwater image
enhancement through integration of red color correction based on blue color channel and
global contrast stretching. In: Md Zain Z et al (eds) Proceedings of the 10th national technical
seminar on underwater system technology 2018, LNEE, vol 538, pp 35–44. Springer,
Singapore
6. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Deep underwater image
enhancement through colour cast removal and optimization algorithm. Imag Sci J 67(6):330–
342
7. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Natural-based underwater
image color enhancement through fusion of swarm-intelligence algorithm. Appl Soft
Comput J 85:1–19
8. Peng Y, Cosman PC (2017) Underwater image restoration based on image blurriness and light
absorption. IEEE Trans Image Process 26(4):1579–1594
9. Ancuti CO, Ancuti C, De Vleeschouwer C, Bekaert P (2018) Color balance and fusion for
underwater image enhancement. IEEE Trans Image Process 27(1):379–393
72 C. H. Chong et al.

10. Kareem HH, Daway, HG, Daway EG (2019) Underwater image enhancement using colour
restoration based on YCbCr colour model. In: IOP conference series: materials science and
engineering, vol 571, pp 1–7
11. Horak K, Zalud L (2015) Image processing on raspberry pi in Matlab. Adv Intell Syst Comput
4:1–7
12. Patil VP, Gohatre UB, Singla CR (2018) Design and development of raspberry pi based
wireless system for monitoring underwater environmental parameters and image enhance-
ment. Int J Electron Electr Comput Syst 7(5):133–138
13. Buchsbaum G (1980) A spatial processor model for object colour perception. J Franklin Inst
310(1):1–26
14. Ye Z (2009) Objective assessment of nonlinear segmentation approaches to gray level
underwater images. ICGST J Graph Vis Image Process 9(II):39–46
15. Hitam MS, Awalludin EA, Wan Yussof WNJ, Bachok Z (2013) Mixture contrast limited
adaptive histogram equalization for underwater image enhancement. In: Proceeding of the
IEEE international conference on computer applications technology (ICCAT), pp 1–5
16. Rao SP, Rajendran R, Panetta K, Agaian SS (2017) Combined transform and spatial domain
based “no reference” measure for underwater images. In: Proceedings of the IEEE
international symposium on technologies for homeland security (HST), pp 1–7
Red and Blue Channels Correction
Based on Green Channel
and Median-Based Dual-Intensity
Images Fusion for Turbid Underwater
Image Quality Enhancement

Kamil Zakwan Mohd Azmi, Ahmad Shahrizan Abdul Ghani,

and Zulkifli Md Yusof

Abstract One of the main problems encountered in processing the turbid under-
water images is the effect of greenish color cast that overshadows the actual color of
an object. This paper introduces a new technique which focuses on the enhance-
ment of turbid underwater images. The proposed method integrates two major
steps. The ﬁrst step is specially designed to reduce the greenish color cast problem.
The blue and red channels are improved according to the difference between these
channels and the reference channel in terms of the total pixel values. Then, the
median-based dual-intensity images fusion approach is applied to all color channels
to improve the image contrast. Qualitative and quantitative evaluation is used to test
the effectiveness of the proposed method. The results show that the proposed
method is very effective in improving the visibility of the turbid underwater images.

Keywords Image processing Turbid underwater image Contrast stretching

1 Introduction

The features of the turbid underwater images differ from deep underwater images,
where not only the red channel but the blue channel also problematic due to
absorption by the organic matter [1]. As a result, the greenish color cast dominates
these images and causes the actual color of an object difﬁcult to be determined
accurately. In addition, the turbid underwater images also have low contrast issue,
resulting in poor image quality.
Based on the aforementioned issues, it is very crucial for underwater researchers
to focus on improving the turbid underwater images. In this paper, an idea to

K. Z. Mohd Azmi A. S. Abdul Ghani (&) Z. Md Yusof

Faculty of Manufacturing and Mechatronic Engineering Technology,
Universiti Malaysia Pahang, 26600 Pekan, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 73

improve the visibility of turbid underwater images is presented. The proposed

method involves two major steps: red and blue channels correction based on green
channel, and median-based dual-intensity images fusion (RBCG-MDIF). The
capability of the proposed method is validated through qualitative and quantitative
evaluation results.
This paper is organized as follows: literature review is described in Sect. 2.
Section 3 discusses the motivation of this research. Section 4 provides a detail
explanation of the proposed method. In Sect. 5, the capability of the proposed
method is conﬁrmed through qualitative and quantitative evaluation results. This
paper ends with a conclusion.

2 Related Works

The gray world (GW) assumption [2] is a famous method, which has been
employed to improve underwater images. This method assumes that all color
channels have the same mean value before attenuation. However, this method
inadequately enhances underwater images that are highly affected by a strong
greenish effect such as in turbid underwater scene.
Another well-known method which is frequently being used to compare the
effectiveness of a method is unsupervised color correction method (UCM) [3]. This
method is able to increase the image contrast. However, for turbid underwater
images, it tends to produce a yellowish output image.
In 2016, Abdul Ghani and Mat Isa [4] proposed an integrated-intensity
stretched-Rayleigh histograms method (IISR). In this method, each color channel is
multiplied by a gain factor in order to balance all the color channels. Based on
visual observation, for turbid underwater images, IISR over-enhances the greenish
effect, thus reducing the visibility of the objects.
Recently, Mohd Azmi et al. (2019) [5] proposed a method for deep underwater
image enhancement. It incorporates two main steps, which are red color correction
based on blue color channel (RCCB) and global contrast stretching (GCS). This
method is very effective in enhancing the attribute of deep underwater images, as it
is able to reduce the bluish color cast signiﬁcantly. However, it is less effective in
improving the quality of turbid underwater images. In the next section, we will
explain how this method is being modiﬁed and adapted for turbid underwater
images enhancement.
Red and Blue Channels Correction Based on Green Channel … 75

3 Motivation

The RCCB step has shown excellent results in improving the feature of deep
underwater images [5]. This step works by modifying the red channel with regards
to the difference between this channel and the blue channel in terms of the total
pixel value.
However, this step is less effective in improving the quality of turbid underwater
images. As mentioned earlier, the features of the turbid underwater images differ
from deep underwater images, where not only the red channel but the blue channel
also problematic due to absorption by the organic matter [1].
The diver image in Table 1(a) is used to show the output image produced by the
RCCB step. The original image is entirely disguised by the greenish color cast
while the objects are hardly seen. According to the image histograms, the green
channel is dominant over the other color channels.
No changes can be seen in the output image generated by the RCCB step. Image
histograms also show no adjustment and improvement. This is because of the
RCCB step only improves the red channel by referring to the blue channel [5],
while in the turbid scene, generally, the red and blue channels is not signiﬁcantly
differ as shown in the histograms of the original image.
Therefore, this paper introduces a new idea to improve the RCCB step, con-
sidering the enhancements that need to be made to both red and blue channels. The
reference channel should be changed to the green channel, instead of the blue
channel as proposed in the RCCB step [5]. This is because the green channel is
usually superior to the other color channels in turbid underwater images.

Table 1 Resultant image and image histograms produce by RCCB step

Method Resultant image Histogram of image

Red
1000

500

0
0 50 100 150 200 250
Green

Original
(a)
500

image 0

1000
0 50 100
Blue
150 200 250

500

0
0 50 100 150 200 250

Red
1000

500

0
0 50 100 150 200 250
Green

(b) RCCB [5]

500

0
0 50 100 150 200 250
Blue
1000

500

0
0 50 100 150 200 250
76 K. Z. Mohd Azmi et al.

4 Methodology: Red and Blue Channels Correction Based

on Green Channel and Median-Based Dual-Intensity
Images Fusion (RBCG-MDIF)

This section provides a detail explanation of the proposed method. Figure 1 shows
the flowchart of the proposed method, while Table 2 shows the resultant images
and image histograms of each step of the proposed RBCG-MDIF method.

4.1 Red and Blue Channels Correction Based on Green

Channel (RBCG)

To begin with, the image is disintegrated into the red, green, and blue channels.
Then, the total pixel value of red channel, Rsum , green channel, Gsum and blue
channel, Bsum are calculated. The green channel is chosen as the reference channel
for the enhancement of the red and blue channel, as this color channel is usually
dominant in turbid underwater scene. Two gain factors, Y and Z are obtained as
follows:

Fig. 1 Flowchart of the

proposed RBCG-MDIF Input image
method

Red and blue channels correction

based on green channel (RBCG)

Median-based dual-intensity
images fusion (MDIF)

Unsharp masking

Output image
Red and Blue Channels Correction Based on Green Channel … 77

Table 2 Resultant images and image histograms of each step of the proposed RBCG-MDIF

Steps Resultant images Histograms of image

Red
1000

500

0
0 50 100 150 200 250

Input
Green

(a)
500

image 0

1000
0 50 100
Blue
150 200 250

500

0
0 50 100 150 200 250

Red

500

0
0 50 100 150 200 250
Green

(b) RBCG
500

0
0 50 100 150 200 250
Blue

500

0
0 50 100 150 200 250

Red

500

0
0 50 100 150 200 250
Green

0
0 50 100 150 200 250
Blue

500

0
0 50 100 150 200 250

Red

500

0
0 50 100 150 200 250

Unsharp
Green

masking 0

500
0 50 100
Blue
150 200 250

0
0 50 100 150 200 250

Gsum Bsum
Y¼ ð1Þ
Gsum þ Bsum

Gsum Rsum
Z¼ ð2Þ
Gsum þ Rsum

The gain factor of Y contains information concerning the difference between the
green and blue channels in terms of total pixel value. Meanwhile, the gain factor of
Z contains information concerning the difference between the green and red
channels. This information is crucial to control the appropriate amount of pixel
value that has to be added to the blue and red channels in order to reduce the
greenish color cast. The larger the pixel value difference between the green channel
and the other color channels, the higher the pixel value will be added to improve the
blue and red channels.
78 K. Z. Mohd Azmi et al.

Before RBCG step After RBCG step

1000
Red Reference Red

500 channel 500

0 0
0 50 100 150 200 250 0 50 100 150 200 250
Green Green

500 500

0 0
0 50 100 150 200 250 0 50 100 150 200 250
Blue Blue
1000
500
500

0 0
0 50 100 150 200 250 0 50 100 150 200 250

Fig. 2 Images and their respective histograms before and after RBCG step

Then, the blue and red channels are improved through Eqs. (3) and (4),
respectively. As shown in Fig. 2, the proposed RBCG is able to enhance the blue
and red channels appropriately, thus signiﬁcantly reduce the effect of greenish color
cast.

Pblue ¼ Pblue þ Y Pgreen ð3Þ

Pred ¼ Pred þ Z Pgreen ð4Þ

where Pred , Pgreen and Pblue are the pixel values of red, green and blue channels,
respectively.

4.2 Median-Based Dual-Intensity Images Fusion (MDIF)

Then, the median-based dual-intensity images fusion approach is employed to all

color channels to improve the image contrast. The phase starts with the determi-
nation of minimum, median, and maximum intensity values of each image
histogram.
Red and Blue Channels Correction Based on Green Channel … 79

Original histogram
Median
point

700

600

500

400
Min Max
300 value value
200

100

0 50 100 150 200 250

Lower stretched-region Upper stretched-region

1500 1500

1000 1000

500 500

0 0

0 50 100 150 200 250 0 50 100 150 200 250

Fig. 3 Illustration of histogram division at a median point and stretching process

As shown in Fig. 3, based on the median point, each image histogram is sep-
arated into two regions, which are upper and lower stretched-regions. Then, each
region is stretched according to Eq. (5). Pin and Pout are the input and output pixels,
respectively. imin and imax represent the minimum and maximum intensity level
values for the input image, respectively.

Pin imin
Pout ¼ 255 ð5Þ
imax imin

For each color channel, the separation at the median point and global stretching
processes will produce two types of histograms, which are upper-stretched and
lower-stretched histograms. All upper-stretched histograms are integrated to gen-
erate a new resultant image. The similar process is performed to all lower-stretched
histograms. Then, these two types of images are composed by average points as
illustrated in Fig. 4.

4.3 Unsharp Masking

The unsharp masking technique [6] is applied in the last step to improve the overall
image sharpness. The fundamental idea of this method is to blur the original image
ﬁrst, then deduct the blurry image from the original image. Then, the difference is
added to the original image.
80 K. Z. Mohd Azmi et al.

Over-enhanced image

Input image Enhanced-contrast

output image

Under-enhanced image

Fig. 4 Composition of under-enhanced and over-enhanced images

This technique can be used and proven effective in improving the quality of
underwater images [7] [8]. Through this method, blurry appearance of underwater
objects can be further enhanced. This can assist underwater researchers to better
detect an object such as plants or animals under the sea.

5 Results and Discussion

In this experiment, 300 underwater images are used to evaluate the performance of
the proposed RBCG-MDIF method. The proposed method is compared with gray
world (GW) [2], unsupervised color correction method (UCM) [3],
integrated-intensity stretched-Rayleigh (IISR) [4], and red channel correction based
on blue channel and global contrast stretching (RCCB-GCS) [5].
Besides visual observation, three quantitative performance metrics are used to
support the qualitative assessment, which are entropy [9], patch-based contrast
quality index (PCQI) [10], and natural image quality evaluator (NIQE) [11]. A high
entropy value indicates that a method is able to generate an output image with more
information, while a high PCQI value corresponds to high quality of image contrast.
On the other hand, a low NIQE value indicates a high degree of image naturalness
of the output image. Five samples of underwater images are selected for comparison
as shown in Figs. 5, 6, 7, 8 and 9, while Table 3 shows the quantitative results of
these samples images.
The original image of turbid image 1 has low contrast and the greenish color
cast overshadows the actual color of objects. Through comparison, GW produces a
Red and Blue Channels Correction Based on Green Channel … 81

(a) Original image (b) GW (c) UCM

(d) IISR (e) RCCB-GCS (f) Proposed RBCG-MDIF

Fig. 5 Processed images of turbid image 1 based on different methods

(a) Original image (b) GW (c) UCM

(d) IISR (e) RCCB-GCS (f) Proposed RBCG-MDIF

Fig. 6 Processed images of turbid image 2 based on different methods

82 K. Z. Mohd Azmi et al.

(a) Original image (b) GW (c) UCM

(d) IISR (e) RCCB-GCS (f) Proposed RBCG-MDIF

Fig. 7 Processed images of turbid image 3 based on different methods

(a) Original image (b) GW (c) UCM

(d) IISR (e) RCCB-GCS (f) Proposed RBCG-MDIF

Fig. 8 Processed images of turbid image 4 based on different methods

Red and Blue Channels Correction Based on Green Channel … 83

(a) Original image (b) GW (c) UCM

(d) IISR (e) RCCB-GCS (f) Proposed RBCG-MDIF

Fig. 9 Processed images of turbid image 5 based on different methods

Table 3 Quantitative results in terms of entropy, PCQI, and NIQE

Images Methods Quantitative analysis
Entropy PCQI NIQE
(a) Turbid image 1 Original 7.556 1.000 3.822
GW 7.030 0.943 3.769
UCM 7.665 1.196 3.849
IISR 7.113 1.107 4.026
RCCB-GCS 7.559 1.209 3.700
Proposed RBCG-MDIF 7.917 1.256 3.747
(b) Turbid image 2 Original 7.600 1.000 7.112
GW 6.987 0.858 6.578
UCM 7.762 1.101 4.828
IISR 5.431 0.698 4.725
RCCB-GCS 7.490 1.141 5.112
Proposed RBCG-MDIF 7.942 1.166 3.959
(c) Turbid image 3 Original 7.266 1.000 7.767
GW 6.639 0.846 6.310
UCM 7.391 1.131 4.696
IISR 4.779 0.756 4.619
RCCB-GCS 7.180 1.179 4.888
Proposed RBCG-MDIF 7.858 1.221 4.359
(continued)
84 K. Z. Mohd Azmi et al.

Table 3 (continued)
Images Methods Quantitative analysis
Entropy PCQI NIQE
(d) Turbid image 4 Original 6.713 1.000 4.996
GW 6.075 0.992 4.344
UCM 7.301 1.209 6.947
IISR 4.856 0.973 4.615
RCCB-GCS 6.630 1.421 4.783
Proposed RBCG-MDIF 7.719 1.442 4.774
(e) Turbid image 5 Original 7.674 1.000 5.999
GW 7.033 0.940 5.279
UCM 7.863 1.155 4.711
IISR 6.796 1.033 4.943
RCCB-GCS 7.691 1.132 4.975
Proposed RBCG-MDIF 7.951 1.202 4.445

reddish output image that seem unnatural to human visual system. Furthermore, this
method insufﬁciently enhances the image contrast as it produces the lowest values
of entropy (7.030) and PCQI (0.943). UCM is able reduce the greenish color cast,
however, the bright region is occupied by yellowish appearance. There is no major
enhancement can be observed in the resultant image delivered by IISR, as this
method intensify further the greenish color cast. The high score of NIQE (4.026)
obtained by this method shows the quality of this output image is worse than the
original image. RCCB-GCS is able to lessen the greenish color cast problem.
However, based on quantitative analysis, this method obtains low entropy value
(7.559) which is almost similar to original image (7.556). Meanwhile, the proposed
RBCG-MDIF produces the best image quality as the greenish color cast effect is
extensively lowered. This better performance is also veriﬁed by the quantitative
assessment stated in Table 3 (a) as the proposed method obtains the highest scores
for entropy and PCQI. For NIQE, the proposed method is in second rank after
RCCB-GCS method. However, the visual observation shows that output image
produced by the proposed method is better than RCCB-GCS. Based on output
image produced by RCCB-GCS method, the greenish color cast retains in the
background as shown in Fig. 5(e).
Contrary to the previous tested image, the original image of turbid image 2 is
affected by a strong greenish color cast causing the actual color of objects being
implicated with this effect. Instead of reducing the greenish color cast, GW intro-
duces a reddish color cast in the output image. This causes the true color of objects
being associated with this effect. UCM is able to improve the image contrast,
however, this method produces a yellowish effect especially in the background.
Compared to the original image, the resultant image processed by IISR is worse.
This method over-enhances the greenish effect, thus reducing the visibility of the
Red and Blue Channels Correction Based on Green Channel … 85

objects. This outcome is supported by the quantitative analysis, where this method
produces the lowest values of entropy (5.431) and PCQI (0.698). RCCB-GCS is
able to improve the image contrast and reduces the greenish color cast problem as
the objects can be differentiated from the background. However, this method
produces a large NIQE value (5.112), indicating poor image naturalness. On the
other hand, the proposed RBCG-MDIF effectively reduces the greenish color cast.
The image contrast is also well-improved. This notable accomplishment is verified
by the quantitative assessment stated in Table 3(b) as the proposed RBCG-MDIF
obtains the highest values of entropy, PCQI, and NIQE with the values of 7.942,
1.166, and 3.959, respectively.
Meanwhile, the original image of turbid image 3 is occupied by intense greenish
color cast causing the appearance of objects is very limited. Through comparison,
GW darkens the original image. This method also produces a high value of NIQE
(6.310), indicating poor naturalness quality of the processed image. UCM produces
a yellowish effect in the output image while the greenish color cast preserves in the
background. IISR degrades further the original image, as the greenish color cast
exceedingly overshadows the output image. RCCB-GCS successfully reduces the
greenish color cast to some extent, however, this effect is retained in the back-
ground. On the other hand, the proposed RBCG-MDIF produces better image
feature than the other methods as the greenish color cast is significantly reduced.
Furthermore, the objects can be seen clearly. This prominent performance is con-
firmed by the quantitative assessment stated in Table 3(c) as the proposed method
obtains the highest scores for all performance metrics.
A similar trend can be observed in other tested images, where the proposed
RBCG-MDIF successfully reduces the greenish color cast and improve the image
contrast. Table 4 reports the average quantitative scores of 300 tested underwater
images. Based on this table, the superior performance of the proposed method is
further supported by this quantitative evaluation, as the proposed method attains the
best rank for all performance metrics.

Table 4 Average Methods Quantitative analysis

quantitative result of 300
Entropy PCQI NIQE
tested underwater images
Original 7.064 1.000 4.244
GW 6.607 0.976 4.801
UCM 7.571 1.194 4.615
IISR 7.258 1.148 3.959
RCCB-GCS 7.287 1.192 3.836
Proposed RBCG-MDIF 7.775 1.279 3.808
Note The values in bold typeface represent the best result
obtained in the comparison
86 K. Z. Mohd Azmi et al.

6 Conclusion

The RBCG-MDIF method is speciﬁcally designed to solve turbid underwater image

problems, especially to reduce the greenish color cast effect and to improve overall
image contrast. This paper introduces a new idea to improve the RCCB step,
considering the enhancements that need to be made to the red and blue channels.
The reference channel has been changed to the green channel, instead of the blue
channel for turbid underwater image enhancement. The capability of the proposed
method in enhancing the turbid underwater images is veriﬁed through qualitative
and quantitative evaluation results.

Acknowledgements We would like to thank all reviewers for the comments and suggestions to
improve this paper. This study is supported by Universiti Malaysia Pahang (UMP) through
Postgraduate Research Grant Scheme (PGRS1903184) entitled “Development of Underwater
Image Contrast and Color through Optimization Algorithm”.

References

1. Lu H, Li Y, Xu X, Li J, Liu Z, Li X, Yang J, Serikawa S (2016) Underwater image

enhancement method using weighted guided trigonometric ﬁltering and artiﬁcial light
correction. J Vis Commun Image Represent 38:504–516
2. Buchsbaum G (1980) A spatial processor model for object colour perception. J Franklin Inst
310(1):1–26
3. Iqbal K, Odetayo M, James A, Salam RA, Talib AZH (2010) Enhancing the low quality
images using unsupervised colour correction method. In: Proceedings of the IEEE
international conference on systems, man and cybernetics pp. 1703–1709
4. Abdul Ghani AS, Raja Aris RSNA, Muhd Zain ML (2016) Unsupervised contrast correction
for underwater image quality enhancement through integrated-intensity stretched-Rayleigh
histograms. J Telecommun Electron Comput Eng 8(3):1–7
5. Azmi KZM, Ghani, ASA, Md Yusof Z, Ibrahim Z (2019) Deep underwater image
enhancement through integration of red color correction based on blue color channel and
global contrast stretching. In: Md Zain Z, et al (eds) Proceedings of the 10th national technical
seminar on underwater system technology 2018. LNEE, vol 538. Springer, Singapore,
pp 35–44
6. Jain AK (1989) Fundamentals of digital image processing. Prentice Hall, Englewood Cliffs
7. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Deep underwater image
enhancement through colour cast removal and optimization algorithm. Imaging Sci J 67
(6):330–342
8. Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Natural-based underwater
image color enhancement through fusion of swarm-intelligence algorithm. Appl Soft
Comput J 85:1–19
9. Ye Z (2009) Objective assessment of nonlinear segmentation approaches to gray level
underwater images. ICGST J Graph Vis Image Process 9(2):39–46
10. Wang S, Ma K, Yeganeh H, Wang Z, Lin W (2015) A patch-structure representation method for
quality assessment of contrast changed images. IEEE Signal Process Lett 22(12):2387–2390
11. Mittal A, Soundararajan R, Bovik AC (2013) Making a “completely blind” image quality
analyzer. IEEE Signal Process Lett 20(3):209–212
Analysis of Pruned Neural Networks
(MobileNetV2-YOLO v2)
for Underwater Object Detection

A. F. Ayob, K. Khairuddin, Y. M. Mustafah, A. R. Salisa,

and K. Kadir

Abstract Underwater object detection involves the activity of multiple object

identification within a dynamic and noisy environment. Such task is challenging
due to the inconsistency of moving shapes underwater (i.e. goldfish) within a very
dynamic surrounding (e.g. bubbles, miscellaneous objects). The application of
pre-trained deep learning classifiers (e.g. AlexNet, ResNet, GoogLeNet and so on)
as the backbone of several object detection algorithms (e.g. YOLO, Faster-RCNN
and so on) have gained popularity in recent years, however, there is a lack of
attention on the systematic study to reduce the size of the pre-trained neural net-
works hence speeding up the object detection process in the real-world application.
In this work, we investigate the effect of reducing the size of the pre-trained
MobileNetV2 as the backbone of the YOLOv2 object detection framework to
construct a fast, accurate and small neural network model to perform goldfish breed
identification in real-time.

Keywords Artiﬁcial neural network Object detection Underwater engineering
Ocean technology

A. F. Ayob (&) K. Khairuddin A. R. Salisa

Faculty of Ocean Engineering Technology and Informatics,
Universiti Malaysia Terengganu, 21030 Kuala Nerus, Malaysia
e-mail: [email protected]
Y. M. Mustafah
Department of Mechatronics Engineering, International Islamic University
Malaysia, 50728 Kuala Lumpur, Malaysia
K. Kadir
Garisan Automotive Sdn. Bhd., Cyberjaya, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 87

1 Introduction

Deep learning is a branch of artiﬁcial neural network which concerns about

developing a model that act as universal function approximator based on the
training data. In the field of underwater object detection, such function
approximator/model can be constructed without prior knowledge such as the depth
of the water, the map of the surrounding, underwater occlusion and the temperature
of the surrounding.
Underwater object detection presented by [1] utilized the combination of the
colour contrast, intensity and transmission information to identify the ROI in
underwater images, however unstable performance was reported in the artificially
illuminated environment. Sung et al. [2] presented the utilization of You Look Only
Once (YOLO) algorithm for the underwater fish detection via the use of transfer
learning by adopting the original framework and trained using their custom dataset,
however reported a very low frame per seconds (FPS) (16.7 FPS) through GeForce
Pascal Titan GPU. Xu and Matzner [3] presented the utilization of third version of
YOLO (YOLOv3) to perform underwater fish detection via the use of transfer
learning, however with a moderate value of mean average precision, mAP =
0.5392.
This paper shall address two questions which involve the effectiveness of deep
learning framework in the real-life applications, such as;
1. The effect of utilizing many layers of deep learning to solve for several classes
within a dynamic underwater environment with respect to detection time and
model size.
2. Whether there is a need to utilize all the layers in the pre-trained deep learning
model to be used in a different situation.

2 Proposed Approach

2.1 You Look Only Once (YOLO) and YOLOv2

YOLO [4] is a single convolutional network that directly predict object bounding
boxes and class probabilities directly from full images in just one evaluation [5].
YOLO comes with its own benefits, one of which is it is exceptionally fast. YOLO
does not need complex pipeline as it models detection as regression problem [4]
YOLO uses regression as its final detection layer that maps the output of the last
fully connected layer to the final bounding boxes and class assignments [6]. The
network of YOLO consists of 24 convolutional layers followed by 2 fully con-
nected layers [7], as shown in Fig. 1. Furthermore, YOLO reasons globally about
the image when making prediction, resulting in less false positive prediction on the
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)… 89

Fig. 1 Original YOLO architecture [4]

background. In addition, YOLO also learns the object general representation, means
that YOLO are able to detect the object in natural images and also in other domains
like artwork.
YOLOv2 [8], also known as YOLO9000, is an improved version of YOLO that
are able to detect over 9000 objects. When compared to Fast R-CNN, YOLO tends
to make a signiﬁcant number of localization errors [8]. YOLO also suffers from low
recall when compared to region proposal-based methods. In YOLOv2, anchor
boxes are added to predict bounding boxes [9]. Anchor boxes proves to be effective,
which allows for multiple objects detection which varies in terms of aspect ratio in a
single grid cell. Furthermore, YOLOv2 introduces dimension clustering and
clustering-based (k-means) for bounding box parameterization which improves the
mean Average Precision (mAP) of the detection.

2.2 MobileNet and Mobile v2 Algorithm

MobileNet consists of two layers, in which its model is based on depth-wise sep-
arable convolutions [10]. Depth-wise separable convolution is made up of
depth-wise convolution and 1 1 pointwise convolutions, as shown in Fig. 2.
Basically, it performs a single convolution on each colour channel rather than
combining all three and flattening it. MobileNet shows that its models have large
accuracy gap against its float point model despite being successfully reduces
parameter size and computation latency with separable computation [11].
In MobileNetV2, bottleneck convolutions had been utilized [12]. The ratio
between the size of the input and the inner size is referred as the expansion ratio.
Each bottleneck block contains an input followed by several bottleneck. Shortcuts
were used directly between the bottlenecks because the bottlenecks contain all the
90 A. F. Ayob et al.

Fig. 2 MobileNet
architecture [12]

Fig. 3 Two types of

bottleneck blocks
incorporated in MobileNetV2
[12]

necessary information while an expansion layer only acts as an implementation

detail that accompanies a non-linear transformation of the tensor, as shown in
Fig. 3. Instead of using classical residual block, where it connects the layers with
high number of channels, the inverted residuals are used where it connects the
bottlenecks. The inverted design is used as it is considerably more memory efﬁcient
and works slightly better. Within the pre-trained MobileNetV2, a 16-blocks
architecture were incorporated. The16-blocks pre-trained MobileNetV2 model can
be obtained from [13].
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)… 91

2.3 Evaluation of Models

In order to evaluate the models, several evaluation metrics have been utilized,
namely; Precision, Recall, Average Precision and the mean Average Precision
(mAP). In a human perspective, such metrics are aimed to evaluate the skill of the
model with respect to its capability to mimic human’s capability in the detection
task.
Given a number of queries;
a) Precision is deﬁned as the ratio of true positive items detected to the sum of all
positive objects based on the ground truth data, shown in Eq. (1).

true positives
precision ¼ ð1Þ
true positives þ false negatives

b) Recall is deﬁned as the ratio of true positive items to the sum of the true positive
and false negatives items identiﬁed by the detector, with relative to the ground
truth data, shown in Eq. (2).

true positives
recall ¼ ð2Þ
true positives þ false negatives

c) Average Precision (AP) is deﬁned as the area under the curve based on the
calculation of Precision and Recall across a given queries.
In this work, the mean Average Precision (mAP) is calculated for each model
across a number of classes, as shown in Eq. (3).
PnClass
class¼1 AP
mAP ¼ ð3Þ
nClass

2.4 Data Preparation

A four-minute free-swimming goldﬁsh video has been prepared within the lab
under controlled lighting setup, as show as in Fig. 4. This setup is adequate to
simulate real world application, where bubbles and other uncontrolled movement
are tolerated. The frame-by-frame images of four-minute video has been extracted,
which resulted to 11,4444 images. A split of training set and validation set of the
images have been set to 60%–40% is applied. The training set was annotated/
labelled with respect to the goldﬁsh breeds prior to the training of the YOLOv2
deep learning model.
92 A. F. Ayob et al.

Fig. 4 QR-code link to the video results of the 6-classes Goldﬁsh Breeds detection/identiﬁcation
[14]

3 Results and Discussions

The experiments were conducted using the pre-trained MobileNet v2 model acted
as the backbone of the YOLOv2 detection framework. The initial pre-trained model
were consists of 16 building blocks, in which for each experiment, the block was
systematically reduced with number of blocks minus one (n−1) for each new
training session. Each training was conducted across 30 epoch, mini batch size of
16, with stochastic gradient descent as the optimizer. The specification of the
machine is Intel i7 (8th Generation), 16 GB of RAM and RTX 2060 GPU with
6 GB of VRAM. The deep learning models were trained using 5,833 annotated
goldfish breeds image dataset, which consist of 6 classes of goldfish breeds; Calico
Goldfish, Blackmoor Goldfish, Common Goldfish, Lionhead Goldfish, Ryukin
Goldfish and Pearlscale Goldfish. The time taken for each experiment to complete
was approximately 4 h. Each newly trained model was analyzed qualitatively (via
videos) and quantitatively (Table 1) to measure its effectiveness.
The first order in evaluating the model is through observing the precision-recall
(PR) curve. The graphs that represented the precision-recall curve are presented in
Figs. 8, 9 and 10. Across all the models (Block 1 to 16), the character of the PR
curves is almost similar, which indicated the consistency of the training. In this
work, Block 1, Block 8 and Block 16 were selected as an indicative typical rep-
resentation. It can be observed that the models are able to perform with high
precision, even with the recall threshold of 0.5.
The video representation of the results can be accessed via the link provided in
the QR as shown in Fig. 4. The effectiveness of the detection model can be
observed qualitatively in the web-based demonstration and further elaborated in this
section.
Shown in Fig. 5, 6 and 7 are the results of the snapshot at the time t = 1:28 min
(named as the ‘checkpoint’) for the three models that were trained on the respective
feature layers named Block 1, Block 8 and Block 16. Considering the whole 16
blocks that built the pre-trained MobileNet v2, Block 1 represented the 1/16 (6%) of
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)… 93

Fig. 5 Video snapshot of the t = 1:28 min of the detection using Block 1 model

Fig. 6 Video snapshot of the t = 1:28 min of the detection using Block 8 model

the original pre-trained model, while Block 8 represented 8/16 (50%) and finally
Block 16 that represented the whole (100%) original pre-trained model.
Referring to the figures, it can be observed that at the checkpoint of t = 1:28
min, Block 1 was able to detect 8 out of 11 goldfishes in the aquarium, where Block
8 were able to detect all goldfishes, followed by Block 16 which was able to detect
8 goldfishes out of 11. This qualitative observation is closely related to the mAP of
each of the model as reported in Table 1, where Block 8 represented the highest
mAP compared with Block 1 and Block 16.
94 A. F. Ayob et al.

Fig. 7 Video snapshot of the t = 1:28 min of the detection using Block 16 model

Further inspection in Table 1 indicated that Block 16 with 3 million parameter

evaluations contributed to the longer detection time which resulted an average of
12.53 frame per-second, compared with Block 1 (17,328 parameter count) that was
able to perform the fastest detection with the rate of 56.64 frames per-second.
A much reasonable FPS (*24 frame per-second) for this case with the mAP close
to *97% can be attributed to Block 8, Block 9 and Block 10, as shown in Table 1.
In terms of possible extension or future works, for a non-critical, non-life
threatening application, such reduction of model size, parameters is beneﬁcial for
mobile-based high-speed detection task such as presented in this paper.

Table 1 Quantitative observation of the trained model across different evaluation metrics.
Highlighted are the most reasonable models with respect to its mAP, FPS and size
Model Total number of Mean average Mean frame per Size of model
name parameters (x105) precision (mAP) second (FPS) (MB-decimal)
(%)
Block 16 36.0284 95.05 12.53 13.594
Block 15 17.478 94.66 12.61 6.787
Block 14 14.3692 94.41 12.75 5.596
Block 13 11.2604 94.44 12.74 4.405
Block 12 6.782 97.39 13.80 2.738
Block 11 5.654 97.08 15.04 2.29
Block 10 4.526 96.89 23.21 1.843
Block 9 2.95896 96.79 24.13 1.246
(continued)
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)… 95

Table 1 (continued)
Model Total number of Mean average Mean frame per Size of model
name parameters (x105) precision (mAP) second (FPS) (MB-decimal)
(%)
Block 8 2.45272 96.43 25.06 1.035
Block 7 1.94648 96.65 30.33 0.825
Block 6 1.44024 95.99 32.35 0.615
Block 5 0.67928 95.83 33.57 0.322
Block 4 0.54904 95.46 35.47 0.26
Block 3 0.4188 94.86 37.37 0.198
Block 2 0.24792 91.77 50.90 0.123
Block 1 0.17328 89.53 56.64 0.084

Calico Goldfish AP = 0.8 Black Moor Goldfish AP = 0.9 Common Goldfish AP = 0.9
1 1 1

0.95 0.95
0.95
0.9
Precision

Precision

0.9
0.9 0.85
0.85
0.8
0.85
0.8 0.75

0.75 0.8 0.7

0 0.5 1 0 0.5 1 0 0.5 1
Recall Recall Recall

Lionhead Goldfish AP = 0.9 Ryukin Goldfish AP = 1.0 Pearlscale Goldfish AP = 0.9

1 1 1

0.95
0.95 0.95
0.9
Precision

Precision

0.85 0.9 0.9

0.8
0.85 0.85
0.75

0.7 0.8 0.8

0 0.5 1 0 0.5 1 0 0.5 1
Recall Recall Recall

Fig. 8 Precision-recall graph for Block 1

96 A. F. Ayob et al.

Calico Goldfish AP = 1.0 Black Moor Goldfish AP = 1.0 Common Goldfish AP = 1.0
1 1 1

0.995
0.99 0.99
0.99
Precision

Precision

Precision
0.98 0.98
0.985
0.97 0.97
0.98

0.96 0.975 0.96

0.97
0.95 0.95
0 0.5 1 0 0.5 1 0 0.5 1
Recall Recall Recall

Lionhead Goldfish AP = 0.9 Ryukin Goldfish AP = 1.0 Pearlscale Goldfish AP = 1.0

1 1 1

0.99 0.995
0.99
0.98 0.99
Precision

Precision

Precision
0.98
0.97 0.985
0.97
0.96 0.98

0.95 0.975 0.96

0.94 0.97 0.95

0 0.5 1 0 0.5 1 0 0.5 1
Recall Recall Recall

Fig. 9 Precision-recall graph for Block 8

4 Conclusions

In this work, we have presented a case study that investigates the effect of reducing
the neural network layers of the original MobileNetV2 from ‘16 Blocks’ to ‘1
Block’ architecture. The decrease of the number of layers accounts for the reduction
of 17,328 to 1.7 million learnable parameters in the deep learning neural net.
Important observations with regards to the effect of reducing the number of layers
include the signiﬁcant speed-up in the detection process, which accounted to 78%
increase of speed; from *12 fps to *56 fps. The mean Average Precision (mAP)
were observed to be 89% by only utilizing ‘Block 1’, compared with utilizing the
whole 16 blocks of MobileNet v2 that accounted for 95% mAP. Furthermore, 99%
of model size shrinkage has been achieved between ‘Block 16’ (13.594 MB) and
‘Block 1’ (0.084 MB), asserting that reducing the number of layers will also
beneﬁcial for the real-world mobile-based model architecture while maintaining
satisfactory accuracy.
Analysis of Pruned Neural Networks (MobileNetV2-YOLO v2)… 97

Calico Goldfish AP = 1.0 Black Moor Goldfish AP = 1.0 Common Goldfish AP = 0.9
1 1 1

0.99 0.99 0.99

0.98 0.98
Precision

Precision

Precision
0.98
0.97 0.97
0.97
0.96 0.96

0.95 0.96 0.95

0.94 0.95 0.94

0 0.5 1 0 0.5 1 0 0.5 1
Recall Recall Recall

Lionhead Goldfish AP = 0.9 Ryukin Goldfish AP = 1.0 Pearlscale Goldfish AP = 1.0

1 1 1

0.98 0.99
0.98
0.98
Precision

Precision

Precision
0.96
0.97 0.96
0.94
0.96
0.94
0.92 0.95

0.9 0.94 0.92

0 0.5 1 0 0.5 1 0 0.5 1
Recall Recall Recall

Fig. 10 Precision-recall graph for Block 16

Acknowledgements Parts of this research were sponsored under Fundamental Research Grant
Scheme (FRGS) 59361 awarded by Ministry of Education Malaysia, and Research Intensiﬁed
Grant Scheme (RIGS) 55192/12 awarded by Universiti Malaysia Terengganu.

References

1. Chen Z, Zhang Z, Dai F, Bu Y, Wang H (2017) Monocular vision-based underwater object

detection. Sensors (Basel) 17(8):1784
2. Sung M, Yu S, Girdhar Y (2017) Vision based real-time fish detection using convolutional
neural network. In: OCEANS 2017—Aberdeen, pp 1–6
3. Xu W, Matzner S (2018) Underwater fish detection using deep learning for water power
applications. arXiv preprint arXiv:1811.01494
4. Redmon J, Divvala S, Girshick R, Farhadi, A (2016) You only look once: unified, real-time
object detection. In: IEEE conference on computer vision and pattern recognition, pp 779–788
5. Jing L, Yang X, Tian Y Video you only look once: overall temporal convolutions for action
recognition. J Visual Commun Image Rep, 58–65 (2018)
6. Putra MH, Yussof ZM, Lim KC, Salim SI (2018) Convolutional neural network for person
and car detection using YOLO framework. J Telecommun Electron Comput Eng 10:1–7
98 A. F. Ayob et al.

7. Du J (2018) Understanding of object detection based on CNN family and YOLO. J Phys Conf
Series, 12–29
8. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE
conference on computer vision and pattern recognition, pp 7263–7271
9. Shafiee MJ, Chywl B, Li F, Wong A (2017) Fast YOLO: a fast you only look once system for
real-time embedded object detection in video, arXiv preprint arXiv:1709.05943
10. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H
(2017) Mobilenets: efficient convolutional neural networks for mobile vision applications.
arXiv preprint arXiv:1704.04861
11. Sheng T, Feng C, Zhuo S, Zhang X, Shen L, Aleksic M (2018) A quantization-friendly
separable convolution for MobileNets. In: 1st workshop on energy efficient machine learning
and cognitive computing for embedded applications (EMC2), pp 14–18
12. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L (2018) Mobilenetv2: inverted residuals
and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern
recognition, pp 4510–4520
13. Mathworks Inc.: Pretrained MobileNet-v2 convolutional neural network. Mathworks Inc.
(2019). https://www.mathworks.com/help/deeplearning/ref/mobilenetv2.html. Accessed 14
Nov 2019
14. Ayob AF.: MobileNet(v2)-YOLOv2 Goldfish Detection (2019). https://www.youtube.com/
playlist?list=PLyM-KBafTfgicwqAhpa9a8HSv2TSHV3fZ. Accessed 21 July 2019
Different Cell Decomposition Path
Planning Methods for Unmanned Air
Vehicles-A Review

Sanjoy Kumar Debnath, Rosli Omar, Susama Bagchi,

Elia Nadira Sabudin, Mohd Haris Asyraf Shee Kandar,
Khan Foysol, and Tapan Kumar Chakraborty

Abstract An Unmanned Aerial Vehicle (UAV) or robot is guided towards its goal
through path planning that helps it in avoiding obstacles. Path planning generates a
path between a given start and an end point for the safe and secure reach of the
robot with required criteria. A number of path planning methods are available such
as bio-inspired method, sampling based method, and combinatorial method. Cell
decomposition technique which is known as one of the combinatorial methods can
be represented with conﬁguration space. The aim of this paper is to study the results
obtained in earlier researches where cell decomposition technique has been used
with different criteria like shortest travelled path, minimum computation time,
memory usage, safety, completeness, and optimality. Based on the classical tax-
onomy, the studied methods are classiﬁed.

Keywords Path planning Cell decomposition Regular grid UAV

1 Introduction

The use of unmanned air vehicle or autonomous robot in place of human beings to
carry out dangerous missions in adverse environments has been gradually increased
since last decades. Path planning is one of the vital aspects in developing an
autonomous vehicle that should traverse the shortest distance from a starting point
to a target point while in a given mission for saving its resources and minimizing

S. K. Debnath R. Omar (&) S. Bagchi E. N. Sabudin M. H. A. Shee Kandar

Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
Johor, Malaysia
e-mail: [email protected]
K. Foysol
Department of Allied Engineering, Bangladesh University of Textiles, Dhaka, Bangladesh
T. K. Chakraborty
Department of Electrical and Electronics Engineering, University of Asia Paciﬁc, Dhaka,
Bangladesh

© Springer Nature Singapore Pte Ltd. 2021 99

Path Planning Approaches

Combinatorial Sampling Based Biologically Inspired

Evolutionary Algorithm
RRT
Graph Search Algorithms Probability Genetic Algorithm
C-Space Representation
roadmap Differential Evolution
Road Map
Depth First Swarm Intelligence
Visibility graph
Bread First Search Particle Swarm Optimization
Voronoi diagram
Dijkstra’s Ant colony optimization
Cell Decomposition
Best First Simulated Annealing
Potential Field
A* Ecology Based
D* M*

Fig. 1 Classiﬁcation of path planning approach [8]

the potential risks. Therefore, it is crucial for a path planning algorithm to produce
an optimal path. The path planning algorithm should also hold the completeness
criterion which means that a path can be found if that exists. Moreover, the robot’s
safety, memory usages for computing and the real-time algorithms are also sig-
nificant [1–7]. Figure 1 illustrates the classification of path planning approaches.
The bio-inspired methods are the nature-motivated/biologically inspired algo-
rithms. A number of instances of bio-inspired approaches are the Genetic algorithm
(GA), Simulated annealing (SA), Particle Swarm Optimization (PSO) plus Ant
Colony Optimization (ACO). GA uses the natural selection course of biological
evolution that continuously fluctuate a populace of distinct results. Nonetheless, it
cannot assure any optimal path. Local minima may occur in narrow environments
and thus, it offers a lesser amount of safety and constricted corridor difficulty. GA is
computationally costly and ultimately it is not complete [8].
SA algorithm is developed based on warming and cooling process of metals to
regulate the internal configuration of its properties. Separate from very sluggish and
very high cost functions, SA is not able to accomplish the optimal path [9–15]. PSO
is a meta-heuristic population based approach and it has real-time outcome, but it
tumbles into local optima easily in many optimization complications. Additionally,
there is no general convergence concept appropriate for PSO in practice and its
convergence period is mostly vague for multidimensional problems [16]. On the
other hand, ACO emulates an ant to mark a path while the food source is confirmed.
The ant separates its direction towards the food source with pheromones for tracing
purpose. In ACO, the path in between the initial point and target point is arbitrarily
produced. ACO does a blind exploration and therefore, it is not proper for efficient
path planning due to the lack of optimal result [13, 17].
In sampling based path planning, a method Rapidly Exploring Random Tree
(RRT) does not require the establishment of the design space. In RRT, the first step
is to define the starting and the target points. Then, it considers the starting point as
Different Cell Decomposition Path Planning Methods … 101

the base for the tree, based on which different new branches are grown-up till it
reaches the target point [10, 11]. RRT is simple and easy way to handle problems
with obstacles and different constraints for autonomous/unmanned robotic motion
planning. Depending on the size of the engendered tree, the computation time is
also escalated. The resulting path commencing by RRT is not optimal all the time.
Nonetheless, it remains pretty easy to find a path for a vehicle with dynamic and
physical constrictions and it also creates least number of edges [18, 19].
Probabilistic roadmap (PRM) method is a path-planning algorithm that takes ran-
dom samples from the configuration space by examining the accessible free space
and dodging the crashes to find a way. A local planner is used to join these
configurations with close-by configurations. PRM is costly without any possibilities
to acquire the path. [18, 19].
Combinatorial path planning consists of mainly two methods, i.e. C-space rep-
resentation technique and graph search algorithm. In this case, the first step is to
create the configuration space of the environment. Then, a graph search algorithm,
for example Dijkstra’s and A-star (A*), is applied to search a path [7, 20].
Depth-first search (DFS) is good to pick up a path among many possibilities
without caring about the exact one. It may be less appropriate when there is only
one solution. DFS is good because a solution can be found without computing all
nodes [7]. Breadth-first search that is suitable for limited available solutions uses a
comparatively small number of steps. Its exceptional property finds the shortest
path from the source node up to the node that it visits first time when all the graph’s
edges are either un-weighted or having similar weight. Breadth-first search is
complete if one exists. Breadth-first search is good because it does not get trapped
in dead ends [21] and this algorithm does not assure to discover the shortest path
because it bypasses some branches in the search tree. It is a greedy search which is
not complete and optimal. Dijkstra’s algorithm is systematic search algorithm and
gives shortest path between two nodes. In optimal cases, where there is no prior
knowledge of the graph, it cannot estimate the distance between each node and the
target. Usually, a large area is covered in the graph by Dijkstra’s due to its edge
selections with minimum cost at every step and thus, it is significant for the situ-
ation having multiple target nodes without any prior knowledge of the closest one
[22]. A* is not very optimal because it needs to be executed a number of times for
each target node to get them all. A* expands on a node only if it seems promising. It
only aims to reach the target from the current node at the earliest and does not
attempt to reach any other node. A* is complete because it always finds a path if
one exists. By modifying the used heuristics and node’s evaluation tactics of A*,
other path-finding algorithm can be developed [23].
Configuration space gives complete information about the location of all points
in the coordination and it is the space for all configurations such as real free space
area for the motion of autonomous vehicle and guarantees that the vehicle must not
crash with obstacles. An illustration of a C-space for a circular vehicle is shown in
Fig. 2. It assumes the robot as a point and adds the area of the obstacles so that the
planning can be complete in a more capable way. C-space is obtained by adding the
vehicle radius while sliding it along the edge of the obstacles and the border of the
102 S. K. Debnath et al.

Goal Goal

A Start A Start

(a) (b)

Fig. 2 A scenario represented in a original form b conﬁguration space. Note that the darker
rectangles in a are those with actual dimensions while in b are those enlarged according to the size
of robot A. The white areas represent free space

search space. In Fig. 2(a), the obstacle-free area is represented by the white region
inside the close area.
The robot in Fig. 2(a) is represented by A. On the other hand, as the workspace is
considered as C-space, as shown in Fig. 2(b), it tells that the free space has been
condensed while the obstacles’ area has been inflated. Hence, C-space indicates the
real free space region for the motion of autonomous vehicle or unmanned vehicle
and it assures that the autonomous vehicle or robot must not collide with the
obstacle.

2 Cell Decomposition (CD) Method

Cell decomposition (CD) is a very useful method especially in outdoor atmosphere.

In CD, C-space is ﬁrst divided into simple and connected regions called cells. The
cells may be of rectangular or polygonal shapes and they are discrete,
non-overlapping and contiguous to each other. If the cell contains obstacle, then it is
identiﬁed as occupied, or else it is obstacle free. A connectivity graph is erected at
Different Cell Decomposition Path Planning Methods … 103

Cell Decomposition

Regular Grid Adaptive Cell Decomposition Exact Cell Decomposition

Fig. 3 Classiﬁcation of cell decomposition method

that point to link the adjacent cells [42]. There are several variations of CD
including Regular Grid (RG), Adaptive Cell Decomposition (ACD) and Exact Cell
Decomposition (ECD) [22]. The classiﬁcation of CD is shown in Fig. 3.

2.1 Regular Grid (RG)

Regular grid (RG) technique was introduced by Brooks and Lozano-Perez [24] to
find a collision-free path for an object moving through cluttered obstacles. In
general, RG can be constructed by laying a regular grid over the configuration
space. As the shape and size of the cells in the grid are predefined, RG is easy to
apply. RG basically samples the domain and marks up the graph subsequently to
know whether the space is occupied, unoccupied or partially occupied.
A cell is marked as an obstacle if an object or part of it occupies the cell; else it
stays as free space. The node is located in the middle of every free space cell within
the C-space. Connectivity graph is then constructed from all the nodes. Path
planning using RG is illustrated in Fig. 4. The path connecting starting point and
target point is shown by solid yellow line.
RG method is popular because they are very easy to apply to a C-space and also
flexible. The computation time can be reduced by increasing the cell size. On the
other hand, the cell size can be made smaller to provide more detailed information
and completeness.
Although RG is easy to apply, there are some drawbacks with this method.
Firstly, it has the digitization bias wherever an obstacle that is too smaller than the
cell dimension results in that whole grid square as filled or occupied. Consequently,
a traversable space may be considered impenetrable by the planner. This scenario is
illustrated in Fig. 4 (b). Furthermore, if the cell is too big (hence grid resolution is
too coarse), the planner may not be complete.
104 S. K. Debnath et al.

Goal Goal

Start Start

(a) (b)

Fig. 4 a Conﬁguration Space obstacles b Obstacles represented by Regular Grid techniques. Note
that the drivable area is considered impenetrable

2.2 Adaptive Cell Decomposition (ACD)

The, adaptive cell decomposition (ACD) is built using quad-tree unlike RG. The
cells of a quad-tree are identiﬁed either as free cells, which contain no obstacles, as
obstacles cells, where the cells are occupied or as mixed cells, which represent
nodes with both free space and obstacles. The mixed cells should be recursively
sub-divided into four identical sub-cells until the resulted smaller cells contain no
obstacles’ region or the smallest cells are produced [25].
ACD maintains as much detail as possible while regular shape of the cells is
maintained. It also removes the digitization bias of RG. An ACD representation
employed for path planning is depicted in Fig. 5. The collision-free path that
connects starting point (Start) and target point (Goal) is depicted via solid yellow
line.
Different Cell Decomposition Path Planning Methods … 105

Fig. 5 Path planning using quad-tree

2.3 Exact Cell Decomposition

Another variant of CD is Exact Cell Decomposition (ECD) method and it consists

of two-dimensional cells to resolve certain dilemma linked with regular grids. The
sizes of the cells are not pre-determined; nonetheless they are decided based on the
location and shape of obstacles in the C-space [26]. The cell boundaries are
determined exactly as the boundaries of the C-space, and the unification of the cells
stands the free space. Therefore, ECD is complete that always finds a path if one
exists. ECD is shown in Fig. 6. The path connecting the starting (Start) and target
(Goal) points is shown as solid yellow line.
Opposed Angle-Based Exact Cell Decomposition is suggested and it is intended
for the mobile robot path-planning issue through curvilinear obstacles for more
natural collision-free efficient path [27].
106 S. K. Debnath et al.

11
6 Goal

1
5 12
4

Start 3 9
8

Fig. 6 Path planning using exact cell decomposition

Till date many researchers have used cell decomposition-based method to solve
path planning problems. In [28], researchers recommended three innovative for-
mulations to construct a piecewise linear path for an unmanned/ autonomous
vehicle when a cell decomposition planning method is used. Another trajectory was
obtained via path planning algorithms, by varying the involved cell decomposition,
the graph weights, and the technique to calculate the waypoints [29]. A combined
algorithm was developed by cell decomposition and fuzzy algorithm to create a
map of the robot’s path [30]. A technique suggested an ideal route generation
outline in which the global obstacle-avoidance problem was decomposed into
simpler sub complications, corresponding to distinct path homotopy that impacted
the description of a technique for using current cell-decomposition methods to
count and represent local trajectory generation problems for proﬁcient and auton-
omous resolution [31].
Parsons and Canny [32] used cell decomposition-based algorithm for multiple
mobile robots path planning, which shared the same workspace. The algorithm
computed a path for each robot and it was capable of avoiding any obstacles and
Different Cell Decomposition Path Planning Methods … 107

other robots. The cell decomposition algorithm was based on the idea of a product
operation that was defined on the cells in a decomposition of a 2D free space.
However, the developed algorithm was only useful when infrequent changes
occurred in obstacles set. Chen et al. [8] introduced framed-quad-tree to create a
map in order solve a problem to find a conditional shortest path over a new
atmosphere in real time. Conditional shortest path is the path that has shortest path
among all possible paths based on known environmental information. The path was
found using a propagated circular path planning wave based on a graph search
algorithm [33]. Jun and D’Andrea [34] used approximate cell decomposition-based
method to accomplish a robot path planning task. The proposed approach used the
initial information of the locations and shapes of the obstacles. The method
decomposed the region into uniform cells, and changed the values of probabilities
while detecting unexpected changes during the mission. A search algorithm was
used to find the shortest path. One drawback of this method is that if the penalty is
considered for accelerations and decelerations, the graph will become a tree and it
will expand exponentially with the number of cells making them very slow.
Lingelbach [35] applied the so-called Probabilistic Cell Decomposition
(PCD) method for path planning in a high-dimensional static C-space for its easy
scalability. Investigational consequences showed that the performance of PCD was
acceptable in numerous circumstances for path planning of rigid body movement
such as maze-like problems and chain-like robotic platform. However, the PCD had
a degraded performance when the free space was small compared to the area of
C-space. Zhang et al. [36] utilised ACD for path planning of robot to subdivide the
C-space into cells. The localised roadmaps were then computed by generating
samples within these cells. Since the complexity of ACD is increased with the
number of degree of freedom (DOF) of robots, it is not practical to use the higher
DOF robot. Arney [37] implemented ACD path planning approach, in which the
efficiency was attained by using a method found in Geographic Information
Systems (GIS) known as tesseral addressing. Each cell was labelled with an address
during the decomposition process that defined the cell size, position and neighbours
addresses. The planner had a priori information about environment and the gen-
erated path had an optimal distance from the unmanned/autonomous vehicles’
present location to the target location. It is suitable for real-time path planning
applications.

3 Discussion on Different Cell Decomposition Methods

The beneﬁts of CD are that it provides assurance to ﬁnd a collision-free path, if

exists and is controllable. Therefore, it is a comprehensive algorithm for an
unmanned or autonomous vehicle that can travel the path deprived of the risk of
local minima incidence [38]. Yet, the shortcoming of CD is that if the formed cell is
too rough, at that time it will not be feasible to achieve the smallest path distance or
length. Instead, if the cell is too trivial, then computation is more time-consuming
108 S. K. Debnath et al.

Table 1 Comparison of different cell decomposition methods

Method Optimal Computational Real Memory Safety Completeness
path time time
p p
CD RG
p p p p
ACD
p p
ECD

[1, 39, 40]. The CD approach also does not provide acceptable performance in a
dynamic state and in real-time circumstance [10, 38, 39]. It is required for CD to
fine-tune with the situation as necessary; e.g. in exact CD, the cells are not pre-
defined, but they are selected based on the site and shape of the obstacles inside the
C-space [41].
Although RG is easy to apply, but the planner may not be complete if cell is too
big, i.e. finding a path where one exists is not guaranteed. If the obstacle’s size is
significantly lesser than the cell size, then also the outcome for the entire grid square
is not obstacle free or occupied. One more drawback of RG is that it inefficiently
represents the C-space as in sparse area many same sized cells are required to fill the
empty space. As a result, planning is costly because additional cells are handled
than they are actually required.
The outcome of ACD is a map that holds different size grid cells and concen-
trates with the cell boundaries to match the obstacle’s boundaries closely. It pro-
duces lesser number of cells so that the C-space can be used more efficiently and
hence, less memory and processing time are required. ACD maintains maximum
details while regular shape of the cells is maintained.
ECD is complete. Still, the paths generated via ECD are not optimal in path
length. There is no simple rule to decompose a space into cells. This method is not
suitable to apply in outdoor environments where obstacles are often poorly defined
and of irregular shape (Table 1).

4 Conclusion

The results from earlier researches on several path planning algorithms for cell
decomposition methods are compared in this study where the nature of motion was
given importance and these algorithms were discussed for their advantages and
drawbacks. When an optimal energy efficient collision-free path that is complete
can be calculated with lowest computation time by an algorithm, then that algorithm
can be conferred as an efficient path planning algorithm. Since none of the algo-
rithms covers all the criteria, hence the optimization of an energy efficient path
planning depends on the criteria of the used algorithm such as completeness,
computation time etc., and the significant requisites of the vehicle’s mission and its
Different Cell Decomposition Path Planning Methods … 109

objective. For example, RG path planning is expensive but easy to apply. ACD has
the adaptive quality and ECD is complete but not suitable for outdoor environment.

Acknowledgements Authors like to give appreciations to Universiti Tun Hussein Onn Malaysia
(UTHM) and Research Management Center (RMC) for supporting fund under TIER-1 VOT H131.

References

1. Omar R (2012) Path planning for unmanned aerial vehicles using visibility line based
methods. PhD diss., University of Leicester
2. Debnath SK, Omar R, Latip NBA (2019) A review on energy efficient path planning
algorithms for unmanned air vehicles. Computational science and technology. Springer,
Singapore, pp 523–532
3. Ganeshmurthy MS, Suresh GR (2015) Path planning algorithm for autonomous mobile robot
in dynamic environment. In: 2015 3rd international conference on signal processing,
communication and networking (ICSCN). IEEE
4. Nguyet T, Duy-Tung N, Duc-Lung V, Nguyen-Vu T (2013) Global path planning for
autonomous robots using modified visibility graph, vol 13. IEEE, pp 317–321
5. Latip NBA, Omar R, Debnath SK (2017) Optimal path planning using equilateral spaces
oriented visibility graph method. Int J Electr Comput Eng 7(6):3046
6. Chen P, et al (2013) Research of path planning method based on the improved Voronoi
diagram. In: 2013 25th Chinese Control and Decision Conference (CCDC). IEEE
7. Omar R, Da-Wei G (2009) Visibility line based methods for UAV path planning. In:
ICCAS-SICE, 2009. IEEE
8. Cho K, et al (2017) Cost-aware path planning under co-safe temporal logic specifications.
IEEE Robotics and Automation Letters 2(4)
9. Li G, et al (2012) “An efficient improved artificial potential field based regression search
method for robot path planning. In: 2012 International Conference on Mechatronics and
Automation (ICMA). IEEE
10. Abbadi A, Matousek R (2014) Path planning implementation using MATLAB in technical
computing bratislava, pp 1–5
11. Adiyatov O, Huseyin AV “Rapidly-exploring random tree based memory efficient motion
planning. In: 2013 IEEE International Conference on Mechatronics and Automation (ICMA).
IEEE
12. Achour N, Chaalal M (2011) Mobile robots path planning using genetic algorithms. In: the
seventh international conference on autonomic and autonomous systems, Baker (ICAS 2011),
pp 111–115
13. Hsu, C-C, Wang W-Y, Chien Y-H, Hou R-Y, Tao C-W (2016) FPGA implementation of
improved ant colony optimization algorithm for path planning. In: 2016 IEEE Congress on
Evolutionary Computation (CEC). IEEE, pp 4516–4521
14. Goyal JK, Nagla KS (2014) A new approach of path planning for mobile robots. In:
international conference on advances in computing, communications and informatics
(ICACCI 2014). IEEE, pp 863–867
15. Gomez EJ, Santa FM, Sarmiento FH (2013) A comparative study of geometric path planning
methods for a mobile robot: potential field and Voronoi diagrams. In: 2013 II International
Congress of Engineering Mechatronics and Automation (CIIMA), 23 October. IEEE, pp 1–6
16. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: proceedings
of the sixth international symposium on micro machine and human science (MHS 1995), 4
October 1995. IEEE, pp 39–43
110 S. K. Debnath et al.

17. Shaogang Z, Ming L (2010) Path planning of inspection robot based on ant colony
optimization algorithm. In: 2010 International Conference Electrical and Control Engineering
(ICECE). IEEE, pp 1474–1477
18. Latombe JC (1999) Motion planning: a journey of robots, molecules, digital actors, and other
artifacts. Int J Robot Res 18(11):1119–1128
19. Marble JD, Bekris KE (2013) Asymptotically near-optimal planning with probabilistic
roadmap spanners. IEEE Trans Rob 29(2):432–444
20. LaValle SM (2006) Planning Algorithms, Cambridge University Press (2006)
21. Dudek G, Jenkin M (2000) Computational principles of mobile robotics. Cambridge
University Press, Cambridge
22. Mehlhorn K, Sanders P (2008) Algorithms and data structures: the basic toolbox (PDF).
Springer
23. Debnath SK, Omar R, Latip NBA, Shelyna S, Nadira E, Melor CKNCK, Chakraborty TK,
Natarajan E (2019) A review on graph search algorithms for optimal energy efficient path
planning for an unmanned air vehicle. Indonesian J Electr Eng Comput Sci 15(2):743–749
24. Brooks RA, Lozano-Perez T (1985) A subdivision algorithm in configuration space for
findpath with rotation. IEEE Trans Syst Man Cybern 2:224–233
25. Chen DZ, Szczerba RJ, Uhran JJ (1995) Planning conditional shortest paths through an
unknown environment: A framed-quadtree approach. In: Proceedings 1995 IEEE/RSJ
international conference on intelligent robots and systems. Human Robot Interaction and
Cooperative Robots. vol 3. IEEE
26. Debnath SK, Omar R, Latip NBA (2019) Comparison of different configuration space
representations for path planning under combinatorial method. Indonesian J Electr Eng
Comput Sci 1(1):401–408
27. Jung J-W et al (2019) Expanded douglas–peucker polygonal approximation and opposite
angle-based exact cell decomposition for path planning with curvilinear obstacles. Appl Sci 9
(4):638
28. Kloetzer M, Mahulea C, Gonzalez R (2015) Optimizing cell decomposition path planning for
mobile robots using different metrics. In: 2015 19th international conference on system
theory, control and computing (ICSTCC), IEEE pp 565–570
29. Gonzalez R, Kloetzer M, Mahulea C (2017) Comparative study of trajectories resulted from
cell decomposition path planning approaches. In: 2017 21st international conference on
system theory, control and computing (ICSTCC), IEEE, pp 49–54
30. Tunggal TP, Supriyanto A, Faishal I, Pambudi I (2016) Pursuit algorithm for robot trash can
based on fuzzy-cell decomposition. Int J Electr Comput Eng 6(6):2088–8708
31. Park J, Karumanchi S, Iagnemma K (2015) Homotopy-based divide-and-conquer strategy for
optimal trajectory planning via mixed-integer programming. IEEE Trans Rob 31(5):1101–
1115
32. Parsons D, Canny J (1990) A motion planner for multiple mobile robots. In: Proceedings,
IEEE international conference on robotics and automation. IEEE, pp 8–13
33. Chen DZ, Szczerba RJ, Uhran JJ (1995) Planning conditional shortest paths through an
unknown environment: A framed-quadtree approach. In: Proceedings 1995 IEEE/RSJ
international conference on intelligent robots and systems. Human Robot Interaction and
Cooperative Robots, vol 3. IEEE, pp 33–38
34. Jun M, D’Andrea R Path planning for unmanned aerial vehicles in uncertain and adversarial
environments. In: cooperative control: models, applications and algorithms. Springer, Boston,
pp 95–110 (2003)
35. Lingelbach F (2004) Path planning using probabilistic cell decomposition. In: IEEE
international conference on robotics and automation, 2004. Proceedings. ICRA 2004, vol 1.
IEEE, pp 467–472
36. Zhang X (1994) Cell decomposition in the affine weyl group wA ([Btilde] 4). Commun
Algebra 22(6):1955–1974
Different Cell Decomposition Path Planning Methods … 111

37. Timothy A (2007) An efficient solution to autonomous path planning by approximate cell
decomposition. In: 2007 third international conference on information and automation for
sustainability, IEEE, pp 88–93
38. Glavaški D, Volf M, Bonkovic M Robot motion planning using exact cell decomposition and
potential field methods. In: Proceedings of the 9th WSEAS international conference on
Simulation, modelling and optimization. World Scientific and Engineering Academy and
Society (WSEAS) (2009)
39. Gonzalez R, Mahulea C, Kloetzer M (2015) A Matlab-based interactive simulator for mobile
robotics. In: 2015 IEEE international conference on automation science and engineering
(CASE). IEEE, pp 310–315
40. Hoang VD, Hernandez DC, Hariyono J, Jo KH (2014) Global path planning for unmanned
ground vehicle based on road map images. In: 2014 7th international conference human
system interactions (HSI), IEEE, pp 82–87
41. Giesbrecht J (2004) Global path planning for unmanned ground vehicles.
No. DRDC-TM-2004-272. Defence Reserch And Development Suffield Alberta
42. Omar R, Melor CK, Hailma CKNA (2015) Performance comparison of path planning
methods
Improved Potential Field Method
for Robot Path Planning with Path
Pruning

Elia Nadira Sabudin, Rosli Omar, Ariffudin Joret,

Asmarashid Ponniran, Muhammad Suhaimi Sulong,
Herdawatie Abdul Kadir, and Sanjoy Kumar Debnath

Abstract Path planning is vital for a robot deployed in a mission in a challenging

environment with obstacles around. The robot needs to ensure that the mission is
accomplished without colliding with any obstacles and ﬁnd an optimal path to reach
the goal. Three important criteria, i.e., path length, computational complexity, and
completeness, need to be taken into account when designing a path planning
method. Artiﬁcial Potential Field (APF) is one of the best methods for path plan-
ning as it is fast, simple, and elegant. However, the APF has a major problem called
local minima, which will cause the robot fails to reach the goal. This paper pro-
posed an Improved Potential Field method to solve the APF limitation. Despite that,
the path length produced by the Improved APF is not optimal. Therefore, a path
pruning technique is proposed in order to shorten the path generated by the
Improved APF. This paper also compares the performance on the path length and
computational time of the Improved APF with and without path pruning. Through
simulation, it is proven that the proposed technique could overcome the local
minima problem and produces a relatively shorter path with fast computation time.

Keywords Path planning Artiﬁcial Potential Field

E. N. Sabudin R. Omar (&) A. Joret A. Ponniran H. A. Kadir S. K. Debnath

Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
Johor, Malaysia
e-mail: [email protected]
A. Ponniran
Power Electronic Converters (PECs) Focus Group, Universiti Tun Hussein Onn Malaysia,
Johor, Malaysia
A. Joret M. S. Sulong
Faculty of Technical and Vocational Education, Universiti Tun Hussein Onn Malaysia, Johor,
Malaysia
M. S. Sulong
Internet of Things (IOT) Focus Group, Universiti Tun Hussein Onn Malaysia, Johor,
Malaysia

© Springer Nature Singapore Pte Ltd. 2021 113

1 Introduction

Path planning is one of the most critical issues to be considered in robot research.
Path planning in robotic is the act of robot to compute a valid and feasible solution
in order for it to traverse from a start to goal points with a sequence of collision-free
and safe motion to achieve a certain task in a given environment. The path taken
must be free of any collisions with surrounding obstacles and also meets kinematic
or dynamic conditions [1, 2]. In the path planning problem, the workspace for the
robot and obstacle geometry is outlined in 2D or 3D, while the motion is repre-
sented as a path in configuration space [3]. In path planning, the presented structure
of the environment is an aspect that needs to be taken into account to ensure the
robot can achieve a defined mission. There are two types of environment for path
planning, namely known and unknown.
As its name implies, the known environment has all the information of obstacles
and goal point. The robot moves based on the prescribed information. On the other
hand, in an unknown environment, there is no previous knowledge or only partial
information of the environment is available. The robot needs to plan a path based on
current information. The unknown environment may contain obstacles which move
continuously, and dynamic obstacles also appear spontaneously and randomly
while the robot is performing its mission.
As previously mentioned, the aspects that need to be addressed in path planning
are the computation time, path length, and completeness. In a dynamic or uncertain
environment, the path planning algorithm must be able to produce a low compu-
tational time for real-time applications. Apart from that, the robot should take the
optimal path during the mission to save fuel and energy. Completeness criterion is
satisfied if the path planning algorithm could find a path if one exists.
There are few common techniques used in path planning problems such as Cell
Decomposition (CD), Visibility Graph (VG), Voronoi Diagram (VD), Probability
Roadmap (PRM) and Artificial Potential Field (APF). APF is a path planning
method which is simple, highly safe, and elegant [4–6]. It uses simple mathematical
equations that are ideal for real-time environments [7]. APF produces two types of
forces, i.e., attractive force and repulsive force. The goal point generates the
attractive force to pull the robot towards it; meanwhile, the obstacles produce a
repulsive force to repel the robot from it. In that way, the robot movement depends
on the resultant of the forces. However, local minima is the major drawback of
APF. The robot will be trapped into local minima if the resultant force is zero. The
problem of Goal Non-Reachable with Obstacle Nearby (GNRON) also happens, if
the robot plunges into local minima. In order to solve the above-mentioned prob-
lem, this paper has proposed Improved Artificial Potential Field. This technique is
able to reduce the limitation of APF method. Besides that, it is also computationally
tractable. In reducing the path length, a path pruning is applied to the planned path.
Improved Potential Field Method for Robot Path Planning ... 115

2 Potential Field Method

Potential field (PF) is one of the most popular techniques in path planning problem.
Artificial Potential Field (APF) method has been used by many researchers because
of its properties such as simplicity, elegance, and high safety method [3]. Khatib
was the who first suggested this idea in which the robot was regarded as a point
under the influence of fields generated by the goals and obstacles in the search space
[8]. The APF can generate path planning based on two types of force which are
attractive force and repulsive force. The attractive force is produced by the goal, and
the repulsive force is generated by the obstacle. This method can be applied in
known scenarios and also effort working in the unknown environment despite
changes and modifications. APF method has several advantages such as path
planning can be implemented in a real-time environment due to its (1) fast com-
putation time and (2) ability to generate a smooth path without any collision with
obstacles. However, this method has major drawbacks namely local minima, goal
non-reachable problem, and narrow passages [9, 10].
To address these problems, researchers have improved the potential field
method. Mei and Arshad used a Balance-Artificial Potential Field Method to solve
the local minima and narrow passage besides achieving heading and speed control
of ASV (Autonomous Surface Vessel) in a riverine environment [11]. An efficient
Improved Artificial Potential Field based Regression Search Method for robot path
planning and also Effective Improved Artificial Potential Field- Based Regression
Search Method for Autonomous Mobile Robot Planning developed by Li et al.
could generate a global sub-optimal/optimal path effectively and could reduce the
local minima and oscillation problems in a known environment without complete
information [12, 13]. Sfeir et al. presented the real-time mobile robot navigation in
an unknown environment using Improved APF approach to create a smoother
trajectory around the obstacles by developing an integrate of rotational force [14].
This method successfully prevented the limitation in APF due to Goal
Non-Reachable when Obstacles are Nearby (GNRON) problem. Besides that, Park
et al. proposed potential field method (PFM) and vector field histogram (VFH) to
overcome the PF limitations by developing a new obstacle avoidance method for
mobile robots based on advanced fuzzy PFM (AFPFM) [15].

3 Path Planning Method

3.1 Field Function Based on Traditional APF

The attractive potential ﬁeld, Vg at goal is represented as

116 E. N. Sabudin et al.

Vg ¼ Kg rg ð1Þ

rg ¼ dist X; Xg ð2Þ

where Kg is a variable constant which is greater than zero, X ¼ ðx; yÞ is a current

position, Xg ¼ xg ; ygÞ is a goal position, and rg is the distance between the
current robot position and the goal. Figure 1 shows an attractive potential field at
the target. The attractive force will pull the robot towards the target [16].
The repulsive potential field, Vo at can be defined as

Ko
Vo ¼ ð3Þ
ro

ro ¼ distðX; X0 Þ ð4Þ

where Ko is a variable constant that is greater than zero, X0 ¼ ðx0 ; y0 Þ is an obstacle

position, Ko and r0 are equivalent to the gain and distance from the robot,
respectively.
The repulsive potential ﬁeld, Vr at the starting point can be written as

Kr
Vr ¼ ð5Þ
rr

rr ¼ dist ðX; Xr Þ ð6Þ

Fig. 1 The form of the

general attractive potential
ﬁeld
Improved Potential Field Method for Robot Path Planning ... 117

Fig. 2 General repulsive

potential ﬁeld (the gradients
pointed away from the
obstacles)

Fig. 3 Negative gradient

between target and obstacles

Kr is a variable constant equal to or greater than zero, X ¼ ðx; yÞ is a current

position and Xr ¼ ðxr ; yr Þ is a starting position.
Figure 2 illustrates a repulsive potential ﬁeld at a goal [16]. The repulsive force
will push the robot towards the target.
118 E. N. Sabudin et al.

Fig. 4 a The attractive potential without obstacle b The repulsive potential set the highest value to
the obstacle c The whole potential shows the combination of the two forces to get the ﬁnal
potential ﬁeld result

Therefore, the total potential ﬁeld can be as represented as in (7)

Vtotal ¼ Vg þ Vr þ Vo ð7Þ

Figure 3 illustrates the total force of the potential ﬁeld [16]. The resultant force of
the ﬁelds is used to determine the direction of motion the robot. In Fig. 4, the
resultant force of the potential is shown in the 3D view [17].

3.2 Algorithm for Traditional Artiﬁcial Potential Field

(APF)

In APF, there are two forces involved, which are the attractive force and repulsive
force. The traditional APF is unable to reduce the local minima problem where the
total sum of the potential field is zero. Figure 5 shows the flowchart of APF for
robot path planning.
In particular, the APF algorithm starts with the setting variable initialization,
such as the number of obstacles and the environment range. The current waypoint
assigned as a starting point and as a target point. Subsequently, the total potential
field is calculated. The robot will move from the starting point; decreasingly with
respect to the value of the potential field surrounding it until reaches the target
point. If the local minima occur while the robot is carrying out a mission to a target
point, the robot will collide with obstacles or oscillation happen. The robot cannot
reach the goal success-fully unless there are no local minima problems while the
robot deploys the mission.
Improved Potential Field Method for Robot Path Planning ... 119

Fig. 5 The traditional APF process for path planning

3.3 Improved APF Method

3.3.1 Background

The attractive gain at goal, Kg is determined by the diagonal distance of the search
space.
120 E. N. Sabudin et al.

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Kg ¼ ðdistxÞ2 þ ðdistyÞ2 ð8Þ

where distx represents the distance of the search space along the x-axis, while disty
is that of the search space along the y-axis.
On the other hand, the repulsive gain at the obstacle, K0 is written as:
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðdistxÞ2 þ ðdistyÞ2
Ko ¼ ð9Þ
ax þ b

Where a; x and b, are the parameters for a line segment from (9). K0 is deﬁned
based on the environmental factor (diagonal distance), and the number of obstacles.

3.3.2 Algorithm of Improved APF Method

The proposed Improved APF algorithm is shown in Fig. 6. From its initial point,
the location of the next position of the robot is selected by identifying and selecting
the lowest point from the eight surrounding points generated by the potential field.
Once the lowest point has been selected, the robot will move to that point. If the
identified and selected point is local minima, the robot will identify and select the
second-lowest potential field point value. The robot moves to that point and
removes the point where local minima happen. This process will continue until the
robot reaches the target.

3.4 APF with Pruning Path

The main aim of the improved APF is to solve the local minima, oscillation, and
non-reachable problems. However, the path length generated by the APF is
non-optimal. In addition, to ensure that the mission of the robot can be carried out
successfully, other factors such as the energy-saving need to be taken into account.
This could be realized if the path can be shortened. Therefore, an alternative
technique known as path pruning has been applied to address this issue.
Debnath et al. has mentioned that APF is effective in finding a shorter path [18].
Omar et al. proposed the path pruning in path planning problem using the proba-
bility roadmap (PRM) to produce a path with a shorter length [19]. Li et al. came
out with Efficient Improved Artificial Potential Field Based Simultaneous Forward
Search (Improved APF-based SIFORS) method for robot path planning which
redefined the potential function to calculate the valid path and consequently shorten
the distance of the planned path [20]. Lifen et al. improved the APF through
changing the repulsive potential function that could help the UAV to avoid collision
with obstacles effectively and found the optimal path [6].
Improved Potential Field Method for Robot Path Planning ... 121

Fig. 6 Proposed method for improved APF that solving the limitation of potential ﬁeld method

3.5 Algorithm for Improved APF Method with Path

Pruning

In this paper, a path pruning technique is used to shorten the existing path. The
flowchart shown in Fig. 7 illustrates the process of pathﬁnding using Improved APF
with path pruning.
Let the path W consist of waypoints fPi ; Pi þ 1; Pi þ 2. . .Pn g where Pi is the
starting point and Pn is the target point. The path pruning process starts by checking
if there are any obstacles between waypoints Pi and Pi þ 1. Pi þ 1 will be eliminated
if no obstacle is detected between Pi and Pi þ 1, and the checking of the obstacle
will proceed between Pi and Pi þ 2. Otherwise, Pi þ 1 will be maintained as one of
the waypoints of W, and the above process continues from Pi þ 1. The process will
proceed until Pi ¼ Pn .
122 E. N. Sabudin et al.

Fig. 7 Algorithm of path pruning based on improved APF

Improved Potential Field Method for Robot Path Planning ... 123

4 Simulation Results and Discussion

Simulation of the proposed algorithm has been carried out using MATLAB R2016a
on a PC with Intel i5-4200U 1.6 GHz CPU and Windows 10 OS. The range of the
environment R is set to 100 units, with obstacles numbers, O varied from 25 to 125.
Coefﬁcients Kg and Ko for calculating the attractive and repulsive force are set
based on Eqs. (8) and (9) which are 282.843 and 15.687 respectively. The per-
formance of the proposed algorithm is in terms of:
i- Local minima
ii- Path length
iii- Computational time
Figure 8 shows the comparison of the simulation result of the traditional APF
(blue line) and Improved APF (magenta line). As can be seen from the scenario in
Fig. 8(a), the Improved APF manages to overcome the local minima problem, and
the robot reaches the goal. The red dots are referred to the area of local minima that
have been addressed successfully. Figure 8(b) illustrates the 3D representation of
the scenario. The subplot of the altitude of waypoints is depicted in Fig. 8(c) where
the robot moves from the highest value (initial point) to the lowest value (target
point).
With the different numbers of obstacles, i.e., 25, 50, 75, 100, and 125, the
resulting paths are shown in Fig. 9(a)–(e), respectively. Referring to subplot the
scenario, the magenta lines show the paths planned based on Improved APF, and
the blue lines represent the pruned paths. It is clearly shown that the algorithm
manages to address the local minima, oscillation, GNRON, and narrow passages.
Besides that, the resulting paths are shorter due to the application of path pruning
technique.

Fig. 8 Comparison between the traditional APF (blue line) and improved APF (magenta line)
simulation results, a Improved APF overcome the local minima problem, b 3D representation and
c Robot movement waypoint
124 E. N. Sabudin et al.

(a) 25 Obstacles

(b) 50 Obstacles

Fig. 9 Paths generated by the Improved APF (magenta lines); the pruned paths (blue lines) with a
number of obstacles, a 25 Obstacles, b 50 Obstacles, c 75 Obstacles, d 100 Obstacles and e 125
Obstacles
Improved Potential Field Method for Robot Path Planning ... 125

(d) 100 Obstacles

(e) 125 Obstacles

Fig. 9 (continued)

The computational time and path length of the proposed algorithm are sum-
marized in Table 1. The overall simulation results show the path length and
computational time of the Improved APF with path pruning in each scenario
computational time are longer if local minima happen.
Referring to the Improved APF performances, the generated path is relatively
long due to the local minima. For the obstacles numbers of 25 and 50, there are no
local minima. For 75 obstacles in the environment, the generated path is relatively
long due to the local minima problems (red dots). The robot removes the previous
waypoints to avoid the repetition of local minima point, and then the robot needs to
move to the lowest point from the midpoint. It can be seen that the robot struggles
to exit from the local minima. As a result, the computation time has increased
126 E. N. Sabudin et al.

Table 1 The performance of Improved APF and pruning path

Number Path length of Pruned path Computation time Computation time
of Improved APF length of Improved APF of pruned path (s)
obstacles (unit) (unit) (s)
25 193.807 153.270 14.127 0.323
50 208.056 143.532 17.811 0.431
75 431.686 187.355 48.695 1.674
100 257.863 160.987 27.771 0.971
125 274.967 172.536 32.419 0.892

dramatically. On the other hand, the path generated in environments with 100 and
125 obstacles are considered moderate. In these cases, the local minima problems
still occur, but the robot manages to address it.

5 Conclusion and Future Work

The Improved APF with path pruning has been proposed for robot path planning in
a known environment. The proposed method finds a valid, feasible, and shorter
solution for robot mission, and consumes low computation time, which is vital for a
real-time path planning application. Improved APF has also been proven to address
the problem faced by APF method. By the proposed algorithm, the criteria for path
planning problems have been fulfilled. In future work, the improved APF with path
pruning will be enhanced considering with a specific region to improve the algo-
rithm speed. This research also focuses on the cooperative technique for multi
robots path planning.

Acknowledgements Authors like to give appreciations to Universiti Tun Hussein Onn Malaysia
(UTHM) and Research Management Center (RMC) for supporting fund under TIER-1 VOT H131.

References

1. Hasircioglu I, Topcuoglu HR, Ermis M (2008) 3-D path planning for the navigation of
unmanned aerial vehicles by using evolutionary algorithms. In: Proceedings of the conference
on genetic and evolutionary computation, pp 1499–1506
2. Omar RB (2011) Path planning for unmanned aerial vehicles using visibility line-based
methods. control and instrumentation research group. Department of Engineering, University
of Leicester, March 2011
3. Sabudin EN, Omar R, Che Ku Melor CKANH (2016) Potential ﬁeld methods and their
inherent approaches for path planning. ARPN J Eng Appl Sci 11(18):10801–10805
4. Borenstein J, Koren Y (1991) Potential ﬁeld methods and their inherent limitations for mobile
robot navigation, April 1991, pp 1398–1404
Improved Potential Field Method for Robot Path Planning ... 127

5. Cen Y, Wang L, Zhang H (2007) Real-time obstacle avoidance strategy for mobile robot
based on improved coordinating potential field with genetic algorithm. In: IEEE international
conference on control applications, October 2007
6. Lifen AL, Rouxin BS, Shuandao CL, Jiang DW (2016) Path planning for UAVS based on
improved artificial potential field method through changing the repulsive potential function.
In: IEEE Chinese guidance, navigation and control conference (CGNCC), 12–14 August
2016
7. Liu Y, Zhao Y (2016) A virtual-waypoint based artificial potential field method for UAV path
planning. In: Proceedings of 2016 IEEE Chinese guidance, navigation and control conference,
12–14 August 2016
8. Khatib O (1985) Real-time obstacle avoidance for manipulators and mobile robots. In:
Proceedings of the IEEE international conference on robotics and automation, pp 500–505
9. Mei W, Su Z, Tu D, Lu X (2013) A hybrid algorithm based on artificial potential field and
BUG for path planning of mobile robot. In: 2nd international conference on measurement,
information and control
10. Wang S, Min H (2013) Experience mixed the modified artificial potential field method. In:
IEEE/RSJ international conference on intelligent robots and systems (IROS), 3–7 November
2013
11. Mei JH, Arshad MR (2015) A balance-artificial potential field method for autonomous surface
vessel navigation in unstructured riverine environment. In: IEEE international symposium on
robotics and intelligent sensors (IRIS)
12. Li G, Tamura Y, Yamashita A, Asama H (2012) Effective improved artificial potential
field-based regression search method for robot planning. In: IEEE international conference on
mechatronic and automation, 5–8 August 2012
13. Li G, Tamura Y, Yamashita A, Asama H (2013) Effective improved artificial potential
field-based regression search method for autonomous mobile robot path planning. Int J
Mechatron Autom 3(3):141–170
14. Sfeir J, Saad M, Saliah-Hasane H (2011) An improved potential field approach to real-time
mobile robot path planning in an unknown environment. In: IEEE international symposium
on robotic and sensors environments (ROSE)
15. Park JW, Kwak HJ, Kang YC, Kim DW (2016) Advanced fuzzy potential field method for
mobile robot obstacle avoidance. J Comput Intell Neurosci 2016. Article No. 10
16. Godrich MA. Potential Field Tutorial. https://pdfs.semanticscholar.org/725e/fa1af22f41dcbe
cd8bd445ea82679a6eb7c6.pdf. Accessed 29 Aug 2019
17. Robot Motion Planning and Control. Potential Field. https://sebastian-hoeffner.de/uni/
ceng786/index.php?number=2. Accessed 29 Aug 2019
18. Debnath SK, Omar RB, Abdul Latip NB (2019) A review on energy efficient path planning
algorithms for unmanned air vehicles. In: Computational science and technology. Springer,
Singapore
19. Omar RB, Che Ku Melor CKNAH, Sabudin EN (2015) Performance comparison of path
planning methods. ARPN J Eng Appl Sci
20. Li G, Tong S, Lv G, Xiao R, Cong F, Tong Z, Yamashita A, Asama H (2015) An improved
artificial potential field-based simultaneous forward search (improved APF-based SIFORS)
method for robot path planning. In: The 12th international conference on ubiquitous robots
and ambient intelligence (URAI), 28–30 October 2015
Development of DugongBot Underwater
Drones Using Open-Source Robotic
Platform

Ahmad Anas Yusof, Mohd Khairi Mohamed Nor,

Mohd Shahrieel Mohd Aras, Hamdan Sulaiman, and Abdul Talib Din

Abstract This paper presents the development and fabrication of an open source,
do-it-yourself underwater drone called DugongBot, which is developed in collab-
oration with the Underwater Technology Research Group (UTeRG), Universiti
Teknikal Malaysia Melaka. Research institutes and hobbyist have shown a growing
interest in the development of micro observation class remotely operated vehicle
(micro-ROV) using open-source platform. Currently, OpenROV and Ardusub are
the low-cost open-source solutions that are available for such ROVs. The
open-source hardware and software platforms are being used worldwide for the
development of small range of electrical powered ROV system’s architecture, with
support from the literature in the internet and the extensive experience acquired
with the development of robotic exploration systems. This paper presents the
development of DugongBot, which uses the OpenROV open-source platform.
Weighing approximately 3 kg and designed for 100 m depth, the drone uses a
single 18 cm long watertight tube in 10 cm diameter to accommodate the main
electronics compartment, which can be tilted up and down with a servo, for CMOS
sensor HD webcam alignment. Two horizontal thrusters for forward, reverse and
rotational movement and a vertical thruster for depth control is also used for
manoeuvrability.

Keywords Micro-ROV OpenROV Underwater drones Open-source

A. A. Yusof (&) M. K. M. Nor M. S. M. Aras

Faculty of Electrical Engineering, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya,
76100 Durian Tunggal, Melaka, Malaysia
e-mail: [email protected]
A. A. Yusof M. K. M. Nor M. S. M. Aras
Centre for Robotics and Industrial Automation, Universiti Teknikal Malaysia Melaka,
Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
H. Sulaiman A. T. Din
Faculty of Mechanical Engineering, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya,
76100 Durian Tunggal, Melaka, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 129

1 Introduction

Open-source robotic platform for underwater robotics has provided high return
investment for the scientific community. There is now significant evidence that such
sharing concept has allowed a scenario in such a way that underwater technology
can be studied, modified, created, and distributed by anyone. Thus, micro ROV or
underwater drones are increasingly famous due to the growing curiosity in
underwater drones by researchers that uses the open-source platform [1, 2]. The
platform has led to the development of various low-cost underwater drones for
hobbyist such as OpenRoV Trident, Gladius Mini and Geneinno Poseidon that
serves a wide variety of purposes in capturing footage in the underwater environment
for scientific exploration, industrial inspections and military surveillance [3–9]. The
availability of open-source platform also gives the opportunity for students to
develop underwater robots for underwater vehicles competition around the globe
[10–14]. These electric powered vehicles can weight to as low as 2 kg and are
generally smaller in size, which is suitable for backpack storage. They are generally
limited to depth ratings of less than 100 m due to the limitations to the underwater
pressure and power to weight ratios. They can be easily hand launched from the
surface, use a simple tether system, and sometimes can be connected wirelessly
from a floating buoy at the surface. This will ensure continuous live video feed from
the drones and more importantly, to avoid losing the drones in the deep ocean. Most
of them are also equipped with powerful headlamps, providing visibility in the dark
and murky underwater conditions. They also use 4K cameras for high-quality
image capture, FPV goggles for first person view experience and a simple robotic
arm for underwater sampling. Figure 1 shows price comparison of of selected small
ROVs and the underwater drones in Malaysian ringgit.

Fig. 1 Price comparison of small ROVs [15]

Development of DugongBot Underwater Drones … 131

Thus, in this paper, the review and the development of an underwater drone
using open-source platforms and solutions are presented and evaluated. Named
DugongBot, the underwater drone serves as the ﬁrst generation of low-cost drones
that is developed in house at UTeRG.

2 DugongBot Development

Dugong, as shown in Fig. 2, is a species of sea cow found throughout the warm
latitudes of the Indian and western Paciﬁc Oceans. It can be found in the coastal
area of Malaysia, and has been categorized as decreasing in numbers in the
International Union for Conservation of Nature’s Red List of Threatened Species
[16]. In support of the dugong protection throughout the world, the underwater
drone in this project is called DugongBot, as shown in the CAD design in Fig. 3.

2.1 Hardware Development

The DugongBot comes with the BeagleBone Black single board computer as a
processor, and integrated with Arduino MEGA microcontroller for sensor detection
and thruster control. It can be tele-operated by using either gamepad or keyboard to
control the vehicle’s movement. It can also works with any Windows compatible
gamepad. DugongBot uses inertia measurement unit and pressure sensor for
movement and depth calibration that uses a single-axis rate gyroscope to measure
the yaw rate and a two-axis accelerometer to measure the roll and the pitch. The
system has a maximum operational pressure of 30 bar for depth capability and a
magnetometer compass. A 1080p high-definition webcam with 120-degree
field-of-view is used in the telemetry system through I2C protocols for laptop
display. There are 3 thrusters used for forward, upward and downward movement.
The topside control hardware contains few electronics equipment to communicate
with the drone. The controller board, which is designed based on the Arduino Mega
configuration manages the low-level input commands from the IMU and pressure
sensors and the output commands to the motors/thrusters and lights, while the

Fig. 2 Dugong
132 A. A. Yusof et al.

Beaglebones Black processes the input from the underwater footage using the
mjpg-streamer. The topside interface board provides an Ethernet connection
between the drone and the laptop. The drone uses micro USB power supply that can
supply at least 500 milliamps to the topside interface board. It has been documented
in the OpenROV support group forum that the topside interface board can be
connected wirelessly by implementing a small modiﬁcation [17]. Table 1 shows
the speciﬁcation for the DugongBot 1.0.

Fig. 3 DugongBot CAD design

Table 1 DugongBot speciﬁcation

Name DugongBot 1.0
Dimension 25H 30 W 45L (cm)
Weight 3 kg
Hull Poly(methyl methacrylate) (Acrylic)
Frame Polyvinyl chloride (PVC) pipe
Thrusters 3 thrusters
ESCs Afro ESC 12amp
Controller Arduino Mega–based OpenROV microcontroller
Processor Beaglebone Black
Software OpenROV Cockpit, Node.JS, mjpg-streamer, Socket.IO,
Batteries 2500mAh, 9.6 V, 26650, LiFePO4
Sensors OpenROV IMU (add-on)
Tether Ethernet 2 wire
Ballast Lead
Camera HD Camera on tilt servo
Development of DugongBot Underwater Drones … 133

2.2 Software Development

OpenROV itself is a company that produces underwater exploration devices, which

is located in Berkeley, California and was founded in 2011. In 2019, Ocean data
startup Spoondrift and OpenROV has announced the merger into a new company
known as Sofar Ocean Technologies. Since then, the support for OpenROV 2.8 has
been unavailable from the OpenROV website, due to the merger. However, despite
the fact that OpenROV has merged into a new company called SOFAR, and the
company current focus is on marketing the OpenROV Trident and Intelligent
Spotter buoy, the support and documentation of OpenROV 2.8 and the older ver-
sions can still be downloaded from GitHub and Dozuki. GitHub is a hosting
platform for software development, which offers all of the distributed version
control and source code management for many software developer, including
OpenROV. Github OpenROV community is managed by a DIY community cen-
tred on underwater robots for exploration and adventure. The community is a group
of amateur and professional ROV builders and operators from over 50 countries
who have a passion for underwater robotics. Dozuki is a cloud-based platform that
provides access to various step-by-step manuals for repair, process tracking,
training and work instructions. Both platform provide good community and support
group for OpenROV documentations. It is noted that almost 30 guides are available
for the step-by-step development of OpenROV in Dozuki itself. Figure 4 shows
some of the open-source support for the project.

Fig. 4 Open Source support

134 A. A. Yusof et al.

3 Drone Testing

3.1 Camera Function with Software

DugongBot uses an ultra-wide angle full HD webcam. This camera enables the user
to experience the live video streaming to explore the underwater environment and
capture photos. The camera can also detect objects and be remotely operated for 25
to 30° upward movement and 60° downward movement. The camera also provides
a view of 120° wide. The battery enables the camera to be functioning up to 3 h.
The movement of the camera is controlled by a keyboard, whereby the Q key
controls the downward movement, T controls the upward movement and I key
controls the lights. The visual interface for openROV platform is known as the
Cockpit, as shown in Fig. 5, which provides informations on depth, heading dis-
play, battery voltage and consumption, and the flight time to the operator. It also
provides the graphical user interface to the operator. The cross-platform JavaScript
run-time environment Node.js application is used to send commands through the
keyboard by using a HTML 5 one page application supported browser. ROV
connection is possible, by using a static IP address that is similar to the ROV built
static IP address. The static IP address is 192.168.254.1, the last number must be set
other than 1 and the subnet mask need to be change at 255.255.255.0.

Fig. 5 Camera function using OpenROV cockpit platform

Development of DugongBot Underwater Drones … 135

The drone is connected via Ethernet tether to transfer data, and does not need to
download any software or having an internet connection to operate them. Ethernet
protocol is used to connect the DugongBot with a computer via Ethernet tether. The
BeagleBone black in the drone runs the browser and the webserver on the com-
puter, and communicate with the server using Socket.IO, a JavaScript library that
enables bidirectional, real time event based-communication. The DugongBot’s
controller board, which is designed based on Arduino Mega conﬁguration manages
the low-level input commands from the IMU and pressure sensors and the output
commands to the motors/thrusters and lights, while the Beaglebones Black pro-
cesses the input from the underwater footage using the mjpg-streamer. The
DugongBot’s topside interface board provides an Ethernet connection between the
ROV and the laptop, as shown in Fig. 6.

Tenda
Adapter Topside
Ethernet
Computer
(RJ45)

Gamepad
Controller
(Optional)
Ethernet
(2 Wire)
Topside Interface
Board

Fig. 6 DugongBot version 1.0

136 A. A. Yusof et al.

Fig. 7 DugongBot thrusters

3.2 Thrusters Functions

The low cost brushless motors are a good choice for the thrusters, but the motors
may have a limited life when used only in the salt water environments.
Nevertheless, proper maintenance will deﬁnitely enhance their life expectancy. All
the thrusters are wired to the input power and controlled by the keyboard which
enables the user to control the movement from the topside. The input power source
is powered by 2500 mAh, 9.6 V, 26650, LiFePO4 batteries. It can also be tested
with a 12 V power supply. The thrusters needed to be identiﬁed with their rotation
and movement effects, in order to align them together. The thruster is connected to
the left Shift key on the keyboard, for a anticlockwise rotation, that is used in a
forward drone movement. The right Shift key will provide a command for a
clockwise rotation on the same thruster, which also introduce a backward move-
ment. In general, the Up, Down, Left, Right, Shift and Ctrl keys can be used to
maneuver the DugongBot. Figure 7 shows the thrusters used in the drone.

3.3 Buoyancy

An underwater drone that is stable and doesn’t tip over is very important.
DugongBot must be buoyant enough so that it can be maneuvered easily up or
down without using too much energy. The objective of the development is also to
Development of DugongBot Underwater Drones … 137

Fig. 8 DugongBot in action

develop a well balanced structure of underwater drone that it will naturally bouyant
below the water surface. During the ﬁrst trial, the underwater drone is partially
submerged in the water, but not in a stable condition. The left side is heavier than
the right side. Later on, some weight is introduced, as a ballast, with one at the front
and two at the sides. The result is a naturally bouyant DugongBot, as shown in
Fig. 8.

4 Conclusion

The development of DugongBot underwater drone using a low cost open-source

robotic platform has been successfully implemented. The underwater drone has
been designed for maneuverability, performance and underwater footage capability.
This project will give much benefit for related underwater industries by looking at
small underwater drones features with minimum cost implementation. In this paper,
an open source prototype for building low-cost underwater drones and for cus-
tomizing their thrusters and ballast configurations has been successfully tested using
a three-propeller underwater drone based on open source hardware and software
solutions. Nonetheless, further tests in deeper waters and under different frame
configurations will be undertaken in the near future.

Acknowledgements The authors wish to thank Ministry of Education (MOE) and Universiti
Teknikal Malaysia Melaka for their support.
138 A. A. Yusof et al.

References

1. Aristizábal LM, Rúa S, Gaviria CE, Osorio SP, Zuluaga CA, Posada NL, Vásquez RE (2016)
Design of an open source-based control platform for an underwater remotely operated vehicle.
DYNA 83(195):198–205
2. Schillaci G, Schillaci F, Hafner VV (2017) A customisable underwater robot. arXiv abs/
1707.06564
3. OpenROV Trident. https://www.sofarocean.com/products/trident. Accessed 10 Oct 2019
4. Fathom One. https://www.kickstarter.com/projects/1359605477/fathom-one-the-affordable-
modular-hd-underwater-dr. Accessed 10 Oct 2019
5. Geneinno Poseidon. https://www.geneinno.com/poseidon.html. Accessed 10 Oct 2019
6. BlueROV2. https://www.bluerobotics.com/store/rov/bluerov2/. Accessed 10 Oct 2019
7. Aras MSM, Azis FA, Othman MN, Abdullah SS (2012) A low cost 4 DOF remotely operated
underwater vehicle integrated with IMU and pressure sensor. In: 2012 4th international
conference on underwater system technology: theory and applications (USYS 2012), Shah
Alam, Malaysia
8. Zain ZMd, Noh, MM, Ab Rahim KA, Harun N (2016) Design and development of an
X4-ROV. In: IEEE 6th international conference on underwater system technology: theory &
applications, Penang, Malaysia
9. Mainong AI, Ayob AF, Arshad MR (2017) Investigating pectoral shapes and locomotive
strategies for conceptual designing bio-inspired robotic ﬁsh. J Eng Sci Technol 12(1):001–014
10. Singapore Autonomous Underwater Vehicle Challenge (2017). https://sauvc.org/. Accessed
10 Oct 2019
11. Malaysia Autonomous Underwater Vehicle Challenge (2018). http://oes.ieeemy.org/.
Accessed 10 Oct 2019
12. Yusof AA, Nor MKM, Shamsudin SA, Alkahari MR, Mohd Aras MS, Nawawi MRM (2018)
Facing the autonomous underwater vehicle competition challenge: the TUAH AUV
experience. In: Hassan M (eds) Intelligent manufacturing & mechatronics. Lecture notes in
mechanical engineering. Springer, Singapore
13. Yusof AA, Nor MKM, Shamsudin SA, Alkahari MR, Musa M (2018) The development of
PANTHER AUV for autonomous underwater vehicle competition challenge 2017/2018. In:
Hassan M (eds) Intelligent manufacturing & mechatronics. Lecture notes in mechanical
engineering. Springer, Singapore
14. Yusof A, Kawamura T, Yamada H (2012) Evaluation of construction robot telegrasping force
perception using visual, auditory and force feedback integration. J Robot Mechatron
24(6):949–957
15. Sulaiman H, Nor MKM, Yusof AA, Aras MSM, Mohamad Ayob AF (2019) Low cost
observation class remotely operated underwater vehicle using open-source platform: a
practical evaluation between Openrov And Bluerov. In: International conference on ocean,
engineering technology and environmental sustainability (I-OCEANS 2019), Kuala
Terengganu, Malaysia
16. IUCN Red List of Threatened Species. https://www.iucn.org/ur/node/24442. Accessed 10 Oct
2019
17. Jakobi N. Guide ID 59. How to build a WiFi enabled Tether ManagementSystem. https://
openrov.dozuki.com/Guide/How+to+build+a+WiFi+enabled+Tether+Management+System/
59. Accessed 10 Oct 2019
Development of Autonomous
Underwater Vehicle for Water Quality
Measurement Application

Inani Yusra Amran, Khalid Isa, Herdawatie Abdul Kadir,

Radzi Ambar, Nurul Syila Ibrahim, Abdul Aziz Abd Kadir,
and Muhammad Haniff Abu Mangshor

Abstract Autonomous Underwater Vehicles (AUVs) are unmanned, self-propelled

vehicles typically deployed from a surface vessel and are capable of operating
independently from that vessel for periods of several hours to several days. This
project presents the development of an Autonomous Underwater Vehicle
(AUV) with a pH sensor, temperature sensor, and turbidity sensor to measure the
water quality. An existing method is a conventional approach, where a scientist has
to go to the site and collect a water sample to measure the quality. It required more
time to gather the data and lack the capability for real-time data capture. Thus,
through the innovation and idea of this project, a scientist can measure the water
quality in real-time, autonomously and easier than the conventional method. In this
project, two thrusters control the horizontal motion of the AUV, which placed on
the side of the AUV with the guidance of a digital magnetic compass to control the
direction of the AUV. The vertical movement of the AUV is controlled by two
thrusters located at the bottom of the AUV with the help of a depth sensor to ensure
that the AUV remains submerged. A pH sensor used to detect the water quality
whether the water contamination is close to acidity or alkaline or normal value. The
temperature sensor is used to sense the water temperature. The turbidity sensor is
used to detect the cloudiness of water, either murky water or clear water. These
three sensors start operating when the microcontroller starts to power up. The AUV
is tested in a G3 lake at UTHM to test its ability to stay submerged and its
functionality to measure the water quality parameters. The AUV has successfully
carried out the given task without requiring the interface of an operator. Future
researchers can improve the AUV’s design to make the AUV works more
efﬁciently.

Keywords Autonomous Underwater Vehicle Water quality measurement

Water quality sensor

I. Y. Amran K. Isa (&) H. A. Kadir R. Ambar N. S. Ibrahim A. A. A. Kadir

M. H. A. Mangshor
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussien Onn Malaysia,
86400 Parit Raja, Batu Pahat, Johor, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 139

1 Introduction

1.1 Project Background

An Autonomous Underwater Vehicle (AUV), also known as an unmanned

underwater vehicle, is a robot that submerged underwater without requiring a
command from an operator. An AUV is different from Remotely Operated Vehicle
(ROV). The different between AUV and ROV is on how do both robots were
operated. An AUV works independently of humans, while ROV is an unoccupied
underwater robot with a sequence of wires linked to a vessel [1]. An AUV only
submerged underwater with the requirement inside the code from the user and
returned after it finishes and completes the mission, but ROV transmits all the data
to the operator through the cables convey power and allow the ROV to be con-
trolled by the operator.
The application of AUV has been used for more and more tasks, with roles and
missions continually evolving, such as the oil and gas industry. This industry uses
the AUV to make detailed about seafloor maps before they start to build their
subsea infrastructure. The scientist uses AUV for their research about ocean floor
mapping, used to find wreckages of missing aeroplanes, and also can be as a hobby.
Water is a significant source of every living thing to survive. However, when
humans pollute the water, the water starts to be unclean. From that situation, water
problems become widespread. Water contamination is the primary inducement of
human disease [2]. Thus, measuring and monitoring of water quality is very crucial.
Human beings begin to assess the water quality of the contaminated water. They
were using conventional methods to measure water quality. The conventional
methods to measure water quality lack the capability for real-time data capture.
Traditional techniques of collecting, testing, and analysing water samples in water
laboratories are not expensive but also lack the capacity to collect, analyse, and
rapidly disseminate information in real-time [3]. Several procedures need to be
done before the data comes out. Many scientists collect water samples only from
lake cliffs and on the surface of the water, and they were also looking for beautiful
weather to go out to collect the sample of water. The collected water was tested and
analysed in the laboratory and need time to get the result. From this process, the
result became not real-time data capture because the conventional process took time
to analyse the data.
For traditional tools, the scientist using Litmus paper (pH strip paper) or a
Membrane-based kit. A litmus paper is produced of a lichens-based dye, turning
purple in acid (pH < 6.0) while turning green in a base (pH > 8.0) [4]. A litmus
paper only needs to dip into the collected water, and the paper changed the colour
according to the pH indicator. The pH indicator is the specific range of pH values.
A Membrane-based kit is also a type of strip paper which contains tetrazolium dye
and a carbon source on it. The kit is only required for the water sample to kept, and
the colour development is observed [5]. The traditional pH tools need time to
analyse real-time data because the Litmus paper changes colour after the paper
Development of Autonomous Underwater Vehicle … 141

dipped into the collected water. From that colour, the paper needs to match with the
pH indicator whether the water is acidity, alkaline or normal.
The objective of this project is to develop a functional prototype of an AUV for
water quality measurement application where the AUV that consists of a pH sensor,
turbidity sensor, and temperature sensor is a new idea and innovation to make it
easier for the scientist to carry out measurement tasks. The function of AUV
innovation is where the AUV can collect the water quality data on the surface of the
water and underwater. The data will be recorded and stored in the data logger. The
recorded data can be retrieved by removing the memory card inside the data logger.
From this kind of innovation, the data that produce were approximate to the
real-time data capture and also analysed the performance of the AUV and the
effectiveness of the water quality measurement.
Overall, there are ﬁve parts in this paper, and the following are structured.
Section 1 presents the introduction of this project. The problem statements and
goals were discussed, including reviewing the associated prior project. Section 2
introduces the project methodology, including system layout and a few project
trials. While Sect. 3 addresses the outcomes and analyses, the gathered information
was discussed in detail in this section. Towards the completion of this project,
Sect. 4 discusses the project restriction, and Sect. 5 presents the future work to
enhance this project.

1.2 Previous AUV with Water Quality Sensor

This section addresses relevant and important previous research that offers a
detailed and systematic perspective on the Underwater Vehicles literature review.
Komaki [6] concerned with the design and creation of an AUV speciﬁcally
designed for entry into hydrothermal settings over-complicated, wide depth seabed
topography. They can be very close to the ventilation ﬁelds and carry various types
of a chemical sensor. Okamura developed MINIMONE (Mini Monitoring
Equipment) for collecting water samples. MINIMONE information analyses vari-
ous water characteristics such as water density, pH, dissolved inorganic carbon,
nutrients, iron and manganese. The environment for this AUV is for underwater.
The advantage for AUV Urashima is every second the information was logged;
meanwhile, the disadvantage is the AUV has 10 m in length, as shown in Fig. 1.
Takeuchi [7] applied the design implementation of a Solar-Powered
Autonomous Surface Vehicle (SASV), as shown in Fig. 2. SASV measured
depth, temperature, turbidity, conductivity, oxygen dissolved, and chlorophyll. The
ultimate objective of this study is to create an index of ocean ecosystem soundness
and to suggest preventive steps to avoid collisions between fast passenger vessels
and big whales. The environment for SASV is on the sea surface. The advantage of
this project is solar-powered, and the disadvantage is that the data collected only on
the water surface.
142 I. Y. Amran et al.

Fig. 1 AUV Urashima [6]

Fig. 2 Solar-powered ASV

[7]

An innovative project has been created by Helmi [8] to monitor water quality in
the continental, coastal and lake regions. The parameters for this project is pH data,
Oxidation Reduction Potential (ORP) and temperature of the water where these
water quality sensors are attached to a buoy. The environment of this project is on
the water surface, as shown in Fig. 3. The beneﬁt for the portable buoy is that
information is obtained in real-time from the buoy, and the disadvantage is that the
data were collected only on the water surface.
Prasad [9] stated that the Internet of Things (IoT) and Remote Sensing
(RS) methods are commonly used to monitor, collect and analyse information from
remote places. The researcher developed the Smart Water Quality Monitoring to
analyse the following water parameters, as shown in Fig. 4. This project aims to
develop a technique for monitoring the quality of seawater, surface water, tap water
and polluted stream water in an attempt to help manage water pollution using IoT
and RS technologies. The beneﬁt of Smart Water Quality Monitoring System is that
the information was stored onboard via the SD card or sent to the File Transfer
Development of Autonomous Underwater Vehicle … 143

Fig. 3 Mobile buoy [8]

Fig. 4 Smart water quality

system [9]

Protocol (FTP) or cloud server and the disadvantage is that the data can only be
taken at one point to another.
Kafli [10] mentioned that the environmental monitoring process is characterising
and monitoring environmental quality such as air quality and water quality.
Furthermore, environment monitoring is used to prepare environmental impact
assessment and in many cases where human operations pose a danger of damaging
impacts on the natural environment. The author developed a floating platform to
observe the air and the water, as shown in Fig. 5. This device monitors parameter
like temperature, humidity, latitude and longitude, water pH, date and time, and
carbon monoxide. The beneﬁt for this project is the information saved for every
10 min in the SD card in .txt format [11], and the weakness is the data of water
quality measurement collected only at the water surface area.
Niswar [12] has studied soft shell crab farming throughout south-east Asia, such
as Indonesia. Poor water quality throughout crab farming raises the mortality rate in
the pond of the crab. The author proposed to design and implement a water quality
monitoring system for crab farming using IoT technology to raise awareness among
144 I. Y. Amran et al.

Fig. 5 Floating platform for

environment monitoring [10]

Fig. 6 IoT-based water

quality monitoring system for
soft-shell crab farming [12]

farmers about the maintenance of acceptable water quality levels in the pond. The
parameter used in this project is the temperature sensor, salinity, and pH sensor. The
environment of this project is the bottom of the water floor as shown in Fig. 6. The
advantage of this project is that the data sensing is transmitted via the ZigBee
network and stored in the cloud database, and the disadvantage of this project is that
the data collected only on the water surface.
Development of Autonomous Underwater Vehicle … 145

2 Methodology

2.1 Project Design

In order to attain the goals, this project is divided into several stages. It is to ensure
that the design of the project can be carried out smoothly. The subsequent phases
can be described into three sections; the ﬁrst section is the modelling section, the
second section is design and development, and the third section is testing and
analysis sections. Figure 7 shows a sequence plan to start the AUV project.
The ﬁrst phase of this project is modelling, where AUV system architecture and
mechanical assembly drawing is designed. Therefore, computer-aided software like
Solidworks is used to draw 3D modelling and design the suggested and anticipated
AUV structure. Phase 2 is to design and develop the AUV that consists of hardware
development, software development, and integration, which covers internal and
external mechanical design and electrical design. Phase 3 is to test and analyse the
components of the AUV. Three tests have been focused on, is a lake test, buoyancy
test, and leaking test.
Figure 8 demonstrates the sensor flowchart of the AUV for water quality mea-
surement application. The sensor is on with the connected parts and senses the
surroundings. The pH sensor, temperature sensor, and turbidity sensor information
gathered will be stored in the data logger every 1 s. If the data were not collected or
not an accurate result, all connections of sensor need to troubleshoot.

Fig. 7 Sequence plan of project

146 I. Y. Amran et al.

Start

Switch on Arduino to power up

sensors

4 Thrusters start operates

Troubleshooting sensor connections

Acquire data from water quality sensor

Yes

Is data collected? No

The sensor data stored in the memory card

End

Fig. 8 Sensor flowchart

Start A

Switch on Arduino to Acquire data from depth

power up sensor sensor

No
Acquire data from digital Both bottom thrusters
magnetic compass Is data > range? Yes rotate counter
clockwise for 1s

No
Is data = range?
Both bottom
thrusters rotate Yes Is data < range?
Yes clockwise for 1s

Both horizontal thrusters No

remain stationary

Is data = range?
A

End

Fig. 9 System flowchart

Figure 9 shows the system flowchart for operation of an AUV. After it is entirely
in the water, the AUV switched on automatically. The compass navigates the AUV
Development of Autonomous Underwater Vehicle … 147

underwater while assisted by the depth sensor to keep the AUV underwater. When
the direction of the AUV is changed, the horizontal thruster reset the AUV to return
to its direction of instruction. At the same time, the vertical thruster adjusts the
AUV to remain submerged if the AUV reappears on the water surface. As the pH
sensor, temperature sensor, and turbidity sensor start operates when the AUV
switched on.

2.2 System Design

Figure 10 shows the project operational block diagram that consists of input,
process, and output part of the project. The input part comprises several sensors
with a battery as the primary power supply. Then, the process took place in Arduino
microcontroller, and then data logger displays the output. Finally, the outcomes of
the method will be discussed in the outcomes and analyses part in Sect. 3. Several
hardware experiments that are endurance testing, buoyancy testing, and leakage
testing have been performed after the model has been effectively constructed.
The AUV was tested to evaluate the buoyancy, endurance, and leakage at Universiti
Tun Hussien Onn Malaysia (UTHM) G3 Lake.
In Fig. 10, the sensors enable the AUV to perceive its surroundings. The sensors
in the input section play a key role in providing the AUV with accurate and detail
environmental information. The sensors include a pH sensor, turbidity sensor, and
temperature sensor. The pH sensor is used to evaluate the quality of water. The
turbidity sensor is used to sense the water’s cloudiness. The temperature sensor is
used to detect water temperature. These three sensors operated in simultaneously
when AUV is switched on. On the other hand, the output section consists of a
memory card and 4 thrusters; memory card is used to stores all the collected data
from water quality sensors and thruster is used to stabilise the AUV or to control the
movement.

Fig. 10 Block diagram of system

148 I. Y. Amran et al.

2.3 Hardware Requirements

The hardware requirement for AUV project is actuator and sensors. Figure 11
shows the T100 Thruster and Electronic Speed Controller (ESC). Four units of
thrusters with ESC were used in this project. The T100 Thruster is a patented
underwater marine robotic propeller. High performance with more than 5 lb of
thrust and long-lasting enough to be used at great depths in the open ocean.
The T100 is made of polycarbonate injection-moulded plastic, high-strength,
UV resistant. The core of the engine is closed and protected with an epoxy coating,
and it uses high-performance plastic bearings rather than steel bearings that rust in
saltwater. All that is not plastic is high-quality, non-corroding aluminium or
stainless steel. The propeller and nozzle intended by the T100 deliver a reliable and
effective thrust while active water-cooling helps cool the motor. This model is
composed by an electric brushless motor, ranging from 300 to 4200 rpm, has up to
130 W of output power and has 2.36 kgf of nominal torque [15]. The T100 can be
used to counter torque with clockwise (CW) and counter-clockwise (CCW).
Figure 12 shows that the microcontroller which used to control the AUV. This
panel has 54 pins and 16 more memory analogue pins to store the code [16]. The
Arduino Mega uses an Atmel 8 bits microcontroller that is ATmega2560 with
256 kb flash memory, 8 kb SRAM, 4 kb EEPROM, and 16 MHz of the clock
frequency [17]. The Arduino Mega can be powered with an external power supply
or via a USB connection. The power source is automatically selected. This
microcontroller has the purpose of controlling the four (4) thrusters, digital mag-
netic compass, depth sensor, temperature sensor, pH sensor, turbidity sensor, IMU
module, and data logger.
Figure 13 shows an analogue pH sensor that senses the pH level of water. This
sensor operates in 5 V. The measuring range of this sensor is 0pH to 14pH. The pH
sensor is the alternative to get the result of water quality comparing Litmus paper or
pH testing kit with colours that need to place on a pH indicator to get the result of
water quality. The electrode is made of a sensitive glass membrane with low
impedance. The calibrations of pH were a fast response. The pH is a signiﬁcant
parameter for water quality measurement, and the pH impacts aquatic animal
development and reproduction [18].

Fig. 11 T100 Thruster [13]

and ESC [14]
Development of Autonomous Underwater Vehicle … 149

Fig. 12 Arduino Mega 2560

microcontroller

Fig. 13 pH sensor
150 I. Y. Amran et al.

Figure 14 shows a turbidity sensor that used to evaluate water quality turbidity.
Its procedure is based on the concept that the light intensity dispersed by the
suspended substance is proportional to its concentration [19]. The turbidity sensor
operates in 5 V and 40 mA.
Figure 15 shows a Celsius temperature sensor, also known as TSYS01. It is a
quick response, a high-precision temperature sensor sealed from the water protected
by an aluminium cage and ready to be installed in a waterproof enclosure [20]. The
TSYS01 sensor itself has a rapid response time and designed the entire package to
maintain that speed to enable accurate measurement of the temperature proﬁle even
if it drops and rises rapidly.

Fig. 14 Turbidity sensor

Fig. 15 Temperature sensor

Development of Autonomous Underwater Vehicle … 151

3 Results and Analysis

3.1 3D AUV Modeling

This subtopic discusses the tools of the 3D AUV Modeling. The tools that used to
sketch the 3D AUV Modeling are Solidworks 2016 Software. Figure 16 shows the
AUV designed a box-shaped based on the features required for the AUV stabili-
sation system. The AUV mechanical system is designed that a centre of buoyancy
(COB) is above the centre of gravity (COG). The COB and COG distance is
referred to as metacentric height. The moment of restoration returning the vehicle to
its stable orientation is proportional to the height of the metacentre. As the value of
the metacentric height increases, the hydrostatic stability is increased. In addition,
the COB and COG location must be aligned in the vertical direction so that the
vehicle does not have a moment when the vehicle’s pitch and roll angle is equal to
zero.
Figure 17 shows the isometric 3D Design of an Autonomous Underwater
Vehicle. The isometric consists of three principal axes, where the x-axis represents
the front view, the y-axis represents the left view, and the z-axis represents the top
view of an AUV 3D Modeling.

Fig. 16 3D AUV Modeling

152 I. Y. Amran et al.

Fig. 17 Isometric 3D AUV design

3.2 Control System

All thrusters and sensors calibrated and tested for their functionality before
installation on the AUV, as shown in Fig. 18.
The thruster connected to the AUV control system, powered by an external 11 V
power supply, to control the speed and direction of the thrusters. The thrusters are
precisely mounted in the centre of the vehicle to prevent the AUV from becoming
imbalanced when flooded. Thus, a depth sensor is used to give the AUV instruc-
tions for submerging or floating underwater. The depth sensor detects the depth of
water via its pressure sensor and transmits the data to the control system. The

Fig. 18 Thruster calibration and testing

Development of Autonomous Underwater Vehicle … 153

Fig. 19 Thrusters tested on

the AUV structure

control system provided the thrusters with instructions on whether to submerge

deeper or rise depending on the preset value.
A digital magnetic compass is used as the AUV navigation system. The compass
provided the microcontroller with directional data, and the AUV moved in the
direction of pre-setting. The AUV’s orientation system used an Inertial
Measurement Unit (IMU) Module. The IMU sensors help to position an object in
three-dimensional space attached to the sensor. Usually, these values are in angles
to determine their position.
Figure 19 shows the view of the thruster testing process. All four thrusters
attached on the AUV open structure; two thrusters attached on both side which left
and right of AUV structure for horizontal movements and two thrusters attached at
the bottom of the AUV open structure for vertical movement. The purpose of two
thrusters at horizontal sides for back and forth movement which means the thruster
needs to control the torque to clockwise for forwarding movement or
counter-clockwise for backward movement. The function of two thrusters at the
bottom of the AUV structure is for submerging movement and flotation movement.
These two thrusters are also needed to counter the torque to clockwise for sub-
merging movement or counter-clockwise for floating movement.
154 I. Y. Amran et al.

3.3 AUV Prototype

Before the model was constructed, several experiments were performed to check
each sensor’s functionality. A few experiments were also carried out on the model
by putting the model on the lake which the test of buoyancy, the test of leakage, and
the test of endurance. All the parts that were assembled were put on the AUV body
structure after all the experiments were completed, as shown in Fig. 20.
The AUV consists of four thrusters; two horizontal movement thrusters and two
vertical movement thrusters. The AUV has two compartments used to store all its
electronic components to prevent them from getting contact with water. All AUV
sensors stored in the upper compartment such as a compass, IMU module, data
logger, depth sensor, turbidity sensor, temperature sensor, and pH sensor. Thruster
speed controllers and power supply stored in the lower compartment. The floats and
weights were used to provide sufﬁcient buoyancy force for the AUV to stay on the
float while it was fully submerged.
To collect the data, as shown in Fig. 21, it was conducted at the UTHM G3
Lake. All sensors begin to collect the data when the power supply is switched on,
and the data send to the Arduino microcontroller for storage in the memory card.
The underwater compartments of the AUV are reinforced with white tape, epoxy
and silicone grease to ensure that no water can enter the compartment to avoid
water contact with the components, causing the entire circuit to be short circuit. The
plasticine was also used as an additional reinforcement to seal off the entire opening
of the compartment.

Fig. 20 The AUV prototype

Development of Autonomous Underwater Vehicle … 155

Fig. 21 AUV ﬁeld test at the

G3 Lake, UTHM

The endurance test shows that the AUV was able to survive with turbulent
streams of water. For example, when the water flow is turbulent, the AUV can swim
stable and balanced with the AUV’s assistive sensor like IMU sensor and actuator
to make AUV remains swim in position.

3.4 AUV Submerging and Leaking Test

Following the complete assembly of the AUV, the AUV was submerged at the G3
Lake in UTHM to test whether the AUV could remain fully submerged underwater
for a period of time, as shown in Fig. 22. The floats are added to the sides of the
AUV to act as a floating mechanism to increase the buoyant force acting on the
AUV.
156 I. Y. Amran et al.

Fig. 22 AUV submerging

and leaking test

The additional weights are added to the AUV to prevent the AUV from surfacing
back to the water surface to act as a sinking mechanism for the AUV. Both
mechanisms work together in order to keep the AUV underwater floating.
The AUV’s underwater compartments play a major role as their used for storing
the AUV control system. As the AUV control system is not waterproof, it is
therefore very important to ensure that the AUV control system does not come into
contact with the water. Simultaneously, a leakage test is also carried out to ensure
that no water can enter the AUV submarine compartments.

3.5 Experimental Results

The project goal was effectively accomplished from the outcome that was to
develop an AUV for Water Quality Measurement Application. The system effec-
tively gathered the data of water turbidity, temperature, and pH and saved it every
1 s as shown in Fig. 23 to the SD card in .txt format.
UTHM G3 Lake is the suggested place for AUV to run the ﬁeld test. This is
because the G3 Lake consists of thermocline where the thermocline is a layer of
transition between deep water and surface water. Each layer of water that is mixed
layer represented as surface water, thermocline layer, and deep water has different
temperature as shown in Fig. 24. Water close to the surface and warmed by the sun
is less dense the water closes to the bottom because of water density changes as the
water temperature changes. The lower the water temperature, the higher the water
density until around 4 °C [21]. In a thermocline, with small increases in-depth, the
temperature decreases rapidly. In these three layers also has different of cloudiness
of water and pH value of the water.
Development of Autonomous Underwater Vehicle … 157

Fig. 23 Data is saved in SD card with .txt format

Based on the results in Fig. 25, during ﬁeld test at G3 Lake in UTHM, the
temperature of water starts to decrease rapidly until below 15 °C at 12:57:00 until
12:57:20. It is because the AUV is submerged underwater at the centre of the lake
which at the thermocline layer. The layer that is close to the thermocline, the
temperature of the water is decreasing. While early minutes of AUV operation for
pH data, the pH sensors begins with unstable data because of the voltage reads
incorrectly, the pH value viewed as the voltage is also discarded [23]. The
158 I. Y. Amran et al.

Fig. 24 Thermocline of
water [22]

sensitivity of glass of pH sensor takes time to calibrate the correct data of the pH
water quality. After several seconds which is the AUV started to swim at the centre
of the lake, the pH sensor calibrated the pH water between pH7 until pH10. It is
because the layer where the AUV dive in underwater, the pH value changed in
every layer and location.
Finally, the turbidity sensor senses the cloudiness of the water. From the result
shown, the turbidity data changed at 12:56:18 until 12:56:36 to 5 V. It is because
the water flows were in unsteady movement; in other words, is turbulence. The
turbulence makes water becomes murkier.
Development of Autonomous Underwater Vehicle … 159

Fig. 25 Data analysis for temperature, turbidity, and pH sensors

4 Conclusion

After testing out the AUV in a G3 lake at Universiti Tun Hussien Onn Malaysia, it
can be summed up that the AUV can perform the given task without requiring the
interface of an operator. The AUV switched on automatically after it is entirely in
the water, all sensor in the control system power-up including water quality mea-
surement sensor. The digital magnetic compass navigated the AUV swam under-
water while the depth sensor helps to keep the AUV remain submerged. At the
time, the water quality measurement sensors such as pH sensor, temperature sensor
and turbidity started calibrating the data of the water and record the data into the
data logger.
There were a few problems that were present before reaching the ﬁnal phase,
which is the problem of leakage at the second compartment that consists of power
supply (batteries) and four ESCs. The problem of leakage could be solved by
applying a sealing tape with a layer of silicon grease around the thruster wire to
prevent the passage of water. The second problem that was the power supply
problem could be solved by adding a charging port to the power supply com-
partment so that the power supply could be recharged directly within the AUV
instead of replacing the old battery with new ones. The third problem that was the
uploading code to microcontroller could be solved by adding a Universal Serial Bus
160 I. Y. Amran et al.

(USB) port to the primary compartment (microcontroller compartment) so that the

user can upload the code through the USB port that connected with microcontroller
instead of opening the hull.
In conclusion, the project aims at designing and developing a functional
Autonomous Underwater Vehicle for Water Quality Measurement Application is
achieved. The last objective that is to analyse the performance of the AUV and the
effectiveness of the water quality measurement is successfully achieved as the AUV
able to operate fully function.

5 Recommendation

For future work, there are a few improvements that can be implemented in the
future. One of the recommendations is to decrease the length of the AUV because a
smaller size AUV can improve the manoeuvrability of the AUV. Based on the First
Law of Motion of Newton, also known as Inertia, an object in rest remains at rest,
and an object in movement remain in movement at the moment, unless an unbal-
anced force acts on it. As the mass of the AUV increases, the AUV’s inertia will
also increase, resulting in large inertia for the AUV. In terms of manoeuvrability, a
smaller size AUV will have small inertia that will beneﬁt to the AUV.
Another improvement that can be implemented in future projects is by using
waterproof electronic components. This idea plays an essential part in the devel-
opment of an AUV as the AUV is used explicitly for underwater missions, in
particular for mapping seafloors, detecting wreckage, and measuring the water
quality at seafloors. This is why the component will not malfunction when in
contact with water by using waterproof electronic components while lowering costs
at the same moment of replacing malfunction components with new ones.

References

1. National Oceanic and Atmospheric Administration (2018) What is the difference between an
AUV and an ROV? US Department of Commerce
2. Zhou B, Bian C, Tong J, Xia S (2017) Fabrication of a miniature multi-parameter sensor chip
for water quality assessment. Sensors 17(12):157
3. Faustine A, Mvuma AN, Mongi HJ, Gabriel MC, Tenge AJ, Kucel SB (2014) Wireless sensor
networks for water quality monitoring and control within lake victoria basin: prototype
development. Wirel Sens Netw 6:281–290
4. Gunda NSK, Dasgupta S, Mitra SK (2017) DipTest: a litmus test for E. coli detection in
water. PLoS ONE 12(9):1–13
5. Kumar SB, Shinde AH, Mehta R, Bhattacharya A, Haldar S (2018) Simple, one-step
dye-based kit for bacterial contamination detection in a range of water sources. Sens
Actuators B Chem 276:121–127
Development of Autonomous Underwater Vehicle … 161

6. Komaki K, Hatta M, Okamura K, Noguchi T (2015) Development and application of

chemical sensors mounting on underwater vehicles to detect hydrothermal plumes. In: 2015
IEEE underwater technology, UT
7. Arima M, Takeuchi A (2016) Development of an autonomous surface station for underwater
passive acoustic observation of marine mammals. In: Ocean 2016, Shanghai, no. 26289339,
pp 1–4
8. Helmi AHMA, Haﬁz MM, Rizam MSBS (2014) Mobile buoy for real-time monitoring and
assessment of water quality. In: Proceedings of the 2014 IEEE conference on systems, process
and control, ICSPC 2014, December, pp 19–23
9. Prasad AN, Mamun KA, Islam FR, Haqva H (2016) Smart water quality monitoring system.
In: 2015 2nd Asia-Paciﬁc world congress on computer science and engineering, APWC CSE
2015, pp 1–6
10. Kafli N, Othman MZ, Isa K (2017) Unsupervised floating platform for environmental
monitoring. In: Proceedings of the 2016 IEEE international conference on automatic control
and intelligent systems, I2CACIS 2016, October, pp 84–89
11. Kafli N, Othman MZ, Isa K (2016) Development of a floating platform for measuring air and
water quality. In: 2016 IEEE 6th international conference on underwater system technology:
theory and applications, USYS 2016, pp 177–182
12. Niswar M et al (2018) IoT-based water quality monitoring system for soft-shell crab farming.
In: Proceedings of the 2018 IEEE international conference on internet of things and
intelligence system, IOTAIS 2018, pp 6–9
13. T100 Thruster - Blue Robotics. https://www.bluerobotics.com/store/thrusters/t100-t200-
thrusters/t100-thruster/. Accessed 18 May 2019
14. Speed Controllers (ESCs) Archives - Blue Robotics. https://www.bluerobotics.com/product-
category/thrusters/speed-controllers/. Accessed 18 May 2019
15. Nascimento S, Valdenegro-Toro M (2018) Modeling and soft-fault diagnosis of underwater
thrusters with recurrent neural networks. IFAC-PapersOnLine 51(29):80–85
16. Introduction to Arduino Mega 2560 - The Engineering Projects. https://www.theengineer
ingprojects.com/2018/06/introduction-to-arduino-mega-2560.html. Accessed 18 May 2019
17. RobotShop (2015) Arduino Mega 2560 Datasheet. Power, pp 1–7
18. Wei Y, Hu X, An D (2018) Design of an intelligent pH sensor based on IEEE1451.2.
IFAC-PapersOnLine 51(17):191–198
19. Lambrou TP, Anastasiou CC, Panayiotou CG (2010) A nephelometric turbidity system for
monitoring residential drinking water quality. Springer, Berlin, Heidelberg, pp 43–55
20. Fast-Response, High Accuracy (± 0.1 °C) Temperature Sensor. https://www.bluerobotics.
com/store/sensors-sonars-cameras/sensors/celsius-sensor-r1/. Accessed 18 May 2019
21. About Water Temperature. https://staff.concord.org/*btinker/GL/web/water/water_temperat
ures.html. Accessed 27 May 2019
22. US Department of Commerce, N. N. W. S. Thermocline - Temperature Fluctuations at Erie,
PA
23. Top 10 Mistakes in pH Measurement. https://blog.hannainst.com/top-10-mistakes-in-ph-
measurement. Accessed 21 May 2019
Discrete Sliding Mode Controller
on Autonomous Underwater Vehicle
in Steering Motion

Nira Mawangi Sarif, Raﬁdah Ngadengon, Herdawatie Abdul Kadir,

and Mohd Haﬁz A. Jalil

Abstract The purpose of this study is to implement sliding mode control in dis-
crete time domain for Autonomous Underwater Vehicle (AUV). Six Degree of
Freedom (DOF) was established for Naval Postgraduate School (NPS) AUV II
model, followed by linearizing surge and sway nonlinear Equation of Motion
(EoM) in horizontal plane to simplify the control system design. Discrete sliding
mode controller was designed based on Gao’s reaching law. Discrete Proportional
Integral Derivative (PID) controllers were used for performance comparative
analysis and brief discussion on existence of chattering phenomena in the controller
input. As a result, computer simulations on NPS AUV II showed that the proposed
controller has zero overshoot and faster settling time than the discrete PID
controller.

Keywords AUV Chattering reduction Discrete time sliding mode

1 Introduction

Autonomous Underwater Vehicle (AUV) has shown popularity for three decades
due to its versatility and excellent performance which are increasingly being used in
many industries [1]. Their solid small size with self-operated propulsion systems,
capability carrying sensors such as depth sensors, video cameras, side-scan sonar
and other oceanographic measuring devices has made the AUV to be well suited in
dangerous mission. Futuristic elements in the AUV prompt advantage into much
wider area such as surveillance, environmental monitoring, underwater inspection
of harbor and pipeline, geological and biological survey and mine counter mea-
sures. However, extremely unexpected ocean behavior has created challenges to the
AUV navigation and motion performance in which this phenomenon demonstrate

N. M. Sarif R. Ngadengon (&) H. A. Kadir M. H. A. Jalil

Faculty of Electrical Engineering, University Tun Hussein Onn Malaysia,
86400 Parit Raja, Johor, Malaysia
e-mail: raﬁ[email protected]

© Springer Nature Singapore Pte Ltd. 2021 163

highly frequency oscillating movement by affecting the sensor performance espe-

cially acoustical and optical sensors and also causing the dynamics system to have
highly nonlinear, time-varying and uncertainties in hydrodynamic parameters such
as added mass, lift forces, gravity and buoyancy forces [2]. Additionally, most
AUVs are operated under actuated mode, hence tracking and stabilization control
become demanding task, owing to over possession of Degree Of Freedom
(DOF) beyond control [3]. This restriction is imposed in real life application as
inverting or pointing vertically can cause equipment damage or dangerous control
response [4]. As a result, the AUVs motion control is restricted to only one non-
interacting subsystem at a time [5]. Due to aforementioned challenges, many
advanced control techniques have been implemented in existing literatures, mostly
including robust control techniques in [6–8], intelligent control method in [9] and
adaptive control approach in [10–12]. It is apparent that the SMC evidently is a
promising strategy [13] among the robust controllers types, to overcome the
obstacles due to its simpler computation and robust to external disturbance and
parameter variations [14].
The work reported in the literature addresses that, majority of the SMC appli-
cation on the AUV is in continuous time point of view but its effectiveness in real
situations is no longer efficient due to current trend toward digital rather than analog
control of dynamic system [15]. In other words, controllers nowadays are almost
exclusively in digital computer or microprocessors. This is mainly due to avail-
ability of low-cost digital computers and the advantage found in digital signals
rather than continuous time signal [16]. For this reason, researcher has produced
significant interest over recent years [13, 17, 18] in solving the problems caused by
the discretization of continuous time controllers. It was started in 1997, Lee et al.
[19] adopted self-tuning discrete sliding mode control on AUV ARMA based on
equivalent discrete variable structure control method and it was continued with a
research on quasi sliding mode control in presence of uncertainties and long
sampling interval as started in [20] on an AUV named VORAM (Vehicle for Ocean
Research and Monitoring). The research was then followed by Zhang [21] who has
proposed discrete-time quasi sliding mode controller for the multiple-input
multiple-output on AUV REMUS. In addition, Wu et al. [22] implemented adap-
tive sliding mode control in discrete time system and applied time varying sliding
surface obtained via parameter estimation method. The work developed by Bibuli
et al. in [23] described hybrid guidance and control system based on neural dynamic
and quasi sliding mode integration on Shark USV. Verma et al. [24] worked on
controlling speed of Carangiform robotic fish using Discrete Terminal Sliding
Mode Controller.
Research in discrete-time controller was started by Milosavljevic in [25]. Later
Gao et al. created quasi sliding mode band in [26]. Soon after that, Bartoszewics in
[27] proposed non-switching condition of DSMC. Although Gao’s reaching law
method has been introduced since two decades ago, it is still been used in many
significant studies such as [28–30].
The objective of this research is to implement discrete time sliding mode control
law proposed by Gao et al. in [31] during steering motion control. This is to ensure
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle … 165

the designed control law is parallel to technology advancement and minimize the
vehicle heading error so that the vehicle steering motion will follow the desired
heading angle as close as possible. Discrete Proportional Integral Derivative
(PID) and Discrete Sliding Mode Control (DSMC) are tested on AUV NSP II via
simulation and discrete PID controller is used for performance comparative anal-
ysis. The paper is organized as follows: Dynamic model of AUV NSP II in the
Body-Fixed Reference Frame (BFF) and DSMC structure design are presented in
Sects. 2 and 3 respectively. Results from numerical simulation are illustrated in
Sect. 4 and discussion on advantages and drawback of the control methods is
provided in Sect. 5.

2 Mathematical Modelling of NPS AUV II

2.1 Nonlinear Equation of Motion

AUV dynamic system is highly nonlinear, coupled and time varying which attribute
to considerations of many parameters such as hydrodynamic drag, damping and lift
forces, Coriolis and centripetal forces, gravity, buoyancy forces and thrust [32].
General nonlinear equation of motion is present as

M v_ þ CðvÞv þ DðvÞv þ GðgÞ ¼ s ð1Þ

n_ ¼ J ðnÞv ð2Þ

where, M 2 <6x6 is inertia matrix, CðvÞ 2 <6x6 is Coriolis and centripetal matrix,
DðvÞ 2 <6x6 is damping matrix, GðgÞ 2 <6x1 is vector of buoyancy/gravitational
forces/moments matrix and s 2 <6x1 is vector of control inputs relating to forces
and moments acting on vehicle.
Kinematic and dynamic of the AUV are established using earth-fixed reference
frame and body-fixed reference frame as illustrated in Fig. 1. The earth coordinate
system of vehicle is defined by three orthogonal axes originating from arbitrary
point. East, west and increasing depth correspond to x-axis, y-axis and z-axis
respectively. The motion element is expressed as

v ¼ ½v1 v2 T

v1 ¼ ½u v wT Linear velocities

ð3Þ
v2 ¼ ½p q r T Angular velocities

The position and attitude of body-ﬁxed reference frame with refer to earth-ﬁxed
frame is expressed in the following vectors
166 N. M. Sarif et al.

Fig. 1 The six Degree of Freedom of NPS AUV II [33]

n ¼ ½ n1 n2 T

n1 ¼ ½x y zT Position of Origin ð4Þ

n2 ¼ ½U h wT Angles orientation of roll ð/Þ, pitch ðhÞ and yaw ðwÞ
The control input vector s has three components as stated in (5)

s ¼ ½ dr ; d e ; n ð5Þ

where de is elevator deflection, dr is rudder deflection and n is propeller revolutions.

The 6 DOF kinematic equation is expressed in vector form as
2 3 2 3
x_ ucoshsinw þ vð cos/sinw þ sin/sinhcosw þ wðsin/sinw þ cos/sinhcoswÞ
6 y_ 7 6 ucoshsinw þ vðcos/cosw þ sin/sinhsinwÞ þ wð sin/cosw þ cos/sinhsinwÞ 7
6 7 6 7
6 7 6 7
6 z_ 7 6 usinh þ vsin/tanh þ wcos/cosh 7
6 7¼6 7
6 /_ 7 6 þ þ 7
6 7 6 p qsin/tanh rcos/tanh 7
6_7 6 7
4h5 4 qcos/ rsin/ 5
w_ qsin/
þ rcos/
cosh cosh
ð6Þ
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle … 167

Table 1 Position and velocities of AUV [32]

Motion Forces & Body-ﬁxed frame Earth-ﬁxed frame
direction moments (Velocity) (Position)
Surge X u x
Sway Y v y
Heave Z w z
Roll K p /
Pitch M q h
Yaw N r w

Six different motion components are conveniently deﬁned as surge, sway, heave,
roll, pitch and yaw as summarized in Table 1 according to Fossen in [32].
The Six OF rigid body equations of motion (EoM) in (1), (2), (3), (4) and (5) are
expended as [32].

m u_ vr þ wq xG q2 þ r 2 þ yG ðpq r_ Þ þ zG ðpr þ q_ Þ ¼ X ð7Þ

m v_ wp þ ur þ xG ðqr p_ Þ yG p2 r 2 þ zG ðqr p_ Þ ¼ Y ð8Þ

m w_ uq þ vp xG ðpr q_ Þ þ yG ðqr þ p_ Þ zG p2 þ q2 ¼ Z ð9Þ

Ix p_ þ qr Iz Iy þ Ixy ðpr q_ Þ Iyz q2 r 2 Ixz ðr_ þ pqÞ
ð10Þ
þ m½yG ðw_ uq þ vpÞ zG ðv_ wp þ ur Þ ¼ K

Iy q_ þ rpðIx Iz Þ Ixy ðqr p_ Þ þ Iyz ðqp r_ Þ þ Ixz p2 r 2
ð11Þ
þ m½zG ðw_ uq þ vpÞ xG ðu_ vr þ ur Þ ¼ M

Iz r_ þ pq Iy Ix Ixy p2 q2 Iyz ðq_ þ rpÞ þ Ixz ðrq p_ Þ
ð12Þ
þ m½xG ðv_ þ ur wpÞ yG ðu_ vr þ wqÞ ¼ N

where, m is the AUV mass, xG; yG ; zG are locations of the vehicle center of AUV
mass, Ix; Iy ; Iz are rotational inertia of AUV mass, u; v; w are AUV linear velocities in
x-axis, y-axis and z-axis. p; q; r are AUV angular velocities of roll, pitch and yaw
respectively. u; _ w;
_ v_ ; p;
_ q;
_ r_ are linear and angular acceleration and X; Y; Z; K; M; N
is external force and moment.
Total forces and moments from [32] acting on vehicle is expressed as

X ¼ ðW BÞsinh þ Xujuj ujuj þ Xu_ u_ þ Xwq wq þ Xqq qq þ Xvr vr

ð13Þ
þ Xrr rr þ Xprop
168 N. M. Sarif et al.
Y ¼ ðW BÞcoshsinU þ Yvjvj vjvj þ Yrjrj r jr j þ Yv_ v_ þ Yr_ r_
ð14Þ
þ Yur ur þ þ Ywp wp þ þ Ypq pq þ þ Yuv uv þ þ Yuudr uudr

Z ¼ ðW BÞcoshcosU þ Zwjwj wjwj þ Zqjqj qjqj þ Zw_ w_ þ Zq_ q_

ð15Þ
þ Zuq uq þ Zvp vp þ Zrp rp þ Zuw uw þ Zuuds uuds

K ¼ ðYG W YB BÞcoshcosU þ ðZG W ZB BÞcoshsinU

ð16Þ
þ Kpj pj pj pj þ Kp_ p_ þ Kprop

M ¼ ðZG W ZB BÞsinh þ ðXG W XB BÞcoshcosU þ Mwjwj wjwj

þ Mqjqj qjqj þ Mq_ q_ þ Muq uq þ Mvp vp þ Mrp rp þ Muw uw ð17Þ
þ Muuds uuds

N ¼ ðXG W XB BÞcoshsinU ðYG W YB BÞsinh þ Nvjvj vjvj

þ Nrjrj r jr j þ Nv_ v_ þ Nr_ r_ þ Nur ur þ Nwp wp þ Npq pq þ Nuv uv ð18Þ
þ Nuudr uudr

where, Xujuj ujuj, Yvjvj vjvj; Yvjvj vjvj are cross flow drag moment coefficient,
Xwq ; Xvr ; Xqq ; Yur ; Ywp ; Ypq are added mass cross force coefficient terms, XProp and
KProp are propeller force and torque respectively. Muq ; Mvp ; Mrp ; Muw ;
Nur ; Nwp ; Npq ; Nuv are added mass cross moment coefficient terms and
Yuudr ; Zuuds ; Muuds ; Nuuds are fin lift moment coefficients.

2.2 Linearization of Horizontal Plane Equation of Motion

According to Healey and Marco in [5], a complete dynamic Equation of Motion

(EoM) is divided into three non-interacting subsystem. In order to reduce com-
plexity in designing control law, this scope is limited to steering motion with
vertical motion control parameters set to zero. Steering control system is respon-
sible for control heading errors. The automatic steering control is done by utilizing a
rudder and a pair of thrusters. Following assumptions are used to obtain a linearized
model of steering control system by considering sway and yaw EoM [5].
• The forward velocity, uo is constant.
• Vertical motion control parameters are set at zero.
• The body drag force and moment are negligible.
• The added mass force and moment are negligible.
• The origin of the vehicle coincides with the centre or gravity.
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle … 169

Linearized (8) and (12) are stated as

m_v þ muo r ¼ Y ð19Þ

Iz r_ ¼ N ð20Þ

where uo is the constant forward vehicle velocity. From (6), roll and pitch angles
can be simpliﬁed as

sinh cosU
w_ ¼ qþ rr ð21Þ
cosh cosh

By considering previous assumptions, linearized modelling of hydrodynamic

added mass, damping and the rudder of (14) and (18), Y and N are yielded as

Y ¼ Yv_ v_ þ Yr_ r_ þ Yv v þ Yr r þ Yd dr ð22Þ

N ¼ Nv_ v_ þ Nr_ r_ þ Nv v þ Nr r þ Nd dr ð23Þ

Equation (19), (20), (21), (22) and (23) are expressed in a compact form of
2 32 3 2 32 3 2 3
m Yv_ mxG Yr_ 0 v_ Yv_ u0 ðYr mÞu0 0 v Ydr
4 mxG Nv_ Iz Nr_ 0 54 r_ 5 ¼ 4 Nv u0 ðNr mxG Þ 0 54 r 5 þ 4 Ndr 5dr
0 1 0 w_ 0 1 0 w 0
ð24Þ

where v is sway velocity, r is the angular velocity in yaw, w is heading angle and dr
is rudder deflection (Table 2).
Re-arranging the expression in state space form

x_ ¼ AxðtÞ þ BuðtÞ ð25Þ

y ¼ CxðtÞ

where, x ¼ ½v r w, u ¼ dr ; C ¼ ½1 0 0; 0 1 0 ; 0 0 1 and y ¼ w

Table 2 The NPS AUV II Parameter Value Units

model parameter [35]
m 5443.4 Kg
W 53400 N
ZG 0.061 M
ZB 0 n
Iy 13587 Nms2
Mq_ 1:7 102
170 N. M. Sarif et al.

3 Controller Design

3.1 Discrete Sliding Mode Control (DSMC) Design

In this section, DSMC is designed in discrete time domain to control heading errors
of steering system. By considering continuous time system in (25), the discrete
model of (19), (20), (21), (22) and (23) by Zero Order Hold (ZOH) approximation
yields

xðk þ 1Þ ¼ UxðkÞ þ CuðkÞ ð26Þ

yðk Þ ¼ Cxðk Þ

where xðkÞ is the state vector, uðkÞ is the control input, yðk Þ is the output and U and
C are the system matrices.
In this paper, the objective of the controller is to force the variable x to achieve a
constant reference position, xr . Hence, the output tracking error is deﬁned as:

e ¼ xr x ð27Þ

Next, discrete conventional sliding surface is deﬁned as follows

Sðk Þ ¼ Cs eðk Þ
ð28Þ
¼ Cs ðxr ðk Þ xðk ÞÞ

where eðk Þ is the heading error, xr is reference input and Cs is the selected sliding
matrix such that Cs is a gain matrix.
Discrete sliding mode control scheme is designed based on reaching law or
equivalent method. In order to steer the state trajectory to reach the sliding surface
in one instant sampling, the strategy is developed based on following condition:

Sð k Þ ¼ 0 ð29Þ

The ﬁrst-time derivative of (29) is expressed as:

Sð k þ 1Þ s ð k Þ ¼ 0 ð30Þ

The discrete time extension reaching law proposed by Gao et al. [31] is deﬁned
as

Sðk þ 1Þ Sðk Þ ¼ qTSðkÞ eTsgnðSðkÞÞ ð31Þ

where T is the sampling interval of discrete time system, e and q are positives
constants. e > 0, q > 0 and 1 qT [ 0.
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle … 171

From Eq. (26) and (28), the ﬁrst derivative of sliding surface rewrite as;

Sðk þ 1Þ Sðk Þ ¼ Cs ðxr ðk þ 1Þ Cs xðk þ 1ÞÞ ð32Þ

Substituting Eq. (26) into (31), the sliding surface is expressed as;

Sðk þ 1Þ SðkÞ ¼ Cs ðxr ðk þ 1Þ Cs Uxðk Þ CuðkÞ Cs xr ðk Þ

ð33Þ
þ C s xð k Þ

Hence, the control law of DSMC for system (26) so that the sliding surface steer
to zero in a ﬁnite time is deﬁned as:

uðkÞ ¼ ðCs CÞ1 ½Cs xr ðk þ 1Þ þ Cs Uxðk Þ þ ð1 qT ÞsðkÞ

ð34Þ
eTsgnðsðkÞÞ

Flowing step obtains sliding gain matrix Cs by substituting (33) into (34) to
generate

xðK þ 1Þ ¼ ðU CK Þxðk Þ ð35Þ

where K ¼ ðCs CÞ1 CsU

Hence, the sliding gain matrix Cs becomes the solution of the following
equation.

CsðU CK Þ ¼ 0 ð36Þ

CsC ¼ I ð37Þ

where I is an identity matrix and (37) to ensure that CsC is full rank. Using (36),
(37) can be replace by CsU ¼ K and thus the above equations can be written as

Cs½U C ¼ ½KI ð38Þ

Finally, the sliding matric Cs is given by

Cs ¼ ½KI ½U C þ ð39Þ

where + is representation of matrix pseudo-inverse. The feedback matrix K is

obtained by adopting (37) into Linear Quadratic Controller [35].
172 N. M. Sarif et al.

4 Computational Result on Steering Control Motion

This section evaluates controller performance via Matlab/Simulink simulations

which the overall system was considered as discrete control system using Zero
Order Hold (ZOH) with 0.2 s sampling time. To illustrate an effectiveness of
DSMC, discrete PID controller was used as a comparative analysis. Step response
simulations were performed in sway and yaw motion. Discrete PID controller is
widely used due to its reliability and simplicity but it is difﬁcult to tune the
parameter in discrete time domain to achieve optimal performance. Discrete PID
gain setting is obtained from Ziegler Nicholas method as tabulated in Table 3
(Fig. 2 and Table 4).
Using (39), the sliding gain matrix, Cs is given by

Cs ¼ ½ 0:1 2 0:3 ð40Þ

Table 3 PID tuning gain value

Gain Value
Proportional (Kp ) −0.626
Integral (KI ) −0.038
Derivative (Kd ) −1.179

Fig. 2 Yawing angle evolution

Table 4 Controller performance comparison

Transient response properties Discrete PID DSMC
Rise time 3.045 11.5
Overshoot 20 0
Settling time 50 45
Steady state error 0 0
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle … 173

Reaching law parameters are set as follows

q ¼ 0:4; e ¼ 0:01 ð41Þ

Figure 4 shows evolution of steering motion by both controllers that have

strongly reached the desired output. Both controllers start responding after 5 s of
input command and they have managed to achieve the desired 50-degree yawing
angle.
The AUV gradually changes the yawing angle and it is stabilized after 45 s and
50 s for DSMC and PID respectively. The rudder deflection by DSMC gradually
changes to positive value which then results in smooth and sensible yawing rate
angle with 0% overshoot. On the other hand, the PID controller results 20%
overshoot of rudder deflection before reaching steady state at 50 s as demonstrated
in Figs. 3 and 4. It is noticeable that, both controllers have achieved desired the
results with different performance.
Figure 5 illustrates chattering phenomena evolution in discrete sliding mode
control input. This is because discrete control algorithm is calculated in each sample
period and it is kept as a constant value until the next sampling period. Due to ﬁnite
sampling frequency, a Quasi-Sliding Mode (QSM) will occur in the close loop
system. This situation will force system state to move around the sliding surface
rather than staying along the sliding surface. From Gao’s reaching law in Eq. (31),
the thickness of Quasi-Sliding Mode Band (QSMB) in steady state depends on

Fig. 3 Control input evolution

Fig. 4 Yawing rate evolution

174 N. M. Sarif et al.

Fig. 5 Chattering phenomena in control input

Fig. 6 QSMB in sliding surface

parameter e as illustrated in Fig. 6. The width of the QSMB could be reduced by

using smaller e. In other words, the robustness of a system can be improved by
decreasing the e: The smaller e; will lead to the smaller is the effect of the sampling
time in the system.

5 Conclusion

In this study, two controllers DSMC and Discrete PID for the AUV were developed
based on discrete time domain. The NPS AUV II was used to design discrete time
controllers. Through the comparative computer simulations on NPS AUV II, it is
apparent that DSMC presents excellent performance than Discrete PID.
On the contrary, DSMC generated chattering phenomena due to ﬁnite sampling
frequency and control algorithm in discrete time calculated in each sample period
and kept as a constant value until the next sampling period. The chattering effect
can be mitigated by reducing the thickness of QSMB. However, the robustness of
the designed controller is not considered in the study.
In the future work, the controller performance will be teasted by considering
parameter uncertainties and external disturbance in the designed control law.
Discrete Sliding Mode Controller on Autonomous Underwater Vehicle … 175

Acknowledgements The authors would like to thank Universiti Tun Hussein Onn Malaysia
(UTHM) for TIER 1 grant Vot H148, GPPS grant Vot H316 and AdMiRe FKEE for the research
funding support.

References

1. Gelli J, Meschini A, Monni N (2018) Development and design of a compact autonomous

underwater vehicle zeno AUV. IFAC-PapersOnLine 51(29):20–25
2. Yang R (2016) Modeling and robust control approach for autonomous underwater vehicles.
Ph.D. thesis, Ocean University of China (2016)
3. Isa K, Arshad MR, Ishak S (2014) A hybrid-driven underwater glider model, hydrodynamics
estimation, and an analysis of the motion control. Ocean Eng 81:111–129
4. Brutzman DP (1994) A virtual world for an autonomous underwater vehicle, Ph.D. thesis,
Naval Postgraduate School, Monterey, California
5. Healey A, Marco D (1992) Slow speed flight control of autonomous underwater vehicles:
experimental results with NPS AUV II. In: The second international offshore and Polar
engineering conference, pp 523–532
6. Guerrero J, Antonio E, Manzanilla A, Torres J, Lozano R (2018) Autonomous underwater
vehicle robust path tracking- auto-adjustable gain high order sliding mode controller.
IFAC-PapersOnLine 51(13):161–166
7. Farhan M (2017) Sliding mode control of autonomous under water vehicle by Muhammad
Farhan Faculty of Engineering, Master Thesis, Capital University of Science & Technology,
Islamabad
8. Song YS, Arshad MR (2016) Sliding mode depth control of a hovering autonomous
underwater vehicle. In: Proceedings of the 5th IEEE international conference on control
system, computing and engineering ICCSCE (2016)
9. Ullah B, Ovinis M, Baharom MB, Javaid MY, Izhar SS (2015) Underwater gliders control
strategies: a review. In: 10th Asian control conference: emerging control techniques for a
sustainable world, ASCC (2015)
10. Qiao L, Zhang W (2019) Adaptive second-order fast nonsingular terminal sliding mode
tracking control for fully actuated autonomous underwater vehicles. IEEE J Ocean Eng 44
(2):363–385
11. Cui R, Zhang X, Cui D (2016) Adaptive sliding-mode attitude control for autonomous
underwater vehicles with input nonlinearities. Ocean Eng 123:45–54
12. Chu Z, Xiang X, Zhu D, Luo C, Xie D (2017) Adaptive fuzzy sliding mode diving control for
autonomous underwater vehicle with input constraint. Int J Fuzzy Syst 10–11
13. Wang B (2008) On discretization of sliding mode control systems on discretization of sliding
mode control systems, Ph.D thesis, RMIT University
14. Hung JY, Gao W, Hung JC (1993) Variable structure control: a survey. IEEE Trans Ind
Electron 40(1):2–22
15. Sanchez-Gonzalez PL, Díaz-Gutiérrez D, Leo TJ, Núñez-Rivas LR (2019) Toward
digitalization of maritime transport. Sensors 19(4):926
16. Ogata K (1995) Discrete-time control systems, 2nd edn. Prentice Hall International, Inc.,
Upper Saddle River
17. Singh DP, Agarwal S, Gupta UK (2014) A technical review on discrete-time sliding mode
controller for linear time-varying systems. Int J Eng Tech Res 5:289–291
18. Feng Y, Xue C, Yu X, Han F (2018) On a discrete-time quasi-sliding mode control. In:
Proceedings of IEEE international workshop on variable structure systems, pp 251–254
19. Lee P, Hong S, Lim Y (1997) Self-tuning control of autonomous underwater vehicles based
on discrete variable structure system. In: IEEE Conference Proceedings Oceans 1997, vol 2.
MTS, pp 902–909
176 N. M. Sarif et al.

20. Lee PM, Hong SW, Lim YK, Lee CM, Jeon BH, Park JW (1999) Discrete-time quasi-sliding
mode control of an autonomous underwater vehicle. IEEE J Ocean Eng 24:388–394
21. Zhang S, Yu J, Zhang A (2010) Discrete-time quasi-sliding mode control of underwater
vehicles. In: Proceedings of the world congress on intelligent control and automation,
pp 6686–6690
22. Wu B, Li S, Wang X (2009) Discrete-time adaptive sliding mode control of autonomous
underwater vehicle in the dive plane. Springer, Heidelberg, pp 157–164
23. Lecce ND, Laschi C, Bibuli M, Bruzzone G, Zereik E (2015) Neural dynamics and sliding
mode integration for the guidance of unmanned surface vehicles. In: MTS/IEEE ocean.
Genova discovering sustainable ocean energy a new world, pp 1–6
24. Verma S, Abidi K, Xu JX (2016) Terminal sliding mode control for speed tracking of a
carangiform robotic ﬁsh. In: Proceedings of the IEEE international workshop on variable
structure systems, July, pp 345–350
25. Milosavljevic D (1985) General conditions for existence of a quasi-sliding mode on the
switching hyperplane in discrete variable structure systems. Autom Remote Control 46:307–
314
26. Gao W, Wang Y, Homaifa A (1995) Discrete-time variable structure control systems. IEEE
Trans Ind Electron 42(2):117–122
27. Bartoszewicz A (1998) Discrete-time quasi-sliding-mode control strategies. IEEE Trans Ind
Electron 45:633–637
28. Bsili I, Ghabi J, Messaoud H (2018) Discrete sliding mode control of inverted pendulum. In:
World symposium on mechatronics engineering and applied physics 2015, November
29. Dias MSG (2017) Discrete time sliding mode control strategies applied to a multiphase
brushless DC machine, Ph.D. thesis, Kassel University
30. Ngadengon R, Sam YM, Osman JHS, Ghazali R (2011) Controller design for inverted
pendulum system using discrete sliding mode control. In: Proceedings of the 2011 2nd
international conference on instrumentation control and automation ICA 2011, November,
pp 130–133
31. Liao TL (1997) On discrete-time variable structure control systems. J Control Syst Technol 5
(4):285–290
32. Fossen TI (1994) Guidance and control of ocean vehicles, 4th edn. Wiley, New York
33. Geranmehr B, Nekoo SR (2015) Nonlinear suboptimal control of fully coupled non-afﬁne
six-DOF autonomous underwater vehicle using the state-dependent Riccati equation. Ocean
Eng 96:248–257
34. Gerdönmez F (2007) Simulation of motion of an underwater vehicle, Ph.D. thesis, Middle
East Technical University
35. Draženović B, Milosavljević C, Veselić, B (2013) Comprehensive approach to sliding mode
design and analysis in linear systems. In: Advances in sliding mode control, Springer, pp 1–19
Impact of Acoustic Signal on Optical
Signal and Vice Versa in Optoacoustic
Based Underwater Localization

M. R. Arshad and M. H. A. Majid

Abstract Underwater localization is an important process in order to determine the

approximate location of a deployed underwater tool such as different types of
underwater vehicle. A common underwater localization depends on acoustic signal,
but it has disadvantages of high development cost, slow propagation speed, high
attenuation and only works effectively at a long distance. Optic is an alternative
approach for underwater localization. Optical signal has advantages of low cost and
high propagation speed, but it has the disadvantage of shorter detection range
compared to an acoustic signal. A combination of both approaches is known as an
optoacoustic which eliminates the disadvantages of each individual approach and
can be used for both short and long distance localizations. However, since both
signals are travelling waves, the use of both signals simultaneously may introduce
interferences. This paper investigates this possibility through experimentation. The
results of investigation proved that the interference does exist when both signals are
used simultaneously underwater.

Keywords Underwater localization Optical based localization Acoustic based

localization Optoacoustic

1 Introduction

A successful underwater operation depends on a reliable underwater positioning

system. On the ground applications, Global Positioning System (GPS) which uses
radio signal is widely used for positioning or localization purposes. However, in an
underwater environment, GPS or any RF-based localization methods cannot work
properly due to hostile aquatic channel conditions. Underwater localization can be

M. R. Arshad (&) M. H. A. Majid

Underwater, Control and Robotics Group, School of Electrical
and Electronic Engineering, Universiti Sains Malaysia (Engineering Campus),
14300 Nibong Tebal, Penang, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 177

categorized into two types, namely long range localization and short range local-
ization. Long range localization is used for tracking underwater vehicle as it
maneuvers while short range localization is used for underwater docking during
recovery or power recharging. In general, underwater localization can be performed
through acoustic, radio frequency (RF) and optical waves. However, most of the
current technology of underwater localization depends stiffly on acoustic signal in
which distances are mostly estimated from the time delay estimation. This process
requires an accurate estimation of time delay, in order to obtain an accurate position
estimation which typically involves a complex signal processing. Thus, in order to
obtain high accuracy position estimation, a large size hydrophone array or a high
sampling rate is needed. Additionally, acoustic waves have the disadvantages of
slow propagation speed, high attenuation, low bandwidth and give bad impact on
marine life [1]. However, acoustic signal has a large ﬁeld-of-view (i.e. detection
radius) and can travel in a long distance. Similarly, RF has disadvantages of high
attenuation, high absorption, requires huge antennas and transmission power, and it
is limited to shallow water applications.
On the other hand, optical based underwater localization is another approach to
the underwater localization. However, optical based localization has just been
recently studied. Typically, optical waves have advantages of high speed, low-cost,
energy-conservative, but it has a shorter operation range and observed point-to-
point type of communication (i.e. narrower ﬁeld-of-view) where the receiver and
transmitter must be aligned within a limited detection range in order to avoid
disconnect of connection compared to RF or acoustic waves. By combining
acoustic and optical approaches, an optoacoustic based underwater localization has
been developed as a new way of performing underwater localization. In the fol-
lowing subsections, some related works for acoustic based localization, optical
based localization and optoacoustic based localization are discussed.

1.1 Acoustic-Based Localization

Acoustic-based underwater localization problem is a common research problem

addressed by many literatures and can be found widely used in a wide range of
commercial applications. In a real underwater environment, acoustic based local-
ization can be used to not only determine the underwater position of a speciﬁc
target, but also to track the source of interest as it moves [2].
Acoustic is the most favorable medium for underwater communication, posi-
tioning and localization since no radio wave could propagate efﬁciently underwater.
However, underwater acoustic channels are characterized by harsh physical layer
conditions with a low bandwidth; high propagation delay, high bit error rate and
variable speed of sound pose unique challenges for the underwater localization.
Common methods for underwater localization are known as Ultra Short Baseline
(USBL), Short Baseline (SBL) and Long Baseline (LBL) [3]. Usually, these
methods of acoustic based localizations depend on trigonometry solution which is
Impact of Acoustic Signal on Optical Signal and Vice Versa … 179

expressed as the distance between a transmitter (i.e. acoustic source) and a set of
receivers (i.e. hydrophones in an array form). The distances are determined directly
from by the time delay or phase delay estimation [4]. Provided that the underwater
speed of acoustic signal is known (i.e. can accurately be estimated), the distance
could be estimated from the estimated time or phase delay.
An underwater acoustic signal is influenced by path loss, noise, multi-path,
Doppler spread and high variable propagation delay. Direction of the underwater
acoustic communication also affect the acoustic link, which means that the different
propagation direction has different propagation characteristics, especially with
respect to the time dispersion, multi-path spread and delay variance. Hence, the
underwater acoustic channel is a temporal and spatial variable system, which makes
the available bandwidth limited and intensely dependent on both range and fre-
quency [4].

1.2 Optical-Based Localization

Optical sources such as laser and light emitting diodes (LED) have been used
widely in many applications from as simple as a pointing device to advance defense
weapons. However, in underwater applications, utilization of optical signals
specifically laser technology is still limited due to high absorption properties of
light intensity by the sea water. However, the use of lasers as underwater com-
munication, imaging and localization is seen as the future underwater localization
technology [5]. Although most of the lasers cannot penetrate sea water in a long
distance, but it performs better compared to LED. Some laser such as blue and
green lasers can propagate from several hundred meters to several kilometers in
seawater depends on the intensity of the laser. This type of laser has been studied
for the underwater broadband communication system in [6]. Similar works with
difference research considerations can be found in [7] and [8].
LEDs on the other hand have been used for underwater communication for a
very short range application [9, 10]. High power LEDs are used to assist acoustic
devices for localization of underwater swarm robotics [11]. The high powered LED
is used to calculate distances between the robots. The downside of using LED is
that it has to be in high powered and its wavelength has to be properly selected due
to the high light absorption coefficient in the water. Even more, compared to lasers,
light illuminated from the LED is easily scattered. Optical is efficient in close range
and clear water conditions, while acoustic work efficiently in long range and doesn’t
significantly affected by turbidity. In optical based operation, laser is proven to have
a better transmission range, higher data rate, low latency and power efficiency
compared to Light Emitting Diode (LED) [1]. However, the light beam propagation
suffers from the absorption, scattering and multipath fading. Optical receiver is
commonly developed based on the optical camera but camera is easily affected by
lighting. In other research work, laser-based vision system had been used to localize
an underwater vehicle [12]. The laser-based vision system consists of a camera and
180 M. R. Arshad and M. H. A. Majid

two laser pointers as its major components. Basically, the laser pointers will serve
as the target and the camera will capture the image of the target in the form of two
dotted points. Based from the captured image, the underwater vehicle will be able to
know its location with respect to the targets. The downside of using the camera
instead of acoustic is its short working range (i.e. 40–150 cm). The work has been
expanded to include an inertial measurement unit (IMU) to assist the localization
system [13].

1.3 Optoacoustic-Based Localization

One of the optoacoustic application in the underwater environment for bathymetry

in turbid water known as optoacoustic underwater remote sensing (OAURS) has
been studied in [14] to improve accuracy and enhance the speed of the process.
Additionally, optoacoustic has been studied for both outdoor and indoor localiza-
tions [4, 15]. In underwater, optoacoustic has been researched for as an ultra-short
laser based underwater acoustic signal generator [16]. Moreover, a remotely
operated underwater vehicle (ROV) guidance based on optoacoustic data fusion and
optoacoustic based mosaic, and positioning of underwater vehicle is proposed in
[17]. Other examples include ﬁsh tracking using optical and acoustical data fusion
is studied in [18] and optical and acoustic based underwater sensor network is
studied in [19].
Optoacoustic also can be used for seabed mapping and motion estimation of
underwater vehicle [20]. Instead of underwater localization application, optoa-
coustic had been used to perform underwater mapping using multiple robots. In the
context of this work, opto means an imaging device and not exactly a laser sensor.
The imaging device is used to operate together with acoustic sensor for multi-AUV
trajectory optimization [20]. Communication modem based on hybridization of
acoustic and optical signals where prominent solution is determined through sim-
ulation studies which signify efﬁciency and effectiveness of the optoacoustic based
underwater localization solution [21]. In general, a primary advantage of an
optoacoustic is it allows for compensating the drawbacks of the low resolution of
acoustic sensors and limitations of optical sensors in poor visibility condition. In
addition, by combining both acoustic and optic in a single underwater localization
system, localization accuracy can be maintained for short and long distance
purposes.
However, the above studies are mostly focus on direct implementation of the
optoacoustic technology without considering the impact of acoustic signal on
optical signal. Since both acoustic and optic are travelling waves, it is important to
study the possibility of interferences in order to ensure accurate reading and reliable
localization based on optoacoustic can be realized. This consideration is important
to ensure that the readings obtain by optic sensor and acoustic sensor is fully trusted
Impact of Acoustic Signal on Optical Signal and Vice Versa … 181

(i.e. free from interference) and thus, signiﬁcance estimation errors can be avoided
during the localization processes. In order to study the above mentioned problem,
the rest of this paper is organized as follows: In Sect. 2, the research methodology
taken in order to investigate the problem is discussed in details. In Sect. 3, the
discussion of the research ﬁndings is reported.

2 Methodology

In order to investigate the impact of acoustic signal on optical signal, an overall

experimental setup used for the investigation is shown in Fig. 1. In this study, we
investigate the impact when the optic and acoustic sources are located perpendicular
to each other (i.e. the most critical orientation that gives the largest possible
interference). As can be seen from Fig. 1, the optical source is a diffused green laser
and the acoustic source is transmitted by a wideband underwater acoustic trans-
mitter. Green laser is selected since it has better penetration performance compared
to other colors (i.e. different wavelengths). Notice that the diffuser is used to
increase ﬁeld-of-view of the laser beam. The acoustic source is generated by a
signal generator. In this study the parameters of generated acoustic signal are shown
in Table 1. Sine wave is used since it gives easy to identify the fundamental
frequency without harmonics.

Fig. 1 Experimental setup to determine effect of acoustic on optic and vice versa
182 M. R. Arshad and M. H. A. Majid

Table 1 Parameters for acoustic signal source

Parameter Value Unit
Type Sinusoidal –
Magnitude 0–12 VDC
Frequency 1–10000 kHz
Distance to receiver 40 cm

Table 2 Speciﬁcation of laser, acoustic transmitter and hydrophone

Parameter Parameter Value Unit
Laser Color/wavelength Green/530 nm
Power supply 12 V
Power 50 W
Acoustic transmitter Power supply 12 V
Max bandwidth 10 MHz
Hydrophone Frequency 170 kHz
Sensitivity −211 ± 3 dB re 1 V/lPa
Operating temperature −2 to 80 °
C

The optical receiver is a Photoresistor (i.e. LDR-Light Dependent Resistor)

which is placed in a waterproof container. The optic intensity measured by a
Photoresistor is received by a microcontroller and then transferred to a computer for
real time data monitoring and analysis. The microcontroller is responsible for
converting a received analog signal from LDR to a digital signal. The intensity of
the acoustic signal is measured by a hydrophone (i.e. underwater acoustic sensor)
and transmitted to a computer through a PicoScopeTM (i.e. digital oscilloscope).
The speciﬁcations of the laser, acoustic transmitter and hydrophone are given in
Table 2.
The actual lab scale experimental setup is shown in Fig. 2. The inside view of
the tank and the orientation of the receivers and transmitters are shown Fig. 3. The
size of the tank used in this study is 52 38 31 cm. In this study, the impact of
optic on acoustic is measured based on different input parameters variation. The
output of measurement is the LDR intensity (i.e. measured in ADC value and
converted to Lux) and hydrophone intensity (i.e. measured in dBm). In order to
avoid the measured light intensity is disrupted by lighting (i.e. to ensure reading
consistency), a cover is used to cover the tank. In other words, the recorded data are
measured in a dark environment where LDR only measured the light intensity from
the diffused laser beam. The input and output parameters used in this experiment are
listed in Table 3.
Impact of Acoustic Signal on Optical Signal and Vice Versa … 183

Signal generator
Tank

Pico-
ScopeTM

Computer

Microcontroller

Fig. 2 Actual experimental setup

LDR

Acoustic
Hydrophone Transmitter

Diffused green
laser

Fig. 3 View inside the tank used for investigation

Table 3 Parameters for acoustic signal source

Parameter Acoustic on optic Optic on acoustic
Input Amplitude (VDC), Frequency (kHz) Light intensity (lx)
Output Light intensity (lx) Amplitude (dBm)

3 Results and Discussions

The results shown in Fig. 4(a) through Fig. 4(f) were obtained by recording the
ADC value received from the microcontroller. Then, the value of light intensity in
Lux (lx), Ilx is given by
184 M. R. Arshad and M. H. A. Majid

Fig. 4 Measured light intensity for different values of applied acoustic signal a 3V b 5 V c 7 V
d 9 V e 12 V (f) average intensity

LADC RANA
Ilx ¼ ð1Þ
RADC

where LADC is the value of ADC, RANA is the maximum range of analog voltage and
RADC is the maximum ADC value. The presented results are calculated based on
Impact of Acoustic Signal on Optical Signal and Vice Versa … 185

average of 2000 samples with three sets of measurements. In the presented results,
from the Fig. 4(a) through Fig. 4(e), it can be observed that both frequency and
amplitude affect the intensity reading of the light. From the figures, acoustic source
with low frequencies has a smaller impact (i.e. smaller interference) on optical
intensity value measurement compared to high frequency signals. Note that high
intensity value means low interference (as indicated by high intensity measurement)
and vice versa. This can be observed from the general trend of the light intensity
value as frequency increases. As the frequency increases the intensity decreases.
From Fig. 4(f), it can be clearly observed that the magnitude of the acoustic
source also affects the intensity measurement of the optical signal. The larger the
magnitude the better the intensity being measured, but as discussed earlier, the
intensity is slightly dropped as the acoustic frequency increases. The illustration
example of how acoustic interfere optical reading is shown in Fig. 5. From the
figure, it can be observed that the intensity measured by the LDR increases as the
acoustic pinger is activated (i.e. ON).
Theoretically, green light has a low absorption coefficient and attenuation, which
relatively gives a good intensity reading. The light beam intensity is affected by
absorption, scattering, and multipath fading effect due to interactions between water
molecules and particles with the photons as it propagates through the water.
However, as the acoustic source is activated (i.e. ON), the scattering effect caused
by the travelling acoustic signal scatters the light beam. As a result, the measured
light intensity by the LDR decreases. Figure 6 shows the impact of optical signal
intensity on acoustic signal intensity measurement (i.e. taken as average value).

Fig. 5 Actual Light sensor (linear scale LDR) response on green light source with ON and OFF
acoustic source (pinger) at 500 kHz. Setting: light source and acoustic source are at 90° from each
other (perpendicular)
186 M. R. Arshad and M. H. A. Majid

Fig. 6 Acoustic intensity

versus optical intensity

In this case, the intensity or brightness of the optical signal (i.e. diffused laser) are
controlled by a potentiometer while the distance between LDR and the diffused
laser is fixed. It can be observed that the optical signal intensity does not signifi-
cantly affect the acoustic signal intensity measurement. Although there are small
discrepancies in measurement, it is expected that it is due to noise from the envi-
ronment and not due to optical intensity change. This is because optical signal
transmission is not associated with pressure change as measured by hydrophone.
Thus, based on the above findings, it can be concluded that the acoustic signal has
any significant effect on the optical signal but not in the other way around.

4 Conclusion

In this paper, the study of interference effect of the acoustical signal on the optical
signal and vice versa in an underwater environment through experiment is pre-
sented. From the ﬁndings, both frequency and amplitude of the acoustic signal
affect the intensity reading of the optical signal. On the other hand, the optical
signal does not affect the intensity value of the acoustic signal. In the future, the
study of the impact of various external parameters such as salinity, density and
pressure will be considered and the reliability of the actual optoacoustic based
underwater localization will be studied.

Acknowledgements This research is funded by the Fundamental Research Grant

Scheme (FRGS). Account No.: 1001/PELECT/6071346.
Impact of Acoustic Signal on Optical Signal and Vice Versa … 187

References

1. Saeed N, Celik A, Al-Naffouri TY, Alouini M-S (2019) Underwater optical wireless
communications, networking, and localization: a survey. Ad Hoc Netw 94:101935. https://
doi.org/10.1016/j.adhoc.2019.101935
2. Carroll P, Zhou S, Zhou H, Xu X, Cui J-H, Willett P (2012) Underwater localization and
tracking of physical systems. J Electr Comput Eng 2012:11. https://doi.org/10.1155/2012/
683919
3. Paull L, Saeedi S, Seto M, Li H (2014) AUV navigation and localization: a review. IEEE J
Ocean Eng 39(1):131–149. https://doi.org/10.1109/JOE.2013.2278891
4. Esslinger D, Rapp P, Wiertz S, Rendich H, Marsden R, Sawodny O, Tarín C (2019) Accurate
optoacoustic and inertial 3-D pose tracking of moving objects with particle ﬁltering. IEEE
Trans Instrum Meas: 1–14. https://doi.org/10.1109/tim.2019.2905749
5. Shen C, Guo Y, Oubei HM, Ng TK, Liu G, Park K-H, Ho K-T, Alouini M-S, Ooi BS (2016)
20-meter underwater wireless optical communication link with 1.5 Gbps data rate. Opt
Express 24(22):25502–25509. https://doi.org/10.1364/OE.24.025502
6. Wu T-C, Chi Y-C, Wang H-Y, Tsai C-T, Lin G-R (2017) Blue laser diode enables underwater
communication at 124 Gbps. Sci Rep 7:40480. https://doi.org/10.1038/srep40480
7. Zhou T, Hu S, Mi L, Zhu X, Chen W (2017) A long-distance underwater laser communication
system with photon-counting receiver. In: 2017 16th international conference on optical
communications and networks (ICOCN), 7–10 August 2017, pp 1–3
8. Shan X, Yang C, Chen Y, Xia Q (2017) A free-space underwater laser communication device
with high pulse energy and small volume. In: OCEANS 2017, Anchorage, 18–21 September
2017, pp 1–5
9. Stefano B, Marco C, Silvia G, Ivan S (2013) Advances in underwater acoustic networking. In:
Mobile ad hoc networking: the cutting edge directions. IEEE, pp 804–852
10. Han X, Peng Y, Zhang Y, Ma Z, Wang J (2015) Research on the attenuation characteristics of
some inorganic salts in seawater, vol. 10
11. deir’Erba R, Moriconi C (2015) High power leds in localization of underwater robotics
swarms. IFAC-PapersOnLine 48(10):117–122. https://doi.org/10.1016/j.ifacol.2015.08.118
12. Wu S, Zhou P, Yang C, Zhu Y, Zhi H (2019) A novel approach for underwater vehicle
localization and communication based on laser reflection. Sensors 19(10):2253
13. Vila AP (2018) 3D underwater SLAM using sonar and laser sensors. University of Girona
14. Farrant D, Burke J, Dickinson L, Fairman P, Wendoloski J (2010) Opto-acoustic underwater
remote sensing (OAURS) - an optical sonar? In: OCEANS 2010 IEEE SYDNEY, 24–27 May
2010, pp 1–7
15. Esslinger D, Rapp P, Wiertz S, Sawodny O, Tarín C (2018) Highly accurate 3D pose
estimation for economical opto-acoustic indoor localization. In: 2018 15th international
conference on control, automation, robotics and vision (ICARCV), 18–21 November 2018,
pp 1984–1990
16. Brelet Y, Jarnac A, Carbonnel J, André Y-B, Mysyrowicz A, Houard A, Fattaccioli D,
Guillermin R, Sessarego J-P (2015) Underwater acoustic signals induced by intense ultrashort
laser pulse. J Acoust Soc Am 137(4):EL288–EL292. https://doi.org/10.1121/1.4914998
17. Campos R, Gracias N, Palomer A, Ridao P (2015) Global alignment of a multiple-robot
photomosaic using opto-acoustic constraints. IFAC-PapersOnLine 48(2):20–25. https://doi.
org/10.1016/j.ifacol.2015.06.004
18. Boudhane M, Nsiri B (2017) Fish tracking using acoustical and optical data fusion in
underwater environment. Paper presented at the proceedings of the international conference
on watermarking and image processing, Paris, France
19. Vasilescu I, Kotay K, Rus D, Dunbabin M, Corke P (2005) Data collection, storage, and
retrieval with an underwater sensor network. Paper presented at the proceedings of the 3rd
international conference on embedded networked sensor systems, San Diego, California, USA
188 M. R. Arshad and M. H. A. Majid

20. Lagudi A, Bianco G, Muzzupappa M, Bruno F (2016) An alignment method for the
integration of underwater 3D data captured by a stereovision system and an acoustic camera.
Sensors 16(4):536
21. Kumar ML, Rani MJ (2019) A design of novel hybrid opto-acoustic modem for underwater
communication. Int J Innov Technol Explor Eng (IJITEE) 8(8):7
Design and Development of Mini
Autonomous Surface Vessel
for Bathymetric Survey

Muhammad Ammar Mohd Adam, Zulkifli Zainal Abidin,

Ahmad Imran Ibrahim, Ahmad Shahril Mohd Ghani,
and Al Jawharah Anchumukkil

Abstract Bathymetric survey is necessary in monitoring hydrographic environ-

ment. The conventional approach involves operators including engineer, surveyor
and boat captain being onboard the vessel during survey works. Application of
Autonomous Surface Vessel (ASV) is becoming popular for bathymetric mapping
as it reduces operation cost and replaces human operation in high risk areas.
Available commercial ASVs are typically designed for a particular task with limited
expandability for other similar applications. Also, with current miniaturization of
industrial sensors, a small-sized ASV is sufﬁcient for most inland water survey
operation. In this paper, the design and development components for a modular and
mini ASV, named Suraya-1 is detailed. This vessel is developed for hydrographic
survey using singlebeam depth sonar, measuring depth and temperature of a water
body. The speciﬁcations and performance of developed ASV is compared to
existing commercial unmanned vessels of same class and application. The
dimension of the vehicle is the smallest compared to counterparts which is
1.04 m 0.35 m 0.32 m, weighing only 6.8 kg without payload. Our ASV is
powered by two paralleled 18.5 V LiPo battery, which is in the mid-range, yet able
to reach navigation speed of 4 knots as required for survey work. The real-time
vessel poses and collected data are transmitted to the ground station within range of
2 km. For performance evaluation, the developed ASV is tested in pool environ-
ment. Qualitative outcome shows minimal error in navigation control. Also, output
data obtained is shown consistent and reliable for a calm water environment.

Keywords Autonomous Surface Vessel Bathymetry mapping Hydrographic

survey

M. A. Mohd Adam (&) Z. Zainal Abidin A. I. Ibrahim A. S. Mohd Ghani

A. J. Anchumukkil
International Islamic University Malaysia, 53100 Gombak, KL, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 189

1 Introduction

Unmanned Surface Vessel (USV) is a vessel that operates on the surface of water
without an onboard crew. A typical unmanned vessel is made up of a surface
vehicle, a ground control station, communication and control link as well as logistic
module [9]. While USVs has been developed widely, a complete autonomous
version, Autonomous Surface Vessel (ASV) are currently still under development
stage. ASVs, which are initially developed for military purposes, have recently also
being used extensively in commercial and scientific research applications as they
become a cheaper and easier solution for operations. Missions such as maritime
security [4], oil spill handling [5], search and rescue [7], bathymetry mapping [3, 6]
and environmental monitoring [1] are among the works taking advantage of this
technology.
Bathymetric survey is an essential operation in environmental monitoring.
Conventionally, these works involve multiple operator to be on-board the survey
vessel including a surveyor, an engineer and a boat captain. In many hydrography
applications, ASV is becoming a more popular alternative as they replace human
operation in dangerous and remote areas as well as reduce cost of manpower and
operations.
In the current market, common type of vessels available are mono-hull and
catamaran designs [9]. Gürsel et al. presented hydrodynamic analysis on different
forms of hull to be considered for their catamaran design vessel in geological
survey of coastal and offshore [2]. Despite catamaran in general provides better
stability and payload capacity in comparison to single-hull platforms, mono-hull
offers lower cost with sufficient stability and maneuverability in inland waters
which fulfills requirement of our application.
In another work by Vasilj et al., the developed surface vessel, which is also a
catamaran, are designed with modular hardware design where components can be
replaced with suitable payload for different applications [10]. Navigation, data
collection and propulsion system are separated into three control levels with sep-
arate microcontrollers, connected over a shared data link. The work specifically
aims for research platform with only low-cost sensors being integrated with the
prototype. A modular system utilizing marine-standard sensors is yet to be
implemented.
Typically, commercial ASVs are designed with specificity in functionalities
which lacks modularity for implementing other hydrographic operations. Moreover,
with the recent advancements in miniaturization of survey sensors, a small-sized
vessel is sufficient in supporting the necessary operation payloads. Prainetr and
Janprom presented a mini survey robot of 1-m length in [8] being integrated with a
sonar sensor which operates at 200 kHz with scanning range of 45°. However,
specifications of the sonar sensor used is not included in the paper.
In this particular work, research is limited to the scope of developing an ASV
prototype of small-sized class vessel for hydrographic survey in inland waters
including rivers, lakes and dam reservoirs. Specifically, this ASV is named
Design and Development of Mini Autonomous Surface Vessel … 191

Suraya-1. The platform will have the ability to navigate in two modes: autonomous
or remotely controlled, while transmitting real-time survey data obtained by sin-
glebeam echo-sounder (SBES) installed on the vessel to be monitored onshore by
marine geology surveyor. Among the objectives of the development are:
• To reduce operation and research costs while obtaining comparable performance
in data measurement through utilizing industrial-grade sensors
• To modularize design and allowing flexibility in integrating alternate
marine-standard sensors for other forms of water survey works such as water
quality monitoring
• To increase portability and ease of operation with its light-weight vessel and
small-sized dimensions to be operated by minimal number of workers
This project is in collaboration with Temasek Hidroteknik, a company experi-
enced in hydrographic survey works and interest in innovative marine product
development.

2 System Overview

The overall system is designed to fulfill functional and specification needs in the
hydrographic survey industry with limiting scope defined by the collaborating
industrial partner.

2.1 Functional Requirements

For bathymetry work, industrial-grade sensors are to be integrated within the sys-
tem to ensure quality data output are obtained from the conducted survey using the
developed ASV. The typical requirement for data to be obtained are depth and
temperature of the surveyed water body. However, it is also essential for the system
to be able to obtain current position and timestamp for every recorded data from
singlebeam sensor. Both of these values are obtained using Global Navigation
Satellite System (GNSS) and transmitted to the ground station via 900 MHz radio
frequency wireless communication together with information from sonar, IMU and
compass sensors. On the ground, the received data are being used to plot the
corrected depth readings to corresponding geo-spatial position on a global
map. Each of the received data are used with the following functions as shown in
Table 1. The limit range of data transmission using existing communication device
is 2 km distance.
192 M. A. Mohd Adam et al.

Table 1 List of integrated sensors for hydrographic survey with their corresponding functions
Sensors Type of sensors Speciﬁcation
Hemisphere Atlaslink GNSS GNSS sensitivity : −142 dBm
Update rate : 10 Hz
Pitch/roll accuracy :1
Horizontal accuracy : RTK – 10 mm
L-band – 0.04 m
SBAS – 0.3 m
Inertial Labs AHRS-II IMU Heading accuracy : 0.3
Pitch & roll accuracy : 0.05
Gyroscope bias in-run : 1°/h
stability : 0.005 mg
Accelerometer bias in-run
stability
KVH C100 Compass Accuracy : ±0.5
Repeatability : ±0.2
Resolution : 0.1
Response time : 0.1 to 24 s
Airmar SS510 Singlebeam Min. depth reading : 0.4 m
sonar Max. depth reading : 200 m
Depth resolution : 0.01 m
Depth precision : 0.25% full range

Apart from data collection, the other essential part of system is the navigation. In
terms of functional requirement, it is also important for the operator to be able to
control the motion of the vessel in real-time manner. Being unmanned, the system is
designed with the capability of being controlled remotely using Remote Control as
well as autonomously controlled according to pre-deﬁned waypoints set by the
operator. A user-friendly Graphical User Interface (GUI) is also necessary to reduce
steepness in learning curve required for operators to adapt from traditional onboard
control.

2.2 Speciﬁcation Requirements

This development is designed considering speciﬁcations suitable for real environ-

ment of survey work. To reduce amount of resources required for an operation, it is
preferable to have a dimension and weight operable by one or two operators. The
vessel is based on an existing hull design with a deep-V shape which fits the
requirement of surveying calm inland water bodies such as rivers and dams.
In terms of operating period, a minimum of 1 h running time is sufficient with
operating speed of minimum 3 knots during survey. This is defined to technical
specification of minimum battery capacity 10000 mAh to power a 150–250 W bru-
shed motor at 18.5 V. A longer operation will require bigger capacity power source.
Design and Development of Mini Autonomous Surface Vessel … 193

One of the differences of this ASV is in the minimal operator requirement for
launch and recovery due to miniaturized size vessel. This is a signiﬁcant advantage
in reducing operating cost as well as improving efﬁciency in hydrographic work.
Other than that, the ASV is designed with modularity concept of sensor mounts
which provides flexibility to the system with other variation of sensors. This allows
the same system to easily be adapted for other similar applications such as water
monitoring and sampling.

3 Hardware Design

This section presents the hardware design of the ASV including hull design, overall
system block diagram and the sub-systems including navigation, data collection,
communication and power management system.

3.1 Hull Selection

Among the criteria considered in selecting suitable vessel as the platform for
autonomous survey are minimal operator requirement, low-cost development and
relatively stable for supporting payloads up to 20 kg for survey equipment. In terms
of hull design, a comparison between a mono-hull deep-V bottom and catamaran
has been conducted which both are the two most common hull shape used for
small-sized vessels.
The hull design of the selected vessel is a mono-hull deep-V bottom with
dimensions of 1.04 m in length, 0.32 m in height and 0.35 m in width. This design
is preferable due to its lower cost and easier to operate while being sufﬁciently
stable for the task required. Figure 1 illustrates the CAD drawing of the vessel
selected.

Fig. 1 CAD drawing of hull design selected for development from side (left) and isometric view
(right)
194 M. A. Mohd Adam et al.

3.2 System Block Diagram

The block diagram in Fig. 2 illustrates the integration between sub-systems

involved in developing the ASV which includes navigation, data collection and
ground station.

Navigation system

Mission Control Module

On-board sensors External sensors

H M8N GPS IST8310

ICM-20689 BMI055
ard
IST8310 MS5611 war
e Effectors
Processors
in-
STM32F76 STM32F10 Servo DC motor
ter-
f
Hardware interfaces

PW-Link

Data collection system Ground station

Sensors Wi-Fi

GPS Sonar

Compass AHRS
PC

Data combiner

RF modem RF modem

Fig. 2 Overall block diagram of the ASV system and sub-systems including the navigation, data
collection and ground station systems
Design and Development of Mini Autonomous Surface Vessel … 195

3.3 Navigation System

For autonomous navigation system, the mission control module is the main con-
troller of the system to connect feedback sensors with output effectors. The module
is built on 32-bit STM32F765 Cortex-M7 core, 216 MHz frequency, 2 MB
memory and 512 KB RAM with I/O co-processor 32-bit STM32F100 Cortex-M3,
24 MHz frequency and 8 KB SRAM for failsafe purpose.
Internally, the module consists of multiple sensor modules which are
accelerometers, gyroscopes, magnetometer (compass) and barometer. The inte-
grated sensing components included are extended with external sensors such as
GPS and another compass. The redundancy of certain sensors is implemented to
allow corrections being calculated in measurements. This introduces a more
accurate and stable readings obtained in terms of pose estimation – position, ori-
entation and motion.
The vessel is propelled by a brushed DC motor as thruster and a heavy-duty
servo as rudder. The driving signal input into the servo and motor driver of thruster
is generated by the mission control module based on autonomous control calcu-
lations in autonomous mode or based on manual remote control in manual mode.

3.4 Data Collection System

Hardware architecture for data collection system is structured to be flexible for

changes in type of integrated sensors. The data combiner module is equipped with 5
serial RS-232 ports (4 payload sensors and 1 reserved for RF modem), which
enables 4 different sensors being combined and directly transmitted to receiver on
ground station.
The module is programmed to integrate inputs from National Marine Electronics
Association (NMEA) based sensors only as it is the common communication
protocol being used for marine applications (typically using RS-232 interface). It is
programmed using ATmega2560 which allows communication with NMEA sen-
sors by means of UART to RS-232 converter (MAX232 level shifter). The com-
bined information is transferred point-to-point to the PC on ground via 900 MHz
band radio modems and is used by the GUI software module for data monitoring.
The overall data collection system is supplied with isolate power input from
navigation system to avoid any potential malfunction from affecting the other
system. The connected power source provides power to data combiner module
while other components within the system derive their power from the module. In
this case, off-the-shelf lithium polymer (LiPo) batteries are sufﬁcient to support the
power requirement of the system.
196 M. A. Mohd Adam et al.

3.5 Ground Station

The station located on ground is the central unit collecting information from both
systems, navigation and data collection. This provides a platform for real-time
status monitoring as well as actions controlling for the launched vessel from a
remote location.
As the ground station, any PC or device with Wi-Fi connection and USB ports
are applicable. Communication interface of receiving radio modem is converted to
USB using RS-232 to USB converter. On the other hand, the Wi-Fi connection can
be connected directly with the PW-Link transmitter on the vessel. The communi-
cation protocol of the device is User Datagram Protocol (UDP) which is chosen to
establish low latency and minimal data loss in transmission.

4 Software Design

The software architecture for navigation and data collection system is designed in
isolation concept similar to hardware design. In this section, the algorithm for
waypoint mission as well as hydrographic survey software implementations are
explained.

4.1 Navigation

In navigation system, the main autonomous control is run on the mission control
module. This device is based on NuttX operating system and is supported with
ArduPilot, an open-source firmware configured for ASV application. The archi-
tecture implemented for this system is as shown in Fig. 3.
In order to obtain reliable and accurate estimation of vehicle position, velocity
and angular orientation, Extended Kalman Filter (EKF) algorithm is implemented
by intelligently fusing the data from IMU, GPS, compass and other integrated
sensors.
On the ground control station (GCS), telemetric data is being input into the
ground-based PC through Mission Planner, an open-source mission planning
application. The software is used for operation waypoints entry, navigation firm-
ware configurations, real-time output monitoring from mission control module and
mission data logging.
Design and Development of Mini Autonomous Surface Vessel … 197

ArduPilot

Main Loop
Background thread

Inertial Sensor Extended Kalman Filter

Barometer Flight Mode

GPS Position Control

Motor & Servo Control

Hardware Abstraction Layer (HAL)

Hardware PWM Input Hardware PWM Output

Fig. 3 Zoomed view of ArduPilot architecture re-conﬁgured for ASV application

4.2 Data Collection

Payload sensors measurement is the essential part of the whole ASV system for this
application. The received data from vessel are recorded and displayed on the
ground PC for the surveyor in operation to monitor the quality of data. These sensor
readings are processed, displayed on a developed GUI, and pushed to an online
server to allow data access from control station located on ground.

5 System Integration

The subsystems of navigation and data collection are initially tested separately on
the vessel to validate performance of each system. Once veriﬁed, both systems are
installed within the vessel for integrated system in pool environment.
The vessel being small-sized with limited internal space and payload capacity is
among the main challenges in this development. On top of that, an off-the-shelf
vessel is used to reduce cost which introduces limitation of full customization.
198 M. A. Mohd Adam et al.

Table 2 Parameters in measurement of longitudinal center of gravity (LCG) for existing

setup. Calculation for lateral COG is neglected due to positions of components being in the center
of body
List of components Weight (kg) LCG (mm) Moment (kg.mm)
GPS 1.15 390 448.5
Singlebeam sonar 1.30 440 572.0
IMU 0.28 600 168.0
Compass 0.07 670 46.9
Mission control module 0.09 800 72.0
Batteries 10.00 540 5400.0
TOTAL: 12.89 – 6707.4

To overcome the limitation of space and payload, a balanced weight distribution

of payloads on-board is critical. Each component is weighed and positioned
according to calculated arrangements by using moment of inertia as shown in
Table 2. The individual moment of inertia is calculated using Eq. (1). The longi-
tudinal center of gravity (COG) of all the components onboard is calculated using
Eq. (2) to be 520.36 mm from transom and the longitudinal center of gravity of the
vessel is 520 mm. This setup is near the ideal arrangement for balanced longitudinal
weight.

Moment ¼ Weight LCG ð1Þ

DistanceRef : to COG ¼ Momenttotal Weighttotal ð2Þ

On the other hand, to solve limitation of using pre-fabricated vessel, optimiza-

tion of space is implemented. Singlebeam sonar is mounted externally by extending
support from top side of vessel to the bottom as shown in Fig. 4. This design is
considered to minimize amount of material required for support material, hence less
weight, as well as to position payload as close to COG as possible.

Fig. 4 Design of ASV with mounting for singlebeam sensor on the external body from isometric
and side view. Singlebeam sensor is below the vessel and GPS is positioned above the vessel
Design and Development of Mini Autonomous Surface Vessel … 199

Fig. 5 Fabricated ﬁnal design of ASV

6 Results and Discussion

The ﬁnal overall design is fabricated and integrated with all components required
for navigation and data collection system. The setup is as shown in Fig. 5. The
developed ASV is compared with existing commercial ASVs for shallow and calm
water in terms of speciﬁcations and capabilities which is summarized in Table 3.

Table 3 Comparison of speciﬁcations between recent development of commercial ASVs for

singlebeam hydrographic survey of small-sized class: USV Inception MK1 [13], Teledyne Z-Boat
1250 [12] and OceanAlpha SL20 [11]
Speciﬁcations Suraya-1 Inception MK1 Z-Boat 1250 OceanAlpha
SL20
Length 1.04 m 1.40 m 1.27 m 1.05 m
Width 0.35 m 1.32 m 0.94 m 0.55 m
Weight 20 kg 37 kg 22 kg 24 kg
Hull type Mono hull Twin hull Tri hull Mono hull
Power 18.7 V DC 12 V DC 14.4 V DC 33 V DC
Speed 2–4 knots 2–3 knots 2–3 knots 2–5 knots
Endurance Up to 2 h Up to 4 h Up to 4 h Up to 6 h
Range Up to 2 km 750 m 750 m Up to 2 km
Launch/ Transport: Transport: Transport: Transport:
recovery Via car or van Via car or van Via car or van Via car or van
Launch: Launch: Launch: Launch:
1 person 1 person 1 person 1 person
from slipway/ from slipway/ from slipway/ from slipway/
launching launching launching launching cradle
cradle cradle cradle 1 person
1 person 2 person 1 person from pontoon/
from pontoon/ from pontoon/ from pontoon/ river edge
river edge river edge river edge
Autonomous Yes Yes No Yes
200 M. A. Mohd Adam et al.

From comparison, it is shown that our ASV has the smallest dimension in terms
of length and width, which contributes to being the lightest design of ASV com-
pared to counterparts. With a mid-range powered motor, Suraya-1 is able to
compete in enabling necessary speed for survey work which is within range of 2 to
3 knots. However, our current endurance capacity is lowest compared to other three
ASVs. In order to increase battery capacity, another battery of same rating can be
connected in parallel. However, the downside of such approach is a significant
increase in total weight. Typically, 2 h is sufficient for survey work in small areas.
The detection range for data collection and navigation is within 2 km, which is on
par with SL20, and longer range than other two competitors.
For transportation, all the compared vessels are designed to fit at least a car or
van. On the other hand, for launch and recovery, only MK1 requires an extra
operator if particular ASV is released from pontoon or river edge. Apart from that,
single operator is sufficient for this process. In terms of navigation, our vessel as
well as MK1 and SL20 have the autonomous ability whereas Z-boat only allows
remote-controlled navigations.
To further evaluate the performance and capabilities of our Suraya-1, an
experiment is conducted in the fresh water swimming pool in International Islamic
University Malaysia (IIUM) Male Sports Complex located at Gombak, Malaysia.
The specific objective of the trial is to assess the functionality of ASV in performing
hydrographic survey for shallow inland waters environment. The depth of the test
pool is ranging between 1.5 to 2.5 m which is within detection range of our sin-
glebeam sonar sensor.

Fig. 6 Sample pool test implementation showing path travelled for pre-deﬁned waypoints. The
yellow line is the travelled path and red line is the target path. Green pinpoints are the target points
Design and Development of Mini Autonomous Surface Vessel … 201

The resulting path navigation of our ASV for the pool test is as shown in Fig. 6.
From qualitative evaluation, it is shown that the navigation system of our ASV
prototype is able follow the deﬁned waypoint path with minimal deviation error in
maintaining the pre-calculated target path.
From the experiment conducted, the sample raw data output obtained from
payload sensors are extracted and tabulated in Table 4. The data is received as
NMEA strings synchronised based on timestamp. It is observed that the GPS
coordinates are received in real-time and provided accurate positioning of the
vessel. Also, the depth and temperature obtained the expected readings in com-
parison with manual approach of measurements. On the other hand, for IMU data,
based on our observation, the heave angles obtained are as expected as very
minimal rolling motion is affecting the vessel in the pool. At this moment, the
compass performance is also evaluated based on observation only, where further
analysis will be conducted in future work.

Table 4 Sample of data output collected from payload sensors for pool test hydrographic survey
conducted in IIUM
Timestamp GPS Singlebeam sonar IMU Compass (°)
Latitude (°) Longitude (°) Depth (m) Temperature (°C) Heave (m)
2019-08-27 3.2504278 101.7405416 1.50 30.08 −0.01 138.2
11:11:52.245
2019-08-27 3.2504278 101.7405394 1.50 30.08 −0.01 137.2
11:11:52.740
2019-08-27 3.2504289 101.7405369 1.50 30.08 −0.01 136.2
11:11:53.730
2019-08-27 3.2504317 101.7405336 1.50 30.08 −0.01 135.3
11:11:54.769
2019-08-27 3.2504333 101.7405319 1.51 30.08 −0.01 134.5
11:11:55.264
2019-08-27 3.2504355 101.7405292 1.51 30.08 −0.01 133.2
11:11:55.761
2019-08-27 3.2504391 101.7405257 1.51 30.08 −0.01 132.7
11:11:56.752
2019-08-27 3.2504419 101.7405227 1.51 30.08 −0.01 132.3
11:11:57.749
2019-08-27 3.2504429 101.7405212 1.51 30.08 −0.01 131.9
11:11:58.240
2019-08-27 3.2504465 101.7405171 1.51 30.08 −0.01 131.6
11:11:59.772
202 M. A. Mohd Adam et al.

7 Conclusion and Future Works

In this paper, the design and development of a light-weight class autonomous

surface vessel (ASV) for hydrographic survey is presented. The realized vessel is to
be tested and applied in real industrial survey application where a robust and stable
control system is critical. The targeted environment to be tested is specifically for
calm inland water body.
A modular architectural design for various payload is being considered in
designing the systems, software and hardware. This introduces the ability to expand
the potential of ASV to be used with other sensors and applications supportable by
the designed platform. With existing setup and capacity, it can support extra pay-
loads up to 20 kg weight in total.
In comparison to existing commercial ASVs of same class and application, our
ASV stands out being the smallest in dimension and lightest in weight. This con-
tributes to being a more efficient vessel which enables speed up to 4 knots with
mid-ranged powered batteries. However, the endurance of the vessel operation is
currently low compared to counterparts and require further improvement.
To improve the capability of ASV in operation, the propellers and battery
capacities can be upgraded to better specifications. As a replacement to current
brushed DC motor, a brushless motor will be a more efficient solution. For power
source, installing more parallel-connected batteries of same capacity will improve
operation time, but with trade-off of increased weight. Depending on requirement of
operation, this improvement shall be considered in future development.

Acknowledgements This research paper is supported by research initiative grant scheme with the
number RIGS16-348-0512, International Islamic University Malaysia (IIUM) with equipment and
additional ﬁnancial support by Temasek Hidroteknik.

References

1. Dunbabin M, Grinham A (2017) quantifying spatiotemporal greenhouse gas emissions using

autonomous surface vehicles. J Field Robot 34(1):151–169
2. Gürsel KT, Taner M, Ünsalan D, Neşer G (2018) Design of a marine autonomous surface
vehicle for geological and geophysical surveys. Sci. Bull. Nav. Acad. 21:20–36
3. Han J, Park J, Kim T, Kim J (2015) Precision navigation and mapping under bridges with an
unmanned surface vehicle. Auton Robots 38(4):349–362
4. Johnston P, Poole M (2017) Marine surveillance capabilities of the AutoNaut wave-propelled
unmanned surface vessel (USV). In: OCEANS 2017 – Aberdeen, pp 1–46
5. Maawali WA, Al Naabi A, Yaruubi Al M, Saleem A, Maashri A.A (2019) Design and
implementation of an unmanned surface vehicle for oil spill handling. In: 2019 1st
International Conference on Unmanned Vehicle Systems-Oman (UVS), pp 1–6
6. Mat Idris MH, Sahalan MI, Abdullah MA, Zainal Abidin Z (2015) Development and initial
testing of an autonomous surface vehicle for shallow water mapping. ARPN J Eng Appl Sci
10:7113–7118
Design and Development of Mini Autonomous Surface Vessel … 203

7. Matos A, Silva E, Cruz N, Alves JC, Almeida D, Pinto M, Martins A, Almeida J, Machado D
(2013) Development of an unmanned capsule for large-scale maritime search and rescue. In:
2013 OCEANS - San Diego, pp 1–8
8. Prainetr S, Surface A (2017) Development of mini hydrography survey robot, pp 2–5
9. Stateczny A, Burdziakowski P (2019) Universal autonomous control and management system
for multipurpose unmanned surface vessel. Polish Marit Res 26(1):30–39
10. Vasilj J, Stancic I, Grujic T, Music J (2017) Design, development and testing of the modular
unmanned surface vehicle platform for marine waste detection. J Multimed Inf Syst 4:195–
204
11. Autonomous Survey Boat SL20. https://www.oceanalpha.com/product-item/sl20/. Accessed
11 Nov 2019
12. Teledyne Z-Boat 1250. http://www.teledynemarine.com/zboat1250. Accessed 11 Nov 2019
13. The Inception Class MK1 USV. https://www.unmannedsurveysolutions.com/usv-inception-
mki/. Accessed 11 Nov 2019
Control, Instrumentation and Artificial
Intelligent Systems
Optimal Power Flow Solutions
for Power System Operations Using
Moth-Flame Optimization Algorithm

Salman Alabd, Mohd Herwan Sulaiman,

and Muhammad Ikram Mohd Rashid

Abstract This article proposes a recent novel metaheuristic optimization technique:

Moth-Flame Optimizer (MFO) to solve one of the most important problems in the
power system namely Optimal power flow (OPF). Three objective functions will be
solved simultaneously: minimizing fuel cost, transmission loss, and voltage devia-
tion minimization using a weighted factor. To show the effectiveness of proposed
MFO in solving the mentioned problem, the IEEE 30-bus test system will be used.
Then the obtained result from the MFO algorithm is compared with other selected
well-known algorithms. The comparison proves that MFO gives better results
compared to the other compared algorithms. MFO gives a reduction of 14.50%
compared to 13.38 and 14.15% for artiﬁcial bee colony (ABC) and Improved Grey
Wolf Optimizer (IGWO) respectively.

Keywords Optimal power flow MFO Economic dispatch Optimal reactive

power

1 Introduction

Optimal power flow (OPF) has attained increasing interest from electrical
researchers since it is a key tool that helps utility power system to determine the
optimal economic and operational security of the electric grid. The predominant
purpose of OPF is to optimize certain objective functions such as: minimizing fuel
cost, emission, transmission loss, voltage deviation, etc. while meeting certain

S. Alabd M. H. Sulaiman (&) M. I. M. Rashid (&)

Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang,
26600, Pekan, Pahang, Malaysia
e-mail: [email protected]
M. I. M. Rashid
e-mail: [email protected]
S. Alabd
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 207

operation constraints like line capacity, bus voltage, generator capability, and power
flow balance. The aforementioned objective functions can be solved as a single or
multi-objective problem.
Optimal reactive power dispatch (ORPD) is a part of Optimal power flow (OPF).
ORPD has a substantial impact on the security and the economic operation of the
electric grid system. ORPD problem contains continuous and discrete variables so it
considered a mixed nonlinear problem. The control variables of the ORPD problem
are the reactive power outputs of generators and static VAR compensators, bus
voltage magnitudes, and angles. Another sub-problem of OPF is Economic dispatch
(ED) which one of the complex problems in the power system which aims to find
the optimal allocation of generator unit output to meet the load demand at the
lowest economic generation cost while satisfying the equality and inequality
constraints.
Several optimization techniques have been used to solve the OPF ranging from
traditional to metaheuristic optimization algorithms. In recent years, metaheuristic
optimization algorithms have been developed for simulating some of the chemical,
physical and biological phenomena. Lately, many nature-inspired meta-heuristic
algorithms have been applied to solve the OPF problem and its sub-problem ORPD
and ED. Artificial Bee Colony (ABC) [1], Opposition-Based Gravitational Search
Algorithm (OGSA) [2], Grey Wolf Optimizer (GWO) [3] and Harmony Search
Algorithm (HAS) [4] have been to solve ORPD separately. On the other hand, ED
has been solved by many meta Meta-heuristic such as Grey Wolf Optimizer
(GWO) [5], Moth-Flame Optimization (MFO) algorithm [6], A Particle Swarm
Optimization PSO [7], and Genetic Algorithm (GA) [8]. Moreover, A lot of opti-
mization techniques have been implemented to solve the ED problem and ORPD
problem simultaneously such as improved grey wolf optimizer IGWO [9], Modified
Sine-Cosine algorithm (MSCA) [10], Gravitational Search Algorithm (GSA) [11]
and Particle Swarm Optimization (PSO) [12].
According to no free lunch theorem, a single meta-heuristic algorithm is not best
for every problem [13], so in this paper, Moth-Flame Optimizer will be considered
to solve the optimal power flow (OPF) problem. The performance of the proposed
technique is tested on the standard IEEE 30-bus test system where the objective
functions are the minimization of generation fuel cost, minimization of power
losses and voltage profile improvement.

2 Problem Formulation

Since the OPF problem is a nonlinear complex optimization problem that mini-
mizes certain objective functions while subjected to equality and inequity con-
straints. It can be express as follow:
Optimal Power Flow Solutions for Power System Operations … 209

Min f ðy; xÞ ð1Þ

while subject to

h ð xÞ ¼ 0 ð2Þ

g ð xÞ 0 ð3Þ

In this paper, economic dispatch, Optimal reactive power dispatch, and voltage
proﬁle improvement will be taking into consideration as objectives functions as
follow:

2.1 Economic Dispatch

The main objective function of economic dispatch is to reduce the generation cost
which can be formulated as a quadratic equation [14].

X
N X
N
F1 ¼ min ð Fi ðPi ÞÞ ¼ ai þ bi Pi þ ci P2i ð4Þ
i¼1 i¼1

where F1 Is the total fuel cost, N is the total number of generating units, Fi Is the
fuel cost of generator i, Pi Is the power generated by generator i and ai , bi And ci
Are the cost coefﬁcients of generator i.

2.2 Optimal Reactive Power Dispatch Problem

The objective function of ORPD is to minimize the real transmission system power
losses while satisfying the equality and inequality constraint. It is formulated as
follow [15]:

X
N X
N
F2 ¼ minðPLoss Þ ¼ min PL ¼ Gij Vi2 þ Vj2 2Vi Vj cosdij ð5Þ
i¼1 i¼1

where PLoss Is the real power losses in the transmission system and N is the number
of lines. Also, Gij Is the line conductance between the i-th and j-th buses. While
Vi and Vj Are the voltage at the i-th and j-th buses respectively and dij Is the voltage
phase angles of the i-th and j-th buses.
210 S. Alabd et al.

2.3 Voltage Proﬁle Enhancement

The objective function of Voltage proﬁle enhancement is to minimize the voltage

deviation [3]:

X
Nd
F3 ¼ min ðVDÞ ¼ min jVi 1j ð6Þ
i¼1

where Vi Is the voltage at i load bus and Nd Is the number of load buses.

2.4 The Weighted Objective Functions

The proposed optimization objective function can be formulated by combing the three
aforementioned objective functions into a signal objective function as fellow [9]:

F ¼ F1 þ w1 F2 þ w2 F3 $=h ð7Þ

where w1 and w2 are the weighting factors which can be selected by the user [9].

2.5 Equality Constraints

The load power flow balance equation is equality constraints which states that total
load demand plus the total power losses should be equaled to the total power
generation. The equality constraint equation can be described as following [9]:
X
PGi ¼ PDi þ Vi Vj Gij Cos hij þ Bij Sin hij ð8Þ
j2Ni

X
QGi ¼ QDi þ Vi Vj Bij Cos hij Gij Sin hij ð9Þ
j2Ni

2.6 Inequality Constraints

Generator Limit
The voltage, real power and reactive power of the generator must be constrained
within their minimum and maximum value limit [9]:
Optimal Power Flow Solutions for Power System Operations … 211

min
VGi VGi VGi
max
i ¼ 1; 2; . . .; N ð10Þ

Gi PGi PGi
Pmin i ¼ 1; 2; . . .; N ð11Þ
max

Gi QGi QGi
Qmin i ¼ 1; 2; . . .; N ð12Þ
max

Transformer Tap Setting

The tap ratio of the transformer must be constrained within their minimum and
maximum value limit [9]:

Timin Ti Timax i ¼ 1; 2; . . .; NT ð13Þ

Reactive Compensators
The shunt VAR compensator must be constrained within their minimum and maxi-
mum value limit [9]:

Ci QCi QCi
Qmin i ¼ 1; 2; . . .. . .; NC ð14Þ
max

3 Moth-Flame Optimizer (MFO)

Moth-flame optimizer is a new stochastic nature-inspired algorithm proposed by

Mirjalili in 2015 [16]. Moths are insects related to butterflies and they go through
two-stage in their lifetime which is larvae moth and adult moth. The special nav-
igation technique used by moths to travel at night called transverse orientation. The
idea of transverse orientation is by maintaining a fixed angle of natural light such as
the moon, moths can ensure to travel in a straight line. Since the moon is too far, it
stays stationary and provides a fixed reference point for moths to navigate in a
straight line. However, the advent of lamps, moths get confused and take the
lamplight as an artificial moon and tries to keep a constant distance from it and end
up circling the artificial light since light is too close.

3.1 MFO Mathematical Formulation

The number of moths can be represented as matrix [16]:

2 3
m1;1 m1;2 m1;d
6 m2;1 m2;2 . . . m2;d 7
6 7
M¼6 . .. .. .. 7 ð15Þ
4 .. . . . 5
mn;1 m1;1 . . . mn;d
212 S. Alabd et al.

Where n is moths’ number which represents the candidate solutions and d is the
number variables.
To store the corresponding ﬁtness value of each moth into an array as following [16]:
2 3
OM1
6 .. 7
6 . 7
OM ¼ 6 . 7 ð16Þ
4 .. 5
OMn

A matrix like Moths matrix is designed for flames [16]:

2 3
F1;1 F1;2 F1;d
6 F2;1 F2;2 . . . F2;d 7
6 7
F¼6 . .. .. .. 7 ð17Þ
4 .. . . . 5
Fn;1 F1;1 . . . Fn;d

Where n is moths’ number which represents the candidate solutions and d is the
number variables.
To store the corresponding ﬁtness value of each flame into an array as following [16]:
2 3
OF1
6 .. 7
6 . 7
OF ¼ 6 . 7 ð18Þ
4 .. 5
OFn

It is important to note that flames and moths are both candidate solutions.
However, they differ only by the approach to update. Hence, the actual search
agents that go around the search space are the moths whereby the best locations of
moth gained so far are the flames. When searching the search space, each moth
drops flame as a pinpoint, so it can search around the flame and updated it in case of
ﬁnding a better solution. By applying this, the moth will never lose its best result
obtained so far. The way moth updates their location depending on flames can be
modeled as fellow [16]:

Mi ¼ S Mi ; Fj ð19Þ

where Mi ; Fj indicate the i-th moth and j-th flame respectively while S represents
the spiral function. The logarithmic spiral function that used to as the update
mechanism is modeled as fellow [16]:

S Mi ; Fj ¼ Di ebt Cosð2ptÞ þ Fj ð20Þ
Optimal Power Flow Solutions for Power System Operations … 213

where Di Indicates the distance of the i-th moth for the j-th flame, b is a constant
which deﬁnes the shape of the logarithmic spiral, and t is a random value within the
range of [−1, 1]. Di Is calculated as following [16]:

D i ¼ F j M i ð21Þ

where Mi Indicate the i-th moth, Fj Indicates the j-th flame.

To guarantee the processes of exploration and exploitation of the search area,
moths move around the flames and are not essential to fly within the area between
the flames and moths which modeled by the spiral Eq. (20). When the subsequent
position situated outside the space between the flame and the moth, exploration
occurs. However, when the next position located within the area between the flame
and the moth, exploitation occurs. To reach a global optimum and not to be stuck in
local optima, every moth must update its location according to corresponding
flames in Eq. (20) Fig. 1.

N1
flame no ¼ round N l ð22Þ
T

Fig. 1 The spiral flying path of Moth around light source [16]
214 S. Alabd et al.

3.2 Implementing MFO in Solving ORPD and ED Problems

The utilization of the MFO algorithm in solving the optimal ORPD problem and
ED problem is via obtaining the optimal control variables to minimize the objective
functions while fulﬁlling the equality and inequality constraints. The implementing
MFO In Solving ORPD and ED problems are shown in the flow chart below Fig. 2:

Fig. 2 MFO flow chart for

solving the objective function
Optimal Power Flow Solutions for Power System Operations … 215

4 Results and Discussion

To ﬁnd the best optimal setting of the control variables for the OPF problem, the
proposed MFO method is tested on the standard IEEE 30-bus test system.
All simulations were carried out in a MATLAB R2017a and MATOWER 6.0
software package on a personal computer with an i5 processor, 1.6 GHz, 64 bits
and 8 GB RAM. In this paper, 30 search agents were selected, and the maximum
iteration was 300. Moreover, the weighting factors w1 and w2 are selected as 1950
and 200 respectively.

4.1 IEEE 30-Bus Systems

The bus and line data of the IEEE 30-bus test system is found in [18]. This test
system is composed of six generators located at buses 1, 2, 5, 8, 11 and 13, and four
transformers located at lines 6–9, 4–12, 9–12, and 27–28. The total load power
demand is 283:40 þ j126:20 MVA. Moreover, the total real power losses and the
total reactive power losses are 5.6035 MW and 29.9294 MVAr respectively.
Figure 3 shows the single line diagram of the IEEE-30 bus system while Table 1
shows the setting of control variables for IEEE 30-bus.
For the purpose of evaluating the performance of the proposed MFO, its optimal
results will be compared with the simulation results of other popular optimization

Fig. 3 Single line diagram of the IEEE-30 bus system [18]

216 S. Alabd et al.

Table 1 Upper and lower limit of control variables for the IEEE 30-bus system
Control variable Upper bound Lower bound
PG1 MW 50 200
PG2 MW 20 80
PG5 MW 15 50
PG8 MW 10 35
PG11 MW 10 30
PG13 MW 12 40
Generator Voltages p:u 0.95 1.1
Transformer Tap Setting p:u 0.9 1.1
Reactive Compensator Sizing MVAr −10 10
Load voltageðp:uÞ 0.95 1.05

approaches which are ABC [9], IGWO [9]. For fair compression between the MFO
and the chosen methods, the optimization results of these methods reported in their
respective reference will be inserted into MTAPOWER load flow to evaluate the
proposed objective function.

4.2 The Weighted-Objective Function

The three objective functions namely minimizing transmission power losses,

minimizing generation cost and voltage profile improvement are compound into
one single objective function using the weighting factor which is called the
weighted objective function.
Table 2 shows the obtained results of MFO versus the reported optimization
method namely artificial bee colony (ABC) and Improved Grey Wolf Optimizer
(IGWO). It can be clearly observed that MFO outperforms the other two methods
with 967.59 $/h with a percentage of 14.50% compared to 980.1586 $/h (13.38%)
and 971.4114 $/h (13.38%) for artificial bee colony (ABC) and Improved Grey
Wolf Optimizer (IGWO) respectively. The convergence of MFO is shown in Fig. 4.
Optimal Power Flow Solutions for Power System Operations … 217

Table 2 The obtained results of MFO for the weighted objective function
Control variables Initial ABC [9] IGWO [9] MFO
Generator output unit MW
PG1 MW 99.00 119.338 123.3468 199.9683
PG2 MW 80.00 54.8327 50.8357 50.84092
PG5 MW 50.00 29.2442 30.3516 31.36332
PG8 MW 20.00 35 35 35
PG11 MW 20.00 30 28.3808 26.79478
PG13 MW 20.00 21.041 21.5518 20.56381
Generator voltage p:u
VG1 1.060 1.0268 1.0295 1.030482
VG2 1.045 1.0156 1.0171 1.016681
VG5 1.010 0.994 0.9974 0.999912
VG8 1.010 0.9981 1.0006 0.999795
VG11 1.082 1.0459 1.0015 1.029194
VG13 1.071 1.0331 1.0528 1.001948
Transformer tap ratio p:u
T412 1.0780 0.98 1.0107 1.040193
T69 1.0690 0.9381 0.975 1.002741
T610 1.0320 1.0125 1.0556 0.953949
T2827 1.0680 0.9672 0.978 0.979411
Capacitor bank MVAr
Qc10 0.0 1.4017 2.1785 10
Qc12 0.0 −6.1533 −10 −1.16987
Qc15 0.0 3.5496 10 2.7043
Qc17 0.0 0.5092 3.4209 1.314517
Qc20 0.0 4.8013 7.7976 8.443245
Qc21 0.0 −3.0998 10 10
Qc23 0.0 8.7841 2.256 3.742131
Qc24 0.0 8.4659 9.8128 10
Qc29 0.0 2.4237 3.5445 3.803413
Fuel cost ð$=hÞ 901.3495 833.9610 831.38 830.1046
Power loss, MW 5.6035 6.0396 6.06672 6.1289
Voltage deviation, p:u: 0.6051 0.1421 0.10867 0.0899
Objective function $/h 1131.6336 980.1586 971.4114 967.59
218 S. Alabd et al.

Fig. 4 Convergence performance of MFO for Case 1 (IEEE 30-bus)

5 Conclusion

In this paper, the application of MFO into solving OPF has been carried out. The
three objective functions namely minimizing fuel cost, transmission loss, and
voltage deviation minimization were compound into one weighted objective
function. The performance of MFO has been tested in the standard IEEE 30-bus test
system. Therefore, From the obtained result, MFO shows a competitive result in the
OPF problem compared to the other optimization techniques in the literature. The
application of MFO into a multi-objective function is highly recommended.

Acknowledgements This work was supported by the University Malaysia Pahang (UMP) and the
Ministry of Higher Education Malaysia (MOHE) under Fundamental Research Grant
Scheme FRGS/1/2017/TK04/UMP/03/1 & RDU170129.

References

1. Ayan K, Kiliç U (2012) Artiﬁcial bee colony algorithm solution for optimal reactive power
flow. Appl Soft Comput J 12(5):1477–1482
2. Shaw B, Mukherjee V, Ghoshal SP (2014) Solution of reactive power dispatch of power
systems by an opposition-based gravitational search algorithm. Int J Electr Power Energy Syst
55:29–40
3. Sulaiman MH, Mustaffa Z, Mohamed MR, Aliman O (2015) Using the gray wolf optimizer
for solving optimal reactive power dispatch problem. Appl Soft Comput J 32:286–292
Optimal Power Flow Solutions for Power System Operations … 219

4. Khazali AH, Kalantar M (2011) Optimal reactive power dispatch based on harmony search
algorithm. Int J Electr Power Energy Syst 33(3):684–692
5. Sulaiman MH, Ing WL, Mustaffa Z, Mohamed MR (2015) Grey wolf optimizer for solving
economic dispatch problem with valve-loading effects. ARPN J Eng Appl Sci 10(21):9796–
9801
6. Sulaiman MH, Mustaffa Z, Rashid MIM, Daniyal H (2018) Economic dispatch solution using
moth-flame optimization algorithm. In: MATEC web of conferences, vol 214
7. Park J-B, Jeong Y-W, Lee W-N, Shin J-R (2008) An improved particle swarm optimization
for economic dispatch problems with non-smooth cost functions, 20(1):7
8. Chen P-H, Chang H-C (2002) Large-scale economic dispatch by genetic algorithm. IEEE
Trans Power Syst 10(4):1919–1926
9. Taha IBM, Elattar EE (2018) Optimal reactive power resources sizing for power system
operations enhancement based on improved grey wolf optimiser. IET Gener Transm Distrib
12(14):3421–3434
10. Attia AF, El Sehiemy RA, Hasanien HM (2018) Optimal power flow solution in power
systems using a novel Sine-Cosine algorithm. Int J Electr Power Energy Syst 99
(January):331–343
11. Duman S, Güvenç U, Sönmez Y, Yörükeren N (2012) Optimal power flow using gravitational
search algorithm. Energy Convers Manag 59:86–95
12. Abido MA (2002) Optimal power flow using particle swarm optimization. Int J Electr Power
Energy Syst 24(7):563–571
13. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans
Evol Comput 1:67–82
14. Sulaiman MH, Ing WL, Mustaffa Z, Mohamed MR (2015) Grey wolf optimizer for solving
economic dispatch problem with valve-loading effects ARPN. J Eng Appl Sci 10(21):1619–
1628
15. Abdel-Fatah S, Ebeed M, Kamel S (2019) Optimal reactive power dispatch using modiﬁed
sine cosine algorithm. In: Proceedings of 2019 international conference on innovation trends
computer engineering, ITCE 2019, no February, pp 510–514
16. Mirjalili S (2015) Moth-flame optimization algorithm: a novel nature-inspired heuristic
paradigm. Knowl-Based Syst 89:228–249
17. Ng Shin Mei R, Sulaiman MH, Mustaffa Z, Daniyal H (2017) Optimal reactive power
dispatch solution by loss minimization using moth-flame optimization technique. Appl Soft
Comput J 59:210–222
18. Lee KY, Park YM, Ortiz JL (1985) A united approach to optimal real and reactive power
dispatch. IEEE Power Eng Rev PER-5(5):42–43
A Pilot Study on Pipeline Wall
Inspection Technology Tomography

Muhammad Nuriffat Roslee, Siti Zarina Mohd. Muji,

Jaysuman Pusppanathan, and Mohd. Fadzli Abd. Shaib

Abstract Malaysia is one of the world’s third-largest exporter of liqueﬁed natural,

the second-largest oil and natural gas producer in Southeast Asia, this signiﬁed that
development of oil and gas industry in Malaysia particularly has rapidly evolved
and so thus the using of steel pipe. Steel pipe is essential and widely uses for fluid
transportation in the sense of transporting petroleum, gas, water, steam etcetera.
Both corrosion and blockage are the main problem in the oil and gas industry.
However, it is reportedly that the main technique used in Malaysia is by using
radiation material like gamma ray or X-rays. This technique is too dangerous if
extensive care is neglected. Hence, a throughout discussion on established pipe wall
inspection technology is pivotal, as it to be applied on different situation of
application or study. This paper focusing on the suitability, the basic functionality,
advantage and disadvantage on every established pipe wall inspection technology
ever known. Mostly tomography researcher in Malaysia particularly, used acrylic
pipe as subject for experiment with tomography hardware. Ironically, with that
implementation is not entirely portraying the real process of pipeline inspection as
conducted by oil and gas company. In this research, steel pipe is used to imitate the
real situation of pipeline inspection as conducted. Therefore, the real issues raised is
more reliable when conducting the experiment using the real steel pipe thus, could
solve the industry problem. From the review that had been done, steel pipe in

M. N. Roslee S. Z. Mohd.Muji (&) Mohd.F. Abd. Shaib

Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
Parit Raja, 86400 Batu Pahat, Johor, Malaysia
e-mail: [email protected]
M. N. Roslee
e-mail: [email protected]
Mohd.F. Abd. Shaib
e-mail: [email protected]
J. Pusppanathan
Sports Innovation & Technology Centre (SiTC), Institute of Human Centered Engineering
(iHumen), Faculty of Engineering, Universiti Teknologi Malaysia, 81310 Skudai, Johor,
Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 221

diameter 203.2 mm and thickness of 7.7 mm will be used in this research to solve
the industrial problem situation. A simulation result using ﬁnite element analysis
method was done using ultrasonic as the main sensors and it shows that the
ultrasonic can penetrate successfully into the steel pipe. In conclusion, research
using ultrasonic can be used as it proved to have the measurement result where the
suitable frequency is 40 kHz with 20 V voltage inserted the most suitable to operate
the ultrasonic tomography system.

1 Introduction

Tomography is a technology that producing an image of certain internal system, i.e.

a process vessel or pipeline from the measurements signals sensor which located
around desired object or in other word, tomography can be defined as a method of
displaying a representation of image of a solid through the use of any kind of
penetrating wave impacting on the object. The word ‘tomography’ is originally
from Greek words whereby ‘tomo’ means ‘to slice’ and graph’ means ‘image’ [1].
Over the years, tomography has been widely used in medical imaging like X-ray,
CT scan, single proton emission computerized tomography (SPECT) and MRI for
diagnose disease, monitoring the effectiveness of therapy and many other purposes.
However, this recent decade, tomography has evolved rapidly and become most
beneficial technology in actual process material such as in pipeline and vessel and
many other fields.
The basic components of tomography system consist of hardware which
includes sensors and measurement circuits, software for image reconstruction and
display unit for displaying the image obtained. There are various type of tomog-
raphy includes x-ray, gamma-ray, microwave, ultrasound, optical, positron emis-
sion tomography (PET), nuclear magnetic resonance, capacitance, resistance,
impedance and electrical charge [1]. Tomography also has different types of method
applied on tomography sensor which are attenuation, transmission, reflection,
diffraction and impedance [1]. Figure 1 shows the basic tomography system and
application in tomography fields. It proved that tomography has successfully and
reliable tool for Industrial Process Tomography.
Application of tomography includes inspection purpose, concentration inspec-
tion, process control monitoring, flow pattern identification, environment and flow
measurement as stated on Fig. 1. Every type of sensor has its unique characteristics,
advantages and disadvantages. This paper will emphasize and discussed the tech-
niques of wall thickness inspection and multiphase imaging functionally and
technically.
A Pilot Study on Pipeline Wall Inspection Technology Tomography ... 223

Fig. 1 Basic tomography

system and applications [1]

2 Types of Tomography Sensor

In tomography ﬁeld, a different type of sensor is used to detect the desired

parameter. Sensor is essentials because it will differentiate the functionality of the
sensing method in process vessel. Nowadays, non-invasive and non-intrusive are
the features needed for tomography system as it has the capability to eases the
monitoring process. However, each type of tomography has it advantages, limita-
tions, and drawbacks hence the selection of tomography sensor must be accordingly
to the studied case.
Rahim and Rahiman [1] in their book has mentioned of consideration of several
factors for choosing tomography sensors as follows:
(1) The molecular structure of the components contained in the pipeline, vessel,
reactor, or desired material (particles, gases, liquids and mixtures).
(2) The industrial environment like humidity, temperature, noise, maintenance, and
safety implications.
(3) The requirements such as imaging resolution, measurement speed, measure-
ment sensitivity and temporal solution.
(4) The size and cost of the equipment process also the length-scale of the case
study area.
(5) The requirements of human resource and any potential hazards towards
personnel.
The following below are the common types of tomography sensors:
(a) X-ray
(b) Gamma rays
224 M. N. Roslee et al.

(c) Microwave
(d) Ultrasound
(e) Optical
(f) Positron emission tomography
(g) Nuclear magnetic resonance
(h) Capacitance
(i) Impedance
(j) Electrical charge
Each of the sensors are the main part for every established tomography tech-
nologies that discussed on the next subtopic.

2.1 Pipe Wall Inspection Technology

Malaysia is one of the world’s third-largest exporter of liqueﬁed natural, the

second-largest oil and natural gas producer in Southeast Asia, and strategically
amid important routes for seaborne energy trade [2]. This signified that develop-
ment of oil and gas industry in Malaysia particularly has rapidly evolved and so
thus the using of steel pipe. Steel pipe is essential and widely used for fluid
transportation in the sense of transport petroleum, gas, water, steam and etcetera [3].
However, constantly under pressure, high temperature, mineral deposition along
the pipe wall and corrosion could lead to pipe thickness thinning of body pipe,
crack appearance or even leaking of oil and gas when it got worsen [4].
Consequently, this event could lead to huge economic lose, threaten the production
safety and bring disaster to surrounding environment [5]. Hence, continuous pro-
cess of monitoring on oil and gas pipeline is essential to promote corrosion skills
and achieve strategically progresses on pipeline production by making corrosion
prediction, and furthermore, potentially can be an important technology method in
safety production assurance and a developing trend of digital oilfield.
Over the years there are various established technology regarding pipeline wall
inspection. Each type of it have the specific working principle either invasively or
not.

2.1.1 X-Ray

In 1970s x-ray computed tomography being introduced into the world [6]. Since
their discovery, X-rays have become an important tool in the ﬁelds of medical
diagnosis and materials testing or used for many applications [7].
Conventional X-ray imaging relies on the different attenuation of X-rays in
structures with high X-ray absorbance, such as bones, and lowly absorbing parts,
such as the surrounding tissue, in the examined target.
A Pilot Study on Pipeline Wall Inspection Technology Tomography ... 225

However, X-rays are not only absorbed in the object but also refracted and
scattered, producing measurable deviations from their original direction and thus
enabling the measurement of new signal components such as phase contrast and
dark-field contrast.
For instant, over the years the castings industry has used x-ray inspection to
verify the structural integrity of its castings or pipeline inspection [8]. The first
manually operated off-line film-based inspection systems have been replaced with
fully automatic real-time x-ray systems able to make pass/fail decisions without
operator intervention. Now the systems can be integrated directly into the manu-
facturing process and even integrated directly into the manufacturing process and
make the monitoring process become easier [8].
The basic principle of x-ray imaging in involving with the bombardment of a
thick target with energetic electron. X-ray tomography uses the ability of X-ray
radiation to penetrate objects. On the way through an object, part of the impinging
radiation is absorbed. The longer the radiographic length of the object, the less
radiation escapes from the opposite side.
The absorption also depends on the material. An X-ray detector (sensor) captures
the escaping X-ray radiation as a two-dimensional radiographic image. At detector
sizes of approximately 50 to 400 mm, a large portion of the measured object can be
captured in a single image as shown in Fig. 2.
Figure 3 shows the flowchart of the X-ray process starting from the beginning to
the end of the process. This is the basic structural process of x-ray tomography.
The advantage of X-ray is able to operate at higher temperature compared to
others established system of technologies that has been found [9]. Furthermore, the
image reconstructed from the X-ray is more reliable and precisely depicts the
internal image of the system [10]. X-ray one of the tomography technique that
offers high inspection efficiency, good economic effect and real time problem
evaluating [11].
X-ray have number of drawbacks, the usage of ionizing radiation is very dan-
gerous due it’s hazardous potential and Brian Plonsky from International Atomic
Energy Agency (IAEA) mentioned that the most NDT technology being used in
Malaysia is radiography [12]. This technique is too dangerous towards humankind
if extensive safety precaution is not handled correctly. Furthermore, the design of
the system usually very bulky hence demographically not suitable for pipeline

Fig. 2 The basic operation of

X-ray [9]
226 M. N. Roslee et al.

Fig. 3 The flowchart of the

whole process of X-ray
tomography [9]

inspection. Lastly, very high cost and high maintenance due to the usage of ionizing
radiation. Therefore, a new way of detecting the pipe condition that can replace the
radiation sources by using ultrasonic sensors or any other tomography method.

2.1.2 Electrical Capacitance Tomography (ECT)

The measurement principle of electrical capacitance tomography is depend on the

permittivity of the internal material or media inside the pipeline [13]. Different
material have different number of permittivity value and this values are used to
reconstruct the image. User can differentiate and determine each material inside the
pipe from image reconstructed.
This system has been used for process industries for measuring the component
fraction of a multicomponent flow process. It is very useful since this system
operate very fast, invasively and does not use ionizing radiation. Furthermore, from
an industries point of view, ionizing radiation is not favored to its high cost and
hazard potential. However, according to Ruzairi et al. electrical capacitance have
few disadvantages which are no simple linear relationship between the measured
capacitance and the dielectric distribution, the changeable sensitivity is small [13].
A Pilot Study on Pipeline Wall Inspection Technology Tomography ... 227

2.1.3 Electrical Impedance Tomography (EIT)

EIT was ﬁrst developed for medical applications in the early 1980 and it was then
extended to industrial process like process vessel. A basic principle of EIT is that
injected the current signal into one pair of electrodes and the others electrodes
measure the voltage developed, its repeated for others pairs of current injection
electrodes [14]. This technique mostly applied in industry whereby involving the
process that uses conducting fluid to carry desired compound from one place to
another.
The advantages of EIT are relatively low cost compared to others technique of
tomography, the design of the EIT is more simple and certainly non-invasive and
intrusive. However, EIT for a process of transporting fluid or material that contains
large numbers of non-conducting solid material is not suitable.

2.1.4 Magnetic Flux Leakage (MFL)

MFL is a device that established in the 1950’s and until now it becomes most
commonly used tools for pipeline inspection [13]. MFL is considered to be most
effective and dependable on-line method among inner corrosion-detection tech-
nologies for oil and gas pipeline [15–18].
The working principle of the MFL is shown in Figs. 4 and 5. In Fig. 4(a), the
inspected pipe is perfect without any metal loss and the magnetic flux totally passes
the magnetic circuit. Figure 4(b), there is defect (metal loss, corrosion) existing
within the pipeline. This defect area has different value of magnetic permeability
compared to perfect steel pipe [15, 19]. Thus, the different value of magnetic flux
between perfect steel pipe and defect steel pipe is pivotal as it be the indicator
whether the pipe is in perfect or defect condition.
As a result, most magnetic flux passes around the flaw, a small fraction of
magnetic flux passes through the defect, and some magnetic flux departs from the
top and bottom surfaces and passes around the defect through air [15]. The last part
of magnetic flux leakage can be acquired by sensors and stored in computer for
analysis, which can be used to evaluate the dimensions and characteristics of the
defects. Figure 5 shows the operation of the pipeline intervention gadget PIG
(pipeline intervention gadget) inside the steel pipeline. It’s intrusive technology
hence it is not efﬁcient in term of costing and time operation.
MFL can be used to detect corrosion before pipe failure, and leaks occurred in
pipes. MFL is technology that have high accuracy compared to other established
technologies, since it works invasively inside the pipeline also high sensitivity and
no disturbance [20]. MFL also can provide qualitative information regarding the
presence of different defect located on the steel pipe [13].
However, on industry perspective any technologies that offer low cost of
inspection process would be the favourable but MFL process of inspection is highly
on cost and time consuming. Plus, it is not non-invasive technology thus, it does not
meet the industry interest currently which more to Non-destructive Technology
228 M. N. Roslee et al.

(a) Pipe without metal loss.

(b) Pipe with defect.

Fig. 4 Inspection principle of MFL [15]

Fig. 5 MFL hardware component [13]

A Pilot Study on Pipeline Wall Inspection Technology Tomography ... 229

(NDT). Lastly MFL is not suitable due to its method of operational which is
invasively because in every pipeline system have varies diameter of pipe. Hence, it
is not suitable for pipeline that has variety diameter. MFL technology should be
more applicable towards pipeline that have variety diameter and curved pipeline.
There is less research on producing MFL that has the ability to change the size of its
prototype accordingly to the pipe diameter to avoid MFL being stuck.

2.1.5 Ultrasonic Guided Wave

Guided wave of tomography as shown in Fig. 6 is technology that emerged this

recent decade. This technology is promising method since it can be used for
inspection in long range area up to 15 m [21–23]. The conventional point-by-point
methods such as ECT, EIT, and X-ray implies a slow inspection process and it
becomes very expensive when full inspection coverage is needed. It is therefore
useful to introduce a quick and sufﬁciently accurate method for the detection of
corrosion. It also non-destructive and non-intrusive technique [22].
Conceptionally, guided waves are generated from the interference of two types
of waves: longitudinal and transverse waves. Longitudinal waves exist when the
movements of the particles of a medium are parallel to the propagation direction of
the waves. Transverse waves exist when the movements of the particles are per-
pendicular to the propagation direction of the waves as shown in Fig. 6 [23].
Generally, the working principle of the ultrasonic guided waves is based on the
measurement of wave velocities, attenuation, and mode scattering of received signal
using fast fourier transform algorithms [24]. Practically, ultrasonic guided wave
inspection detects and assesses the severity of defects on steel pipe by measuring
the amplitude of the waves reflected by the defect area [25]. A quantitative study of
the reflection coefﬁcient able to detect the defect size, the dimensions of the pipe
with the frequency at which the wave is excited [25].
Figure 7 shows that the ultrasonic guided wave traveled in media (steel pipe)
which contained crack. At this area the ultrasonic guided wave will experience
scattering or mode conversion at discontinuous places. Reflection and refraction can
be expected on this defected area [26, 27]. By processing the wave carrying dis-
continuous information, the location of defect area can be estimated as shown in
Fig. 7 [21, 26].

Fig. 6 Representation of
guided waves.
L = longitudinal wave.
T = transverse wave [23]
230 M. N. Roslee et al.

Fig. 7 Pipe model of crack defect [22]

The advantages of the guided wave technology are the guided wave travel along
the pipe without much energy attenuation and ultrasonic wave travelled by
vibrating the particles on the inner and outer pipe. Thus 100% pipe wall inspection
achieved [21, 28].
There are few drawbacks of ultrasonic guided wave technique which are cannot
display the exact of the defect area of the steel pipe and can only detect any defect
area of pipe by observation of the amplitude of the wave compared others technique
[24]. Ultrasonic guided wave not able to investigate the multiphase flow inside the
pipe due to the Lamb’s wave only travel within the pipe. Lastly, the effectiveness of
this method is based on the assumption that the leakage induced acoustic waves
propagates along the pipeline as an individual non-dispersive guided wave with
small attenuation. In reality, the assumption is not always valid because the acoustic
waves are multi-modally blended signals and consequently resulting in missing
detection and location of leakage [29–31]. Recently, most studies have focused on
signal processing algorithm to increase the accuracy of the received signal.

2.1.6 Ultrasonic Tomography

In physics, sound is the product of the vibration of object (particle) and typically it
propagates as an audible sound through transmission medium like gas, liquid and
solid. In human physiology, the range frequencies of human can hear is between
20 Hz to 20 kHz. Sound wave below than 20 Hz and above 20 kHz is not per-
ceptible by humans and both called infrasound and ultrasound as shown in Fig. 8.
Tomography is the real time imaging technique that has been dominate oil and
gas industry over the recent years [33]. Generally, the basic principle of tomog-
raphy is that producing a density imaging by exposing the material to sound wave
or any other physical stimulus that able to penetrate the material and the object
A Pilot Study on Pipeline Wall Inspection Technology Tomography ... 231

Fig. 8 The sound range frequency [32]

responded. By using computers and mathematical models the internal image of the
system can be constructed [34–36].
(i) Waves
In the case of the capability of the ultrasonic sensor to penetrate the materials, ﬁrst
the characteristic of the wave that travelled inside the material should be studied.
There are four types of ultrasonic waves which are Longitudinal wave, Shear wave,
Rayleigh waves and Lamb wave [37, 38]. All these waves are shown in Figs. 9, 10,
11 and 12.
Figure 9 shows longitudinal waves (compression wave) is the type of wave that
human can hear and it used for testing the front end of pipe body structure and also
to test the integrity of pipe plate [37]. Longitudinal waves are moving inside
material by compressing and refraction of particles of the medium. Figure 10 shows
shear wave (transverse wave) that propagate slower and shorter wavelength com-
pared to longitudinal wave [39]. It commonly used for detection discontinuity in
both inner and outer layer of pipe.
Figure 11 shows the Rayleigh wave that only travel along the surface of material
at velocities equal to shear wave [37]. Figure 12 shows ultrasonic lamb wave (plate
wave) that vibrates from upper to lower surface of the material. The application of
lamb wave is able to detect location and extent of discontinuities of metal pipe.

Fig. 9 Graphical depiction of parallel motion response of particles of longitudinal ultrasonic

waves [37]
232 M. N. Roslee et al.

Fig. 10 Graphical depiction of perpendicular motion response of particles of shear ultrasonic

waves [37]

Fig. 11 Graphical depiction of limited detection area of Rayleigh waves, conﬁned on the surface
of material [37]

For the discussion above it is clearly to say that Rayleigh wave are not suitable
for detection any crack or corrosion inside the steel pipeline because the wave only
travelled on the surface of the pipe. Any crack or corrosion located beneath the
pipeline cannot be detected.
Longitudinal, shear waves and Lamb wave are the modes that most widely used
for ultrasonic testing of pipeline [40, 41]. From the observation from Figs. 9, 10
and 11 it shows that the waves are travelled throughout the entire medium (steel
pipe) so it can detect any crack or corrosion that placed inside the pipe medium by
measuring the signal received.
A Pilot Study on Pipeline Wall Inspection Technology Tomography ... 233

Fig. 12 Graphical depiction of ultrasonic Lamb waves (plate wave) [28, 37]

3 Result and Analysis

3.1 Frequency Selection

Frequency selection is one of the major factors that contribute the successfulness of
the ultrasonic tomography sensor. It is because the right selection of frequency able
to penetrate the internal system and can construct the internal image of the system
and with that analysation can be made.
The higher the frequency of the ultrasonic wave the faster the time for the wave
to decay and the higher the frequency of the ultrasonic wave the shorter the
wavelength [42]. Hence it cannot travel longer within the pipe material [33]. That
can be proved form the equation speed of sound below.

c ¼ fk

Where:
c = Speed of sound
f = frequency of sound
k = sound wavelength
However, steel pipe is commonly used in oil and gas industry thus, the appro-
priate ultrasonic sensor should be chosen regard to its performance. This is because
when ultrasonic sensor applied on steel pipe, Lamb wave become more pronounce
and disrupt the reading of signal received [40, 43]. The Lamb wave does not
provide any information caused by object disturbance or obstruction inside the pipe
because the Lamb wave only propagate within the pipe boundary [44, 45].
Abbaszadeh et al. has run simulation test using ﬁnite element analysis to ﬁnd
the suitable frequency for steel pipe with minimum disturbance of Lamb wave
(noise). As for result stated that 40 kHz is the optimum frequency applied on the
steel pipe [43]. However, Nordin et al. stated that the selection of frequency should
234 M. N. Roslee et al.

be high to reduce the percentage of Lamb wave propagation with 390 kHz.
Afterwards, that range frequency within 40 to 490 kHz should be test using finite
element analysis to get a better and optimum frequency for better image recon-
struction [46]. Thus, it is crucial to balance the trade-offs in developing an ultra-
sonic tomography system by considering the optimum frequency of the system.
This paper presented a pilot study on established pipeline wall inspection
technologies and purposed using ultrasonic tomography this is because there is no
researcher apply ultrasonic tomography on steel pipe with the outer diameter of
203.3 mm and thickness of 7.7 mm. A simulation result using finite element
analysis method was done using ultrasonic as the main sensors and it shows that the
ultrasonic can penetrate successfully into the steel pipe. It proved that ultrasonic
have the measurement result where the suitable frequency and voltage inserted to
operate the ultrasonic tomography system. This method will be applied briefly on
the next study on how the ultrasonic sensor will react to the pipeline fully-filled
with oil that have clog issue.
Table 1 shows the value of sound pressure level at 4 different voltages which are
5, 10, 20, 24 V and resonates at different frequencies in range of 40 kHz to 2 M
being applied on the steel pipe with diameter of 203.3 mm and thickness of
7.7 mm. The value sound pressure level shows that there are significant changes
whenever there is change in frequency and mostly voltage. This can be seen on
Fig. 13 where the 20 V have the highest value of sound pressure level compare to
other voltage thus, it can be concluded that 20 V is the most stable and the most
suitable to use for the ultrasonic sound wave able to penetrate the steel pipe from
transmitter to receiver.

Table 1 The sound pressure level for 5, 10, 20 and 24 V

Frequency Sound pressure Sound pressure Sound pressure Sound pressure
(Hz) level 5 V level 10 V level 20 V level 24 V
40 k −720.7 −714.68 −708.66 −707.07
80 k −732.74 −726.72 −720.7 −719.11
120 k −939.78 −733.76 −727.74 −726.16
160 k −744.78 −738.76 −732.74 −731.16
200 k −748.66 −742.64 −736.62 −735.03
240 k −751.82 −745.8 −739.78 −738.2
280 k −754.5 −748.48 −742.46 −740.88
320 k −756.82 −750.8 −744.78 −743.2
360 k −758.87 −752.85 −746.83 −745.34
400 k 760.7 −754.68 −748.66 −747.07
440 k −762.35 −756.33 −750.31 −748.73
480 k −763.87 757.87 −751.82 −750.24
1M −776.62 770.6 −764.57 −762.99
2M −788.66 782.64 −776.62 −775.03
A Pilot Study on Pipeline Wall Inspection Technology Tomography ... 235

Frequency vs Pressure Sound Level

-660
40k 80k 120k 160k 200k 240k 280k 320k 360k 400k 440k 480k 1M 2M
-680

-700

-720

-740

-760

-780

-800
Sound Pressure Level 5V Sound Pressure Level 10V
Sound Pressure Level 20V Sound Pressure Level 24V

Fig. 13 The graph between frequency and pressure sound level

3.2 Transducer Projection

Crucial aspect of an ultrasonic tomography is the selection of transducer mode of

transmission and transmission beam. This to ensure that large portion of region of
interest is illuminated with the ultrasonic wave so that better image can be recon-
structed and gaining more information of internal system of pipeline.
Ayob et al. stated that narrow beam does not have the advantage of gaining
information of internal system as much compared to wide beam of transmission.
This is a contrasting requirement to medical ultrasonic sensors which need the
sensor to be very narrow beam for fine lateral resolution. Hence, it is necessarily to
have a very wide beam so that, the information gaining is more reliable and precise.
The Illustration of ultrasonic sensor mounted on the steel pipe, there are five
significance interactions of the ultrasonic wave sound with different boundaries.

Fig. 14 The electronic transducers mounted on the surface of steel pipe

236 M. N. Roslee et al.

Fig. 15 The electronic

transducer with divergence
angle of 125° and array
estimation

Firstly, ultrasonic transducers with couplant (lithium grease), then couplant (lithium
grease) to steel pipe, after that steel pipe with liquid (hydrocarbon), steel pipe to
corrode and lastly, liquid to sand or mud. Before the real construction begin the
coefﬁcient of transmitted and reflected sound energy must be known theoretically to
ensure the capability and successfulness operation of the technology. Consequently,
ultrasonic this recent decade has evolved with the development of dual functioning
sensor whereby it can either transmitter and receiver. In this case, 16 transducers are
being used. Figure 14 shows the drawing of transducer that mounted on the surface
of steel pipe using ﬁnite element analysis with diameter of pipe is 203.2 mm
(Fig. 15).

4 Conclusion

In the light all pre-existing and competing technologies that each have unique
characteristics, advantages and drawbacks. EIT and ECT has low spatial resolution
compared to ultrasonic and X-ray inspection technology. This indicate that the EIT
has low coverage area upon the targeted object hence has low efﬁciency and
accuracy for long pipeline system. X-ray has the ability of deepest penetration
compared to other technology however, the usage of radiation material is not favour
since it required extensive care and prolong being exposed to ionizing material may
lead to hazardous potential. Hence, ultrasonic tomography has better offer which it
has better spatial resolution and it clearly a non-radiation material. MFL is one of
the established technologies that work invasively. MFL is not suitable because in
every pipeline system have varies diameter of pipe. Hence, it is not suitable for
A Pilot Study on Pipeline Wall Inspection Technology Tomography ... 237

pipeline that has variety diameter. MFL technology should be more applicable
towards pipeline that have variety diameter and curved pipeline. Ultrasonic
tomography has adjustable sensor rig that could ﬁt with any diameter of pipeline
since it works noninvasively. The effectiveness of Ultrasonic guided wave is based
on the assumption that the leakage induced acoustic waves propagates along the
pipeline as an individual non-dispersive guided wave with small attenuation. In
reality, the assumption is not always valid because the acoustic waves are
multi-modally blended signals and consequently resulting in missing detection and
location of leakage. However, ultrasonic tomography has being research interest in
this paper compared to other tomography techniques in oil and gas industry, thus
ultrasonic tomography should be given extra focus. Furthermore, ultrasonic
tomography inspection operation has proved to be the most beneﬁcial, ideal, low
cost, consumed less operational time, reliable, not involving radiated material and
most of all the design is demographically suitable for industrial usage. In ultrasonic
tomography technologies all established instrument is dependable on the situation is
being applied. In this research the metal pipe is used hence the usage of ultrasonic
tomography is applied with suitable frequency of 40 kHz to minimalize the effect of
Lamb’s Wave on solid material. The longitudinal shows that the waves are travelled
throughout the entire medium (steel pipe) so it can detect any crack or corrosion
that placed inside the pipe medium by measuring the signal received. Further study
will be carried on the suitability ultrasonic tomography for detection any crack and
corrosion inside located on pipeline.

Acknowledgements The authors would like to acknowledge the support from grants FRGS
K074, MDR H499, UTMSHINE 09G18, TDR 06G17 and CRG 05G04.

References

1. Rahim RA, Rahiman MHF (2012) Ultrasonic tomography: non-invasive techniques for flow
measurement. Penerbit UTM Press Universiti Teknologi Malaysia, Johor
2. United State Energy Information Administration (2017) Country analysis brief: Malaysia.
Independent Statistic and Analysis, 26 April 2017, pp 1–23
3. Zhao J, Yang S, Li Y, Wang X (2010) Study on detection of industrial pipe network by
high-frequency ultrasound. In: Proceeding of the 2010 symposium on piezoelectricity,
acoustic wave and device application, pp 517–519
4. Yang B, Li Q, Li M, Lu Y (2012) Ultrasonic monitoring system for oil and gas pipeline
corrosion. In: 2012 fourth international conference on multimedia information networking and
security, pp 381–383
5. Qingling Y, Xuan CH, Jun Z (2009) Corrosion control technology for underground pipelines
in oil and gas station. In: Corrosion and protection in petrochemical industry, pp 16–19
6. Morton E, Mann K, Berman A, Knaup M, Kachelrieb M (2009) Ultrafast 3D reconstruction
for X-ray real-time tomography (RTT). In: 2009 IEEE nuclear science symposium conference
record, pp 4077–4088
7. Seyyedi S, Wieczorek M, Pfeiffer F, Lasser T (2018) Incorporating a noise reduction
technique into X-ray tensor tomography. IEEE Trans Comput Imaging 4(1):137–146
238 M. N. Roslee et al.

8. Chen W, Miao Z, Ming D (2011) Automated inspection using X-ray imaging. In: 2011
international joint conference of IEEE TrustCom-11/IEEE ICESS-11/FCST-11,
pp 1769–1772
9. Onel Y, Emert U, Willems P (2000) Radiographic wall thickness measurement of pipes by a
new tomography algorithms. In: 15th world conference on nondestructive testing, Roma, Italy
10. Maher KP, Edyvean S (2001) Diagnostic radiology physics: a handbook for teachers and
students. International Atomic Energy Agency, United State America
11. Deng Z, Xu F, Zhang X, Chen H (2004) The development of X-ray inspection real time
imaging pipeline robot. In: Proceedings of the 5th world congress on intelligent control and
automation, pp 4846–4850
12. Plonsky B (2015) Non-destructive testing helps Malaysia’s competitiveness. International
Atomic Energy Agency (IAEA), 25 September 2015. https://www.iaea.org/newscenter/news/
non-destructive-testing-helps-malaysias-competitiveness. Accessed 19 June 2019
13. Rostron P (2018) Critical review of pipeline scale measurement technologies. Indian J Sci
Technol 11(17):1–18
14. Qin X, Ji C, Wang Z, Wang P (2018) Reconstruction and simulation of fluid flow pattern in
pipeline based on electrical impedance tomography algorithm. In: 2018 international
symposium on computer, consumer and control (IS3C), pp 262–265
15. LiYing S, LiBo S, LingGe L (2012) Comparison of magnetic flux leakage (MFL) and
acoustic emission (AE) techniques in corrosion inspection for pressure pipeline. In:
Proceedings of the 31st Chinese control conference, pp 5375–5378
16. Kim HM, Yoo HR, Park S (2018) A new design of MFL sensors for self-driving NDT robot
to avoid getting stuck in curved underground pipelines. IEEE Trans Mag 54(11):1–5
17. Zhang Z, Udpa L, Udpa SS, Sun Y, Si J (1996) An equivalent linear model for magnetostatic
nondestructive evaluation. IEEE Trans Magn 32(3):718–721
18. Katragadda G, Lord W, Sun YS, Udpa S, Udpa L (1996) Alternative magnetic flux leakage
modalities for pipeline inspection. IEEE Trans Magn 32(3):1581–1584
19. Lu S, Feng J, Li F, Liu J, Zhang H (2017) Extracting defect signal form the MFL signal of
seamless pipeline. In: 2017 29th Chinese control and decision conference (CCDC),
pp 5209–5212
20. Rahman NA, Rahim R, Ling CP, Pusppanathan J, Rahiman MHF (2015) A review of
ultrasonic tomography for monitoring the corrosion of steel pipes. Jurnal Teknologi
73:151–158
21. Sun L, Li Y, Jin S (2006) Study on guided ultrasonic waves propagating along the pipes with
fluid loading. In: Proceeding on the 6th world congress on intelligent control
22. Yang H, Wang C (2011) Study on simulation of non-destructive test ing for pipeline defects
by ultrasonic guided waves. In: 2011 cross strait quad-regional radio science and wireless
technology conference, pp 238–242
23. Silva J, Wanzeller MG, Farias PA (2008) Neto SR (2008) Development of circuit excitation
and reception in ultrasonic transducers for generations of guided waves in hollow cylinders
for fouling detection. IEEE Trans Instrum Meas 57:1149–1153
24. Lyutak I (2005) Wavelet analysis of ultrasonic guided waves in pipeline inspection. In: IEEE
workshop on intelligent data acquisition and advanced computing systems: technology and
applications, pp 571–523
25. Dance DR, Christiﬁdes S, Maidment ADA, Mclean ID, Ng HK (2014) Diagnostic radiology
physics. International Atomic Energy Agency, Vienna
26. Li Y, Yang JH, Qiu CC, Yang JS, Shi XS, Wang FB (2017) Shear circumferential guided
waves in coated gas pipeline. In: 2017 symposium on piezoelectricity, acoustic waves, and
device applications, pp 481–485
27. Lowe PS, Sanderson RM, Boulgouris NV, Haig AG, Balachandran W (2016) Inspection of
cylindrical structures using the ﬁrst longitudinal guided wave mode in isolation for higher
flaw sensitivity. IEEE Sens J 16(3):706–715
A Pilot Study on Pipeline Wall Inspection Technology Tomography ... 239

28. Huthwaite P, Ribichini R, Carley P, Lowe MJS (2013) Mode selection for corrosion detection
in pipes and vessels via guided waves tomography. IEEE Trans Ultrason Ferroelectr Freq
Control 60:1165–1177
29. Li S, Wen Y, Li P, Yang J, Wen J (2014) Modal analysis of leakage-induced acoustic
vibration in different directions for leak detection and location in fluid-filled pipelines. In:
2014 IEEE international ultrasonic symposium proceedings, pp 1412–1415
30. Wilcox PD, Lowe M, Cawley P (2001) The effect of dispersion on long-range inspection
using ultrasonic guided waves. NDT E Int 34:1–9
31. Long R, Lowe M, Cawley P (2001) Attenuation characteristics of the fundamental modes that
propagate in buried iron water pipes, vol 109, pp 1841–1847
32. Niaz A, Moshin F, Kaleem U, Kashif R, Afia SA, Ishaq BI (2009) Ultrasonic in Wet
Processing. Pak Text J 50–57
33. Shaib MFA, Rahim RA, Muji SZM (2017) The development of non-invasive ultrasonic
measuring system for monitoring multiphase flow liquid media within composite pipeline.
Int J Electr Comput Eng (IJECE) 7(6):3076–3087
34. Goh CL, Rahim RA, Tee ZC (2017) Investigation into slow scan front-end control of a
transmission mode ultrasonic system. IEEE Sens J 17(16):5136–5142
35. Rahim RA, Nyap NW, Rahiman MHF, San CK (2007) Determination of water and oil flow
composition using ultrasonic tomography. ELEKTRIKA 9(1):19–23
36. Ayob NMN, Yacoob S, Zakaria Z, Rahiman MHF, Manan MR (2010) Improving gas
component detection of an ultrasonic tomography system for monitoring liquid/gas flow. In:
2010 6th international colloquium on signal processing & its applications (CSPA), pp 278–
282
37. Alobaidi WM, Alkuam EA, Rizzoa HM, Sanguan E (2015) Applications of ultrasonic
techniques in oil and gas pipeline industries. Am J Oper Res 5:274–287
38. Gachagan A, McNab A, Reynolds P (2004) Analysis of ultrasonic wave propagation in
metallic pipe structure using finite element modelling techniques. In: 2004 IEEE international
ultrasonics, ferroelectric, and frequency, pp 938–941
39. Shivaprasad S, Balasubramaniam K, Kanna KC, Bhattachay S, Singh SP (2013) Multi-mode
tandem ultrasonic technique for tube inspection. In: 2013 joint UFFC, EFTF and PFM
symposium, pp 1307–1310
40. Abbaszadeh J, Rahim HA, Rahim RA, Sarafi S, Ayob MN, Faramarzi M (2013) Design
procedure of ultrasonic tomography system with steel pipe conveyor. Sens Actuators, A
203:215–224
41. Na J (2008) Design, fabrication, and characterization of single-element interdigital transducer
for NDT applications. Sens Actuator A 148:359–365
42. Ayob NMN, Rahiman MHF, Zakaria Z, Yaacob S (2010) Detection of small gas bubble using
ultrasonic transmission-mode tomography system. In: 2010 IEEE symposium on industrial
electronics and applications (ISIEA 2010), pp 165–169
43. Abbaszadeh J, Rahim HA, Rahim RA (2012) Optimizing the frequency of ultrasonic
tomography system with a metal pipe conveyor. In: 2012 IEEE 8th international colloquium
on signal processing and its applications, pp 52–56
44. Rahim RA, Rahiman MHF, Nyap NG, San CK (2004) On monitoring of liquid/gas using
ultrasonic tomography. Jurnal Teknologi 40:77–88
45. Gan TH, Hutchins DA, Billsum DR, Wong FC (2000) Ultrasonic tomography imaging of an
encased higly-attenuating solid media. In: 2000 IEEE ultrasonic symposium, pp 823–826
46. Nordin N, Idoras M, Zakaria Z, Ibrahim MN (2014) Tomography image reconstruction of
monitoring flaws on gas pipeline base on reverse ultrasonic tomography. In: 2014 IEEE 5th
international conference on intelligent and advanced system (ICIAS), pp 1–6
Weighted-Sum Extended Bat Algorithm
Based PD Controller Design
for Wheeled Mobile Robot

Nur Aisyah Syaﬁnaz Suarin, Dwi Pebrianti, Nurnajmin Qasrina Ann,

and Luhur Bayuaji

Abstract PID controller of WMR needs to be tuned as precise as possible in order

to develop a good performance of WMR that is able to move from initial position to
a desired position with the fast time response and minimum steady state error.
Weighted-sum Extended Bat Algorithm (WS-EBA) is a multi-objective optimiza-
tion method based on Extended Bat Algorithm. The weighted optimization
approach is used to search the optimum value of Proportional-Integral-Derivative
(PID) gains controller of Wheeled Mobile Robot (WMR) by referring to x and y
position. Several experiments are conducted to test the effect of variables or
parameters control to the value of PID gains and performance of WMR. Those
parameters are the type of PID controller, number of agents in WS-EBA and the
optimization functions used in the system to search the optimum value of PID
gains. Results obtained from this research study indicates that PD controller, 30
number of searching agents and ITAE as the objective function gives the most
suitable controller for WMR with result for X position is 11.00, 20.08 s and 0.00%
for rise time, settling time and overshoot respectively. Additionally, for Y position,
the results are 12.11, 22.08 s and 0.00% of rise time, settling time and overshoot
respectively. The comparison of WS-EBA with Weighted-sum Particle Swarm
Optimization (WS-PSO) and Weighted-sum Bat Algorithm (WS-BA) is also
experimented in this research. WS-EBA outperformed the rest with the best result
performance of WMR, consistency of solution, fastest convergence rate and the
most balance of exploration and exploitation phase.

N. A. S. Suarin D. Pebrianti (&) N. Q. Ann

Faculty of Electrical and Electronic Engineering, Universiti Malaysia Pahang (UMP),
26000 Pekan, Pahang, Malaysia
e-mail: [email protected]
L. Bayuaji
Faculty of Computer Science and Software Engineering, Universiti Malaysia Pahang (UMP),
26500 Gambang, Pahang, Malaysia
D. Pebrianti L. Bayuaji
Magister of Computer Science, Universitas Budi Luhur, Jakarta 12260, Indonesia

© Springer Nature Singapore Pte Ltd. 2021 241

Keywords Weighted-sum Extended bat algorithm Wheeled mobile robot
Proportional-Integral-Derivative (PID) controller

1 Introduction

Wheeled Mobile Robot has gained increasing popularity due to its ability and
flexibility to freely move by using the wheels and potentiality to be applied on
numerous applications such as to lift and moving heavy and static object. In order to
achieve predefine goal or desired location that the WMR need to moves, it has to be
equipped with a good controller. Fast response, minimum settling time and over-
shoot are the criteria which determine the performance of the controller on WMR.
There are plenty of controllers available nowadays, e.g. Proportional-Integrated-
Derivative (PID) controller, path planning, fuzzy logic controller and the simplest
controller which is on-off controller [1–3]. The simple solution is preferable to be
applied to the system to solve the problem rather than complex solution. However
the simplest controller, on-off controller has an oscillating behaviour which limit its
usage. The ultimate aim of the controller is to maintain zero error or minimum
steady state error which is the difference between the process output and the desired
output.
Proportional-Integrated-Derivative (PID) controller is well known with its sim-
ple structure and ability to produce a robust performance for the system. It has been
applied to a lot of application such as to the system of machine [4], controller of
flood and to control a mobile robot [5–7]. PID is a basic controller which consists of
three unfixed gains variables. The gain that gives proportional output to the current
error is known as P controller. When applied alone, P controller tends to produce
steady state error and need to manual reset [8]. Integrated or I controller is the gain
that counter-backs the limitation of P gain by eliminating the steady state error.
However as the value of I gain increases, the speed is going to decrease. Last but
not least is the Derivative or D controller. The D gain has the ability to become
flexible and helps the system reacts when there is a change to the set point. Future
can be predicted well by applying the D controller. The lag of system response due
to the I gain can be fixed by applying D gain. However, the combination of each
gain as a set of controller depends on the suitability and performance required by
the system. The trade is between the accuracy, the speed and the robustness [9].
The gains of PID controller is not fixed and needs to be tuned to suit with the
system and process. There are several well-known methods. The most basic method
is trial and error method and manual tuning [9, 10]. It is the most simplest method
but not a systematic method and time consuming. Next is Ziegler-Nichols method.
There are two additional constants need to be tuned by using this method, which is
constant for oscillation and period of oscillation [6]. The lengthy method and
adding more constants make the controller become more complicated and hard to
be tuned to the optimized value. Last but not least, the method that can be applied to
tune the PID gains is by using mathematical optimization and metaheuristics
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ... 243

optimization method. Particle Swarm Optimization (PSO) was applied to tune PD

controller [5, 11, 12], Simulated Annealing was applied to tune load frequency
controller [13] and genetic algorithm (GA) was applied to solve unicycle type of
mobile robot [7, 9]. Swarm intelligence is one of the group of metaheuristics
algorithm method which is inspired by the swarm behaviour of living things. There
are abundance of optimization methods available as according to No Free Lunch
(NFL) theorem, there are no single solution of optimization is able to be applied to
all problems. Thus the new optimization method is still rapidly growing.
Extended Bat Algorithm (EBA) is a hybrid optimization method developed from
Bat Algorithm (BA) optimization and Spiral Dynamic Algorithm (SDA) opti-
mization. It was created [14] and never been applied on the problem to optimize
PID controller yet as recorded in published papers. The steps of the EBA opti-
mization method which improve the searching method by search in spiral according
to SDA and agents movement as in BA. This is to improves the result to avoid from
trap in local minima and speed up the process to converge to ﬁnd the best solution.
PID controller needs to be tuned to the optimum value which give the best output
with the minimum steady state error, minimum overshoot and fast response of the
system. This is because WMR is the mobile robot which keeps moving to the
desired position.
Weighted-sum is a multi-objective optimization method which used to optimize
and ﬁnd solution for multiple solutions in one system. Weighted-sum is the simplest
solution to optimize multiple objective due to the linearize method applied in the
approach. The linearize make the multiple objective functions become single
objective function by applying weightage value for each objective function.
Weighted-sum Extended Bat Algorithm (WS-EBA) is multiple-objective opti-
mization which solve the optimization based on EBA approach. WS-EBA is chosen
in this research study because there are two objective functions of WMR needs to
minimize, i.e. error of x position and y position of WMR.
Apart from the ultimate aim which is to tune the gains of PID controller, there
are several variables need to take into the consideration. These variables were
recorded had gave impact and influenced the results of the controller when tuning
by using metaheuristics optimization method. The variables are the number of
agents used in the optimization method, the objective function and hyperparameters
tuning of metaheuristics algorithm. Each algorithm has different type and number of
hyperparameters. EBA consists of loudness, pulse rate, spiral radius and spiral
angle while hyperparameter for particle swarm optimization (PSO) algorithm is the
cognitive component, social component and inertia weight. The hyperparameter is
important to determine the local and global searching by the agents and to control
the exploration and exploitation phase for all the agents.
Thus in this paper, the most popular and simple yet can produce good perfor-
mance of result, PID controller is being discussed and investigated to be applied to
WMR. The paper is organized as follows. Section 2 presents the experimental
design and methods while in Sect. 3, the simulation results and performance
comparison in terms of are discussed. Lastly, conclusion is drawn in Sect. 4.
244 N. A. S. Suarin et al.

2 Methodology and Experimental Setup

2.1 Closed Loop System for WMR and PID Controller

Figure 1 shows the example of wheeled mobile robot used in this study which is the
mBot wheeled mobile robot. The most important parameters of the mBot which
adapted to the kinematic model are the length, L, the distance between the two
wheels and radius, r, the radius of the wheel. The control system is fully developed
from kinematic model and equation derived in Fig. 1 and Eqs. (1) to (8). This
research study is a simulation-based study, thus converting a real mobile robot to
control system that the performance can be measurable is a necessary. The input of
the control system is the desired position (x and y position) where WMR needs to
reach and the output is the current position of the WMR. A closed loop control
system of the WMR is designed as in Fig. 2. The objective of this system is to
minimize or eliminate the error which is the difference between desired position and
current position.
r
x_ ¼ ðxR þ xL Þ cos h ð1Þ
2
r
y_ ¼ ðxR þ xL Þ sin h ð2Þ
2
r
h_ ¼ ðxR xL Þ ð3Þ
L

r is the radius of the mBot’s wheel, xR is the right wheel angular velocity, xL is
the left wheel angular velocity and L is the distance between the mBot’s wheels.

Fig. 1 Development of
kinematic model of mBot
(WMR)
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ... 245

In order to develop a kinematic model a single upright wheel on the plane is given
as in Eqs. (4) and (6).

x_ ¼ v cos h ð4Þ

y_ ¼ v sin h ð5Þ

h_ ¼ x ð6Þ

By rearranging Eqs. (1) to (6), we get:

2v þ xL
xR ¼ ð7Þ
2r
2v xL
xL ¼ ð8Þ
2r

where xR is the angular velocity of right wheel of the mBot and xL is the angular
velocity of the left wheel of the mBot. The output of xR and xL are formed from the
inputs of v and x. A constant number 0.113 m is used in the model. This constant
number is the distance, L between two mBot’s wheels and r refers to the radius of
the mBot’s wheel which is 0.03 m. The kinematic model has been veriﬁed by Dwi
Pebrianti et al. by conducting an experiment to compare the developed kinematic
model with the actual mBot robot performance [15]. The accuracy of the developed
kinematic model is 85%.
Proportional-Integral-Derivative (PID) controller in the system is aiming to reduce
the errors that feedback into the system. Weighted sum Extended Bat Algorithm
(WS-EBA) is a multi-objective hybrid metaheuristic algorithm. It can be clearly seen
in Fig. 2, there are two errors, the ﬁrst error is for x position and second error is for y
position. Thus, a single objective optimization technique is not able to minimise both
of the errors by tuning the PID controller, multi objective optimization is needed to be

Fig. 2 Closed loop control system, controller (PID), plant (kinematic model of WMR), input (x
and y desired position), output (x and y current position)
246 N. A. S. Suarin et al.

applied in the system. Weighted sum is the simplest multi-objective algorithm as it

linearize into one function and the equation is stated in Eq. 9.

f T ¼ ðW 1 f 1 Þ þ ðW 2 f 2 Þ ð9Þ

where fT is total fitness function, W1 is the weight for first fitness function, W2 is the
weight for second fitness function, f1 is the first fitness from the first objective
function and f2 is the second fitness from the second objective function.
It is important to have a robust PID controller which can reduce and eliminate
the errors in a short time and produce a stable performance of WMR. PID controller
is a classical well-known controller due to its simple structure, convenient
debugging, strong adaptability and most widely used in the systems. However, the
challenging for this controller is to tune the gains to the optimized value so that the
best performance of the system can be produced.

2.2 Weighted-Sum Extended Bat Algorithm

Extended Bat Algorithm (EBA) is a low level meta-heuristic hybridization algo-

rithm of original Bat Algorithm (BA) and Spiral Dynamic Algorithm (SDA). The
hybridization is known as low level because the hybrid is only involved in one part
of Bat Algorithm which is the exploration part. Original Bat Algorithm is updating
the position of the agent by using Eq. (10) while in SDA, updating the position is
by using Eq. (11). However, in EBA, updating position is combination of BA and
SDA as stated in Eq. (12).

xti ¼ xt1
i þ vti ð10Þ

xti þ 1 ¼ rRðhÞxti ðr:RðhÞ In Þx ð11Þ

xti ¼ rRðhÞxti ðr:RðhÞ In Þx þ vti ð12Þ

where x is agent position, v is the velocity of the agent, r is the step rate between x(t)
and x* per t, h is the rotation rate [−p, p], R(h) is the composite rotation matrix, i is
the number of agent, t is the number of iteration, In is the matrix identity and x* is
the position of best agent.
The combination of both algorithms is expected to perform well together and
could improve the performance of the original BA. This is because, the perfor-
mance of optimization method is depending on its ability to balance exploration and
exploitation phases. By applying SDA searching method, the exploration phase of
BA can be improved. The only part taken from SDA is for updating the x value, and
the remaining algorithm is from BA. Figure 3 is the flowchart for WS-EBA tune
PID controller of WMR.
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ... 247

Fig. 3 Flowchart of Weighted-sum Extended Bat Algorithm (WS-EBA) to tune PID controller for
WMR
248 N. A. S. Suarin et al.

Table 1 Parameters setup for Parameter WS-EBA WS-PSO WS-BA

WS-EBA, WS-PSO and
WS-BA Number of searching agent 10/30 30 30
Number of iterations 100 100 100
Initial loudness, A 0.5 – 0.5
Initial pulse rate, p 0.5 – 0.5
Spiral radius, r 0.95 – –
Spiral angle, h 1 – –
KP boundary [0 50] [0 50] [0 50]
KI boundary [0 100] [0 100] [0 100]
KD boundary [0 300] [0 300] [0 300]
Cognitive component, c1 – 0.9 –
Social component, c2 – 0.9 –
Inertia weight, w – 0.5 –

Objective function (optimization index) is an important component in opti-

mization method because the value need to be optimized (either minimized or
maximize) is relied on the objective function. Objective functions which usually
used to minimize the error in control system are Integral Square Error (ISE),
Integral Absolute Error (IAE), Integral Time Squared Error (ITSE) and Integral
Time Absolute Error (ITAE). Equations (13) to (16) are the equations of the
objective functions, all the objective functions had been tested. The best objective
function should be able to minimise the errors and give the optimum value of gain
of PID controller for WMR.
Z 1
2
ISE ¼ e1 ðtÞ þ e22 ðtÞ þ . . . þ e2n ðtÞ dt ð13Þ
0
Z 1
IAE ¼ ðje1 ðtÞj þ je2 ðtÞj þ . . . þ jen ðtÞjÞdt ð14Þ
0
Z 1
ITSE ¼ t e21 ðtÞ þ e22 ðtÞ þ . . . þ e2n ðtÞ dt ð15Þ
0
Z 1
ITAE ¼ t ðje1 ðtÞj þ je2 ðtÞj þ . . . þ jen ðtÞjÞdt ð16Þ
0

2.3 Experimental Setup

The parameter setup for WS-EBA is shown in Table 1. The experiment is con-
ducted to determine the best PID based controller to be applied to the WMR, the
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ... 249

appropriate number of searching agents to be used to search the gains of PID

controller in the algorithm and the best objective function to be used in the opti-
mization process. All of the criteria mentioned are important in order to produce the
robust performance of PID controller for WMR. In the table as well, the parameters
setup for WS-PSO and WS-BA optimization are listed. Both of the optimizations
are required in the comparison of controller tuned by proposed method with others
swarm based optimization method. For PSO, the parameter setup is referring to the
paper from [12] as recommended. For BA, the parameter setup is referring to our
previous research paper [16].

3 Result and Discussion

3.1 PID Controller for WMR Tuned by WS-EBA

Proportional-Integral-Differential (PID) is a controller which consists of three gains

those need to be tuned as precise as possible corresponding to the system. However,
not each gains of the controller i.e. P-I-D, is compulsory to be applied in the system,
it is depending on the suitability of the system to use those controller components.
In order to develop the robust controller for wheeled mobile robot (WMR) of
mobile robot, different PID based controllers are being tested and the best controller
with outperformed performance is selected.

Table 2 PID gains value tuned by WS-EBA

Controller No. of agents Objective functions KP KI KD
PID 10 ISE 0.015 0.013 0.014
IAE 0.004 0.004 2.901
ITSE 0.000 0.004 477.447
ITAE 0.000 0.003 145.828
30 ISE 0.002 0.001 65.549
IAE 0.015 0.013 0.013
ITSE 0.008 0.006 152.931
ITAE 0.000 0.002 236.19
PI 10 ISE 0.013 0.013 0.000
IAE 0.009 0.012 0.000
ITSE 0.014 0.012 0.000
ITAE 0.009 0.011 0.000
30 ISE 0.012 0.013 0.000
IAE 0.011 0.013 0.000
ITSE 0.012 0.012 0.000
ITAE 0.013 0.012 0.000
(continued)
250 N. A. S. Suarin et al.

Table 2 (continued)
Controller No. of agents Objective functions KP KI KD
PD 10 ISE 0.105 0.000 0.093
IAE 0.124 0.000 0.116
ITSE 0.117 0.000 0.108
ITAE 0.144 0.000 0.129
30 ISE 0.221 0.000 0.207
IAE 0.221 0.000 0.208
ITSE 0.188 0.000 0.170
ITAE 0.229 0.000 0.214
No PID 0 ISE 0.000 0.000 0.000
IAE 0.000 0.000 0.000
ITSE 0.000 0.000 0.000
ITAE 0.000 0.000 0.000

Proposed method to tune PID controller is multiple-objective-optimisation method

which is weighted sum Extended Bat Algorithm (EBA). The aim of this experiment is
to recognize the most suitable PID controller for WMR of mBot. Table 2 shows the
gain values of 3 types of PID controller tuned by WS-EBA. Additionally, the system
without PID controller is included in this experiment as well to observe the difference
between them. The agents in the algorithm are assigned to search for the optimal
solution of the PID gains. For PI and PD controller, D gain and I gain has been
neglected respectively. For no PID controller, all the gains have been neglected and
the system is run without controller.
The solutions which is the value of gains for PID controller obtained by the
WS-EBA for PID, PI and PD are different due to the different number of agents and
objective function used in the system. The gains obtained by the PID and PI
controller for 10 and 30 agents show only a small difference, in the other hand, PD
controller obtained value with big difference with each other. The value obtained by
30 agents are higher than 10 agents. This is because the agents are searching based
on the boundary of search area and the number of agents play important role to
obtain the best solution, avoid to easily trap in local minima and to explore the area
with the best solution.
Table 3 shows the performance of WMR when applying the gains in Table 2.
The WMR system is as shown in Fig. 2. PD controller, with 30 number of agents
and tuned by using ITAE objective function outperformed other controllers with the
best value of rise time, settling time and percentage of overshoot. In the WMR
system, short rise time is better than long time to rise to the desired position, short
settling time is better than long time to stable and settle down and the low value of
the overshoot percentage is better than has high overshoot percentage to determine
the performance of WMR. Thus, PD controller recorded the best result based on the
criteria above with the shortest rise time, settling time and the lowest value of
percentage of overshoot.
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ... 251

PID and PI controller show the worst result by not reaching the reference
position which is 1 m as stated by the result of settling time and percentage of the
overshoot. The system is run for 60 s and during that time, the system still not
reaches the settling time. The percentage of overshoot is too high which indicates
that the system is not stable and unable to reach to the desired position.
Performance of the robot without PID controller shows that it takes the longest
settling time and rise time. The WMR needs to react fast as it is a moving robot that
need to accomplish a task. Although the percentage of overshoot for the system
without PID controller is among the lowest, PD controller perform the best in all
aspects including for rise time and settling time.
PD controller consists of P and D gain values in the controller, without I gain
value. I is integral gain which the function is to eliminate the steady state error.
I gain limits the speed of the response and stability of the system. This gain can give
integral windup effect to the system which accumulates a signiﬁcant error during
the rise, thus leads to continuous overshooting of the system performance. This
situation can be avoided by recognizing the range of the gain. For the system that
requires fast response, I components in the controller is not encourage to be
implemented in the system.
For WMR system, the present of I gain value, such as in PID and PI controller
make the error that is fed into the system accumulates and continue to increase the
overshoot values. The superior controller for the system in this study is PD con-
troller with the fastest rise time and the most accurate steady state value.
Karahan et al. [17] in their paper applied PD controller integrated with fuzzy
controller for the wheeled mobile robot system, Baral et al. [13] applied PI con-
troller in the load frequency controller system and Ye et al. [4] applied PID con-
troller in the hydraulic system for position control. The PID controller is a
well-known classical controller, easier to be implemented to the system compared
with other controllers and can produce reliable result. However, to implement to the
system, the suitability needs to be tested and recognized ﬁrst. For this WMR
system, PD controller is the best controller to be applied.
From the result, the similarities of all controller are the present of P gain value in
the controller except for no controller. P is proportional gain which makes the
feedback error in the system proportional to the system. The function is to help to
stabilise the system at the same time remaining the steady state error, SSE. The
optimal value of P gain is important to control the oscillation of the robot. By
referring to Table 2, table of PID gain values for each controller, P gain for PD
controller is not the lowest or the highest, but the performance is the best among all
in term of rise time, settling time and steady state error.
Searching agent is important element in optimization procedure where the agent
is used to search the best result. The values of gain obtained by 10 number of agents
is higher than by 30 number of agents. However, the high values of gain does not
252 N. A. S. Suarin et al.

indicate the best performance of WMR, the optimum value does. The searching
area plays the important role as well in this process. The number of searching agent
must be appropriate with the searching area, as too many searching agents in small
area might lead to deadlock situation and too little number of searching agents
might make the agents unable to explore the whole area and produce the bad result.

Table 3 Result analysis performance of WMR with different types of PID controller
Controller No. of agents Obj. func. X position Y position
Tr (s) Ts (s) Os (%) Tr (s) Ts (s) Os (%)
PID 10 ISE 22.26 59.59 350.53 9.13 13.38 1.14
IAE 20.61 49.96 1003.54 31.08 46.54 1.52
ITSE 35.89 59.42 96.27 36.08 59.42 96.28
ITAE 37.96 59.02 98.50 37.78 59.02 98.49
30 ISE 37.94 59.35 1051.42 37.84 59.35 1.54
IAE 22.45 59.58 97.07 8.95 13.10 97.07
ITSE 38.05 59.39 98.50 38.03 59.39 98.50
ITAE 36.16 59.40 93.32 36.09 59.40 93.32
PI 10 ISE 22.45 59.58 8557.86 0.33 5.67 1.45
IAE 22.05 59.65 1035.47 9.44 14.01 1.53
ITSE 21.97 59.64 887.99 9.47 13.84 1.42
ITAE 22.04 59.65 8578.58 9.44 14.00 1.46
30 ISE 22.43 59.58 1029.98 8.84 13.05 1.59
IAE 22.47 59.58 1030.47 8.78 13.06 1.55
ITSE 22.04 59.63 915.27 9.32 13.69 1.48
ITAE 22.02 59.64 8848.02 9.33 13.85 1.53
PD 10 ISE 21.89 36.82 0.36 22.36 38.02 0.93
IAE 18.99 32.99 0.79 19.63 34.70 0.43
ITSE 20.47 36.72 0.06 21.16 38.45 0.08
ITAE 16.66 29.83 0.01 17.43 31.59 0.02
30 ISE 11.32 20.67 0.01 12.45 22.62 0.00
IAE 11.31 20.57 0.00 12.43 22.55 0.01
ITSE 12.83 23.48 0.00 13.91 25.82 0.00
ITAE 11.00 20.08 0.00 12.11 22.08 0.00
No PID 0 – 41.28 58.97 257.09 2.31 59.94 0.01

In Table 3 as well shows the result performance for X position of mobile robot
using PID, PI, PD controller and without PID controller tuned by EBA with dif-
ferent number of searching agents. PD controller with 30 number of searching
agents produce better result with shorter rise time and settling time than 10
searching agents. The difference between these two comparison is quite signiﬁcant
as the difference is 30 searching agent is faster by 5.66 s than 10 searching agent for
rise time and 8.75 faster for rise time.
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ... 253

For Y position, the result shows a big difference of performance for the com-
parison for PD controller for 10 and 30 searching agents. 30 searching agents
produced better result for settling time by 12.11 s. The searching area of the PD
controller which are set by the parameters of upper boundary and lower boundary
for both gains are appropriate with the number of searching agents. Thus, the
searching agents are able to search and obtain the result that gives good perfor-
mance for the mobile robot. The number of searching agent is one of the important
factor that can influence the performance of the system.
Sahib et al. [18], mentioned in his research that by using the most suitable
objective function can highly improve the PID tuning optimization. The result of
comparison between different objective functions are show in Table 3 as well. By
referring Table 3, ITAE (Integral Time Absolute Error) is the best objective
function to be applied in this system to tune PD controller. Rise time and settling
time for ITAE is the shortest, 11.00 and 20.08 s respectively. IAE and ISE weight
all error and independent of time which can result in a response with relatively
small overshoot compared to ITAE and ITSE.
The result for Y position of mobile robot using PD controller tuned by mini-
mizing four different objective functions shows the performance for ITAE is the
best, same with X position. Overshoot produced by ISE and ITSE are higher than
overshoot produced by IAE and ITAE by 50%. Rise time and settling time for IAE
and ITAE are shorter than those produced by ITSE and ISE. These results indicate
that by squaring the output will increase the error and make the system become
unstable. Overshoot in the system plays an important role to measure the perfor-
mance of the mobile robot.

3.2 Performance Comparison with WS-PSO and WS-BA

The comparison in this section only involves PD controller tuned by WS-EBA,

WS-PSO and WS-BA. This is because only PD controller outperforms the other
controller form previous experiment. The number of searching agents is set to 30
agents and the objective function used is ITAE due to the same reason as well. PSO
algorithm has been chosen because the method was used [12] to tune PID controller
for charger system and BA algorithm has been chosen due to the previous result of
WMR tuned by BA [16]. Additionally, the originality of EBA is BA, thus it is
appropriate to compare the performance.
Figure 4 shows the convergence curve of EBA, BA and PSO. All the algorithms
have fully converged at the end of 100 iterations. Among the three algorithm, PSO
does not converge and remains steady during the 100 iterations. The ﬁtness value
for PSO is the maximum. In the other hand, BA keeps trying to converge until 40
iterations and EBA stop to converge at 12 iterations. The ﬁtness value for BA is the
minimum. However, although the problem is to minimise the position error, but it is
depending on the value of PD gains controller tuned by the mobile robot system.
254 N. A. S. Suarin et al.

Fig. 4 Convergence curve ﬁtness function for WS-EBA, WS-BA and WS-PSO

Table 4 Table weightage values for WS-EBA, WS-PSO and WS-BA

Algorithm Total Weightage Fitness Weightage Fitness
ﬁtness (fT) 1 (W1) 1 (f1) 2 (W2) 2 (f2)
WS-EBA 1.321 0.749 0.885 0.251 2.621
WS-PSO 3.224 0.453 1.587 0.547 4.580
WS-BA 0.742 0.642 1.032 0.358 0.222

Control parameters of each algorithm is different with each other. The good
performance of the algorithm is depending on how well the algorithm is able to
control and balance the exploration and the exploitation phase. Too much explo-
ration will lead the searching agents diverge from the best solution while excessive
exploitation phase will make the algorithm fall in deadlock and trapped in wrong
solution. Thus, it is important to control the searching agents to search for the best
solution.
Table 4 shows the value of total fitness, first and second weightage and fitness of
ITAE function used for minimizing the first and second errors in the system. The
summation of total weightage is equal to one. Total fitness obtained by WS-PSO is
the maximum while total fitness obtained by WS-BA is the minimum. The values
may indicate the convergence of the algorithm to search for the best solution. Being
trapped in local minima will make the agents unable to explore more and give value
higher than the solution. In the other hand, uncontrol exploration phase will make the
agents missed the best solution by taking the solution from other low fitness value.
Table 5 shows the result of PD gains tuned by WS-EBA, WS-PSO and WS-BA.
Each algorithm produced different results depend on the method of searching agents
used to search in the algorithm. In this optimizations experiment, the best number of
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ... 255

Table 5 PD gains value for comparison with different algorithm

Optimization KP KD
WS-EBA 14.376 20.223
WS-PSO 16.562 63.962
WS-BA 14.875 40.452

Fig. 5 Box plot of EBA, BA and PSO for result of ﬁtness values for ﬁve times repeatability

searching agent used is 30 and the best objective function, ITAE is used in order to
optimize the gain values of PD controller. The performance of the results can only
be determined after implementing the controller in the kinematic model of the mBot
and run the closed loop system.
Figure 5 shows the boxplot ﬁtness value run for ﬁve times against algorithm and
Table 8 shows the analysis of performance from the box plot. EBA produce the
most consistent data as shown in the size of the box. The smaller the box, the more
consistent the data with the median. Apparently, PSO has the highest median and
the largest range by referring to the maximum point and minimum point of the box.
This means that PSO is the worst in terms of consistency (Table 6).

Table 6 Information of box plot for EBA, BA and PSO by referring Fig. 5
Algorithm Maximum Minimum Median Number of point Outliers
point point
WS-EBA 1.284 1.204 1.216 5 No
WS-PSO 4.514 1.229 2.845 5 No
WS-BA 1.404 1.199 1.232 5 No
256 N. A. S. Suarin et al.

Fig. 6 Graph of X position for optimizing PD controller by using different algorithm

Table 7 Result of Optimization Tr ðsÞ Ts ðsÞ Os ðsÞ

performance with different
optimization algorithm WS-EBA 17.650 34.989 0
approach, X position WS-PSO 21.454 41.440 0
WS-BA 19.753 38.776 0

Figure 6 and Table 7 display the results of PD controller tuned by different

algorithm for X position. Performance of mobile robot when using PD controller
tuned by WS-EBA is the best with the fastest rise time, 17.65 s and the fastest
settling time, 34.98 s. The worst result is showed by WS-PSO. The result of PD
controller tuned by BA acquired almost the same performance with WS-EBA
because, WS-BA is the origin algorithm, and both use the same main method for
searching the solution. WS-EBA is managed to obtain better result due to speciﬁc
method it used, by applying spiral path to search the solution.
Figure 7 and Table 8 show the results of PD controller tuned by different
algorithm for Y position. For this result, the performance of mobile robot is the
same for using the PD controller which the gain values are tuned by WS-EBA and
WS-BA. PD controller tuned by WS-PSO produced the worst performance by the
longest time took to rise time, 1.49 s, settling time, 11.63 s and 62% of overshoot.
Although WS-BA gives the good performance in Y-position, WS-EBA pro-
duced the best performance for both position, X and Y position. This makes the
controller tuned by WS-EBA is the best controller produced compared with other
algorithms. Rise time, settling time and overshoot are the three main indicators to
determine the performance of controller for the kinematic model of the system.
Weighted-Sum Extended Bat Algorithm Based PD Controller Design ... 257

Fig. 7 Graph of Y position for optimizing PD controller by using different algorithm

Table 8 Result of Optimization Tr ðsÞ Ts ðsÞ Os ðsÞ

performance with different
optimization algorithm WS-EBA 1.328 9.811 1.352
approach, Y position WS-PSO 1.488 11.626 1.626
WS-BA 1.328 9.821 1.352

4 Conclusion

Extended Bat Algorithm is one of the latest hybrid algorithms and has not yet been
implemented to solve any controller optimization problem. By conducting this
research study, the potentiality of EBA has been proven. Solving multi-objective
optimization problem based on EBA is one of new challenge accepted by EBA.
EBA produced the best result for optimizing and tuning the gains of PID controller.
Based on the experiment conducted, PD controller, tuned by using 30 searching
agents, using ITAE as the ﬁtness function is the best controller compared with the
PD controller tuned by WS- PSO and WS-BA. PD controller has been selected as
the best among PID and PI controllers. PD controller with P gain 14.776 and D gain
20.223 has the best rising time, 17.65 s, settling time, 34.989 s and one of the
controllers with the lowest overshoot which is 3.5%.

References

1. Abdalla TY, Abed AA, Ahmed AA (2017) Mobile robot navigation using PSO-optimized
fuzzy artiﬁcial potential ﬁeld with fuzzy control. J Intell Fuzzy Syst 32(6):3893–3908
258 N. A. S. Suarin et al.

2. Jeng JC, Tseng WL, Chiu MS (2014) A one-step tuning method for PID controllers with
robustness speciﬁcation using plant step-response data. Inst Chem Eng 92(3):545–558
3. Din A, Jabeen M, Zia K, Khalid A, Saini DK (2018) Behavior-based swarm robotic search
and rescue using fuzzy controller. Comput Electr Eng 70:53–65
4. Ye Y, Yin CB, Gong Y, Zhou JJ (2017) Position control of nonlinear hydraulic system using
an improved PSO based PID controller. Mech Syst Signal Process 83:241–259
5. Kanojiya RG, Meshram PM (2012) Optimal tuning of PI controller for speed control of DC
motor drive using particle swarm optimization. In: Proceeding of 2012 international
conference on advances in power conversion and energy technologies (APCET), Mylavaram,
Andhra Pradesh, pp 1–6
6. Chia KS (2018) Ziegler-nichols based proportional-integral-derivative controller for a line
tracking robot. Indones. J Electr Eng Comput Sci 9(1):221–226
7. Majid NA, Mohamed Z, Basri MAM (2016) Velocity control of a unicycle type of mobile
robot using optimal PID controller. Jurnal Teknologi 78(7–4):7–14
8. Foley MW, Ramharack NR, Copeland BR (2005) Comparison of PI controller tuning
methods. Ind Eng Chem Res 44(17):6741–6750
9. Sujay HS, Suman R, Chaithanya S, Narayanan S, Shamanth U (2018) Tuning and analysis of
PID controllers using soft computing techniques. Int J Sci Res Sci Technol 5(3):67–71
10. Goswami NK, Padhy PK (2018) Sliding mode controller design for trajectory tracking of a
non-holonomic mobile robot with disturbance. Comput Electron Eng 72:307–323
11. Nazari MAD, Khooban MH (2015) Design of optimal mamdani-type fuzzy controller for
nonholonomic wheeled mobile robots. J King Saudy Univ Eng Sci 27(1):92–100
12. Solihin MI, Tack LF, Kean ML (2011) Tuning of PID controller using particle swarm
optimization (PSO). In: Proceeding of international conference of advance science
engineering information technology, Putra Jaya, Malaysia, pp 458–461
13. Baral KK, Barisal AK, Mohanty B (2017) Load frequency controller design via GSO
algorithm for nonlinear interconnected power system. In: Proceeding of 2016 international
conference on signal processing, communication, power and embedded system (SCOPES),
Paralakhemundi, vol 77, pp 662–668
14. Pebrianti D, Ann NQ, Bayuaji L, Abdullah NRH, Zain ZM, Riyanto I (2019) Extended bat
algorithm (EBA) as an improved searching optimization algorithm. In: Md Zain Z, Ahmad H,
Pebrianti D, Mustafa M, Abdullah NRH, Samad R, Mat Noh M (eds) Proceeding of the 10th
national technical seminar on underwater system technology 2018, vol 538. LNEE. Springer,
Heidelberg, pp 229–237
15. Pebrianti D, Hao YH, Suarin NAS, Bayuaji L, Musa Z, Syafrullah M, Riyanto I (2018)
Motion tracker based wheeled mobile robot system identiﬁcation and controller design. In:
Hassan MHA (ed) Intelligent manufacturing & mechatronics, vol 538. LNME. Springer,
Heidelberg, pp 241–258
16. Suarin NAS, Pebrianti D, Ann NQ, Bayuaji L, Syafrullah M, Riyanto I (2019) Performance
evaluation of PID controller parameters gain optimization for wheel mobile robot based on bat
algorithm and particle swarm optimization. In: Md Zain Z, Ahmad H, Pebrianti D,
Mustafa M, Abdullah NRH, Samad R, Mat Noh M (eds) Proceeding of the 10th national
technical seminar on underwater system technology 2018, vol 538. LNEE. Springer,
Heidelberg, pp 323–333
17. Karahan O, Bingül Z (2011) A fuzzy logic controller tuned with PSO for 2 DOF robot
trajectory control. Expert Syst Appl 38(1):1017–1031
18. Sahib MA, Ahmed BS (2016) A new multiobjective performance criterion used in PID tuning
optimization algorithms. J Adv Res 7(1):125–134
An Analysis of State Covariance
of Mobile Robot Navigation
in Unstructured Environment
Based on ROS

Hamzah Ahmad, Lim Zhi Xian, Nur Aqilah Othman,

Mohd Syakirin Ramli, and Mohd Mawardi Saari

Abstract This paper deals with mobile robot navigation in unstructured environ-
ment by using Robot Operating System (ROS). ROS is a framework to develop
robotic application and it consists of algorithms to build maps, navigate, and
interpret sensor data. The system is used to define a condition of mobile robot
navigation in a specific environment to evaluate the estimation performance. The
research aims to analyze and investigate the mobile robot movement in unknown
environment by using Kalman Filter approach considering uncertainties. Only one
LiDAR sensor and one IMU sensor are applied to measure the relative distance and
then provide the information for estimation purposes. An experiment of a Turtlebot
that can keep track autonomously with collision avoidance has been organized to
recognize the mobile robot motions through the application of Kalman Filter. Once
the simulation is successfully performed as expected, then only the experimental
analysis are organized. The results shown that Kalman Filter can sufficiently esti-
mate the condition of the environment with only depending on a LiDAR and IMU
sensors with good performance. Besides, the calculated state covariance is also
agreed with the theoretical analysis.

Keywords Kalman Filter Navigation Mobile robot LiDAR Covariance

1 Mobile Robot Navigation

Working with an autonomous mobile robot is a challenging task that requires a lot
of system analysis, integration of parts and sensors, environment conditions and
techniques. Introduced more than two decades, the simultaneous localization and
mapping problem, simply known as SLAM, is an integral part of navigation which
demands researcher to take into account several factors that can easily affects the
mobile robot performances. Issues such as computational cost, complexity,

H. Ahmad (&) L. Z. Xian N. A. Othman M. S. Ramli M. M. Saari

Faculty of Electrical and Electronics Engineering, UMP, Pekan, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 259

dynamics of environments and uncertainties are always making the SLAM problem
inspired researchers to continuously seeks more reliable technique for solution.
Habibie and some researchers [1–3] states that SLAM, or also known as
Concurrent Mapping and Localization, is a term known as an approach to solve a
“chicken-and-egg problem” of robot localization and mapping. This problem
appears because to make a good map of robot’s environment it needs a precise
self-position estimation; however good localization only can be achieved when a
well-defined map is available. In SLAM problem, for each time of observation, a
mobile robot only knows its measurement from sensors based on controls given.
Referring to those information, the system needs to find the probability of all pose
or mobile robot state and the map of the environment concurrently.
As stated earlier, as SLAM is developed by two main issues i.e. localization and
mapping, each of the problem demands better and consistent results to guarantee a
good mobile robot performance [4]. Park [5] stated that in process of making map,
the simpler the geometry of working area, then the larger error are produced of
localization estimation. In other words, lesser information obtained by sensors leads
the mobile robot becomes uncertain about its expectation. Therefore, most of
current research are applying sensor fusion from different sensor types to gain better
results. Looking into this aspect, this research attempt to implement a single LiDAR
sensor to analyze the performance of estimation.
Lotfy [6] states that one major problem with SLAM is that the measurements
read from the sensors will invariably contains noise, and the motion performed by
the mobile robot too will produce uncertainties during its observations. This is why
Kalman Filter which relies on its state covariance performance becomes necessary
in SLAM problem. Since Kalman Filter uses linear models in which contrary to the
typical SLAM problem is nonlinear in nature, a nonlinear variant of it, called
Extended Kalman Filter (EKF) is applied. The EKF SLAM method mainly consists
of two steps which are the prediction step and the correction step. The details of the
EKF system algorithm will be presented in later section for better descriptions.
There have been a lot of research conducted to examine EKF performance in
various conditions covering from theoretical analysis to the experimental verifica-
tions e.g. Huang et al. [3, 7], Ahmad et al. [8–11], Remark that, there are also other
available techniques in SLAM such as the particle filter and other Kalman Filter
families. However, due to the shortcomings such as computational cost, and
complexity, EKF still offers better choices in providing solution in SLAM problem.
One of the important aspects in EKF is the behavior of state covariance. To
guarantee a good estimation is preserved, the state covariance must always con-
verged and this is the reason why EKF being one of the famously applied technique
in SLAM [7].
Application of Kalman Filter in ROS can also be found in a number of papers
with different environment and settings. Kokovkina et al. demonstrates that, EKF
has been used for localization of the mobile robot and then was compared with the
data acquired from the sensing devices like camera and also laser scanner. The
results are satisfactory which shows the errors at acceptable level [12]. UAV is
another examples of successful implementation of EKF through ROS environment
An Analysis of State Covariance of Mobile Robot Navigation... 261

[13, 14]. Image obtained from the camera observations are feed as references into
EKF for prediction purposes for landing estimation especially when the landing
platform was not detected. In fact, the error produced by EKF is much more smaller
than the one detected by the sensors [13]. Looking on the perspective of sensoring
devices applied for EKF in ROS, Ponce et al. claims that, by using only a laser
scanner, a robot can still able to depart and mobilize people in domestic area. They
present the robot as an autonomous wheelchair to move from specific place to its
destination efficiently with EKF [15]. Their results determines a possibility to
reduce the computation and sufficient technique for estimation.
Inspired by the findings in literatures, this paper attempts to analyze the perfor-
mance of EKF in ROS environment considering LiDAR and IMU sensors for
measurements. The LiDAR sensor is used as it can provide better measurement than
sonar sensor as well as reducing the computational cost in providing solution to the
mobile robot localization and mapping. While, IMU is proposed to identify the
mobile robot heading angle when its moves around the environment. The state
covariance of estimation is also examined to understand its relation to the estimation
as well as to compare with the theoretical results provided by the literatures. The
analysis of state covariance in EKF in preceding literatures especially on ROS
environment are few and in fact mostly focusing on the statistical error performance.
As EKF also concerns on the state covariance analysis, this paper deals on the matter
to observe the overall behavior of state covariance throughout the estimation pro-
cesses. Meanwhile, for verification purposes, TurtleBot is being used as main
application as the mobile robot is easy to control and then to estimate its movements.
This paper is organized in the following manner. Section 2 describes the Kalman
Filter algorithm in SLAM and the mobile robot; TurtleBot 3 Burger. This is then
followed by Sect. 3 about the simulation and experimental analysis of the proposed
system. Finally Sect. 4 concludes the findings of the research.

2 Navigation and TurtleBot 3

2.1 SLAM and Kalman Filter

As mentioned in previous section, SLAM is consists of two main parts namely

known as process and measurement models. In this paper, the same conﬁguration of
the system is being applied from Ahmad et al. [10]. The process model is stated as
follow. Consider a state xk 2 R3 þ 2n which consists of mobile robot x, y position
and its heading angle with a number of n landmarks marked with x, y locations. The
kinematic model of the mobile robot is represented by

x k þ 1 ¼ f ð x k ; uk ; x Þ ð1Þ
262 H. Ahmad et al.

where uk deﬁnes the control input which basically describes of the mobile robot
velocity and angular acceleration. x represents the noise occurred during mobile
robot motions.
To observe the surrounding area, the mobile robot needs to know its environ-
ment and therefore sensors are important to retrieve the related information. This is
accomplished by using LiDAR to measure the relative distance between mobile
robot and any recognized landmarks during mobile robot observations. The mea-
surement is calculated as follow.

z k þ 1 ¼ hð x k ; t Þ ð2Þ

where zk þ 1 describes the measurement matrix which consists of the relative dis-
tances and angles between the mobile robot and landmarks.
Above two models are essential for the system to make its analysis and further
calculation especially for the Kalman Filter. Kalman Filter is generally consists of
two stages which are the prediction and update steps. Prediction stage simply
recognize the kinematic model of the mobile robot to infer the location of the
mobile robot based on its movements. This is then followed by the update steps
which continuously update the mobile robot location as well as landmarks for each
time frame. These two steps if compared to the process and measurement models
looks the same but with no noises considered in the calculation. The prediction
stage is shown as following equation.

x
kþ1 ¼ ^
x k þ f ð x k ; uk Þ ð3Þ

where ^xk is the predicted states with its associated state covariance matrix expressed
by

P ^
k þ 1 ¼ f Pk f þ Qk ð4Þ

P
k þ 1 is the predicted state covariance with its associated noise, Qk . The information
obtained in the prediction stage is then referred to update the estimated state. The
updated states xkþþ 1 becomes,

xkþþ 1 ¼ x
k þ 1 þ K z k þ 1 h xk þ 1 ð5Þ

where K is the Kalman Gain.

K ¼ P T
k þ 1 h hPk þ 1 h þ Rk
T
ð6Þ
An Analysis of State Covariance of Mobile Robot Navigation... 263

where Rk is the covariance of measurement error produced by the sensor. Above all
equations will be further calculated to ﬁnd the updated covariance.

Pkþþ 1 ¼ ðI KhÞP
kþ1 ð7Þ

One of the important criteria in Kalman Filter is that the state covariance always a
positive semidefinite. Besides, the state covariance will always converging to its
initial state as reported by Huang et al. [7], Ahmad et al. [10]. The state covariance
is related to the errors of estimation and leading to conclusion of either the esti-
mation has higher accuracy or else. It was found in many literatures proving that if
the state covariance value becomes higher, then the mobile robot can easily become
uncertain about its estimation. The problem becomes severe especially for one
technique known as H∞ Filter where there are possibilities that the state covariance
can instantaneously increase. Therefore, there was a lot of analysis focusing on the
state covariance on the same family as EKF such as particle filter and Unscented
Kalman Filter. Hence, these properties will be observed in the experimental analysis
in the later section for verification purposes.

2.2 TurtleBot 3 Conﬁguration

This research applied TurtleBot 3 as presented in Fig. 1. For experimental analysis

preparation, the ROS packages needs to be installed in a computer. The procedure
of installation can be found widely on the ROS wikipedia and further information
can be obtained on the website. The turtlebot must be consistently connected to the
computer to continuously received information of the system performances. The
gmapping technique is applied for mapping analysis and the initial results is shown
in below Fig. 2.
Once the system has been prepared, the EKF package from ROS wiki is installed
in the computer. The package contains odometry and IMU sensors. Odometry is the
use of data from motion sensor or LiDAR to estimate change in position over time
while IMU (Inertial Measurement Unit) is a sensor that determine the orientation of
the turtlebot. The package has been published since 2012 and since then, there are
not much update about this package. The system is then tested to ensure all
information can be obtained from those two sensors.
264 H. Ahmad et al.

Fig. 1 Turtlebot 3 burger model

Fig. 2 Testing the gmapping of the mobile robot

An Analysis of State Covariance of Mobile Robot Navigation... 265

3 Analysis and Discussion of the Experimental Results

This section provides the outcomes of experimental results. For evaluation pur-
poses, two different places are selected to assess the performance of the estimation
using only LiDAR sensor. The results are mainly discussed on the EKF perfor-
mance focusing on the state covariance conditions when the mobile robot moves
around the environment. The mobile robot motions are autonomous and monitored
through the computer for verifications purposes. It is assumed that the environment
do not contains any dynamical system and is planar as the measurement are made
for 2D conditions.
Figure 3 shows the initial map constructed by the mobile robot on the dining hall
which has dimension of 20 m 4 m. After a period of time, the mobile robot
completed the mapping as presented in Fig. 4. Based on these figures, it has been
found that there are some erroneous results of estimation. The error is not accu-
mulated over time and highly depends on the initial measurement made by the
mobile robot. Other possible reasons are due to the mobile robot tyre slippage and
initial state covariance values. It was identified that, higher values of initial state
covariance has yield higher error of estimation.

Fig. 3 Before mapping of

the dining hall
266 H. Ahmad et al.

Fig. 4 Dining hall ﬁnal mapping. Blue line shows the real environment based on odometry
measurement

Fig. 5 Odometry of position x against position y with lower initial state covariance

By deﬁning lower initial state covariance, a better picture of estimation results

are shown in Fig. 5 consisting of x-y positions. In Fig. 5, it is clearly indicated that
the measured wheel odometry and the predicted EKF measurement are same and in
fact producing errors similar to odometry measurement. Even though Kalman Filter
has sufﬁciently low errors when comparing to the wheel odometry measurement,
the results of mapping is not the best it can performed. Hence, the initial mea-
surements is playing an important roles to guarantee a good estimation can be
preserved. In addition, reading from IMU also plays signiﬁcant effect to the esti-
mation. The state covariance for both x, y states are also small as depicted in Figs. 6
and 7 respectively. It can be observed that at the beginning of time of measurement,
An Analysis of State Covariance of Mobile Robot Navigation... 267

Fig. 6 Covariance of position x by robot pose EKF against time

Fig. 7 Covariance of position y published by robot pose EKF package against time

high uncertainty was perceived which makes the estimation becomes erroneous. If
the state covariance is consistent at all time, the error become lower and then the
robot able to produce better results of estimation.
Investigation in other room size, 7 m 7 m was also organized to analyse the
mobile robot performance on its performance consistency. The same procedure of
mapping is applied for this room. Figures 8 and 9 shows the mapping of initial
position of the turtlebot and its movements respectively. In this experiment, the
results are better since the initial state covariance is smaller and the mobile robot
moves in smaller environment. Compared to the previous dining hall estimation, the
results produced better accuracy with smaller covariance being obtained in the
observations (Figs. 10 and 11).
268 H. Ahmad et al.

Fig. 8 Initial position of turtlebot in mapping

Fig. 9 Final mapping of the turtlebot

Fig. 10 Covariance position x against time

An Analysis of State Covariance of Mobile Robot Navigation... 269

Fig. 11 Covariance position y against time

4 Concluding Remarks

As been demonstrated above, EKF can be sufﬁciently provide good estimation of

the surrounding area approximating 90% accuracy especially when the initial state
covariance is designed to be suitable to the environment. This can be accomplished
by observing and identifying the mobile robot sensoring capabilities and the
environment complexity. Even though identifying a good initial state covariance is
one of the challenging factors to be considered, the results still preserved good
estimation. Besides of this ﬁnding, the estimation is also agreeing to the theoretical
analysis provided by the literatures even with different surroundings. It was also
possible to estimate an environment with using a minimum and yet efﬁcient sensors
such as LiDAR and IMU sensors. Moreover, it was found that to ensure a good
estimation can be achieved, the design of the robot and the environment must be
taken into account.

Acknowledgements The research was conducted under UMP grant, RDU1703139. The authors
would like to thank University Malaysia Pahang for the continuous support in achieving the
research outcomes.

References

1. Habibie N, Nugraha AM, Anshori AZ, Ma’sum MA, Jatmiko W (2017) Fruit mapping mobile
robot on simulated agricultural area in Gazebo simulator using simultaneous localization and
mapping (SLAM). In: 2017 international symposium micro nano mechatronics and human
science (MHS), Japan. IEEE
2. Durrant-Whyte H, Bailey T (2006) Simultaneous localization and mapping: part I. IEEE
Robot Autom Mag 13(2):99–110
270 H. Ahmad et al.

3. Dissayanake G, Newman P, Clark S, Durrant-Whyte H, Csorba M (2001) A solution to the

simultaneous localization and map building (SLAM). IEEE Trans Robot Autom 17(3):229–
241
4. Sebastian T, Wolfram B, Dieter F (2005) Probabilistic robotics. MIT Press, Cambridge
5. Park S, Lee G (2017) Mapping and localization of cooperative robots by ROS and SLAM in
unknown working area. In: 2017 56th annual conference of the society of instrument and
control engineers of Japan (SICE), Japan. IEEE, pp 858–861
6. Saman ABSHM, Lotfy AH (2016) An implementation of SLAM with extended Kalman filter.
In: 2016 6th international conference on intelligent and advanced systems (ICIAS), Malaysia.
IEEE, pp 1–4
7. Huang S, Dissayanake G (2007) Convergence and consistency analysis for extended Kalman
filter based SLAM. IEEE Trans Robot 23(5):1036–1049
8. Ahmad H, Othman NA, Saari M, Ramli MS (2019) Investigating state covariance properties
during finite escape time in H∞ filter SLAM. In: Md Zain Z et al (eds) Proceedings of the
10th national technical seminar on underwater system technology 2018. Lecture notes in
electrical engineering, vol 538. Springer, Heidelberg
9. Ahmad H, Othman N (2015) The impact of cross-correlation on mobile robot localization.
Int J Control Autom Syst 13(5):1251–1261
10. Ahmad H, Othman NA, Saari MM, Ramli MS, Mazlan MBM, Namerikawa T (2017) A
hypothesis of state covariance decorrelation effects to partial observability SLAM. Indones J
Electr Eng Comput Sci 14(2):588–596
11. Othman N, Ahmad H, Namerikawa T (2016) Sufficient condition for estimation in designing
H∞ filter-based SLAM. Math Prob Eng 2015:1–14
12. Kokovkina VA, Antipov VA, Kirnos VP, Priorov AL (2019) The algorithm of EKF-SLAM
using laser scanning system and fisheye camera. In: 2019 systems of signal synchronization,
generating and processing in telecommunications (SYNCHROINFO), Russia. Media
Publisher, pp 1–6
13. Ruiz MS, Vargas AMP, Cano VR (2018) Detection and tracking of a landing platform for
aerial robotics applications. In: 2018 IEEE 2nd colombian conference on robotics and
automation (CCRA), Barranquilla. IEEE, pp 1–6
14. Ponce R, Mosquera Canchingre G, Velarde P, Moya M (2018) Design and construction of an
automatic transport system inside the home for people with reduced mobility. In: 2018
International conference on information systems and computer science (INCISCOS), Equidor.
IEEE, pp 88–93
15. Li B, Liu H, Zhang J, Zhao X, Zhao B (2017) Small UAV autonomous localization based on
multiple sensors fusion. In: 2017 IEEE 2nd advanced information technology, electronic and
automation control conference (IAEAC), Chongqing. IEEE, pp 296–303
Control Strategy for Differential Drive
Wheel Mobile Robot

Nor Akmal Alias and Herdawatie Abdul Kadir

Abstract The wheel mobile robot has been widely used nowadays. It is not only
being used in the industries, but currently has been developed to aid patients in
rehabilitation. Robotics is now used widely as it can reduce therapist workload as
well as to give out efﬁcient results. Robots used as rehabilitation device can help
patients to gain the ability of walking due to the loss of it from stroke, spinal cord
injury and traumatic brain injury. The gait training device is widely used is the
Andago. The motivation behind this exploration is to build up a control method-
ology for a differential drive wheel mobile robot. The robotic job is to move in a
straight direction in the workspace regardless of powered by two
non-indistinguishable electric motors. The rotational speed was controlled by the
develop controller to achieve straight trajectory of WMR. This paper proposed a
trajectory tracking control for a WMR using sliding mode controller. SMC is best in
dealing with trajectory tracking of the nonholonomic robot. The sliding surface of
SMC will be converged to zero and trivially the error produce while the robot
moves will also be converged to zero.

Keywords Kinematics Dynamics Wheel mobile robot Differential drive

robot Sliding mode controller

1 Introduction

In recent years, developments in robotic technology have reached a certain mile-

stone. Heavy works conservatively done by hand by our predecessors are now
mostly accomplished with automated machinery. The applications of robots exist in
diverse ﬁelds such as logistics, aerospace as well as medicine. In the medical ﬁeld,
robots are normally used to assist doctors in the rehabilitation of bedridden patients.

N. A. Alias H. A. Kadir (&)

Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
83000 Batu Pahat, Johor, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 271

During the rehab sessions, the said patient is strapped into a device, which will
assist them in walking from a point to another. The ergonomic feature of the
machine helps patients to retain a normal skeletal structure. Previously medical
personnel would need to support the patients themselves and guide them slowly.
This method is not efficient and time consuming. Patient will rely on the therapist
for them to do gait training. They are not able to do it by themselves as they have
lost the ability of walking due to their illness. Degree of illness may be different
form one patient to another. High degree of illness will need higher rate assistance.
Therapist workload can be reduced by having robotic device as gait trainer.
The non-holonomic properties for the differential drive wheeled mobile robot
(DDWMR) has some mobility restriction in their applications mostly regarding its
trajectory tracking problem. To overcome this, a variable structure control (VSC) as
a robust controller has been successfully design for diverse applications such as
electrical motor controller, autonomous underwater vehicle, flight stability and
robotics [1]. Thus, SMC may be one of the is the best approach for this
non-holonomic robot.
Implementing a system for an autonomous robot is however not without chal-
lenges. Trajectory tracking control of a mobile wheeled robot (WMR) was initially
based solely on kinematic models due to its similarities to nonholonomic limitations
[2, 3]. However, the output does not simulate the actual situation in real life.
Kinematics only ascertains the current position relative to the input. The reference
points are derived from the calculation of translational and rotational velocity. Thus,
reducing the reliability of the trajectory tracking control of the WMR significantly
[4]. To negate this problem, the dynamics of the WMR must be taken into con-
sideration. Parameters such as mass, the center of gravity and moments of inertia
are added into the matrix to calibrate the motion of the WMR.
Researchers have come up with a few controller types, for example, fuzzy
control, neural network and adaptive control. Among them, sliding motion control
(SMC) has shown a great prospect in minimizing uncertainties, reducing the
tracking error as well as giving a fast response [5, 6]. In [7] positive results were
achieved in both tracking control and regulation tasks when SMC was utilized in a
WMR system. However, the system strictly needs the moving force of the robot to
be determined as one of the inputs. Although achievable this would impractical
mainly because of its complexity and cost.
Introducing the Lyapunov stability equation further enhances the robustness of
the system, as done by [7]. Following the example of [8], other researchers have
implemented Lyapunov based controller for their robots. Dealing with patients in
the rehabilitation field requires the utmost consideration of the patient’s safety.
This paper presents as follows. In Sect. 2, the trajectory tracking is introduced by
developing the kinematics and dynamics of the differential drive wheel mobile
robot. Towards the end, the SMC is briefly explained in the subsection. Section 3
tells about the results and discussion of the proposed controller for DDWMR. For
the final Sect. 4, conclusion is explained.
Control Strategy for Differential Drive Wheel Mobile Robot 273

2 System Description

The target of the investigation is the plan of a WMR with the capacity of following
a predefined way or direction. Trajectory tracking will follow the direction based on
the specified velocity using precise controller which has been studied by previous
researchers. Using trajectory tracking for DDWMR to operate in different con-
trollers have been developed in the last recent years by [9–11]. Some researchers
designed the DDWMR controller using backstepping, PID, sliding mode and many
more.
By far, tracking control law developed using SMC in term of stability analysis, is
one of the best solutions [12]. SMC is insensitive to the uncertainties and thus
makes it a reliable controller for DDWMR especially in rehabilitation. Therefore,
SMC is the powerful answer for trajectory tracking controller in real application. In
this reproduction, the required course can be plotted by means of a progression of
focusing removed from an information record, or it can be produced from a con-
dition or arrangement of conditions.
For demonstration purposes, the last approach was embraced. To synchronize
the way focuses with reproduction time steps, a clock was acquainted with the
framework. The inconvenience of time limitations on the coveted way brings about
the production of a direction, as characterized in prior segments.
The main objective is to develop a control law for the DDWMR to closely
follow the reference trajectory of the robot. Figure 1 shows the block diagram of the
proposed controller for this study. The first level of control strategy is to obtain the
suitable torque that will be used for the robot to move the left and right wheels.
Then, the DDWMR will provide suitable velocity to track the reference trajectory.

2.1 Kinematic and Dynamics Model

The wheeled mobile robot (WMR) that is being discussed in this paper is the
differential drive type robot platform. This WMR consists of two motorized
wheeled with one castor wheeled that act as the balancing for the robot. The left and
right wheel will be given a speciﬁc input velocity for the robot to perform the
desired trajectory. However, the free-moving wheel does not play any role in
driving or steering the robot. This type of robot also known as nonholonomic robot.

xe x
x Error S1 τr v Inverse
ye SMC y
Trajectory y calculation
Φe S2 Transformation τl Plant ω kinematics
φ φ

Fig. 1 Block diagram for DDWMR using SMC controller

274 N. A. Alias and H. A. Kadir

Fig. 2 Kinematic model of Y

differential drive WMR
V

X
x

The kinematics and dynamics are the two levels controllers for this nonholo-
nomic robot. The kinematics is implemented in the system to obtain the velocity
which then will be used by the robot’s dynamic to apply the desired torque for both
left and right wheels. The goal is to develop a control law that can follow the
desired trajectory of the robot. Figure 2 below shows the kinematics behaviour of a
differential drive WMR.

p_ ¼ J_ q_ ð1Þ
2 3 2 3
x_ cos; 0
4 y_ 5 ¼ 4 sin; 0 5 v
w
;_ 0 1

The differential drive robot is a 3 degree of freedom robot with a

two-dimensional movement which is the translational and rotational movement.
The kinematic model can be written in the form below where v and w are the linear
and angular velocities of WMR.
This WMR model is a two-vector ﬁeld of driftless afﬁne system. Matrix g1 and
g2 is obtained from the Jacobian matrix of the kinematics. The vector g1 allows for
the translational while vector g2 is for rotational movement. Figure 3 shows this
relationship.
2 3 2 3
cos; 0
g1 ¼ 4 sin; 5 g2 ¼ 4 0 5 ð2Þ
0 1

The dynamic model of the nonholonomic robot can be presented in linear and
angular velocities. The dynamic equations of WMR will ensure to give out the
Control Strategy for Differential Drive Wheel Mobile Robot 275

Fig. 3 Translational and g1

rotational movement

actual velocities match the desired velocities. The model is acquiring form
Lagrange dynamic equation and is depicted as below:

M ðqÞg_ þ V ðq; q_ Þg ¼ BðqÞs ð3Þ

The equation then rearranges in a compact form such as:

!
1 sR sL mc dx2
v_ ¼ 2 þ þ ð4Þ
R m þ R2 m þ 2L m þ 2IR2w
2L 2
R 2 Iw
!
L sR sL mc dxV
x_ ¼ ð5Þ
R I þ 2L 2
R2 w I I þ 2L2
R 2 Iw I þ 2L
2
R2 Iw

Where,
mc = mass without wheel and actuators
mw = mass of each wheel with actuators
Ic = Inertia about vertical axis through center of mass
Iw = Inertia of each wheel with actuators about wheel axis
Im = Inertia of each wheel with actuators about wheel diameter
Equations (4) and (5) can be written in matrix form as,
2 1 3 2 3
1 0 mc dx
v_ 1 4 m þ 2IR2w m þ 2I2w
5 s 2
m þ 2L2
5 v
¼ L
R R
4 R
ð6Þ
x_ L
R I þ 2L 2 2 sL mc dx
0 x
Iw I þ 2L2 Iw 2
I þ 2L2 Iw
R2 R R

Table 1 shows the values of each parameters that are used in the dynamic
equation.
276 N. A. Alias and H. A. Kadir

Table 1 DDWMR Parameters Value Unit

parameters and values
m 81.05522 kg
mc 80.4144 kg
mw 0.6378 kg
L 0.385 m
R 0.1 m
d 0.2 m
Ic −0.1821 kgm2
Iw 1.0 kgm2
Im 1.0 kgm2

2.2 Sliding Mode Controller

SMC is used to ensure the discontinuous control signal is generated from this
controller when the system is repeatedly across the sliding surface until it ﬁnally
converges to zero. Other issues after the sliding motion is the chattering phe-
nomenon which it switch the states to divert from lying on the sliding surface. This
issue can be overcome by replacing the saturation function (sat) from the sign
function (sgn). It will smooth the boundary layer and reduce the chattering effect at
the same time.
The controller and its gains are used to lead the tracking errors to zero. As the
errors are zero, the real trajectory will follow the reference trajectory closely.
Tracking errors will exhibit when the real robots get moving. The differentiated
errors in terms of the robot coordinate are given out as below:
2 3 2 32 3
xe cos ;d sin ;d 0 x xd
4 ye 5 ¼ 4 sin ;d cos ;d 0 54 y yd 5 ð7Þ
;e 0 0 1 ; ;d

Hence, the dynamic errors for trajectory tracking,

x_ e ¼ ðx_ x_ d Þ cos ;d þ ðy_ y_ d Þ sin ;d ;_e ðx_ x_ d Þ sin ;d ;_d ðy yd Þ cos ;d

¼ x_ cos ;d x_ d cos ;d þ y_ sin ;d y_ d sin ;d ;_d x sin ;d þ ;_d xd sin ;d þ ;_d y cos ;d ;_d yd cos ;d
¼ x_ cos ;d þ y_ sin ;d þ ;_d ½x sin ;d þ xd sin ;d þ y cos ;d yd cos ;d x_ d cos ;d y_ d sin ;d
¼ x_ cos ;d þ y_ sin ;d þ xd ye Vd
¼ x_ cos ð; ;e Þ þ y_ sinð; ;e Þ þ xd ye Vd
¼ x_ ðcos ; cos ;e þ sin ; sin ;e Þ þ y_ ðsin ; cos ;e cos ; sin ;e Þ þ xd ye Vd
¼ cos ;e ðx_ cos ; þ y_ sin ;Þ þ sin ;e ðx_ sin ; y_ cos ;Þ þ xd ye Vd
¼ V cos ;e þ xd ye Vd
Control Strategy for Differential Drive Wheel Mobile Robot 277

y_ e ¼ ðx_ x_ d Þ sin ;d þ ðy_ y_ d Þ cos ;d ;_d ðx_ x_ d Þ cos ;d ;_d ðy yd Þsin ;d

¼ x_ sin ;d þ x_ d sin ;d þ y_ cos ;d y_ d cos ;d ;_d x cos ;d þ ;_d xd cos ;d ;_d y sin ;d ;_d yd sin ;d
¼ x_ sin ;d þ y_ cos ;d þ ;_d ½x cos ;d þ xd cos ;d y sin ;d yd sin ;d þ x_ d sin ;d y_ d cos ;d
¼ x_ sin ;d þ y_ cos ;d xd xe
¼ x_ sinð; ;e Þ þ y_ cosð; ;e Þ xd xe
¼ x_ ðsin ; cos ;e þ cos ; sin ;e Þ þ y_ ðcos ; cos ;e sin ; sin ;e Þ xd xe
¼ cos ;e ðx_ sin ; y_ cos ;Þ þ sin ;e ðx_ cos ; þ y_ sin ;Þ xd xe
¼ V sin e3 xd e3

;_e ¼ ;_ ;d
ð8Þ
¼ x xd

The SMC is designed for the actual velocities to follow the desired velocities of
the WMR and conﬁrmed that the trajectory tracking is closely tracked. Referring to
S2, the lateral error, ye and angular error, ue are coupled together to make it
converge together. The C0 ; C1 ; C2 are the positive constant parameter for the sys-
tem. Therefore, the sliding surface depicts as:

S
Si ¼ 1 ð9Þ
S2

S1 ¼ x_ e þ C1 xe

S2 ¼ y_ e þ C2 ye þ C0 sgnðye Þ ð;e Þ

Then, the sliding surface is differentiated into:

S_ 1 ¼ €xe þ C1 x_ e
S_ 1 ¼ x_ d ye þ xd y_ e þ V_ cos ;e ;_e V sin ;e V_ d þ C1 xd ye C1 V cos;3 C1 Vd
ð10Þ

S_ 2 ¼ €ye þ C2 y_ e þ C0 sgnðye Þ ;_e

S_ 2 ¼ V_ sin ;e þ ;_e V cos ;e x_ d xe xd x_ e þ C2 V sin;e C2 xd xe þ C0 sgnðe2 Þ ðxd Þ

The reaching law in the proposed controller is using the Gao and Hung reaching
law [13]. They suggested by using certain reaching law the reaching speeds can be
controlled. When the proportional rate P is used, it will push the switching faster if
278 N. A. Alias and H. A. Kadir

the boundary layer, Q is larger. Both P and Q must be larger than zero for the sliding
surface smoothly converging to zero. The general form of the law is given by:

S_ 1 ¼ Qi sgnðSi Þ Pi Si i ¼ 1; 2 ð11Þ

Equation below is achieved when Eq. (11) = (10):

Q1 sgnðS1 Þ P1 S1 ¼ x_ d ye þ xd y_ e þ V_ cos ;e ;_e V sin ;e V_ d þ C1 xd ye

C1 V cos;3 C1 Vd
ð12Þ

Q2 sgnðS2 Þ P2 S2 ¼ V_ sin ;e þ ;_e V cos;e x_ d xe xd x_ e þ C2 V sin;e

C2 xd xe þ C0 sgnðye Þ ;e

Below equations are obtain after some mathematical equations from equations
in (12),

1
V_ ¼ ½x_ d ye xd y_ e þ ;_e V sin ;e V_ d C1 xd ye þ C1 V cos;3
cos xe ð13Þ
C1 Vd Q1 sgnðS1 Þ P1 S1

1
x¼ ½ðQ2 sgnðS2 Þ P2 S2 V_ sin ;e þ x_ d xe
V cos;e þ C0 satðye Þ
þ xd x_ e C2 V sin;e þ C2 xd xe Þ þ xd

The sign function in the boundary layer is then replaced with the saturation
function. By doing so, the chattering issue can be eliminated.

1
V_ ¼ ½x_ d ye xd y_ e þ ;_e V sin ;e V_ d C1 xd ye þ C1 V cos;3
cos xe ð14Þ
C1 Vd Q1 satðS1 Þ P1 S1

1
x¼ ½ðQ2 satðS2 Þ P2 S2 V_ sin ;e þ x_ d xe
V cos;e þ C0 satðye Þ
þ xd x_ e C2 V sin;e þ C2 xd xe Þ þ xd

The obtained control law of the DDWMR is free from uncertainties and will not
be considered in this paper. This is the nominal control law for SMC applied to
WMR. This control law will be feed into the DDWMR and tracked the generated
reference trajectory.
Control Strategy for Differential Drive Wheel Mobile Robot 279

2.3 Summarize

This paper briefly discusses about the DDWMR using SMC in application of
rehabilitation. Patients who lost the ability to walk will need an assist as needed
device for them to gain back their normal walking behavior. This can be achieved
by having frequent therapy session. Robot assisted device can help patients to gain
back their ability to walk much faster compared to therapy assistance. Andago [14]
is the closest reference for this research.
Both kinematics and dynamics of the DDWMR will be used in this simulation.
Trajectory tracking for DDWMR is formulated based on a mobile robot that will
move along a desired path with speciﬁed velocity. The kinematics explained about
the behavior of movement by the mobile robot while the dynamics will ensure that
the mobile robot physical parameters will be considered.
The SMC is the controller used in this research. The control law that is obtained
from SMC will be used for the DDWMR to follow the refence trajectory. The
sliding surface that is equal to zero shows that the controller can follow the input
that has been given to it. This shows that SMC is the efﬁcient controller for the
system. It should be able to track the reference trajectory very well.

3 Results and Discussions

Displaying a WMR’s conduct on kinematics alone will probably prompt mistakes,

particularly at expanding mass and velocity. To avert slippage, dynamic forces must
be considered. For the WMR modelling utilized in this study, dynamic limitations
are forced on kinematic arrangements with a speciﬁc end goal to deliver practical
outcomes.
It is comprised of both kinematic and dynamic aspects and has incorporated
salient components such as SMC controllers and motors. It has also taken into
consideration the effects of tire friction. With this stage completed, the behaviour of
the wheeled robot in response to various assigned trajectories can now be simu-
lated. The advantage of using software simulation is that trajectories and other
physical parameters can be altered with ease in order to gauge the reaction of the
robot.
The simulation starts with a trajectory that should be tracked by the differential
drive robot. Figure 4 shows that the blue line is the reference trajectory while the
yellow line is the real trajectory that managed to be tracked by the robot. The WMR
was able to quickly propel itself from its starting point towards the pre-deﬁned path
located at a certain distance away. Once the WMR has negotiated itself onto the
path, it will faithfully follow until the end of the simulation period. Hence, the robot
slowly follows the reference trajectory. This has shown that the proposed control
law is validated by the trajectory tracking of the robot that closely follows the
reference trajectory (Table 2).
280 N. A. Alias and H. A. Kadir

Fig. 4 Real and reference trajectory

Table 2 Controller Controller parameters Value

parameters and values
Controller gain, C0 0.4
Controller gain, C1 0.5
Controller gain, C2 0.1
Reaching gain, P1 0.003
Reaching gain, P2 0.1
Boundary layer, Q1 100.0
Boundary layer, Q2 1.0

There are some parameters that are used in the simulation to achieve the results
below. These parameters are obtained within the SMC parameter rules. The selected
values of Q are used to eliminate the chattering effect occur in the SMC controller.
The boundary layer thickness must thick enough dot it to eliminate the chattering
that occur within the boundary.
The remaining results shown below can be used to validate the proposed control
law. The sliding surface of the SMC should converge to zero for the robot follows
the reference trajectory. Errors occur when the robot starts to move and if it’s able
to eliminate the error then, the robot can closely follow the reference trajectory.
Referring to Fig. 4, the sliding surface is successfully converged to zero as it able to
eliminate the error.
Control Strategy for Differential Drive Wheel Mobile Robot 281

Figure 5 shows results that is much likely to the above ﬁgure. The sliding
surface can reach zero. The WMR follows exactly along the trajectory as the error
has been eliminate when the sliding surface reaching zero. When the switching
function is introduced with a boundary layer, the system can reach zero much faster.
Both ﬁgures have shown that it is able to converge to zero in a short period of time
(Fig. 6).

Fig. 5 Sliding surface, S1

Fig. 6 Sliding surface, S2

282 N. A. Alias and H. A. Kadir

Walking speed
1.4
1.2
Speed (ms-1) 1
0.8
0.6
0.4
0.2
0
1
23
45
67
89
111
133
155
177
199
221
243
265
287
309
331
353
375
397
419
441
463
485
-0.2
Time (s)

Fig. 7 Walking speed for the proposed trajectory

The results prove that the dynamic algorithm will slow down the simulated
WMR when situations arise that could cause it to exceed friction limits.
Nevertheless, it is responsive enough to be able to speed up when required to match
the reference trajectory. In general, the SMC model performs reasonably well in a
simulated environment and demonstrates the feasibility of the idea.
It is important not only for a controller to be able to follow a prescribed tra-
jectory, but it must be able to do it with a level of accuracy that is within acceptable
limits. In order to verify the tracking capabilities of the SMC, further simulation
runs must be conducted.
The trajectory tracking is generated using SMC must also satisfy patients
behavior. Average normal gait speed is 1.34 ms−1 [15]. When dealing with patients
who are in difﬁculties to perform their walking behavior, the speed of the gait
assisted device must tolerate with this situation. Figure 7 depicts that the average
speed perform by the WMR is 1.25 which is lesser than the normal speed. So, it is
shown that the controller suits well enough in the robust manner as well as in the
rehabilitation purpose.

4 Conclusion

This paper discussed about the DDWMR in application of rehabilitation by using

SMC. SMC is a robust controller that can tolerate very well with the nonholonomic
behavior of the WMR. Hence, SMC is applied to this robotic device in application
of gait assisting rehabilitation.
The proposed controller’s effectiveness has proven that it can tolerate well with
the WMR trajectory in order to ensure it can eliminate error and follow its trajectory
so well. The WMR happens to follow the desired trajectory that has been pro-
grammed as close as it can use the proposed SMC. This happen because of the
designed controller works very well with this nonholonomic robot that is roll
Control Strategy for Differential Drive Wheel Mobile Robot 283

without slipping constraint. Both sliding surfaces is eventually converging to zero,

hence making the tracking errors also equal to zero.
This simulation also shown that it can tolerate with patient’s condition who
facing some difﬁculties in their walking behaviour. Patients may not be able to walk
as the normal person. They may produce lower speed in order to cope with their
current situation.
All in all, the tracking performance produced by simulation has been thoroughly
evaluated. This can be summarized that the control law works very well in reha-
bilitation condition speciﬁcally in gait training.

Acknowledgements The authors acknowledge support from the Advanced Mechatronic

Research (AdMire) Group.

References

1. Filipescu A et al (2011) Trajectory-tracking and discrete-time sliding-mode control of

wheeled mobile robots. In: 2011 IEEE international conference on information and
automation. IEEE
2. Nicolescu A-F, Ilie F-M, Alexandru T-G (2015) Forward and inverse kinematics study of
industrial robots taking into account constructive and functional parameter’s modeling. Proc
Manuf Syst 10(4):157
3. Chwa D (2004) Sliding-mode tracking control of nonholonomic wheeled mobile robots in
polar coordinates. IEEE Trans Control Syst Technol 12(4):637–644
4. Solea R, Nunes U (2007) Trajectory planning and sliding-mode control based
trajectory-tracking for cybercars. Integr Comput-Aided Eng 14(1):33–47
5. Asif M, Khan MJ, Cai N (2014) Adaptive sliding mode dynamic controller with integrator in
the loop for nonholonomic wheeled mobile robot trajectory tracking. Int J Control 87(5):964–
975
6. Tzafestas SG (2013) Introduction to Mobile Robot Control. Elsevier, Amsterdam
7. Belhocine M, Hamerlain M, Meraoubi F (2003) Variable structure control for a wheeled
mobile robot. Adv Robot 17(9):909–924
8. Yun X, Sarkar N (1998) Uniﬁed formulation of robotic systems with holonomic and
nonholonomic constraints. IEEE Trans Robot Autom 14(4):640–650
9. Xie D, Wang S, Wang Y (2018) Trajectory tracking control of differential drive mobile robot
based on improved kinematics controller algorithm. In: 2018 Chinese automation congress
(CAC). IEEE
10. Ibrahim AE-SB (2016) Wheeled mobile robot trajectory tracking using sliding mode control.
JCS 12(1):48–55
11. Wu H-M, Karkoub M (2019) Hierarchical fuzzy sliding-mode adaptive control for the
trajectory tracking of differential-driven mobile robots. Int J Fuzzy Syst 21(1):33–49
12. Solea R et al (2009) Sliding mode control for trajectory tracking of an intelligent wheelchair.
Ann Dunarea de Jos Univ Galati. Fascicle III Electrotech Electron Autom Control Inf
32(2):42–50
13. Gao W, Hung JC (1993) Variable structure control of nonlinear systems: a new approach.
IEEE Trans Industr Electron 40(1):45–55
14. Alias NA et al (2017) The efﬁcacy of state of the art overground gait rehabilitation robotics: a
bird’s eye view. Procedia Comput Sci 105:365–370
15. Bohannon RW, Andrews AW (2011) Normal walking speed: a descriptive meta-analysis.
Physiotherapy 97(3):182–189
Adaptive Observer for DC Motor Fault
Detection Dynamical System

Janet Lee, Rosmiwati Mohd-Mokhtar,

and Muhammad Nasiruddin Mahyuddin

Abstract The increase in the complexity of manufacturing systems increases the

importance of fault detections and isolations. Fault detection is important to prevent
failure of the system which may affect the productivity. This paper studies the fault
detection using observer-based approach for a dynamical system. Direct current
motor with encoder is used to represent a dynamical system and the sensor. A linear
observer and an adaptive observer are designed to detect the sensor fault. Two types
of encoder fault are modelled in the simulation via MATLAB Simulink. The result
shows the linear observer is good at estimate states but failed when there is presence
of fault in the output signal. The adaptive observer is better in estimating the actual
states of the system with additive faults but failed in gain fault. Comparable analysis
was made to verify the efﬁcacy of the observer in fault detection and estimation.

Keywords Fault detection Adaptive observer Sensor fault Encoder fault

1 Introduction

The improvement in the information technology leads to the invention of the

Internet which consequently leads to the fourth Industrial Revolution by the name
of Industry 4.0 [1]. This leads to the upgrading of the manufacturing systems from a
traditional factory to a smart factory and increase the complexity of the system, and
the use of sensors also increases [2]. Fault diagnosis techniques are getting more
important to ensure the safety of the systems as well as human beings including

J. Lee R. Mohd-Mokhtar (&) M. N. Mahyuddin

School of Electrical and Electronic Engineering, Universiti Sains Malaysia, Engineering
Campus, 14300 Nibong Tebal, Pulau Pinang, Malaysia
e-mail: [email protected]
J. Lee
e-mail: [email protected]
M. N. Mahyuddin
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 285

industrial workers and customers. Due to the automations which are highly reliable
on the control system, any faults in the system should be detected quickly to avoid
hard failure of the system [3].
The feedback system of a control system implemented in the industry usually
relies on the information provided by sensors. Thus, a fault in the sensor may lead
to a loss in control of the system [3]. In a complex system that contains a lot of
subsystems in the system network, a small fault in a subsystem may affect the
stability of the whole network since the effect of the fault may be propagated to
other subsystems via the interconnections [4].
There are various fault detection strategies that are researched, one of them is by
using the observer. Observers can be utilized to estimate the states or disturbance
that are unknown [5]. Its main purpose is to estimate the state output in a given state
input condition. Hence it can be used in the fault detection system to estimate the
errors or the faults in the sensors.
In this paper, direct current (DC) motor is used as a representative to illustrate
the dynamical system. This is due to the DC motor is one of the most commonly
used actuator in the manufacturing industry. Besides that, DC motors are eco-
nomical, easy to drive, and are easy to get in different sizes and shapes [6]. One of
the most commonly used sensor to detect the speed and position of the DC motor is
the encoder. It gives the exact rotor speed or position of a DC motor in a close-loop
operation [3]. This research aims to design an observer to detect sensor fault in a
DC motor with an encoder and to design a model for encoder fault signals.
Overall, this paper is organized as follows. Section 2 reviews the related work
and Sect. 3 shows the research methodology. The results are discussed in Sect. 4
and conclusions in Sect. 5.

2 Related Works

There are various fault diagnosis methods that have been developed and proposed
in the last few decades. One of the most commonly used approaches is the
observer-based approach. Observer plays a key role in model-based fault diagnosis
for monitored systems or processes characterized by deterministic models [7]. The
basic idea in the observer-based approach is by constructing various observers to
estimate the state and compared with the actual state to generate residuals that are
used to detect faults present in the system [8]. Observers are first to be designed
based on the linear system, which are known as linear observers or Luenberger
observers. The Luenberger observers are useful in linear systems and have been
applied in various applications, but they need to be modified before applying them
in non-linear systems and uncertain systems which are more common in the real
world [9].
Besides linear observers, an adaptive observer is an algorithm that estimates
unmeasured states and unknown parameters simultaneously [10]. With some
modification to the Luenberger observer, the adaptive observers can estimate the
Adaptive Observer for DC Motor Fault Detection Dynamical System 287

states of a system with the present of disturbance [11]. Another most used observer
for fault detection in non-linear system is the sliding mode observer. Sliding mode
observer uses a non-linear high gain feedback to bring the error dynamic to zero in
ﬁnite time. A sliding mode observer is usually implemented with a scaled switching
function, such as the signum of the error between estimated output and measured
output [12]. Its advantages are robustness to bounded disturbances and low sen-
sitivity to parametric uncertainties [12].
Other than observer-based approaches, another model-based approach is parity
relation approach where the parity vector is generated to check the consistency
between the model and the process output [7]. Stable factorization approach is
frequency-domain fault diagnosis method. It generates a residual based on the
stable coprime factorization of the transfer function matrix of the monitored system
[7]. Both parity relation approach and the stable factorization approach involved the
design of an observer [7]. Besides that, another fault diagnosis approach is
non-linear geometric approach which relies on a coordinate change in the state and
output spaces [13]. This approach must be provided an observable subsystem which
is affected by the fault, but unaffected by disturbances and the other faults to be
decoupled [13].
Data-driven fault detection used the Takagi-Sugeno fuzzy model (T-S model) in
the dynamic modelling of a non-linear system [14]. It is called the kernel repre-
sentation for the non-linear systems. Generally, the main concept of the standard
fuzzy fault detection approach is by designing the kernel representation based on
the model of the system with the aid of the fuzzy modelling technique [14]. Fault
tree analysis (FTA) approach is widely used to determine system dependability. In a
fault tree, the logical connections between faults and their causes are represented
graphically [15]. It is deductive in nature, in other words, the analysis starts with the
top event or a system failure and works backward from top of the fault tree to the
bottom leaves to ﬁnd the root causes of the system failure [15].
In this paper, the adaptive observer will be employed in detection of fault to dc
motor system. The ability of the observer to estimate the state in the presence of
disturbance and can simultaneously estimate both unmeasured states and unknown
parameters will be the advantage of implementing this technique for dc motor fault
detection.

3 Observer Design

The transfer function and the state space model of the dc motor system can be
presented as (1) to (3).

hð s Þ Km
¼ ð1Þ
V ðsÞ s ðsJm þ Bm ÞðRa þ sLa Þ þ Km2
288 J. Lee et al.

2 3 2 Ra 32 3 2 1 3
ia 0 KLma ia
d 4 5 4 La La
h ¼ 0 0 1 54 h 5 þ 4 0 5V ð2Þ
dt _
h Km
Jm 0 BJmm h_ 0
2 3
ia
y ¼ ½0 1 0 4 h 5 ð3Þ
h_

where V is the source voltage, h is the position, Ra is armature resistance, La is

electric inductance, Km is the motor constant, Jm is the rotor moment of inertia, Bm
is the frictional coefﬁcient, and ia is armature current.

3.1 Luenberger Observer Design

Consider the linear system in (4) and (5), and compare it with the state space model
in (2) and (3), the system matrices A, B and C can be identiﬁed.

x_ ðtÞ ¼ AxðtÞ þ BuðtÞ ð4Þ

yðtÞ ¼ CxðtÞ ð5Þ

where A 2 Rnn is the system matrix, B 2 Rnr is the input matrix, u 2 Rr is the
control input that satisfies the Sufficiently Rich (SR) condition to guarantee the
Persistently Excited (PE) condition which is later to be defined, y 2 Rq is the output
of the system and C 2 Rqn is the corresponding output matrix. The observability
of the system can be determined by using the observability matrix O in (6).
2 3
C
6 CA 7
6 7
O¼6
6
CA2
..
7
7 ð6Þ
4 . 5
CAi1

The Luenberger observer is formulated as (7). The observer gain L can be

^x_ ðtÞ ¼ A^xðtÞ þ BuðtÞ þ Lðy C^xðtÞÞ ð7Þ

where ^x 2 Rn is the estimated state vector.

Adaptive Observer for DC Motor Fault Detection Dynamical System 289

3.2 Adaptive Observer Design

Consider the linear system in (4) and (5), a fault f(t) is added at the output equation
to represent the sensor fault, and the system becomes

yðtÞ ¼ CxðtÞ þ f ðtÞ ð8Þ

The fault signal is represented in a linear regression such that

f ðtÞ ¼ wðtÞqT ð9Þ

h i
where wðtÞ ¼ w1 ðtÞ; . . .; wp ðtÞ 2 Rqp are the regressors and qðtÞ ¼ ½q1 ðtÞ; . . .;
qp ðtÞT 2 Rp are the unknown coefficients of the regressors. This model comes from
the physical knowledge of the possible faults [11]. Let the signal wðtÞ be filtered
through the filter

Y_ ðtÞ ¼ ½A KC Y ðtÞ KwðtÞ ð10Þ

XðtÞ ¼ CY ðtÞ þ wðtÞ ð11Þ

Y(t) and X(t) are the state and output of the filter, respectively. Assuming that the
w(t) be persistently exciting, so that the filtered signals Ω(t) satisfies the following
inequality for t t0 and with some positive constants a, T where Iq 2 Rqq is q q
identity matrix [11].
Z tþT
XT ðsÞXðsÞds aIq ð12Þ
t

Thus, the adaptive observer can be formulated as follows where C is a positive

deﬁnite gain matrix [11].

Y_ ðtÞ ¼ ½A KC Y ðtÞ KwðtÞ ð13Þ

^x_ ðtÞ ¼ A^xðtÞ þ BuðtÞ þ K ½yðtÞ C^xðtÞ wðtÞ^ ^_ ðtÞ

qðtÞ þ Y ðtÞq ð14Þ

^_ ðtÞ ¼ C½CY ðtÞ þ wðtÞT :½yðtÞ C^xðtÞ wðtÞ^

q qðtÞ ð15Þ

By considering the state space model in (4) and (8), K can be designed using pole
placement method.
290 J. Lee et al.

3.3 Encoder Fault Signal Modelling

Two types of fault are modelled. In mechanical causes, the loose mounting of the
encoder may result in random error signal [3]. Therefore, one of the method to
model the encoder fault signal is by adding a noise signal at the output of the plant.
This can be easily done in the Simulink by adding the Signal Generator block from
the library to generate random signal and is discussed in the next section.
Next, in electronic causes, if one of the two channels of the quadrature encoder is
malfunctioning and not delivering signals, the number of counted edges reduced to
the half of the healthy one [16]. This causes the resulting output become half of the
actual output. To represent this fault, the output of the state space model of DC
motor is multiply with a gain of 0.5.

3.4 Simulations in MATLAB Simulink

MATLAB Simulink is used to construct the model of the dynamic system, observer
and the encoder fault model, and is used to simulate the results. To model the whole
system in the simulation, the parameters of the DC motor in Table 1 are used. First,
the Luenberger observer is modelled. A MATLAB source ﬁle that calculates the
system matrices of the DC motor and also the observer gain matrix is run. Then, the
block diagram of the system and the observer are built in Simulink as shown in
Fig. 1 for a healthy system. For the random error signal, a Signal Generator block is
added to the x2 signal before feeding to the observer. The parameters of the block
are set to generate random waveform with amplitude equals to 1 and frequency,
10 Hz. For the gain fault, the x2 signal goes through a gain of 0.5 before entering
the observer. The Scope blocks are used to show the simulated signals for each
states and compares with the actual signals.
Next, the adaptive observer is modelled similar to the procedure of simulations
for the Luenberger observer. The MATLAB code is used to load the workspace
with appropriate parameters and calculate the gain matrix K. Then, the block dia-
gram as shown in Fig. 2 is built for the observer without faults. The adding of fault
into the system is similar to that of the Luenberger observer. As the system is more
complex than the Luenberger observer, it is divided into four subsystems, three for

Table 1 DC motor Parameters Values

parameters for simulation
purposes Armature resistance, Ra 1X
Electric inductance, La 1 10–3 H
Frictional coefﬁcient, Bm 1 10−4 N m s
Moment of inertia, Jm 5 10−3 kg m2
Motor constant, Km 0.1 N m/A
Adaptive Observer for DC Motor Fault Detection Dynamical System 291

Fig. 1 Luenberger observer

Fig. 2 Adaptive observer

each equation of (13), (14) and (15), and one that generates the regressor,
w(t) which is used to estimate the fault. It is based on the Fourier series with four
frequency terms. A low pass ﬁlter is added after the x is generated to get a better
result.
292 J. Lee et al.

4 Results and Discussions

The observability check of the system is done and it shows that the system is
observable. The system matrices and the observer gain were calculated as follows.
2 3 2 3
1000 0 100 1000
A¼4 0 0 1 5; B ¼ 4 0 5; C ¼ ½0 1 0 ð16Þ
20 0 0:02 0
2 3
12419998
K ¼ L ¼ 4 199:98 5 ð17Þ
310496:0004

For the Luenberger observer, the input signal of the simulation was generated as
a square wave with amplitude of −1 V and frequency 1 Hz. The graphical results of
the simulation for the three states, current, position and speed without fault were
shown in Fig. 3. Besides that, Fig. 4 shows the results for system with random error
signal fault and Fig. 5 shows the results for system with gain fault. The random
error signal is generated using signal generator that generates random signal with
amplitude of 1 and frequency of 10 Hz. From these results, we can see that the

Fig. 3 Actual and estimated states for Luenberger observer with no fault
Adaptive Observer for DC Motor Fault Detection Dynamical System 293

Fig. 4 Actual and estimated states for Luenberger observer with random error signal fault

Fig. 5 Actual and estimated states for Luenberger observer with gain fault
294 J. Lee et al.

Luenberger observer is doing very well in estimating the states when there is no
fault and noise occurs. However, the output of the Luenberger observer is corrupted
by the fault signal when the faulty signal is fed into the observer. The estimated
output signal follows exactly the same as the faulty signal.
For the adaptive observer, the input signal used is a square wave with amplitude
of −1 V and frequency 0.5 Hz. The gain matrix C is set to 20I8 and the fault
regression used was taking the form of a Fourier series with four frequency terms,
which was as follows.

f ðtÞ ¼ WðtÞq
¼ q1 cos 100pt þ q2 cos 200pt þ q3 cos 400pt
ð18Þ
þ q4 cos 800pt þ q5 sin 100pt þ q6 sin 200pt
þ q7 sin 400pt þ q8 sin 800pt

The results of the simulation for the system with no fault are shown in Fig. 6.
Then, Fig. 7 shows the results for the system with random error signal fault and
Fig. 8 for the system with gain fault. From the results, we can see that for system
with no fault, the estimated states follow the actual states with slight delay due to
the low pass ﬁlter. The use of low pass ﬁlter makes the estimation slower as
mentioned in [11]. This result is acceptable and almost the same with the

Fig. 6 Actual and estimated states for adaptive observer with no fault
Adaptive Observer for DC Motor Fault Detection Dynamical System 295

Fig. 7 Actual and estimated states for adaptive observer with random error signal fault

Fig. 8 Actual and estimated states for adaptive observer with gain fault
296 J. Lee et al.

Luenberger observer. However, the response was much better than that of the
Luenberger observer for the random error signal fault. The estimated states did not
follow exactly as the faulty signal, and tried to get to the actual values. The results
for the gain fault were not good as they contained similar problem with the
Luenberger observer, which the estimated states followed the faulty states, and the
estimated states were delayed due to the low pass ﬁlter. This may be due to the
adaptive law is not suitable to detect gain fault.

5 Conclusions

In this research, the sensor fault detection was studied using linear observer and
adaptive observer. A linear observer and an adaptive observer were designed and
applied in fault detection. Before designing the observers, the DC motor system was
modelled. The observability of the system was checked. In designing the
Luenberger observer, pole placement method was used to design the observer gain
matrix. The adaptive observer was designed by modiﬁcation on the Luenberger
observer, considering the fault in the system. By estimating the fault in the system,
the adaptive observer can reduce or eliminate the effect of the fault, and thus
estimated the actual states.
The encoder fault was studied and its effect on the output signal were investi-
gated. Two types of encoder fault were modelled and applied into the simulations of
the fault detection system. An improper and loose mounting of the encoder that may
lead to random error signal fault was modelled using a noise or a random waveform
signal. Another encoder fault, that was gain fault, was represented using a gain of
0.5 at the output of the motor system. From the simulations, the linear observer can
estimate the states very well in the absence of fault signal, but failed to detect the
fault when there is presence of fault. The adaptive observer can estimate the states
well both in ideal system and random error fault system, but failed to detect the gain
fault. Based on the analysis, the modiﬁcation to the adaptive observer is required to
overcome the issue of the gain fault. This will be the focus in the next research
investigation.

Acknowledgements The authors would like to thank Universiti Sains Malaysia for providing
space and software tool in conducting the research. This research is also partially supported by the
USM RUI Grant: 1001/PELECT/8014093.

References

1. Tjahjono B, Esplugues C, Ares E, Pelaez G (2017) What does Industry 4.0 mean to supply
chain? Procedia Manuf 13:1175–1182
2. Wang S, Wan J, Li D, Zhang C (2016) Implementing smart factory of Industry 4.0: an
outlook. Int J Distrib Sensor Netw 12(1):3159805
Adaptive Observer for DC Motor Fault Detection Dynamical System 297

3. Bourogaoui M, Jlassi I, El Khil SK, Sethom HBA (2015) An effective encoder fault detection
in PMSM drives at different speed ranges. In: 2015 IEEE 10th international symposium on
diagnostics for electrical machines, power electronics and drives (SDEMPED), Guarda,
pp 90–96
4. Zhu J, Yang G (2018) Robust distributed fault estimation for a network of dynamical systems.
IEEE Trans Control Netw Syst 5(1):14–22
5. Liu C et al (2017) A state-compensation extended state observer for model predictive control.
Euro J Control 36:1–9
6. Tun HM, Aung W (2014) Analysis of control system for A 24 V PM brushed DC motor fitted
with an encoder by supplying H-bridge converter. Bahria Univ J Inf Commun Technol 7
(1):54–67
7. Gao Z, Cecati C, Ding SX (2015) A survey of fault diagnosis and fault-tolerant
techniques-part I: fault diagnosis with model-based and signal-based approaches. IEEE
Trans Industr Electron 62(6):3757–3767
8. Bo L, Tao P, Lu S, Ze-zhou H, Chao Y (2016) Multi fault diagnosis of traction motor current
sensor based on state observer. In: 2016 Chinese control and decision conference (CCDC),
Yinchuan, pp 7058–7063
9. Zhang H, Wang J (2016) Adaptive sliding-mode observer design for a selective catalytic
reduction system of ground-vehicle diesel engines. IEEE/ASME Trans Mechatron 21
(4):2027–2038
10. Oliva-Fonseca P, Rueda-Escobedo JG, Moreno JA (2016) Fixed-time adaptive observer for
linear time-invariant systems. In: 2016 IEEE 55th conference on decision and control (CDC),
Las Vegas, NV, pp 1267–1272
11. Zhang Q (2005) An adaptive observer for sensor fault estimation in linear time varying
systems. IFAC Proc Vol 38(1):137–142
12. Xia J, Guo Y, Dai B, Zhang X (2017) Sensor fault diagnosis and system reconfiguration
approach for an electric traction PWM rectifier based on sliding mode observer. IEEE Trans
Ind Appl 53(5):4768–4778
13. Baldi P, Blanke M, Castaldi P, Mimmo N, Simani S (2018) Fault diagnosis for satellite
sensors and actuators using nonlinear geometric approach and adaptive observers. Int J
Robust Nonlinear Control 29:1–27
14. Li L, Ding SX, Yang Y, Peng K, Qiu J (2018) A fault detection approach for nonlinear
systems based on data-driven realizations of fuzzy kernel representations. IEEE Trans Fuzzy
Syst 26(4):1800–1812
15. Kabir S (2017) An overview of fault tree analysis and its application in model based
dependability analysis. Expert Syst Appl 77:114–135
16. Damdoum A, Berriri H, Slama-Belkhodja I (2012) Detection of faulty incremental encoder in
a DFIM-based variable speed pump-turbine unit. In: 2012 16th IEEE mediterranean
electrotechnical conference, Yasmine Hammamet, pp 1151–1154
Water Level Classification for Flood
Monitoring System Using Convolutional
Neural Network

J. L. Gan and W. Zailah

Abstract This project aims to propose a new water level classification model into
the flood monitoring system by integrating it with the Artificial Intelligence tech-
nology, Convolutional Neural Network. Various image pre-processing and data
augmentation techniques have been applied in order to increase the dataset from
one image to 300 images that are able to imitate the real images captured by a
camera. The images have undergone transfer learning for weight initialization with
fine tuning and training from scratch in order to compared their results and finalize
the most suitable optimizer, initial learning rate and batch size for this application.
The result has shown that by using pretrained AlexNet with Adam optimizer,
0.0001 initial learning rate and batch size of 16, the validation accuracy is able to
reach to 100% at the ninth epoch and show high stability and consistency for both
training and validation accuracies. Besides, when the model undergoes testing with
15 new images, it is able to obtain full score for 14 images and the average testing
accuracy is as high as 99.72%. The model has outperformed the previous work
done by other researchers. In conclusion, this project has contributed in improving
the safety of the community by successfully created a trustworthy and robust water
level classification model that is able to detect the water level, analyze its risk and
display the information by using camera which is more safe, durable and suitable to
be placed in flood-prone area.

Keywords Convolutional Neural Network Flood monitoring system

Image classiﬁcation

J. L. Gan (&)
Department of Mechanical Engineering, Faculty of Engineering, UCSI University,
Kuala Lumpur, Malaysia
e-mail: [email protected]
W. Zailah
Department of Mechatronic Engineering, Faculty of Engineering, UCSI University,
Kuala Lumpur, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 299

1 Introduction

Flood has been a common issue for many countries across the globe for many
decades. There is no exemption for Malaysia. As one of the countries located in
Southeast Asia, Malaysia is subjected to monsoon season from November to March
every year. The heavy downpour is highly affecting the lives in the states of
Kelantan, Terengganu, Pahang and Johor [1]. Flooding caused by overflowing of
river, high tides and flash flood are the major types of flood happen in this country.
As flood is an inevitable disaster, many engineers and researchers have been
working on various projects, implementing structural and non-structural measures
in order to mitigate its negative impacts on social, environmental as well as
economy [1, 2].
As Artificial Intelligence (AI) is gaining much higher interest among the
researchers in recent years, machine learning, one of the AI application, is being
widely explored and implemented in various fields. Convolutional Neural Network
(CNN) falls under the category of supervised learning as the machine is needed to
be taught in order to learn the way to execute certain tasks. CNN is often used for
image classification, object detection, visual saliency detection as well as text
detection and recognition [3, 4]. Compared to other types of neural network, the
input data for CNN is in three-dimensions (3D), representing the width and height
of the image as well as the colour channel, which allows the machine to learn the
full features exist in the image instead of sacrificing its colour channel and losing
information from it [5]. CNN has proven its high performance in image classifi-
cation in winning the ImageNet Large Scale Visual Recognition Challenge
(ILSVRC) since 2012 with low error rate. Multiple researchers had come up with
different CNN architectures and winning the contest with high results achieved [6],
including AlexNet [7], GoogLeNet [8] and ResNet [9].
Even so, CNN has not been widely implemented in the environmental field.
Previous integration work of CNN in disaster management system has only used it
to detect the region where flood has occurred instead of monitoring the rise of river
water level to provide early warning as this is one of the major type of flood happen
in Malaysia [5, 10]. Current flood monitoring system also integrated various ground
sensors to obtain hydrological parameters, radar and satellites for flood mapping as
well as unmanned aerial vehicle to monitor and observe the disaster area [11].
Malaysia has adopted both flood mapping and flood forecasting and warning sys-
tem to manage the disaster. Many stick gauges, rainfall gauges, river gauges and
water level sensors are used to collect data and monitor the situation at the selected
area [12, 13]. However, physical sensors are rather vulnerable during the disaster.
The flood water during harsh weather is so destructive that can ruin the structures
attached with the sensors [1]. Water sensors are very expensive and it is subjected
to high risk of damage as well [14]. In some cases, the physical sensors might stop
working at the start of the flood. Unfortunately, it could not run away from the
disaster area as the water level sensor used is uniaxial, it must be placed right above
the water to detect the distance of water from the sensor.
Water Level Classification for Flood Monitoring System … 301

Therefore, this project aims to propose a new methodology which integrated

CNN technology into the flood monitoring system to monitor the river water level
by performing water risk level classification. Rather than using physical sensors that
need to be installed right above the surface of the water, a camera that could be
placed at a higher place, further away from the disaster region is preferred. The
usage of camera also served as a closed-circuit television that enabled the workers
to monitor the condition of the river, detecting the causes of peculiar data collected
and even the lives at the disaster area. The objectives of this project are to develop
the water level classification system model, implement the water level analysis
using CNN and evaluate the classification system performance by comparing it with
previous studies.

2 Methodology and Experimental Setup

The project focuses on modelling an active flood monitoring system which is able
to perform data collection, data analysis, data processing, decision-making and
release of useful information to the target audience [11, 15]. The proposed system
consists of three layers, which includes data observing layer as shown in Fig. 1
whereas the overall workflow from obtaining and creating the dataset to the ﬁnal
implementation of the system on the hardware for real time testing is shown as in
Fig. 2.

2.1 Database Acquisition

Due to the limited database available from the Jabatan Pengairan dan Saliran
(JPS) Malaysia, the main image that is used in this project, as shown in Fig. 3, is
obtained from online resource because it is the same type of stick gauge used near

Fig. 1 The proposed system layer with its components

302 J. L. Gan and W. Zailah

Fig. 2 Overall workflow of the project

Fig. 3 The original image

used for this project

to light rapid transit around at Masjid Jamek, Kuala Lumpur. At the same time, it is
also assumed that the dataset is obtained at day time, right after the rain, where the
visibility is high while the water current is relatively high. The image at Sect. 3.4 is
assumed to be taken in the evening, after the rain, where the visibility is low and the
camera used has lower resolution and need to be maintained.
Water Level Classiﬁcation for Flood Monitoring System … 303

Fig. 4 Images of low, medium and high risk levels created by using graphics editor

Data mining is performed in this project to create its own dataset as the actual
river images with different water levels could not be obtained from any official sites.
By using professional graphics editor software, 20 images for each low, medium
and high risk levels are being created from the image above, according to the water
level data provided by JPS for Sungai Kuantan Kajang, which defines 28.5 m as
alert level (low risk level), 28.68 m as warning level (medium risk level) and
29.10 m as danger level (high risk level) [2].
Besides, the images are also being cropped into a square shape in order to match
the expected input of the CNN architecture used in this project so that when it is
being resized automatically using the machine algorithm, it can prevent the image
from being stretched and deviated far from the real-life situation. Each image is then
labelled with “l”, “m” and “h”, indicating its water risk level and their folders are
renamed as “low”, “med” and “high” respectively. With five images from each
category being kept aside from the folder as testing set, a total of 60 images are
being saved inside the stated folders to be used for the following model training and
validation. The samples of some resulted images are as shown in Fig. 4.

2.2 Data Augmentation and Pre-processing

Data augmentation is often used in CNN image processing to increase the amount
of data and avoid overﬁtting [16]. It is also one of the techniques in CNN opti-
mization. The data augmentation is applied unto testing and validation dataset so
that the system is able to detect new incoming images taken under various con-
ditions [17]. In this project, rotating, translating and scaling are used. Furthermore,
in pre-processing step, the images are also being resized into the expected size of
the CNN architecture using machine algorithm, in this project, it is 227*227 pixels.
The resulted images from data augmentation are as shown in Fig. 5.
304 J. L. Gan and W. Zailah

Fig. 5 Samples of the resulted images from data augmentation

2.3 Details of CNN Workflow

AlexNet is chosen to be the architecture used in this project. With eight layers of
depth, this architecture is able to obtain error rate as low as 15.4%, defeating all
other participants during the ImageNet Challenge in 2012 [7]. AlexNet is a series
network that is composed by five convolutional layers, three pooling layers and
three fully-connected layers. Two cross channel normalization layers, two dropouts,
seven Rectified Linear Units (ReLU) and a Softmax classifier are used in the
architecture as well. Table 1 below shows the sequence of the layers along with the
settings of each layer which is referred from MATLAB deep learning toolbox [18].
At this stage, this project aims to investigate the most suitable training hyper-
parameters and type of optimizer for this water level dataset after performing
fine-tuning and to compare the accuracy result of the model trained by different
techniques as well as results generated by other researchers. To find the suitable
hyperparameters for this application, in this project, the chosen architecture under-
goes two different techniques to compare their accuracies, which include transfer
learning for weight initialization combined with fine tuning and training from
scratch. The original pretrained CNN architecture is used for transfer learning, while
reconstruction of the famous CNN architecture is used in training from scratch.
Transfer Learning: Pretrained Network for Weight Initialization and Fine
Tuning. In this technique, the pretrained network is first loaded into the work-
space. Then the input dataset is being resized according to the expected input size of
the network, which is 227*227 pixels. Layer transfer is needed to be performed as
Water Level Classification for Flood Monitoring System … 305

Table 1 Details of AlexNet architecture

No. Type Settings
1 Image input Zero center normalization
2 Convolution S = 4, P = 2, K = 96, F = 11*11*3
3 ReLU
4 Cross channel normalization 5 channels/element
5 Max pooling S = 2, P = 0, F = 3*3
6 Convolution S = 1, P = 2, K = 256, F = 5*5*48
7 ReLU
8 Cross channel normalization 5 channels/element
9 Max pooling S = 2, P = 0, F = 3*3
10 Convolution S = 1, P = 1, K = 384, F = 3*3*256
11 ReLU
12 Convolution S = 1, P = 1, K = 384, F = 3*3*192
13 ReLU
14 Convolution S = 1, P = [1 1 1 1], K = 256, F = 3*3*192
15 ReLU
16 Max pooling S = 2, P = [0 0 0 0], F = 3*3
17 Fully-connected
18 ReLU
19 Dropout 50%
20 Fully-connected
21 ReLU
22 Dropout 50%
23 Fully-connected
24 Softmax
25 Classification output Cross-entropy
* Note S = stride, P = padding, K = number of filters, F = filter size, W = weights, B = bias

the number of classes in this project is only three, instead of the original 1000. The
last fully connected layer, Softmax layer and classification layer of the architecture
are being replaced by a three-output fully connected layer, a new Softmax layer and
a new classification layer. There are two common ways to fine tune the model in
order to obtain the most suitable hyperparameters for the dataset. As one being trial
and error and another is to analyze by observing the graph. In this stage, trial and
error is used as the second method requires more experience to be able to perform
well. Therefore, the pretrained network undergoes trial and error in order to obtain
the most suitable learning rate, batch size and type of optimizer for this application.
Then, the set of hyperparameters finalized in this stage are being used for the
second technique, which is training from scratch. The details are stated in the next
section. The training options used for this part are referring from the works done by
[19]. The best result generated by the fine-tuned model in this section is then being
selected to compare with the results generated in the other technique in order to
306 J. L. Gan and W. Zailah

finalize the type of technique, optimizer and values of hyperparameters for the
proposed flood monitoring system as they are proven to be the most suitable to be
implemented.
Train from Scratch. AlexNet architecture is reconstructed from scratch.
Therefore, the parameters in the filters are randomly initialized by random Gaussian
distributions which makes the main difference between the new and the pretrained
network. Deep Network Designer is first being initiated to build the architectures
and the architecture is then being loaded into MATLAB. The dataset is again being
resized according to the input size of the network and being trained according to the
finalized training hyperparameters obtained in transfer learning section. The final-
ized result is being brought forward to be compared as well.

3 Results and Discussion

In this section, the pretrained AlexNet architecture is first being studied to finalized
the types of hyperparameters to be used. The effects of the types of optimizer,
learning rates and batch size on the model’s accuracy and computational time are
also being analyzed. Then the finalized hyperparameters are being loaded into new
AlexNet. To compare the performance between the pretrained and the new archi-
tecture, the training graphs and the learned features of both architecture are being
studied. Last but not least, both architectures are being tested with the testing
dataset and also previous studies to finalized the model to be used for this system.

3.1 Transfer Learning: Pretrained Network for Weight

Initialization and Finetuning

The details of the hyperparameters for the model training at this stage are as shown
in Table 2. These values are referred from the work of [19].
Tables 3 and 4 show the results generated by comparing different optimizers for
different batch sizes and initial learning rates. As many researchers have proposed
different values for these hyperparameters, at this stage, this project aims to
investigate the relationships among those hyperparameters and to decide the most
desirable hyperparameters for this speciﬁc case study.
The initial learning rates used to be investigated in this project are 0.001, 0.0001
and 0.00001. The hyphens “-” that appear in the table show that the particular
results are not valid due to two reasons, which include the batch size that is too low
and out of the compatibility of the graphics processing unit, as well as the results
show constant validation accuracy which indicates that the optimizer is not able to
effectively update and optimize the weights in the model. Therefore, they are being
omitted from the results.
Water Level Classiﬁcation for Flood Monitoring System … 307

Table 2 Hyperparameters set for weight initialization and ﬁnetuning

Hyperparameters Settings
Type of optimizer SGDM RMSProp Adam
Momentum 0.9 – –
Max epochs 12 12 12
Learning rate drop period (epochs) 6 6 6
Weight decay 0.0001 0.0001 0.0001
Gradient decay factor – – 0.9
Squared gradient decay factor – 0.999 0.999
Shuffle Every epoch Every epoch Every epoch
Validation frequency 3 3 3

The results show that SGDM optimizer is able to work well for wide range of
training options while RMSProp only works when the initial learning rate is
0.00001 and Adam only works well with low batch size and low learning rate.
Next, among all the trials, SGDM optimizer is able to obtained the average vali-
dation accuracy as high as 99.11% with 0.0001 initial learning rate and batch size of
16 in 87.33 s. Adam optimizer has obtained the second highest average validation
accuracy with both 0.0001 and 0.00001 initial learning rates and batch size of 16 in
115.67 s and 110.67 s respectively. The performance of RMSProp is slightly lower
compare to the other two optimizers. Its best result is the third highest and it is able
to obtain 98.22% of average validation accuracy with 0.00001 learning rate and
batch size of 16 in 117.33 s. The results show that each optimizer performs dif-
ferently from each other at different settings different. However, in general, all of
them work better in lower batch size and lower learning rate.
The results in Tables 3 and 4 are being plotted in the graphs of average accuracy
and average computational time against batch size as shown in Figs. 6 and 7. Based
on Fig. 6, when batch size increases, the accuracies of all different optimizers
decrease. Besides, when SGDM optimizer is used, at the same batch size, the
accuracy is higher for higher initial learning rate and it is lower for lower initial
learning rate, except when the batch size is 128 used by SGDM with 0.0001
learning rate. It also shows that the accuracy of SGDM with 0.0001 learning rate
and batch size of 16 is the highest compared to others while the same optimizer
with batch size of 128 has obtained the lowest accuracies in this simulation.
Therefore, large batch size might have to be avoided when the dataset available is
small as in this project.
In overall, Fig. 7 shows that the computational time needed when SGDM is used
is the shortest compared to others. As RMSProp being the second fastest, Adam
requires the most computational time. The reason for it might be the formula used
by Adam is relatively more complicated compared to SGDM and RMSProp. It can
also be observed that the gradients from batch size of 16 to 32 for all the optimizers
are much higher than the gradients at other points in the graph. Considered from
SGDM results, the gradients remain relatively similar across different learning rates.
308

Table 3 Validation results of accuracy obtained by comparing different optimizers with initial learning rate and batch size
Optimizers SGDM RMSProp Adam
Initial Batch Validation accuracy (%)
learning size Trials
rate
1 2 3 Average 1 2 3 Average 1 2 3 Average
0.001 16 – – – – – –
32 98.67 96.00 98.67 97.78 – – – –
64 96.00 97.33 97.33 96.89 – – – –
128 86.67 86.67 88.00 87.11 – – – –
0.0001 16 100.00 98.67 98.67 99.11 – – 100.00 98.67 97.33 98.67
32 93.33 97.33 100.00 96.89 – – 98.67 93.33 94.67 95.56
64 88.00 94.67 96.00 92.89 – – – –
128 85.33 89.33 90.67 88.44 – – – –
0.00001 16 96.00 97.33 96.00 96.44 96.00 98.67 100.00 98.22 100.00 98.67 97.33 98.67
32 96.00 97.33 92.00 95.11 97.33 96.00 93.33 95.55 98.67 97.33 96.00 97.33
64 94.67 90.67 92.00 92.45 93.33 93.33 96.00 94.22 – –
128 – – – – –
J. L. Gan and W. Zailah
Water Level Classiﬁcation for Flood Monitoring System … 309

Table 4 Validation results of computational time obtained by comparing different optimizers

with initial learning rate and batch size
Optimizers SGDM RMSProp Adam
Initial Batch Computational time (s)
learning size Trials
rate
1 2 3 Average 1 2 3 Average 1 2 3 Average
0.001 16 – – – – – –
32 64 69 68 67.00 – – – –
64 53 54 56 54.33 – – – –
128 51 49 49 49.67 – – – –
0.0001 16 89 88 85 87.33 – – 115 119 113 115.67
32 63 63 62 62.67 – – 107 107 109 109.67
64 49 48 47 48.00 – – – –
128 51 53 53 52.33 – – – –
0.00001 16 87 87 89 87.67 126 107 119 117.33 109 113 110 110.67
32 62 61 61 61.33 74 73 85 77.33 105 106 103 104.67
64 48 47 48 47.67 74 76 75 75.00 – –
128 – – – – – –

Fig. 6 Graph of average accuracy against batch size

Next, as the batch size increases, the computational time decreases for all the cases
except for SGDM with 0.00001 learning rate and batch size of 128. The compu-
tational time shows a signiﬁcant increment at that point even exceeding the needed
computational time for SGDM with higher learning rate. The data at that point is
inaccurate because higher batch size is supposed to eventually lead to lower
computational time due to the fact that the optimizer requires less steps to observe
310 J. L. Gan and W. Zailah

Fig. 7 Graph of average computational time against batch size

the entire training set [20]. Therefore, that particular result is not being considered
for the selection to enter the next stage.
The graphs in Fig. 8 show the training progresses of the top four results
(highlighted) obtained in Table 3 which are SGDM with 0.0001 learning rate,

Fig. 8 From left to right, top to bottom are the training progress of SGDM, RMSProp, Adam with
0.0001 and Adam with 0.00001 learning rates
Water Level Classiﬁcation for Flood Monitoring System … 311

RMSProp with 0.00001 learning rate, Adam with 0.0001 and 0.00001 learning
rates. It is observed that the noise level in the training progress is the highest in
SGDM and the lowest in Adam. The resulted ﬁgures have shown that SGDM and
Adam with 0.00001 learning rate are able to reach to 100% validation accuracy at
the sixth epoch while RMSProp and Adam with 0.0001 learning rate at the ninth
epoch. Even though SGDM requires the most minimum computational time for the
whole training and it is able to reach full accuracy at the lowest number of epoch, its
training graph shows unsteadiness at the training graph even upon termination. On
the other hand, even though Adam with 0.0001 learning rate requires the longest
computational time for training and it reaches 100% validation accuracy at the three
epochs later than SGDM, both training graph and validation graph have shown high
consistency at 100% accuracy from the ninth epoch onwards. Adam with 0.00001
learning rate has shown the smoothest training progress but it is not being selected
due to the same consistency reason as well.
Therefore, in regard to this technique called transfer learning, it is concluded that
the pretrained model performs the most effectively and efﬁciently with Adam
optimizer of 0.0001 learning rate and batch size of 16 because both of its training
and validation results are able to converge within an adequate time and show high
consistency and accuracy.

3.2 Training from Scratch

The same finalized hyperparameters obtained from Sect. 3.1 are being used to train
the new AlexNet. Figure 9 shows the training progress of the model. The figure
shows that the fluctuation at the beginning of training is higher than it is seen from
previous section. The fluctuation starts decreasing until it reaches the eighth epoch,
which is one epoch after the learning rate drops to 0.00001. However, the training
graph (bright blue line) is still fluctuating upon training termination. Nonetheless,
the figure shows that without pretraining, the model is still able to yield 100%
validation accuracy in 111 s. It does not experience underfitting or overfitting
throughout the training. The graph also shows that the model has reached to full
accuracy at the eighth epoch. Therefore, the number of epochs used in this case can
be decreased. The resulted model is being brought forward to the following sections
to compare the features learned by different models that undergo different tech-
niques and also to compare their results obtained through the testing dataset.
312 J. L. Gan and W. Zailah

Fig. 9 Training progress for training from scratch

3.3 Extracted Features

Figure 10 shows the extracted features from different layers in the model obtained
from the pretrained AlexNet after performing transfer learning. Figure from left to
right and from top to bottom are the first five convolutional layers and the subse-
quent three fully-connected layers. The last three images also indicate the features
learned for high, low and medium water levels respectively. The features from the
first convolutional layer are rather simpler compared to other layers as according to
the working principle of CNN, the features in subsequent layers are learned from
the features in previous layer. Therefore, deeper layer is able to extract more
meaningful features. Besides, the complexity of the features is due to the pretrained
network has been trained by 1.2 million images from 1000 categories. So, the newly
learned features of the stick gauge are not noticeable in the figure. In spite of that, it
does not affect its performance in classifying water levels.
On the other hand, Fig. 11 shows the extracted features from the newly trained
AlexNet. As compared to the previous figure, the extracted features in this model
are obscurer because the model has not been pretrained by the huge database but
only with 300 images for three different categories. Therefore, the features extracted
might not be as clear as the ones from the pretrained network. However, starting
from the third convolutional layer, the shape of the stick gauge and its number have
become more noticeable compared to the pretrained network.
Water Level Classification for Flood Monitoring System … 313

Fig. 10 Features extracted from the pretrained AlexNet

The characteristics and the quality offeatures extracted from both models above are
different from each other. However, in order to verify the model’s performance, the
reserved testing dataset is being used. The results are shown in the following section.

3.4 Performance on Testing Dataset

The models selected from Sects. 3.1 and 3.2 are being used to predict the labels of
the testing dataset. The results are tabulated in Table 5. The pretrained model is able
to obtain 100% testing accuracy on almost all the images across different categories,
except for the second image of high risk category, which is shown in Fig. 12, the
predicted accuracy is 95.78%. On the other hand, when the new AlexNet is used,
the same image is unable to obtain 100% prediction accuracy as well. It is observed
314 J. L. Gan and W. Zailah

Fig. 11 Features extracted from the new AlexNet

Table 5 Prediction accuracy on testing dataset

Type of Image Prediction accuracy (%)
network category Image 1 Image 2 Image 3 Image 4 Image 5 Average
Pretrained High 100 95.78 100 100 100 99.72
AlexNet Med 100 100 100 100 100
Low 100 100 100 100 100
New AlexNet High 99.98 94.53 99.90 100 100 99.32
Med 99.81 99.96 99.86 98.48 97.86
Low 99.96 99.69 99.89 99.95 99.98
Water Level Classiﬁcation for Flood Monitoring System … 315

Fig. 12 Second image for

high risk category

that the water level is at 29.1 m, which is the border to the medium risk level. After
checking through the dataset, it is found that the lowest water level in the high-risk
category is 29.12 m while the highest water level in the medium-risk category is
29.06 m. There is insufﬁcient data being trained on the model that causes the model
to be unable to fully distinguish between the two risk levels. Therefore, more
dataset is required to train the model so that it is able to learn to classify over the full
range of water risk level. Although the new AlexNet has obtained less average
prediction accuracy in the testing results, its individual result is only slightly lower
compared to the pretrained network.
In addition, the testing continues to obtain more information on the robustness of
the model by feeding it images of the testing dataset taken through a camera, as the
proposed system in Sect. 2. In this experiment, the difference between the perfor-
mance of the two models has been enlarged. The pretrained AlexNet has outper-
formed the new AlexNet by 27.21% as shown in Table 6. The results also show
that the prediction accuracy in overall has dropped when a webcam is used because
contrary to the normal images used in previous sections, the images obtained
through the webcam is blurrier and has lower intensity, as shown in Fig. 13.

Table 6 Prediction accuracy on testing dataset using webcam

Type of Image Prediction accuracy (%)
network category Image 1 Image 2 Image 3 Image 4 Image 5 Average
Pretrained High 99.97 64.01 93.08 100 100 96.98
AlexNet Low 100 100 100 100 100
Med 99.46 98.16 99.95 100 100
New AlexNet High 24.16 83.49 43.66 87.95 38.38 69.77
Low 99.95 100 100 99.98 99.99
Med 100 0.07 0.45 68.52 99.95
316 J. L. Gan and W. Zailah

Table 7 Comparisons of current results with previous works

Researcher Architecture Accuracy
Current results (transfer learning) AlexNet 99.72%
Current results (training from scratch) AlexNet 99.32%
Amit and Aoki [5] AlexNet 83%, 89%
Cirneanu and Popescu [21] CNN 95%

Fig. 13 Sample of image

taken by a webcam

Lastly, the results have been used to compare with previous works that are
related to flood monitoring or detection system. Note that the results obtained when
the webcam is used are not being considered in the ﬁnal result comparison because
the testing is not done according to the normal testing procedure as other
researchers. Based on the tabulated results below, Amit and Aoki [5] has trained the
machine to detect the disaster region using aerial images while Cirneanu and
Popescu [21] have created a simple CNN architecture to classify flooded area based
on local binary pattern texture operator. Nonetheless, Table 7 has shown that the
results from current project have outperformed than the others.

4 Conclusion and Recommendation

Based on all the results obtained in this project, the final model chosen for water
level classification system is the pretrained AlexNet model. This model has proven
its high validation accuracy with Adam optimizer of 0.0001 learning rate and batch
size of 16 during the training stage. Next, its training progress has shown that the
model is able to reach to 100% validation accuracy at the ninth epoch and the result
remains stable and consistent to the end of the training. Although its average testing
accuracy is only slightly higher than the new model, it is noticed that only one result
shows the imperfect score. Therefore, the problem can be easily solved by training
Water Level Classification for Flood Monitoring System … 317

more data at that particular level to increase the model’s ability to distinguish the
difference. Furthermore, the accuracies (f-score) obtained in [5] is 89% and 83% for
two different flood location, which are much lower compared to the results obtained
by the models in this project. In a nutshell, AlexNet with Adam optimizer and initial
learning rate of 0.0001 and batch size of 16 is the most suitable choice for this
application.
The system can be further improved by applying heavier data augmentation to
create images with different brightness and clarity to imitate images of water level
captured at different time and weather respectively. Next, in order to ease the rescue
work, the system can also be trained to detect living organisms at times of flood.
A platform can also be created upon this application to better distribute the work
force of different parties that involve in the rescue work.

References

1. Zakaria SF, Zin RM, Mohamad I, Balubaid S, Mydin SH, MRD EMR (2017) The
development of flood map in Malaysia. In: 3rd International Conference on Construction and
Building Engineering (ICONBUILD) 2017. AIP Publishing, Malaysia, pp 1–8
2. Department of Irrigation and Drainage Malaysia Homepage. http://publicinfobanjir.water.gov.
my. Accessed 28 Mar 2019
3. Gu JX, Wang ZH, Kuen J, Ma LY, Shahroudy A, Shuai B, Liu T, Wang XX, Wang G,
Cai JF, Chen TH (2018) Recent advances in convolutional neural networks. Pattern Recogn
77:354–377
4. Hadji I, Wildes RP (2018) What do we understand about convolutional networks?. ArXiv,
Toronto
5. Amit SNK, Aoki Y (2017) Disaster detection from aerial imagery with convolutional neural
network. In: 2017 International Electronics Symposium on Knowledge Creation and
Intelligent Computing (IES-KCIC), pp 239–245
6. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A,
Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge.
Int J Comput Vis 115(3):211–252
7. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classiﬁcation with deep convolu-
tional neural networks, pp 1–9
8. Szegedy C, Liu W, Jia YQ, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V,
Rabinovich A (2014) Going deeper with convolutions, pp 1–12
9. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In:
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778
10. Ahamed A, Bolten JD (2017) A MODIS-based automated flood monitoring system for
southeast asia. Int J Appl Earth Obs Geoinf 61:104–117
11. Chen ZQ, Chen NC, Du WY, Gong JY (2018) An active monitoring method for flood events.
Comput Geosci 116:42–52
12. Chan NW (2012) Impacts of disasters and disasters risk management in Malaysia: the case of
floods. In: Economic and Welfare Impacts of Disasters in East Asia and Policy Responses,
pp 503–551
13. Shaﬁai S, Khalid MS (2016) Flood disaster management in Malaysia: a review of issues of
flood disaster relief during and post-disaster. In: International Soft Science Conference. Future
Academy, United Kingdom, pp 163–170
318 J. L. Gan and W. Zailah

14. Subramaniam SK, Vigneswara RG, Subramonian S, Hamidon AH (2010) Flood level
indicator and risk warning system for remote location monitoring using Flood Observatory
System. WSEAS Transn Syst Control 5(3):153–163
15. Hannan MA, Zailah W (2012) Image extraction and data collection for solid waste bin
monitoring system. J Appl Sci Res 8(8):3908–3913
16. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolu-
tional neural networks. Commun ACM 60(6):84–90
17. Yurtsever M, Yurtsever U (2019) Use of a convolutional neural network for the classification
of microbeads in urban wastewater. Chemosphere 216:271–280
18. Deep Learning Toolbox Model for AlexNet Network. https://www.mathworks.com/
matlabcentral/fileexchange/59133-deep-learning-toolbox-model-for-alexnet-network.
Accessed 15 Nov 2019
19. Mahbod A, Schaefer G, Ellinger I, Ecker R, Pitiot A, Wang C (2019) Fusing fine-tuned deep
features for skin lesion classification. Comput Med Imaging Graph 71:19–29
20. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. The MIT Press, London
21. Cirneanu AL, Popescu D (2018) CNN based on LBP for evaluating natural disasters. In: 2018
15th International Conference on Control, Automation, Robotics and Vision (ICARCV).
IEEE, New Jersey, pp 568–573
Evaluation of Back-Side Slits
with Sub-millimeter Resolution Using
a Differential AMR Probe

M. A. H. P. Zaini, M. M. Saari, N. A. Nadzri, A. M. Halil,

A. J. S. Hanifah, and K. Tsukada

Abstract The electromagnetic method of the Non-destructive Test is one of the

approaches in the field of crack detection on a metallic sample. One of the tech-
niques that appear in the electromagnetic method is the Eddy Current Testing
(ECT), where it utilizes the electromagnetic principle to detect cracks in metallic
components. In this research, an ECT probe that is made up of two AMR sensors,
two excitation coils, and a developed set/reset circuit. Besides, a digital lock-in
amplifier has also been developed by using NI-LabVIEW and a data acquisition
(DAQ) card. A measurement system that incorporates the ECT probe and the digital
lock-in amplifier as well as an amplifier circuit, a power supply, a PC and an XY
stage to which the probe is attached to, is developed. Then, artificial slits with
different depths from 768 µm to 929 µm are created on a galvanized steel plate
sample. The slits are evaluated from the back-side of the galvanized steel plate via
two types of scanning, which is the line scan and full map scanning. From the
results of the line scan, the localization of the slits, as well as their depths, could be
performed and estimated. Furthermore, 2-D mapping of the sample from the
backside has been generated. The 2-D map shows that the position of the slits could
be estimated, including their slits depths.

Keywords Non-destructive testing NDT Eddy Current Testing ECT

Anisotropic magnetoresistance AMR

M. A. H. P.Zaini (&) M. M. Saari N. A. Nadzri

Faculty of Electrical & Electronics Engineering, Universiti Malaysia Pahang,
26600 Pekan, Pahang, Malaysia
e-mail: [email protected]
A. M. Halil A. J. S.Hanifah
Faculty of Mechanical & Manufacturing Engineering, University Malaysia Pahang,
Pekan Campus, 26600 Pekan, Pahang, Malaysia
K. Tsukada
Graduate School of Interdisciplinary Science and Engineering in Health Systems,
Okayama University, Okayama 700-8530, Japan

© Springer Nature Singapore Pte Ltd. 2021 319

1 Introduction

In order to identify and assess cracks in metallic components, the magnetic method
is considered as one of the approaches in the Non-destructive Test (NDT) where the
method is widely used in industry due to its small cost and straight-forward
operation [1], thanks to its capability to analyze metallic compounds since the
compound is conductive and has powerful magnetic characteristics. Furthermore,
the benefits of the magnetic method are that it is contactless and could provide
real-time inspection compared to other NDT techniques [2–4]. Recently, the elec-
tromagnetic method in NDT is extensively being researched since its emergence
due to the growth of technology.
There are a number of NDT techniques which utilizes the electromagnetic
principles to detect cracks in metallic component and one of them is recognized as
the Eddy Current Testing (ECT) where it is thoroughly used in NDT for the
detection of cracks on a metallic sample such as aluminum plates [5, 6]. The ECT
can be considered as one of the techniques that are extensively being researched,
especially due to the promising characteristics of eddy current. In ECT, the lift-off
between the magnetic sensor and the metallic sample could heavily affect the eddy
current signal, thus causing alterations in the eddy current readings [7]. Therefore,
in order to minimize or overcome this effect entirely, a compensation method
should be proposed.
In this research, a low-frequency ECT technique is used to allow deeper prop-
agation of electromagnetic waves in an attempt to induce and provide deeper
penetration of eddy current [8]. This is because the generated eddy currents are
greatly influenced by the skin depth effect, which implies that the eddy currents will
be largely distributed on the surface area at high frequency, thus limiting the ability
of the eddy current to penetrate deeper.
Furthermore, for this research, a small-sized ECT probe consists of magnetic
sensors and excitation coils is developed to obtain benefits from its small size,
which allows it to be used in the evaluation of small or complex cracks [9]. There is
a list of magnetic sensors that can be used, such as the induction coil [10],
Anisotropic Magnetoresistive (AMR) sensor, Tunnel Magnetoresistance
(TMR) sensor, Giant Magnetoresistance (GMR) sensor and Superconducting
Quantum Interference Device (SQUID). Among these, SQUID is known to be the
one of the most sensitive magnetic sensor [11], however, due to its nature which
needs to be operated with the presence of complex heat insulation structures for
cooling purpose, thus rendered it to be difficult to be compacted [12, 13]. Therefore,
the usage of the AMR sensor is proposed in this research as it is compact in size as
well as offering high-sensitive sensing. The small size of the AMR sensor is
advantageous in resolving the higher spatial distribution of eddy currents in con-
ductive materials [14, 15].
A plate that is made up of galvanized steel is used as the primary sample in this
research. One side of the plate is engraved with four slits with different depths at
sub-millimeter resolution. The ECT probe to be developed aims to analyze those
Evaluation of Back-Side Slits with Sub-Millimeter Resolution… 321

slits from the backside. Finally, the developed ECT probe is utilized to investigate
the magnetic response characteristic of the artiﬁcial slits on the sample.

2 Experimental Setup

2.1 ECT Probe

ECT probes, in general, can be found in different forms that vary in terms of types
and designs. In this research, an ECT probe is developed in order to detect and
evaluate back-side slits on a sample. This ECT probe is proposed to be designed to
become small in size so that it could become more beneficial for the detection of
small cracks as well as cracks that exist in a complex pattern. In addition to that, the
small-sized probe could also have an upside at which it can make the detection
performance of crack to be better.
Then, the ECT probe is fabricated with the presence of two magnetic sensors and
two excitation coils. The AMR sensor is chosen as the magnetic sensors for this
probe thanks to its highly sensitive detection and small size with a dimension of
11 4 mm2. An AMR sensor consists of 4 magneto-resistive (MR) elements.
The MR elements are arranged in a Wheatstone bridge connection, as shown in
Fig. 1, where each MR element is wounded with a set/reset strap. When a magnetic
field is exposed to the AMR sensor, this will cause the MR elements to change in
resistance, which will then cause a change in the potential difference at the node
between the two MR elements.
However, it is also worth to mention a drawback in utilizing the AMR sensor
where, whenever it is exposed with a strong magnetic field, the AMR sensor itself
will become saturated and becoming less sensitive. Therefore, a set/reset circuit is
fabricated to supply high pulses of current in order to help the sensor to regain its
sensitivity. Then, the reasoning of why two AMR sensors are used instead of one is
to introduce the differential technique detection, which may help in diminishing the
background noises. Between the two AMR sensors, a baseline of 4 mm is placed.

Fig. 1 The schematic HMC1001

diagram of the AMR sensor
(HMC1001) that is used in
this research alongside a set/ + +5V
reset circuit and an
instrumentation ampliﬁer AD8249
(AD8249) connected to it 5V

- -5V
Set/Reset Set/Reset
Circuit Circuit
322 M. A. H. P. Zaini et al.

Amplifier
DAQ PC
Circuit

Power
Supply

Induced Magnetic Field due

to Induced Eddy Current

Induced Eddy Current

N S
S1 S2

Fig. 2 The developed ECT system

Then, the AMR sensors will be connected to an amplifier circuit where the
amplifier circuit is made up of two instrumentation amplifier (INA). Each INA is
connected with one AMR sensor where the output of the sensor is amplified with a
gain of 40 dB. Next, by placing the AMR sensors between two excitation coils, the
stability of the sensors could be significantly enhanced. Each excitation coils are
wounded with 0.65-mm magnet wires for 100 turns around a ferrite core with a
diameter and height of 6 mm and 20 mm, respectively, as in Fig. 2.

2.2 Measurement System

A measurement system that incorporates the developed ECT probe is constructed

with a few others components such as a power supply, an amplifier circuit, a digital
acquisition (DAQ) card, an XY stage with a size of 55 cm 45 cm as well as a
personal computer (PC) for the analysis of the acquired data as shown in Fig. 2.
The signal from the AMR sensors will be pre-amplified by the amplifier circuit
before it is acquired by the DAQ card (NI-USB6212). The ECT probe is attached to
the XY-stage. Then, via NI-LabVIEW, an XY-stage controller virtual instruments
(VI) is created. This is to allow the XY stage to be controlled by the PC.
Furthermore, as an instrument that can extract a signal from a noisy environment
is needed in this research, it is necessary that a lock-in amplifier (LIA) is to be used
in this research. The LIA is crucial due to its functional, where it is able to extract
Evaluation of Back-Side Slits with Sub-Millimeter Resolution… 323

Fig. 3 Block diagram of the digital LIA that is constructed in NI-LabVIEW VI

signal amplitudes and phases from a very noisy environment. Therefore, by using
LabVIEW, the VI of a digital LIA is constructed as shown in Fig. 3. Compared to
the analog LIA, the digital LIA excels in terms of size where it only required a
DAQ card for data acquisition purposes, thus, may enable the measurement system
to become simpler. Besides that, a VI that controls the power supply is developed.
This is to allow the measurement to be done automatically, thus reducing the time
taken for each measurement as well as minimizing any human intervention. Then,
by combining the XY-stage controller, the developed digital LIA and the power
supply controller, a measurement system is produced.
A 2-mm thick galvanized steel plate is used as the sample for this research. On
one surface of the sample, four artiﬁcial slits are fabricated with different depths at
the sub-millimeter resolution as shown in Fig. 4. First, line scans are conducted on
the sample as shown in Fig. 4, with a resolution of 1 mm. The experimental settings
of the line scan are sinusoidal currents with an amplitude 300 mA with variable
frequencies of 30 Hz, 70 Hz, 90 Hz, 110 Hz, 160 Hz, 210 Hz, 410 Hz, and
510 Hz, used to produced excitation ﬁelds using the excitation coils. Then, the
optimum frequency is determined from the results of the line scan. After that, by

929 µm 849 µm 817 µm 768 µm

Direction of
scanning

Fig. 4 Scanning procedure of a line scan on the sample

324 M. A. H. P. Zaini et al.

using the optimum frequency, a full map scanning is conducted for the back-side
measurement to generate the 2-D representation of the induced magnetic ﬁeld of the
induced eddy current.

3 Results and Discussions

3.1 Line Scan of the Back-Side Measurement

Compared to the supplied magnetic field via excitation coil, the induced magnetic
field of the induced eddy current in the sample is delayed by 90°. From the output
of the LIA, the reading of the differential sensors consists of two different part,
which is the real part and the imaginary part. In other words, the real part is also
known as the signals which are in-phase with the reference signal while the
imaginary part represents the signal which is out-of-phase with the reference signal.
For this research, the reference signal is set to be the signal from sensor 1.
Therefore, as the induced magnetic field of the induced eddy current is delayed
compared to the supplied magnetic field, the signal of the magnetic field from the
induced eddy current could be detected from the imaginary part of the output of the
lock-in amplifier.
Figure 5 shows the raw waveforms of the induced magnetic fields of the eddy
current signals at the 849-µm slit. The slit is located at the position of 15 mm. From
the waveforms, the location of the slit can be identified to be at the middle of the
transition of voltage from peaks to troughs; i.e., the position of the highest gradient
of the waveforms with respect to the position of the probe. The pattern is similar for
every frequency. In terms of frequency, it can be observed that as the frequency
increases, the magnitude of the waveform averages is decreasing. This could be due

0.005
Slit Location

-0.005

30 Hz
Voltage (V)

-0.015 70 Hz
90 Hz
110 Hz
-0.025 160 Hz
∆V210 Hz 210 Hz
410 Hz
-0.035
510 Hz

-0.045
0 5 10 15 20 25 30
Position (mm)

Fig. 5 The raw waveform signal of the 849 µm slit

Evaluation of Back-Side Slits with Sub-Millimeter Resolution… 325

0.04

0.035

Delta Values (∆V) 0.03

30 Hz
0.025 70 Hz
90 Hz
0.02
110 Hz
0.015 160 Hz
210 Hz
0.01 410 Hz
510 Hz
0.005

0
750 800 850 900 950
Depths (μm)

Fig. 6 Delta values of voltage of the line scan back-side measurement as calculated from the raw
waveform

to the skin depth effect as the eddy current may not penetrate further as the fre-
quency increases and distribute more on the surface. Furthermore, delta values of
voltage (ΔV) or simply the difference between the peaks and troughs can be cal-
culated to characterize the sub-millimeter slits. For example, at the frequency of
210 Hz, the delta value of voltage, ΔV210 Hz, is calculated as shown in Fig. 5.
Then, a graph of ΔV versus the depth is plotted as in Fig. 6. From the graph,
there is a correlation that can be observed where, as the depth increases, the ΔV also
increasing. Thus, these characteristics can be used to estimate the crack depth of
any unknown defects. However, it is not the same for the ΔV at frequencies of
410 Hz and 510 Hz at which their ΔV seems to fluctuate. Also, as the frequency
increases, the overall ΔV is also increasing. This case, however, only occurs in the
frequency region between 30 Hz until 210 Hz. For the ΔV of frequency above
210 Hz, which is 410 Hz and 510 Hz, the overall ΔV seems to start to decrease.
This is suspected to happen due to the skin depth effect where eddy current dis-
tributed on the surface and its distribution is not much affected by the presence of
the back-side slits. Therefore, the frequency dependency characteristic can also be
utilized in order to provide richer information on the slit depth. Also, it is worth to
note that the maximum percent error of this system could go up to 15.41% and the
error variations can be seen as in Fig. 6 from the error bars. Thus, since the depth
difference between the 817 µm and the 849 µm depth slits is quite small, which is
approximately 32 µm, the overlap between the error bars at both slits can be
expected, which may cause the ΔV to be quite similar.
After that, for the ΔV at each frequency, a trendline is generated. From here, the
gradient of the trendlines is calculated, and then, the gradient is plotted versus
frequency, as in Fig. 7. From the ﬁgure, it can be seen clearly as the frequency
increases, the gradient of the trendline of the ΔV of the line scan for back-side
measurement is also increased. However, the gradient of the trendline of the ΔV of
326 M. A. H. P. Zaini et al.

Gradient of Trendline, m (×10 5)

0
10 100 1000
Frequency, f (Hz)

Fig. 7 Graph of trendline of the ΔV of the line scan for the back-side measurement versus
frequency

the line scan for back-side measurement starts to decrease after 210 Hz where this
could be affected by the skin depth effect. From the gradient of the trendline of the
ΔV of the line scan for the back-side measurement, it can be said that the most
optimal frequency is 210 Hz as the gradient of the trendline is at the highest.

3.2 2-D Map of the Back-Side Measurement

Next, a full map for the back-side measurement is conducted to evaluate the locality
of the slits. As mentioned previously, the frequency of 210 Hz is considered as the
optimum frequency. Therefore, the full map for the back-side measurement is
conducted at the frequency of 210 Hz. Same with the line scan measurement, the
full map measurement also uses the same scanning resolution which is 1 mm. Then,
from the full map scanning, a 2-D map of the sample is generated by using the
contour function of MATLAB. The result of the full map scanning is shown in
Fig. 8. The comparison between the 2-D mapping and the actual sample is also
highlighted in Fig. 8. It can be seen that from the 2-D mapping, the location of the
slit is at the middle between the intensity change of voltage, which is from the 2-D
mapping is the changes from red intensity to blue intensity; i.e., from minimum
voltage to maximum voltage.
Moreover, the depth can also be estimated by observing the level of intensity at
both blue intensity and red intensity regions. For the 768 µm depth slit, the blue and
red color intensity regions can be seen to be much lower compared to the blue and
red color intensity regions for the slit with a depth of 929 µm. Other than the
intensity change, the background signal can also be seen to be lower on the left side
as compared to the right side. This may be caused by the magnetic ﬁeld distribution
Evaluation of Back-Side Slits with Sub-Millimeter Resolution… 327

929 µm 849 µm 817 µm 768 µm

Actual Sample
Full Map of the
Sample

-0.015 -0.01 -0.005 0 0.005 0.01

Voltage (V)

Fig. 8 2-D mapping of the sample from back-side measurement and comparison with the actual
sample

inside the sample itself. However, the background voltage does have a huge dif-
ference compared to the voltage near the slit as the intensity change can be clearly
seen as compared to the background. From both line scan and 2-D map measure-
ments, it can be said that the developed probe is able to resolve back-side slit with a
resolution up to approximately 54 µm, showing its potential in an early and sen-
sitive back-side crack assessment.

4 Conclusions

An ECT probe with a differential AMR sensor configuration has been developed in
this research. The probe is able to detect the artificial slits that have been created on
a galvanized steel plate sample from the backside with a slit depth resolution up to
approximately 54 µm. Two type scanning is done in this research, which is the line
scan and the full map scanning. For the line scan, the location of the slit could be
estimated by observing the patterns of the results from the line scan measurement.
Furthermore, by analyzing the results further, the depth of the slit could even be
estimated. Then, an optimum frequency is identified to be 210 Hz for detecting the
artificial back-side slits. By using the optimum frequency, a full map scanning is
conducted on the same sample from the backside. A 2-D mapping of the sample has
been generated. The location of the slit could be seen on the 2-D map as it is at the
328 M. A. H. P. Zaini et al.

transition from minimum to maximum points of the acquired signal. By observing

the intensity of the blue and red colors on the 2-D map, the depth could be
estimated.

Acknowledgements The authors would like to thank the Universiti Malaysia Pahang (grant no.
RDU1903100 and PGRS190321) for laboratory facilities and ﬁnancial assistance.

References

1. Tsukada K, Kiwa T, Kawata T, Ishihara Y (2006) Low-frequency eddy current imaging using
mr sensor detecting tangential magnetic ﬁeld components for nondestructive evaluation. IEEE
Trans Magn 42:3315–3317
2. Postolache O, Ribeiro AL, Ramos H (2009) Weld testing using eddy current probes and
image processing. In: 19th IMEKO World Congress 2009, pp 6–10
3. García-Martín J, Gómez-Gil J, Vázquez-Sánchez E (2011) Non-destructive techniques based
on eddy current testing. Sensors 11:2525–2565
4. Zaini MAHP, Saari MM, Nadzri NA, Mohd Halil A, Tsukada K (2019) An MFL probe using
shiftable magnetization angle for front and back side crack evaluation. In: Proceedings - 2019
IEEE 15th International Colloquium on Signal Processing and Its Applications, CSPA 2019,
pp 157–161
5. Sophian A, Tian G, Fan M (2017) Pulsed eddy current non-destructive testing and evaluation:
a review. Chin. J. Mech. Eng. 30:500–514
6. Ghanei S, Kasheﬁ M, Mazinani M (2013) Eddy current nondestructive evaluation of dual
phase steel. Mater Des 50:491–496
7. Nadzri NA, Ishak M, Saari MM, Mohd Halil A (2019) Development of eddy current testing
system for welding inspection. In: Proceeding of the 2018 9th IEEE Control and System
Graduate Research Colloquium, ICSGRC 2018. IEEE, pp 94–98
8. He D, Shiwa M (2014) A magnetic sensor with amorphous wire. Sensors (Switzerland).
14:10644–10649
9. Tsukada K, Hayashi M, Nakamura Y, Sakai K, Kiwa T (2018) Small eddy current testing
sensor probe using a tunneling magnetoresistance sensor to detect cracks in steel structures.
IEEE Trans Magn 54:1–5
10. Saari MM, Zaini MAHP, Ahmad H, Che Lah NA (2019) An AC magnetometer using
automatic frequency switching of a resonant excitation coil for magnetic nanoparticles
characterization. In: Proceeding of the 2018 9th IEEE Control and System Graduate Research
Colloquium, ICSGRC 2018. IEEE, pp 207–210
11. Tumanski S (2007) Induction coil sensors—A review. Meas Sci Technol 18:R31–R46
12. Saari MM, Sakai K, Kiwa T, Sasayama T, Yoshida T, Tsukada K (2015) Characterization of
the magnetic moment distribution in low-concentration solutions of iron oxide nanoparticles
by a high- T c superconducting quantum interference device magnetometer. J Appl Phys
117:17B321
13. Saari MM, Ishihara Y, Tsukamoto Y, Kusaka T, Morita K, Sakai K, Kiwa T, Tsukada K
(2015) Optimization of an AC/DC high- Tc SQUID magnetometer detection unit for
evaluation of magnetic nanoparticles in solution. IEEE Trans Appl Supercond 25:1–4
14. Jander A, Smith C, Schneider R (2005) Magnetoresistive sensors for nondestructive
evaluation (Invited Paper). In: Advanced Sensor Technologies for Nondestructive Evaluation
and Structural Health Monitoring, p 1
15. Tsukada K, Haga Y, Morita K, Song N, Sakai K, Kiwa T, Cheng W (2016) Detection of inner
corrosion of steel construction using magnetic resistance sensor and magnetic spectroscopy
analysis. IEEE Trans Magn 52:1–4
Model-Free Tuning of Laguerre
Network for Impedance Matching
in Bilateral Teleoperation System

Mohd Syakirin Ramli, Hamzah Ahmad, Addie Irawan,

and Nur Liyana Ibrahim

Abstract This paper addresses the tuning method to attain symmetry between the
master and slave manipulators of a bilateral teleoperation system. In the proposed
structure, an equalizer based on the Laguerre network connected in-feedback loop
to the master manipulator has been introduced. A set of input-output data were first
generated and recorded which later be used in two-steps tuning procedure.
A fictitious reference signal was formulated based on these data. In addition, a
metaheuristic optimization algorithm namely the Particle Swarm Optimization has
been employed in seeking the optimal controller’s parameters. Numerical analyses
utilizing Matlab software has been performed. The results exhibited that the
dynamic of the master manipulator with the added controller is almost identical to
the dynamic of the slave systems. Hence, it is verified that the proposed tuning
technique is feasible to achieve symmetry between both sides of the manipulators.

Keywords Fictitious signal PID controller Particle Swarm Optimization

Two-port networks Velocity matching

1 Introduction

A teleoperator system comprised of dual robots namely the master robot controlled
by the human operators, and a remote slave robot which tracks the motion of the
master, where it concurrently transmits the environment’s force back to the human
operator. The teleoperation system extends the human operator’s capability to
conduct tasks remotely from a base station. Vast applications of teleoperation
systems can be found in the underwater explorations [1, 2], telesurgery [3], and
military [4].

M. S. Ramli (&) H. Ahmad A. Irawan N. L. Ibrahim

Instrumentation and Control Engineering (ICE) Research Cluster, Faculty of Electrical and
Electronics Engineering Technology, Universiti Malaysia Pahang, 26600 Pekan, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 329

Various studies had been carried out by researchers in the past focusing on the
four-channels architecture of bilateral teleoperation systems. The work in [5, 6]
discussed some of the earlier ideas of the four-channels structure, and emphasized
that the proper utilization of all channels is crucial in achieving accurate trans-
mission of the task impedance to the operator. In [7], their work focused on
designing symmetric impedance matched with position tracking. Meanwhile in [8],
the authors provide surveys on the implementation of the wave variable control in
the four-channel structure in bilateral teleoperation system. On the other hand, the
work in [9] considered the implementation of the wave variable control for
four-channels architecture in the multilateral framework. To add further to the lists,
our recent work in [10] investigated the potential of introducing a controller con-
nected in-feedback to a single master manipulator, to attain a matched impedance
with the Locked-system derived from the multiple slave manipulators formed by
multi-agents system.
In this paper, we focus on obtaining a matched impedance between the master
and slave sides of a bilateral teleoperator system by using a model-free approach.
Assuming the human and the remote task at the environment to form two sides of
the divide, then by introducing a feedback controller to the master system, a
symmetry between both sides can be established. For this purpose, a Laguerre
network structure is selected as the controller due to orthonormal properties filter,
which simplifies the tuning process to only finding the optimal values of the basis of
the filters. Here, the task of tuning the basis of the Laguerre network can be
performed by employing the Fictitious-Reference-Iterative-Tuning (FRIT) and
Particle Swarm Optimization (PSO) algorithms. The FRIT only requires a set of
input-output data acquired from a single-shot experiment to be used in tuning
process [11]. Hence, the mathematical modeling of the complex system which
normally needed in the conventional controller design can partly be eliminated
through the employment of FRIT.
The PSO, on the other hand, is a metaheuristic optimization technique of finding
the optimal solution from a predefined search space. First introduced by Kennedy
and Eberhart [12] in 1995, the algorithm mimics the behavior of swarm or flock of
fishes/birds in minimizing or maximizing the specified fitness function. Our work
focus on implementing the algorithm in minimizing the cost function, formulated
based on the fictitious signals utilizing the recorded data.
The organization of this paper is as follows. In Sect. 2, we provide the problem
formulation where the overview of the two-ports and basic teleoperation structures
are presented. In Sect. 3, we discuss our proposed algorithm to achieve impedance
matching. Next in Sect. 4, a numerical example to illustrate the effectiveness of
proposed method is discussed. Finally, we conclude the findings in Sect. 5.
Mathematical Preliminaries: We denote R and Rn as the set of real numbers and
vectors with dimension n respectively. Suppose v 2 Rn , then the vector norm is
pffiffiffiffiffiffiffi
defined by kvk :¼ vT v where T is the transposition. Meanwhile, the notation of
kvðkÞk2K implies
Model-Free Tuning of Laguerre Network … 331

X
K
kvðkÞk2K :¼ kvðkÞk2 ¼ kvð1Þk2 þ kvð2Þk2 þ þ kvðKÞk2 : ð1Þ
k¼1

Finally, we deﬁne 1m ¼ ½1; ; 1 2 R1m as the m-dimensional row vector with all
elements equal to 1.

2 Problem Formulation

2.1 Overview of the Two-Ports Network

The general model of two-ports network in bilateral teleoperation is depicted in

Fig. 1. In the bilateral teleoperation mechanism, the operator’s force on the master
fh is transmitted to the remote task through the teleoperation system T, and at the
same time the environment force fe is transmitted back to the operator. Considering
the master velocity x_ m and the slave velocity x_ s , the perfect transparency is achieved
if fh fe for x_ m ¼ x_ s . The relation between the forces and motions in bilateral
teleoperation system can be generalized in the hybrid matrix [13] of

fh ðsÞ h ðsÞ h12 ðsÞ x_ s ðsÞ
¼ 11 ð2Þ
x_ m ðsÞ h21 ðsÞ h22 ðsÞ fe ðsÞ

where hij ðsÞ is a SISO transfer function. From (2), it can be shown that

fh ¼ ðh11 h12 Ze Þðh21 h22 Ze Þ1 x_ m : ð3Þ

|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
ZT

To achieve a perfect transparency such that the transmitted impedance ZT equals to

the environment impedance Ze , the necessary and sufﬁcient conditions are h22 ¼ 0,
h21 Ze ¼ Ze ðh12 Þ, and h11 ¼ 0. Hence, for an ideal case, a perfect transparency for
all frequencies implies

Fig. 1 General two-ports

model of a bilateral
teleoperation system [13]
332 M. S. Ramli et al.

h11 h12 0 1
¼ : ð4Þ
h21 h22 1 0

2.2 Basic Structure of a Teleoperation System

We modelled the motion of the master manipulator by a simple

mass-damper-spring system given by

mm€xm þ dm x_ m þ km xm ¼ fm þ fh ð5Þ

where mm , dm , km are the mass, damping factor, and spring constants, respectively
Meanwhile, fm , and xm are the master’s exerted force and total displacement,
respectively. In a similar form, the slave manipulator is governed by the equation of
motion of

ms€xs þ ds x_ s þ ks xs ¼ fs fe ð6Þ

where ms ,ds , ks are the mass, damping factor, and spring constants. The signals fs
and xs are the slave’s exerted force and the total displacement of the manipulator.
Figure 2 illustrates the general structure of a four-channels bilateral teleoperation
system. The total impedances of the human and environment are denoted by Zh and
Ze respectively. Meanwhile, Zm and Zs are the impedances of the master and slave
manipulators. The local controllers for both master and slave manipulators are
denoted by Cm and Cs . On the other hand, the controllers C1 to C4 are to dictate the
communication link between the master and the slave sides. Zhu and Salcudean
[13] reported that the perfect transparency can be achieved by properly designing
C1 to C4 . For transparency under position control, a fully transparent teleoperator
system satisﬁes the condition given in Eq. (4) by the selection of C1 ¼ Zs þ Cs ,
C2 ¼ C3 ¼ 1, and C4 ¼ ðZm þ Cm Þ. However, this control strategy requires for
acceleration measurement to implement C1 and C4 . As to overcome this issue, the
“intervenient impedance” was introduced to eliminate the need for acceleration
measurement [13]. With low-gain PD control of Cm and Cs , and with the selection
of C1 ¼ Cs , C2 ¼ C3 ¼ 1, C4 ¼ Cm , a nearly perfect transparency is achievable
when we have the master impedance identical to the slave impedance such that
Zm Zs . However, in most cases Zm 6¼ Zs . Hence, this paper will discuss our
proposed method to reach to the similar behavior of Zm Zs .
Model-Free Tuning of Laguerre Network … 333

Fig. 2 Four-channels structure proposed by Zhu and Salcudean [13]

2.3 Improvement to the Existing Structure

To improve the existing structure of the four-channel teleoperation system, Tsuji

et al. [14] introduced an additional equalizer or controller connected in-feedback to
the master manipulator. By using the same local controller Cm for both the master
and slave manipulators, the equalizer F can be properly tuned so that there exists
symmetry between the impedance of the master and slave system. The new
structure of the four-channels teleoperation system is depicted in Fig. 3. With this
implementation, the controllers C1 to C4 can be chosen as C1 ¼ Cm , C2 ¼ C3 ¼ 1,
and C4 ¼ Cm . Now, the aim is to design an optimal controller F to achieve
ZF :¼ Zm þ F Zs . In the next section, we present the structure of the Laguerre
network as to form the basic structure of F. Furthermore, the method of tuning
where the metaheuristic optimization algorithm and ﬁctitious-reference signal
generation are also briefly discussed.
Remark 1: Even though the modeling of manipulators is presented in this paper, it
is not a necessity in implementing our proposed algorithm. It will be discussed
further in the next section to illustrate that only the recorded input-output data are
required in the process of tuning the controllers. Hence, this technique is totally a
model-free approach.
334 M. S. Ramli et al.

Fig. 3 Four-channel structure illustrating the additional equalizer F

3 Algorithm for Impedance Matching

3.1 Particle Swarm Optimization

The PSO is an optimization method based on the metaphor of social behavior of

flocks of birds or school of fish. First introduced by Kennedy and Eberhart [12], the
algorithm started with the initialization of the pools particles/agents with random
positions and velocities in multi-dimensional space. Let pi ðkÞ 2 R1D and
qi ðkÞ 2 R1D , i ¼ 1; 2; ; N, denote the position and velocity of each agent i in D
dimension at iteration k. Let the fitness function’s value associated with the position
pi ðkÞ is denoted by Fit 2 R. Each of the agents is assumed to optimize the fitness
function Fit , by evaluating the best-value-so-far (pbesti 2 R1D ) and its current
position. The velocity of each agent i will be updated based on the following
equation

qi ðk þ 1Þ ¼ xqi ðkÞ þ g1 r1 ðpbesti pi ðkÞÞ þ g2 r2 ðgbest pi ðkÞÞ ð7Þ

where x 2 R is the weighting function, g1 ; g2 2 R are the weighting factors,

r1 ; r2 2 R are the cognitive and social learning parameters generated randomly
between 0 and 1. Meanwhile pbesti is the pbest value of agent i, and gbest 2 R1D is
the best value so far in the group among the pbests of all agents. The following
function is used to update the weighting function x in Eq. (7):
Model-Free Tuning of Laguerre Network … 335

xmax xmin
x ¼ xmax iter ð8Þ
itermax

where xmax ; xmin 2 R are the initial and ﬁnal weights, itermax 2 R is the maximum
number of iteration, and iter is the current iteration number. Thus, based on the
updated velocity in (7), each agent i will update its position such that

pi ðk þ 1Þ ¼ pi ðkÞ þ qi ðk þ 1Þ: ð9Þ

At the end of iteration, the agents shall all converge to the optimal position p ,
where

p :¼ arg min Fit ; 8i: ð10Þ

3.2 Equalizer FðzÞ in the Form of a Laguerre Network

The discrete time SISO system can be approximated to use a series of Laguerre
ﬁlters of [15]

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðz1 aÞi1

Li ðzÞ ¼ ð1 a2 Þts ð11Þ
ð1 az1 Þi

P
M
as to form yðzÞ ¼ FðzÞsðzÞ ¼ ci Li ðzÞ as shown in Fig. 4. The parameter a 2 R is
i¼1
the pole of the Laguerre network, and 0 a\1 for the stability of the network [16],
with ts as the sampling time. The input and output signals of the network are denoted
by sðkÞ ¼ Z 1 ½sðzÞ and yðkÞ ¼ Z 1 ½yðzÞ, respectively. Here, we use Z 1 ½ to
denote the inverse z-transform operator. The parameters ci 2 R, i ¼ 1; ; M are the
coefﬁcients that form the basis of the Laguerre network. Meanwhile, the signal of
li 2 R, i ¼ 1; ; M is the output of the i th-order ﬁlter in the Laguerre network.

Fig. 4 Structure of the Laguerre network

336 M. S. Ramli et al.

By this notation, the SISO state-space model of the overall network can be
represented by

lðk þ 1Þ ¼ AlðkÞ þ BuðkÞ
FðzÞ : ð12Þ
yðkÞ ¼ ClðkÞ

where l ¼ ½l1 ; ; LM T 2 RM is the state vector, A 2 RMM is the system matrix,

B 2 RM is the input matrix, and C ¼ ½c1 ; ; cM 2 R1M is the output matrix. The
elements of A and B are given by
8
< a if i ¼ j
½ Aij :¼ ð1Þðij þ 1Þ aðij1Þ ð1 a2 Þ if i\j ð13Þ
:
0 otherwise
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
½Bi :¼ ðaÞði1Þ ð1 a2Þts : ð14Þ

3.3 Fictitious-Reference-Iterative Tuning

The equalizer F needs to be properly designed and tuned to attain ZF Zs . Similar

procedure of tuning as discussed in [14] was adopted in this work. Figure 5
illustrates the two-process of tuning which had been carried out to obtain the
optimal controllers. In the ﬁrst process (see Fig. 5(a)), an equalizer H was ﬁrst to be
determined to match the velocities x_ m and x_ s . Similar to our previous work in [10],
we selected HðzÞ :¼ PðzÞ=QðzÞ as a bi-proper transfer function in the form of

1 þ ^a1 z1 þ þ a
^p zp
HðzÞ ¼ : ð15Þ
1 þ ^bz1 þ þ ^bp zp

In the second process, a fictitious signal was formulated to utilize (15) (see Fig. 5
(b)). The fictitious signal can be defined as

~fs ðkÞ ¼ H ðzÞ u0 ðkÞ þ FðzÞ_x0 ðkÞ ð16Þ

where H ðzÞ is the transfer function of HðzÞ with the optimal parameters.
Meanwhile, u0 and x_ 0m are the recorded input-output data measured from the
master’s manipulator.
Model-Free Tuning of Laguerre Network … 337

Fig. 5 Two-steps of tuning: (a) to attain H , (b) to attain F

3.4 Attaining a Matched Impedance via PSO and FRIT

To obtain the optimal transfer function H ðzÞ, we need to solve the constraint
optimization problem deﬁned by

min JH
HðzÞ
ð17Þ
s:t: jzj 1

where for the recorded initial data x_ 0m ðkÞ and x_ 0s ðkÞ,

2
JH :¼ x_ 0s ðkÞ HðzÞ_x0m ðkÞ K
: ð18Þ

Meanwhile, to attain the optimal F ðzÞ, we solve the second optimization problem
given by

min JF ð19Þ
FðzÞ

where for recorded initial data fm0 ðkÞ and fs0 ðkÞ,

fs0 ðkÞ ~fs ðkÞ

2
JF :¼ K
2
¼ fs0 ðkÞ H ðzÞðu0 ðkÞ þ FðzÞ_xm ðkÞÞ K ð20Þ
2
¼ H 1 ðzÞfs0 ðkÞ u0 ðkÞ FðzÞ_xm ðkÞ K
u0 ðkÞ ¼ fm ðkÞ F 0 ðzÞ_xm ðkÞ:
0
338 M. S. Ramli et al.

The following algorithm has been implemented to obtain the optimal controllers
H ðzÞ and F ðzÞ:
Step 1. Let the tunable parameters of the controller FðzÞ be defined as
q ¼ ½a; c1 ; ; cM 2 R1D1 . By arbitrarily selecting the initial value q0 ,
the set of data x_ 0m , x_ 0s , fs0 and u0 are then generated.
Step 2. First, we tune the equalizer H by employing the PSO algorithm. Let

pi :¼ â1 ; ; âp ; ^b1 ; ; ^bp 2 R1D2 pi 2 ½pHmin ; pHmax ; 8i. Initialize the
positions of PSO agents in the specified search space. Define the fitness
function Fit for each agent according to Eq. (18), such that Fit ¼ JH .
Step 3. Update the agents’ velocities based on Eq. (7) and agents’ positions based
on Eq. (9) at each iteration. At the final iteration time, all agents shall
converge to the optimal position of p corresponds to optimization
problem defined in Eq. (17). Assign the coefficients of transfer function in
(15) with p . Repeat from Step 2 if results are not satisfactory.
Step 4. Next, we tune the controller F by also employing the PSO algorithm. Let
pi :¼ q 2 R1D1 pi 2 ½pFmin ; pFmax ; 8i: Initialize the positions of PSO
agents in the specified search space. Define the fitness function Fit for
each agent according to Eq. (20), such that Fit ¼ JF .
Step 5. Update the agents’ velocities based on Eq. (7) and agents’ positions based
on Eq. (9) at each iteration. At the final iteration time, all agents shall
converge to the optimal position of p corresponds to optimization
problem defined in Eq. (19). Assign q ¼ q . Repeat from Step 4 if results
are not satisfactory.

4 Numerical Results and Analysis

To illustrate the effectiveness of our proposed method, we present an example in

this section. We conducted a numerical analysis employing the Matlab simulation
package to execute the developed theoretical models. The parameters used in the
teleoperation system are summarized in Table 1. The impedance of the human
operator was deﬁned as Zh ¼ s2 þ 5s þ 10. Meanwhile, the number of basis of the
truncated Laguerre ﬁlters was chosen as M ¼ 10, and the sample time ts ¼ 0:01 s.
We assume there was no time delay in the communication link, and the environ-
ment’s impedance was set to zero to imply that the slave manipulator moves freely

Table 1 Parameters values Manipulator Mass Damper (Ns/ Spring (N/

of the manipulator systems (kg) m) m)
Master mm ¼ 1:5 dm ¼ 0:4952 km ¼ 0
Slave ms ¼ 3 ds ¼ 2:4762 ks ¼ 1:4621
Model-Free Tuning of Laguerre Network … 339

without any attached load. The transfer function of the local controllers for both
master and slave were chosen as Cm ¼ 2 1 þ 100s 1
þ 0:2s . Meanwhile, the con-
trollers C1 to C4 were selected based on the description provided in Sect. 2.3.
In Table 2, we provide the parameters of the PSO algorithm that were used in
the tuning process. For both procedures, we used the weighting factor
g1 ¼ g2 ¼ 1:4. Meanwhile, xmin ¼ 0:4 and xmax ¼ 0:9, respectively.
Figure 6 illustrates the performance of the equalizer HðzÞ with p ¼ 6 in equal-
izing the velocities between the manipulators. As presented in the ﬁgure, the initial
recorded velocity signals of the master and slave manipulator are indicated in the
blue and red lines, respectively. It can clearly be seen that the velocity x_ 0m was
matched with x_ 0s through the equalizer HðzÞ(as indicated by the dashed-black line).
The convergence of the cost function (18) is exhibited in Fig. 7 where JH ¼ 7:8817
at the ﬁnal iteration k ¼ 150. Meanwhile, Fig. 8 indicates the location of the poles
and zeros of HðzÞ which all lie inside the unit circle to signify HðzÞ and HðzÞ1 are
always stable.

Table 2 Tuning parameters used in PSO algorithm

Number of Number of Maximum Minimum range Maximum range
parameters D agents N iteration pmin pmax
itermax
Tuning H D2 ¼ 12 200 150 1 1
Tuning F D1 ¼ 11 100 400 0; 200 1M 1; 50 1M

Fig. 6 Velocity matching through equalizer HðzÞ

340 M. S. Ramli et al.

Fig. 7 Convergence of the cost function JH

Fig. 8 Location of poles and

zeros of HðzÞ

The comparison of the positions, velocities and exerted forces of the master and
slave manipulators, before and after tuning are depicted in Fig. 9(a) and (b) re-
spectively. From Fig. 9(b), it can be observed that the trends of velocities of both
manipulators are almost identical for all time t. Except for the position of the master
manipulator where it was slightly lagging than the position of the slave. Similar
observation can be obtained from the exerted forces response of the manipulators.
Here, it could be seen that they have almost identical patterns. Additional result to
illustrate the convergence of the cost function (20) is provided in Fig. 10. The cost
function value was obtained as JF ¼ 30294:2498 at the ﬁnal iteration time of
k ¼ 400.
Model-Free Tuning of Laguerre Network … 341

(a) Before Tuning

(b) After Tuning

Fig. 9 Performance comparison before and after tuning

342 M. S. Ramli et al.

Fig. 10 Convergence of the cost function JF

5 Conclusion

In this paper, the tuning algorithm based on a model-free approach to improve

transparency through impedance matching between the master and slave manipu-
lators of a bilateral teleoperation system has been demonstrated. By introducing a
controller connected in-feedback to the master manipulator, it provides the possi-
bility of obtaining a symmetric impedance between both sides of the teleoperation
system. Furthermore, the utilization of FRIT has eliminated the necessity of
obtaining the plant model through mathematical modeling in designing the con-
trollers. Hence, it is truly a model-free approach. Meanwhile, the implementation of
the PSO algorithm further simpliﬁed the process of obtaining the optimal controller
parameters. From the presented numerical results, it can be concluded that the
proposed algorithm exhibits promising results to achieve a matched impedance
between the master and slave manipulators. However, the formulation of the cost
function warrants for further investigation to ensure ultimate convergence of its
value towards zero.

Acknowledgements This research work has been supported by Research & Innovation
Department, Universiti Malaysia Pahang through short-term grant of RDU1703139.

References

1. Zhang J, Li W, Yu J, Zhang Q, Cui S, Li Y, Li S, Chen G (2017) Development of a virtual

platform for telepresence control of an underwater manipulator mounted on a submersible
vehicle. IEEE Trans Ind Electron 64:1716–1727
2. Saltaren R, Barroso AR, Yakrangi O (2018) Robotics for seabed teleoperation:
part-1-conception and practical implementation of a hybrid seabed robot. IEEE Access
6:60559–60569
Model-Free Tuning of Laguerre Network … 343

3. Berthet-Rayne P, Leibrandt K, Gras G, Fraisse P, Crosnier A, Yang G-Z (2018) Inverse

kinematics control methods for redundant snakelike robot teleoperation during minimally
invasive surgery. IEEE Robot Autom Lett 3:2501–2508
4. Chen JYC, Barnes MJ (2008) Robotics operator performance in a multi-tasking environment.
In: Human-robot interactions in future military operations, pp 293–314
5. Lawrence DA (1992) Designing teleoperator architectures for transparency. In: Proceedings
1992 IEEE international conference on robotics and automation. IEEE Computer Society
Press, pp 1406–1411
6. Lawrence DA (1993) Stability and transparency in bilateral teleoperation. IEEE Trans Robot
Autom 9:624–637
7. Namerikawa T, Kawada H (2006) Symmetric impedance matched teleoperation with position
tracking. In: Proceedings of the 45th IEEE conference on decision and control, pp 4496–4501
8. Sun D, Naghdy F, Du H (2014) Application of wave-variable control to bilateral teleoperation
systems: a survey. Ann Rev Control 38:12–31
9. Kanno T, Yokokohji T (2012) Multilateral teleoperation control over time-delayed computer
networks using wave variables. In: Haptics symposium (HAPTICS). IEEE, pp 125–131
10. Ramli MS, Ahmad H (2018) Data-driven impedance matching in multilateral teleoperation
systems. Indones J Electr Eng Comput Sci 10:713–724
11. Kaneko O, Soma S, Fujii T (2003) A fictititous reference iterative tuning (FRIT) in the
two-degree of freedom control scheme and its application to closed loop system identification.
Instrum Control Soc 42:17–25
12. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN 1995 -
international conference on neural networks, pp 1942–1948
13. Zhu M, Salcudean SE (1995) Achieving transparency for teleoperator systems under position
and rate control. In: Proceedings 1995 IEEE/RSJ international conference on intelligent robots
and systems. Human robot interaction and cooperative robots. IEEE Computer Society Press,
pp 7–12
14. Tsuji M, Yamamoto S, Kaneko O (2014) A tuning method of a 4-channel bilateral control
system. In: 46th SICE Hokkaido branch academic symposium, pp 1–4. (in Japanese)
15. Wang Q, Zhang J (2011) Wiener model identification and nonlinear model predictive control
of a pH neutralization process based on Laguerre filters and least squares support vector
machines. J Zhejiang Univ Sci C 12:25–35
16. Wang L (2009) Model predictive control system design and implementation using MATLAB.
Springer, London
Identification of Liquid Slosh Behavior
Using Continuous-Time Hammerstein
Model Based Sine Cosine Algorithm

Julakha Jahan Jui, Mohd Helmi Suid, Zulkifli Musa,

and Mohd Ashraf Ahmad

Abstract This paper presents the identification of liquid slosh plant using the
Hammerstein model based on Sine Cosine Algorithm (SCA). A remote car that
carrying a container of liquid is considered as the liquid slosh experimental rig. In
contrast to other research works, this paper considers a piece-wise affine function in
a nonlinear function of the Hammerstein model, which is more generalized func-
tion. Moreover, a continuous-time transfer function is utilized in the Hammerstein
model, which is more suitable to represent a real system. The SCA method is used
to tune both coefficients in the nonlinear function and the transfer function of the
Hammerstein model such that the error between the identified output and the real
experimental output is minimized. The effectiveness of the proposed framework is
assessed in terms of the convergence curve response, output response, and the
stability of the identified model through the pole-zero map. The results show that
the SCA based method is able to produce a Hammerstein model that yields iden-
tified output response closes to the real experimental slosh output with 80.44%
improvement of sum of quadratic error.

Keywords Slosh behavior Sine Cosine Algorithm Hammerstein model

1 Introduction

Nowadays, liquid slosh inside a cargo always happens in many situations. For
example, ships with liquid container carriers are at high risk of generating sloshing
load during operation [1]. In the metal industries, high oscillation can spill molten
metal that is dangerous to the operator [2]. Meanwhile, sloshing of fuel and other
liquids in moving vehicles may cause instability and undesired dynamics [3].
Hence, it is necessary to completely study the behavior of this residual slosh

J. J. Jui (&) M. H. Suid Z. Musa M. A. Ahmad

Faculty of Electrical and Electronics Engineering Technology, University Malaysia Pahang,
26600 Pekan, Pahang, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 345

induced by the container motion. One may study the behavior of liquid slosh
through developing the exact mathematical model of liquid slosh. So far, many
researchers focus on the first principle approach to model the slosh behavior, while
there are few literatures to discuss it from the perspective of nonlinear system
identification approach.
On the other hand, block oriented nonlinear system identification has become a
popular technique to model a complex plant. The block oriented nonlinear model
can be classified into three categories, which are Hammerstein model, Wiener
model and Hammerstein Wiener model. In particular, Hammerstein model is a
model that consists of a nonlinear function followed by linear dynamic sub-plant,
while Wiener model consists of a linear dynamic sub-plant followed by nonlinear
function, and finally, Hammerstein-Wiener model contains a linear dynamic
sub-plant inserted between two or more nonlinear functions in series. Among these
three block oriented models, Hammerstein model is famous due to its simple model
structure and it has been widely used for nonlinear system identification.
Specifically, the Hammerstein model has been applied to model a real plant such as
Solid Oxide fuel cell [4], bidirectional DC motor [5], oxygen uptake estimation [6],
stretch reflex dynamics [7], turntable servo system [8], pneumatic muscle actuators
[9], amplified piezoelectric actuators [10] and multi-axis piezoelectric micro posi-
tioning stages [11]. On the other hand, there are many tools that have been utilized
to identify the Hammerstein model. There are the iterative method [12–14], the
subspace method [15–17], the least square method [18], the blind approach [19] and
the parametric instrumental variables method [20]. Moreover, many also consider
the optimization tools for Hammerstein model, such as Bacterial Foraging algo-
rithm [21], Cuckoo search algorithm [22], Particle Swarm optimization [23], and
Genetic algorithm [24].
Based on the above literature, several limitations are ineluctable in their works,
which are:
(i) Most of the Hammerstein models used in their study are based on
discrete-time model, while many real plants can be easily represented in
continuous-time model.
(ii) Almost all the methods assume a known structure of nonlinear function, which
consists of several basis functions.
Though, our proposed work can solve a more general class of continuous-time
Hammerstein model by assuming an unknown structure of nonlinear function. In
particular, a piece-wise affine function is adopted with so many basis functions. Due
to the introduction of the piece-wise affine function, a high dimensional design
parameter tuning is considered in this study, which make the identification problem
more complex. On the other hand, Sine Cosine Algorithm (SCA) [25] has become a
top notch optimization algorithm which has solved various types of engineering
problems [25–27]. To the best of our knowledge, there are still few works to discuss
Identification of Liquid Slosh Behavior … 347

on the SCA for identiﬁcation of Hammerstein model. Moreover, other recent

optimization methods are quite complex as compared to SCA which may contribute
to high computation time in obtaining the result. Thence, it motivates us to see the
effectiveness of the SCA in modeling the liquid slosh plant from the real experi-
mental data.
This paper presents the identification of liquid slosh plant using the
Hammerstein model based on SCA method. A remote car that carrying a container
of liquid is considered as the liquid slosh experimental rig. The SCA method is used
to tune both coefficients in the nonlinear function and transfer function of the
Hammerstein model such that the error between the identified output and the real
experimental output is minimized. The effectiveness of the proposed framework is
assessed in terms of the convergence curve response, output response, and the
stability of the identified model through the pole zero map.

2 Liquid Slosh Experimental Rig

In this study, a mobile liquid slosh plant is considered to replicate real situation of a
moving container carrying liquid, as shown in Fig. 1. In particular, a remote control
car is used to carry a small tank ﬁlled with liquid. The tank is also equipped with
four plastic wheels so that it can move smoothly as shown in Fig. 1(a). Moreover,
three accelerometer sensors (ADXL335) that are floated on the surface of liquid are
used to measure liquid oscillation as shown in Fig. 1(b). For simplicity of our study,
the liquid slosh data from only one of the sensor is recorded and only z-axis output
data is considered. Figure 2 shows a general schematic diagram of liquid slosh
experimental rig. In particular, an Arduino UNO is used as a data acquisition
platform to process the input and output data. Here, we generate a voltage from the
Arduino UNO to the remote car and concurrently the Arduino UNO also will
acquire the slosh data from the accelerometer. Both the input and output data can be
monitored and analyzed from the personal computer using the LabView software.
In order to identify the model of liquid slosh, the remote car is required to move to a
certain distance and suddenly stop to generate a liquid oscillation or slosh inside the
tank. Thence, we apply the input voltage as shown in Fig. 3 to move the remote car.
Concurrently, the liquid slosh data is recorded as shown in Fig. 4. These two data
are then used to develop the Hammerstein model based SCA, which is discussed in
the next section.
348 J. J. Jui et al.

(a) Side view

(b) Plan view

Fig. 1 Liquid slosh experimental rig

Fig. 2 Schematic diagram of liquid slosh experimental rig

3 Identiﬁcation of Liquid Slosh Using Hammerstein Based

SCA

In this section, the proposed Sine Cosine Algorithm (SCA) for identiﬁcation of
liquid slosh plant in Sect. 2 based on Hammerstein model is presented. Firstly, a
problem formulation to identify the liquid slosh plant is explained. Then, it is
shown on how to apply the SCA method to identify the liquid slosh based on
Hammerstein model.
Identiﬁcation of Liquid Slosh Behavior … 349

Fig. 3 Input voltage applied

to the remote car

Fig. 4 Output slosh from the

accelerometer

Figure 5 shows a complete block diagram to identify the liquid slosh model in
Sect. 2. The proposed Hammerstein model consists of nonlinear function h(u) fol-
lowed by the transfer function G(s). The nonlinear function is a piece-wise afﬁne
function given by
8
>
> c0 þ m1 ðu d0 Þ if d0 u\d1 ;
>
< c1 þ m2 ðu d1 Þ if d1 u\d2 ;
hðuÞ ¼ .. ð1Þ
>
> .
>
:
cr1 þ mr ðu dr1 Þ if dr1 u\dr ;
350 J. J. Jui et al.

Fig. 5 Block diagram of Hammerstein model based SCA

and the transfer function G(s) is given by

BðsÞ sm þ bm1 sm1 þ þ b0

GðsÞ ¼ ¼ : ð2Þ
AðsÞ am sm þ am1 sm1 þ þ a0

In (1), the symbol mi ¼ ðci ci1 Þ=ðdi di1 Þ ði ¼ 1; 2; . . .; rÞ are the segment
slope with connecting input and output points as di ði ¼ 0; 1; . . .; rÞ and
ci ði ¼ 0; 1; . . .; rÞ, respectively. For simplicity of notation, let d = [d0, d1, …, dr]T
and c = [c0, c1, …, cr]T. The input of the real liquid slosh plant and the identified
model is defined by u(t), while the output of the real liquid slosh plant and the
identified model are denoted by yðtÞ and ~yðtÞ, respectively. Thence, the expression
of the identified output can be written as

~yðtÞ ¼ GðsÞhðuðtÞÞ: ð3Þ

Moreover, several assumptions are adopted in this work, which are:

(i) The order of the polynomial A(s) and B(s) are assumed to be known
(ii) The nonlinear function h(u(t)) is one-to-one map to the input u(t) and the
values of di ði ¼ 1; 2; . . .; rÞ are pre-determined according to the response of
input u(t).
Identiﬁcation of Liquid Slosh Behavior … 351

Next, let ts be a sampling time for the real experimental input and output data (u
(t), y(t)) (t = 0, ts, 2ts, …, Nts). Then, in order to accurately identify the liquid slosh
model, the following objective function in (4) is adopted in this study:

X
N
EðG; hÞ ¼ ðyðgts Þ ~yðgts ÞÞ2 : ð4Þ
g¼0

Note that the objective function in (4) is based on the sum of quadratic error,
which has been widely used in many literature [28, 29]. Finally, our problem
formulation can be described as follows.
Problem 1. Based on the given real experimental data (u(t), y(t)) in Fig. 1, find the
nonlinear function h(u) and the transfer function G(s) such that the objective
function in (4) is minimized.
Furthermore, it is shown on how to apply the SCA in solving Problem 1. For
simplicity, let the design parameter of Problem 1 is defined as
x ¼ ½ b0 b1 bm1 a0 a1 am c0 cr T , where the elements
of the design parameter are the coefficients of both the nonlinear function and the
transfer function of the continuous-time Hammerstein model. In SCA framework,
let xi ði ¼ 1; 2; . . .; MÞ be the design parameter of each agent i for M total number
of agents. Then, consider xij ðj ¼ 1; 2; . . .; DÞ be the j-th element of the vector
xi ði ¼ 1; 2; . . .; MÞ, where D is the size of the design parameter. Thence, by
adopting objective function in (4), a minimization problem is expressed as

arg min Eðxi ðkÞÞ: ð5Þ

xi ð1Þ; xi ð2Þ; ...

for iterations k = 1, 2, …, until maximum iteration kmax. Finally, the procedure of

the SCA in solving Problem 1 is shown below:
Step 1: Determine the total number of agents M and the maximum iteration kmax.
Set k = 0 and initialize the design parameter xi ð0Þði ¼ 1; 2; . . .; MÞ according to
the upper bound xup and lower bound xlow values of the design parameter.
Step 2: Calculate the objective function in (4) for each search agent i.
Step 3: Update the values of the best design parameter P based on the generated
objective function in Step 2.
Step 4: For each agent, update the design parameter using the following
equation:

xij ðkÞ þ r1 sin(r2 Þ r3 Pj xij ðkÞ if r4 \0:5;
xij ðk þ 1Þ ¼ ð6Þ
xij ðkÞ þ r1 cos(r2 Þ r3 Pj xij ðkÞ if r4 0:5;
352 J. J. Jui et al.

where

k
r1 ¼ 2 1 ð7Þ
kmax

for maximum iteration kmax and constant positive value a. Note that r2, r3 and r4 are
random values that are generated independently and uniformly in the ranges [0, 2p],
[0, 2] and [0, 1], respectively. The detailed justiﬁcation on the selection of the
coefﬁcients r1, r2, r3 and r4 are clearly explained in [25]. In (6), the symbol Pj
(j = 1, 2,…, n) is denoted as the best current design parameter in j-th element of
P that is kept during tuning process.
Step 5: After the maximum iteration is achieved, record the best design
parameter P and obtained the continuous-time Hammerstein model in Fig. 1.
Otherwise, repeat Step 2.

4 Results and Analysis

In this section, the effectiveness of the SCA based method for identifying the liquid
slosh system using continuous-time Hammerstein model is demonstrated. In par-
ticular, the convergence curve response of the objective function in (4), the
pole-zero mapping of linear function and the plot of nonlinear function, will be
presented and analyzed in this study.
Based on the experimental setup in Sect. 2, the input response u(t) as shown in
Fig. 3 is applied to the liquid slosh plant, and the output response y(t) is recorded as
shown in Fig. 4. Here, the input and output data are sampled at ts = 0.02 for
N = 450. In this study, the structure of G(s) is selected as follows:

BðsÞ s3 þ b2 s2 þ b1 s þ b0
GðsÞ ¼ ¼ : ð8Þ
AðsÞ a4 s þ a3 s3 þ a2 s2 þ a1 s þ a0
4

after performing several preliminary testing on the given data (u(t), y(t)). The fourth
order system is used by considering a cascade of 2nd order system for both dc motor
of remote car and the slosh dynamic. Meanwhile, the input points for piece-wise
afﬁne function of h(u(t)) are given by d = [0, 0.2, 0.4, 0.6, 0.8, 1, 2, 3, 4, 5]T. The
selection of vector d is obtained after several preliminary experiments. The design
parameter x 2 R18 with its corresponding transfer function and nonlinear function is
shown in Table 1. Next, the SCA algorithm is applied to tune the design parameter
with initial values of design parameter are randomly selected between the upper
bound xup and lower bound xlow as shown in Table 1. Note that the values xup and
xlow are obtained after performing several preliminary experiments. Here, we
choose the number of agents M = 40 with maximum iterations kmax = 5000.
Identiﬁcation of Liquid Slosh Behavior … 353

Table 1 Design parameter of x Coefﬁcients xlow xup P

liquid slosh plant
x1 b2 −5 35 −3.7948
x2 b1 −5 35 10.7153
x3 b0 −5 35 −0.9059
x4 a4 −5 35 −0.6154
x5 a3 −2200 −1 −5.3112
x5 a2 −2200 −1 −139.8711
x7 a1 −2200 −1 −1132.2883
x8 a0 −2200 −1 −839.7621
x9 c0 −5 5 −4.8859
x10 c1 −5 5 −0.0219
x11 c2 −5 5 3.3211
x12 c3 −5 5 −4.7295
x13 c4 −5 5 −0.3240
x14 c5 −5 5 −4.4858
x15 c6 −5 5 −0.0002
x16 c7 −5 5 0.0000
x17 c8 −5 5 0.1679
x18 c9 −5 5 −4.3282

Fig. 6 Convergence curve

response

Figure 6 shows the response of the objective function convergence with the
value of E(G, h) = 0.1616 at kmax = 5000 with 80.44% of objective function
improvement to produce the best design parameter P as shown in the ﬁnal column
of Table 1. It shows that the SCA based method is able to minimize the objective
function in (4) and produce a quite close output response yðtÞ as compared to the
real output ~yðtÞ, which can be clearly seen in Fig. 7. Note that the identiﬁed output
response tends to yield high oscillation when input is injected to the system and it
start to attenuate when the input is zero, which is quite similar to the response of
real experimental output.
354 J. J. Jui et al.

Fig. 7 Response of the

identiﬁed output ~yðtÞ and real
output yðtÞ

Fig. 8 Pole-zero map of

transfer function G(s)

Fig. 9 Resultant of
piece-wise afﬁne function
h(u)
Identiﬁcation of Liquid Slosh Behavior … 355

In the real experimental setup, we can say that the liquid slosh system is stable
since the liquid slosh output is reduced gradually as t ! 1. In order to validate our
model regarding the stability, we use the pole-zero map of the identiﬁed transfer
function G(s) as shown in Fig. 8. From the pole-zero map, all the poles are located
at the left hand side of y-axis. In particular, the obtained values of poles are
−0.1190 ± j14.8001, −7.5621 and −0.8229, while the obtained values of zeros are
0.0872 and 1.8538 ± j2.6373. On the other hand, we also can observe the feature
of nonlinear function by plotting the obtained piece-wise function as depicted in
Fig. 9. Note that our nonlinear function is not restricted to any form of nonlinear
function (i.e., quadratic), which is more generalized and provide more flexibility of
searching a justiﬁable function.

5 Conclusion

In this paper, an identiﬁcation of liquid slosh plant using continuous-time

Hammerstein model based on Sine Cosine Algorithm (SCA) has been presented.
The results demonstrated that the proposed generic Hammerstein model based on
SCA has a good potential in identifying the real liquid slosh behavior. In particular,
it is shown that the proposed method is able to produce a quite close identiﬁed
output with real liquid slosh output. Moreover, the resultant linear model has been
proved to be stable based on the pole-zero map. It is also shown that the used of
piecewise-afﬁne function gives more flexibility for the SCA to search more generic
nonlinear function. In the future, our work can be extended to various types of
nonlinear function such as continuous-time Wiener and Hammerstein-Wiener.

Acknowledgements The authors gratefully acknowledged Research and Innovation Department

of Universiti Malaysia Pahang under grant RDU1703153 for the ﬁnancial support.

References

1. Rizzuto E, Tedeschi R (1997) Surveys of actual sloshing loads on board of ships at sea. In:
Proceedings of International Conference on Ship and Marine Research, pp 7.29–7.37
2. Terashima K, Schmidt G (1994) Sloshing analysis and suppression control of tilting-type
automatic pouring machine. In: Proceedings of IEEE International Symposium on Industrial
Electronics, pp 275–280
3. Acarman T, Ozguner U (2006) Rollover prevention for heavy trucks using frequency shaped
sliding mode control. Vehi Syst Dyn 44(10):737–762
4. Li C, Zhu X, Cao G, Sui S, Hu M (2008) Identiﬁcation of the Hammerstein model of a
PEMFC stack based on least squares support vector machines. J Power Sour 175:303–316
5. Kara T, Eker I (2004) Nonlinear modeling and identiﬁcation of a DC motor for bidirectional
operation with real time experiments. Energy Convers Manag 45(7–8):1087–1106
6. Su SW, Wang L, Celler BG, Savkin AV (2007) Oxygen uptake estimation in humans during
exercise using a Hammerstein model. Ann Biomed Eng 35(11):1898–1906
356 J. J. Jui et al.

7. Westwick DT, Kearney RE (2001) Separable least squares identiﬁcation of nonlinear

Hammerstein models: Application to stretch reflex dynamics. Ann Biomed Eng 29(8):707–
718
8. Zhang Q, Wang Q, Li G (2016) Nonlinear modeling and predictive functional control of
Hammerstein system with application to the turntable servo system. Mech Syst Signal Process
72:383–394
9. Ai Q, Peng Y, Zuo J, Meng W, Liu Q (2019) Hammerstein model for hysteresis
characteristics of pneumatic muscle actuators. Int J Intell Robot Appl 3(1):33–44
10. Saleem A, Mesbah M, Al-Ratout S (2017) Nonlinear Hammerstein model identification of
amplified piezoelectric actuators (APAs): Experimental considerations. In: 2017 4th
International Conference on Control, Decision and Information Technologies (CoDIT),
pp 0633–0638
11. Zhang HT, Hu B, Li L, Chen Z, Wu D, Xu B, Huang X, Gu G, Yuan Y (2018) Distributed
Hammerstein modeling for cross-coupling effect of multiaxis piezoelectric micropositioning
stages. IEEE/ASME Trans Mechatron 23(6):2794–2804
12. Bai EW, Li D (2004) Convergence of the iterative Hammerstein system identification
algorithm. IEEE Trans Autom Control 49(11):1929–1940
13. Hou J, Chen F, Li P, Zhu Z (2019) Fixed point iteration-based subspace identification of
Hammerstein state-space models. IET Control Theory Appl 13(8):1173–1181
14. Ge Z, Ding F, Xu L, Alsaedi A, Hayat T (2019) Gradient-based iterative identification method
for multivariate equation-error autoregressive moving average systems using the decompo-
sition technique. J Frankl Inst 356(3):1658–1676
15. Hou J, Liu T, Wahlberg B, Jansson M (2018) Subspace Hammerstein model identification
under periodic disturbance. IFAC-PapersOnLine 51(15):335–340
16. Hou J, Liu T, Wang QG (2019) Subspace identification of Hammerstein-type nonlinear
systems subject to unknown periodic disturbance. Int J Control, 1–29 (Just-accepted)
17. Jamaludin IW, Wahab NA (2017) Recursive subspace identification algorithm using the
propagator based method. Indones J Electr Eng Comput Sci 6(1):172–179
18. Wang D, Zhang W (2015) Improved least squares identification algorithm for multivariable
Hammerstein systems. J Frankl Inst 352(11):5292–5307
19. Bai EW (2002) A blind approach to the Hammerstein-Wiener model identification.
Automatica 38(6):967–979
20. Ma L, Liu X (2015) A nonlinear recursive instrumental variables identification method of
Hammerstein ARMAX system. Nonlinear Dyn 79(2):1601–1613
21. Lin W, Liu PX (2006) Hammerstein model identification based on bacterial foraging. Electron
Lett 42(23):1332–1333
22. Gotmare A, Patidar R, George NV (2015) Nonlinear system identification using a cuckoo
search optimized adaptive Hammerstein model. Expert Syst Appl 42(5):2538–2546
23. Al-Duwaish HN (2011) Identification of Hammerstein models with known nonlinearity
structure using particle swarm optimization. Arab J Sci Eng 36(7):1269–1276
24. Zhang H, Zhang H (2013) Identification of hammerstein model based on Quantum Genetic
Algorithm. Telkomnika 11(12):7206–7212
25. Mirjalili S (2016) SCA: A sine cosine algorithm for solving optimization problems.
Knowl-Based Syst 96:120–133
26. Suid MH, Tumari MZ, Ahmad MA (2019) A modified sine cosine algorithm for improving
wind plant energy production. Indones J Electr Eng Comput Sci 16(1):101–106
27. Suid MH, Ahmad MA, Ismail MRTR, Ghazali MR, Irawan A, Tumari MZ (2018) An
improved sine cosine algorithm for solving optimization problems. In: IEEE Conference on
Systems, Process and Control (ICSPC), pp 209–213
28. Mjahed M, Ayad H (2019) Quadrotor identification through the cooperative particle swarm
optimization-cuckoo search approach. Comput Intell Neurosci 2019:1–10
29. Gupta S, Gupta R, Padhee S (2018) Parametric system identification and robust controller
design for liquid–liquid heat exchanger system. IET Control Theory Appl 12(10):1474–1482
Cardiotocogram Data Classification
Using Random Forest Based Machine
Learning Algorithm

M. M. Imran Molla, Julakha Jahan Jui, Bifta Sama Bari,

Mamunur Rashid, and Md Jahid Hasan

Abstract The Cardiotocography is the most broadly utilized technique in obstet-

rics practice to monitor fetal health condition. The foremost motive of monitoring is
to detect the fetal hypoxia at early stage. This modality is also widely used to record
fetal heart rate and uterine activity. The exact analysis of cardiotocograms is critical
for further treatment. In this manner, fetal state evaluation utilizing machine
learning technique using cardiotocogram data has achieved significant attention. In
this paper, we implement a model based CTG data classification system utilizing a
supervised Random Forest (RF) which can classify the CTG data based on its
training data. As per the showed up results, the overall performance of the super-
vised machine learning based classification approach provided significant perfor-
mance. In this study, Precision, Recall and F-Score has been employed as the metric
to evaluate the performance. It was found that, the RF based classifier could identify
normal, suspicious and pathologic condition, from the nature of CTG data with
94.8% accuracy. We also highlight the major features based on Mean Decrease
Accuracy and Mean Decrease Gini.

Keywords Fetal heart rate Random forest classiﬁer Cardiotocography

M. M. Imran Molla
Faculty of Computer Science and Engineering, Khwaja Yunus Ali University,
6751 Enayetpur, Sirajganj, Bangladesh
J. J. Jui (&) B. S. Bari M. Rashid
Faculty of Electrical and Electronics Engineering, Universiti Malaysia Pahang,
26600 Pekan, Pahang, Malaysia
e-mail: [email protected]
M. J. Hasan
Faculty of Mechanical and Manufacturing Engineering, Universiti Malaysia Pahang,
26600 Pekan, Pahang, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 357

1 Introduction

Cardiotocography is a strategy that is utilized to screen fetal health condition during

pregnancy. A cardiotocogram (CTG) comprises of two signals, to be specific, the
fetal heart rate (FHR) as well as uterine activity (UA). The identification of fetal
hypoxia at early stage is the target for CTG monitoring. Further examinations for
fetal condition may be performed or the baby is delivered by a surgical strategy.
A standardized nomenclature has been embraced to peruse the cardiotocographs
[1]. It incorporates baseline fetal heart rate (110 to 160 beats/minute), uterine
activity, baseline FHR variability (5 to 25 beats/minute above and below the stable
FHR baseline), periods of decreased and increased FHR variability and existence of
any acceleration or deceleration [2]. It is conceivable to recognize the fetal hypoxia
(lack of oxygen normally in the range of 1 to 5%) by observing FHR. The possi-
bility of being disabling of the newborn baby gets to be high and, in some cases, it
may lead to the death if fetal hypoxia is prolonged. Consequently, it is essential to
detect abnormal FHR patterns and take suitable actions for evading prenatal mor-
bidity as well as mortality [3, 4]. Cardiotocography can be utilized to examine the
fetus health condition, normoxia [5] (oxygen tensions between 10–21%) and nor-
mal or abnormal fetus acid base status [6]. Thus, numerous indicators (occurring
days or hours before fetus death) that can be identified promptly can lead to
appropriate obstetric intervention which could assist in delivering a healthy baby.
CTG is done manually which may cause human error. A computerized CTG may
develop automatic interpretation by decreasing the fetal mortality rate [7, 8]. For the
classification of CTG data, various techniques are utilized. Czabanski et al. [9]
reported that two steps mechanism consisting of weighted fuzzy scoring and LSVM
algorithm are applied to FHR to predict the acidemia hazard. Artificial neural
network is applied to record the fetal wellbeing by Georgieva et al. [10] and
Jezewski et al. [11]. Esra et al. [12] utilized adaptive boosting ensemble of decision
trees for analyzing cardiotocogram to detect pathologic fetus. Neuro-fuzzy method
[13], naïve Bayes classifier [14] are two approaches utilized in the ensemble
classifiers to combine the classification outputs of the weak learners. Random forest
[15] is a classifier that is built on multiple trees from randomly sampled subspaces
of the input features which combine the output of the trees using bagging. It is
applied to different real life applications including protein sequencing [16], clas-
sification of Alzheimer’s disease [17], cancer detection [18], physical activity
classification [19], classification of cardiotocograms using random forest classifier
[20] and so on. Fetal state classification from cardiotocography with feature
extraction utilizing hybrid K-Means and support vector machine has been reported
in [21] with 90.64% accuracy. Fetal state assessment using Cardiotocogram with
Artificial Neural Networks has been presented in [22]. Fetal state assessment using
cardiotocography parameters by applying PCA and AdaBoost has been done by
Zhang et al. [23] with 93% accuracy. In [24], decision Tree is used for analyzing the
Cardiotocogram data for fetal distress determination. In this paper, random forest
classifier is applied for the classification of cardiotocograms into normal, suspicious
Cardiotocogram Data Classification Using Random Forest … 359

as well as pathological classes. Feature importance index is utilized for identifying

important features of the database. Fetal state identification from cardiotocogram
applying LS-SVM with PSO (Particle Swarm Optimization) and binary decision
tree has been reported in [25]. There proposed method provides 91.62% classifi-
cation accuracy. It has been observed that good classification accuracy can be
obtained by applying only ten important features among twenty-one features [25].
A mathematical modeling strategy has been presented to simulate early deceleration
in CTG by Beatrijs et al. [26]. Their outcomes for the uncompromised fetus have
been described that partial oxygen pressures decreases with the strength and
duration of the contraction. Sundar has been proposed classification of car-
diotocogram data using neural network in [27] the accuracy of 91%. A feature
group weighting method for subspace clustering of high-dimensional data reported
in [28]. The get the f measure value 0.77. Zhou and Sun proposed Active learning
of Gaussian Processes with the accuracy 89% in [29]. Cruz et al. proposed
META-DES Ensemble Classifier for the identification with the accuracy of 84.6%
in [30].

2 Research Methodology

Figure 1 depicted the complete working procedure while working with Random
Forest algorithm. For building any model at first it is necessary to import the
dataset. In this research CTG dataset [27] has been used. This dataset is collected
from UCI Machine Learning Repository. Then, various operations have been
performed for checking whether there is any missing value or misleading data
present in the dataset. After that the dataset is split in order to train the model. For
the classification model the dataset has been split into 80% train and 20% test set
and then, testing the model based on trained dataset. Random Forest classifier has
been used to get trained model using train dataset. After the training phase, testing
phase is performed to validate the predictive result using test data. Finally, various
measurements also used to evaluate the performance of the model.

2.1 Dataset Description

A freely accessible CTG data set [31] from the UCI Machine Learning Repository
has been utilized in this study. This data set comprises of 2126 instances described
by 22 attributes. The last two attributes are class codes for FHR pattern and fetal
condition, individually. Each instance can be grouped utilizing the FHR pattern and
fetal condition. The attributes are presented in Table 1. CTG is a technique for
account the fetal heartbeat and the uterine contractions during pregnancy typically
in the last trimester.
360 M. M. Imran Molla et al.

Fig. 1 Working principle of

Random Forest regression

The data set comprises of 2126 cardiotocograms which has been collected from
the Maternity and Gynecological Clinic [32]. CTG are classiﬁed by three expert
obstetricians and their larger part has characterized the class of the cardiotocogram.
The dataset is labeled as one the three classes, Normal (N), Suspicious (S) and
Pathological (P) which is shown in Table 2.
Cardiotocogram Data Classiﬁcation Using Random Forest … 361

Table 1 Explanation of features

Symbol of features Description
LB FHR baseline (beats/min)
AC Number of accelerations/second
FM Number of fetal movements/second
UC Number of uterine contractions/second
DL Number of light decelerations/second
DS Number of severe decelerations/second
DP Number of prolonged decelerations/second
ASTV Percentage of time with abnormal short-term variability
MSTV Mean value of short term variability
ALTV Percentage of time with abnormal long-term variability
MLTV Mean value of long-term variability
Width Width of FHR histogram
Min Minimum of FHR histogram
Max Maximum of FHR histogram
Nmax Number of histogram peaks
Nzeros Number of histogram zeros
Mode Histogram mode
Mean Histogram mean
Median Histogram median
Variance Histogram variance
Tendency Histogram tendency

Table 2 Class distribution of CTGs

Fetal state Class Numeric class Number of FHR recordings
Normal N 1 1655
Suspect S 2 295
Pathologic P 3 176
Total 2126

2.2 Random Forest Classiﬁer

Random forest classiﬁer makes a set of decision trees from arbitrarily chosen subset
of training dataset. It aggregates the votes from various decision trees to choose the
ﬁnal class of the test objects [33]. Each tree is grown as follows:
1. If the number of cases within the training set is N, sample N cases at random -
but with replacement, from the original data. This sample will be the training set
for growing/developing the tree.
362 M. M. Imran Molla et al.

2. If there are M input variables, a number m << M indicates that at each node, m
variables are chosen at random out of the M and the best split on these m is
utilized to split the node. The value of m is held constant during the forest
growing.
3. Each tree is grown to the highest extent possible. There is no pruning.
Decreasing m decreases both the correlation and the strength. In the other hand,
increasing it increases both. Somewhere in between is an “optimal” range of m -
usually quite wide. Utilizing the OOB error rate as shown below, a value of m in the
range can quickly be found. This is the only adjustable parameter to which random
forests is somewhat sensitive.
In Laymen’s term,
Assume that the training set is represented as: [X1, X2, X3, X4 …… Xn] with
corresponding labels [L1, L2, L3, L4 …… Ln], random forest may make three
decision trees having input of subset for example,

½X1 ; X2 ; X3 . . .Xn ð1Þ

½X1 ; X2 ; X4 . . .Xn ð2Þ

½X2 ; X3 ; X4 . . .Xn ð3Þ

Thus, it predicts dependent on the most votes from each of the decision trees
made. Classification outcomes are introduced by utilizing precision, recall and
F-measure. Precision or positive predictive value (PPV) is characterized as the
proportion of instances which belongs to a class (TP: True Positive) out of the total
instances including TP and FP (False Positive) classified by the classifier as belong
to this particular class.

TP
Precision ¼ ð4Þ
TP þ FP

Recall or Sensitivity is introduced as proportion of instances classiﬁed in one

class out of the total instances belonging to that class. TP and FN (False Negative)
is included by the total number of instances of a class.

TP
Recall ¼ ð5Þ
TP þ FN

F-measure can be deﬁned as the combination of precision and recall which is

represented as,

2 Precision Recall
F-Measure ¼ ð6Þ
Precision þ Recall
Cardiotocogram Data Classiﬁcation Using Random Forest … 363

3 Results and Discussions

To classify these three classes, Random forest classifier is used Normal (N),
Suspicious (S) and Pathological (P). In this experiment 10-folded cross validation
on random forest model has been performed and from the result it is found that
Random Forest gives accuracy with different randomly selected predictor shown in
Fig. 2.
Out-of-Bag (OOB) error along with class error for each class is also evaluated
and shown in Fig. 3. Out-of-bag (OOB) error, also called out-of-bag estimate, is a
method of measuring the prediction error of random forests. It is seen that error rate
is high with small tree size and with increase of tree error decrease. Errors are
almost constant when tree size in 300.
Number of nodes for different tree size is shown in Fig. 4. Size of trees (number
of nodes) in and ensemble. Depict the relationship between tree size with their
corresponding terminal nodes.
Training and testing and testing data sets are created by separating the whole
data set into 80-20 split randomly without any replacement. Random forest clas-
sifier is trained on the training set. The class labels of the testing set are anticipated
by the trained classifier. Mean and standard deviation of Precision Recall and
F-measure is accounted for training and testing data (Table 3). Random forest
classifier appeared exceptionally great performance for the training data achieving
large values of precision, recall and F-measure. The weighted average of the values
is appeared in the Table 3 (last row). Precision and recall of the Normal class are
0.948 with F-measure of 0.948 for the testing data sets. Suspect class (S) indicated
small precision and recall values when contrasted with other two classes. It is
apparent since specialists put these cardiotocogram within the suspect class too. In
this way, it is simpler for this class to be confused by the classifier with either
normal (N) class or pathological (P) class.

Fig. 2 Randomly selected predictor vs Accuracy (cross validation)

364 M. M. Imran Molla et al.

Fig. 3 Classiﬁcation error with increase of tree

Fig. 4 Number of nodes for trees

Table 3 Classiﬁcation result for training and testing dataset; values are represented as mean
(standard deviation)
Class Precision Recall F-Measure
Train Test Train Test Train Test
Normal 0.999 0.979 0.999 0.967 0.999 0.973
Suspect 0.996 0.760 0.996 0.905 0.996 0.826
Pathologic 1.00 0.947 1.00 0.857 1.00 0.900
Weighted average 0.999 0.948 0.999 0.948 0.999 0.948

Table 4 shows the confusion matrix for one of the testing data set. Most of the
Normal class is identified as Normal class whereas 4 cases of suspect (S) class are
confused with normal (N) class. Few cases of pathologic (P) class (only 1) are
confused with the normal class.
The accuracy of overall classification is 94.8% for the testing data set (Table 3).
There are 21 features in the data set. All the features may not be equally important
in contributing the classification. Thus, it is necessary to study the impact of
Cardiotocogram Data Classification Using Random Forest … 365

Table 4 Confusion matrix for one of testing data set

Class Normal Suspect Pathologic
Normal 146 4 1
Suspect 2 19 0
Pathologic 1 2 18

Fig. 5 Important variable among the 21 variables

features in the classiﬁcation for all three classes. 10 important variables based on
Mean Decrease Accuracy and Mean Decrease Gini are shown in Fig. 5. The Mean
Decrease Accuracy of a variable is determined during out of bag error calculation
phase. A variable is considered to be as more important whose exclusion (or
permutation) decrease the more accuracy of the random forest. That’s why variables
with a large mean decrease in accuracy are more important for classiﬁcation. ALTV
has the higher mean decrease in accuracy. The Mean Decrease Gini indicates the
average (mean) of a variable’s total decrease in node impurity and weighted by the
proportion of samples in each decision tree in the random forest reaching that node.
This is an effective measure that implies how important a variable is for estimating
the value of the target variable across all of the trees which is making up the forest.
A variable with higher Mean Decrease Gini indicates higher variable importance.
MSTV has higher mean decrease in accuracy among all others variable.
A partial dependency on an important variable is shown in Fig. 6. Partial
dependence plot provides a graphical representation of the marginal effect of a
variable on the class probability. Negative values (in the y-axis) indicate the pos-
itive class is less likely for that value of the independent variable (x-axis) according
to the model. Similarly, positive values indicate that the positive class is more likely
for that value of the independent variable according to the model. Clearly, zero
implies no average impact on class probability according to the model.
366 M. M. Imran Molla et al.

Fig. 6 Partial dependencies on ASTV

Table 5 Comparison with previous works

References Method Accuracy
Sundar et al. Neural network Precision (0.91), Recall (0.90) and F-Measure (0.90)
[27]
Jezewski et al. LSVM classifier Sensitivity (83%), Specificity (92%)
[11]
Chen et al. [28] FG-Kmeans Precision (0.76) Recall (0.81) F-measure (0.77)
Cruz et al. [30] META-DES Ensemble Overall accuracy 84.6%
Classifier
Arif [20] Random Forest (Full Features) Precision, Recall and F-measure are 0.936
Overall Accuracy: 93.6%
Zhou and Sun Active learning of Gaussian Overall Accuracy 0.89% small training dataset of 140
[29] Processes examples only
Chamidah [21] 1. SVM 76.72%
2. K-Means+SVM 90.64%
Zhang [23] PCA and AdaBoost Overall accuracy 93%
Proposed Random Forest (Full Features) Precision, Recall and F-measure are 0.948
method Overall Accuracy: 94.8%

The proposed work is compared with the previous works that is shown in
Table 5. In this study, all dataset is partitioned into 80% (training set) and 20%
(testing set). The classification accuracy is reported as the average value of 10
independent runs. It can be concluded that the overall classification accuracy is
better than the previous results.
Cardiotocogram Data Classification Using Random Forest … 367

4 Conclusions

Cardiotocograms (CTG) are sorted by three expert obstetricians. The used data set
was collected from the Maternity and Gynecological Clinic (University Hospital of
Porto in Portugal) (Ayres-de-Campos, Bernardes et al. 2000 [28]). The performance
of random forest classifier is analyzed by utilizing three different performance
measures: Precision, Recall and F-measure to distinguish the pathological and
suspicious condition of the fetus from the normal condition. The used dataset is
partitioned into training and testing datasets randomly (80% for training and 20%
for testing). As the classifier is stochastic, thus ten folds cross validation is utilized
with 80%-20% split of the CTG dataset. The proposed technique achieves the
classification accuracy of 94.8% when the complete feature sets are employed to the
classifier. The classifier performance has also been evaluated in terms of precision,
F-measure and recall which are 0.948.

Acknowledgements The author would like to acknowledge the great supports by the Faculty of
Electrical & Electronics Engineering and Universiti Malaysia Pahang, Malaysia.

References

1. Macones GA, Hankins GD, Spong CY, Hauth J, Moore T (2008) The 2008 National Institute
of Child Health and Human Development workshop report on electronic fetal monitoring:
update on definitions, interpretation, and research guidelines. J Obstetric Gynecol Neonatal
Nurs 37(5):510–515
2. Ugwumadu A (2013) Understanding cardiotocographic patterns associated with intrapartum
fetal hypoxia and neurologic injury. Best Practice Res Clin Obstetric Gynaecol 27(4):509–536
3. Chen HY, Chauhan SP, Ananth CV, Vintzileos AM, Abuhamad AZ (2011) Electronic fetal
heart rate monitoring and its relationship to neonatal and infant mortality in the United States.
Am J Obstetric Gynecol 204(6):491–501
4. Lees C, Marlow N, Arabin B, Bilardo CM, Brezinka C, Derks JB, Wolf H (2013) Perinatal
morbidity and mortality in early-onset fetal growth restriction: cohort outcomes of the trial of
randomized umbilical and fetal flow in Europe (TRUFFLE). Ultrasound Obstetric Gynecol 42
(4):400–408
5. Carbonne B, Langer B, Goffinet F, Audibert F, Tardif D, Le Goueff F (1997) Multicenter
study on the clinical value of fetal pulse oximetry. Am J Obstetric Gynecol 177(3):593–598
6. Spencer JA (1993) Clinical overview of cardiotocography. BJOG Int J Obstetrics Gynaecol
100(9):4–7
7. Grivell RM, Alfirevic Z, Gyte GM, Devane D (2010) Antenatal cardiotocography for fetal
assessment. Cochrane Database Syst Rev 1
8. Brown R, Wijekoon JH, Fernando A, Johnstone ED, Heazell AE (2014) Continuous objective
recording of fetal heart rate and fetal movements could reliably identify fetal compromise,
which could reduce stillbirth rates by facilitating timely management. Med Hypotheses 83
(3):410–417
368 M. M. Imran Molla et al.

9. Czabanski R, Jezewski J, Matonia A, Jezewski M (2012) Computerized analysis of fetal heart

rate signals as the predictor of neonatal acidemia. Expert Syst Appl 39(15):11846–11860
10. Georgieva A, Payne SJ, Moulden M, Redman CW (2013) Artificial neural networks applied
to fetal monitoring in labour. Neural Comput Appl 22(1):85–93
11. Jezewski M, Wrobel J, Labaj P, Leski J, Henzel N, Horoba K, Jezewski J (2007) Some
practical remarks on neural networks approach to fetal cardiotocograms classification. In:
IEEE 29th annual international conference of the engineering in medicine and biology society
(EMBS 2007), pp 5170–5173
12. Karabulut EM, Ibrikci T (2014) Analysis of cardiotocogram data for fetal distress
determination by decision tree based adaptive boosting approach. J Comput Commun 2
(09):32–37
13. Czabanski R, Jezewski M, Wrobel J, Horoba K, Jezewski J (2008) A neuro-fuzzy approach to
the classification of fetal cardiotocograms. In: 14th nordic-baltic conference on biomedical
engineering and medical physics. Springer, Heidelberg, pp 446–449
14. Menai MEB, Mohder FJ, Al-mutairi F (2013) Influence of feature selection on naïve Bayes
classifier for recognizing patterns in cardiotocograms. J Med Bioeng 2(1):66–70
15. Breiman L (2001) Random forests. Mach Learn 45(1):5–32
16. Kandaswamy KK, Chou KC, Martinetz T, Möller S, Suganthan PN, Sridharan S,
Pugalenthi G (2011) AFP-Pred: a random forest approach for predicting antifreeze proteins
from sequence-derived properties. J Theo Biol 270(1):56–62
17. Gray KR, Aljabar P, Heckemann RA, Hammers A, Rueckert D (2013) Random forest-based
similarity measures for multi-modal classification of Alzheimer’s disease. Neuroimage
65:167–175
18. Ozcift A (2012) Enhanced cancer recognition system based on random forests feature
elimination algorithm. J Med Syst 36(4):2577–2585
19. Arif M, Bilal M, Kattan A, Ahamed SI (2014) Better physical activity classification using
Smartphone acceleration sensor. J Med Syst 38(9):1–10
20. Arif M (2015) Classification of cardiotocograms using random forest classifier and selection
of important features from cardiotocogram signal. Biomater Biomech Bioeng 2:173–183
21. Chamidah N, Wasito I (2015) Fetal state classification from cardiotocography based on
feature extraction using hybrid K-Means and support vector machine. In: 2015 international
conference on advanced computer science and information systems (ICACSIS), pp 37–41
22. Yılmaz E (2016) Fetal state assessment from cardiotocogram data using artificial neural
networks. J Med Biol Eng 36(6):820–832
23. Zhang Y, Zhao Z (2017) Fetal state assessment based on cardiotocography parameters using
PCA and AdaBoost. In: 2017 10th international congress on image and signal processing,
BioMedical engineering and informatics (CISP-BMEI), pp 1–6
24. Permanasari AE, Nurlayli A (2017) Decision tree to analyze the cardiotocogram data for fetal
distress determination. In: 2017 international conference on sustainable information
engineering and technology (SIET), pp 459–463
25. Yılmaz E, Kılıkçıer Ç (2013) Determination of fetal state from cardiotocogram using
LS-SVM with particle swarm optimization and binary decision tree. Comput Math Methods
Med 1–8
26. Beatrijs H, GuidOei S, Bovendeerd PHM (2013) Simulation of reflex late decelerations in
labor with a mathematical model. Early Human Dev 89(1):7–19
27. Sundar C, Chitradevi M, Geetharamani G (2013) An overview of research challenges for
classification of cardiotocogram data. J Comput Sci 9(2):198–206
28. Chen X, Ye Y, Xu X, Huang JZ (2012) A feature group weighting method for subspace
clustering of high-dimensional data. Pattern Recogn 45(1):434–446
29. Zhou J, Sun S (2014) Active learning of Gaussian processes with manifold-preserving graph
reduction. Neural Comput Appl 25(7–8):1615–1625
30. Cruz RM, Sabourin R, Cavalcanti GD, Ren TI (2015) META-DES: a dynamic ensemble
selection framework using meta-learning. Pattern Recogn 48(5):1925–1935
Cardiotocogram Data Classification Using Random Forest … 369

31. UCI Machine Learning Repository, 13 March 2019. http://archive.ics.uci.edu/ml/datasets/

Cardiotocography
32. Ayres-de Campos D, Bernardes J, Garrido A, Marques-de-Sá J, Pereira-Leite L (2000)
SisPorto 2.0: a program for automated analysis of cardiotocograms. J Mater Fetal Med
9(5):311–318
33. Patel S (2017) Random Forest Classiﬁer, 18 May 2017. Chapter 5
FPGA Implementation of Sensor Data
Acquisition for Real-Time Human Body
Motion Measurement System

Zarina Tukiran, Afandi Ahmad, Herdawatie Abd. Kadir,

and Ariffudin Joret

Abstract In most sensor-based human body motion measurement systems,

microcontroller and general-purpose unit are used to acquire and process the sensor
data. These processing devices, however, have some limitation in obtaining data in
parallel especially from various sensors. This paper focuses the discussion on the
use of FPGA as a processing device to acquire real-time sensor data from various
sensors concurrently. The architecture of real-time sensor data acquisition is pro-
posed utilizing parallelism features of an FPGA. The architecture is also designed to
stream the sensor data from FPGA to the host. This paper also investigates the
performance of FPGA of the proposed architecture in terms of FPGA usage
resources and speed for various optimisation techniques. The implementation
results concluded that the synthesis optimisation technique contributed to the FPGA
overall performance. In addition, the experimental ﬁndings show promising results
to implement a state-of-the-art of the FPGA-based human body motion measure-
ment system.

Keywords Sensor data acquisition Body motion measurement FPGA

1 Introduction

In human motion analysis, most of the researchers focused on sensor data acqui-
sition using a microcontroller and process the sensor data using a general-purpose
unit [1–8]. Field programmable gate array (FPGA) is another type of processing
unit that can be used to process data obtained from the sensor. Some of the FPGA

Z. Tukiran (&) A. Ahmad A. Joret

Microelectronics and Nanotechnology Shamsuddin Research Centre (MINT-SRC), Institut
Kejuruteraan Integrasi (I2E), Universiti Tun Hussein Onn Malaysia, Johor, Malaysia
e-mail: [email protected]
H. Abd.Kadir
Advanced Mechatronics Research Group (ADMIRE) Focus Group, Faculty of Electrical and
Electronic Engineering, Universiti Tun Hussein Onn Malaysia, Johor, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 371

advantages over microcontroller is the ability to perform parallel computing and

fast real-time operation [9]. In this work, multiple wearable sensors were mounted
on the human lower limb to measure the motion. Since there is a need to acquire
sensor data from multiple sensors and at the same time perform other tasks,
therefore, the FPGA is chosen as a processing device.
As illustrated in Fig. 1, the proposed architecture of real-time human body
measurement system consists of three (3) main units; sensing, processing, and
displaying the measurement data. In this study, the sensing unit utilises four
(4) tri-axial accelerometer sensors to measure the lower body movement of both left
and right shank and left and right thigh. These sensors are connected to the pro-
cessing unit; the FPGA board via FPGA I/O analogue pin connectors. In the pro-
cessing unit, there are two main modules; the ResDAQ and joint measurement
modules. The ResDAQ module performed the task of acquiring the sensor data in
real-time. Whilst the later module computes sensor data to obtain the results of joint

Fig. 1 Proposed architecture

FPGA Implementation of Sensor Data Acquisition … 373

angle before streamed to the host via a fast Ethernet cable. At the host, the GUI
module displays the measurement results to the user. The GUI module is also
programmed to save the measurement results for future reference.
This study utilizes LabVIEW FPGA 2011 to implement the ResDAQ and joint
measurement modules. Whilst LabVIEW 2011 Service Pack 1 (SP1) is used to
implement the GUI module. However, this paper only focuses on the implemen-
tation of ResDAQ module as discussed in Sect. 2. Section 3 discusses the imple-
mentation and experimental ﬁndings. Section 4 remarks the conclusion and future
works.

2 Proposed FPGA-Based Sensor Data Acquisition

2.1 Hardware Conﬁguration Between the Sensors, FPGA

Board and Host

The proposed real-time sensor data acquisition is designed and implemented in two
(2) phases; the hardware and software. The hardware phase involves configuring a
physical connection between sensors, an FPGA and a host. This physical con-
nection is needed for streaming sensor data to the host via the FPGA board. The
software phase involves programming of acquiring and processing sensor data via
FPGA board.
In this work, as depicted in Fig. 1, the hardware configuration has two (2) parts;
(i) the configuration between the FPGA board and personal computer (PC), and
(ii) the configuration between the sensors and FPGA board. The FPGA board and
host is connected via Ethernet cable that must be installed properly through RJ-45
port on the FPGA board and the host. The configuration between FPGA and host is
performed automatically by Measurement & Automation (MAX) software [10].
The sensors are connected to FPGA board via FPGA analogue I/O pin con-
nection. Since the FPGA board supplied 5 V and the sensor uses 3.3 V, thus a
voltage regulator LM117-T is used to reduce 5 V power supply to 3.3 V. The
overall hardware physical connections are shown in Fig. 2. The physical setting
between the FPGA board, the voltage regulator and the sensors are shown in
Table 1.
374 Z. Tukiran et al.

Fig. 2 The physical connection between the sensors and FPGA board

Table 1 Conﬁguration between the FPGA, voltage regulator and sensors pin connector
FPGA I/O analogue pin connector Voltage regulator pin connector Sensor pin connector
– OUT ACC
5V IN –
AI GND GND GND
AI0–AI3 – X
AI4–AI7 Y
AI8–AI11 – Z

2.2 Implementation of ResDAQ Module

In this work, the main task of ResDAQ module is to obtain, filter and calibrate
sensor data in real-time. The sensor data were obtained from multiple sensors
mounted on the human body. The filter is configured with second-order Butterworth
to remove unwanted data from the signal. The conversion to the output voltage and
calibration are performed before the data were processed for the next task or
streamed to the host.
According to [11], the output voltage of the accelerometer sensor is related to the
acceleration of a particular axis by the relationship in Eq. (1).

Voffset þ S Ai ¼ Vout ð1Þ

where Vout is the output voltage of the accelerometer, Voffset is the offset of the
accelerometer at 0 g, S is the sensitivity of the accelerometer in volts per meter per
second squared, and Ai is the acceleration of a particular axis in g. Thus, the
acceleration is determined as in Eq. (2). The Eq. (2) is then applied to design and
FPGA Implementation of Sensor Data Acquisition … 375

implement the ResDAQ module on the FPGA platform using LabVIEW FPGA
2011 via FPGA VI.

ðVoutVoffsetÞ=S ¼ Ai ð2Þ

The ResDAQ module is also designed to perform transferring the data from the
FPGA to the host. LabVIEW FPGA provides two (2) communication methods of
data transfer between FPGA and the host; (i) FPGA host interface front panel
controls and indicators (FPCIs) and (ii) FPGA host interface FIFOs.
The FPGA host interface has registers for the top-level FPGA VI controls and
indicators. These registers were created by the LabVIEW FPGA and accessible to
the host via the FPGA host interface [12]. Figure 3 illustrates the implementation of
ResDAQ module with FPCIs.
Whilst, the FPGA host interface FIFOs uses DMA to buffer and transfer data to the
host system memory at high speed with little processor involvement [12]. This is an
efﬁcient mechanism when sending large blocks of data compared to front panel
controls and indicators. The FPGA host interface FIFOs are a unidirectional transfer
mechanism and can be conﬁgured to transfer host-to-FPGA or FPGA-to-host. The
implementation of ResDAQ module with DMA is illustrated in Fig. 4.

Fig. 3 ResDAQ module with FPCIs interfacing method

376 Z. Tukiran et al.

Fig. 4 ResDAQ module with FIFOs interfacing method

3 Implementation Results

3.1 Results on FPGA Resources and Performance

on the Implementation of ResDAQ

LabVIEW FPGA VI provides two (2) Xilinx settings for synthesis optimisation
upon compilation; (i) speed (SS), and (ii) area (SA). This synthesis optimisation
technique is to translate the G-code to the hardware circuitry. The LabVIEW FPGA
Module was set to speed as a default optimisation technique. Once the optimisation
technique is selected, the Xilinx compiler performs the compilation process for
targeted FPGA devices. Once the process completed, the report that contains the
information about the FPGA resources usage and the maximum frequency is
generated.
In this study, the ResDAQ with FPCIs and the ResDAQ with FIFOs are com-
piled with these two (2) Xilinx settings for synthesis optimization. The motivation
is to investigate the impact of Xilinx synthesis optimization settings on the
ResDAQ architecture. Two parameters are selected to evaluate the performance of
the proposed ResDAQ architecture which are FPGA resources and maximum
frequency.
Based on Table 2, in the design of ResDAQ with FPCIs, the synthesis opti-
mization by area (SA) reduces the usage of FPGA resources by approximately 1%.
Conversely, the FPGA speed decreases by 1.38 MHz. When the same design is
optimised for FPGA speed (SS), the usage of FPGA resources increases by 0.7%
FPGA Implementation of Sensor Data Acquisition … 377

Table 2 Comparison of the FPGA performances FPCIs FIFOs

usage of FPGA resources and
SS SA SS SA
speed
(A) Usage of FPGA
resources
Total slices (%) 24.1 23.4 25.3 23.5
(B) Maximum 41.91 40.53 40.88 40.78
frequency (MHz)
Note SS—synthesis optimization by speed, SA—synthesis
optimization by area, FPCIs—ResDAQ with FPCIs, FIFOs—
ResDAQ with FIFOs

Table 3 Details on total FPGA performances FPCIs FIFOs

slices of FPGA resources
SS SA SS SA
usage
Slice registers (%) 9.7 8.7 10.4 10.4
Slice LUTs (%) 18.9 18.6 20.4 20.7
Mult18X18s (%) 92.5 92.5 92.5 92.5
Block RAMs (%) 0 0 5 5
Note SS—synthesis optimization by speed, SA—synthesis
optimization by area, FPCIs—ResDAQ with FPCIs, FIFOs—
ResDAQ with FIFOs

and the FPGA speed is improved approximately 1.5 MHz. These ﬁndings show that
the optimisation method offered trade-off on overall FPGA performance.
Table 3 shows further details on total slices in terms of registers, Lookup Tables
(LUTs), multiplier and block Random Access Memory (RAM) usage for both
designs of ResDAQ with FPCIs and ResDAQ with FIFOs.
Based on Table 3, for both optimisation methods by area and speed, the design
with FIFOs uses more elements especially on registers, LUTs and block RAMs for
data storage approximately by 1%, 2% and 5%, respectively. However, there is no
signiﬁcant difference in terms of multiplier usage for both design and synthesis
optimisation methods.

3.2 Measurement Results

Two (2) tri-axial accelerometers were used in this study. The output of all sensors is
processed using the FPGA board, which in turn was connected to the computer with
Ethernet cable. The sampling frequency was 1 kHz. The two (2) sensors were
mounted on simple Velcro strap and placed on the shank and thigh as shown in
Fig. 5. Before working with the sensors for measurement, the sensors were cali-
brated on a flat surface that was parallel to the ground. In this case, both sensors
have the same zero references. The assumption that thigh and shank segments are in
the same place was considered.
378 Z. Tukiran et al.

Fig. 5 Accelerometer sensors and goniometer placement

Table 4 Experimental results of 500 sample data

Number Actual RMSE of actual vs. Mean of Standard deviation of
of knee estimated estimated estimated
samples joint measurement measurement measurement
(degrees) (degrees) (degrees) (degrees)
500 125 0.0959 125.0739 0.0610

Samples of 500 sensor data of static motion during flexed knee were collected
for five (5) cycles. The collected data were saved in a file with .CSV format. The
data processing was done offline using MS Excel 2016. For validating the joint
angle measurement that was estimated by the accelerometer, a goniometer was used
to measure the actual angles from the knee.
The root mean square error (RMSE) is used to find the differences between
actual and estimated measurement of 500 sample data in the unit of degrees. For
500 sample data, the calculated RMSE is small which approximately 0.1° as shown
in Table 4.
As in Table 4, from 500 sample data, the mean and standard deviation of esti-
mated measurement is also calculated. Then, the minimum and maximum range of
accepted estimated data were determined. Figure 6 shows how far the estimated
measurement from the calculated mean.
FPGA Implementation of Sensor Data Acquisition … 379

Fig. 6 Distribution of estimated measurement from the calculated mean

4 Conclusion

As a conclusion, this study proposed the architecture of real-time sensor data

acquisition (ResDAQ) module on the FPGA platform to obtain data from
multi-sensor in parallel. The proposed ResDAQ architecture also considered two
(2) communication methods to transfer the sensor data from FPGA to the host;
FPCIs and FIFOs. The G-code of the proposed architecture is converted to hard-
ware circuit using two (2) synthesis optimisation method; optimise for FPGA area
(SA) and optimise for FPGA speed (SS).
The implementation findings concluded that the optimization methods offered
trade-off on FPGA overall performance in terms of area and speed. Whilst the
experimental finding shows the measurement data produces small RMSE which is
approximately 0.1°. These findings give promising results to use FPGA platform as
data acquisition and processing device in human body motion measurement
application. Also, the optimisation method for FPGA speed (SS) is suitable to be
implemented in future work for the measurement of human body motion in
real-time.

References

1. Nwaizu H, Saatchi R, Burke D (2016) Accelerometer based human joints’ range of movement
measurement. In: 10th international symposium on communication systems, networks and
digital signal processing (CSNDSP). IEEE, Prague, pp 1–6
2. Kardos S, Balog P, Slosarcik S (2017) Gait dynamics sensing using IMU sensor array system.
Adv Electr Electron Eng 15(1):71–76
3. Wagner JF (2018) About motion measurement in sports based on gyroscopes and
accelerometers—an engineering point of view. Gyroscopy Navig 9(1):1–18
380 Z. Tukiran et al.

4. Ong ZC, Seet YC, Khoo SY, Noroozi S (2018) Development of an economic wireless human
motion analysis device for quantitative assessment of human body joint. Measurement
115:306–315
5. Zhang J, Cao Y, Qiao M, Ai L, Sun K, Mi Q, Zang S, Zuo Y, Yuan X, Wang Q (2018)
Human motion monitoring in sports using wearable graphene-coated ﬁbre sensors. Sens
Actuators A 274:132–140
6. Taghavi N, Luecke GR, Jeffery ND (2018) A wearable body controlling device for
application of functional electrical stimulation. Sensors 18(4):1251
7. Zhang Y, Fei Y, Xu L, Sun G (2015) Micro-IMU-based motion tracking system for virtual
training. In: Chinese control conference (CCC 2015), Sept, pp 7753–7758
8. Tu Y, Liu L, Li M, Chen P, Mao Y (2018) A review of human motion monitoring methods
using wearable sensors. Int J Online Eng (iJOE) 14(10):168–179
9. Tripathi K, Narkhede P, Kottath R, Kumar V, Poddar S (2018) Design considerations of
orientation estimation system. In: 2016 5th international conference on wireless networks and
embedded systems. IEEE, pp 1–6
10. NI sbRIO-961x/963x/964x and NI sbRIO-9612XT/9632XT/9642XT National Instruments.
http://www.ni.com/pdf/manuals/375052c.pdf. Accessed 30 Sept 2019
11. Lee GX, Low KS, Taher T (2010) Unrestrained measurement of arm motion based on a
wearable wireless sensor network. IEEE Trans Instrum Meas 59(5):1309–1317
12. NI LabVIEW high-performance FPGA developer’s guide – recommended practices for
optimizing LabVIEW RIO applications. Rev No 1, 1 February 2014. http://download.ni.com/
pub/gdc/tut/labview_high_perf_fpga_v1.1.1.pdf. Accessed 30 Sept 2019
Pulse Modulation (PM) Ground
Penetrating Radar (GPR) System
Development by Using Envelope
Detector Technique

Maryanti Razali, Ariffuddin Joret, M. F. L. Abdullah,

Elfarizanis Baharudin, Asmarashid Ponniran,
Muhammad Suhaimi Sulong,
Che Ku Nor Azie Hailma Che Ku Melor, and Noor Azwan Shairi

Abstract GPR system equipment is used to detect embedded objects in the earth’s
surface. This system applied a method that is based on the reflection technique of
the electromagnetic wave produced by the dipole antenna. To obtain a clear image
of the GPR radargram, the output signal of the GPR antenna will be processed
using the envelope detector (ED) technique. In this study, the frequency range used
in developing GPR system simulation is from 0.07 to 0.08 GHz. The GPR system
simulations were designed to perform scanning using GPR system to detect
embedded iron object in the dry sandy soil at the depths of 0, 10, 100, 500, 900,
1000 and 1500 mm. Through this study, based on the GPR radargram, the only
embedded object that cannot be detected in the simulation is the object embed at
1500 mm. Comparison of the GPR radargram produced without and using envel-
ope detector techniques proves that the envelope detector technique is capable of
generating GPR radargram and displaying embedded objects more clearly.

M. Razali A. Joret (&) M. F. L. Abdullah E. Baharudin A. Ponniran

C. K. N. A. H. C. K. Melor
Faculty of Electrical and Electronics Engineering, Universiti Tun Hussein Onn Malaysia,
Parit Raja, Malaysia
e-mail: [email protected]
M. S. Sulong
Faculty of Technical and Vocational Education, Universiti Tun Hussein Onn Malaysia,
Parit Raja, Malaysia
N. A. Shairi
Faculty of Electronic and Computer Engineering, Universiti Teknikal Malaysia Melaka,
Malacca, Malaysia
A. Joret M. S. Sulong
Internet of Things (IoT) Focus Group, Universiti Tun Hussein Onn Malaysia, Parit Raja,
Malaysia
A. Ponniran
Power Electronics Converters (PECs) Focus Group, Universiti Tun Hussein Onn Malaysia,
Parit Raja, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 381

Keywords Pulse modulation GPR system Dipole antenna Envelope detector

1 Introduction

Ground Penetrating Radar (GPR) System is a RADAR (Radio Detection and

Ranging) system used to detect the presence of embedded objects in the earth’s
surface. The system is said to be able to detect embedded objects that not only
consist of metal objects but also non-metal objects, have been studied by many
researchers [1–5]. According to Joret [4], basic equipment for developing a GPR
system equipment involves electromagnetic wave signal transmission system known
as transmitter system and electromagnetic wave signal receiving system (receiver
system) as shown in the Fig. 1. The proposed transmitter system equipment is the
alternating current signal generator while for the receiver system equipment is the
oscilloscope. According to the GPR system equipment shown in Fig. 1, this GPR
system equipment is known as a bistatic GPR system equipment where the same two
antennas are used as electromagnetic signal transmitter and detectors.
In GPR system, antenna is used as an electromagnetic wave signal detector can
be classified into two categories which is wide band antenna and narrow band
antenna. The classification of these antennas is depending on the value of its
fractional bandwidth. If the value 0.2, the antenna is classified as a wide band
antenna and vice versa [4, 6, 7].
The use of antennas in GPR system usually involves wide band antenna such as
microstrip patch antenna, horn antenna, frame antenna, monopole antenna, bow-tie
antenna and circular disc monopoles [4, 8–10]. According to Sato [11] and Ghafoor
[12]. The wideband antenna used as GPR antenna in GPR system are capable of
providing a good GPR radargram. However, the design of the wide band antenna is
difficult due to its complex geometric shape compared to the narrow band antenna.
Antennas that produces electromagnetic radiation at low frequencies are
essential to detect the position of the embedded object in the earth surface at greater
distance. Referring to the low operating frequency antenna, study using antenna that
is operating at 500 MHz have been used by Florian to study the maximum depth of
the embedded objects detectable by the GPR system [13]. Theoretically, referring to
Daniels [14] and Joret et al. [15], the depth of the embedded objects that can be
detected by the GPR systems is using Eq. (1)

Fig. 1 Basic equipment of the GPR system [4]

PM GPR System Development by using Envelope Detector Technique 383

vt
d ¼ pffiffiffiffi ð1Þ
2 er

where the depth d in units of meter, the electromagnetic wave ð3 108 m=sÞ
velocity known as v while the electromagnetic wave signal used time t to travel
from the GPR system to the surface of an embedded objects in the earth and last but
not least is the relative permittivity value of the medium where the electromagnetic
wave propagates er .
This paper discusses on the simulation development of the GPR system using
dipole antenna to detect an embedded object in sandy soil area. The GPR system
simulation was designed using CST Studio Suite software while the processing of
the antenna output signal was done using MATLAB software. In obtaining pulse
signal from the antenna pulse modulation signal, an envelope detector (ED) was
used as one of the technique to process signal. The used of the ED based signal
processing technique in this study show that the produced GPR radargram is able to
show the embedded object clearly as compared to the GPR radargram produced
using signal processing technique without ED based.

1.1 Antenna Input Signal

The production of an electromagnetic wave signal by an antenna requires electrical

alternating current or pulse signal as the antenna input signal as shown in Fig. 2. In
Fig. 2(a), the Gaussian modulated pulse signal shown is the input signal for the
wideband antenna with fractional bandwidth value of 1 whereas Fig. 2(b) shows an

Fig. 2 Antenna input signal: (a) Input signal for wideband antenna with value of 1 for its fractional
bandwidth, (b) Input signal for wideband antenna with value of 0.4 for its fractional bandwidth
384 M. Razali et al.

antenna input signal for wideband antenna with fractional bandwidth of 0.4 [4].
Based on Fig. 2, the antenna input signal is said to be affected by the fractional
bandwidth value of the antenna. When the fractional bandwidth value is higher, it
will affect the ripple signal where it will become less ripple. This is directly
affecting the narrow band antenna because the input signal of this antenna will have
more ripple.
Compared to wideband antennas, narrow band antennas are easy to design,
which is one of the reasons why this antenna has been chosen to be used in this
study. However, the GPR radargram obtained from the use of the narrow band
antenna as GPR antenna will appear blur due to the input signal used have a lot of
ripple. Some examples of the narrow band antenna that are often designed are
dipole antenna, loop antenna, dish antenna and Yagi-Uda antenna [16].
Based on its simple design, lightweight, easy to ferment and economical [17] the
dipole antenna was selected as the GPR antenna in this study. The antenna con-
sisting of two cylindrical copper wires thread. As the operating frequency is pro-
portional to the antenna length, this feature enables this study to be performed using
the frequency operation of dipole antenna in Mega Hertz by adjusting the antenna
length. At this frequency range, the production of radiation of electromagnetic wave
is said to be able to penetrate the depths of the soil at a distance approximately
1–2 m.

1.2 Signal Processing Technique for PM GPR System

The use of narrow band antenna and amplitude modulation using pulse modulation
signal in the GPR system resulted too much ripple in the input and output signals.
This ripple signals can be minimized by using signal processing techniques. One of
the technique used in the GPR system to process signal is the envelope detector
technique [4].
The uniqueness of the amplitude modulation signal as related in PM GPR system
is on its envelope which contains the information signal. Refers to [4, 8, 18], the
amplitude modulation signal represented by AðtÞ can be referred to as

AðtÞ ¼ Ac cosðxc tÞ þ lAm cosðxm tÞ cosðxm tÞ ð2Þ

where l is modulation index known as positive constant, Ac is carrier amplitude

signal, while carrier signal phase known as xc , the amplitude of information signal
is Am and phase of the information signal is xm and t for time. According to Eq. (2),
if the value 1 were set for Ac and Am , 0.6 is set for l value, 0:6p for the xc while 2p
for xm and t values are measured from 0 to 2. Figure 3(c) shows the amplitude
modulation signal generated using Eq. (2). The information signal used to derive
this modulation signal is as shown in Fig. 3(a).
Referring to the signal of amplitude modulation in Fig. 3(c), the ED technique
can be used to detect the information signal [4, 8]. There are three kinds of
PM GPR System Development by using Envelope Detector Technique 385

Fig. 3 Generation of amplitude modulation signal: (a) Message signal, (b) Carrier signal, (c) AM
signal [4]

Output signal
Low pass
filter
Thresholding
Input signal

Fig. 4 Block diagram for AHW envelope detector technique

techniques for the envelope detector to retrieve the signal information from
amplitude modulation signals which are Asynchronous Full-Wave (AFW),
Asynchronous Half-Wave (AHW) and Asynchronous Real Square Law (ARSL).
The AHW type of envelope detector technique was used to detect information
signals which is the pulse signal from the amplitude modulation signal in this paper.
Figure 4 shows the AHW type of envelope detector technique block diagram while
Fig. 5 shows an example of the signal extraction results from the amplitude
modulation signal using the AHW envelope detector technique. Figure 5(c) shows
the result signal of AHW envelope detector technique to detect message signal.
386 M. Razali et al.

Fig. 5 Signal extraction information from amplitude modulation using AHW envelope detector
technique

2 Development of GPR System Simulation

The use of narrow band antennas is less popular in the PM GPR system because this
kind of antennas use high ripple of input signal that will cause the radar image
known as GPR radargram to be blurred. In this study, the antenna output signal
generated in the GPR system simulation design using CST software will be
extracted into MATLAB software. Next, the output signal of the antenna will be
processed to obtain the GPR radargram of the GPR system simulation. The pro-
duction of the GPR radargram in this study involved the processing of the signal
using envelope detector technique. The development of the GPR system simulation
has used materials such as dry sand soil as a background object and iron plate as the
embedded object, beside Dipole antenna.

2.1 Antenna Design and GPR System Simulation

Using CST Software

The frequency operation of the dipole antenna used in development of the GPR
system simulation is at 0.075 GHz. By selecting this frequency value, it requires
that the dipole length of the antenna should be in a range of 1500 mm with the
PM GPR System Development by using Envelope Detector Technique 387

Fig. 6 Dipole antenna using CST software

Fig. 7 Simulation design of

GPR system using CST
software

radius of 50 mm. This does not include the port distance value of this antenna that
has been set to 200 mm. Figure 6 shows the diagram of the developed dipole
antenna in this study using CST software.
In this study, after the dipole antenna was successfully designed hence to obtain
an appropriate reflectance parameter of less than –10 dB at 0.075 GHz, the addition
of background models and embedded objects was performed to model the GPR
system simulation. The background dimensions used for this simulation are
3000 mm in length, width and height using dry sand material. The dimension for
the iron plate used as the embedded object is around 800 mm 800 mm 400
mm for its width, length and height respectively. The schematic diagram of the
GPR system simulation model presented in this study is shown in Fig. 7.
388 M. Razali et al.

Fig. 8 Scanning direction

procedure of the GPR system
16
simulation

The scanning procedure of the GPR system simulation in this study were per-
formed by running the simulated GPR systems simulation several times with ref-
erence to several antenna positions. A total of 16 antenna positions have been
determined that will be referred to as GPR system scanning antenna points based on
the background model. The movement of the simulated scan point can be referred
in Fig. 8 where the set distance for each antenna position from one position to next
position is at about 162.5 mm.
In this study, several GPR system simulations were carried out involving iron
plate design as an embedded object in dry sandy areas at certain depths such as at 0,
10, 100, 500, 900, 1000 and 1500 mm. This simulation was performed in order to
see the effectiveness of the dipole antenna as GPR antenna of the GPR system in
detecting an embedded iron object. In order to detect the position of an iron
embedded in dry sand area at depth of 1000 mm a simulation of the GPR system
using dipole antenna that was developed in this study has shown in Fig. 9.
PM GPR System Development by using Envelope Detector Technique 389

Fig. 9 Simulation of the

GPR system in scanning
object immersed in dry sand
at 1000 mm depth using
dipole antenna

2.2 GPR System Simulation Output Signal Processing

Based on GPR system simulation development using CST software, the antenna
output signal calculated by the software will be exported into MATLAB software.
The simulation frequency range was set from 0.07 GHz until 0.08 GHz. The
selection of this frequency range in the CST software has produced a modulated
Gaussian pulse signal as an information signal with a sinusoidal carrier frequency
of 0.075 GHz.
Next, to generate the GPR radargram of the GPR system simulation, the output
signal of the antenna for each antenna position in the simulation was arranged in a
column where the ﬁrst position refers to the ﬁrst column and so on. To obtain a
clearer GPR radargram in this study, the envelope detector technique was per-
formed to these antenna output signal. The selected envelope detector technique is
the Asynchronous Half-Wave (AHW) which can be refered to the block diagram in
Fig. 4.
390 M. Razali et al.

Fig. 10 GPR system

algorithm based on magnitude
calculation

2.3 GPR System Reconstruction Image

To reconstruct the GPR system image, an output signal from the GPR system are
needed and arranged according to each of the antenna position scanning process.
The antenna scanning position for this study has been set as y-value for each signal
while x-value was set for the signal sample which will form an ðx; yÞ antenna
position referring to the 16 unmodulated signals as in Fig. 8. The output image, i in
this study is produced by following mapping procedure of Eq. (3)

iðx; yÞ ¼ ½y1 ðn; 1Þ y2 ðn; 2Þ y3 ðn; 3Þ. . . ð3Þ

where the unmodulated signal is represented by y1 ; y2 ; y3 , the position at the image

width is represented as x while y is the position at the image length and n is the
sample value of the output signal. Based on the Fig. 10, the algorithm that apply for
this system can reconstruct image thru signal processing using Eq. (3).
PM GPR System Development by using Envelope Detector Technique 391

3 Result and Discussion

3.1 Result of Dipole Antenna Design

Figures 11 and 12 shows the input and output signal obtained from the simulation
of dipole antenna. Based on the displayed signal, it can be observed that the signal
is a modulated Gaussian pulse signal.
The simulation result of dipole antenna design shown in Fig. 13 is the magni-
tude of the reflection signal parameter (S11). This graph shows that the designed
dipole antenna is effectively capable to transmit signal with spectrum value from
0.07 GHz until 0.08 GHz. The center frequency of this antenna as selected at
0.075 GHz have reflection gain at about –31.0955 dB. Based on the radiation
pattern shown in Fig. 14 it can be said that this dipole antenna has an omnidirec-
tional radiation pattern with gain value of 2.03 dBi based on the isotropic antenna at
this center frequency.

Fig. 11 Dipole antenna input signal

Fig. 12 Dipole antenna output signal

392 M. Razali et al.

Fig. 13 S11 parameter of designed dipole antenna

Fig. 14 Radiation pattern of the designed dipole antenna

3.2 GPR System Simulation Result

According to the simulation outcome of the GPR system in this study, the position
of the embedded objects in dry sand soil at 0 mm up to 1000 mm were successfully
detected and displayed in its GPR radargram. However, the embedded object at the
depth of 1500 mm could not be detected. Figure 15(a–c) shows the GPR radargram
of the GPR system simulation containing embedded object at depth of 100, 500 and
1500 mm respectively which have been processed without using the envelope
detector-based technique while Fig. 16(a–c) shows the GPR radargram that have
been processed using envelope detector-based technique. Based on Fig. 15 and
Fig. 16, it can be proved that the usage of the envelope detector-based technique on
antenna output signal can produce clearer GPR radargram and allow us to identify
the existence of the embedded object easily.
When the envelope detector technique is not used to reprocess the signal from
the GPR system, it is hard to identify the exact position of the iron embedded object
by using the reflection of electromagnetic wave. Normally, when the envelope
PM GPR System Development by using Envelope Detector Technique 393

Fig. 15 (a) GPR image a

radargram of GPR system
simulation with an embedded
object at 100 mm depth in dry
sand soil processed without
ED based technique. (b) GPR
image radargram of GPR
system simulation with an
embedded object at 500 mm
depth in dry sand soil
processed without ED based
technique. (c) GPR image
radargram of GPR system
simulation with an embedded
object at 1500 mm depth in
dry sand soil processed
without ED based technique
b

c
394 M. Razali et al.

Fig. 16 (a) GPR image

radargram of GPR system a
simulation with embedded
object at 100 mm depth in dry
sand soil processed with ED
based technique. (b) GPR
image radargram of GPR
system simulation with
embedded object at 500 mm
depth in dry sand soil
processed with ED based
technique. (c) GPR image
radargram of GPR system
simulation with embedded
object at 1500 mm depth in
dry sand soil processed with
ED based technique
b

c
PM GPR System Development by using Envelope Detector Technique 395

detector technique was not applied the GPR system are only able to display a vague
radargram image. This vague radargram image can be seen in the Fig. 15.
In Fig. 16, the position and the electromagnetic wave reflection shown in this ﬁgure
can be seen clearly because of the envelope detector appliances. At the depth of
100 mm as well as 500 mm, the embedded object detected by the GPR system is
estimated as be seen in time samples from 3000 to 3500 and at the scanning point
of 7 to 11. However, the depth of the embedded object cannot be determined in
detail and the size of the embedded object is slightly different from the size of the
embedded object set in the GPR system simulation.
As a validation purpose, before the GPR system is simulated to scanning pro-
cess of the embedded object in this study, the GPR system will be simu-
lated the scanning process of the dry sand area with no embedded object as a

Fig. 17 (a) GPR image

a
radargram of GPR system
simulation without embedded
object without ED based
technique. (b) GPR image
radargram of GPR system
simulation without embedded
object with ED based
technique

b
396 M. Razali et al.

Table 1 Depth of the scanned embedded object using dipole antenna

Metal depth, mm With envelope detector Without envelope detector
0 ✓ ✓
10 ✓ ✓
100 ✓ ✓
500 ✓ ✓
900 ✓ ✓
1000 ✓ ✓
1500 ✗ ✗

reference simulation which will be used to distinguish whether the system detects
the presence of the embedded object in the dry sand area or not. Fig. 17 shows
the GPR radargram image of the GPR system simulation with no embedded object.
Based on the GPR radargram image at Fig. 17, it can be concluded that this
GPR system can only detect the presence of the embedded object in the dry sand
area at a depth of less than 1000 mm. If, the depth is exceeding 1000 mm, the GPR
system radargram cannot detect any presence of the embedded object in the dry
sand area as the image has almost same pattern as the image of the GPR radargram
without embedded object. Through this simulation, there are two possibilities that
could be happened when the depth is exceeded 1000 mm which either there is
no embedded objects in the dry sand area or the electromagnetic wave signal of
the GPR system is not able to penetrate deeper in the dry sand area to detect the
embedded object.
Table 1 show the scanning results of the GPR system using the designed sim-
ulation model in this study which includes the position of the embedded object in
dry sand area. The embedded object detected by the GPR system are the object at
the depth of 0, 10, 100, 500, 900 and 1000 mm. the embedded object at the depth of
1500 mm cannot be detected by the GPR system whether or not it used the
envelope detector technique.

4 Conclusion

The GPR system simulation was designed using a dipole antenna. The use of these
antenna which is a narrow band antenna as GPR antenna have not received much
attention because the GPR radargram produced will be unsmootherned. Apart from
replacing the narrow band antenna to wide band antenna, the image of the
GPR radargram can be smootherned by applying signal processing technique to the
antenna output signal, which contain high ripple using envelope detector based.
PM GPR System Development by using Envelope Detector Technique 397

Acknoledgement This paper acknowledges the contribution of funding from UTHM under the
internal grant of Postgraduate Research Grant (GPPS) Scheme Vot No. H403. The experimen-
tation and testing have been done at UTHM research project laboratory.

References

1. Daniels JJ (2000) Ground penetrating radar fundamentals, pp 1–21 (2000)

2. Baker GS, Jordan TE, Pardy J (2007) An introduction to ground penetrating radar (GPR). In:
Special paper 432 stratigraphic analysis using GPR, vol 2432, pp 1–18 (2007)
3. Lai WWL, Derobert X, Annan P (2018) A review of ground penetrating radar application in
civil engineering: a 30-year journey from locating and testing to imaging and diagnosis.
NDT E Int 96:58–78
4. Joret A (2018) Modulation techniques for GPR system radargram module technique GPR
system radargram, p 283
5. Jazayeri S, Saghaﬁ A, Esmaeili S, Tsokos CP (2019) Automatic object detection using
dynamic time warping on ground penetrating radar signals. Expert Syst Appl 122:102–107
6. Breed G (2005) A summary of FCC rules for ultra wideband communications. High Freq
Electron 4(1):42–44
7. Wiesbeck W, Adamiuk G, Sturm C (2009) Basic properties and design principles of UWB
antennas. Proc IEEE 97(2):372–385
8. Carlson AB, Crilly PB, Rutledge JC (2002) Communication systems: an introduction to
signals and noise in electrical communication, 2nd edn. McGraw-Hill, New York
9. Sharif A, Chattha HT, Aftab N, Saleem R, Rehman S (2015) A tree shaped monopole antenna
for GPR applications, pp 3–5
10. Shebalkova LV, Markov MA, Romodin VB (2018) Broadband antenna for ground
penetrating radar application in soil. In: IOP conference series: earth and environmental
science, vol 134, no 1
11. Sato M, Yarovoy A (2008) GPR (ground penetrating radar) into real world 2. In:
Fundamentals of GPR 3 new technologies in GPR, p 4 (2008)
12. Riaz MM, Ghafoor A (2012) Information theoretic criterion based clutter reduction for ground
penetrating radar. Progr Electromagnet Res 45:147–164
13. Florian F (2003) Introduction of a ground penetrating radar system, vol 14, pp 35–44 (2003)
14. Daniels DJ (2004) Ground penetrating radar, 2nd edn. IET London, UK
15. Joret A, Sulong MS, Abdullah MFL, Madun A, Dahlan, SH (2018) Design and simulation of
horn antenna using CST software for GPR system. In: Journal of physics: conference series,
vol 995, no 1
16. Zivkovic I, Scheffler K (2013) A new inovative antenna concept for both narrow band and
UWB applications. Progr Electromagnet Res 139:121–131
17. Wu D, Yin Y, Guo M, Shen R (2006) Wideband dipole antenna for 3G base stations, pp 454–
457 (2006)
18. Ziemer RE, Tranter WH (2014) Principles of communication systems, modulation, and noise.
Wiley, Hoboken
An Overview of Modeling and Control
of a Through-the-Road Hybrid Electric
Vehicle

M. F. M. Sabri, M. H. Husin, M. I. Jobli,

and A. M. N. A. Kamaruddin

Abstract Heavy reliance on fossil fuels poised a challenge in environment

preservation as hazardous by-products from fuel-burning are dissipated irrepress-
ibly to the atmosphere. The introduction of hybrid electric vehicles (HEV) in the
transportation sector serves as a contemporary solution towards the realization of
emission-free vehicles of the future. In this paper, a Through-the-Road (TtR) HEV
configuration with in-wheel motors (IWM) fitted in the rear wheels is proposed and
tested in simulation over standard drive cycles. Due to its simpler configuration,
TtR HEV has a lower efficiency compared to other conventional HEVs but the
architecture also grants several redeeming features such as enhanced acceleration
and stability courtesy of its 4-wheel drive (4WD) setup. Further research is needed
to improve the offering from TtR architecture to make them perform closer in
efficiency to conventional HEVs. Modeling of the TtR HEV uses established
mathematical equations in MATLAB® using Simulink. This is achieved through a
modification of a power-split HEV model in Simulink into a TtR architecture
through the elimination of the planetary gear system, the addition of IWM to the
rear wheels and a slight modification of the EMS. The main objective of this
exercise is to develop a robust simulation platform for future works such as driv-
etrain optimization and development of energy management strategy
(EMS) controller. Simulation results have shown that the proposed TtR HEV is
capable of satisfying the driver’s demand with acceptable fuel consumption.

Keywords Hybrid electric vehicle Through-the-road HEV 4-wheel drive

Robust simulation platform Energy management strategy

M. F. M. Sabri (&) M. H. Husin M. I. Jobli A. M. N. A. Kamaruddin

Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 399

1 Introduction

1.1 HEV as Key for the Future of Transportation

The burning of fossil fuels is widely reflected as the main aggregator to air pollution
and is subsequently leading to the global warming phenomenon. The transportation
sector is among the biggest fossil fuel consumer and is one of the biggest producers
of greenhouse gasses (GHG). GHG comprises of hazardous fumes such as nitrogen
oxides (NOx), carbon monoxides (CO), sulfur oxides (SOx), unburned hydrocar-
bons and other pollutants [1–3]. In the effort to lessen the catastrophic impact
towards the environment and to achieve the 2 °C Scenario (2DS) advocated by The
Paris Agreement of 2015, industry leaders and researchers are actively striving to
cut down fuel consumption and emission towards realizing the zero-emission target
within this century [4].
In the wake of technological and logistical challenges faced by the
battery-powered electric vehicles (BEV) and fuel cell vehicles (FCV) [5–8], hybrid
electric vehicles (HEV) is thriving in the current market that is still wavering on the
best solution yet for the realization of zero-emission vehicles of the future. HEV is a
type of vehicle that accumulates the traits of conventional vehicles powered by
internal combustion engines (ICE) and BEV into a single package to deliver
uncompromised performance while producing lesser emissions [3, 9–11]. A HEV is
equipped with an ICE and one or more electric motors (EM) connected to the
vehicle’s final drive in a certain configuration. The driver’s request for speed and
power in HEVs is administered through a carefully schemed energy management
strategy (EMS) which determines the optimal power delivery from the two energy
forms—fuel and electric, to the wheels while taking their respective efficiency curve
into consideration [3, 9]. HEVs are the result of the synergy between mechanical,
electrical, electronic and power engineering which are working in sync to produce a
short-term solution for the global fuel consumption and emission problems.
In the current market, HEVs offer a better proposition in the skirmish for the
market share of green vehicles or energy-efficient vehicles (EEV) segment com-
pared to BEVs and FCVs. Even though the data is showing that the BEV adoption
rate is on the rise with more than two million BEVs sold by 2016, it only managed a
0.2% market share of the global passenger vehicle market with FCVs almost a
non-factor [12, 13]. In terms of the core technology being used, HEVs have the
advantage by taking the midway approach of what currently available on the
market. BEVs are hindered by expensive battery technology and availability of
dedicated charging stations whereas FCVs are stalled by immature and expensive
fuel cell technology imposed by the production and storage of hydrogen (H2) which
is the essence for the fuel cell operation [6, 8, 14].
An Overview of Modeling and Control of a TtR HEV … 401

1.2 Overview of Hybrid Electric Vehicle Architectures

There are generally three main types of HEVs—series, parallel and power-split,
which are distinguished by their source-to-wheel arrangement of the ICE-driven
mechanical path and the EM-driven electrical path. A series HEV has its
mechanical path and electrical path arranged in a serial configuration with the ICE
only used to spin the generator (GEN) to charge the battery pack that acts as the
secondary energy storage system (ESS) which is the power source for the EM. As
presented in Fig. 1, the EM is the only source for traction driving the wheels. Series
HEVs are very similar to BEVs but with the addition of a generator. As the ICE is
not connected to the final drive, series HEVs achieve great fuel economy by having
the ICE operating at its highest efficiency throughout its operations [3, 10, 11].
Parallel HEVs adopt the parallel arrangement of the two energy paths from their
sources to the wheel. As shown in Fig. 2, both the ICE and EM are connected to the
transmission through a mechanical torque coupling device which blends the torque
output from both sources before delivering it to the final drive. The coupling device
is also necessary to allow ESS recharging by diverting a portion of torque from the
ICE, but this process is only present when the vehicle is in motion. Parallel HEVs
offer a higher degree of flexibility for the choice of ICE and EM capacity compared
to the limited downscaling options in series HEV. However, series HEVs do not
require any mechanical coupling device and gearbox as EMs are generally high
revving and efficient over a wide range of speed [3, 10, 11].
Series and parallel HEVs possess a contrasting set of advantages and disad-
vantages as tabulated in Table 1.

Fig. 1 Power flow for series

HEV
402 M. F. M. Sabri et al.

Fig. 2 Power flow for parallel HEV

Table 1 Series vs Parallel vs Power-split HEV—a comparison

Conﬁguration Advantages Disadvantages
Series HEV • Simpler design • Less flexible component sizing
• Lesser component requirement • Needs separate EM and GEN units
• ESS recharging always available • ESS capacity tied to EM capability
• Simpler control design • Unsuited for highway driving
• Most suitable for city driving
Parallel HEV • Flexible component sizing • More complex design
• Needs only one EM/GEN unit • More component requirement
• ESS capacity not dictated by EM • More complicated control design
• Most suitable for highway driving • ESS recharging only when moving
• Unsuited for city driving
Power-split HEV • Flexible component sizing • Most complex design requirement
• Needs only one EM/GEN unit • Requires the most component
• ESS recharging always available • Most expensive implementation
• ESS capacity not dictated by EM • Most complicated control design
• Suitable for all types of driving

As vehicles should be designed to suit all types of driving conditions, a new

conﬁguration combining series and parallel HEV was introduced. The aim is to
overcome the weaknesses of the individual designs and harness their strengths.
From Table 1, it can be observed that power-split HEVs solve most of the problems
for the previous two architectures but introduce new areas of concern of their own.
Figure 3 shows the power flow in a power-split HEV and it is very similar to a
parallel HEV but it uses a planetary gear system instead of a simple torque coupling
system. The planetary gear is a complex system that enables the HEV to operate as
both a series and parallel HEV at the same time. However, the inclusion of the
complex planetary gear system with its complex control requirements is also the
root cause of the perceived disadvantages of this conﬁguration [3, 10, 11].
An Overview of Modeling and Control of a TtR HEV … 403

Fig. 3 Power flow for power-split HEV

The introduction of plug-in HEVs (PHEV) has been able to elevate the
fuel-saving capabilities of HEVs to a whole new level. PHEVs are HEVs that can
be connected to the grid for a direct ESS recharging like a BEV [3, 9, 15]. PHEVs
differ from standard HEVs in the hierarchy of its source of energy where the battery
pack is the primary ESS instead. Due to the external charging capability, battery
packs in PHEVs are generally larger with higher energy density than standard
HEVs. PHEVs are also fitted with a bigger and more powerful electrical drivetrain
to match its large ESS. With these added combinations, PHEVs are capable of the
all-electric range (AER) drive for a certain amount of distance [3, 9]. The result is a
zero-emission vehicle with no fuel consumption for trips within the AER. This
breakthrough in HEV technology has driven a large interest from manufacturers
with new models introduced every year. The impacts of PHEVs and the technology
behind it will be one of the keys to unlocking true potentials of all HEV
configurations.
In this paper, a particular HEV architecture called the through-the-road
(TtR) HEV will be focused on. TtR HEV is a derivative form of parallel HEV
but the link between the mechanical and electrical path is established through
contact with the road surface and not using any mechanical torque coupling device
in the drivetrain [3, 9, 16, 17]. This architecture will be explained further by
comparing it with the other configurations in the next section. The merit and
shortcoming of the TtR architecture will be discusses based on recent publications
in the hope of evaluating the true potential of this architecture and to catapult it as
the configuration of choice to accelerate the adoption rate of green vehicles.
404 M. F. M. Sabri et al.

2 Through-the-Road Hybrid Electric Vehicles

2.1 Synopsis on Concept and Design

TtR HEVs have no mechanical torque coupling device linking the mechanical path
and electrical path of the vehicle. To make up for the absence of the in-transmission
torque coupling mechanism, the link between the two drivetrains is established
externally through the road contact while the vehicle is in motion, hence the name
“through-the-road”. This unconventional coupling mechanism grants a simpler and
cheaper foundation for HEV implementation compared to any other configurations
[3, 9, 16–19].
Configuration-wise, a TtR HEV, also known as separate axle parallel HEV,
obtains propulsion power through two independent propulsion systems compared to
only one in conventional HEV. Taking advantage of the separate axle setup, instead
of a big chassis-mounted EM turning the rear axle, smaller and highly efficient
in-wheel motors (IWMs) are fitted in the rear wheels to provide power directly to
the wheels for minimal losses. The smaller IWMs also have the benefit of being
lighter than conventional chassis-mounted EM giving a TtR HEV the much-needed
advantage in terms of the mass of the vehicle. The smaller size also means that
IWMs are theoretically gentler to the ESS. The extra space which normally
occupied by the EM is now vacant and is perfect for fitting a larger ESS depending
on the budget allocation [3, 9]. For this paper, the measure of the design consid-
eration can be simplified by the illustration in Fig. 4.

Fig. 4 Design considerations for the proposed TtR HEV

An Overview of Modeling and Control of a TtR HEV … 405

2.2 Advantages and Disadvantages of Retroﬁt TtR HEVs

Among the advantages of this configuration are the 4-wheel drive (4WD) capability
that provides a higher level of stability to the vehicle and it also offers exceptional
acceleration. Next is the appeal of retrofitting any conventional ICE vehicles and
transform them into HEVs. This tantalizing property is an excellent motivation for
consumers to start embracing green vehicles at a reasonable cost, considerably
lower than buying a whole new vehicle. However, it is not seen as an enticing
prospect for car manufacturers aiming to keep selling new vehicles unless radical
measures and policies are imposed [3, 9, 18].
One of the trade-offs for the simpler architecture is the lower efficiency for ESS
recharging compared to conventional HEVs since the extra torque needed to
recharge the ESS from the ICE is supplied externally through forced interaction
with the road surface and limited only when in motion and enacts a big loss. Even
with the assist from regenerative braking, the amount of energy that can be har-
vested internally is significantly less than what is possible with a conventional
HEV. The result is a much smaller window for optimum EMS operation and a
reduced amount of electrical energy supply for the IWM which will affect the HEV
performance target. Another setback for the TtR architecture is both of its axles are
constrained to spin at a matching frequency and always relative to vehicle speed. In
a conventional HEV, the EM is never subjugated by the vehicle speed to allow it to
operate at its highest efficiency. However, as the road surface becomes the torque
coupling medium for the two drivetrains in TtR HEV, this poses a problem as EM
usually rotates at a higher revolution per minute (RPM) count than the ICE to
produce the same amount of power [3, 9, 16, 17].
One of the nifty features for a conventional HEV is the option to downsize the
ICE to further enhance their fuel-saving potential. However, with retrofitted TtR
HEVs, that option is unavailable as they are limited with existing mechanical
drivetrains which are not originally designed for HEV application [16]. This will
put TtR HEVs at a disadvantage in terms of fuel efficiency compared to natively
designed HEVs. This will also lead to the virtual limitation that, when the
state-of-charge (SOC) level is very low, the ICE can only be expected to recharge
the ESS enough to keep it at the lower threshold rather than replenishing it for
further hybrid mode operation [20]. These lingering issues with the TtR HEV
architecture need to be addressed as it is just as equally important as the EMS side
of the system in ensuring a successful development process [3, 9, 16]. The pro and
cons of the architecture can be summarized as in Table 2.
In the next section, the modeling process for the proposed TtR HEV model is
shown. The process will take these considerations to the fullest in the effort to
understand the behaviour of the vehicle in responding to the driver’s demand. It is
an important step in progressing further with the research because the results will
hopefully show the strengths and weaknesses of the developed model. The simu-
lation is also an opportunity to identify the areas that require further optimizations
that can offer a performance boost.
406 M. F. M. Sabri et al.

Table 2 Pros and cons of TtR HEV architecture

Type Advantages Disadvantages
TtR • Simplest design concept • Component sizing only limited to EM
HEV • Cheapest cost of entry into HEV and ESS
• The least demanding component • ESS recharging only available while
requirement moving
• ESS size not tied to IWM capability • Lower efﬁciency/high operational loss
• IWM is gentler to the ESS • Front and rear axles speed matching
• 4WD capability

3 Simulation Platform Modiﬁcation and Setup

3.1 Design Considerations

The design of choice for the TtR HEV proposed in this research is a PHEV in order
to maximize the EMS potential by using the external charging feature to provide the
best possible SOC window for optimal EMS operation [9, 19]. This design choice
will consequently eradicate the limited onboard ESS recharging capability of a
TtR HEV. Subsequently, the use of deep-discharge, high energy density battery as
the ESS is also being considered to further enhance the EMS potential. The main
focus of this research is to synthesis an EMS controller capable of performing
favourably in a heavily modiﬁed HEV architecture given the best possible condi-
tions. From there onwards, the controller will be optimized towards a more realistic
target using the robustness of MATLAB® as a powerful simulation tool.
In order to identify the challenges and the most suitable EMS for the proposed
TtR HEV, ﬁrst, it is important to take into consideration every possible operating
mode for a TtR HEV. By design, the direct ESS recharging mechanism by the ICE
is unavailable, therefore, the recharging of the ESS is only achievable when the
vehicle is in motion, ESS recharging cannot occur otherwise. This design choice
also excludes operation modes exclusive to series HEVs. Here are all the possible
operating modes for the proposed TtR HEV model:
1. Load obtains power from ICE alone
2. Load obtains power from IWM alone
3. Load obtains power from both ICE and IWM (hybrid mode)
4. Load returns power to ESS (regenerative braking)
5. Load obtains power from ICE and delivers power to ESS (TtR exclusive)
In this research, a deterministic rule-based strategy is used to carefully work
within these operating modes [9]. The power flow solution for the proposed
TtR HEV is as illustrated in Fig. 5.
An Overview of Modeling and Control of a TtR HEV … 407

Fig. 5 Power flow in the proposed TtR HEV

Fig. 6 Proposed TtR HEV model

3.2 Development of Simulation Model

The simulation platform for the TtR HEV is built in Simulink for efficient devel-
opment. The proposed modified model of TtR HEV is based on the original
series-parallel HEV which can be accessed here [21]. Lookup tables are used in
various parts of the model for quicker system response. The balance between model
fidelity and simulation speed is critical for efficient development. The vehicle model
and controllers are modeled in a single environment to enable system-level opti-
mization. The modeling aspect includes the electrical system, mechanical, thermal
and the control system of the vehicle. The simulation is done using Simulink over
standard drive cycles.
The main modification needed for TtR HEV is the removal of the power split
device from the original model. By this removal, the ICE (mechanical path) and the
IWM (electrical path) now have direct connections to the front and rear wheels
respectively as shown in Fig. 6.
408 M. F. M. Sabri et al.

Fig. 7 TtR HEV architecture in Simulink

Fig. 8 Mode Logic for the EMS

As per Fig. 7, the final drive model is also modified into a 4WD configuration to
ensure the ICE is connected to the front wheels and the IWMs are connected to the
rear wheels through two input ports—Port “Conn1” for the IWM directly to the rear
wheels and Port “Conn2” for connection from the ICE to the front wheels. As the
IWMs are efficient for a wide range of speed, it does not need a gearbox.

3.3 Energy Management for TtR HEV

The control system used to test the response of the proposed TtR HEV model is a
rule-based type. It contains multiple proportional-integral (PI) controllers as well as
a controller block containing the rule-based EMS programmed in state-flow. PI
controllers are used in various parts of the main controller to make the system
iterate quickly. Figure 8 illustrates the rule-based EMS controller used here. Basic
rules are imposed for the system based on four inputs, namely current vehicle
An Overview of Modeling and Control of a TtR HEV … 409

Fig. 9 State-flow diagram for the rule-based EMS

speed, brake signal, current SOC level and current ICE speed and it outputs three
switching signals controlling the ICE, IWM and GEN respectively.
The state-flow diagram is shown in Fig. 9. There two main modes available
which are brake mode and motion mode. Motion mode is further detailed into four
sub-modes. Start mode is during the initial movement where only the IWMs are
used as the ICE stall speed is yet to be exceeded. Once the ICE stall speed is
exceeded, the mode changes to the next sub-mode which is the normal mode where
the ICE is turned ON. The normal mode is further divided into cruise mode and
acceleration mode. These modes are used throughout the operation while corre-
sponding to the driver’s demand. Acceleration is done in hybrid mode with both the
ICE and IWMs supplying power to the wheels. In cruise mode, when the SOC is
high, the GEN is switched OFF but at a lower threshold of 30%, the GEN will be
switched ON to replenish the ESS. Brake mode is when the brake pedal is activated
and the ICE and IWM will be turned OFF to allow the GEN to regenerate energy
through regenerative braking.
The detailed modiﬁcation and modeling process is reported in a separate pub-
lication [9] which is why it will not be explained further in this paper.
410 M. F. M. Sabri et al.

Table 3 Drive cycles data

Name Type Distance Average
speed
Urban Drive Cycle (ECE Low speed, stop-go urban driving 995 m 18.4 km/h
R15)
Extra Urban Drive Cycle High-speed highway driving 6955 m 62.6 km/h
(EUDC)
New European Drive Cycle Combined urban and high-speed 11017 m 33.6 km/h
(NEDC) highway driving
Highway Fuel Economy High-speed highway driving 16503 m 77.7 km/h
Test (HWFET)

4 Simulation Results and Discussions

4.1 Experiment Setups

In this chapter, the proposed TtR HEV model is put to the test using the simulations
on four standard drive cycles. The four drive cycles used in the simulations are as
stated in Table 3.
These simulation runs will give a brief picture of how the proposed TtR HEV
model will perform in real-life driving situations in a controlled environment as
various performance indicators such as drivability, power flow, fuel consumption,
battery SOC, etc. are observed during the duration of the simulations. For this
research, the emphasis is put on drivability and fuel economy of the TtR HEV to
warrant ﬁrst and foremost, the proposed TtR HEV is capable of responding to
driver’s demand while maintaining an acceptable level of fuel economy. The level
of initial SOC is set to the optimum value of 90% to ensure the best possible vehicle
performance without the SOC bottleneck as a detailed battery management strategy
for the proposed TtR HEV requires separate research which has been identiﬁed as
one of the areas to be focused on in the future. The basic parameters for the
proposed TtR HEV used in this simulation are as presented in Table 4 and the basis
for the fuel consumption calculation is based on the flow rate (g/s) of fuel provided
by the ICE block divided by the density of gasoline, which is 750 kg/m3, times the
total time taken by the drive cycle in seconds to obtain the total amount of fuel
consumption in liters.
An Overview of Modeling and Control of a TtR HEV … 411

Table 4 TtR HEV Body

parameters
Mass 1200 kg
Frontal area 2.16 m2
ICE
Max power 114 kW
Speed at max power 5000 RPM
Max. speed 6000 RPM
Fuel consumption By speed and torque
IWM
Max. power 30 kW
Max. torque 400 Nm
ESS
Type Li-Ion
Nominal voltage 200 V
Rated capacity 22 Ah

4.2 Results and Discussions

ECE R15 Drive Cycle

For the ECE R15 drive cycle, it can be observed in Fig. 10(a) that the TtR HEV has
managed to follow the speed demand with minor difficulties after the first accel-
eration. Figure 10(b) shows the power flow throughout the simulation with the
heavy lifting done mostly by the IWMs and the ICE only activated during the last
section of the drive cycle due to the higher speed demand as can be proven by the
spike in fuel consumption shown in Fig. 10(c). When the ICE is activated, a little
bit of energy is replenished by the GEN as shown by the slight bump in the SOC
level but as the hypothesis suggested, the amount of energy that can be recovered
through regenerative braking in a TtR HEV is limited as proven by this result. The
simulation result shows that the model performs with acceptable performance on
low-speed cycle. At the end of the simulation, the total fuel consumption figure is at
0.1609 L and the final SOC level sits at 84.3%. From these results, seeing as the
SOC level is still pretty high by the end of the simulation and considering the short
trip distance, the IWMs could have been utilized more to save more fuel. However,
in the current model, trip distance is not among the considerations for the rule-based
EMS, thus, that kind of minute level adjustment is not possible unless a drastic
change is made to the EMS algorithm.
412 M. F. M. Sabri et al.

(a) Vehicle speed response. (b) Power flow and SOC.

(c) Fuel consumption pattern.

Fig. 10 Test run on ECE R15 drive cycle

EUDC Drive Cycle

EUDC drive cycle provides insight into the TtR HEV performance on higher speed
cycles. From Fig. 11(a), it looks like the proposed model has no issue in responding
to the driver’s demand. The power flow plot in Fig. 11(b) shows the ICE as the
main contributor for power with the IWMs assisting during accelerations. And as
the ICE was running during cruising at a constant speed, it can be observed that the
ESS is getting recharged. The ﬁnal SOC stands at 78.93% and as per Fig. 11(c), the
fuel consumption total is 0.9167 L. There is no apparent issue that needs to be
highlighted in this part of the simulation, but the fuel consumption ﬁgure looks a
little high as ICE is being used heavily here.
NEDC Drive Cycle
On a longer drive cycle such as the NEDC, the proposed model is showing a similar
performance attribute as the previous two drive cycles combined. Figure 12(a)
exhibits that the model is facing a bit of instability at the lower speed region but
performs smoother on high-speed regions. Figure 12(b) shows the power flow and
SOC level of 57.32% at the end which means that the ESS is used heavily espe-
cially during the low-speed section of the drive cycle but with limited regeneration.
The observation that can be made after the three simulations is the proposed model
An Overview of Modeling and Control of a TtR HEV … 413

(a) Vehicle speed response. (b) Power flow and SOC.

(c) Fuel consumption pattern.

Fig. 11 Test run on EUDC drive cycle

performs best with the ICE as the main source of traction whereas the performance
of the IWMs needs more attention, so further investigation is needed to ﬁnd the
source of the shaky performance. However, with the ICE taking the centre stage,
fuel consumption takes a hit as can be seen in Fig. 12(c) with the stern increase of
consumption during the later part of the cycle and the simulation ended with
1.561 L of consumption.
HWFET Drive Cycle
HWFET serves as the drive cycle with the highest demand in terms of speed and
power. From Fig. 13(a), it can be clearly observed that the model exhibits insta-
bility which takes a while to be corrected before it is able to follow the speed
proﬁle. And from the Fig. 13(b), it can be deduced that this instability is caused by
the spike of power coming from the IWMs as they try to respond to the steep power
request by the driver but resulted in the overshoots. But as the ICE is being used
more frequently during much of the drive cycle, the performance of the proposed
model is smooth and the ESS is not put under too much strain as the simulation
ended with the SOC of 68.49%. However, as expected by the heavy usage of the
ICE, the fuel consumption is at a high 1.966 L as shown in Fig. 13(c). From the
414 M. F. M. Sabri et al.

(a) Vehicle speed response

. (b) Power flow and SOC.

(c) Fuel consumption pattern.

Fig. 12 Test run on NEDC drive cycle

two long drive cycles above, it can be concluded that the fuel-saving is higher at
lower speed region regions, but the performance is a little unsteady due to the
inconsistency shown by the IWMs. At higher speed regions, the ICE helps maintain
smoother vehicle performance, but the result leads to unfavourable fuel
consumption.
The summary of the results is presented in Table 5. An interesting point from the
summary is the proposed TtR HEV model has a better fuel consumption rate at
higher speed cycles which are EUDC and HWFET. This is due to the ICE operating
more efﬁciently at high-speed compared to low-speed operations and it also resulted
in longer SOC preservation as the ESS is not drained as aggressively as when the
proposed model is relying on the IWMs for power at the lower speed sections.
However, when comparing the performance obtained here with other publica-
tions, such as [22–24] which have the fuel efﬁciency range between 2.01 L/100 km
to 4.25 L/100 km on NEDC and HWFET drive cycles, the fuel-saving capability of
the proposed model is still far from satisfactory. This is due to the rule-based EMS
used here compared to the more advanced EMS approaches by the publications
mentioned above and the choice of ICE which is not in favour of the proposed
model as the downsizing option is unavailable.
An Overview of Modeling and Control of a TtR HEV … 415

(a) Vehicle speed response. (b) Power flow and SOC.

(c) Fuel consumption pattern.

Fig. 13 Test run on HWFET drive cycle

Table 5 Simulation summary

Features/Drive cycles ECE R15 EUDC NEDC HWFET
Fuel consumption (L) 0.1609 0.9167 1.561 1.966
Fuel consumption 16.17 13.18 14.16 11.91
(L/100 km)
Final SOC (%) 84.3 (−5.7) 78.93 (−11.07) 57.32 (−32.68) 68.49 (−21.51)

5 Conclusions

Finally, it can be concluded that the modeling of the proposed TtR HEV has been a
success as far as the ability of the model to respond to the driver’s demand is
concerned albeit some minor instabilities which can be remedied through further
optimizations. The focus will be put on the IWM’s performance because safety is at
risk when a vehicle performs not as the drivers intended. The simulation results
have provided that the development of the simulation platform is successful, and it
can be used for further research for the proposed TtR model especially in
416 M. F. M. Sabri et al.

developing a new EMS controller to replace the rule-based EMS and take full
advantage of the design approach taken here, in the pursuit of achieving the best
fuel consumption possible without sacrificing vehicle performance.
From the results, several areas have been identified as prospective research
focuses in the future focusing on performance gain and increased fuel-saving
potential of the proposed architecture. EMS is certainly the main area in which
these goals can be achieved as the rule-based EMS currently used by the model has
too many limitations and is not capable of adapting to different characteristics of
different drive cycles. Another area of interest is towards hardware-based drivetrain
optimization by component-sizing to increase operational efficiency and minimiz-
ing losses.

Acknowledgements This research work and publication is supported and funded by UNIMAS
under Special MyRA Assessment Funding (Project ID: F02/SpMYRA/1719/2018).

References

1. Atabani AE, Badruddin IA, Mekhilef S, Silitonga AS (2011) A review on global fuel
economy standards, labels and technologies in the transportation sector. Renew Sustain
Energy Rev 15:4586–4610
2. Mohr SH, Wang J, Ellem G, Ward J, Giurco D (2015) Projection of world fossil fuels by
country. Fuel 141:120–135
3. Sabri MFM, Danapalasingam KA, Rahmat MF (2016) A review on hybrid electric vehicles
architecture and energy management strategies. Renew Sustain Energy Rev 53:1433–1442
4. UNFCCC (2015) Conference of the parties (COP): Paris climate change
conference-November 2015, COP 21. Adoption of the Paris Agreement. Proposed by
President 21932, 32 (2015)
5. Un-Noor F, Padmanaban S, Mihet-Popa L, Mollah M, Hossain E (2017) A comprehensive
study of key electric vehicle (EV) components, technologies, challenges, impacts, and future
direction of development. Energies 10:1217
6. Manoharan Y, Hosseini SE, Butler B, Alzhahrani H, Senior BTF, Ashuri T, Krohn J (2019)
Hydrogen fuel cell vehicles; current status and future prospect. Appl. Sci. 9:2296
7. Williamson SS, Rathore AK, Musavi F (2015) Industrial electronics for electric transporta-
tion: current state-of-the-art and future challenges. IEEE Trans Ind Electron 62:3021–3032
8. Cano ZP, Banham D, Ye S, Hintennach A, Lu J, Fowler M, Chen Z (2018) Batteries and fuel
cells for emerging electric vehicle markets. Nat Energy 3:279–289
9. Mohd Sabri MF, Danapalasingam KA, Rahmat MF (2018) Improved fuel economy of
through-the-road hybrid electric vehicle with fuzzy logic-based energy management strategy.
Int J Fuzzy Syst 20:2677–2692
10. Enang W, Bannister C (2017) Modelling and control of hybrid electric vehicles (a
comprehensive review). Renew Sustain Energy Rev 74:1210–1239
11. Hannan MA, Azidin FA, Mohamed A (2014) Hybrid electric vehicles and their challenges: a
review. Renew Sustain Energy Rev 29:135–150
12. International Energy Agency (IEA) (2017) Global EV outlook 2017: two million and
counting. IEA Publication, New Delhi, pp 1–71
13. Rezvani Z, Jansson J, Bodin J (2015) Advances in consumer electric vehicle adoption
research: a review and research agenda. Transp Res Part D Transp Environ 34:122–136
An Overview of Modeling and Control of a TtR HEV … 417

14. Nykvist B, Nilsson M (2015) Rapidly falling costs of battery packs for electric vehicles. Nat
Clim Change 5:329–332
15. Axsen J, Kurani KS (2013) Hybrid, plug-in hybrid, or electric-What do car buyers want?
Energy Policy 61:532–543
16. Pisanti C, Rizzo G, Marano V (2014) Energy management of through-the-road parallel hybrid
vehicles. IFAC Proc 47:2118–2124
17. Galvagno E, Morina D, Sorniotti A, Velardocchia M (2013) Drivability analysis of
through-the-road-parallel hybrid vehicles. Meccanica 48:351–366
18. Rashid MIM, Danial H (2017) ADVISOR simulation and performance test of split plug-in
hybrid electric vehicle conversion. Energy Procedia 105:1408–1413
19. Meisel J, Shabbir W, Evangelou SA (2013) Evaluation of the through-the-road architecture
for plug-in hybrid electric vehicle powertrains. In: 2013 IEEE international electric vehicle
conference (IEVC). IEEE, pp 1–5
20. Mathews JC, Walp KJ, Molen GM (2006) Development and implementation of a control
system for a parallel hybrid powertrain. In: 2006 IEEE vehicle power and propulsion
conference, pp 1–6
21. Miller S. Hybrid-Electric Vehicle Model in Simulink. https://www.mathworks.com/
matlabcentral/ﬁleexchange/28441-hybrid-electric-vehicle-model-in-simulink. Accessed 10
Oct 2019
22. Dubois MR, Desrochers A, Denis N (2015) Fuzzy-based blended control for the energy
management of a parallel plug-in hybrid electric vehicle. IET Intell Transp Syst 9:30–37
23. Zhang Y, Liu H-P (2012) Fuzzy multi-objective control strategy for parallel hybrid electric
vehicle. IET Electr Syst Transp 2:39
24. Adhikari S, Halgamuge SK, Watson HC (2010) An online power-balancing strategy for a
parallel hybrid electric vehicle assisted by an integrated starter generator. IEEE Trans Veh
Technol 59:2689–2699
Euler-Lagrange Based Dynamic Model
of Double Rotary Inverted Pendulum

Mukhtar Fatihu Hamza, Jamilu Kamilu Adamu,

and Abdulbasid Ismail Isa

Abstract Double Rotary inverted pendulum (DRIP) is an important member of

nonlinear, unstable, non-minimum phase, and under-actuated mechanical systems.
The DRIP is known widely as experimental setup for testing different kind of
control algorithms. This paper, described a development of nonlinear dynamical
equations of the DRIP system using Euler-Lagrange methods. Euler-Lagrange
methods does not requisite complicated and tedious formulation since DRIP is not
large multi-body system. The linear model and state space representation was also
presented. The Simulink model of DRIP was developed based on the derived
equations. Simulation study was carried out and the results indicated that, the DRIP
system is inherently nonlinear and unstable. It is realized that the difﬁculties and
limitations in the previous dynamic equation of DRIP proposed in literature are
eliminated. Euler-Lagrange methods can be regarded as an alternative method for
ﬁnding the dynamic model of the systems.

Keywords Rotary inverted pendulum Dynamic model Euler-Lagrange

Nonlinear system

M. F. Hamza (&)
Department of Mechanical Engineering, University of Malaya, Kuala Lumpur, Malaysia
e-mail: [email protected]
M. F. Hamza
Department of Mechatronics Engineering, Bayero University, Kano, Nigeria
J. K. Adamu
Department of Engineering Services, Federal Ministry of Power, Works and Housing, Abuja,
Nigeria
A. I. Isa
Department of Electrical and Electronics Engineering, Usmanu Danfodiyo University Sokoto,
Sokoto, Nigeria

© Springer Nature Singapore Pte Ltd. 2021 419

1 Introduction

The Double Rotary inverted pendulum (DRIP) has two inverted pendulums con-
nected with each other and one is attached to a rotating arm as shown in Fig. 1. The
plane of the two pendulums is orthogonal to the radial arm [1]. This rotary arm is
actuated by a controlling torque with the objective of balancing the two pendulums
in the inverted position. Therefore, it has three degree of freedom (DOF). The
actuated joint angle has complete azimuth revolution range to stabilize the double
inverted pendulum [2]. The DRIP is an important member of nonlinear, unstable,
non-minimum phase, and under-actuated mechanical systems. The schematic dia-
gram of experimental setup is shown in Fig. 2.
The DRIP systems perform in an extensive range in real life applications such as
aerospace systems, robotics, marine systems, mobile systems, flexible systems,
pointing control, and locomotive systems [3]. Moreover, when the pendulums of
DRIP are at hanging position, it represents real model of the simpliﬁed industry
crane application [4].
The control objectives of the DRIP can be categorized into four categories [5, 6]
namely:
1. Controlling the two pendulums from downward stable position to upward
unstable position known as Swing-up control [7].
2. Regulating the pendulums to remain at the unstable position known as stabi-
lization control [8].

Fig. 1 Picture of
experimental setup
Euler-Lagrange Based Dynamic Model of DRIP 421

Fig. 2 Schematic diagram of experimental setup

3. The switching between swing-up control and stabilization control known as

switching control [6].
4. Controlling the DRIP in such a way that the arm tracks a desired time varying
trajectory while the pendulum remains at unstable position known as trajectory
tracking control [9].
The study of system dynamics resides in modeling its behavior. The dynamic
equations of any mechanical system can be obtained from the known Newtonian
classical mechanics [10–12]. Newtonian dynamics is a mathematical model whose
purpose is to predict the motions of the various objects which we encounter in the
world around us [13]. The drawback of this formalism is the use of the variables in
vector form, complicating considerably the analysis when increasing the joints or
there are rotations present in the system. In these cases, it is favorable to employ the
Lagrange equations, which have formalism of scale, facilitating the analysis for any
mechanical system [14, 15].
This study, described the detail development of nonlinear and linear dynamical
equations of the DRIP system using Euler-Lagrange methods. The state space
representation of the developed linear model was also presented. The nonlinear
Matlab model of DRIP was developed based on the derived equations. Simulation
study was carried out and the results indicated that, the DRIP system is inherently
nonlinear and unstable. It is realized that the difﬁculties and limitations in the
previous dynamic equation of DRIP proposed in literature are eliminated.
422 M. F. Hamza et al.

2 Double Rotary Inverted Pendulum Modelling

The DRIP consists of a series of two pendulums attached to a rotary arm that rotate
around motor shaft axis. It has three DOF, namely rotary arm angle h, lower
pendulum angle a, and upper pendulum angle c. The schematic diagram of DRIP is
shown in Fig. 3. Derivation of mathematical equation describing dynamics of the
DRIP system is based on Euler-Lagrange equation of motion [16].

2.1 Euler-Lagrange Equation

As described in [17], the Euler-Lagrange Equation is given in Eq. (1)

d @L @L @w
si ¼ þ ð1Þ
dt @ q_ i @qi @ q_ i

1 @w
w ¼ bi q_ 2i ) ¼ bi q_ i ð2Þ
2 @ q_ i

where qðtÞ are the generalize coordinates, q_ ðtÞ are the generalized velocities, si is
the external force or load vector, L is the Lagrangian w is the loss energy.

2.1.1 Kinetic Energy

The kinetic energy of the DRIP consist of a translational and rotational component
for the two pendula and rotational component for the rotary arm [18].
The total kinetic energy can be expressed in terms of the generalized coordinates
and their ﬁrst-time derivatives. In order to describe the position and motion of the
system under consideration, we could use standard Cartesian (x, y, z) and polar
coordinates, ðr; hÞ, of each of the three links. Each different point in these planes
corresponds to a unique instantaneous state of the DRIP. The Kinetic Energy for
each of the links can be obtained as follows:

Arm (Link 1)
The kinetic energy of the arm consists only of the rotational components. The arm is
constrained to movement on x-o-z plane and rotates around y-axis through an angle
h [19]. The arm instantaneous position and hence the kinetic energy of the arm ðK1 Þ
is most conveniently speciﬁed in terms of the plane polar coordinates r and h.

1
K1 ¼ Ja h_ 2 ð3Þ
2
Euler-Lagrange Based Dynamic Model of DRIP 423

Fig. 3 Schematic diagram of DRIP

Pendulum
The movements of the two pendulums are constrained to a vertical plane perpen-
dicular to Link 1. A Cartesian coordinate system allows position and direction in
space to be represented in a very convenient manner. Let us deﬁne our usual
Cartesian coordinates (x, y, z) and let the origin of our coordinate system corre-
spond to the equilibrium position of each pendulum. The direction of the arrows on
424 M. F. Hamza et al.

the arcs in Fig. 3 that indicates the angular displacement shows the positive
direction for the rotary movement of the links. The straight dash lines in Fig. 4
represent the reference position of the link angles (i.e. h a c 0).

Lower pendulum (Link 2)

If the lower pendulum is deflected from the upward vertical position by a small
angle a then it is easily seen that:

X1 ¼ rh þ l1 sin a ð4Þ

_ 1 cos a a_
X_ 1 ¼ r hl ð5Þ

Y1 ¼ l1 cos a ð6Þ

Y_ 1 ¼ l1 sin a a_ ð7Þ

Translational Kinetic Energy for link2 is given by:

1
Kt2 ¼ m1 X_ 12 þ Y_ 12 ð8Þ
2

Substituting (5) and (7) into (8) yields

2
1 _
Kt2 ¼ m1 r hl1 cos a a_ þ ðl1 sin a a_ Þ 2
ð9Þ
2

Rotational Kinetic Energy of link2 is:

1
Kr2 ¼ J1 a_ 2 ð10Þ
2

Total Kinetic Energy (K2) for link 2 is given by the sum of rotational and
translational kinetic energy

K2 ¼ Kt2 þ Kr2

Fig. 4 Position analyses for

link 2
Euler-Lagrange Based Dynamic Model of DRIP 425

2
1 1 _
K2 ¼ J1 a_ þ m1 r hl1 cos a a_ þ ðl1 sin a a_ Þ
2 2
ð11Þ
2 2

Upper pendulum (Link 3)

If the upper pendulum is deflected from the upward vertical position by a small
angle c then it is easily seen that

X2 ¼ rh þ L1 sin a þ l2 sin c ð12Þ

X_ 2 ¼ r h_ þ L1 cos a a_ þ l2 cos c c_ ð13Þ

Y2 ¼ L1 cos a þ l2 cos c ð14Þ

Y_ 2 ¼ L1 sin a a_ l2 sin c c_ ð15Þ

Translational Kinetic Energy for link 3 is given by:

1
Kt3 ¼ m2 X_ 22 þ Y_ 22 ð16Þ
2
2
Kt3 ¼ r h_ þ L1 cos a a_ þ l2 cos c c_ þ ðL1 sin a a_ l2 sin c c_ Þ2 ð17Þ

Rotational Kinetic Energy of link 3 is:

1
Kr3 ¼ J2 c_ 2 ð18Þ
2

Total Kinetic Energy (K3) for link 3 is given by the sum of rotational ðKt3 Þ and
translational ðKr3 Þ kinetic energy.
2 2 3
1 2 1 4 r h_ þ L1 cos a a_ þ l2 cos c c_ þ 5
K3 ¼ J2 c_ þ m2 ð19Þ
2 2 ðL sin a a_ l sin c c_ Þ2
1 2

The total kinetic energy for system ðK Þ is given by the combination of moving
and rotational kinetic energy of the individual components making up the system as
shown below.

K ¼ K1 þ K2 þ K3
2
1 _2 1 1 1
K¼ Ja h þ J1 a_ 2 þ J2 c_ 2 þ m1 rh_ þ l1 cos a a_ þ ðl1 a_ sin aÞ2
2 2 2 2
2 ð20Þ
1 _
þ m2 r h þ L1 a_ cos a þ l2 c_ cos c þ ðL1 a_ sin a l2 c_ sin cÞ 2
2
426 M. F. Hamza et al.

2.1.2 The Potential Energy

The potential energy for the individual links of DRIP is given below:

Arm
Since the center of mass of the arm is balanced at the original point (y = 0), thus,
the potential energy for the arm ðP1 Þ is zero.

P1 ¼ 0 ð21Þ

Lower Pendulum

P2 ¼ m1 gl1 cos a ð22Þ

Upper Pendulum

P3 ¼ gm2 L1 cos a þ gm2 l2 cos c ð23Þ

Total Potential Energy for the system ðPÞ is given by:

P ¼ P1 þ P2 þ P3

P ¼ gm1 l1 cos a þ gm2 L1 cos a þ gm2 l2 cos c ð24Þ

_ €h; are angular position, velocity and acceleration of the motor shaft,
where: h; h;
around the vertical axis respectively, a; a; _ €a; are agular position, velocity and
acceleration of the lower pendulum, around the motor shaft axis respectively, c; c_ ; €c;
are angular position, velocity and acceleration of the upper pendulum, around the
motor shaft axis respectively.

2.1.3 Lagrangian Formulation (L)

Let consider the Euler Lagrange equation

L¼KP ð25Þ

Therefore, substituting Eqs. (20) and (24) we have:

2
1 _2 1 1 2 1 _
L ¼ Ja h þ J1 a_ þ J2 c_ þ m1 r h þ l1 cos a a_ þ ðl1 a_ sin aÞ
2 2
2 2 2 2
2
1 _ ð26Þ
þ m2 r h þ L1 a_ cos a þ l2 c_ cos c þ ðL1 a_ sin a l2 c_ sin cÞ 2
2
½gm1 l1 cos a þ gm2 L1 cos a þ gm2 l2 cos c
Euler-Lagrange Based Dynamic Model of DRIP 427

Applying the Euler Lagrange Eq. (1) to the Lagrangian (26) results in three
coupled nonlinear equations.
Euler-Lagrange equation of the motion of each link thus becomes:
For arm ðhÞ, substituting h in Eq. (1)

d @L @L
sa ¼ þ ba h_ ð27Þ
dt @ h_ @h
h i
a þ m2 l2 r€c cos c þ ba h_
sa ¼ Ja þ r 2 ðm1 þ m2 Þ€h þ r ðm1 l1 þ m2 L1 Þ cos a €
ð28Þ
r ðm1 l1 þ m2 L1 Þ sin a a_ 2 m2 l2 r sin c c_ 2

For lower pendulum ðaÞ, substituting a in Eq. (1)

d @L @L
0¼ þ b1 a_ ð29Þ
dt @ a_ @a

0 ¼ r ðm1 l1 þ m2 L1 Þ cos a €h þ J1 þ m1 l21 þ m2 L21 €a þ m2 L1 l2 cosða cÞ€c
ð30Þ
b1 a_ þ m2 L1 l2 sinða cÞ_c2 gðm1 l1 þ m2 L1 Þ sin a

For upper pendulum ðcÞ, substituting c in Eq. (1)

d @L @L
0¼ þ b2 c_ ð31Þ
dt @ c_ @c

0 ¼ m2 l2 rcos c €h þ m2 L1 l2 cos ða cÞ€a þ J2 þ m2 l22 €c þ b2 c_
ð32Þ
m2 L1 l2 sinða cÞa_ 2 gm2 l2 sin c

Equations (28), (30) and (32) are three nonlinear, coupled, second order dif-
ferential equations of motion describing the dynamics equations of the DRIP sys-
tem. These dynamic equations can be reduced to the following equations:

sa ¼ z1 €h þ z2 cos a €a þ z3 €c cos c þ ba h_ z2 sin a a_ 2 z3 sin c c_ 2 ð33Þ

0 ¼ z2 cos a €h þ z4 €a þ z5 cosða cÞ€c þ b1 a_ þ z5 sinða cÞ_c2 z7 sin a ð34Þ

0 ¼ z3 cos c €h þ z5 cosða cÞ€a þ z6 €c þ b2 c_ z5 sinða cÞa_ 2 z8 sin c ð35Þ

where:

z1 ¼ Ja þ r 2 ðm1 þ m2 Þ ð36Þ

z2 ¼ r ðm1 l1 þ m2 L1 Þ ð37Þ
428 M. F. Hamza et al.

z3 ¼ m 2 l 2 ð38Þ

z4 ¼ J1 þ m1 l21 þ m2 L21 ð39Þ

z5 ¼ L 1 l 2 m 2 ð40Þ

z6 ¼ J2 þ m2 l22 ð41Þ

z7 ¼ gðm1 l1 þ m2 L1 Þ ð42Þ

z8 ¼ gm2 l2 ð43Þ

The torque at the load shaft from an applied motor torque can be express as:

gg Kg gm kt Vm ðtÞ Kg km h_ ðtÞ
sm ð t Þ ¼ ð44Þ
Rm

The value of the torque for the system under consideration can be calculated
using Eq. (44) below.

sa ¼ 0:117238v 0:063h_ Nm ð45Þ

2.2 System Speciﬁcations

The system speciﬁcation and their description are given in Table 1 (SRV02 DRIP
module).

2.3 MATLAB Modelling

For the purpose of controller design and analysis, the DRIP Simulink model was
developed in Matlab/Simulink using the nonlinear, parameterized mathematical
model as shown in Fig. 5. This is done by ﬁrst rearranging the nonlinear-coupled
equations of motion (33), (34) and (35) and substituting the values of the param-
eters we have:

€
h ¼ 0:8085v 0:6138 cos a €a 0:2966 cos c €c 4:5103h_ þ 0:6138 sin a a_ 2
ð46Þ
þ 0:2966 sin c
Euler-Lagrange Based Dynamic Model of DRIP 429

Table 1 SRV02 DRIP speciﬁcations

Symbol Description Value Unit
Ja Rotary arm moment of inertia about its center of mass 0.0041 kg m2
J1 First Pendulum moment of inertia about center of mass 0.00032 kg m2
J2 Second Pendulum moment of inertia about center of 0.0012 kg m2
mass
r Rotary arm length from pivot to tip 0.2159 M
L1 Lower pendulum length from pivot to tip 0.2 M
l1 Lower pendulum length from pivot to center of mass 0.097 M
l2 Upper pendulum length from pivot to center of mass 0.156 M
ba Viscous damping coefficient of the motor arm 0.0024 N m/(rad/s)
b1 Upper Pendulum viscous damping coefficient as seen 0.0024 N m/(rad/s)
at the pivot axis
b2 Lower Pendulum viscous damping coefficient as seen 0.0024 N m/(rad/s)
at the pivot axis
Vnom Motor nominal input voltage 6.0 V
Rm Motor armature resistance 2.6 X
gm Motor efficiency 0.63
gg Gear efficiency 0.9
Kg Total gear ratio 70
Km Back-emf constant 0.00768 V/(rad/s)
Kt Motor torque constant 0.00768 Nm

a ¼ 1:1266 cos a €h 0:4937 cosða cÞ €c 0:3038a_ 0:4937 sinða cÞ c_ 2

€
þ 51:2405 sin a
ð47Þ

€c ¼ cos c €h 0:9096 cos ða cÞ €a 0:5581 c_ þ 0:9096 sinða cÞa_ 2

ð48Þ
þ 45:2093 sin c

2.4 Linearization of Nonlinear Model

In most situations where we seek a linearized model, the nominal state is an

equilibrium point. This term refers to an initial state where the system remains
unless perturbed [5]. Therefore, to linearize the model [27], the following
approximations are applied: cos h 1; cos a 1; cos c 1; sin h ¼ h, sin a ¼ a;
sin c ¼ c; h_ 2 ¼ a_ 2 ¼ c_ 2 0. This is based on Taylor series expansion.
430 M. F. Hamza et al.

Fig. 5 Simulink model of DRIP

The linearized model of the nonlinear equations (33), (34) and (35) in matrix
form:
2 32 3 2 32 3 2 3 2 3
z1 z2 z3 €h ba 0 0 h_ 0 0 0 sa
4 z2 z4 z5 54 €a 5 þ 4 0 b1 0 54 a_ 5 þ 4 0 az7 0 5 ¼ 40 5
z3 z5 z6 €c 0 0 b2 c_ 0 0 cz8 0
ð49Þ

By deﬁning the state variables as: x1 ¼ h x2 ¼ a x3 ¼ c x4 ¼ h_ x5 ¼ a_ x6 ¼ c_ and

substituting the values of the parameters, a linear state space system can be rep-
resented as:
2 3
0 0 0 1 0 0
60 0 0 0 1 0 7
6 7
60 0 0 0 0 1 7
A¼6
60
7
6 103:7924 1:7156 14:6318 0:6154 0:0212 7
7
40 211:7365 42:3798 16:7688 1:2554 0:5232 5
0 88:2477 81:9312 0:5772 0:5232 1:0115
Euler-Lagrange Based Dynamic Model of DRIP 431

2 3
0
6 0 7 2 3
6 7 1 0 0 0 0 0
6 0 7
B¼6 7 4
6 26:2209 7; C ¼ 0 1 0 0 0 05
6 7 0 0 1 0 0 0
4 30:0506 5
1:0343

3 Test for Stability

As pointed out in [20], the necessary and sufﬁcient condition for stability of a
system is that all the roots of the characteristic equation (k, also referred to as
eigenvalues) should have negative real parts. If any of the roots has positive real
part, the contribution from the corresponding exponential term will grow with time,
the output response will be unbounded, and the entire system will be regarded as
unstable.
The characteristic equation (P) is given as:

PðkÞ ¼ detðkI AÞ ð50Þ

k1 0 0 1 0 0

0 k2 0 0 1 0

0 0 k3 0 0 1
PðkÞ ¼
0 103:7924 1:7156 k4 þ 14:6318 0:0212 0

0 211:7365 42:3798 16:7688 k5 þ 1:2554 0:5232

0 88:2477 81:9312 0:5772 0:5232 k6 þ 1:0115

k1 ¼0
k2 ¼ 22:5049
k3 ¼ 12:8716
k4 ¼ 6:2333
k5 ¼ 3:3489
k6 ¼ 10:1498

From the eigenvalues ðki Þ obtained, it is found that two of the poles are in
positive real part of s-plane. Hence, the system is conﬁrmed to be unstable.

4 Open Loop Response

The system dynamic model was derived on the assumption that, the tilt angles for
both the links (arm, lower pendulum and upper pendulum) are at reference zero
(0 rad) position. As the motor is energized with a step signal and without control,
432 M. F. Hamza et al.

Fig. 6 Rotary arm open loop response

Fig. 7 Lower pendulum open loop response

Euler-Lagrange Based Dynamic Model of DRIP 433

Fig. 8 Upper pendulum open loop response

which serve as a disturbance to the unstable equilibrium DRIP, the two pendulums
were unable to maintain the unstable equilibrium position, but they fall to the
downward stable equilibrium equivalent to 180o. These behaviors are shown in
Figs. 6, 7 and 8.

5 Conclusion

This study, presented a development of nonlinear dynamical equations of the DRIP

system using Euler-Lagrange method. The MATLAB/Simulink model of DRIP was
developed based on the derived equations. Simulation study was carried out and the
result shows that, the RIP system is inherently nonlinear and unstable. The
developed models can be used by the researchers for application of linear or
nonlinear controllers. Also, the method used can be applied in modelling of other
nonlinear systems.

References

1. Casanova V, Salt J, Piza R, Cuenca A (2012) Controlling the double rotary inverted
pendulum with multiple feedback delays. Int J Comput Commun Control 7(1):20–38
2. Pakdeepattarakorn P, Thamvechvitee P, Songsiri J, Wongsaisuwan M, Banjerdpongchai D
(2004) Dynamic models of a rotary double inverted pendulum system. In: 2004 IEEE region
10 conference (TENCON 2004), vol 500. IEEE, pp 558–561 (2004)
434 M. F. Hamza et al.

3. Hamza MF, Yap HJ, Choudhury IA, Isa, AI (2016) Application of Kane’s method for
dynamic modeling of rotary inverted pendulum system. In: 2016 MNTMSim conference, vol
1. IEEE, Malaysia, pp 20–27 (2016)
4. Moreno-Valenzuela J, Aguilar-Avelar C (2018) Motion control of underactuated mechanical
systems. Springer, Cham
5. Hamza MF, Yap HJ, Choudhury IA (2015) Genetic algorithm and particle swarm
optimization based cascade interval type 2 fuzzy PD controller for rotary inverted pendulum
system. Math Probl Eng 12(2015):279–462
6. Hamza MF, Yap HJ, Choudhury IA (2017) Cuckoo search algorithm based design of interval
Type-2 Fuzzy PID Controller for Furuta pendulum system. Eng Appl Artif Intell 2(62):134–
151
7. Yang X, Zheng X (2018) Swing up and stabilization control design for an underactuated
rotary inverted pendulum system: theory and experiments. IEEE Trans Ind Electron
65(9):7229–7238
8. Fantoni I, Lozano R (2002) Stabilization of the Furuta pendulum around its homoclinic orbit.
Int J Control 6(75):390–398
9. Casanova V, Alcaína J, Salt J, Pizá R, Cuenca Á (2015) Control of the rotary inverted
pendulum through threshold-based communication. ISA Trans 1(62):357–366
10. Isa AI, Hamza MF (2014) Effect of sampling time on PID controller design for a heat
exchanger system. In: 6th international conference on adaptive science & technology. IEEE,
pp 1–8
11. Isa AI, Hamza MF, Zimit AY, Adamu JK (2018) Modelling and fuzzy control of ball and
beam system. In: 7th international conference on adaptive science & technology. IEEE, pp 1–
6 (2018)
12. Zimit AY, Yap HJ, Hamza MF, Siradjuddin I, Hendrik B, Herawan T (2018) Modelling and
experimental analysis two-wheeled self balance robot using PID controller. In: International
conference on computational science and its applications. Springer, pp 683–698 (2018)
13. Georgiadis MC, Macchietto S (2000) Dynamic modelling and simulation of plate heat
exchangers under milk fouling. Chem Eng Sci 9(55):1605–1619
14. Dhaouadi R, Hatab AA (2013) Dynamic modelling of differential-drive mobile robots using
lagrange and newton-euler methodologies: a uniﬁed framework. Adv Robot Autom 2(2):1–7
15. Hamza MF, Yap HJ, Choudhury IA, Isa AI, Zimit AY, Kumbasar T (2019) Current
development on using rotary inverted pendulum as a benchmark for testing linear and
nonlinear control algorithms. Mech Syst Sig Process 2(16):347–369
16. García-Alarcon O, Puga-Guzman S, Moreno-Valenzuela J (2012) On parameter identiﬁcation
of the Furuta pendulum. Procedia Eng 1(35):77–84
17. Hamill P (2015) A student’s guide to Lagrangians and Hamiltonians. J Geom Symmetry Phys
2(37):101–105
18. Madrid JLD, Henao PO, Querubín EG (2017) Dynamic modeling and simulation of an
underactuated system. In: Journal of physics: conference series, vol 1, no 850. IOP
Publishing, p 012005
19. Li B (2013) Rotational double inverted pendulum. University of Dayton (2013)
20. Mandal AK (2006) Introduction to control engineering: modeling, analysis and design. New
Age International, New Delhi
Network-Based Cooperative
Synchronization Control of 3 Articulated
Robotic Arms for Industry 4.0
Application

Kam Wah Chan, Muhammad Nasiruddin Mahyuddin, and Bee Ee Khoo

Abstract This project presents a Control Area Network (CAN) based Cooperative
Synchronization Control of three articulated robotic arms for Industry 4.0 application.
Demand on multi-robot system increases as a result of its flexibility and ability on
handling complex task, especially in the era of our nation approaching Industry 4.0.
In this project, three robotic arms will be commissioned to synchronize with each
other to perform a cooperative task. The cooperative setup employs a multi-agent-
inspired framework. A leader agent is assigned to one of the robotic arms which has
full knowledge on the desired trajectory signal whereas the other follower agents have
partial information. CAN bus will be used as a means of communication between
the three robotic arms due to its ease of convenience in terms of configuration and
future extension. An intelligent cooperative phase lead controller is to be designed,
developed and implemented to guarantee smooth synchronizing motions of robot
arms. Experimental frequency response approach is used to identify the input-output
model of each joint of each robot agent i. Discrete phase lead controller is designed
from the transfer function obtained. The CAN bus network is designed so that slave
robot get cooperative consensus error from each other as input signal. The distributed
cooperative control robot system is successfully developed. The slave robots tracks
the master robot successfully.

Keywords Cooperative control · Robotics · Control system · Phase-lead

compensator · Multi-robot · Distributed control

1 Introduction

Efficiency, productivity, interconnectivity and the capability to handle complex task

seems to be the final target of the industry revolution. Industry 4.0 is based on the

K. W. Chan · M. N. Mahyuddin (B) · B. E. Khoo

School of Electrical and Electronic Engineering, Universiti Sains Malaysia, 14300 Nibong Tebal,
Pulau Pinang, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 435

technological concepts of cyber-physical systems, Internet of Things (IoT), which

enables the Factory of the Future (FoF) [1, 2].
The nine pillars of technological advancement are autonomous robot, simulation,
big data and analytics, horizontal and vertical system integration, the industrial inter-
net of things, cybersecurity, the cloud, additive manufacturing, and augmented reality
[3]. Automated robots produce higher competitiveness of companies, provide better
quality and lower requirements for post processing and quality control, speed up
processing operation, decrease occupational injuries, and provide a better working
environment [4].
In order to manage uncertainties, machine operations have become more flexible
and more autonomous in handling their problems [5]. Cooperative control of robotic
arms is vital in manufacturing when comes to complex tasks such as assembling two
parts of the semi-finished product together and the motions for operating robot arms
are different among each other. Distributed architecture is the nature of cooperative
control.
In a distributed architecture, the key aspect is how communication requests are
handled [6]. The communication concerns is focused on suggesting solutions to
enable data exchange between the internal elements of the system [11]. CAN bus
protocol allows distributed network based control of robot arms with higher efficiency
and robustness as well as the simplify complexity of system [13].
With network-based control, the robot arms work together to complete a coop-
erative task [14–18]. The six main requirements discussed were modularity, inter-
operability, decentralization, virtualization, service orientation, and responsiveness
[7]. Haddara and Elragal emphasized on the need on machine-to-machine commu-
nication to ensure the effectiveness and the objective of smart factory as promoted
in Industry 4.0 [2].
Multi-robot systems (MRSs) have been widely investigated in the recent years
due their appealing characteristics in terms of flexibility, redundancy, fault tolerance,
and the possibility they offer for using distributed sensing and actuation [8, 19–37].
There is application by using a control procedure and a control algorithm with
two levels to solve the control problem of a cooperating multi-arm robotic system
like a gripper with n fingers manipulating a usual object [9]. Decentralised control
system without requiring communication between robot is applied in a collaborative
controller for a team of mobile manipulators is designed for transporting a rigid work
piece to a desired position and orientation [10].
In distributed system, the implementation of network based communication is eas-
ier than implementation of pure sensor system and the communication cost is reduced
as well [11]. Distributed controller–observer schema with first-order dynamics for
tracking control of the centroid and of the relative formation of a multi-robot sys-
tem is implemented and can be potentially used as a bridge to the solution of the
tracking problem with additional control objectives including complex tasks such as
exploration and deployment [8].
Kocan et al. implemented CAN bus on L601-KT robotic arm with six degrees
of freedom [12]. A plug-in architecture robot platform [13] was designed using
STM32 series chips as microcontroller and CAN bus as communication medium.
Multi-robot system is extensively studied in the past decade due to the capability
Network-Based Cooperative Synchronization Control ... 437

in term of flexibility, and redundancy it offered providing a viable solution for the
complex task. The used of communication device instead of pure sensor system bring
distinct advantage in both cost and system complexity. The use of CAN bus provides
higher reliability in term of robustness and the ease of implementation.

2 System Setup

Figure 1 shows the system setup in the lab to demonstrate a cooperative task carried
out by 3 articulated robot arms. Each of the robot joints will be controlled by cooper-
ative control algorithm to be designed as shown by the block diagram in Fig. 2. Each
of the robot arms joint angles will be passed among of the robot agents for control
purpose depending on the CAN bus communication topology.

2.1 Robot Arm Joint Model

Frequency response experiment is carried out estimate the transfer function of the
system. The experiment is conducted by observing the output response in terms of
the angular position of the robot actuator (DC motor in this case) when a sinusoidal
voltage signals of varying frequencies is fed into the motor. The sinusoidal voltage
input, ν(t), varying with time, t, is described by,

ν(t) = Asin(ωt) (1)

Fig. 1 The system setup

(top view) for 3 DOF
articulated robot arms
commissioned for a
cooperative task
438 K. W. Chan et al.

Fig. 2 Block diagram showing the cooperative control scheme

where A is the peak-to-peak voltage amplitude, ω is the frequency in radian per

second. Conventional frequency response method was employed for each robot joint
to obtain the input-output model. It is assumed that each of the robotic arm link
poses minimal coupling effect throughout its connected linkages. Therefore, it is
permissible to model each joint in a linear form as in (2) under an assumption that
no abrupt motion or demanding joint acceleration is commissioned in this work.

Ak
f ( jω) = (2)
jω( jω + k)

where Ak is the system gain and k is the system pole. Such transfer function is deemed
suitable when an input-output relationship is desired relating angular position to a
voltage input of a DC motor.
The experimental result for one robot arm joint is recorded in Table 1.
From the data tabulated in Table 1, a bode plot (shown in Fig. 3) is drawn to
conclude the frequency response experiment.
From the bode plot in Fig. 3, we may identify the uncompensated system DC gain
accordingly as in (3).

20logK = −27.2984 dB (3)

−27.2984
→ K = 10 20 = 0.04316 (4)

and from 0.8 = 10ω p , first order pole is calculated to be 0.08 rad/s, yielding the
following transfer function for one of the joint angle,

θi ( jω) 0.04316
G( jω) = = (5)
νi ( jω) jω( jω + 0.08)
Network-Based Cooperative Synchronization Control ... 439

Table 1 Frequency response data for one of the robot arm agent i’s link.
Freq (rad/s) νi (V ) θi (◦ ) 20log( νθii ) t(s) T (s) φ(◦ )
(dB)
0.03 25.5 468 5.274 −61 209 −105.071
0.04 25.5 338 2.447 −45 157 −103.184
0.05 25.5 299 1.382 −39 126 −111.428
0.06 25.5 238 −0.599 −30.75 105 −105.428
0.07 25.5 207 −1.811 −25 90 −100
0.08 25.5 186 −2.740 −22.75 79 −103.670
0.09 25.5 151 −4.551 −20 70 −102.857
0.10 25.5 132 −5.719 −19 63 −108.571
0.11 25.5 121 −6.470 −16 57 −101.052
0.12 25.5 111 −7.224 −15.2 52 −105.230
0.13 25.5 99 −8.218 −14.25 48 −106.875
0.15 25.5 95 −8.576 −13 42 −111.428
0.17 25.5 78 −10.288 −11 37 −107.027
0.19 25.5 74 −10.746 −10 33 −109.090
0.20 25.5 67 −11.609 −9.5 31.5 −108.571
0.25 25.5 57 −13.013 −8 25 −115.2
0.30 25.5 44 −15.261 −7 21 −120
0.35 25.5 37 −16.766 −6.6 18 −132
0.40 25.5 32 −18.027 −6 16 −135
0.50 25.5 26 −19.831 −4.5 13 −124.615
0.60 25.5 21 −21.686 −4 10.5 −137.142
0.70 25.5 18 −23.025 −4 9 −160
0.80 25.5 16 −24.048 −4 8 −180
0.90 25.5 12 −26.547 −3.5 7 −180
1.00 25.5 11 −27.302 −3 6 −180
1.10 25.5 9 −29.045 −3 6 −180
1.30 25.5 9 −29.045 −2.5 5 −180
1.50 25.5 5 −34.151 −2 4 −180

3 Cooperative Control Design

3.1 Consensus Error

The cooperative error signal measuring the difference between the joint angles of
the neighbouring robot arm can be written in a consensus-like formulation (inspired
from multi-agent theory),
440 K. W. Chan et al.

Fig. 3 Bode plot representing frequency response for joint i

N

ei = ai j (θ j − θi ) + bi (θ0 − θi ) (6)
i=1

where ei = [ei1 ei2 · · · ein ]T ∈ Rn is the consensus error vector for each agent i with
n degrees of freedom, ai j is the element in the adjacency matrix A ∈ RN×N , i.e.
the matrix (from graph theory) that describes how the 3 agent robots are connected
to each other in their communication link through CAN bus. θi ∈ Rn is the angular
position of the current agent i, θ j is the angular position of the neighbouring agent
j and θ0 is the leader agent’s angular position.

3.2 Cooperative Control

The cooperative control algorithm utilises the use of consensus error (information
about the neighbouring agent state contained therein) in the discrete phase-lead con-
troller setup creating the following expression,

u coopi = Dlead In ( jω)ei (7)

where In ∈ Rn×n is the identity matrix, u coopi is the control input in a form of voltage
νi to each of the robot agent. It is assumed that the coupling between joints are
minimal under the condition that there is no sudden abrupt motion.
Network-Based Cooperative Synchronization Control ... 441

3.3 Discrete Phase Lead Controller Algorithm

The discrete phase lead compensator is designed and implemented to minimise the
response time and maximise the effect of synchronisation based on the transfer func-
tion obtained from frequency response experiment. The phase-lead compensator’s
transfer function is given in the form of

1 jω + ωl
Dlead ( jω) = √ (8)
β jω + ωh

where ωl , ωh are the lower break and higher break frequency of the controller to be
designed respectively. β is a controller gain coefficient to be designed to satisfy the
performance criteria.
Since the system output is in discrete form, thus the phase lead compensator
is transformed into discrete from and long division is applied on the compensator
designed to be implemented into Arduino coding.
Recall the transfer function obtained in (5), we may identify the natural bandwidth
of the hardware joint system by observing the magnitude at −6 dB in the open loop
plot shown in Fig. 3.
The first step in the design of phase-lead controller is the steady-state performance
need to be satisfied first by increasing the system gain to 1. A bode plot is drawn
again (see Fig. 4) for this adjusted transfer function which satisfy the steady-state
performance.
From Fig. 4, the phase margin is observed to be,

φuncomp P M = 4.5812◦ (9)

The observed phase margin is too low and requires compensation which can be
achieved by the discrete phase lead controller. Additional phase lead contribution
φ Mlead and the coefficient β are calculated accordingly,

φ Mlead = φ Mcomp − φ Muncomp + φcor = 50.4188◦ (10)

−1 1 − β 1 − 0.7117
φ Mlead = sin →β= = 0.1684 (11)
1+β 1 + 0.7117

where φcor is the phase correction factor in range of 5◦ –12◦ . The compensator’s
magnitude contribution can be computed at the peak of the phase curve,

1
|G lead ( jω)max)| = √ = 2.437 dB (12)
β

From negative value of G lead ( jωmax ), we may determine the new gain crossover
frequency, ωmax , from the bode plot,
442 K. W. Chan et al.

Fig. 4 Normalised bode plot for one of the joint of robot agent i

1.145 ωl
ωmax = =√ (13)
s β

→ ωl = ωmax β = 0.47 rad/s (14)
ωl
→ ωh = = 2.791 rad/s (15)
β

Consequently, the phase-lead compensator can be written as

2.1277ω + 1
Dlead = (16)
0.3593ω + 1

The sampling time specified (adhering to the Nyquist sampling theorem) at the
microcontroller is Ts = 0.01 s and applying the bilinear transformation of the form,

Ts (z − 1)
ω= (17)
2 (z + 1)

to discretize the controller for the purpose of hardware implementation, we would

arrive at the following discrete version of phase-lead compensator,

1.00883z + 0.98758
Dlead (z) = (18)
z + 0.99642
Network-Based Cooperative Synchronization Control ... 443

The designed phase-lead compensator in (18) can be coded in the microcontroller

or digital signal processor by simply performing a long division to establish the
corresponding difference equation,

Dlead (z) = 1.00883 − 0.01764z −1 + 0.01758z −2 − 0.017515z −3 + . . . (19)

Remark. For the sake of brevity, the control design and analysis here is shown only
for one of the robot joints. It is to note that, in practical, although all robot agents
being commissioned are identical in terms of kinematic configuration, each of the
actuators and feedback sensors exhibit entirely different characteristics due to the
wear-and-tear factor. The frequency response shown in Table 1 for other joints are
also different in certain magnitude and phase. Certain joints were observed to operate
in a narrower operating joint angle band. Backlash characteristics were also observed
distinctive from one robot agent platform to another. However, the principle approach
of system modeling and control design being elucidated in this paper will still be
valid for other types of robotic arm. Future work which employs a more advanced
nonlinear control design technique and with enhanced robotics instrumentation are
feasible.

4 Results

With the low baud rate used, the rate of current angular position sent from the robot
agent leader to the other neighbouring robot agent was slower compared to the one
with high baud rate used. The slave response was also slower when low baud rate
was used. To maximise real-time synchronisation effect, the maximum baud rate
(1000 kbps) was selected for the CAN bus.
Observing from Figs. 5 and 6, despite of the inherent noise signal emanating from
the aged encoder signal, all the robot agents (i = 0, 1, 2) are able to track the desired

Fig. 5 Preliminary evaluation on the selection of CAN bus communication bandwidth

444 K. W. Chan et al.

Fig. 6 Hardware setup of the cooperative controlled of 3 articulated robotic arms.

Fig. 7 Consensus tracking error for joint angle k all robot agents

trajectory satifactorily. The robot 1 read the sensor feedback and the robot 2 was
taking input from sensor feedback of robot 1 through CAN bus. Thus, the errors due
to sensor feedback were also amplified, affecting the performance of tracking for
the robot 2. Hardware or software filter was recommended to eliminate the noise as
the input for slave was depending on the sensor feedback of each robot. As men-
tioned, phase lead controller is a linear controller which poses some limitation in
handling nonlinearity such as backlash and saturation. It is to note that in a practical
system (Fig. 7), the nonlinearities do exist and it depends entirely on the knowledge
and experience of a control engineer to formulate a suitable compensator to over-
come it. Actuator saturation may exist and can be overcame by means of dead-zone
Network-Based Cooperative Synchronization Control ... 445

compensation. Nonlinearities exist across all the link members and can be resolved
by nonlinear feedback compensation taking account the robot arm mass inertia, Cor-
riolis/centrifugal effect and the gravity effect.

5 Conclusion

The network-based cooperative synchronization control has been developed for coop-
erative task between master, slave 1 and slave 2 robot arms. Frequency response
approach is successfully applied to obtain the system transfer function. Phase lead
compensator controller was chosen as most suitable controller for synchronization
task due to its advantage in improving transient response and small change in steady
state error as well as the ability to emphasize high frequency noise. The synchroniza-
tion control was validated through performance analysis using data logged to Excel
using PLX-DAQ. The implementation of CAN bus as communication medium allow
information exchange between each robot arms. Robot arms are sending and receiv-
ing signal to each other through the two CAN bus node. By using the error equation
developed from the communication network, the cooperative task is performed. In
conclusion, the objectives of this project has been achieved successfully. The out-
comes from this project shows the distributed network-based cooperative control by
using phase lead compensated control affect a lot in synchronization control between
robot arms.

References

1. Gereald M, Peter Z (2017) Industrial Robots Meet Industry 4.0, Hadmérnök (XII) IV, pp 230–
238
2. Haddara M, Elragal A (2015) The readiness of ERP systems for the factory of the future.
Procedia Comput Sci 6(64):721–728
3. Hecker M, Howe K, Russo M, Küpper D, Spindelndreier D, Whiteman S, Zinser M (2015)
Industry 4.0: The Future of Productivity and Growth in Manufacturing Industries
4. Vysocky A, Novak P (2016) Human-robot collaboration in industry. Sci J 9:903–906
5. Anussornnitisarn P, Nof SY, Etzion O (2005) Decentralized control of cooperative and
autonomous agents for solving the distributed resource allocation problem. Int J Prod Econ
98(2):114–128
6. Migliavacca M, Bonarini A, Matteucci M (2013) RTCAN: a real-time CAN-bus protocol for
robotic applications. In: Proceedings of the 10th international conference on informatics in
control, automation and robotics, ICINCO 2013, vol 2, pp 353–360
7. Mabkhot MM, Al-Ahmari AM, Salah B, Alkhalefah H (2018) Requirements of the smart
factory system: a survey and perspective. Machines 6:23
8. Antonelli G, Arrichiello F, Caccavale F (2014) Decentralized time-varying formation control
for multi-robot. Int J Robot Res 33(7):1029–1043
9. Stoian V, Bobasu E (2015) Control algorithm for a cooperative robotic system in fault condi-
tions. In: 2015 12th international conference on informatics in control, automation and robotics
(ICINCO), Colmar, France
446 K. W. Chan et al.

10. He Y, Wu M, Liu S (2018) Decentralised cooperative mobile manipulation with adaptive

control parameters. In: 2018 IEEE conference on control technology and applications (CCTA),
Denmark, Copenhagen
11. Lu M, Liu L (2018) Adaptive leader-following consensus of networked uncertain Euler-
Lagrange systems with dynamic leader based on sensory feedback. In: 2018 15th international
conference on control. automation, robotics and vision (ICARCV), Singapore, Singapore
12. Kocian J, Skovajsa L, Vojcinak P, Kotzian J (2009) Robotic arm controlled by CAN bus. In:
9th IFAC workshop on programmable devices and embedded systems, pp 92–95
13. Lin Z, Wang T, Gao Q, Liu Y (2011) Design of robot platform based on CAN bus. In: 2011
international conference on electrical and control engineering
14. Marino A (2018) Distributed adaptive control of networked cooperative mobile manipulators.
IEEE Trans Control Syst Technol 26(5):1646–1660
15. Khan SG, Bendoukha S, Mahyuddin MN (2018) Dynamic control for human-humanoid inter-
action. In: Humanoid robotics: a reference. Springer, Heidelberg, pp 1–29
16. Mahyuddin MN, Herrmann G (2013) Distributed motion synchronisation control of humanoid
arms. In: 2013 FIRA RoboWorld congress. Springer, Heidelberg, pp 21–35
17. Mahyuddin MN, Herrmann G, Lewis FL (2013) Distributed adaptive leader-following control
for multi-agent multi-degree manipulators with finite-time guarantees. In: 52nd IEEE confer-
ence on decision and control, Florence, pp 1496–1501
18. Mahyuddin MN, Herrmann G (2013) Cooperative robot manipulator control with human ‘pin-
ning’ for robot assistive task execution. In: Herrmann G, Pearson MJ, Lenz A, Bremner P,
Spiers A, Leonards U (eds) Social robotics, ICSR 2013. LNCS, vol 8239. Springer, Cham
19. Zhang HW, Lewis FL (2012) Adaptive cooperative tracking control of higher-order nonlinear
systems with unknown dynamics. Automatica 48(7):1432–1439
20. Peng Z, Wang D, Sun G, Wang H (2014) Distributed cooperative stabilisation of continuous
time uncertain nonlinear multi-agent systems. Int J Syst Sci 45(10):2031–2041
21. Wang W, Wang D, Peng ZH (2015) Cooperative fuzzy adaptive output feedback control
for synchronisation of nonlinear multi-agent systems under directed graphs. Int J Syst Sci
46(16):2982–2995
22. Wang J, Chen K, Lewis FL (2017) Coordination of multi-agent systems on interacting physical
and communication topologies. Syst Control Lett 100:56–65
23. Lewis FL, Zhang H, Hengster-Movric K, Das A (2014) Cooperative control of multi-agent
systems. Springer, London
24. Jiao Q, Modares H, Lewis FL, Xu S, Xie L (2016) Distributed L2-gain output-feedback control
of homogeneous and heterogeneous systems. Automatica 71:361–368
25. Roman RC, Radac MB, Precup R-E (2016) Multi-input-multi-output system experimental
validation of model-free control and virtual reference feedback tuning techniques. IET Control
Theory Appl 10(12):1395–1403
26. Safaei A, Koo YC, Mahyuddin MN (2017) Adaptive model-free control for robotic manipula-
tors. In: Proceedings of the IEEE international symposium on robotics and intelligent sensors
(IRIS2017), Ottawa, Canada, October 2017, pp 7–12
27. Safaei A, Mahyuddin MN (2018) Adaptive model-free control based on an ultra-local model
with model-free parameter estimations for a generic SISO system. IEEE Access 6:4266–4275
28. Safaei A, Mahyuddin MN (2018) Optimal model-free control for a generic MIMO nonlinear
system with application to autonomous mobile robots. Int J Adapt Control Signal Process
29. Cai H, Lewis FL, Hu G, Huang J (2017) The adaptive distributed observer approach to the
cooperative output regulation of linear multi-agent systems. Automatica 75:299–305
30. Modares H, Nageshrao SP, Delgado Lopes GA, Babuska R, Lewis FL (2016) Optimal model-
free output synchronization of heterogeneous systems using off-policy reinforcement learning.
Automatica 71:334–341
31. Peng ZH, Wang D, Sun G, Wang H (2014) Distributed cooperative stabilisation of continuous
time uncertain nonlinear multi-agent systems. Int J Syst Sci 45(10):2031–2041
32. Mahyuddin MN, Herrmann G, Lewis FL (2013) Distributed adaptive leader-following control
for multi-agent multi-degree manipulators with finite-time guarantees. In: 2013 IEEE 52nd
conference on decision and control (CDC2013), Florence, Italy, pp 1496–1501
Network-Based Cooperative Synchronization Control ... 447

33. Mahyuddin MN, Herrmann G, Na J, Lewis FL (2012) Finite-time adaptive distributed control
for double integrator leader-agent synchronisation. In: 2012 IEEE international symposium on
intelligent control (ISIC), Dubrovnik, Croatia, pp 714–720
34. Mahyuddin MN, Safaei A (2017) Robust adaptive cooperative control for formation-tracking
problem in a network of non-affine nonlinear agents. In: Rocha J (ed) Multi-agent systems.
InTech
35. Safaei A, Mahyuddin MN (2017) Adaptive model-free consensus control for a network of
nonlinear agents under the presence of measurement noise. In: Asian control conference
(ASCC2017), Gold Coast, Australia, December 2017, pp 1701–1706
36. Li Z, Duan Z (2015) Cooperative control of multi-agent systems. CRC Press/Taylor and Francis
Group, Boca Raton
37. Safaei A, Mahyuddin MN (2017) An optimal adaptive model-free control with a Kalman-filter-
based observer for a generic nonlinear MIMO system. In: Proceedings of the 2017 IEEE 2nd
international conference on automatic control and intelligent systems (I2CACIS 2017), Kota
Kinabalu, Malaysia, October 2017, pp 56–61
EEG Signal Denoising Using Hybridizing
Method Between Wavelet Transform
with Genetic Algorithm

Zaid Abdi Alkareem Alyasseri, Ahamad Tajudin Khader,

Mohammed Azmi Al-Betar, Ammar Kamal Abasi,
and Sharif Naser Makhadmeh

Abstract The most common and successful technique for signal denoising with
non-stationary signals, such as electroencephalogram (EEG) and electrocardiogram
(ECG) is the wavelet transform (WT). The success of WT depends on the optimal con-
figuration of its control parameters which are often experimentally set. Fortunately,
the optimality of the combination of these parameters can be measured in advance
by using the mean squared error (MSE) function. In this paper, genetic algorithm
(GA) is proposed to find the optimal WT parameters for EEG signal denoising. It is
worth mentioning that this is the initial investigation of using optimization method
for WT parameter configuration. This paper then examines which efficient algo-
rithm has obtained the minimum MSE and the best WT parameter configurations.
The performance of the proposed algorithm is tested using two standard EEG dataset,
namely, EEG Motor Movement/Imagery dataset. The results of the proposed algo-
rithm are evaluated using five common criteria: signal-to-noise-ratio (SNR), SNR
improvement, mean square error (MSE), root mean square error (RMSE), and per-
centage root mean square difference (PRD). In conclusion, the results show that the
proposed method for EEG signal denoising can produce better results than manual
configurations based on ad hoc strategy. Therefore, using metaheuristic approaches
to optimize the parameters for EEG signals positively affects the denoising process
performance of the WT method.

Keywords EEG · Signal denoising · Wavelet transform · Metaheuristic

algorithms · Genetic algorithm

Z. A. A. Alyasseri (B) · A. T. Khader · A. K. Abasi · S. N. Makhadmeh

School of Computer Sciences, Universiti Sains Malaysia, Gelugor, Pulau Pinang, Malaysia
e-mail: [email protected]
Z. A. A. Alyasseri
ECE Department, Faculty of Engineering, University of Kufa, Najaf, Iraq
M. A. Al-Betar
IT Department, Al-Huson University College, Al-Balqa Applied University, Irbid, Jordan

© Springer Nature Singapore Pte Ltd. 2021 449

1 Introduction

Electroencephalogram (EEG) is a graphical recording of brain electrical activity that

is recorded from the scalp. This recording represents the voltage fluctuations resulting
from ionic current flows within the neurons of the brain [1, 2]. Therefore, EEG
signals can provide most of the required information about brain activity. EEG signals
from the brain are captured using invasive or non-invasive techniques [3]. The main
difference between these techniques is that the invasive approach involves the use of
electrode arrays implanted inside the brain, such as ECoG BCI for arm movement
control [4, 5]. Meanwhile, there are several techniques to record the brain activity
can also be captured using different types of signal capturing devices, including EEG
for electrical activity from the scalp, MEG for magnetic field fluctuations caused by
electrical activity in the brain, and fMRI and fNIR for changes in blood oxygenation
level resulting from neural activity [4, 6, 7]. In [8], Berger proposed for the first time
the use of EEG signals as a non-invasive technique for capturing brain activities. Over
the past several decades, researchers have developed Hans’s technique to suit multiple
applications. For instance, EEG signals have been used in medical applications for
prevention, detection diagnosis, rehabilitation and restoration. This technique has
also been used for non-medical applications, such as education and self-regulation,
neuromarketing and advertisement, neuroergonomics and smart environment, games
and entertainment, and learning and education [9, 10]. Recently, EEG signals have
been used as a new biometric technique in security and authentication applications
[1, 9].
In general, several artifact noises can corrupt the original EEG signal during its
recording time, such as eye blink, eye movements, muscle activity, and interference
of electronic device signals [11]. Therefore, the EEG signal must be processed to
reduce such noise. Several EEG noise removal techniques have been proposed in the
literature, such as filtering and adaptive thresholding. Recently, wavelet transform
(WT) has been successfully applied for denoising non-stationary signals, including
ECG and EEG [12–16].
Kumari et al. in [1] proposed a user identification system on the basis of EEG
signal collected from six users using EMOTIVE EPOC headset with 14 channels.
These researchers used wavelet transform (WT) for EEG signal denoising where a
db4 mother wavelet function (MWF) is used with five levels of signal decompo-
sition. They tested their method using the EEG dataset established in [17]. After-
wards, the same authors investigated several cognitive tasks to design an individual
identification system [18]. These researchers used standard EEG datasets related to
motor/movement and imaginary tasks [19] with only one channel (i.e. Cz) to obtain
an input signal. In addition, the authors used WT to decompose the EEG signal into
five levels and then extract four features from each EEG sub-band. Al-Qazzaz et
al. [13, 20] conducted a comparative study to determine the efficient MWFs that
can provide high signal characteristics for an EEG channel. These authors tested 45
MWFs that are categorized into Daubechies, Symlets and Coiflets families. An MWF
called ‘sym9’ showed efficient results in nearly all brain regions. The same team of
EEG Signal Denoising Using Hybridizing Method ... 451

researchers applied WT with independent component analysis to decompose the

EEG signals for obtaining an efficient feature for discriminating stroke-related mild
cognitive impairment and vascular dementia [21]. Reddy et al. [22] proposed WT
for processing the EEG signal. These authors applied WT to EEG signal denoising
and used db8 as an MWF with eight EEG signal decomposition levels. Furthermore,
these authors classified the EEG signal on the basis of the features that are extracted
from the WT signal denoising process [23].
Mowla et al. [24] introduced a new method for removing EMG and electrooculo-
gram (EOG) artifacts from the original EEG signal. The proposed method used two
scenarios for removing these artifacts. In the first scenario, the EMG artifacts were
processed using a combination method where the EEG signal was firstly processed
using canonical correlation analysis, and the output signal will then be reprocessed
by a stationary WT (SWT). A second-order blind identification approach followed
by SWT was used for removing EOG artifacts. The results of the proposed method
showed that combining the techniques provided more effective results than using
each technique individually.
Yang et al. in [25] proposed an artificial method for removing the EOG artifacts
from the EEG raw. The proposed method (CCA-EEMD) involves three steps. In
the first step, the input EEG signal proposed using CCA to spread the EOG. In the
second step, the EOG will be decomposed into multi-level and apply intrinsic mode
functions (IMFs) using EEMD approach. Finally, the clear EEG data are ready to use
and extract more features. The (CCA-EEMD) tested using seven subjects. The results
show that the (CCA-EEMD) method it is not only EOG removal method but also it
can keep the EEG features to the maximum extent. Torabi et al. in [26] introduced
a combining method between nonlinearity EEG features and wavelet coefficients
for improving the performance of the recognition rate classification. The proposed
method applied a linear SVM classifier and the effect of the combining technique
shown significant improvement in the classification results from (54%) to (73%).
Furthermore, the proposed method has been also applied for feature selection for the
same problem, while it is selected up (44%) for nonlinear features.
Several techniques have been proposed for EEG feature extraction. A comprehen-
sive analysis and review of EEG decomposition methods for feature extraction have
been presented [27]. For example, Wang et al. [28] introduced a new method for EEG
feature extraction using spatiotemporal analysis with multivariate linear regression
to improve the accuracy detection of SSVEP features. Zhang et al. [29] proposed a
new algorithm for EEG feature extraction on the basis of common spatial pattern
with motor imagery classification. The proposed method used boost classification to
improve the accuracy rate of the MI EEG. The proposed method was tested using
three public EEG datasets from BCI competition. The performance of the TSGSP
reached 88.5% for these datasets. Jiao et al. [30] proposed a new technique (SGRM)
for EEG classification that is based on reducing the number of training samples for
EEG data by implementing a new representation for the non-zero coefficient samples.
For EEG classification, Zhang et al. [31] proposed the combination of classification
methods between sparse Bayesian and Laplace priors.
452 Z. A. A. Alyasseri et al.

In general, WT has five parameters with each parameter having different types
(Table 1). The efficiency of EEG signal denoising depends on the selection of the
best combination of WT parameters. The selection is usually performed based on
experience or empirical evidence. In previous research, the WT parameter configu-
ration is formulated as an optimization problem with MSE as its objective function
[15]. As aforementioned, WT has five parameters, namely, (i) MWF Φ, (ii) decom-
position level L, (iii) thresholding function β, (iv) threshold selection rules λ, and
(v) threshold re-scaling methods ρ. Each of these parameters has several values and
is used for a specific denoising level. The optimal values of these parameters are
required to empower WT in the denoising process. For ECG signals, El-Dahshan in
[12] attempted to obtain the optimal configuration using GA, the results were better
than those that were produced experimentally. Alyasseri et al. [14, 32] proposed
a hybrid scheme for non-stationary signals denoising, such as ECG and EEG that
is based on β-hill climbing (βhc) optimization algorithm [33] with WT to obtain
the optimal wavelet parameters. The proposed method (βhc-WT) was tested using
an MIT-BIH dataset [34], where the original ECG signal was corrupted with white
Gaussian noise (WGN) using different input SNR noises that corrupted the ECG
from 0 to 40 dB. The performance of the βhc-WT method was evaluated using min-
imum squared error (MSE) and SNR. The proposed method successfully removed
WGN from the ECG and EEG signals [14–16, 32].
The main objective of this paper is to propose genetic algorithm (GA) for optimal
settings of WT parameters. Therefore, a new GA version of WT, called (GA-WT)
is tested in an experiment. The original EEG signal benchmark taken from Motor
Movement/Imagery dataset 1 is used for the evaluation process [19]. To evaluate
the performance of the GA, EEG signals are corrupted using three different noise
mechanisms, including power line noise (PLN), electromyogram (EMG), and white
Gaussian noise (WGN) [12, 35, 36]. Initially, each GA generates optimal parameter
settings for WT to denoise the EEG signal of each dataset. Afterward, the denoisined
results are evaluated using five measurement factors, namely, SNR, SNR improve-
ment, MSE, RMSE, and PRD. For comparative evaluation, the denoising results of
the GA method. Interestingly, FPA-WT achieves efficient EEG signal denoising for
EMG and WGN datasets. In addition, FPA-WT and GA-WT obtain the best denois-
ing levels for PLN dataset. In conclusion, FPA is the best algorithm that can be
incorporated with WT to achieve an efficient EEG signal denoising.
This paper is organized as follows. Section 2 provide a background to Wavelet
Transform (WT). Section 2.1 presents a Wavelet denoising principle for EEG signal
denoising. Genetic algorithm presents in Sect. 3. The hybrid scheme between meta-
heuristic algorithms and WT explains in Sect. 4. The results and discussion presents
in Sect. 5. Finally, the conclusions and future works describes in Sect. 6.

1 https://www.physionet.org/physiobank/database/eegmmidb/.
EEG Signal Denoising Using Hybridizing Method ... 453

2 Wavelet Transform

Wavelet Transform (WT) is a common and powerful tool for representing signals
in the time-frequency domain. WT has been successfully used for non-stationary
signals, such as ECG and EEG, to address several problems, such as those related to
signal compression, feature selection, and signal denoising [14, 37, 38]. Recently,
WT has been extensively tailored for non-stationary signals because of its powerful
performance in removing several EEG artifact noises that can corrupt the original
EEG signal during its recording time. These noises include eye blinking noise, eye
movement noise, muscle activity noise, electromyogram (EMG) noise, and interfer-
ence of electronic device signals [39–41].

2.1 Wavelet Denoising Principle for Non-stationary Signals

As aforementioned in Sect. 2, WT is a powerful tool for time-frequency domain

representation. This technique represents the signal on the basis of the correlation
between the translation and the dilation of MWF [12, 42, 43]. In general, the problems
solved by WT can be categorized into two WT versions, namely, continuous wavelet
transform (CWT) and discrete wavelet transform (DWT) [44]. In this paper, DWT
has been proposed for EEG signal decomposition whereby inverse DWT (iDWT)
is used for EEG signal reconstruction. DWT was originally established in [45] as
the so-called Donoho’s approach. In general, DWT decomposes a signal by using
set of filtering (i.e., low pass and high pass filters) to product the approximation and
details coefficients, respectively. The main objective of using DWT is to decompose
the input signal via different coefficient levels to correct the high frequency of the
input signals [46]. In other word, DWT decomposes the EEG signal into several
frequency bands because it assumed that the artifacts will have large amplitudes
in the respective frequency bands. Normally, the denoising process involves three
phases:
– EEG signal decomposition phase: Assuming the original EEG signals with n
samples x(t) = [x(1), x(2), ..., x(n)] will be divided into three levels, and each
level will be decomposed into two parts, namely, approximation coefficients (c A)
and detail coefficients (cD). cD will be processed using a high-pass filter, while
c A will continue to be decomposed for the next level.
∞

c Ai (t) = c Ai−1 (k)φi (t − k)
k=−∞
∞

cDi (t) = cDi−1 (k)Ψi (t − k) (1)
k=−∞
454 Z. A. A. Alyasseri et al.

where c Ai (t), cDi (t) denotes the approximation and detail coefficients of level i,
Ψ , φ refers to scaling and shifting, respectively.
– Applying thresholding phase: A threshold value is defined for each level according
to the noise level of the coefficient.
– Reconstruction phase: The EEG denoised signal is reconstructed using iDWT.
The formula of iDWT as follows [24]:

∞

L ∞

E E G clean (t) = c A L (k)φi (t − k) + cDi+1 (k)Ψi (t − k)
k=−∞ i=1 k=−∞

where E E G clean (t) denotes the reconstructed EEG signal, i refers to decomposi-
tion level (Fig. 1),

Fig. 1 EEG denoising process taken from [2, 7]

EEG Signal Denoising Using Hybridizing Method ... 455

Table 1 The ranges of the WT denoising parameters Method (range)

wavelet denoising parameters
Mother wavelet function Φ Symlet (sym1..sym45),
Coiflet (coif1..coif5),
Daubechies (db1..db45), and
Biorthogonal (bior1.1..
bior1.5&bior2.2 .. bior2.8&
bior3.1..bior3.9)
Thresholding function β soft or hard threshold
Decomposition level L 5
Thresholding selection rule λ Heursure, Rigsure,
Sqtwolog, and Minimax
Re-scaling approach ρ one, sln, mln

Signal noise removal is considered a challenging task in signal processing [47,

48]. Therefore, researchers have developed several approaches to solve this problem,
such as using the filtering technique [49, 50], thresholding technique [6, 51, 52], and
other techniques [53]. WT is one of the powerful techniques for non-stationary signal
denoising [43, 54, 55]. WT has five parameters, with each parameter having different
types (Table 1) the success of EEG signal denoising relies on the selection of WT
parameters. The wavelet denoising parameters are defined in three phases. In the
decomposition phase, the first parameter, namely, MWF (Φ), is used in the EEG
signal decomposition task. The second WT parameter, namely, the decomposition
level (L), is also selected in the decomposition phase based on the EEG signal and
experience.
The third parameter, namely, thresholding functions (i.e, β)), can be divided into
hard and soft thresholding [45, 51]. The thresholding types (soft or hard) in the
second phase must be selected along with the fourth parameter, namely, the selec-
tion rules (λ), and the fifth parameter, namely, the rescaling methods (ρ). These
threshold mechanisms must be applied because the selection will affect the global
denoising performance. The thresholding value is generally defined based on the
standard deviation (σ ) of the noise amplitude [12]. Tables 2 and 3 provide the dif-
ferent types of parameters for the thresholding selection rule and rescaling methods.
The thresholding rules are selected according to Eq. (2).

E E G noisy (n) = x(n) + σ e(n) (2)

where x(n) is the original EEG signal, e is the noise, σ is the amplitude of the noise,
and n is the number samples. The wavelet parameters (β, λ, and ρ) must be separately
applied for each wavelet coefficient (approximation and details) level.
In the last phase, the denoised EEG signal is reconstructed by iDWT as shown in
Eq. (2.1).
456 Z. A. A. Alyasseri et al.

Table 2 Thresholding Thresholding selection rule Description

selection rules
Rule 1: Rigrsure Threshold is selected using
the principle of Stein’s
Unbiased
Risk Estimate (SURE)
Rule 2: Sqtwolog Threshold
√ is selected equal
to (2log M)
Rule 3: Heursure Threshold is selected
according to mixture
(Rigrsure and Sqtwolog)
Rule 4: Minimaxi Threshold is selected equal
to Max(MSE)

Table 3 The wavelet Wavelet threshold rescaling Rescaling

thresholding rescaling methods ρ
methods
one No scaling
sln Single level
mln Multiple level

3 Genetic Algorithm

GA was developed in [56] to mimic the natural phenomenon of Darwin evolution

theory. Based on the ‘survival of the fittest’ principle, GA starts with many solutions,
with each solution being a vector of decision variables and each decision variable
having a specific range of values. In evolution context, the set of solutions is equiva-
lent to population, each solution is analogous to chromosome, each decision variable
is analogous to gene, and each value of the decision variables is analogous to allele.

Algorithm 1. Genetic Algorithm pseudo-code

1: X chr om ← Generate_I nitital_Population
2: Evaluate(X chr om )
3: while (Stopping criterion is not met) do
4: X chr om ← Selection(X chr om )
5: X chr om ← Crossover (X chr om )
6: X chr om ← Mutation (X chr om )
7: Evaluate(X chr om )
8: X chr om ← Replacement (X chr om ∪ X chr om )
9: end while

In order to apply a successful GA to COPs, both the objective function and prob-
lem representation must be properly adjusted together with parameter tuning. GA
typically has a set of parameter, including the size of the population Psi ze , the number
of generations Pno , the crossover rate Pcr ossover , and the mutation rate Pmutation . In
EEG Signal Denoising Using Hybridizing Method ... 457

order to build an efficient and robust GA, the parameter settings of each COP must
be closely examined.
Algorithm 1 shows the high-level schematic pseudo-code of GA that starts with a
population of candidate solutions X chr om , where X chr om is an augmented matrix of
size Psi ze × N and N is the number of decision variables in each solution. Initially, the
population X chr om is filled with random candidate solutions across the problem search
space, that is, X chr om = {X chr om 1 , X chr om 2 , . . . , X chr om Psi ze }. Each candidate solution
X chr om i is evaluated based on an objective function. The improvement loop in GA
(see Algorithm 1, line 3 to 9) repeats the following steps until a termination criterion
is met: select the parents (new population X chr om ) that will be used to generate
the next population which will pairwise crossover with a probability of Pcr ossover to
come up with a new population X chr om . Afterward, each pairwise solution will be
checked if it must be mutated with probability Pmutation to come up with X chr om .
The new population will be reevaluated, and the X chr om will be substituted with the
population X chr om based on such selection method. This procedure is followed to
determine whether the offsprings are fit or not. This process will be repeated several
times until an optimal solution is reached.

4 Meta-Heuristic Algorithms and Wavelet Transform for

EEG Signal Denoising: Proposed Method

This section provide a full discussion for the proposed methodology of the meta-
heuristic algorithms with wavelet transform to solve EEG signal denoising problem.
Algorithm 2 shows the pseudocode of the proposed method framework. The proposed
methodology run through four phases where the result of each phase is an input to the
consecutive one. The four phases are presented in Fig. 2 and thoroughly described
as follows:

Algorithm 2. Tuning WT parameters using a meta-heuristic algorithms for EEG

signal denoising
1: Initialize noisy EEG signal (nEEG), calculate the SNR, MSE, RMSE, and PRD for
input EEG signal.
2: Initialize meta-heuristic operators, initialize solution(s) X i (i = 1, 2, .., N ) N=5 wavelet parameters,
the initial solution X i (Φ ,L,β ,λ,ρ )
= Metheuristic ( X , X )
3: X opt i
4: EEGDenoiseSignals=WT ( X opt ,nEEG)

5: EEGOutSignals=Evaluate(EEGDenoiseSignals, S N Rout , S N Rimp , MSE, RMSE, PRD).

Phase I: Initialization. This phase involves three steps: firstly, reading the input
EEG signal x(n) from its source. The WT denoising approach was developed
based on the original EEG signal being corrupted with white Gaussian noise
(WGN), Power Line Noise (PLN), and Electromyogram (EMG) estimation
458 Z. A. A. Alyasseri et al.

Fig. 2 Proposed method for EEG denoising

[12, 35, 36]. Where these noises are exactly simulating the noises which will
corrupt the original EEG signal during the recording time such as eye blink
noise, eye movement noise, electro signal distortion, etc. In this paper, the orig-
inal EEG signals are provided then the signals corrupted by PLN using Eq. (3)
EEG Signal Denoising Using Hybridizing Method ... 459

followed by signals corrupted by EMG using Eq. (4) followed by signals cor-
rupted by WGN using Eq. (5) are given. These three types of noises corruption
EEG signals are used as a dataset to evaluate the performance of proposed
methods.
N (t) = A ∗ sin(2 ∗ π ∗ f ∗ t) (3)

N (t) = E ∗ rand(t) (4)

N (t) = x(t) + σ (5)

where A = 60 uV, E = (0–10) uV, f = 60 Hz, e is the noise, σ is the amplitude

of the noise in this work σ = 15 μV. The N signal is added to the original EEG
signal x to simulate PLN, EMG, and WGN respectively.
Secondly, initialize WT denoising parameters (Φ, L, β, λ, ρ) which are shown in
Table 4, as well as the parameter for genetic algorithm is also initialized. Finally,
compute the signal to noise ratio (SNR) by Eq. (15), percentage of root mean
square difference (PRD) by Eq. (14), mean square error (MSE) by Eq. (6), and
root mean square error (RMSE) by Eq. (17). This is to record the results of EEG
signals before and after denoising process (Fig. 3).
Phase II: Tuning WT parameters by GA. In the proposed methodology, GA is
adapted to ﬁnd the optimal WT parameters which can be used for EEG signal
denoising problem. Initially, the solution of WT parameters conﬁguration is rep-
resented as a vector x = (x1 , x2 , . . . xn ) where n is the total number of parame-
ter used for WT which is normally equal to 5. x1 represent the value of mother
wavelet function parameter Φ, x2 denotes the value of decomposition level param-
eter L, x3 refers to the thresholding method β, x4 represents the value of thresh-
olding selection rule parameter λ, and x5 represents the re-scaling approach ρ,

Original EEG Signal

200
uV

−200
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Noisy EEG Signal with PLN
200
uV

−200
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Noisy EEG Signal with EMG
400

200
uV

−200
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Original EEG Signal + WGN Noise
500
uV

−500
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Time, in Milliseconds

Fig. 3 EEG signal corrupted using PLN, EMG, and WGN noise
460 Z. A. A. Alyasseri et al.

where the possible range for these parameters are selected from Table 1. Figure 4
shows an example solution of WT parameters for denoising EEG signals. The
selected metaheuristic algorithm evaluates the solution using the MSE objective
function which is formulated in Eq. (6).

1
N
MSE = [x(n) −
x (n)]2 (6)
N n=1

where x(n) denotes the original EEG signal and x (n) is the denoised EEG signal
obtained by tuning the wavelet parameters using the meta-heuristic algorithm.
Iteratively, the randomly generated solution(s) undergoes refinement using the
selected meta-heuristic algorithm. The final output of this phase is an optimized

solution xopt = (x1 , x2 , . . . xn ) which will be passed to the next phase.

Phase III: EEG denoising using WT based on xopt . As aformentioned in Sect. 2.1,
the denoising process of WT involves three main steps that are described in more
details below:
• EEG signal decomposition using DWT. In this step the DWT is applied to
decompose the noise of the input EEG signals x(n). In decomposition process,

we must use the first two xopt parameters, namely, the mother wavelet furcation
ρ and the decomposition level L). The noisy EEG signal is divided at each level
into cA and cD. The latter is processed using a high-pass filter, while the former
is processed using a low-pass filter and is decomposed for the next level.
The EEG signal is convolved using the high-pass and low-pass filters, while
the block(↓2), which is represented by the downsampling operator, is used to keep
the even index elements of the EEG signal. The EEG signals are separated into
cA and cD based on their frequency and amplitude.
• The second step of EEG denoising is Thresholding which is applied based on
the noise level of the coefficients. In this step, the last three wavelet parameters,
namely, the thresholding type (β), the thresholding selection rules (λ), and the

re-scaling methods (ρ), must be selected from xopt .
According to [57], using a thresholding operation on the input noisy non-
stationary signal X can estimate the denoised EEG signal as follow:

Z = THR(
X , δ), (7)

where the THR denotes a thresholding function, while δ denotes a threshold value.
The EEG denoising performance in the wavelet domain depends on the estimation
of δ. Therefore, several methods have been proposed for estimating δ. Donoho and
Johnstone [45] calculated the threshold δ on an orthonormal basis as follows

δ = σ 2log M (8)

where σ represents the standard deviation of DWT detail coefficients, while M

denotes the length vector of the DWT coefficients. Given that the threshold value
EEG Signal Denoising Using Hybridizing Method ... 461

δ only depends on cD and that cA has a low frequency EEG signal and the highest
amount of energy. We estimate the value of δ based on the coefficients level as
follows:
xd (l) = T H R(
xd (l), δl ), l = 1, 2, .... (9)

where xd represents a vector of threshold DWT detail coefficients, l denotes a

where i denotes the index of the DWT details coefficients at a level l. The thresh-
olding DWT coefficients can be expressed as follows:

xd (1)
X = [ xd (2)
xa (2)] (12)

• Reconstruction of the denoising EEG signal by iDWT. We estimate the value

of the original EEG signals
X by applying iDWT on
X as follows:

∞

L ∞

z[n] = c A L (k)φi (n − k) + cDi+1 (k)Ψi (n − k) (13)
k=−∞ i=1 k=−∞

The reconstruction convolves the EEG signals using upsampling (↑2), which
involves the insertion of zeros at the even index elements of EEG signals. Figure 1
shows the iDWT procedure for five levels as an example.
Phase V: EEG Denoising Evaluation The final phase is evaluating the EEG output
of WT. The evaluation will done based on five criteria which are: Signal-to-Noise-
Ration (SNR), SNR improvement, Mean Square Error (MSE) Eq. (6), Root Mean
Square Error (RMSE), and percentage root mean square difference (PRD).

N
n=1 [x(n) − x (n)]2
P R D = 100 ∗ N
(14)
n=1 [x(n)]
2

N
n=1 [x(n)]
2
S N Rout = 10 log10 N
(15)
n=1 [x(n) −
x (n)]2
462 Z. A. A. Alyasseri et al.

Fig. 4 Solution of WT parameters for denoising EEG signals using MOFPA

N
n=1 [δ(n) − x(n)]
2
S N Rimp = 10 log10 N
(16)
n=1 [x(n) −
x (n)]2

1 N
RMSE = [x(n) −
x (n)]2 (17)
N n=1

where x(n) denotes the original EEG signal, x (n) is the denoised EEG signal
obtained by tuning the wavelet parameters through the selected meta-heuristic
algorithms, and N is the sampling number.
The final decision about the denoise results are decided by comparing the original
criteria (i.e., SNR, MSE, RMSE, PRD) with improved one (i.e., S N Rout , S N Rimp ,
MSE, RMSE, PRD).

5 Results and Discussions

5.1 EEG Dataset

The Motor Movement/Imagery’ (See footnote 1) dataset [19] collected the EEG
signals from 109 healthy subjects using a brain-computer interface software called
BCI2000 system [58]. The EEG signals are recorded using 64 Electrodes (EEG
channels) with sampling rate of 160 Hz per second, where each signal is stored in
EEG Signal Denoising Using Hybridizing Method ... 463

Fig. 5 Distribution of electrodes in EEG Motor Movement/Imagery Dataset

a separate EDF file. Each volunteer performs several motor/imagery tasks that are
mainly used in different fields, such as neurological rehabilitation and brain-computer
interface applications. In general, these tasks consist of imagining or simulating a
given action, such as opening and closing the eyes. The EEG signals are recorded
from each volunteer by asking them to perform four tasks according to the position
of a target that appears on the screen placed in front of them. If the target appears on
the right or left side of the screen, then the volunteer must open and close his/her fist
corresponding to the position of the target on the screen. If the target appears on the
top or bottom of the screen, then the volunteer must open and close his/her fists or feet.
Figure 5 shows the distribution of electrodes in the EEG Motor Movement/Imagery
Dataset.

5.2 Comparing the Proposed Method (GA-WT) with

State-of-the-Art Methods

In this section, two state-of-the-art methods for EEG signal denoising are discussed,
namely, the Al-Qazzaz method [13] and the Kumari method [1]. These methods use
WT for solving EEG signal denoising problems in which the WT parameters are set
based on a comparative study. The best parameter configurations for WT as identified
by these two methods are shown in Table 4.
464 Z. A. A. Alyasseri et al.

We compare the results of these two methods with this generated by our proposed
GA-WT method. The comparison is performed based on Kiern’s dataset [17], where
the original EEG signal is corrupted with WGN, PLN, and EMG [12, 35, 36]. The
final results are evaluated using five criteria, namely, MSE, RMSE, SNR, S N Rimp ,
and PRD. Table 5 shows the EEG signal denoising results of the Al-Qazzaz, Kumari,
and GA-WT methods. The first column presents the ranking of each method based
on the evaluation criteria adopted.
The results were evaluated using five measures, namely, MSE, RMSE, SNR_Out,
SNR_imp, and PRD). The performance of the proposed method (GA-WT) has been
compared with two state-of-the-art methods [1, 13]; the results show that the pro-
posed method achieves better outputs than [1, 13], as summarized in Table 5, in terms
of the overall EEG signal denoising criteria.
Figure 6 proves that the proposed GA-WT method outperforms both the Al-
Qazzaz and Kumari methods for EEG signal denoising based on different noises.
GA-WT obtains the best results for WGN and EMG based on MSE, RMSE, S N Rout ,
S N Rimp , and PRD. For PLN, GA-WT outperforms the Al-Qazzaz method [13] in
terms of MSE (0.0144) and RMSE (0.1200). Meanwhile, the S N Rout , S N Rimp , and
PRD values of these two methods are very close. In general, finding optimal param-

Table 4 Wavelet parameters range for Al-Qazzaz and Kumari methods

Wavelet parameters Al-Qazzaz method Kumari method
Mother wavelet (φ) Symlet (sym9) Daubechies (db4)
Decomposition level (L) 5 5
Thresholding type (β) soft and hard soft and hard
Selection method (λ) Rigrsure Rigrsure
Rescaling approach (ρ) sln, one sln, one

MSE and RMSE with PLN PRD with PLN SNR output with PLN SNR imp (dB) for PLN
MSE and RMSE Value

0.2 3 40 0

0.15 30 −1
SNR (dB)

SNR (dB)
PRD (%)

2
0.1 20 −2
MSE
RMSE 1
0.05 10 −3

0 0 0 −4
GA Sym9 GA Sym9 GA Sym9 GA Sym9

MSE and RMSE with EMG PRD with EMG SNR output with EMG SNR imp (dB) for EMG
MSE and RMSE Value

0.2 3 40 0

0.15 30
SNR (dB)

SNR (dB)
PRD (%)

2 −1
0.1 20
MSE
RMSE 1 −2
0.05 10

0 0 0 −3
GA db4 GA db4 GA db4 GA db4

MSE and RMSE with WGN PRD with WGN SNR output with WGN SNR imp (dB) for WGN
MSE and RMSE Value

30 100 3 3
SNR (dB)

SNR (dB)
PRD (%)

20 2 2
MSE
50
RMSE
10 1 1

0 0 0 0
GA Sym9 GA Sym9 GA Sym9 GA Sym9

Fig. 6 Comparative analysis between GA-WT, Sym9 and db4

Table 5 Comparing the proposed GA-WT method with state-of-the-art methods for EEG signals denoising with different noises
Rank Method Noise MSE SNR SNRimp PRD RMSE (φ) L β λ ρ
1 Proposed method WGN 24.7403 2.2045 2.0730 78.7682 4.3497 db35 5 Soft heursure sln
GA-WT
2 Al-Qazzaz WGN 26.186927 0.661388 0.792952 92.668167 5.117316 sym9 5 Soft rigrsure sln
method [13]
3 Kumari method WGN 27.006156 0.527605 0.6592 94.106513 5.196744 db4 5 Soft rigrsure sln
[1]
1 Al-Qazzaz PLN 0.025316 30.808240 −3.522428 2.881296 0.1591 sym9 5 hard rigrsure one
method [13]
2 Proposed method PLN 0.0144 30.5449 −3.7858 2.9700 0.1200 db27 5 hard heursure one
GA-WT
3 Kumari method PLN 0.030888 29.944328 −4.386341 3.182610 0.196744 db4 5 hard rigrsure one
[1]
EEG Signal Denoising Using Hybridizing Method ...

1 Proposed method EMG 0.0098 33.6418 −2.4149 2.0793 0.0990 db1 5 hard rigrsure one
GA-WT
2 Kumari method EMG 0.015076 33.059211 −2.99741 2.223511 0.122786 db4 5 hard rigrsure one
[1]
3 Al-Qazzaz EMG 0.019144 32.021900 −4.034729 2.505561 0.138361 sym9 5 hard rigrsure one
method [13]
1 Proposed method EOG 0.001 36.3513 15.6052 1.5221 0.0329 bior3.9 5 soft heursure one
GA-WT
2 Al-Qazzaz EOG 3.8699 22.4421 13.5106 7.5491 1.9672 sym9 5 hard rigrsure one
method [13]
3 Kumari method EOG 4.6352 21.6583 13.3562 8.262 2.153 db4 5 hard rigrsure one
[1]
Bold value indicates best results where for SNR, SNRimp, highest is best and for MSE, RMSE, and PRD, lowest is best
465
466 Z. A. A. Alyasseri et al.

eter configurations for WT by using metaheuristic-based algorithms especially GA,

can directly improve the performance of WT in the EEG signal denoising process.
The results show that the proposed method (GA-WT) for EEG signal denois-
ing can produce better results than manual configurations based on ad hoc strategy.
Therefore, using metaheuristic approaches to optimize the parameters for EEG sig-
nals positively affects the denoising process performance of the WT method.

6 Conclusions and Future Work

This paper proposes variation of wavelet transform (WT) method for EEG signal
denoising based on genetic algorithm called (GA-WT). As previously mentioned,
the denoising performance of WT depends on its five main parameters, with each
parameter having different types. Selecting the suitable WT parameters is a challeng-
ing task that is usually performed based on empirical evidence or experience. The
proposed method (GA-WT) aim to find the optimal WT parameters that can obtain
the minimum MSE between the original and denoised EEG signals.
The GA-WT is evaluated using a standard EEG dataset, the EEG Motor Movement-
Imagery dataset. These dataset contain 109 volunteers, and capture EEG signals from
64 EEG channels based on different mental tasks. These EEG signals are corrupted
using three different noises namely, PLN, EMG, and WGN [12, 35, 36]. Five eval-
uation criteria are used, namely, SNR, SNR improvement, MSE, RMSE, and PRD.
Several experiments are conducted to compare the performance of the GA-WT can
support WT in producing efficient EEG signal denoising outcomes. Interestingly,
GA-WT outperforms the other proposed methods.

Acknowledgements This research has been done under USM Grant (1001/PKOMP/8014016).
Also, the first author would like to thank The World Academic Science (TWAS) and the University
Science Malaysia (USM) for supporting his study (TWAS-USM Postgraduate Fellowship 2015, FR
number: 3240287134).

References

1. Kumari P, Vaish A (2015) Brainwave based user identification system: a pilot study in robotics
environment. Robot Auton Syst 65:15–23
2. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, ahmad Alomari O (2018) EEG-based person
authentication using multi-objective flower pollination algorithm. In: 2018 IEEE congress on
evolutionary computation (CEC). IEEE, pp 1–8
3. Ramadan RA, Vasilakos AV (2017) Brain computer interface: control signals review. Neuro-
computing 223:26–44
4. Rao RP (2013) Brain-computer interfacing: an introduction. Cambridge University Press, Cam-
bridge
5. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadme SN (2018) An
efficient optimization technique of EEG decomposition for user authentication system. In: 2018
EEG Signal Denoising Using Hybridizing Method ... 467

2nd international conference on biosignal analysis, processing and systems (ICBAPS). IEEE,
pp 1–6
6. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadmeh SN (2018)
Classification of EEG mental tasks using multi-objective flower pollination algorithm for person
identification. Int J Integr Eng 10(7):7
7. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA (2018) EEG feature extraction
for person identification using wavelet decomposition and multi-objective flower pollination
algorithm. IEEE Access 6:76007–76024
8. Berger H (1929) Uber das elektrenkephalogramm des menschen. Eur Arch Psychiatry Clin
Neurosci 87(1):527–570
9. Abdulkader SN, Atia A, Mostafa M-SM (2015) Brain computer interfacing: applications and
challenges. Egypt Inform J 16(2):213–230
10. Tareq Z, Zaidan B, Zaidan A, Suzani M (2018) A review of disability EEG based wheelchair
control system: coherent taxonomy, open challenges and recommendations. Comput Methods
Programs Biomed 164:221–237
11. Adeli H, Ghosh-Dastidar S, Dadmehr N (2007) A wavelet-chaos methodology for analysis of
EEGs and EEG subbands to detect seizure and epilepsy. IEEE Trans Biomed Eng 54(2):205–
211
12. El-Dahshan E-SA (2011) Genetic algorithm and wavelet hybrid scheme for ECG signal denois-
ing. Telecommun Syst 46(3):209–215
13. Al-Qazzaz NK, Hamid Bin Mohd Ali S, Ahmad SA, Islam MS, Escudero J (2015) Selection
of mother wavelet functions for multi-channel EEG signal analysis during a working memory
task. Sensors 15(11):29015–29035
14. Alyasseri ZAA, Khader AT, Al-Betar MA, Abualigah LM (2017) ECG signal denoising using
β-hill climbing algorithm and wavelet transform. In: ICIT 2017 the 8th international conference
on information technology, pp 1–7
15. Alyasseri ZAA, Khader AT, Al-Betar MA (2017) Optimal electroencephalogram signals
denoising using hybrid β-hill climbing algorithm and wavelet transform. In: Proceedings of the
international conference on imaging, signal processing and communication. ACM, pp 106–112
16. Alyasseri ZAA, Khader AT, Al-Betar MA (2017) Electroencephalogram signals denoising
using various mother wavelet functions: a comparative analysis. In: Proceedings of the inter-
national conference on imaging, signal processing and communication. ACM, pp 100–105
17. Keirn ZA, Aunon JI (1990) A new mode of communication between man and his surroundings.
IEEE Trans Biomed Eng 37(12):1209–1214
18. Sharma PK, Vaish A (2016) Individual identification based on neuro-signal using motor move-
ment and imaginary cognitive process. Opt Int J Light Electron Opt 127(4):2143–2148
19. Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody
GB, Peng C-K, Stanley HE (2000) Physiobank, physiotoolkit, and physionet. Circulation
101(23):e215–e220
20. Al-Qazzaz NK, Ali S, Ahmad SA, Islam MS, Ariff MI (2014) Selection of mother wavelets
thresholding methods in denoising multi-channel EEG signals during working memory task. In:
2014 IEEE conference on biomedical engineering and sciences (IECBES). IEEE, pp 214–219
21. Al-Qazzaz NK, Ali SHBM, Ahmad SA, Islam MS, Escudero J (2018) Discrimination of stroke-
related mild cognitive impairment and vascular dementia using EEG signal analysis. Med Biol
Eng Comput 56(1):137–157
22. Reddy CSP et al (2017) Analysis of EEG signal for the detection of brain abnormalities. Int J
Res 4(17):1947–1950
23. Kumari P, Vaish A (2016) Feature level fusion of mental tasks brain signal for an efficient
identification system. Neural Comput Appl 27(3):659–669
24. Mowla MR, Ng S-C, Zilany MS, Paramesran R (2015) Artifacts-matched blind source separa-
tion and wavelet transform for multichannel EEG denoising. Biomed Signal Process Control
22:111–118
25. Yang B, Zhang T, Zhang Y, Liu W, Wang J, Duan K (2017) Removal of electrooculogram arti-
facts from electroencephalogram using canonical correlation analysis with ensemble empirical
mode decomposition. Cogn Comput 9(5):626–633
468 Z. A. A. Alyasseri et al.

26. Torabi A, Jahromy FZ, Daliri MR (2017) Semantic category-based classification using non-
linear features and wavelet coefficients of brain signals. Cogn Comput 9(5):702–711
27. Zhou G, Zhao Q, Zhang Y, Adali T, Xie S, Cichocki A (2016) Linked component analysis from
matrices to high-order tensors: applications to biomedical data. Proc IEEE 104(2):310–331
28. Wang H, Zhang Y, Waytowich NR, Krusienski DJ, Zhou G, Jin J, Wang X, Cichocki A (2016)
Discriminative feature extraction via multivariate linear regression for SSVEP-based BCI.
IEEE Trans Neural Syst Rehabil Eng 24(5):532–541
29. Zhang Y, Nam CS, Zhou G, Jin J, Wang X, Cichocki A (2018) Temporally constrained sparse
group spatial patterns for motor imagery BCI. IEEE Trans Cybern 49(9):3322–3332
30. Jiao Y, Zhang Y, Chen X, Yin E, Jin J, Wang X, Cichocki A (2018) Sparse group representation
model for motor imagery EEG classification. IEEE J Biomed Health Inform 23(2):631–641
31. Zhang Y, Zhou G, Jin J, Zhao Q, Wang X, Cichocki A (2016) Sparse bayesian classification
of EEG for brain-computer interface. IEEE Trans Neural Netw Learn Syst 27(11):2256–2267
32. Alyasseri ZAA, Khader AT, Al-Betar MA, Awadallah MA (2018) Hybridizing β-hill climbing
with wavelet transform for denoising ECG signals. Inf Sci 429:229–246
33. Al-Betar MA (2017) β-hill climbing: an exploratory local search. Neural Comput Appl
28(1):153–168
34. Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus
JE, Moody GB, Peng C-K, Stanley HE (2000) PhysioBank, PhysioToolkit, and Phys-
ioNet: components of a new research resource for complex physiologic signals. Circula-
tion 101(23):e215–e220. https://doi.org/10.1161/01.CIR.101.23.e215 Circulation Electronic
Pages: http://circ.ahajournals.org/content/101/23/e215.fullPMID:1085218
35. Wang J, Ye Y, Pan X, Gao X (2015) Parallel-type fractional zero-phase filtering for ECG signal
denoising. Biomed Signal Process Control 18:36–41
36. Jenkal W, Latif R, Toumanari A, Dliou A, El Bcharri O, Maoulainine FM (2016) An efficient
algorithm of ECG signal denoising using the adaptive dual threshold filter and the discrete
wavelet transform. Biocybern Biomed Eng 36(3):499–508
37. Subasi A, Ercelebi E (2005) Classification of EEG signals using neural network and logistic
regression. Comput Methods Programs Biomed 78(2):87–99
38. Kumar H, Pai SP, Vijay G, Rao R (2014) Wavelet transform for bearing condition monitoring
and fault diagnosis: a review. Int J COMADEM 17(1):9–23
39. Mamun M, Al-Kadi M, Marufuzzaman M (2013) Effectiveness of wavelet denoising on elec-
troencephalogram signals. J Appl Res Technol 11(1):156–160
40. Al-Kadi MI, Reaz MBI, Ali MAM, Liu CY (2014) Reduction of the dimensionality of the
EEG channels during scoliosis correction surgeries using a wavelet decomposition technique.
Sensors 14(7):13046–13069
41. Borse S (2015) EEG de-noising using wavelet transform and fast ICA. IJISET Int J Innov Sci
Eng Technol 2:200–205
42. Poornachandra S, Kumaravel N (2005) Hyper-trim shrinkage for denoising of ECG signal.
Digit Signal Proc 15(3):317–327
43. Yang R, Ren M (2011) Wavelet denoising using principal component analysis. Expert Syst
Appl 38(1):1073–1076
44. Sawant C, Patii HT (2014) Wavelet based ECG signal de-noising. In: 2014 first international
conference on networks & soft computing (ICNSC). IEEE, pp 20–24
45. Donoho DL, Johnstone JM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika
81(3):425–455
46. Singh BN, Tiwari AK (2006) Optimal selection of wavelet basis function applied to ECG signal
denoising. Digit Signal Proc 16(3):275–287
47. McSharry PE, Clifford GD, Tarassenko L, Smith LA (2003) A dynamical model for generating
synthetic electrocardiogram signals. IEEE Trans Biomed Eng 50(3):289–294
48. Alyasseri ZAA, Khadeer AT, Al-Betar MA, Abasi A, Makhadmeh S, Ali NS (2019) The effects
of EEG feature extraction using multi-wavelet decomposition for mental tasks classification.
In: Proceedings of the international conference on information and communication technology.
ACM, pp 139–146
EEG Signal Denoising Using Hybridizing Method ... 469

49. Feng J, Wang Z, Zeng M (2013) Distributed weighted robust kalman filter fusion for uncertain
systems with autocorrelated and cross-correlated noises. Inf Fusion 14(1):78–86
50. Sun X-J, Gao Y, Deng Z-L, Li C, Wang J-W (2010) Multi-model information fusion Kalman
filtering and white noise deconvolution. Inf Fusion 11(2):163–173
51. Donoho DL (1995) De-noising by soft-thresholding. IEEE Trans Inf Theory 41(3):613–627
52. Ustundaug M, Gokbulut M, Sengur A, Ata F (2012) Denoising of weak ECG signals by
using wavelet analysis and fuzzy thresholding. Netw Model Anal Health Inform Bioinform
1(4):135–140
53. Zeng K, Dong M (2014) A novel cuboid method with particle swarm optimization for real-life
noise attenuation from heart sound signals. Expert Syst Appl 41(15):6839–6847
54. Lagha M, Tikhemirine M, Bergheul S, Rezoug T, Bettayeb M (2013) De-noised estimation of
the weather doppler spectrum by the wavelet method. Digit Signal Proc 23(1):322–328
55. Vazquez RR, Velez-Perez H, Ranta R, Dorr VL, Maquin D, Maillard L (2012) Blind source
separation, wavelet denoising and discriminant analysis for EEG artefacts and noise cancelling.
Biomed Signal Process Control 7(4):389–400
56. Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with
applications to biology, control, and artificial intelligence. MIT Press, Cambridge
57. Kabir MA, Shahnaz C (2012) Denoising of ECG signals based on noise reduction algorithms
in EMD and wavelet domains. Biomed Signal Process Control 7(5):481–489
58. Schalk G, McFarland DJ, Hinterberger T, Birbaumer N, Wolpaw JR (2004) BCI 2000: a general-
purpose brain-computer interface (BCI) system. IEEE Trans Biomed Eng 51(6):1034–1043
Neural Network Ammonia-Based
Aeration Control for Activated Sludge
Process Wastewater Treatment Plant

M. H. Husin, M. F. Rahmat, N. A. Wahab, and M. F. M. Sabri

Abstract The paper proposes an improved effluent control for the operation of a
biological wastewater treatment plant using a neural network ammonia-based aer-
ation control. The main advantage of this control method is the simplicity and
nonlinear approximation ability that beat the performances of the static-gain
Proportional Integral (PI) controller. The trained neural network controller used the
measured value of dissolved oxygen and ammonium in compartment 5 of the
Benchmark Simulation Model No. 1 (BSM1) to regulate the oxygen transfer
coefficient in compartment 5. The effectiveness of the proposed neural network
controller is verified by comparing the performance of the activated sludge process
to the benchmark PI under dry weather file. Simulation results indicate that Ntot,e,
and SNH,e violations are reduced by 22% reduction for Ntot,e, and 4% for SNH,e.
The significant improvement in effluent violation, and effluent quality index of the
BSM1 confirms the advantage of the proposed method over the Benchmark PI. For
future research, the method can also be applied in controlling the nitrate in activated
sludge wastewater treatment plant.

Keywords Aeration control Activated sludge Wastewater treatment plant

Nomenclature
AE Aeration Energy
BSM1 Benchmark Simulation Model No. 1
DO Dissolved Oxygen
EQ Effluent Quality
MPC Model Predictive Control
OCI Overall Cost Index

M. H. Husin (&) M. F. M. Sabri

Department of Electrical and Electronic Engineering, Faculty of Engineering,
Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia
e-mail: [email protected]
M. F. Rahmat N. A. Wahab
Department of Control and Mechatronics Engineering, Faculty of Electrical Engineering,
Universiti Teknologi Malaysia (UTM), 81310 Johor Bahru, Johor, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 471

PI/PID Proportional Integral/Proportional Integral Derivative

WWTP Wastewater treatment plant

1 Introduction

1.1 Activated Sludge Wastewater Treatment Plant

Wastewater treatment plant (WWTP) is a process used to remove the contaminants

from wastewater and convert it into an effluent that is safe or has minimum impact
on the environment. The activated sludge process is a form of the wastewater
treatment process for handling wastewater using aeration and bacteria. The acti-
vated sludge process is a biological process, and it is the most commonly applied
[1, 2] technology in WWTP.
Minimizing the energy expenditure in the activated sludge process can be
achieved by controlling the aeration system. From the total operation cost of the
WWTP, the energy consumption itself may range from 30–50% [3–5], with over
half of the energy requirement comes from the aeration section. Aeration is a costly
process [4–9], and the increases in the cost of the energy will escalate the total
operation cost even more. The wastewater treatment process performance depends
on the effectiveness of maintaining the dissolved oxygen (DO) concentration at a
reasonable level. DO concentration has an immense influence on treatment effec-
tiveness, operational cost, and system stability.
WWTP is a process industry that has influent variations and a large disturbance.
Unlike other process industries, WWTP cannot restraint the crude material to the
plant. Standards have been established for the quality of the effluent discharged
from the WWTP to receiving waters. Due to the imposed of more strict discharge
thresholds, process control at WWTP is becoming gradually more essential. In
adapting to the new requirements, automatic control has been used to improve the
water quality and also to minimize the operational costs to achieve sustainable
treatments.
Two ways are proposed by [10] to control the aeration process, which is the total
aerobic volume and the aeration intensity. In changing the aeration intensity, a
common method used is by adjusting the DO concentration level based on the
ammonium concentration in the effluent. With ammonium concentration as a target
variable, ammonium feedback control can adjust the aeration intensity as required
by the process. Ammonium feedback control can achieve 3–7% [11] energy saving
compared to constant DO control.
The BSM1 [12] has been established as the simulation model and protocol and a
handful of papers working on the control of WWTP being using this benchmark.
Ever since the establishment, in most earlier literature and even in recent years
[13–15], the WWTP operation is usually assessed in terms of overall cost index
Neural Network Ammonia-Based Aeration Control … 473

(OCI) and effluent quality (EQ). Control schemes applied in most of those works
directly attempt to control the DO and nitrate concentration, which are the variables
that defined the attribute of the effluent and cost of the WWTP operation.
Proportional Integral (PI) or Proportional Integral Derivative (PID) control
strategy has is the most commonly used control strategy in the process control of
WWTP. However, the control of the linear PI/PID might be affected by the dis-
turbances or changes in the condition of operation. Various solutions are proposed
to improve DO concentration control performance. By limiting the literature based
on BSM1 as the working scenario, it can be perceived that in most control strategy
solutions to the above mentioned setback of PI/PID controllers are using different
types of controllers such as nonlinear PI controller [16, 17], model predictive
control [18–22] and artificial intelligence control [23–27]. However, the control
strategy remains the same, which is to control the DO and nitrate concentration.
Generally, the enhancement of control performance in the nonlinear PI controller
results in a trivial enhancement of the EQ and infrequent to achieve a reduction of
cost. Model predictive control (MPC) or artificial intelligent control, on the other
hand, usually have better EQ and offers a reduction of cost. However, these
methods have complicated structures, and the complex algorithm of MPC requires a
large number of computations, due to the attempts at every control interval to
optimize upcoming plant behavior by calculating a sequence of upcoming manip-
ulated variable adjustment.
Due to this implementation, the overall performance of the WWTP can be said to
be improved. However, the detailed analysis from the environmental aspect is not
being discussed further in most of the papers. Further analysis of the imposed
pollution limit must be taken into account to ensure the effluent discharge from the
WWTP is safe or has minimum impact on the environment.

1.2 Ammonium Based Aeration Control

Precisely, most critical pollutants are ammonium and ammonium nitrogen, and total
nitrogen. Not many research work yet to be found that are taking into account the
imposed pollution limits. Recently, all the necessary elements for advanced control
are now available and within reach of any wastewater treatment utilities. The arrival
of in situ ISEs to measure ammonium is an important development to the process
industries. This technology is mature and continues to develop and improve. Thus,
ammonia-based aeration control is becoming an increasingly popular aeration
strategy applied to WWTP. The ammonia-based aeration control was made possible
with the availability of numerous sensors, e.g., ammonia ISE probes, that determine
the activity of ammonia ion in solutions. Ammonia-based aeration control would be
beneﬁcial for many wastewater treatment utilities. However, the applicable control
strategy for a particular wastewater treatment facility depends on factors like system
conﬁguration, discharge limitation, and wastewater treatment characteristics.
474 M. H. Husin et al.

Utilities have implemented ammonia-based aeration control based on feedback

and feedforward strategies. Feedback is very common in the process industry, but it
can have limitations in a high dynamics system such as WWTP. Feedforward has
more complexity, but it does offer the possibility to attain the best effluent at the
lowest energy cost. A study on ammonia-based aeration control applied to WWTP
can be found in a few papers [5, 10, 28]. Two of these studies are implemented in
real WWTP (Kappala WWTP and an industrial WWTP), while the other one is
implemented using BSM1. In all papers, the focus of the study is to reduce the
aeration cost while maintaining high-quality effluent. The details on the imposed
pollution limits, e.g., ammonium and total nitrogen, are not mentioned in the paper.
In all papers above, PI controllers are the applied controller. As being stated earlier,
the PI/PID controller might not respond well when dealing with disturbances.
WWTP is a highly nonlinear plant with huge disturbances. Thus, a more advanced
controller is needed to tackle these issues. For this study, the referred article are [11,
29, 30]. The summary of previous related research focuses mainly on the aeration
control of activated sludge WWTP is illustrated in Table 1.

Table 1 Summary of aeration control for activated sludge WWTP

References Approach Major findings
Åmand and Supervisory PI ammonium feedback i. Achieved 1–3.5% savings in the
Carlsson control with DO profile created from a airflow rate compared to constant DO
[11] mathematical minimization of the daily control
air flow rate ii. Use a modified version of BSM1
(no zones for denitrification included)
Santin et al. Hierarchical control architecture i. Complete elimination of total
[29] Lower level: MPC to regulates the DO nitrogen violation is achieved by
of the three aerated tanks based on adding additional carbon at tank 1
ammonium and ammonia nitrogen ii. Manipulating internal recirculating
concentration in the tank 5 flow rate (Qrin) with a combination of
Higher-level: Affine function to linear and exponential function makes
determine the DO setpoint possible of ammonia violations
removal
Santin et al. Effluent pollutants concentration i. A logical signal is generated at the
[30] prediction by using ANN instants where risk is detected
ii. Simulation is done in BSM2
Uprety et al. Ammonia PID control calculated DO i. Implemented are real Industrial
[28] setpoint based on the difference WWTP
between ammonia probe feedback and ii. Significant reduction in
ammonia set point supplemental carbon necessary for
denitrification with a reduction in plant
energy consumption
iii. Reduced the need for increased
reactor volume
Várhelyi Combination of PI ammonia-based i. Potential to achieve a cost reduction
et al. [5] aeration control with the control of of about 43%
nitrate and return activated sludge ii. A data collection form municipal
recycle WWTP
Neural Network Ammonia-Based Aeration Control … 475

Fig. 1 Ammonium cascade control. The NH controller determines the DO setpoint

Most of these study deals with PI/PID controller with ammonium cascade
control structure, as shown in Fig. 1. In this conﬁguration, the ammonia sensor is
located at the aerated zones (reactor 3 to 5). The ammonia probe constantly
transmits a signal of the ammonia measurement to an ammonia PI/PID controller,
which then computes a DO setpoint based on the variation between the reading of
the ammonia probe and the required ammonia set point. Ammonia set point in the
aeration effluent ranges from 1–5 mg NH4/l [28], depending on the permit limits.
This PI/PID calculated DO set point is then relayed to the DO controller. With the
ammonia PI/PID control, it requires two cascade controller.

2 Benchmark Simulation Model No. 1 (BSM1)

The BSM1 is a simulation setting deﬁning a plant outline, a simulation model,

influent loads, test procedure, and evaluation criteria. The BSM1 is based on the
ASM1, and the layout is as shown in Fig. 2. The first component of BSM1 is a
biological activated sludge reactor, which consists of five compartments of two
non-aerated compartments and three aerated compartments. For non-aerated com-
partments, the reactor volume is 1000 m3, and for the aerated compartments, the
reactor volume is 1333 m3. The secondary settler is 10 layers of the non-reactive
unit with no biological reaction. The settler volume is 6000 m3.
The influent data defines in BSM1 consists of dry weather, rain weather, and
storm weather. The influent data use is sampled with a sampling period of 15 min in
the following order:
[time SI SS XI XS XBH XBA XP SO SNO SNH SND XND SALK Q0]

In any influent: SO = 0 g (-COD) m3; XBA = 0 g COD m−3; SNO = 0 g N m−3;

XP = 0 g COD m−3; SALK = 7 mol m−3. The details of influent’s variables is in
Table 2. For this study, only a dry weather ﬁle is considered. The dry weather ﬁle
comprises fourteen days of dynamic dry influent data (see Fig. 3).
476 M. H. Husin et al.

Fig. 2 Default control strategy in BSM1

Table 2 Description of variables

Symbol Description Symbol Description
SI Soluble inert organic matter SO Dissolved oxygen
SS Suspended solids SNO Nitrate
XI Particulate inert organic matter SNH Ammonium and ammonia
nitrogen
XS Slowly biodegradable substrate SND Soluble biodegradable organic
nitrogen
XBH Active heterotrophic biomass XND Particulate biodegradable
organic nitrogen
XBA Active autotrophic biomass SALK Alkalinity
XP Particulate products arising from Q0 Input flowrate
biomass decay

The simulation setup starts with initialization, where simulation using 100 days
of stabilization in a closed-loop condition (using constant inputs with no noise on
the measurements) has to be completed. After that, it follows by simulation using
the dry weather file, and lastly, it proceeds with weather files to be verified. Noise
on measurements must be used with the dynamic files. The system is stabilized if
the steady state is attained.
A simulation procedure is set to achieve a just assessment of results. In the
attempt to compare the different control strategies, a few standards are outlines for
the plant performance assessment. It includes Effluent Quality Index (EQI) and the
OCI to weigh the operating cost. The assessment also comprises the calculation of
the operating time that the concentration of the pollutants in the discharge is above
the limit, as shown in Table 3. Total nitrogen (Ntot) is the sum of NO and Kjeldahl
nitrogen (NKj).
Neural Network Ammonia-Based Aeration Control … 477

35000

30000

25000
Flowrate(m3.d-1)

20000

15000

10000

5000

0
0.0
0.5
1.0
1.6
2.1
2.6
3.1
3.6
4.2
4.7
5.2
5.7
6.3
6.8
7.3
7.8
8.3
8.9
9.4
9.9
10.4
10.9
11.5
12.0
12.5
13.0
13.5
Time (days)
(a) Q0, input flowrate of dry weather influent

140

120
Concentration (g.m-3)

100

80
SS
60
SNH
40 SND

0
0.0
0.6
1.2
1.8
2.5
3.1
3.7
4.3
4.9
5.5
6.1
6.8
7.4
8.0
8.6
9.2
9.8
10.4
11.1
11.7
12.3
12.9
13.5

Time (days)
(b) SS, SNH and SND concentration of dry weather influent

350

300
Concentration (g.m-3)

250

200
XBH
150 XS
XI
100
XND
50

0
0.0
0.6
1.2
1.8
2.5
3.1
3.7
4.3
4.9
5.5
6.1
6.8
7.4
8.0

11.7
12.3
12.9
13.5
8.6
9.2

10.4
11.1
9.8

Time (days)
(c) XBH, XS, XI and XND concentration of dry weather influent

Fig. 3 Dry weather influent

478 M. H. Husin et al.

Table 3 Concentration Variables Maximum accepted values

thresholds of pollutants in the
effluent Ntot [g N/m3] 18
CODt [g COD/m3] 100
NH [g N/m3] 4
TSS [g SS/m3] 30
BOD5 [g BOD/m3] 10

2.1 PI Control

The default controller in BSM1 is the PI controller. The primary control objectives
are to maintain the nitrate concentration in tank two at a setpoint value of 1 g m−3
and the DO concentration in tank ﬁve at a setpoint value of 2 g(-COD) m−3.
The PI controllers are on the following form:
Z
1 t deðtÞ
uð t Þ ¼ K e ð t Þ þ eðsÞds þ Td umin \uðtÞ\umax ð1Þ
T 0 dt

where u(t) is the controller output, K is the controller gain, Ti is the integral time,
e(t) is the control error, and umin and umax are the upper and lower limits of the
controller output, correspondingly.

2.2 Ammonia Sensor

For the ammonia-based aeration control, the ammonia sensor used is of class B0
(see Fig. 4) with a measurement span of 0–20 g N m−3 and measurement noise
d = 0.5 g N m−3 as recommended by BSM1 [12]. This sensor is located at the
ﬁnal aerated compartment, which will continuously send a signal of the ammonia
measurement to the neural network controller.

2.3 Performance Assessment

BSM1 performance assessment makes available measures for the outcome of the
proposed control strategy. According to the benchmark, it can be divided into few
categories, EQ, cost factors for operation (aeration energy (AE), pumping energy,
sludge production, consumption of external carbon source, mixing energy), influent
quality and OCI. However, for this study, only three important categories are
highlighted, EQ, AE, and OCI.
Neural Network Ammonia-Based Aeration Control … 479

Fig. 4 Simulink model of sensor class BO

The EQ is averaged throughout 7-days for each weather ﬁle and is based on a
weighting of the effluent loads of compounds that have the main impact on the
quality of the receiving water and counted in regional legislation. It is expressed as:
0 1
Z t¼14days BSS SSe ðtÞ þ BCOD CODe ðtÞ
1 @ AQe ðtÞ dt ð2Þ
EQ ¼ þ BNkj SNkj;e ðtÞ
T 1000 t¼7days þ BNO SNO;e ðtÞ þ BBOD5 BODe ðtÞ

where

SNkj;e ¼ SNH;e þ SND;e þ XND;e þ iXB XBH;e þ XBA;e þ iXP XP;e þ Xi;e

SSe ¼ 0:75 XS;e þ XI;e þ XBH;e þ XBA;e þ XP;e

BOD5;e ¼ 0:25 SS;e þ XS;e þ ð1 fP Þ XBH;e þ XBA;e

CODe ¼ SS;e þ SI;e þ XS;e þ XI;e þ XBH;e þ XBA;e þ XP;e

The AE take into account the plant peculiarities and is computed from the kLa
according to the following relation:
Z t¼14days X8
Ssat
AE ¼ O
Vi KL ai ðtÞdt ð3Þ
T 1:8 1000 t¼7days
i¼1

with kLa given in d−1 and I referring to the compartment number.

480 M. H. Husin et al.

Finally, the OCI is calculated:

OCI ¼ AE þ PE þ 5 SP þ 3 EC þ ME ð4Þ

where PE is the pumping energy, SP is the sludge production to be disposed of, EC

is the consumption of external carbon source, and ME is mixing energy. Further
details on the equation for all of this can be found in [12].

3 Methodology

3.1 Feed-Forward Neural Network Ammonia-Based

Aeration Control

An artiﬁcial neural network (ANN) is an approach to replicate the biological ner-

vous system, e.g., brain. It applies the nonlinear processing unit to mimic biological
neurons for modeling the activities of biological synapses amid neurons by
fine-tuning the values of the variable weights between output and target until the
network output matches the target. The main features of ANN are parallel pro-
cessing capability and distributed storage. ANN offers advantages in which the
outstanding nonlinear mapping ability, strong fault acceptance, self-organization,
self-learning, and adaptive reasoning ability [23].
For this study, which is the application of neural network in the control system,
the neural network looks as function approximators. The proses (see Fig. 5) is
involving the adjustment of parameters of the network so that it will produce the
same response as the unknown function, if the same input is applied to both
systems.
In this paper, the proposed controller (see Fig. 6) neural network ammonia-based
aeration control is used as the controller to manipulate the oxygen transfer coeffi-
cient, KLa5 of the reactor tank five, by using the measured value of DO

Fig. 5 The neural network as a function approximator

Neural Network Ammonia-Based Aeration Control … 481

Fig. 6 The block diagram of the neural network ammonia-based aeration controller

concentration and ammonia concentration in tank ﬁve directly. This study aims to
evaluate the feedforward neural network ammonia-based aeration controller with PI
benchmark constant DO setpoint strategy.
Two-layer networks, with sigmoid transfer functions in the hidden layer and a
linear transfer function in the output layer, are universal approximators [31]. In this
study, a feed-forward neural network is applied with a two-layer network consist of
10 sigmoid hidden neurons and a linear output neuron. The schematic illustration of
the feedforward neural network is illustrated in Fig. 7.
Assuming that the samples to be trained are fxi ; ri g 2 fX; Rg, where xi repre-
sents the input of the network, X ¼ ½x1 ðkÞ; x2 ðk Þ; ; xn ðkÞT is the input vector, ri
represents the expected output of the network, and R ¼ ½r1 ðkÞ; r2 ðk Þ; ; rn ðk ÞT is
the anticipated output vector. Sigmoid function is chosen as the active function of
the hidden layer of the network, and linear function as the active function for the
i;j 2 W
output layer. wL1 L1
represents the weight connecting the ith neuron of the input
layer and jth neuron of the hidden layer, the weight connecting the ith neuron of
hidden layer and jth neuron of output layer is wL2 i;j 2 W . Two layer network is
L2

chosen and X ¼ ½y1 ðkÞ; y2 ðk Þ; ; yn ðkÞ as the actual output of the network

Y ¼ W L2 f X W L1 ð5Þ

where the sigmoid function as an f function

1
f ð xÞ ¼ ð6Þ
1 þ ex

and e is a transcendental number, e = 2.71828 [32].

482 M. H. Husin et al.

Fig. 7 The topological

structure of the feed-forward
neural network

The training index is set as:

1
J ðkÞ ¼ ðeðk ÞÞ2 ð7Þ
2

This structure can ﬁt multidimensional mapping difﬁculties well, given reliable

data and sufﬁcient neurons in its hidden layer. The feed-forward neural network is
widely used in modeling and control applications due to its simplicity and efﬁciency
[14]. Increment of the learning rate and avoiding the problem of local minima can
be achieved through the nonlinear mapping of the input layer to the output layer and
the linear mapping from the hidden layer to the output layer [23]. The network is
trained with the Bayesian Regularization algorithm.
Neural Network Ammonia-Based Aeration Control … 483

4 Results and Discussion

Ammonia-based aeration control applied in this study uses both the ammonium
concentration and DO concentration as the controlled variables, while the oxygen
transfer coefficient as the manipulated variable. The ammonium sensor was located
in the fifth tank. It is unexceptional to locate the sensor in the latter zone of the
activated sludge process. Simulations are carried out using sensor class B0 for SNH
and SNO and type A sensor for SO. Dry influent weather is used to evaluate the
suggested control strategy.
The pollutants SNH,e and Ntot,e are the ones that are more demanding to be kept
under the approved limits. Reduction of Ntot,e can be accomplished by adding
external carbon flow rate (qEC) in the first tank, while for reducing the peaks of
SNH,e, proper manipulation of internal recirculating flow rate (Qrin) is needed. The
comparison of the proposed control strategy is compared to the default BSM1 PI
controller (see Fig. 8). The dotted line is the Ntot,e limit, default BSM1 is indicated
using blue line, and the red line is the proposed neural network ammonia-based
aeration control. It can be observed that by using the proposed method, a large
decreased of Ntot,e peaks are achieved, and the number of violations is reduced
from 7 occasions to 5 occasions during the evaluation week using the NN-ABAC
control strategy.
However, the proposed control strategy alone will not keep the Ntot,e below the
allowing limit. The total remove of Ntot,e can only be achieved if the addition of
carbon is added at tank one. This is due to the increment of the anoxic growth of
XBH when carbon dosage is added to tank one.

22
21
20
Ntot,e (mg N/l)

19
18
17
16
15
14
13
7.00
7.25
7.50
7.75

11.25
8.00
8.25
8.50
8.75
9.00
9.25
9.50
9.75
10.00
10.25
10.50
10.75
11.00

11.50
11.75
12.00
12.25
12.50
12.75
13.00
13.25
13.50
13.75

Time (days)

BSM1 NN-ABAC Ntot Limit

Fig. 8 Ntot,e performances of one-week simulation using dry weather with the benchmark PI
controller (blue line) and with the NN-ABAC (red line)
484 M. H. Husin et al.

9
8
7
SNH,e (mg N/l)

6
5
4
3
2
1
0
7.00
7.25
7.50
7.75
8.00
8.25
8.50
8.75
9.00
9.25
9.50
9.75
10.00
10.25
10.50
10.75
11.00
11.25
11.50
11.75
12.00
12.25
12.50
12.75
13.00
13.25
13.50
13.75
Time (days)

BSM1 NN-ABAC SNH Limit

Fig. 9 SNH,e performance of one-week simulation using dry weather with the default PI
controller (blue line) and with the NN-ABAC (red line)

As for SNH,e violation, only a slight decreased of SNH,e peaks is achieved using
the NN-ABAC control strategy (see red line in Fig. 9); however, the number of the
occasion remains the same. As mentioned previously, the control of SNH,e vio-
lation can be obtained if the Qrin is correctly manipulated. Proper manipulation of
Qrin is needed to improve the nitrification process.
Table 4 shows the results of EQ, AE, OCI, and percentage of time over the
limits of SNH,e, and Ntot,e. It shows that with the proposed control strategy
(NN-ABAC), Ntot,e violation is reduced by 22% while SNH,e is reduced by 4%.
This figure is verified using the graph shown in Figs. 4 and 5. Besides, an
improvement of 2% of EQ is obtained. Improvement in EQ is foreseeable due to the
reduction of effluent violation for SNH,e and Ntot,e. However, AE is increased by
1%. The increases in AE mainly because in the benchmark, the DO concentration
setpoint is fixed while in the proposed controller, the DO concentration is varied.

Table 4 Results with the proposed NN-ABAC and its comparison with the benchmark BSM1
control strategy for dry weather
BSM1 NN-ABAC % of reduction
EQ (kg poll.unit s/d) 6096.71 5975.73 2%
AE 3697.57 3749.24 −1%
OCI 16366.3 16435.9 0%
Ntot,e violations (% of operating time) 17.8571 13.8393 22%
SNH,e violations (% of operating time) 16.8155 16.0714 4%
Neural Network Ammonia-Based Aeration Control … 485

The DO setpoint for the proposed controller depends on the ammonia reading
obtained by the ammonia sensor at tank 5. However, the slight increased in the AE
does not increase the OCI that much.

5 Conclusions

This paper aims to improve the effluent control of the benchmark plant. Using the
proposed control strategy (NN-ABAC), the discharge effluent violations show a
reduction in the total number of violation in two main pollutants, SNH,e and Ntot,e.
These two pollutants are the ones that are difficult to be kept under the established
limits. It can be observed from the simulation results that Ntot,e, and SNH,e vio-
lations are reduced by 22% reduction for Ntot,e, and 4% for SNH,e. Also, a
reduction of EQ by 2% is achieved compared to the default PI benchmark. The
huge reduction in the number of violations proved that the proposed result had
improved the effluent control of the BSM1.
Nonetheless, for future improvement, adding the additional carbon dosage at
tank one can help improve the denitrification process thus can help achieves the
more elimination of Ntot,e violations. But, adding an addition to carbon dosage will
increase the OCI. Good control of the internal recirculation flow rate is needed to
improve the nitrification process because it can eliminate more SNH,e.

Acknowledgements The authors wish to thank the Universiti Malaysia Sarawak and
Special MYRA Assessment Funding (Project ID: F02/Sp/MYRA/1719/2018) for their ﬁnancial
support. Their support is gratefully acknowledged.

References

1. Mei-jin L, Fei L (2014) A nonlinear adaptive control approach for an activated sludge process
using neural networks. In: The 26th Chinese control and decision conference CCDC 2014.
IEEE, pp 2435–2440
2. Hoang BL, Tien DN, Luo F, Nguyen PH (2014) Dissolved oxygen control of the activated
sludge wastewater treatment process using Hedge Algebraic control. In: 2014 7th
international conference on biomedical engineering and informatics. IEEE, pp 827–832
3. Ghoneim WAM, Helal AA, Wahab MGA (2016) Minimizing energy consumption in
wastewater treatment plants. In: 2016 3rd international conference on renewable energies for
developing countries, REDEC 2016. Institute of Electrical and Electronics Engineers Inc.
4. Fernández FJ, Castro MC, Rodrigo MA, Cañizares P (2011) Reduction of aeration costs by
tuning a multi-set point on/off controller: a case study. Control Eng Pract 19:1231–1237
5. Várhelyi M, Brehar M, Cristea VM (2018) Control strategies for wastewater treatment plants
aimed to improve nutrient removal and to reduce aeration costs. In: Proceedings of the 2018
IEEE international conference on automation, quality and testing, robotics, AQTR 2018,
THETA 21st edn, pp 1–6
6. Amand L, Carlsson B (2012) Optimal aeration control in a nitrifying activated sludge process.
Water Res 46:2101–2110
486 M. H. Husin et al.

7. Liu C, Li S, Zhang F (2011) The oxygen transfer efﬁciency and economic cost analysis of
aeration system in municipal wastewater treatment plant. Energy Procedia 5:2437–2443
8. Rieger L, Jones RM, Dold PL, Bott CB (2013) Ammonia-based feedforward and feedback
aeration control in activated sludge processes. Water Environ Res 86:63–73
9. Rieger L, Jones RM, Dold PL, Bott CB (2014) Ammonia-based feedforward and feedback
aeration control in activated sludge processes. Water Environ Res 86:63–73
10. Åmand L, Carlsson B (2014) Aeration control with gain scheduling in a full-scale wastewater
treatment plant. IFAC
11. Åmand L, Carlsson B (2013) The optimal dissolved oxygen proﬁle in a nitrifying activated
sludge process – comparisons with ammonium feedback control. Water Sci Technol 68:641–
649
12. Alex J, Benedetti L, Copp J, Gernaey KV, Jeppsson U, Nopens I, Pons M, Rieger L, Rosen C,
Steyer JP, Vanrolleghem P, Winkler S (2008) Benchmark Simulation Model no. 1 (BSM1)
13. Chen W, Yao C, Lu X (2014) Optimal design activated sludge process by means of
multi-objective optimization: case study in Benchmark Simulation Model 1 (BSM1). Water
Sci Technol 69:2052–2058
14. Zhang W, Qiao J (2014) Direct adaptive neural network control for wastewater treatment
process. In: Proceeding of the 11th world congress on intelligent control and automation.
IEEE, pp 4003–4008
15. Kumar SS, Latha K (2017) A hybrid intelligent controller to reduce the energy of a
wastewater treatment plant. In: 2017 trends in industrial measurement and automation
(TIMA). IEEE, pp 1–5
16. Samsudin SI, Rahmat MF, Abdul Wahab N (2014) Nonlinear PI control with adaptive
interaction algorithm for multivariable wastewater treatment process. Math. Probl. Eng.
2014 (2014)
17. Samsudin SI, Rahmat MF, Wahab NA, Razali MC, Gaya MS, Salim SNS (2014)
Improvement of activated sludge process using enhanced nonlinear PI controller. Arab J
Sci Eng 39:6575–6586
18. Holenda B, Domokos E, Fazakas J (2008) Dissolved oxygen control of the activated sludge
wastewater treatment process using model predictive control. Comput Chem Eng 32:1270–
1278
19. Akyurek E, Yuceer M, Atasoy I (2009) Comparison of control strategies for dissolved oxygen
control in activated sludge wastewater. Elsevier B.V.
20. Han H-G, Qiao J-F, Chen Q-L (2012) Model predictive control of dissolved oxygen
concentration based on a self-organizing RBF neural network. Control Eng Pract 20:465–476
21. Wahab NA, Katebi R, Balderud J, Rahmat MF (2011) Data-driven adaptive model-based
predictive control with application in wastewater systems. IET Control Theory Appl 5:803–
812
22. Cristea MV, Agachi SP (2006) Nonlinear model predictive control of the wastewater
treatment plant, pp 1365–1370
23. Du X, Wang J, Jegatheesan V, Shi G (2018) Dissolved oxygen control in activated sludge
process using a neural network-based adaptive PID algorithm. Appl Sci 8:261
24. Han HG, Qian HH, Qiao JF (2014) Nonlinear multiobjective model-predictive control scheme
for wastewater treatment process. J Process Control 24:47–59
25. Shi X, Qiao J (2010) Neural network predictive optimal control for wastewater treatment. In:
Proceedings of the 2010 international conference on intelligent control and information
processing, ICICIP 2010, pp 248–252
26. Han H-G, Qiao J-F (2011) Adaptive dissolved oxygen control based on dynamic structure
neural network. Appl Soft Comput 11:3812–3820
27. Fu W, Qiao J, Han G, Meng X (2015) Dissolved oxygen control system based on the T-S
fuzzy neural network
28. Uprety K, Kennedy A, Balzer W, Baumler R, Duke R, Bott C (2015) Implementation of
ammonia-based aeration control (ABAC) at full-scale wastewater treatment plants. In:
Proceedings of the water environment federation 2015, pp 1–10
Neural Network Ammonia-Based Aeration Control … 487

29. Santin I, Pedret C, Meneses M, Vilanova R (2015) Process based control architecture for
avoiding effluent pollutants quality limits violations in wastewater treatment plants. In: 2015
19th international conference on system theory, control and computing (ICSTCC). IEEE,
pp 396–402
30. Santin I, Pedret C, Meneses M, Vilanova R (2015) Artiﬁcial neural network for nitrogen and
ammonia effluent limit violations risk detection in wastewater treatment plants. In: 2015 19th
international conference on system theory, control and computing, ICSTCC 2015, joint
conference SINTES 19, SACCS 15, SIMSIS 19, pp 589–594
31. Hagan MT, Demuth HB, Jesús ODE (1966) An introduction to the use of neural networks in
control systems. Endeavour 25:58
32. Qiao JF, Han G, Han HG (2014) Neural network on-line modeling and controlling method for
multi-variable control of wastewater treatment processes. Asian J Control 16:1213–1223
A Min-conflict Algorithm for Power
Scheduling Problem in a Smart Home
Using Battery

Sharif Naser Makhadmeh, Ahamad Tajudin Khader,

Mohammed Azmi Al-Betar, Syibrah Naim, Zaid Abdi Alkareem Alyasseri,
and Ammar Kamal Abasi

Abstract Scheduling operations of smart home appliances using an electricity pric-

ing scheme is the primary issue facing power supplier companies and their users,
due to the scheduling efficiency in maintaining power system and reducing electricity
bill (EB) for users. This problem is known as power scheduling problem in a smart
home (PSPSH). PSPSH can be addressed by shifting appliances operation time from
period to another. The primary objectives of addressing PSPSH are minimizing EB,
balancing power demand by reducing peak-to-average ratio (PAR), and maximiz-
ing satisfaction level of users. One of the most popular heuristic algorithms known
as a min-conflict algorithm (MCA) is adapted in this paper to address PSPSH. A
smart home battery (SHB) is used as an additional source to attempt to enhance the
schedule. The experiment results showed the robust performance of the proposed
MCA with SHB in achieving PSPSH objectives. In addition, MCA is compared
with Biogeography based Optimization (BBO) to evaluate its obtained results. The
comparison showed that MCA obtained better schedule in terms of reducing EB and
PAR, and BBO performed better in improving user comfort.

Keywords Optimization · Min-conflict algorithm · Power scheduling problem in

a smart home · Smart home battery

1 Introduction

Power demand is increasing over time, due to the continuous growth of population
and appearing new technologies of smart home appliances that need more power to

S. N. Makhadmeh (B) · A. T. Khader · S. Naim · Z. A. A. Alyasseri · A. K. Abasi

School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia
e-mail: [email protected]
M. A. Al-Betar
Department of Information Technology, Al-Huson University College, Al-Balqa Applied
University, Irbid, Jordan
Z. A. A. Alyasseri
ECE Department, Faculty of Engineering, University of Kufa, Najaf, Iraq
© Springer Nature Singapore Pte Ltd. 2021 489
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_33
490 S. N. Makhadmeh et al.

be operated [15]. Accordingly, old power grids faced several issues regarding the
stability of the power system in meeting this massive increment on power demand.
In addition, old power grids are not able to install more power generators to meet
power demands due to primitive nature of its architecture [16, 23, 28].
Smart Grids (SGs) are developed to address such issues, where they considered
as the next generation of old power grids. The communication system is the primary
system used in SGs, where it provides two ways communication between user and
power supplier companies (PSCs) to enhance distribution and power systems. This
enhancement allows PSCs to distribute more power to users and meet their power
needs.
SGs allow users to maintain their power consumption using demand response
(DR) programs. DR provides several programs that motivate users to modify and
balance appliances power consumption curve in order to maintain the stability of
power system [20]. DR is categorized into incentive-based programs and dynamic
pricing programs [22]. Dynamic pricing programs provide different electricity prices
in a time range, which offers high tariffs at peak periods and low tariffs at off-
peak periods. These programs motivate users to maintain and schedule appliances
operating time at off-peak periods.
The problem of scheduling smart home appliances operation time at suitable peri-
ods according to dynamic pricing programs is known as power scheduling problem
in a smart home (PSPSH). PSPSH has been formulated as a scheduling optimization
problem which aims to minimize electricity bill (EB), balance power demands by
minimizing proportion between average and highest power demand which known as
peak-to-average ratio (PAR), and maximize the satisfaction level of users.
PSPSH was addressed in several studies using different optimization algorithms
such as exact and metaheuristic optimization algorithms. The metaheuristic opti-
mization algorithms are the most popular in handling PSPSH due to their ability
to efficiently explore large and ragged search spaces. In addition, metaheuristic
optimization algorithms proved their efficiency in several domains, such as power
scheduling [23–26], text feature selection [1–3], authentication [11–13], gene selec-
tion [9, 10], and other domains [6–8]. In contrast, most of metaheuristic optimization
algorithms are not able to efficiently search locally in search spaces [5]. Therefore,
heuristic optimization algorithms found more efficient than metaheuristic optimiza-
tion algorithms in searching for an optimal solution locally in search spaces, due to
their concentration on only one solution.
In this paper, one of the most popular and efficient heuristic optimization algo-
rithms that never used in the domain of power scheduling known as a min-conflict
algorithm (MCA) is adapted to address PSPSH. In addition, a smart home battery
(SHB) is formulated to improve quality of solutions by storing power at low pricing
periods and discharge the stored power at high pricing periods. The dataset used
to evaluate the approaches in [24] and [26] is adopted in the evaluation process of
the proposed approach. The performance of the proposed approach is evaluated and
compared with another approach proposed in [26].
The structure of this paper is constructed as follows. The most important studies
that addressed PSPSH are presented in Sect. 2. PSPSH formulation is discussed in
A Min-conflict Algorithm for PSPSH Using Battery 491

Sect. 3. Section 4 described MCA and its adaptation to address PSPSH. In Sect. 5, the
simulation results of the proposed method are presented and illustrated, and Sect. 6
concluded the paper.

2 Related Work

Several optimization algorithms have been adapted to address PSPSH, including

exact and metaheuristic optimization algorithms. Metaheuristic optimization algo-
rithms are more popular than exact algorithms in addressing PSPSH. Some of the
studies that use metaheuristic optimization algorithms are discussed in this section.
The authors of [30] formulate PSPSH as a multi-objective optimization problem.
The multi-objective function of PSPSH was formulated to reduce EB and user dis-
comfort level. Genetic algorithm (GA), binary particle-swarm algorithm (BPSO),
and ant colony optimization (ACO) algorithm are adapted to schedule 13 home
appliances within one day. GA outperformed ACO and BPSO in achieving PSPSH
objectives.
In [31], PSPSH was formulated as a multi-objective optimization problem to
optimize EB and user comfort level simultaneously. Two dynamic pricing programs
were combined for balancing power demand and maintain system stability. GA was
adapted to address PSPSH using 16 operations of appliances for 90 days. The results
prove the proposed approach efficiency in reducing EB and improve the user comfort
level.
Biogeography based Optimization (BBO) and GA were adapted to address PSPSH
in [20]. A dynamic pricing program, namely, time-of-use pricing was used to schedule
operations of 12 appliances within one day. The simulation results showed the high
performance of BBO, where it performed better than GA in searching for an optimal
schedule.
In [4], GA and Flower Pollination Algorithm (FPA) were adapted to address
PSPSH. Sixteen appliances were used to evaluate the algorithms in terms of reduc-
ing EB and PAR and improving user comfort level in accordance with a dynamic
pricing program known as real-time price (RTP). In simulation results, FPA per-
formed better than GA in reducing EB and PAR, whereas GA performed better than
FPA in improving comfort level.
The authors of [18] adapt harmony search algorithm (HSA) and BAT algorithm to
obtain a near-optimal schedule for 11 appliances. Critical peak pricing was used as
a dynamic pricing program in simulation results. In simulation results, HSA showed
better schedule than BAT and performed better in balancing power consumed through
time horizon.
The authors of [26] adapt PSO algorithm in attempting to obtain an optimal
schedule for 36 appliances operations using smart battery. RTP was used as a dynamic
pricing program in simulation results. In simulation results, PSO is compared with
GA to evaluate its performance. PSO showed better schedule than GA with and
without using the smart battery.
492 S. N. Makhadmeh et al.

Note that heuristic algorithms have never been used or adapted by the authors to
address PSPSH. Therefore, one of the most popular heuristic algorithms that provided
to solve scheduling problems known as the min-conflict algorithm is adapted in this
paper.

3 Problem Formulation

PSPSH can be addressed by schedule appliances operations at a specific period in

accordance with dynamic pricing program(s). The primary objectives of addressing
PSPSH are minimizing EB, PAR, and user discomfort level.
In this section, PSPSH objectives are illustrated and formulated mathematically. In
addition, a SHB is expressed to improve quality of solution(s) and obtain a more suit-
able schedule. RTP program is used as the dynamic pricing program and combined
with inclining block rate (IBR) due to IBR efficiency in balancing power demand
and reducing PAR value [23].

3.1 PSPSH Objectives Formulation

Minimizing EB is the essential objective of PSPSH due to its importance in motivating

user to reschedule their appliances operations. EB is mathematically formulated in
Eq. 1.
S T
j
Cost = Pi × pc j (1)
i=1 j=1

where S is maximum number of appliances in home, T denotes the maximum number

j
of time slots, and Pi is power consumed at time slot j by appliance i. pc j is electricity
tariff at time slot j. In the proposed approach, RTP is combined with IBR program;
therefore, pc j has two tariffs based on amount of power consumed as follows:

aj if 0 ≤ P j ≤ C
pc =
j
(2)
bj if P j > C

bj = λ × aj (3)

where P j denotes all appliances power consumption at time slot j, C is the threshold
of power consumed, λ is a positive number, a j denotes normal price at j, and b j is
high price at j.
PAR is the second objective of addressing PSPSH, which is related to balancing
overall power consumed. PAR is formulated in Eq. 4
A Min-conflict Algorithm for PSPSH Using Battery 493

Pmax
P AR = (4)
Pavg

where Pmax denotes maximum power consumed and Pavg is average overall power
consumed.
User comfort level can be improved by reducing waiting time rate (W T R) of
appliances because users always prefer to finish appliances’ operations as soon as
possible. W T R is formulated as follows:

sti − O T Psi
W T Ri = , ∀i ∈ S (5)
O T Pei − O T Psi − li

where W T Ri denotes W T R for appliance i, sti is starting operation of appliance i,

O T Psi and O T Pei are beginning and ending allowable period for appliance i to be
scheduled, respectively, and li is length of operation cycle of appliance i. Average
W T R for all appliances is calculated as follows:
m
(sti − O T Psi )
W T Ravg = m i=1
, (6)
i=1 (O T Pei − O T Psi − li )

The components of W T Ravg are presented and illustrated in Fig. 1.

In this study, the percentage of satisfaction (comfort) of users (U C p ) is calculated
based on W T R as follows:

U C p = (1 − W T Ravg ) × 100%, (7)

3.2 Smart Home Battery (SHB)

SHB is containing a system known as a battery management system which allows it to

charge and discharge automatically based on predefined constraints. In this section,
SHB is formulated to enhance quality of solution(s) and attempt to achieve PSPSH
objectives optimally. The proposed SHB can efficiently reduce power consumed at

Fig. 1 Illustration of the

components in Eq. 6
494 S. N. Makhadmeh et al.

peak periods, where it formulated to store power at low peak periods and discharge
the stored power at peak periods.
The proposed SHB can store power at low pricing periods and if it is not completely
charged, and discharge at high pricing periods and if it is not empty. In addition, power
consumed by charging operation should not exceed C. The charging and discharging
states of SHB is formulated as follows:

1 if pc j ≤ pcavg and N S H B = 0 and P j < C
XSH B = (8)
0 if pc j > pcavg and C HS H B > 0

X S H B is the state of SHB, where number 1 denoting the charging mode and 0 is
the discharging mode. Power charged and discharged at each time slot should not
exceed a maximum allowable limit. pcavg is average tariffs of all time slots, C HS H B
is total power charged in SHB and N S H B is power needed by SHB to be full where
it is formulated as follows:

N S H B = C S H B − C HS H B (9)

where C S H B is capacity of SHB.

4 Min-conflict Heuristic Algorithm (MCA) for PSPSH

MCA is one of the most popular heuristic optimization algorithms that proposed to
address scheduling problems due to its simplicity and speed [14]. MCA was adapted
to address different problems such as scheduling sensor resources [19], job shop
scheduling [21] and n-queens [27].
In PSPSH, MCA solution is containing a vector of appliances’ starting operation
time (st). MCA for PSPSH is started by initializing PSPSH and SHB parameters,
then initializing the solution vector, as shown in step 1 and 2 of Algorithm 1. Note
that MCA is a local search algorithm and its population can be only one solution
vector of size S × 1. In the third step, the solution is updated by choosing an appli-
ance randomly and calculate its operation cost at each time slot, then update its st
to operate at time slot with least cost. As remember, each appliance should be oper-
ated with respecting several constraints such as O T Ps, O T Pe, and l (see Fig. 1);
therefore, these constraints should be considered during the updating step. In step 4,
allowable periods and power that can SHB be charged and discharged are determined
by calculating power consumed by each appliance (see step 4 of Algorithm 1). Step
3 and 4 are repeated until reach maximum number of iteration, as shown in step 5 of
Algorithm 1.
A Min-conflict Algorithm for PSPSH Using Battery 495

Algorithm 1. Pseudo code of MCA for PSPSH using SHB

//Step 1:
Initializing PSPSH parameters
//Step 2:
Initializing MCA population of size (S × 1)
//Step 3:
while (k < Maximum number of iterations) do
Choose an appliance randomly
Calculate the appliance operation cost at each time slot with respecting its O T Ps, O T Pe, and
l
Update the appliance starting time to operate at time slot with least cost
//Step 4:
Calculate power consumed by each appliance
Determine allowable periods and power that can SHB be charged and discharged
Operate SHB
Calculate fitness value of the solution
//Step 5:
k =k+1
Is the maximum number of iterations reached?
end while
Return fitness value;

5 Experiments and Results

This section provides experiment results and their discussion and illustration. This
section begins with a description of the dataset used to evaluate the proposed
approach. SHB effects on the scheduling process and its enhancement are presented
as well. In addition, the adapted MCA is compared with BBO to assess its perfor-
mance.
The simulation results are executed using MATLAB on a PC with 8 GB of memory
(RAM), Intel Core2 Quad CPU, and 2.66 GHz processor.

5.1 Dataset: Dynamic Pricing Program

In this study, the time horizon is containing 24 h that divided into 1440 slots, where
each slot equaled to 1 min. RTP is considered as a dynamic pricing program using
the pricing curve of the 1st of June 2016 that adopted from Commonwealth Edison
Company [17]. The RTP curve used is presenting in Fig. 2.
As mentioned previously, RTP is combined with IBR to disperse power consumed
and maintain the stability of power system. The IBR owns two parameters, including
C and λ (see Eq. 2). The values of these parameters are assigned by 0.0333 for each
slot and 1.543, respectively [24, 26].
496 S. N. Makhadmeh et al.

Fig. 2 RTP curve of the 1st of June 2016

5.2 Dataset: Smart Home Appliances

Generally, appliances can be operated several times in a time horizon. Therefore,

36 operations of nine appliances are used in the evaluation results. The primary
parameters of these operations are presented in Table 1.

Table 1 Parameters of appliances used in the experiments

No. Appliance l OTPs–OTPe Power (kW) No. Appliance l OTPs–OTPe Power (kW)
1 Dishwasher 105 540–780 0.6 19 Dehumidifier 30 1–120 0.05
2 Dishwasher 105 840–1080 0.6 20 Dehumidifier 30 120–240 0.05
3 Dishwasher 105 1200–1440 0.6 21 Dehumidifier 30 240–360 0.05
4 Air conditioner 30 1–120 1 22 Dehumidifier 30 360–480 0.05
5 Air conditioner 30 120–240 1 23 Dehumidifier 30 480–600 0.05
6 Air conditioner 30 240–360 1 24 Dehumidifier 30 600–720 0.05
7 Air conditioner 30 360–480 1 25 Dehumidifier 30 720–840 0.05
8 Air conditioner 30 480–600 1 26 Dehumidifier 30 840–960 0.05
9 Air conditioner 30 600–720 1 27 Dehumidifier 30 960–1080 0.05
10 Air conditioner 30 720–840 1 28 Dehumidifier 30 1080–1200 0.05
11 Air conditioner 30 840–960 1 29 Dehumidifier 30 1200–1320 0.05
12 Air conditioner 30 960–1080 1 30 Dehumidifier 30 1320–1440 0.05
13 Air conditioner 30 1080–1200 1 31 Electric Water Heater 35 300–420 1.5
14 Air conditioner 30 1200–1320 1 32 Electric Water Heater 35 1100–1440 1.5
15 Air conditioner 30 1320–1440 1 33 Coffee Maker 10 300–450 0.8
16 Washing machine 55 60–300 0.38 34 Coffee Maker 10 1020–1140 0.8
17 Clothes dryer 60 300–480 0.8 35 Robotic Pool Filter 180 1–540 0.54
18 Refrigerator 1440 1–1440 0.5 36 Robotic Pool Filter 180 900–1440 0.54
A Min-conflict Algorithm for PSPSH Using Battery 497

Fig. 3 EB using MCA with and without SHB

For SHB, the usable C S H B is 13.5 kWh and the maximum allowable limit to
charge and discharge is 5 kW [29].

5.3 The Enhancement of SHB

In this section, SHB efficiency in attaining PSPSH objectives is examined and eval-
uated using MCA. The results with and without using SHB are compared, to show
whether SHB can improve the quality of the schedule. Figure 3 presents EB obtained
by MCA with and without considering SHB in the scheduling process. EB reduced
from (44.79 cent) using unscheduled mode (i.e., random schedule) to (41.12 cent)
and (28.85 cent) using MCA and MCA with SHB, respectively. The results show the
performance of SHB in improving quality of schedule and reduce EB.
In terms of PAR reduction, PAR value is reduced from (3.32) using unscheduled
mode to (2.53) using MCA and (2.60) using MCA with SHB, as shown in Fig. 4. The
results show that MCA without SHB obtained a better PAR value than MCA with
SHB. These results archived due to SHB process that allow it to store and consume
power only at low pricing periods which increase power consumed at these periods
and increase value of Pmax (see Eq. 4).
As discussed, the percentage of user comfort level could be improved by reducing
WTR value because users always prefer to finish appliances’ operations as soon
as possible. The proposed SHB reduced WTR and enhanced user comfort level
significantly, where WTR value is reduced from (0.4615) using unscheduled mode
to (0.3581) and (0.3368) using MCA and MCA with SHB, respectively, as shown in
Fig. 5. The percentage of user comfort level is 53.85% 64.19%, and 66.32% using
498 S. N. Makhadmeh et al.

Fig. 4 PAR using MCA with and without SHB

Fig. 5 WTR with and without SHB

unscheduled mode, MCA, and MCA with SHB. The results prove the efficiency of
proposed MCA with SHB in reducing waiting time for appliances and improving
user comfort level.
A Min-conflict Algorithm for PSPSH Using Battery 499

Table 2 Comparison between MCA and BBO.

BBO MCA
EB PAR WTR EB PAR WTR
Without 42.46 2.64 0.3534 41.12 2.53 0.3581
SHB
With SHB 28.95 2.60 0.3352 28.85 2.60 0.3368

5.4 Comparison Study Between MCA and BBO

This section presents a comparison between the adapted MCA and BBO algorithm.
This comparison study is provided to show the results of MCA against BBO and
evaluate its performance.
The results obtained by MCA and BBO without and with SHB are compared in
Table 2. The table shows the robust performance of MCA in reducing EB and PAR,
where it obtained better results than BBO in term of reducing EB and PAR, whereas
BBO performed better than MCA in improving user comfort level.

6 Conclusion and Future Work

PSPSH is the primary issue facing power supplier companies and their users, due to
the scheduling efficiency in maintaining power system and reducing EB for users.
PSPSH can be addressed by shifting appliances operation time from period to another
according to a time horizon and dynamic pricing program. The primary objectives of
addressing PSPSH are minimizing EB and PAR, and maximizing satisfaction level
of users.
In this paper, MCA is adapted to address PSPSH according to a time horizon
divided into 1440 time slots and RTP program. The RTP is combined with IBR
program to efficiently balance power demand though the time horizon. SHB is for-
mulated and used as an additional source to attempt to enhance quality of solution.
In the simulation results, the schedule using SHB is compared with schedule
without considering SHB. SHB prove its efficiency in enhancing the schedule in
terms of EB and WTR, where MCA using SHB reduce EB and WTR by up to 29.8%
and 6%, respectively, better than MCA without SHB. However, MCA without SHB
obtains better schedule than MCA with SHB in terms of reducing PAR. In addition,
MCA is compared with BBO to evaluate its obtained results. The comparison showed
that MCA obtained better schedule in terms of reducing EB and PAR, and BBO
performed better in improving user comfort level.
In the future, different dataset can be considered in the scheduling process to effi-
ciently evaluate MCA and SHB. Besides, renewable energy sources can be integrated
with the proposed SHB to improve quality of schedule.
500 S. N. Makhadmeh et al.

Acknowledgments This work has been partially funded by Universiti Sains Malaysia under Grant
1001/PKOMP/8014016.

References

1. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2019) Link-
based multi-verse optimizer for text documents clustering. Appl Soft Comput 87:1–36
2. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2019) A
text feature selection technique based on binary multi-verse optimizer for text clustering. In:
2019 IEEE Jordan international joint conference on electrical engineering and information
technology (JEEIT). IEEE, pp 1–6
3. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2020) An
improved text feature selection for clustering using binary grey wolf optimizer. In: Proceed-
ings of the 11th national technical seminar on unmanned system technology 2019. Springer,
Heidelberg, pp 1–13
4. Abbasi BZ, Javaid S, Bibi S, Khan M, Malik MN, Butt AA, Javaid N (2017) Demand side
management in smart grid by using flower pollination algorithm and genetic algorithm. In:
International conference on P2P, parallel, grid, cloud and internet computing. Springer, Hei-
delberg, pp 424–436
5. Al-Betar MA (2017) β-hill climbing: an exploratory local search. Neural Comput Appl
28(1):153–168
6. Al-Betar MA, Alyasseri ZAA, Khader AT, Bolaji AL, Awadallah MA (2016) Gray image
enhancement using harmony search. Int J Comput Intell Syst 9(5):932–944
7. Al-Betar MA, Awadallah MA, Bolaji AL, Alijla BO (2017) β-hill climbing algorithm for
sudoku game. In: 2017 Palestinian international conference on information and communication
technology (PICICT). IEEE, pp 84–88
8. Al-Betar MA, Khader AT (2012) A harmony search algorithm for university course timetabling.
Ann Oper Res 194(1):3–31
9. Alomari OA, Khader AT, Al-Betar MA, Abualigah LM (2017) Gene selection for cancer classi-
fication by combining minimum redundancy maximum relevancy and bat-inspired algorithm.
Int J Data Min Bioinform 19(1):32–51
10. Alomari OA, Khader AT, Al-Betar MA, Alyasseri ZAA (2018) A hybrid filter-wrapper gene
selection method for cancer classification. In: 2018 2nd international conference on biosignal
analysis, processing and systems (ICBAPS). IEEE, pp 113–118
11. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Ahmad Alomari O (2018) EEG-based per-
son authentication using multi-objective flower pollination algorithm. In: 2018 IEEE congress
on evolutionary computation (CEC). IEEE, pp 1–8
12. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadme SN (2018) An
efficient optimization technique of EEG decomposition for user authentication system. In: 2018
2nd international conference on biosignal analysis, processing and systems (ICBAPS). IEEE,
pp 1–6
13. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadmeh SN (2018)
Classification of EEG mental tasks using multi-objective flower pollination algorithm for person
identification. Int J Integr Eng 10(7) (2018)
14. Bouhouch A, Loqman C, El Qadi A (2019) CHN and min-conflict heuristic to solve scheduling
meeting problems. In: Bioinspired heuristics for optimization. Springer, Heidelberg, pp 171–
184
15. Briefing US (2013) International energy outlook 2013. US Energy Information Administration
16. Colak I, Kabalci E, Fulli G, Lazarou S (2015) A survey on the contributions of power electronics
to smart grid systems. Renew Sustain Energy Rev 47:562–579
17. ComED Company (2017). https://hourlypricing.comed.com/live-prices/
A Min-conflict Algorithm for PSPSH Using Battery 501

18. Farooqi M, Awais M, Abdeen ZU, Batool S, Amjad Z, Javaid N (2017) Demand side man-
agement using harmony search algorithm and bat algorithm. In: International conference on
intelligent networking and collaborative systems. Springer, Heidelberg, pp 191–202
19. Gage A, Murphy RR (2004) Sensor scheduling in mobile robots using incomplete information
via min-conflict with happiness. IEEE Trans Syst Man Cybern Part B (Cybern) 34(1):454–467
20. Iftikhar H, Asif S, Maroof R, Ambreen K, Khan HN, Javaid N (2014) Biogeography based opti-
mization for home energy management in smart grid. In: International conference on network-
based information systems. Springer, Heidelberg, pp 177–190
21. Johnston M, Minton S et al (1994) Analyzing a heuristic strategy for constraint satisfaction
and scheduling. Intell Sched 257–289
22. Khan AR, Mahmood A, Safdar A, Khan ZA, Khan NA (2016) Load forecasting, dynamic
pricing and dsm in smart grid: a review. Renew Sustain Energy Rev 54:1311–1322
23. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S (2018) Multi-objective power scheduling
problem in smart homes using grey wolf optimiser. J Ambient Intell Hum Comput 1–25
24. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S (2018) An optimal power scheduling for
smart home appliances with smart battery using grey wolf optimizer, pp 1–6
25. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S, Abasi AK, Alyasseri ZAA (2019) Opti-
mization methods for power scheduling problems in smart home: survey. Renew Sustain Energy
Rev 115:109362
26. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S, Alyasseri ZAA, Abasi AK (2019) Particle
swarm optimization algorithm for power scheduling problem using smart battery. In: 2019 IEEE
Jordan international joint conference on electrical engineering and information technology
(JEEIT). IEEE, pp 672–677
27. Minton S, Johnston MD, Philips AB, Laird P (1992) Minimizing conflicts: a heuristic repair
method for constraint satisfaction and scheduling problems. Artif Intell 58(1–3):161–205
28. Nexans (2010) Deploying a smarter grid through cable solutions and services. http://www.
nexans.com/Corporate/2010/WHITEPAPERSMARTGRIDS2010.pdf
29. Powerwall T (2018). https://www.tesla.com/powerwall
30. Rahim S, Javaid N, Ahmad A, Khan SA, Khan ZA, Alrajeh N, Qasim U (2016) Exploiting
heuristic algorithms to efficiently utilize energy management controllers with renewable energy
sources. Energy Build 129:452–470
31. Zhao Z, Lee WC, Shin Y, Song KB (2013) An optimal power scheduling method for demand
response in home energy management system. IEEE Trans Smart Grid 4(3):1391–1400
An Improved Text Feature Selection
for Clustering Using Binary Grey Wolf
Optimizer

Ammar Kamal Abasi, Ahamad Tajudin Khader, Mohammed Azmi Al-Betar,

Syibrah Naim, Sharif Naser Makhadmeh, and Zaid Abdi Alkareem Alyasseri

Abstract Text Feature Selection (FS) is a significant step in text clustering (TC).
Machine learning applications eliminate unnecessary features in order to enhance
learning effectiveness. This work proposes a binary grey wolf optimizer (BGWO)
algorithm to tackle the text FS problem. This method introduces a new implemen-
tation of the GWO algorithm by selecting informative features from the text. These
informative features are evaluated using the clustering technique (i.e., k-means) so
that time complexity is reduced, and the clustering algorithm’s efficiency is improved.
The performance of BGWO is examined on six published datasets, including Tr41,
Tr12, Wap, Classic4, 20Newsgroups, and CSTR. The results showed that the BGWO
output outperformed the rest of the compared algorithms such as GA and BPSO based
on the measurements of the evaluation. The experiments also showed that the BGWO
method could achieve an average purity of 46.29%, F-measure of 42.23%.

Keywords Binary grey wolf optimizer · Text mining · K-means · Text feature
selection problem · Text clustering

1 Introduction

The number of digital documents is extremely increasing day by day due to the pro-
liferation of the internet, that cannot be investigated only by humans [3]. Therefore,
text mining tools can assist in addressing this issue. Automatic systems, which are
not affected by a text explosion, can replace the human reader. Text mining examines
the massive documents’ collection to detect data that are previously unknown. Text

A. K. Abasi (B) · A. T. Khader · S. Naim · S. N. Makhadmeh · Z. A. A. Alyasseri

School of Computer Sciences, Universiti Sains Malaysia, Gelugor, Penang, Malaysia
e-mail: [email protected]
M. A. Al-Betar
Department of Information Technology, Al-Huson University College, Al-Balqa Applied
University, Irbid, Jordan
Z. A. A. Alyasseri
ECE Department-Faculty of Engineering, University of Kufa, Najaf, Iraq
© Springer Nature Singapore Pte Ltd. 2021 503
Z. Md Zain et al. (eds.), Proceedings of the 11th National Technical Seminar on
Unmanned System Technology 2019, Lecture Notes in Electrical Engineering 666,
https://doi.org/10.1007/978-981-15-5281-6_34
504 A. K. Abasi et al.

document clustering (TDC) is, among other techniques, an effective method, which
is used in the fields of text mining, topic extraction, machine learning, text summa-
rization, and pattern recognition [16]. An efficient TDC technique allows automatic
classification of a corpus of documents into semantic cluster hierarchies. It is the
method through which documents are structured into significant classification. This
means that the records of similar clusters are closer together than the records of
different clusters [11].
The application of TDC algorithms requires the conversion of raw text files (i.e.,
terms) into numerical formats with document characteristics. The most fundamen-
tal stage to obtain trends and ideas from them is document representation [17]. In
TDC, Vector Space Model (VSM) is commonly utilized so that the documents are
presented, and the terms represent the features/dimensions in the VSM [29].
Huge informative, in addition to uninformative, in other words, irrelevant and
redundant, as well as noisy dimensional features are the result of the conversion pro-
cess [12]. The main informative documents’ features are determined by FS. However,
the high dimensionality space represents the key difficulty. Problems are related to
the removal of non-informational features in order to reduce the dimension space and
improve the clustering performance [18]. It is a fact that hundreds of thousands of
textual features are part of the compilation of the text. The document dimensionality
determines the efficiency of TDC. Figure 1 shows the overall steps of TDC.
The FS techniques fall into three categories, including the filter method, the wrap-
per method, and the hybrid method based on the studies’ approach to obtaining
an information sub-ensemble of features. The filter method examines the feature
set based on statistical methods so that a discriminatory function subset is chosen

Fig. 1 Text clustering steps.

An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer 505

regardless of the machine learning algorithm. These include mean-median [15], mean
absolute difference [15], and odd ratio [23], to name a few examples of filter methods.
The previously mentioned methods are widely used in FS due to their advanta-
geous less computational complexity, particularly if the dimension of the text feature
is vast. The search approach in the wrapper methods is used to evaluate the subsets of
features so that effective informative features are obtained. These techniques include
plus-l-take-away-r-process [25], and sequential forward selection/backward elimi-
nation [26]. Although these techniques are computationally costly, they are relatively
more expensive compared to the filter methods. Another class of FS is the hybrid
technique. Various FS techniques are incorporated into the hybrid methods to select
informative subsets of features. They utilize the advantages of one strategy and reduce
the disadvantages of another technique in choosing the subset.
FS is formulated as an NP-hard (nondeterministic polynomial time) optimization
problem [12]. In combinatorial optimization problems, the best way to achieve the
optimal solution is the exhaustive search [5]. However, the exhaustive search through-
out the full search space cannot be practical because it includes an overwhelmingly
high degree of computing complexity [9, 21, 22]. Recently, many surveys have inves-
tigated the metaheuristic algorithms to address the issues of combinatorial optimiza-
tion [2, 7, 20]. These algorithms are extensively utilized with the aim of discovering
the problems’ unknown search space and obtaining the best global solution and,
therefore, they are becoming more and more popular. Numerous metaheuristic algo-
rithms are available, particle swarm optimization [19], binary multi-verse optimizer
[1], ant lion optimizer [19], harmony search (HS) [6], etc. [8, 28]. They are used to
address the FS issue.
Grey Wolf Optimizer (GWO) is a recent metaheuristic swarm optimization tech-
nique, which emulates grey pack hunting and social behaviour. It is proposed by
Mirjalili [24]. This algorithm provides many advantages over other swarm-based
intelligence techniques. It has a fewer set of parameters and any derivative informa-
tion is not required. Besides, the decision variables’ exchange and the cooperation
process between swarm participants have a significant advantage. Consequently,
GWO has been effectively adjusted in the last analysis of GWO to several types
of optimization problems such as engineering, robotics, scheduling [22], economic
dispatch problems, planning, feature selection for classification problem [13], and
many more as described in [14].
The FS problem is basically a binary problem. For the continuous optimization
problem, the original GWO variant is suggested. Based on the above, a binary Grey
Wolf Optimizer (BGWO) is proposed in the present paper as a novel FS application
using all the GWO operators.
As for the structure of the paper, it is outlined as follows: The theoretical moti-
vation for this work provides in Sect. 2. In Sect. 3, the binary grey wolf algorithm is
provided. In Sect. 4, BGWO for text FS is provided. Section 5 explains the obtained
empirical results to emphasize the efficiency of the new FS method. Finally, Sect. 6
provides the conclusion and future work.
506 A. K. Abasi et al.

2 Preliminaries

The preliminary research is briefly presented in this section.

2.1 Text Clustering Problem

TDC aims at finding the best distribution of a vast set of documents into a clusters’
subset by the clusters’ fundamental features. The pre-processing stages of TDC
are introduced in the following subsection and the k-means technique is briefly
introduced to produce document clusters depending on the obtained features.
Pre-processing Steps. The standard pre-processing stages, which include tokeniza-
tion and stop words removal, as well as stemming, in addition to feature weighting,
are performed before clusters are created to convert the document into a numerical
form format [18]. The pre-processing substeps are shortly outlined as follows:

– Tokenization: Each word (term) in a single document is extracted as separate units

called tokens in this stage, neglecting special characters, symbols, and weight spice
in the text.
– Stop words removal: This involves a list of terms that are common, including
(‘in’, ‘on’, ‘at’, ‘that’, ‘the’, ‘of’, ‘an’, ‘a’, ‘she’, ‘he’, etc.). Short words, high-
frequency terms, and functional terms are also recognized as stop words in TDC. It
is vital to remove these terms as they often cover a substantial part of the document.
Therefore, not only the number of characteristics is unnecessarily intensified but
also the clustering method efficiency is deluded and deteriorated. The stop words
list consists of 571 words that can be obtained.
– Stemming: Transforms several word forms with the same root. We can do this by
separating prefixes and suffixes from the term. For instance, ‘multi-coloured’ and
‘multi-media’ share the same root, i.e., /-multi-/.
– Term weighting: A weighting scheme TF-IDF (i.e., term frequency-inverse doc-
ument frequency) is frequently utilized for transforming textual data into number
formats.

2.2 K-Means Text Clustering Algorithm

K-means represents one of the most popularly used clustering technique for solving
the TDC problem [16]. Algorithm 1 provides the K-means algorithm steps. It splits
the text documents’ set Docs = (doc1 , doc2 , doc3 , ., docn ) into a subset of K clus-
ters via three main steps: (a) choosing random documents as clusters’ centroid (the
number of clusters is predefined). (b) assigning the documents to the nearest clusters.
(c) recalculating the clusters’ centroid.
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer 507

Algorithm 1. K-means clustering algorithm

Data: The clusters’ number K , and a documents’ set Docs (after the pre-processing step)
Result: Clusters K contain homogeneous documents.
Create centroid clusters K by choosing one document randomly for each cluster.
while the number of iteration is not met do
for each document doci in Docs do
Compute the distance (i.e., the similarity) between centroid clusters K and document doci .
end
for each document doci in Docs do
Assign document doci to the nearest cluster k.
end
recalculate the clusters centroid k.
end

2.3 Problem Formulation of Unsupervised Feature Selection

In this paper, the technique of text FS problem utilizes the BGWO to cluster text
using a novel model to identify the most comprehensive informative text features.
In addition, uninformative features are removed. The following math defines the
proposed model for addressing the FS problem. Since F is a set of features F =
{ f 1 , f 2 , ...., f t }, where t signifies the amount of the entire unique features VSM.
Consider N ew_sub_ f eatur es = {N f1 , N f2 , ..., N f j , ..., N f,tn } signifies the subset
of the new features, which is the new dimension of informative features that is
obtained through the FS algorithm, tn signifies the amount of the new features.

3 Binary Grey Wolf Optimizer

The GWO mechanism is modelled by the grey wolves’ lifestyle. Their hunting mech-
anisms were formulated in 2014 as an optimization algorithm by Mirjalili [24] using
four stages of GWO social hierarchy, including (α), (β), (δ), and (ω) alpha, which
stand for an alpha, beta, gamma, and omega, respectively. Alpha is the leader of the
grey wolf pack, and it is at the top of the social hierarchy. In consulting alpha wolf,
beta bears perform the leading role. Delta refers to the level positioned in the struc-
ture between beta and omega wolves. Omega wolves are part of the last hierarchy.
To hunt prey, they surround it first [22].
The intelligence of the group hunting is also proceedingly modelled along with
this intelligent social hierarchy. It involves three main phases: chasing, encircling,
and attacking. Optimization speaking, the top three solutions in the hunting group
are classified into three types according to the fitness value: Alpha (α) is the first-best
solution in the hunting group. Beta (β) is the second-best solution and delta (δ) is
the third-best one. Other solutions involve omega (ω).
508 A. K. Abasi et al.

All the solutions are guided by these three solutions (i.e., (α), (β), and (δ)) to
discover the search space to find the optimal solution. The following equations are
used to mathematically model the encircling behaviour.
−
→ −
→ −
→ − →
X (t + 1) = X p (t) + A × D (1)

−
→ −
→ − → −
→
D = | C × X p (t) − X (t)|, (2)
−
→ −
→
where D is as defined in 2 and t signifies the number of iterations, X p signifies the
−
→ − → −
→
position of the prey, A , C represent coefficient vectors and X signifies the grey
wolf position.
−
→
C =2×− →r 2 (3)

−
→
A =2×−
→
a ×−
→
r 1−−
→
a (4)
−
→ − →
The A , C vectors are calculated based on Eqs. 4 and 3. The components of − →a
are linearly reduced from (2.0 to 0.0) over the course of iterations and r 1, r 2 are
random vectors in [0, 1]. Hunting is typically driven by alpha. Sometimes, beta and
delta might be involved in hunting. In order to mathematically simulate the hunting
behaviour of the grey wolves, alpha, beta, and delta (i.e., the highest solutions) are
expected to possess a stronger understanding of the prey location. Other search agents
follow the first three best solutions, which have been so far achieved in the hunting
processes to update their position to the best search agent’s position. The updating
positions of the wolves are presented in these equations.
−
→ −
→ −
→ −
→
Dα = |C 1 × X α − X | (5)

−
→ −
→ −
→ −
→
Dβ = |C 2 × X β − X | (6)

−
→ −
→ −
→ −
→
Dδ = |C 3 × X δ − X | (7)

−
→ −
→ −
→ −
→
X 1 = X α − A1 × Dα (8)

−
→ −
→ −
→ −
→
X 2 = X β − A2 × Dβ (9)

−
→ −
→ −
→ −
→
X 3 = X δ − A3 × Dδ (10)
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer 509

−
→ −
→ −
→
−
→ X1+ X2+ X3
X (t + 1) = (11)
3
This paper proposes the modification of a GWO as a binary GWO (BGWO) for the
adaptation of binary variables in a search area (FS problem nature). The generation
−
→
function of solutions, as well as the equation of the new position (i.e., X (t + 1))
Eq. (11) are adjusted to identify the practical solutions during the execution of BGWO
as follows:
−
→ 1
Sig( X (t + 1)) = −→ , (12)
1 + e X (t+1)
−

−
→
where Sig( X (t + 1)) refers to the opportunity of the decision variables will be taken
‘0’ or ‘1’ in solution X . The Eq. 13 to update the decision variables of the X solution.
−
→
−
→ 1 if r < Sig( X (t + 1))
X (t + 1) = (13)
0 otherwise,

−
→
where the sigmoid function is used in Eq. 12 to convert the value of X (t + 1) in Eq. 11
in the range [0, 1], r refers to random numbers between (0, 1). Figure 2 illustrates
−
→
the sigmoid function of the X (t + 1).

Fig. 2 Sigmoid function.

510 A. K. Abasi et al.

Fig. 3 Solution represents.

4 BGWO for the Text FS Problem

4.1 Solution Representation

Figure 3 illustrates the BGWO solution presentation, which is proposed for the text
FS problem. In this presentation, the solution involves a text features’ subset. The
binary value of each position indicates whether if the feature selected or not selected
[3, 18]. BGWO starts after creating a random solutions’ set, then it improves the
solutions so that the best optimal solution can be found (i.e., the best informative
features).

4.2 Fitness Function

The mean absolute difference (MAD) [18] can be utilized by the BGWO algorithm
as a fitness function to evaluate each solution in the population to tackle the text FS
problem. MAD is used to give weight (i.e., significance rating) to each feature in the
subset N ew_sub_ f eatur es, then all the scores are summarized. The feature weight
is computed by calculating the distinction of each feature using Eq. 14.

1
t
M AD(Ui ) = |Ui , j − U j |, (14)
n i i=1

where,
1
t
Ui = Ui , j, (15)
n i i=1

where n i refers to all the selected features in the text document i, Ui , j signifies the
feature j value in the document i, U j refers to the mean value of the feature j, t refers
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer 511

to the total features’ number. The methodology, which is proposed in this paper, is
described briefly in Algorithm 2.

Algorithm 2. The proposed BGWO algorithm’s pseudo code for FS problem

Initialize GWO and FS problem parameters (a, A, C, number of solutions(N ), number of itera-
tions, number of feature (F)
Create a population matrix of size (N × F)
Calculate the fitness function for all solutions
Assign the best solution to X α
Assign the second best solution to X β
Assign the third best solution to X δ
for each iteration (t) do
for each solution(i) do
Update solution(i) using equation 13
end for
Update a, A, C
Calculate the fitness function for all solutions
Update X α
Update X β
Update X δ s
end for
Return the best solution X α ;

5 Experimental Setup

The proposed BGWO is tested on six standard datasets to solve the text FS problem.
The results were contrasted using GA [27], BPSO [18]. The parameter setting of
every comparative algorithm is described in Table 1. It should be noted that, the
values of the control parameters are set according to the recommendation given by
the founder of GWO in [24].

Table 1 The parameter setting for each algorithm of comparison.

Algorithm Parameters Value
GA Crossover rate 0.70
GA Mutation rate 0.04
binary PSO C1 2
binary PSO C2 2
binary PSO Max weight 0.9
binary PSO Min weight 0.2
BGWO, BPSO, GA Population size 60
BGWO, BPSO, GA Maximum number of iteration 1000
BGWO, BPSO, GA Runs 30
512 A. K. Abasi et al.

Table 2 Text datasets details.

Datasets ID No. documents No. clusters (K) No. features or
(d) terms (t)
tr41 DS1 878 10 6743
tr12 DS2 313 8 5329
Wap DS3 1560 20 7512
Classic4 DS4 2000 4 6500
20Newsgroups DS5 300 3 2275
CSTR DS6 2 99 4 1725

5.1 Standard Datasets and Evaluation Metric

The BGWO algorithm is tested on six benchmark datasets, and it is compared with the
state-of-the-art algorithms in the experiment, including (Tr41, Tr12, Wap)1 , (Clas-
sic4, 20Newsgroups, CSTR)2 . Several characteristics in these datasets such as spar-
sity and skewness. Based on Table 2, the features’ description of the datasets is given.
The Purity and F-measure measures are used as standards to evaluate the TDC
algorithms [16]. The measures that are implemented involve the criteria, which is
commonly used to achieve validity and compare the clustering of various cluster
datasets [4]. It is worth noting that after the outcomes are achieved, they are calcu-
lated. The following section describes these steps in detail.
Purity. The purity measure is utilized for calculating the maximum correct docu-
ments of every single cluster and the highest purity score is close to 1 because, in
a single cluster, the immense cluster size is calculated according to the estimated
cluster size. Through the given measure, each cluster is assigned the most repeated
class [1]. Purity is calculated in Eq. 16 of the entire clusters.

1
k
purit y = max(i, j), (16)
n i=1

where n refers to the entire documents’ total number in the dataset, max(i, j) refers
to the large size in the cluster j of class i, k refers to the clusters’ number.
F-measure. The F-measure indicates the harmonic combination of the precision
measures (P) with the recall measures (R). When the F-measure’s value is close
to 1, this shows a robust clustering algorithm. Conversely, when the F-measure’s
value is close to 0, the clustering algorithm is considered weak [10]. In the following
Equation, the F-measure is calculated:

1 glaros.dtc.umn.edu/gkhome/fetch/sw/cluto/datasets.tar.gz.
2 sites.labic.icmc.usp.br/text_collections/.
An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer 513

n i, j
P(i, j) = , (17)
nj

where ni, j refers to the correct documents number in cluster j of class i, n j refers
to the total documents number in cluster j.
n i, j
R(i, j) = , (18)
ni

where ni, j refers to the correct documents number in cluster j of class i, n j refers
to the total documents number in class i.

2 × P(i, j) × R(i, j)
F(i, j) = , (19)
P(i, j) + R(i, j)

where R(i, j) refers to the Recall in cluster j of class i, P(i, j) refers to the Precision
in cluster j of class i. For all clusters, the calculated F-measure is shown in Eq. 20

k
nj
F= max F(i, j) (20)
i=1
n

5.2 Results and Discussion

The findings, which were achieved through BGWO, were compared with BPSO and
GA. In order to make a reasonable comparison, every single algorithm was reiterated
30 times, and the parameters’ setting of each clustering algorithm is similar as shown
in Table 1.
Table 3 provides the average of 30 runs for Purity and F-measure results, which
were obtained individually through the six standard text benchmarks by all the FS
algorithms GA, BPSO, and BGWO. For all datasets, BGWO exhibited higher purity
and F-measure in comparison with GA and BPSO. In contrast with both techniques,
this indicates that BGWO is effective and simultaneously efficient to find the globally
optimal solution. Compared with other data sets, BPSO obtained the best purity, as
well as the best F-measure in the DS2 dataset. According to the results, it was found
that BGWO exceeded other algorithms in comparison with purity and F-measure.
Figure 4 demonstrates the selected features percentages, which are compared with
other methods in different datasets. According to the findings, it is possible to state
that a better subset of the appropriate text clustering efficiency is discovered in
the proposed algorithm compared with other algorithms. The selection of features,
however, aims at improving the quality of the clustering and, at the same time,
removing unusable features. Otherwise, the efficiency may be decreased while the
feature subset is tiny. For example, BPSO obtained the smallest subset of features
for the DS3 text dataset. However, the purity and F-measure were smaller (please
514 A. K. Abasi et al.

Table 3 Comparison of BPSO, GA, BGWO results for different datasets based on k-means clus-
tering algorithm in terms of Purity and F-measure
Dataset Measure K-means BPSO GA BGWO
without FS
DS1 Purity 0.4108 0.4358 0.4139 0.4400
F-measure 0.3876 0.4004 0.3904 0.4286
Rank 4 2 3 1
DS2 Purity 0.3908 0.4083 0.4012 0.4354
F-measure 0.3222 0.3471 0.3250 0.3299
Rank 4 2 3 1
DS3 Purity 0.4759 0.4981 0.4887 0.5010
F-measure 0.4315 0.4507 0.4436 0.4627
Rank 4 2 3 1
DS4 Purity 0.5938 0.5970 0.6035 0.6074
F-measure 0.5472 0.5579 0.5504 0.5801
Rank 4 3 2 1
DS5 Purity 0.3741 0.4014 0.3810 0.3953
F-measure 0.3406 0.3499 0.3481 0.3418
Rank 4 1 3 2
DS6 Purity 0.3525 0.3702 0.3558 0.3986
F-measure 0.3460 0.3662 0.3512 0.3962
Rank 4 2 3 1
Average ranks 4 2.00 2.83 1.16
Final rank 4 2 3 1

Fig. 4 Features selected percentage between GA, BPSO, BGWO

An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer 515

refer to Table 3). Instead, BGWO selected a larger subgroup, which provided higher
purity and F-measure than BPSO in the same dataset. Figure 4 also shows that the
worst or the best clustering performance cannot be guaranteed by the lowest or the
largest features’ subset.

6 Conclusion

This paper proposed a binary grey wolf optimizer (BGWO) to solve the FS problem in
TDC. It aims to address the binary nature problem. BGWO uses the original features
to produce a subset, which contains the most necessary text features. The k-means
clustering technique addresses the features as an input in the clustering step so that
the new subset is evaluated. The proposed algorithm is tested on six benchmarks
document datasets regarding the purity and F-measure criteria. The experimental
findings of the BGWO algorithm archived better results than the existing FS tech-
nique. Therefore, the proposed FS algorithm enhanced the outcome of the TDC
by obtaining more homogeneous groups. The hybridization of this algorithm with
other metaheuristic algorithms may potentially improve the information by increas-
ing the search capabilities of the algorithm. Another enhancement in the future can
involve applying various fitness functions so that the results are expected to be further
improved.

Acknowledgements This work was supported by Universiti Sains Malaysia (USM) under Grant
(1001/PKOMP/ 8014016).

References

1. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2019) A
text feature selection technique based on binary multi-verse optimizer for text clustering. In:
2019 IEEE Jordan international joint conference on electrical engineering and information
technology (JEEIT). IEEE, pp 1–6
2. Abasi AK, Khader AT, Al-Betar MA, Naim S, Makhadmeh SN, Alyasseri ZAA (2020) Link-
based multi-verse optimizer for text documents clustering. Appl Soft Comput 87:106002
3. Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on
hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J
Supercomput 73(11):4773–4795
4. Abualigah LM, Khader AT, Al-Betar MA (2016) Multi-objectives-based text clustering tech-
nique using k-mean algorithm. In: 2016 7th international conference on computer science and
information technology (CSIT). IEEE, pp 1–6
5. Al-Betar MA, Awadallah MA (2018) Island bat algorithm for optimization. Expert Syst Appl
107:126–145
6. Al-Betar MA, Awadallah MA, Khader AT, Bolaji AL, Almomani A (2018) Economic load
dispatch problems with valve-point loading using natural updated harmony search. Neural
Comput Appl 29(10):767–781
516 A. K. Abasi et al.

7. Alomari OA, Khader AT, Al-Betar MA, Awadallah MA (2018) A novel gene selection method
using modified MRMR and hybrid bat-inspired algorithm with β-hill climbing. Appl Intell
48(11):4429–4447
8. Alyasseri ZAA, Khader AT, Al-Betar MA, Awadallah MA, Yang XS (2018) Variants of the
flower pollination algorithm: a review. In: Yang XS (ed) Nature-inspired algorithms and applied
optimization. Springer, Cham, pp 91–118
9. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadme SN (2018) An
efficient optimization technique of EEG decomposition for user authentication system. In: 2018
2nd international conference on biosignal analysis, processing and systems (ICBAPS). IEEE,
pp 1–6
10. Bharti KK, Singh PK (2015) Hybrid dimension reduction by integrating feature selection with
feature extraction method for text clustering. Expert Syst Appl 42(6):3105–3114
11. Bharti KK, Singh PK (2016) Chaotic gradient artificial bee colony for text clustering. Soft
Comput 20(3):1113–1126
12. Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight
BPSO for feature selection in text clustering. Appl Soft Comput 43:20–34
13. Emary E, Zawbaa HM, Hassanien AE (2016) Binary grey wolf optimization approaches for
feature selection. Neurocomputing 172:371–381
14. Faris H, Aljarah I, Al-Betar MA, Mirjalili S (2018) Grey wolf optimizer: a review of recent
variants and applications. Neural Comput Appl 30(2):413–435
15. Ferreira AJ, Figueiredo MA (2012) Efficient feature selection filters for high-dimensional data.
Pattern Recogn Lett 33(13):1794–1804
16. Forsati R, Mahdavi M, Shamsfard M, Meybodi MR (2013) Efficient stochastic algorithms for
document clustering. Inf Sci 220:269–291
17. Karaa WBA, Ashour AS, Sassi DB, Roy P, Kausar N, Dey N (2016) Medline text mining:
an enhancement genetic algorithm based approach for document clustering. In: Hassanien
AE, Grosan C, Fahmy Tolba M (eds) Applications of intelligent optimization in biology and
medicine. Springer, Cham, pp 267–287
18. Kushwaha N, Pant M (2018) Link based BPSO for feature selection in big data text clustering.
Future Gener Comput Syst 82:190–199
19. Mafarja MM, Mirjalili S (2019) Hybrid binary ant lion optimizer with rough set and approxi-
mate entropy reducts for feature selection. Soft Comput. 23(5):1–17
20. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S, Alyasseri ZAA, Abasi AK (2019) Particle
swarm optimization algorithm for power scheduling problem using smart battery, pp 1–6
21. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S, Alyasseri ZAA, Abasi AK (2020) A min-
conflict algorithm for power scheduling problem in a smart home using battery. In: Proceedings
of the 11th national technical seminar on underwater system technology 2019. Springer, pp
1–12
22. Makhadmeh SN, Khader AT, Al-Betar MA, Naim S (2019) Multi-objective power scheduling
problem in smart homes using grey wolf optimiser. J Ambient Intell Human Comput. 10:3643–
3667
23. Mengle SS, Goharian N (2009) Ambiguity measure feature-selection algorithm. J Am Soc
Inform Sci Technol 60(5):1037–1050
24. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61
25. Nakariyakul S, Casasent DP (2009) An improvement on floating search algorithms for feature
subset selection. Pattern Recogn 42(9):1932–1940
26. Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern
Recogn Lett 15(11):1119–1125
27. Shamsinejadbabki P, Saraee M (2012) A new unsupervised feature selection method for text
clustering based on genetic algorithms. J Intell Inf Syst 38(3):669–684
28. Sheikhpour R, Sarram MA, Gharaghani S, Chahooki MAZ (2017) A survey on semi-supervised
feature selection methods. Pattern Recogn 64:141–158
29. Song W, Qiao Y, Park SC, Qian X (2015) A hybrid evolutionary computation approach with
its application for optimizing text document clustering. Expert Syst Appl 42(5):2517–2524
Applied Electronics and Computer
Engineering
Metamaterial Antenna for Biomedical
Application

Mohd Aminudin Jamlos, Nur Amirah Othman, Wan Azani Mustafa,

and Maswani Khairi Marzuki

Abstract In this paper, metamaterial element is applied towards antenna for

biomedical application. The metamaterial unit cell is constructed using circular split
ring resonator (CSRR) technique to be attached at the ground of the antenna. The
metamaterial antenna is design to be operated at frequency between 0.5–3.0 GHz
which is suitable for biomedical application such as wireless patient movement
monitoring, telemetry and telemedicine including micro-medical imaging and
Magnetic Resonance Imaging (MRI). The design and simulation has been carried
out using Computer Simulation Technology Microwave Studio (CST MWS) while
the fabricated antenna is measured using Vector Network Analyzer (VNA) to
analyse the overall performance.

Keywords Biomedical Metamaterial Antenna

1 Introduction

Nowadays, Metamaterial has been a popular research topic for almost two decades.
Most of the researcher agree on certain the basic metamaterial definition charac-
teristics although it has different definitions [1]. Metamaterials are materials not
generally found in nature and having negative permittivity and permeability but are
instead artificially medium with a negative index of refractive and structures that
have properties that are either not or seldom found in natural material [1–3].
Variable metamaterials have been designed from radio frequencies up to optical
frequencies, and different functions have been realized such as negative refractive
index (NRI), huge chirality, anisotropy, and bianisotropy [4]. As an interdisci-
plinary topic, metamaterials can be classified into different categories based on
different criteria. From an operating frequency point of view, they can be classified

M. A. Jamlos (&) N. A. Othman W. A. Mustafa M. K. Marzuki

Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI ALAM Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 519

as microwave metamaterials, terahertz metamaterials, and photonic metamaterials.

From a spatial arrangement point of view, there are 1D metamaterials, 2D meta-
materials, and 3D metamaterials. From a material point of view, there are metallic
and dielectric metamaterials. In this work, we will concentrate on the electromag-
netic properties, and introduce several important types of metamaterials [5].
Metamaterial concepts are mainly focused on the size reduction and improving the
conventional patch antenna characteristics [6, 7].
For some years the metamaterials idea has mostly been considered as a means of
engineering the electromagnetic response of passive micro- and nanostructured
materials. Remarkable results have been achieved so far including negative-index
media that refract light in the opposite direction from that of conventional materials,
chiral materials that rotate the polarization state of light hundreds of thousands of
times more strongly than natural optical crystals, and structured thin ﬁlms with
remarkably strong dispersion that can slow light in much the same way as resonant
atomic systems with electromagnetically induced transparency [11–13]. These great
achievements in applications of metamaterials encouraged the biomedical scientists
to use these novel materials and their electromagnetic application in medicine.

2 Metamaterial Unit Cell

The proposed metamaterial unit cell dimensions layout of the proposed G-shape
Ring Resonator (GSRR) [8] is depicted in Fig. 1. The gap between the splits (W2)
plays a significant role in determining the stop-band phenomenon of the proposed
metamaterial unit cell. Figure 2 illustrated a proper gap of W2 = 0.5 mm the stop
band phenomenon of the structure is observed at 3.3 GHz. At 3.3 GHz the
reflection coefficient (S11) is almost near to zero and the transmission coefficient is
below −10 dB.
Similar to GSRR unit cell, Hexagon Split Ring Resonator (HSRR) unit cell is
also analyzed as shown in Fig. 3 meanwhile the S-parameter of the HSRR design
illustrated in Fig. 4.
On the other hand, a schematic view and the design parameters of the proposed
double-negative square-circular ring resonator (SCRR) metamaterial unit cell have
been depicted in Fig. 5 [9]. This SCRR metamaterial unit cell is made by combining
split circular and split square ring shape structure on the front side and metal strip on
the backside of the substrate. The metal strip on the backside is treated as a wire.
The square, circle and wire structures are made up of copper material with a
thickness of 0.035 mm. Arlon AD 350 (lossy) is used as the substrate material
which has a dielectric constant of 3.5 and loss tangent of 0.003. The square-circular
rings behave as inductors whereas the splits in the square and circular ring behave
as capacitors which are responsible for resonance characteristics. Magnetic and
electric field induced in SRR and wire respectively are responsible for negative
permeability (l) and negative permittivity (e). Due to these characteristics, meta-
materials exhibits left-handed properties.
Metamaterial Antenna for Biomedical Application 521

Fig. 1 Detailed dimension layout of GSRR

Fig. 2 S-parameter of proposed design

522 M. A. Jamlos et al.

Fig. 3 HRR unit cell

Fig. 4 S-parameter of HSRR

Figure 6 shown the simulation setup for proposed square-circular unit cell. The
frequency domain solver based electromagnetic simulator CST microwave studio
has been used for the calculation of reflection and transmission coefﬁcient of the
proposed design.
The unit cell is placed between two waveguide ports on positive and negative
X-axis. The perfect Electric Conductor (PEC) and Perfect Magnetic Conductor
(PMC) boundary conditions are applied along Y and Z-axes. Electromagnetic
properties obtained by simulated S11 and S21 characteristics of SCRR metamaterial
unit cell. There are some methods which are suitable for parameter extraction such
Metamaterial Antenna for Biomedical Application 523

Fig. 5 SCRR unit cell structure. a Front view. b Back view

Fig. 6 Simulation setup of unit cell

as TR method, Nicolson Ross method and many others. By using a transfer matrix,
the effective parameters of proposed SCRR metamaterial structure such as complex
permittivity and complex permeability are extracted [10].
Figure 7 represent the transmission (S21) and reflection (S11) characteristics for
simulated unit cell structure. Transmission characteristics (|S21| < −10 dB) shows
that it can be used from 3.36 to 5.88 GHz which belongs to C-band. Meanwhile
Fig. 8 show the phase response of S11 and S21. In Fig. 9, negative refractive index
524 M. A. Jamlos et al.

is obtained from 5.7 to 6 GHz with maximum negative value at 5.816 GHz. For
Fig. 10, the real part of permittivity is negative from 3.22–6 GHz while Fig. 11
shows that real part of permeability is negative from 5.824–6.1 GHz.
For biomedical application, an attractive properties of metamaterial is the plane
wave propagating in the media would there phase velocity antiparallel with group
velocity so that media would support backward waves. In this paper we proposed a
periodic rectangular split ring resonator structure (SRSM) a unit cell is depicted in
Fig. 12. This metamaterial SRSM unit cell is composed of two nested spilt rings,
which are etched on a FR4 substrate of a dielectric constant of 4.4. The resonance
frequency of this rectangular split ring unit cell structure depends on the gap
dimension (g).
Normally, slot loaded miniaturized patch antennas were used in biomedical
applications. Such patch antennas were never extended and analyzed by metama-
terial structure. Hence, rectangular split ring metamaterial structure loaded on
ground plane of the conventional circular microstrip antenna so that the antenna
achieved 75% of size reduction and good amount of bandwidth and gain for
biomedical and wireless applications. The designed metamaterial circular microstrip
patch antenna is shown in Fig. 13 after varying the width and gap of the meta-
material structure parametric studies was done for the better improvement of
bandwidth and gain and efﬁciency for biomedical applications for antenna under
test (AUT).

Fig. 7 The transmission (S21) and reflection (S11) characteristics

Metamaterial Antenna for Biomedical Application 525

Fig. 8 Phase response of S11 and S21

Fig. 9 Refractive index

526 M. A. Jamlos et al.

Fig. 10 Real part of permittivity

Fig. 11 Real part of permeability

Metamaterial Antenna for Biomedical Application 527

Fig. 12 RSRM unit cell

Fig. 13 Metamaterial circular microstrip patch antenna as AUT (top and bottom view)

3 Conclusion

As conclusion, variety of antennas metamaterial design for biomedical applications

has been discussed. The competency of the metamaterial determines by evaluating its
performances in term of resonant frequency, gain, efficiency, radiation pattern,
reflection coefficient magnitude, power ratio and bandwidth. Among the challenges in
realizing ideal designs of metamaterial are to obtain optimum efficiency and compact
size of the antenna which can be achieved through additional effort in designing ideal
metamaterial must be further carried on with metamaterial antenna designs.
528 M. A. Jamlos et al.

References

1. Gangwar K, Gangwar R (2014) Metamaterials: characteristics, process and applications. Adv

Electron Electric Eng 4:97–106
2. Mendhe SE, Kosta YP (2011) Metamaterial properties and applications. Int J Inf Technol
Knowl Manag 4(1):85–89
3. Sihvola A (2007) Metamaterials in electromagnetics. Metamaterials 1(1):2–11
4. Yan S (2015) Metamaterial design and its application for antennas. KU Leuven, Science,
Engineering & Technology
5. Anandhimeena B, Selvan PT, Raghavan S (2016) Compact metamaterial antenna with high
directivity for bio-medical systems. Circuits Syst 7:4036–4045
6. Islam MM, Islam MT, Samsuzzaman M, Faruque MRI, Misran N, Mansor MF (2015) A
miniaturized antenna with negative index metamaterial based on modiﬁed SRR and CLS unit
cell for UWB microwave imaging applications. Materials 8:392–407
7. Ali T, Subhash BK, Biradar RC (2018) Design and analysis of two novel metamaterial unit
cell for antenna engineering. In: Proceedings of 2018 2nd international conference on
advances in electronics, computers and communications, pp 1–4
8. Khombal M, Bagchi S, Harsh R, Chaudhari A (2018) Metamaterial unit cell with negative
refractive index at C band. In: 2018 2nd international conference on electronics, materials
engineering and nano-technology, IEMENTech 2018, pp 1–4
9. Rajput GS, Gwalior S (2012) Design and analysis of rectangular microstrip patch antenna
using metamaterial for better efﬁciency. Int J Adv Technol Eng Res 2:51–58
10. Koutsoupidou M, Karanasiou IS, Uzunoglu N (2013) Rectangular patch antenna on split-ring
resonators substrate for THz brain imaging: modeling and testing. In: 13th IEEE international
conference on bioinformatics and bioengineering, BIBE 2013. IEEE, pp 1–4
11. Singh G, Marwaha A (2015) A review of metamaterials and its applications. Int J Eng Trends
Technol 19(6):305–310
12. Hosseinzadeh HR (2018) Metamaterials in medicine: a new era for future orthopedics. Orthop
Res Online J 2(5):1–3
13. Tütüncu B, Torpi H, Urul B (2018) A comparative study on different types of metamaterials
for enhancement of microstrip patch antenna directivity at the Ku-band (12 GHz). Turk J
Electr Eng Comput Sci 26:1171–1179
Refraction Method of Metamaterial
for Antenna

Maswani Khairi Marzuki, Mohd Aminudin Jamlos,

Wan Azani Mustafa, and Khairul Najmy Abdul Rani

Abstract This paper reviews several refraction methods of metamaterial.

Metamaterial is an engineered structure to produce electromagnetic properties that
is not naturally occurred in ordinary material, such as negative permittivity, neg-
ative permeability and negative refractive index. This reviewed paper focuses on
negative refractive index application where complies with microwave and optical
frequency ranges. Each method provides different frequency range. Split ring res-
onator used in microwave radiation enhances the gain while ﬁshnet-chiral planar
structure is used in photonic frequency. The photonic metamaterial acts similar to
lens, which leads to enhancing the gain of the microwave.

Keywords Refraction method Metamaterial Antenna

1 Introduction

Metamaterial is an artiﬁcial material introduced on 19th century by researchers to

the world. It is known because of the unique properties, which do not occur nat-
urally in other materials [1]. It is formed by a multiple of composite materials or
meta-atoms and is arranged in repeating pattern also known as unit cell. The
metamaterial structured atoms are much larger than conventional atoms but much
smaller than the wavelength of incident waves. The wavelength for microwave
radiation is in millimeter while for photonic metamaterial is in nanometer [2]. Each
design will provide different properties and capable to manipulate the electro-
magnetic waves, such as blocking, absorbing, enhancing and bending the incident
wave. It also affects the electromagnetic radiation or lights [3].
The idea to create an unusual material like metamaterial occurred because of the
limited abilities of natural materials where it has only positive characteristics, such as

M. K. Marzuki M. A. Jamlos (&) W. A. Mustafa K. N. A. Rani

Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI ALAM Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 529

positive dielectric permittivity and positive magnetic permeability, which is also

known as “double positive” material. Metamaterial can be characterized into two
characters, which are “single –negative” where one of the permittivity or permeability
is negative and for this type of metamaterial, it supports evanescent waves. While
other character of metamaterial is to have both negative values also known as “double
negative” metamaterial for permittivity and permeability, which leads to negative
refractive index [4].
The focus of this paper is to explore the application used by the negative index
metamaterial (NIM). Theoretically, NIM is referred as left-handed materials
(LHM) where the poynting vector is antiparallel to wave vector. It is different from
the right-handed material where the poynting vector is parallel with wave vector
with positive permittivity and permeability [5]. The important property of NIM is it
can bend or refract the light passes differently from common positive index
material. The refracted light will lie on the same side of the normal as the incident
light. NIM with a −1 refractive index would provide ultrahigh resolution and give
the super lensing effect. NIM used in variety of applications and it can be distin-
guishing by different methods [6].

2 Refraction Method

There are several refraction methods of metamaterial discussed in this section. Each
method is used in different applications depends on the design of the unit cell. The
ﬁrst method uses cylindrical lens antenna as shown in Fig. 1. The researcher uses
this method to replace the array antenna used at the Base Station for the next
generation mobile system (5G). It supports the application of multi-beam and

Fig. 1 Cylindrical lens

antenna
Refraction Method of Metamaterial for Antenna 531

Fig. 2 Huygen’s
metasurfaces

multi-frequency use. Besides that, the negative refractive index reduces the thick-
ness of the lens and the angle obtained for the application is n = 2 [7].
Huygen’s metasurface method also produces negative refractive index, which is
used to focus the beam of the signal. This method is printed on two bonded boards
by using standard PCB fabrication techniques even there are many stacked and
interspaced layers as shown Fig. 2 [8].
The split-ring resonator (SRR) is commonly used in metamaterial antenna for
many applications depend on the design as shown in Fig. 3. Many researchers tend to
use this method because of the design characteristics. The permeability value is
controlled by the radius and width of the ring [9]. There are ﬁve different designs
discussed for this method. Firstly, the design used the double circular slot ring res-
onator. It acts as planar surface lens and the 3-dB transmission band of 2 GHz
obtained between 8.55 and 10.55 GHz. Then, the high gain antenna is modiﬁed by
placing double stacked meta-surface lens over a microstrip patch antenna and the gain
enhanced by 8.55 dB in H-plane while 6.20 dB in E-Plane. Lastly, cross polarization
improved by 8 dB [10]. There is also squared SRR design, which is used to synthesize
negative refractive index lens and parabolic lens. This method uses 90 unit cells to get
n = ∞ at 11.6 GHz. The combination of these two meta-surfaces able to focus the
energy in a point despite of the power losses in the air [11]. Besides that, the com-
bination of square shape and circular designed to exhibit negative refractive index
from 5.7 to 6 GHz frequency band [12] and other researchers also used this design to
produce negative refractive index in S-band range between 2.2–3.3 GHz, which
resonated at 2.5 GHz. Radiation directivity was also enhanced and it could be used for
wireless power transfer application [13]. Lastly, for SRR design is not limited for
532 M. K. Marzuki et al.

Fig. 3 a Double circular slot ring resonator. b Squared split-ring resonator. c Square-circular
split-ring resonator. d S-shape resonator
Refraction Method of Metamaterial for Antenna 533

Fig. 4 a Chiral planar. b Fishnet structure. c Fishnet-like chiral metamaterial

circular or square shape only. One of the researchers manages to design SRR in
S-shape as shown in Fig. 3d. The negative refractive index occurred at the higher
frequency, which was between 5 and 9 GHz [14].
Subsequent paragraphs, however, are indented. All the methods discussed are
used to get the negative index from microwave. However, none of the above
method is used in optical frequency. Therefore, the Fishnet-Chiral Planar method is
introduced as shown in Fig. 4. There are three designs reviewed in this section. The
first design is chiral planar design used in optical frequency. It managed to reduce
losses of the negative index metamaterial and exhibit polarization effects for lights
field [2]. Then, the fishnet structure design was introduced and the researcher found
that this method used to gain negative permeability and able to get the highest
figure of merit (FOM) without loss compensation. Besides that, the light passes
through undergoes negative refractive index at the interface and focuses at the far
field. The negative index metamaterial (NIM) slab acts similar to a lens. Lastly, the
combination of the fishnet and chiral planars was designed known as fishnet like
chiral metamaterial. It was used to reduce losses exhibited by the chiral metama-
terial and exhibit negative refractive indices in three frequency bands [15].

3 Conclusion

Metamaterial capabilities explored in many applications as reviewed in this paper

by using negative index metamaterial. However, most of the applications are in
microwave frequency range. Therefore, it is good to explore more in photonic
system. As reviewed, the 4th method, ﬁshnet-chiral Planar design is able to
manipulate the electromagnetic radiation or light. There are three different capa-
bilities of this method based on its design, which are it can exhibit polarization
effects of lights, bend and focus the light at a point and act similar to lens. With
these properties, it can be used to explore more in electromagnetic radiation and to
manipulate light properties.
534 M. K. Marzuki et al.

References

1. Kuse R, Hori T, Fujimoto M (2015) Variable reflection angle meta-surface using double
layered FSS. In: 2015 IEEE international symposium on antennas and propagation & USNC/
URSI national radio science meeting, Canada. IEEE, pp 872–873
2. Linden S, Wegener M (2007) Photonic metamaterials. In: Conference proceedings of the
international symposium on signals, systems and electronics, USA, pp 147–150
3. Zhu B, Huang C, Zhao J, Jiang T, Feng Y (2010) Manipulating polarization of
electromagnetic waves through controllable metamaterial absorber. In: 2010 Asia-pacific
microwave conference, Japan. IEEE, pp 1525–1528
4. Duan ZY, Guo C, Guo X, Chen M (2016) Double negative-metamaterial based terahertz
radiation excited by a sheet beam bunch. Phys Plasmas 20(9):1–6
5. Solymar L, Shamonina E (2009) Waves in metamaterial. Oxford University Press, Oxford A
bird’s-eye view of metamaterials
6. Yang J, Xu F, Yao S (2018) A dual frequency Fabry-Perot antenna based on metamaterial
lens. In: 2018 12th international symposium on antennas, propagation and EM theory
(ISAPE), China. CRIRP, pp 1–3
7. Hamid S, Ali MT, Abd Rahman NH, Pasya I, Yamada Y, Michishita N (2016) Accuracy
estimations of a negative refractive index cylindrical lens antenna designing. In: Proceedings
of the 2016 6th IEEE-APS topical conference on antennas and propagation in wireless
communications, APWC, USA. IEEE, pp 23–26
8. Wong Joseph PS (2015) Design of Huygens’ metasurfaces for refraction and focusing.
A dissertation submitted to the faculty of The University of Toronto in partial fulfillment of
requirement for the degree of Doctor of Philosophy in Electrical and Computer Engineering
9. Singh AK, Abegaonkar MP, Koul SK (2017) A negative index metamaterial lens for antenna
gain enhancement. In: International symposium on antennas and propagation, USA. IEEE,
pp 1–2
10. Yang J, Xu F, Yao S (2018) A dual frequency Fabry-Perot antenna based on metamaterial
lens. In: 12th international symposium on antennas, propagation and EM theory (ISAPE),
China. IEEE, pp 1–3
11. Pan CW, Kehn MNM, Quevedo-Teruel O (2015) Microwave focusing lenses by synthesized
with positive or negative refractive index split-ring resonator metamaterials. In: International
workshop on electromagnetics: applications and student innovation competition, IWEM,
pp 1–2
12. Khombal M, Bagchi S, Harsh R, Chaudhari A (2018) Metamaterial unit cell with negative
refractive index at C band. In: 2nd international conference on electronics, materials
engineering and nano-technology, India. IEEE, pp 1–4
13. Baghel AK, Nayak SK (2018) Negative refractive index metamaterial for enhancing radiation
directivity in S-band. In 3rd international conference on microwave and photonics, India.
IEEE, pp 1–2
14. Fiddy MA, Adams R, Weldon TP (2017) Exploiting metamaterials: fundamentals and
applications. A dissertation submitted to the faculty of The University of North Carolina at
Charlotte in partial fulfillment of the requirements for the degree of Doctor of Philosophy in
Electrical Engineering
15. Fernández O, Gómez Á, Vegas A, Molina-Cuberos GJ, García-Collado AJ (2017) Novel
fishnet-like chiral metamaterial structure with negative refractive index and low losses. In:
IEEE antennas and propagation society international symposium proceedings, USA, pp 1959–
1960
Circular Polarized 5.8 GHz Directional
Antenna Design for Base Station
Application

Mohd Aminudin Jamlos, Nurasma Husna Mohd Sabri,

Wan Azani Mustafa, and Maswani Khairi Marzuki

Abstract Nowadays, research development and utilization of directional antenna

with circular polarization have been grown rapidly for base station applications.
High Gain Antenna (HGA) is one of directional antenna that focused on narrow
beam width for the application. The antenna permits more precise on the targeting
the radio signal and usually is placed at the open area so that the radio waves to be
transmitted will not be interrupted. For this paper, methods for circularly polarized
microstrip patch antenna designs are being reviewed. In order to realized circularly
polarized antenna, the patch has undergone some design modiﬁcation while array
antenna is design for improving antenna performance as to realize high gain so that
it is suitable to be used in base station applications.

Keywords Circular polarize Base station Antenna

1 Introduction

Circular polarized 5.8 GHz directional antenna is designed to be used for base
station application. To design the antenna, it must have a very wide band impe-
dance matching, stable radiation pattern in a wide frequency band and high
cross-polarization ratio in wide angle range [1–3]. For this research, circularly
polarized microstrip patch antenna is designed since it is suitable for wireless
communication. In order to make circularly polarized design, the patch must
undergo some modiﬁcation such as masking perturbation, slot or slit and by
truncating corners [1, 4].
In order to enable the antenna that works in the base station, the antenna must
have a very high gain so that the signal can be easily transmitted and received
consistently. Thus, an array antenna is designed for improving antenna gain

M. A. Jamlos (&) N. H. Mohd Sabri W. A. Mustafa M. K. Marzuki

Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI ALAM Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 535

performance in base station applications [5] where rectangular microstrip patch

antenna array is designed and some modiﬁcation on the patch is made in order to
make a circularly polarized antenna for use in base station application.
Besides, the requirement of the directional radiation pattern is important since it
provides increased performances and reduced interference when transmission and
reception of communication [6]. The directional antenna is designed to function more
effectively than in others. The reason for that directionality is for improving trans-
mission and reception of the signal communication as well as to reduce interference
[5]. The antenna for base station application is operating at 5.8 GHz frequency for the
requirement of the large bandwidth and gain for base station application.

2 Microstrip Antenna

Microstrip antenna associated with low cost, light weight, conformal antennas
which can be integrated with feed networks and active devices. The basic structure
of microstrip antenna consists of a radiating patch on one side of a dielectric
substrate and a ground plane on the other side of the substrate [1, 3, 5]. A microstrip
patch antenna structure is shown in Fig. 1. Patch is generally made up of con-
ducting material like copper or gold and it can be of any possible shape. The patch
and the feed lines are photo etched on the substrate. As this antenna is etched on the
substrate so it can take any desired shape. Rectangular shaped patch is the simplest
patch shape to be etched and analyzed. Microstrip antenna has advantages of low
proﬁle, lightweight, low cost, ease of integration with active component and radio
frequency devices [3, 7].
However, the microstrip antenna also have the disadvantages which is low gain,
low efﬁciency, low power handling capability and all of this disadvantages can be
overcome by using an array concept or by make MIMO antenna [5, 8]. Besides, the
radiation pattern of an antenna depends on its dimensions. It also depends on the
effective permittivity of the substrate which is dependent on the width and height of
the patch.

Fig. 1 Microstrip patch antenna structure [1]

Circular Polarized 5.8 GHz Directional Antenna … 537

Fig. 2 Types of polarization. a Linear. b Circular. c Eliptical [9]

3 Antenna Polarization

Polarization is the property of electromagnetic wave describing the time-varying

direction and relative magnitude of the electric ﬁeld vector as observed along the
direction of propagation. Transmitting and receiving antennas should be similarly
polarized otherwise there will be more losses. There are three types of polarization
which is linear polarization, circular polarization and elliptical polarization.
Figure 2 above show three types of polarization and is rotating [9].
Transmitting and the receiving should be similarly polarized otherwise there will
be more losses. The uses of linear polarization will make the alignment of trans-
mitting and receiving antenna become well. This limitation of alignment can be
removed by using circular polarization which compatible with this research project
that is needed circular polarized in its design [10]. Circularly polarized antenna used
to be exotic mw technology for communication. The ﬁeld of CP antenna is always
rotating. A Circular Polarization Circulation polarization (CP) can be achieved by
making axial ratio equal to one. Besides, other researcher claims that circularly
polarized antenna have axial ratio less than 3 dB at 90° phase shift [11]. Circular
polarization has two types which is Right Hand CP (RHCP) and Left Hand CP
(LHCP). For practical implementation of antenna, to consider whether the antenna
is LHCP antenna or RHCP antenna, if the transmitting is LHCP antenna and
receiving is RHCP antenna there will be 25 dB gain difference between them. Some
of the antenna polarization losses are also exist when transmitting antenna and
receiving antenna polarizations are different [12].

4 Methods for Circular Polarized Antenna Design

Circular polarization (CP) antenna is increasingly attractive in wireless communi-

cation systems [13]. Circular Polarized can be obtained if two orthogonal modes
with equal amplitudes are excited with a 90° time-phase difference. This can be
538 M. A. Jamlos et al.

accomplishing for instance by adjusting the physical dimensions of the microstrip

patch or by various feed arrangements [14, 15]. Figures below show some of the
designs of the antenna resulting in circular polarization from some researchers.
Some researcher has modiﬁed the antenna design in result of circular polariza-
tion. As presented by Thoetphan Kingsuwannaphong, the design of 5.7 GHz cir-
cular polarization antenna uses the double feeder in order to avoid the interference
from adjacent channel of other wireless devices. But, the antenna required two input
port of 0° and 90° phase input to achieve circular polarization property. Since it
possible to create two output signals with 90° phase different, hence, the compact
circular polarized antenna with inset fed and slot is design as shown on Fig. 3. The
slot at edge of the circular patch is made to achieve circular polarized. The result of
the axial ratio is shown in Fig. 4 below. From the simulation, the result of the axial
ratio is acceptance which at 90 phases, AR is below that 3 dB. So the design is
circularly polarized.
The other way of design to achieve circularly polarization is make an inclined or
diagonally slot at the centre of the patch. The slot technique is a way to obtain a
circularly polarization [16–18]. As contribute from one of the researcher, the
antenna element is a square with an inclined slot at the center. The antenna is
feeding by a microstrip line having a characteristic impedance of 100 Ω, this
antenna was mounted on a FR4 substrate. The antenna dimensions are presented in
Fig. 5. Besides, by introducing asymmetrical slits in diagonal direction of the
square microstrip patches [18], the single coaxial-feed microstrip patch antenna is
realized for circularly polarized radiation with compact antenna size. The impe-
dance and axial ratio bandwidths are small around 2.5 and 0.5%.
Besides, in order to make the circular polarized antenna, some modiﬁcation on
the patch is done such as make some truncated design on the patch or make a slot
and so on. From the previous research, the proposed antenna is develop by com-
bining two array antenna which excited from 50 GHz coaxial feed probe, the array

Fig. 3 Circular polarized

antenna design [15]
Circular Polarized 5.8 GHz Directional Antenna … 539

Fig. 4 Simulation result of axial ratio

Fig. 5 Patch antenna design

with inclined slot [16]
540 M. A. Jamlos et al.

Fig. 6 Circular polarized array antenna design [12]

antenna is designed with 4 element patches on the substrate and each elements is
truncated at the corner of the patch to achieve circular polarized result [12, 19, 20].
The antenna designed is shown in the Fig. 6.
A single-feed CP U-slot microstrip antenna is proposed in [21]. The asymmet-
rical U-slot structure is able to generate two orthogonal modes for CP operation
without truncate any corner of the square patch. The CP radiation is achieved by
etching the complementary split-ring resonator on the patch. The etched gap ori-
entation to the current propagating direction will render the antenna to generate CP
waves. By cutting asymmetrical slots onto the square patches, the single probe-feed
microstrip antenna is realized for CP radiation [22]. A new technique to design
single-feed CP microstrip antenna using Fractal Defected Ground Structure FDGS
has been presented in this communication [21, 23]. By using this method, the level
of the linearly polarized microstrip antenna is increased to the required level for CP
radiation.
Another technique to obtain circularly polarized antenna in [24]. In this paper, a
circular microstrip patch antenna and its two element array have been proposed for
ISM band Applications. Here, the proposed antenna and its array is operated on
5.8 GHz ISM band. The antenna consists of a circular patch which has an elliptical
slot and a vertical strip at the center of the patch as shown on Fig. 7 below. The
antenna shows circularly polarized radiation pattern with best return loss
characteristics.
Circular Polarized 5.8 GHz Directional Antenna … 541

Fig. 7 Circular polarized

array antenna design [24]

5 Conclusion

As conclusion, the paper describes the method for circularly polarized microstrip
patch antenna design and ways to improve its performance to enhance its appli-
cability for use in base station application. Basically bandwidth of the microstrip
antenna is its main limitation since for the base station, a large bandwidth is needed.
Through this paper, methods including modifying the shape of the patch antenna or
by using different feeding techniques circular polarization are described which
helps in increasing the bandwidth of the antenna as well as by making the antenna
in an array conﬁguration. Different slotted antenna in term of shape and size of the
slot also helps in achieving increased bandwidth, improved efﬁciency, and gain.

References

1. Kingsuwannaphong T, Sittakul V (2018) Compact circularly polarized inset-fed circular

microstrip antenna for 5 GHz band. Comput Electr Eng 65:554–563
2. Chen W-S, Wu C-K, Wong K-L (2002) Compact circularly-polarised circular microstrip
antenna with cross-slot and peripheral cuts. Electron Lett 34:1040
3. Nayan MKA, Jamlos MF, Jamlos MA (2014) Circular polarized phased shift 90° MIMO array
antenna for 5.8 GHz application. In: IEEE international symposium on telecommunication
technologies, ISTT, vol 76, pp 169–173
4. Karvekar S, Deosarkar S, Deshmukh V (2014) Design of compact probe fed slot loaded
microstrip antenna. In: International conference on communication and signal processing,
ICCSP, pp 387–390
5. Midasala V, Siddaiah P (2016) Microstrip patch antenna array design to improve better gains.
Procedia Comput Sci 85:401–409
6. Fauzi DLN, Hariyadi T (2018) Design of a directional microstrip antenna at UHF-band for
passive radar application. IOP Conf Ser Mater Sci Eng 384:012006
7. Balanis CA (2005) Antenna theory analysis and design, 3rd edn. Wiley, Hoboken
542 M. A. Jamlos et al.

8. Nayan MKA, Jamlos MF, Jamlos MA, Lago H (2014) MIMO 22 RHCP array antenna for
point-to-point communication. In: IEEE symposium on wireless technology and applications,
ISWTA, pp 121–124
9. Orban D, Moernaut GJK (2006) The basics of patch antennas. Orban Microwave Products,
pp 1–4
10. Lacoste R (2010) Robert Lacoste’s the darker side: practical applications for electronic design
concepts. Elsevier Inc., Amsterdam
11. Fujita K, Yoshitomi K, Yoshida K, Kanaya H (2015) A circularly polarized planar antenna on
flexible substrate for ultra-wideband high-band applications. AEU Int J Electron Commun
69:1381–1386
12. Kunooru B, Nandigama SV, Rani SS, Ramakrishna D (2019) Analysis of LHCP and RHCP
for microstrip patch antenna. In: International conference on communication and signal
processing (ICCSP), pp 0045–0049
13. Jamlos MA, Jamlos MF, Ismail AH (2015) High performance of coaxial feed UWB antenna
with parasitic element for microwave imaging. Microw Opt Technol Lett 57:649–653
14. Jackson DR, Long SA, Williams JT, Davis VB (1997) Computer aided design of rectangular
microstrip antennas of advances in microstrip and printed antennas, 2nd edn. Wiley, Hoboken
15. Garg AIR, Bhartia P, Bahl I (2001) Microstrip antenna design handbook. Artech House,
Boston
16. Nayan MK, Jamlos MF, Lago H, Jamlos MA (2015) Two-port circular polarized antenna
array for point-to-point communication. Microw Opt Technol Lett 57:2328–2332
17. Madhuri S, Tiwari VN (2016) Review of circular polarization techniques for design of
microstrip patch antenna. In: International conference on recent cognizance in wireless
communication & image processing, pp 663–669
18. Nasimuddin, Chen ZN, Esselle KP (2008) Wideband circularly polarized microstrip antenna
array using a new single feed network. Microw Opt Technol Lett 50:1784–1789
19. Liang D, Hosung C, Robert WH, Hao L (2005) Simulation of MIMO channel capacity with
antenna polarization. IEEE Trans Wireless Commun 4(4):1869–1873
20. Wei K, Li JY, Wang L, Xu R, Xing ZJ (2017) A new technique to design circularly polarized
microstrip antenna by fractal defected ground structure. IEEE Trans Antennas Propag
65:3721–3725
21. Nasimuddin, Qing X, Chen ZN (2011) Compact asymmetric-slit microstrip antennas for
circular polarization. IEEE Trans Antennas Propag 59:285–288
22. Gupta K, Jain K, Singh P (2014) Analysis and design of circular microstrip patch antenna at
5.8 GHz. Int J Comput Sci Inf Technol 5:3895–3898
23. Nayan MK, Jamlos MF, Jamlos MA (2015) Circularly polarized MIMO antenna array for
point-to-point communication. Microw Opt Technol Lett 57:242–247
24. Singh N, Yadav DP, Singh S, Sarin RK (2010) Compact corner truncated triangular patch
antenna for WiMax application. In: Mediterranean microwave symposium, MMS, pp 163–
165
Medical Image Enhancement
and Deblurring

Reza Amini Gougeh, Tohid Youseﬁ Rezaii, and Ali Farzamnia

Abstract One of the most common image artifacts is blurring. Blind methods have
been developed to restore a clear image from blurred input. In this paper, we introduce
a new method which optimizes previous works and adapted with medical images.
Optimized non-linear anisotropic diffusion was used to reduce noise by choosing
constants correctly. After de-noising, edge sharpening is done using shock filters.
A novel enhanced method called Coherence-Enhancing shock filters helped us to
have strong sharpened edges. To obtain a blur kernel, we used the coarse-to-fine
method. In the last step, we used spatial prior before restoring the unblurred image.
Experiments with images show that combining these methods may outperform pre-
vious image restoration techniques in order to obtain reliable accuracy.

Keywords Medical images Blind deconvolution Deburring

1 Introduction

Medical images are an indispensable component of the diagnosis and treatment

system, so we need accurate images. Blur is a type of medical image artifact that
has various sources such as body movement or detector.
The blur kernel determines the effect of the blur on the image. If the blur is
non-shift invariant, it can be modeled as a convolution of the original image with
the blur kernel; thus, obtaining a clear image becomes a deconvolution problem. In
non-blind decon-volution, the blur function is known, and the problem is to ﬁnd the
original image from the blurred image. In blind deconvolution, the blur function is
unknown [1]. Among the non-blind methods, we can refer to the Wiener ﬁlter and
the Lucy-Richardson method that were introduced decades ago with the initial

R. Amini Gougeh T. Youseﬁ Rezaii

Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
A. Farzamnia (&)
Faculty of Engineering, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 543

assumptions about the blur function. From the new blind methods, we can mention
the Fergus method [2].
In this article, we will investigate the blind deconvolution method and will try to
achieve an efﬁcient method for use in the medical ﬁeld with previous improve-
ments. A clear image is obtained fully and correctly in the absence of noise in blur
image and error in blur kernel estimation. So the proposed algorithm tries to achieve
this ideal. As mentioned, blurry images are noisy, so we have the following
equation for the blurry image:

b ¼ i~kþn ð1Þ

where b is the blurry image, i is the clear image, k is a blur kernel and n is noise. ~
indicates convolution operator. In the case of the Fourier transform, Eq. (1)
becomes the following relation:

B ¼ I KþN ð2Þ

Figure 1 shows this equation on carotid MRI image.

Projection-based and maximum-likelihood method are the two major types of
blind deconvolution. The projection-based approach retrieves the blur function and
the real image simultaneously. This method is repeated continuously until it meets a
predefined criterion. The first step is estimating the blur function. One of the benefits
of this method is that it is not sensitive to noise. The second approach shows the
maximum likelihood estimation of blur parameters, such as the covariance matrix.
Since the estimated blur function is not unique, it is possible to introduce functions
by considering the size, symmetry of the estimated function. One of the significant
advantages of this method is that its computational complexity is low, and it also
helps to detect blur, noise, and real image power spectra [3].
Blur kernel estimation is an ill-posed problem. So various types of regularization
terms were used in the models. Fergus et al. [2] used heavy-tailed distribution. They
used the mixture of Gaussians and Bayes’ theorem to estimate kernel. Shan et al. [4]
has developed a parametric model to estimate heavy-tailed distribution from natural
image gradients. Levin et al. [5] used Hyper-Laplacian regularization terms of image
gradient approximation. Cho and Lee [1] used coarse-to-fine method to determine the
blur kernel. They used this iterative method with a bilateral filter. This method used
Gaussian regularization terms. Notably, our method is an adaptation of this method.

Fig. 1 Practical Eq. (1)

Medical Image Enhancement and Deblurring 545

According to previous studies of the blur kernel estimation, the existence of

appropriate edges makes the estimation more accurate. Combined methods such as
shock ﬁlters with bilateral ﬁlters have been used by Money and Kang [6] and
Alvarez and Mazorra [7]. Xu et al. [8] used zero norms in equations for kernel
estimation, which has a good effect on noise and prevents errors that appear around
the edges.
Our paper is formed as follows. In Sect. 2, we describe the structure of our
algorithm and the methods we used. Numerical aspects and results are briefly
sketched in Sect. 3. In the last section, we have a summary which concludes the
paper.

2 Materials and Methods

The primary purpose of the iterative alternating optimization is to refine the motion
blur kernel progressively. The final deblurring result is obtained by the last
non-blind de-convolution operation that is performed with the final kernel K and the
given blurred image B. The intermediate latent images estimated during the itera-
tions have no direct influence on the deblurring result. They only affect the result
indirectly by contributing to the refinement of kernel K. The success of previous
iterative methods comes from two essential properties, including sharp edge
restoration and noise suppression in smooth regions. These attributes help to esti-
mate the kernel accurately [1].
The coarse-to-fine method starts from developed for medical images. Chen et al.
[9] developed a new framework for 3D Brain MR image registration. We used
another method based on spatial priors.

2.1 Noise Reduction

In the ﬁrst phase of blur function estimation, we try to denoise the blurry image.
The method used in this study is based on the Perona-Malik method [10], which
relies on the use of partial derivatives in image analysis. The values of the con-
duction coefﬁcient and diffusion rate play an important role in noise reduction. The
weaknesses of conventional methods are the manual selection of constants. In our
method, the image gradient is calculated in its four major neighborhoods, then the
difference between the gradients are calculated in horizontal and vertical directions.
By calculating the average value of the gradient and variance, we obtain an
appropriate criterion for obtaining the magnitude of the image gradient changes,
which has a linear relationship with the diffusion rate. Choosing the right values is
critical to maintaining the edges of the image, larger values make the image
smoothly, and at low values, noise reduction will not be possible.
546 R. Amini Gougeh et al.

Equation (3) speciﬁes the output image of this method in (1 + t)th repetition:
tþ1
Ii;j ¼ Ii;j
t
þ k½CN :rN I þ CS :rS I þ CE :rE I þ CW :rW I ti;j ð3Þ

where 0 k 0:25 for the numerical scheme to be stable, N, S, E, and W are the
subscripts for North, South, East, West neighbors, and the symbol r indicates
nearest-neighbor differences:

rN Ii;j Ii1;j Ii;j

rS Ii;j Ii þ 1;j Ii;j
ð4Þ
rE Ii;j Ii;j þ 1 Ii;j
rW Ii;j Ii;j1 Ii;j

The conduction coefﬁcients are updated at every iteration as a function of the

brightness gradient.

CtNi;j ¼ g ðrIÞti þ ð1Þ;j

2

CtSi;j ¼ g ðrIÞtið1Þ;j
ð5Þ
2

CtEi;j ¼ g ðrIÞti;j þ ð1Þ

2

CtWi;j ¼ g ðrIÞti;jð1Þ
2

Figure 2 illustrated pixel’s 4 major neighborhood.

We used the equation of Black et al. [11]. As g(.):
( h i pffiffiffi
0:67 1 kxpffiffi5 ;
2
xk 5
gðrIÞ ¼ f ðxÞ ¼ ð6Þ
0; otherwise

where k is the diffusion rate controls the sensitivity to edges.

rNS I ¼ rN I rS I
ð7Þ
rEW I ¼ rE I rW I

Fig. 2 Discrete
computational structure for
simulation of diffusion
equation of Perona and Malik
[10]
Medical Image Enhancement and Deblurring 547

We calculated the gradient in two vertical and horizontal directions by (7), then
the average gradient value is calculated as follows:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
rI ¼ ðrNS IÞ2 þ ðrEW IÞ2 ð8Þ

According to results Hasanpor et al. [12], k has a linear relationship with the
variance of gradients, so we have:

k ¼ a:VarðrIÞ ð9Þ

with respect to noise properties, we can suggest an optimum number of a so we can

calculate k more precise and easier. After applying modiﬁed Perona-Malik, we
obtain an image with less noise without removing image parts like edges which are
essential for blur kernel estimation.

2.2 Shock Filter

The shock filter is used to restore salient edges by [13] One of the disadvantages of
the shock filter is enhancing remnant noise. Money and King [6] used a shock filter
to find sharp edges, and estimated a blur kernel. Weickert [14] introduced an
enhanced version of shock filters called Coherence-enhancing shock filters. We
used this method in our research.
The basic of the shock filter is the transfer of gray values to the edge from both
sides by applying image’s morphological operations to satisfy the differential
equation conditions. The two main operations in image morphology are: 1-Dilation
and 2-Erosion.
The shock filter uses the sign function which has {−1, 0, +1} values to select
between two states (dilation and erosion). Applying such a method creates a severe
discontinuity called shock at the boundary between the two zones of influence. We
use the Gaussian filter to smooth the image and solve the shock filter equation.

@Is
¼ sgnðDIs ÞkrIs k ð10Þ
@t

where DIs and rIs are Laplacian and gradient of Is , respectively. Is is the ﬁltered
image which results from the equation follows:

Is ¼ Gr ~ Ip ð11Þ

which Ip is image after the de-noising section and Gr is Gaussian ﬁlter with
standard deviation r. r determines the size of the resulting patterns. Often r is
548 R. Amini Gougeh et al.

chosen in the range between 0.5 and 2 pixel units. It is the main parameter of the
method and has a strong impact on the result.
If the right edges are not selected, the estimated blur kernel will have less
accuracy. Several modifications have been proposed in order to improve the per-
formance of shock filters. For instance, replacing rIs with other expressions can be
a better edge detector.
It is clear that the shock filter and Perona-Malik method are iterative processes,
so we need to define the iteration number. Furthermore, it has been proven that the
number of salient edges does not always lead to accurate estimates. Impact of
iteration has shown in Fig. 3.

2.3 Edge Selecting

In order to achieve useful edges, Xu and Jia [15] assumed an h h window

centered at pixel x and moving over all parts of the blurred image; we can obtain a
criterion for choosing the correct gradients as follows:
P

y2Nh ðxÞ rBðyÞ
rðxÞ ¼ P ð12Þ
y2Nh ðxÞ krBðyÞk þ 0:5

B is the blurred image and Nh ðxÞ is the mentioned window. The nominator is the
sum of the absolute values of the gradients of the windows with different x centers,
giving us an estimation of the structure of an image. Flat areas of the image, where
the pixel difference is negligible, and also the areas where the pixel sharpness is
high (such as the impulse) have the small r(x) values because they neutralize by
other gradient factors. It should be noted that we obtain the above equation for the x
and y coordinates (derived in two directions). 0.5 was used for grayscale level [0,
1], and if we use system with value [0, 255] we can select 20 instead of 0.5.
Absolute value is:

Fig. 3 The output of the shock ﬁlter. (a) Input image (b) shock ﬁlter iteration = 5 (c) iteration = 50
(d) iteration = 150 (g) iteration = 250
Medical Image Enhancement and Deblurring 549

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2ffi
2
rðxÞ ¼ ðrx Þ þ ry ð13Þ

Figure 4 shows the calculated r(x) for depicting image.

We have the phase as follows:

rx
h ¼ arctan ð14Þ
ry

Which h 2 p2 ; p2 then r values were sorted into 4 groups in descending order:

p p
, ½p
2 ; 4
p p p
4 ; 0Þ, ½0; 4Þ, 4 ; 2 . Then the threshold value was defined to ensure the
minimum number of pixels to be selected in each group.
pffiffiffiffiffiffiffiffiffiffi
sr ¼ 0:5 PI PK ð15Þ

where PI is the total number of pixels in the input image and PK is the total number
of pixels in the kernel.
Using the Heaviside function, H(.), the threshold will be applied:

M ¼ H ð r sr Þ ð16Þ

Another threshold was defined which works with the gradient magnitude.
Selected edges are determined as:
pffiffiffiffiffiffi
ss ¼ 2 PK ð17Þ

rIs ¼ rIsh :H MrIsh ss ð18Þ

Fig. 4 (a) Input image (b) Calculated r(x)

550 R. Amini Gougeh et al.

Ish is shock ﬁltered image, ss is mentioned threshold to guarantee that at least

pffiffiffiffiffiffi
2 PK pixels participate in kernel
estimation in each group. It also excludes seg-
ments depended on MrIsh .
We calculated the required edges. Next step is the blur kernel estimation. Our
target is k, which is the kernel. We know that this problem is ill-posed so we need to
use regularization terms to solve the problem correctly. Our problem modeled as
follows by Xu and Jia [15]:

EðkÞ ¼ krIs ~ k rBk2 þ ckkk2 ð19Þ

To solve this problem, we need separate dimensions and solve the convolution in
matrixes. We can do that operation by flipping both of the rows and columns of the
image and then multiplying locally similar entries and summing:
2
EðkÞ ¼ kAx k rx Bk2 þ Ay k ry B þ ckkk2 ð20Þ

If we apply the ﬁrst-order derivation:

@EðkÞ
¼ 2ATx ðAx k rx BÞ þ 2ATy ðAy k ry BÞ þ 2ck ð21Þ
@k

We assume that the Eq. (21) equals zero, then we apply Fast Fourier Transform
(FFT) on all variables.

2ðATx Ax þ ATy Ay þ cÞk ¼ ATx rx B þ ATy ry B ð22Þ

Using Parseval’s theorem:

!
F ð@x Is ÞF ð@x BÞ þ F @y Is F @y B
k ¼ F 1 2 ð23Þ
F ð@x Is Þ2 þ F @y B þ c

where F ð:Þ and F 1 ð:Þ denote the FFT and inverse FFT respectively. F ð:Þ is the
complex conjugate operator. So we restored the blur kernel with Eq. (23).
To restore an image, we need to model ill-posed problem again, but we use
spatial prior this time:

EðIÞ ¼ kI ~ k Bk2 þ kkrI rIs k2 ð24Þ

which rI rIs is new prior and restore sharp selected edges properly. Using the
former approach results:
Medical Image Enhancement and Deblurring 551

Fig. 5 (a) Blurred input (b) c ¼ 15; k ¼ 0:005 (c) c ¼ 15; k ¼ 0:05 (d) c ¼ 15; k ¼ 0:5
(e) c ¼ 15; k ¼ 5 (f) c ¼ 5; k ¼ 0:005 (g) c ¼ 10; k ¼ 0:005 (h) c ¼ 20; k ¼ 0:005 (i) c ¼ 30;
k ¼ 0:005

0 1
F ðkÞF ðBÞ þ kF ð@x ÞF Isx þ kF @y F Isy
I ¼ F 1 @ A ð25Þ
F ðkÞF ðBÞ þ kF ð@x ÞF ð@x Þ þ kF @y F @y

I; is latent image and we need to use a non-blind deconvolution technique to

restore detailed image. Various methods for reach ﬁnal image have been developed
and we used method of Cho and Lee [1].
Effect of k and c values is illustrated in Fig. 5.

3 Discussion and Results

The parameters in the calculations have an important role in predicting the blur
kernel. For example, if the threshold values are selected for the function r(x) and the
ﬁnal edges are either large or very small, the image will be smoothed, and therefore
important edges will not be selected for kernel estimation. In this paper, we
attempted to improve performance by select these values automatically.
In Fig. 6 effects of values on kernel depicted.
We also tried our algorithm on images which contain text such as Fig. 7.
552 R. Amini Gougeh et al.

Fig. 6 (a) Output image and estimated kernel with c ¼ 15 (b) Output image and estimated kernel
with c ¼ 5 (c) Output image and estimated kernel with c ¼ 1

Fig. 7 Debluring image with text (a) blurred input (b) perona-malik output (c) deconvolution
output

Our algorithm was implemented in MATLAB R2016a on AMD A10 6th gen-
eration CPU 1.8 GHz. and duration of image restoration has calculated in Table 1.
Medical Image Enhancement and Deblurring 553

Table 1 Calculation speed

Image Restoration duration Iterations
(sec) Perona-malik Shock Coarse to
ﬁlter ﬁne
Vessels (Fig. 1) 22.5 5 8 7
Foot (Fig. 3) 31.8 5 8 7
Arm (Fig. 5) 42 5 8 7
Brain (Fig. 6) 27.4 5 8 7
Faculty façade 38 5 8 7
(Fig. 7)

4 Summary

Image processing has improved dramatically in the last decades. The rate of
development has increased with the advent of more advanced machine vision
technologies in daily life. Medical imaging, as one of the pillars of the modern
medical diagnosis system, is not devoid of this technology.
Different imaging methods have different sensitivities to noise, camera move-
ment, beam source, and other factors. The blur of the images cause damage to these
images. For example, a slight movement on an MRI or x-ray machine results in
blurry images. Figure 1 is used to detect blockage of the vein, which results in
relative blind-ness. Therefore, these images must have accuracy due to the physi-
cian can diagnose the disease with less error. The current method, in contrast to
conventional methods, can compute the blur kernel and help to reduce the costs of
re-imaging by restoring the original image. Proper edges and reduced initial noise
of blurry images lead to an accurate estimation of the blur kernel. According to the
results, using nonlinear noise reduction methods increases accuracy. The method
provided by Perona-Malik has basic parameters that are selected by the user.
Choosing these parameters automatically reduce error and leads to optimal results.
The next factor in the accuracy of the blur kernel after noise reduction is to select
the appropriate edges of the estimator function input. Shock ﬁlters introduced by
Osher and Rudin [13] perform better than other methods, such as Canny. Our
iterative algorithm modiﬁes itself at every step and results in a more transparent
output.
Local deburring is one of the accurate ways which leads to clear images. In
Addition, creating a fast algorithm for shift-variant blur models is needed in future
works.

Acknowledgements The authors appreciate those who contributed to make this research suc-
cessful. This research is supported by Center for Research and Innovation (PPPI) and Faculty of
Engineering, Universiti Malaysia Sabah (UMS) under the Research Grant (SBK0393-2018).
554 R. Amini Gougeh et al.

References

1. Cho S, Lee S (2009) Fast motion deblurring. ACM Trans Graph (TOG) 28(5):145
2. Fergus R, Singh B, Hertzmann A, Roweis ST, Freeman WT (2006) Removing camera shake
from a single photograph. ACM Trans Graph (TOG) 25(3):787–794
3. Yadav S, Jain C, Chugh A (2016) Evaluation of image deblurring techniques. Int J Comput
Appl 139(12):32–36
4. Shan Q, Jia J, Agarwala A (2008) High-quality motion deblurring from a single image. ACM
Trans Graph (TOG) 27(3)
5. Levin A, Weiss Y, Durand F, Freeman WT (2009) Understanding and evaluating blind
deconvolution algorithms. In: IEEE conference on computer vision and pattern recognition,
pp 1964–1971
6. Money J, Kang S (2008) Total variation minimizing blind deconvolution with shock filter
reference. Image Vis Comput 26(2):302–314
7. Alvarez L, Mazorra L (1994) Signal and image restoration using shock filters and anisotropic
diffusion. SIAM J Numer Anal 31(2):590–605
8. Xu L, Zheng S, Jia J (2013) Unnatural l0 sparse representation for natural image deblurring.
In: Computer vision and pattern recognition, pp 1107–1114
9. Chen T, Huang TS, Yin W, Zhou XS (2005) A new coarse-to-fine framework for 3D brain
MR image registration. In: International workshop on computer vision for biomedical image
applications, pp 114–124. Springer, Heidelberg, October 2005
10. Perona P, Malik J (1987) Scale-space and edge detection using anisotropic diffusion. IEEE
Trans Pattern Anal Mach Intell 12(7):629–639
11. Black MJ, Sapiro G, Marimont DH, Heeger D (1998) Robust anisotropic diffusion. IEEE
Trans Image Process 7(3):421–432
12. Hasanpor H, Nikpour M (2008) Using adaptive diffusion coefficient to eliminate image noise
using partial equations. Iranian J Electr Comput Eng 6(4)
13. Osher S, Rudin LI (1990) Feature-oriented image enhancement using shock filters. SIAM J
Numer Anal 27(4):919–940
14. Weickert J (2003) Coherence-enhancing shock filters. In: Joint pattern recognition
symposium. Springer, Berlin, pp 1–8
15. Xu L, Jia J (2010) Two-phase kernel estimation for robust motion deblurring. In: European
conference on computer vision. Springer, Berlin, pp 157–170
A Fast and Efficient Segmentation
of Soil-Transmitted Helminths Through
Various Color Models and k-Means
Clustering

Norhanis Ayunie Ahmad Khairudin, Aimi Salihah Abdul Nasir,

Lim Chee Chin, Haryati Jaafar, and Zeehaida Mohamed

Abstract Soil-transmitted helminths (STH) are one of the causes of health problems
in children and adults. Based on a large number of helminthiases cases that have been
diagnosed, a productive system is required for the identification and classification of
STH in ensuring the health of the people is guaranteed. This paper presents a fast and
efficient method to segment two types of STH; Ascaris Lumbricoides Ova (ALO) and
Trichuris Trichiura Ova (TTO) based on the analysis of various color models.
Firstly, the ALO and TTO images are enhanced using modified global contrast
stretching (MGCS) technique, followed by the extraction of color components from
various color models. In this study, segmentation based on various color models such
as RGB, HSV, L*a*b and NSTC have been used to identify, simplify and extract the
particular color needed. Then, k-means clustering is used to segment the color
component images into three clusters region which are target (helminth eggs),
unwanted and background regions. Then, additional processing steps are applied on
the segmented images to remove the unwanted region from the images and to restore
the information of the images. The proposed techniques have been evaluated on 100
images of ALO and TTO. Results obtained show saturation component of HSV color
model is the most suitable color component to be used with the k-means clustering
technique on ALO and TTO images which achieve segmentation performance of
99.06% for accuracy, 99.31% for specificity and 95.06% for sensitivity.

Keywords Soil-transmitted helminths Modiﬁed global contrast stretching

Color models k-Means clustering

N. A. A. Khairudin (&) A. S. A. Nasir H. Jaafar

Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI Alam Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]
L. C. Chin
School of Mechatronic Engineering, University Malaysia Perlis, Pauh Putra Campus,
02600 Arau, Perlis, Malaysia
Z. Mohamed
Department of Microbiology and Parasitology, School of Medical Sciences, Health Campus,
Universiti Sains Malaysia, 16150 Kubang Kerian, Kelantan, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 555

1 Introduction

Soil-transmitted helminths (STH) are a group of intestinal parasitic worms that

affect humans through contact with larvae or ingestion of infective eggs. The
infections for humans are common in underprivileged communities where over-
crowded, poor environmental sanitation and lack of access for clear and safe water
are prevalent [1, 2].
The most commonly STH eggs found in the human body are Ascaris
Lumbricordes ova (ALO) and Trichuris Trichiura ova (TTO). STH inhabit the
intestine, liver, lungs and blood vessels of their hosts while the adult worms inhabit
intestine to mate and they will release the eggs in feces [3] to be diffused into soils.
The sizes of the eggs are microscopic and vary for each species [4].
Helminth eggs can remain viable for 1 to 2 months in crops and many months in
soil, freshwater and sewage [5]. They can remain viable for several years in feces,
night soil, sludge and wastewater. STH eggs can be transmitted to the human body
through direct contact with polluted sludge or fecal material, exposure to con-
taminated food, water and also from an animal body or their fur [6].
These parasites can multiply in the human body and this could lead to a serious
illness such as filariasis and cysts. They also might increase the susceptibility to
other illnesses such as tuberculosis, malaria and HIV infection. For children, the
STH infection may cause malnutrition, education deficits and intellectual retarda-
tion [7, 8]. Studies have shown such infections have a high consequence on school
performance and attendance and future economic productivity [9].
In 2016, around 2.5 billion people all around the world affected with
helminthiases disease and over 530 million children which representing 63% of the
world’s total were treated [10]. Based on the high number of helminthiases cases,
the requirement for identification and classification for the types of helminth eggs is
paramount importance in the healthcare industry.
Early diagnosis is fundamental for patient recovery, especially for children cases.
Helminth eggs can be diagnosed through patients’ stool, blood and tissue sample.
Parasitologist needs to diagnose these sample in fresh condition under a limited
time. Problems occur when the procedures take a great amount of time and the
observer must have a good concentration in observing the samples [11]. Results
obtained are often neither accurate nor reliable. These limitations have initiated the
improvement in digital image processing for helminth eggs recognitions by using
image processing and computer algorithms.
Hadi et al. [12] used the median filter twice to reduce the artifacts and noises in
the image while edge enhancement based on sharpness and edge detection with
canny filter have been used to detect the edge of the hard sharp objects. Threshold
with Logical Classification Method (TLCM) has been proposed for the automatic
identification process by using shape, shell smoothness and size of the eggs as
features in the feature extraction process. The classifying accuracy obtained for
ALO species is 93% while TTO species is 94%.
A Fast and Efficient Segmentation of Soil-Transmitted Helminths … 557

Then, Suzuki et al. [13] identified 15 types of human intestine parasites through
a system that automatically segmented and classified the human intestinal parasites
from microscopy images. The proposed system explores image foresting transform
and ellipse matching for segmentation and optimum-path forest classifier for object
recognition. This system obtained 90.38% for sensitivity, 98.19% for efficiency and
98.32% for specificity.
Kamarul et al. [14] proposed a new classification using Filtration with Steady
Determinations Thresholds System (F-SDTS) classifier. This classifier is applied in
the feature extraction stage by using the ranges of feature values as a database to
identify and classify the type of parasite. The overall success rate for this classi-
fication system is 94%.
Jimenez et al. [11] proposed a system that identifies and quantifies seven species
of helminth eggs in wastewater. Gray-scale profile segmentation is used to identify
the shape and thus to differentiate genera and species of the helminth eggs. The
system shows a specificity of 99% and a sensitivity of 80% to 90%.
The systems proposed by the previous researchers showed an increment in
identification and classification of human intestinal parasites. However, improve-
ment can be done in the segmentation part in order to achieve efficient results. One
of the improvements is by manipulating the color conversion in an image to dif-
ferentiate the feature of helminths with the artifacts. This suggestion is recom-
mended based on the outcome obtained when color conversion is applied to the
image of other medical studies such as cancer, cyst, leukemia and malaria [15–20].
Ghimire and Lee [15] used HSV color model on image by keeping H and S
components unvaried and used only (V) component from HSV color image to
prevent the change of state of color balance among the HSV component. The
enhanced image is not altered because the H and S are not changed. The proposed
method obtained a better image compared to other methods such as histogram
equalization and integrated neighborhood dependent approach for nonlinear
enhancement (AINDANE).
Kulkarni et al. [16] applied color conversion after the pre-processing method in
order to recognize Acute Lymphoblastic Leukemia (ALL) images. RGB color space
is converted into HSV color space to reduce the color dimension from three to two.
Saturation (S) plane is selected as it shows a better contrast compared to Hue
(H) and Value (V) components. Otsu’s Thresholding method is used for the seg-
mentation part and able to segment the ALL into two parts; nucleus and cytoplasm.
Poostchi et al. [17] have listed RGB, HSV, YCbCr, LAB and intensity under
color feature when they analyzed the feature computation for classifying malaria
parasites for both thin and thick blood smear. Color feature is the most natural to be
used for stained parasite to acquire information and to describe the morphological
features in red blood cells.
An analysis of the usability of color model in image processing has been studied
by Sharma and Nayyer [18]. Color components provide a rational way to specify
orders, manipulated and effectively display the color of the object that is been
considered. Thus, the selected color model should be appropriate to deal with the
problem statement and solution. The process of selecting the best color
558 N. A. A. Khairudin et al.

representation involves knowing how color signals are generated and what infor-
mation is needed from these signals. Color models are widely used to facilitate the
specification of the color in some standard generally accepted way.
Aris et al. [19] have analyzed color components in color spaces to improve the
counting performance of malaria parasites based on thick blood smear images. Y,
Cb, R, G, C, M, S and L components have been extracted from YCbCr, RGB,
CMY, HSV and HSL color models in order to identify which color component
shows the most accurate counting for malaria parasites. Based on results obtained,
Y component of YCbCr shows the best segmentation result with 98.48% of average
counting accuracy for 100 images of malaria thick blood smear.
A new color components’ exchanging method on different color spaces for
image segmentation has been proposed by Dai and Li [20] in order to segment a
hematocyte image. This method exchange the order of color components after the
color component from the original image is extracted. The new image formed has
been segmented using Otsu thresholding and region segmentation techniques. The
proposed method can differentiate the target segmentation of hematocyte image
which are nucleus and cytoplasm of hematocyte, erythrocytes and leukocyte from
background image. However, this method is unfitting for sample images that have
different staining methods and magnification.
Based on the previous studies, it can be seen that color models plays a major role
in improving the segmentation performance of image. Therefore, this study will
discover the potential of various color components for segmentation process in
order to improve the STH segmentation performance.

2 Methodology

Most of the researchers have focusing on segmentation and classiﬁcation tech-

niques to achieve the most accurate results. However, the most crucial part lies in
the pre-processing step in which it will affect the next processing step. In this paper,
several color models are applied on the enhanced images in order to identify which
color component is the most suitable to be applied in segmenting the ALO and TTO
images. The methodological steps for segmenting these images will be explained in
this section.

2.1 Image Acquisition

The samples of STH are acquired from helminthiases patients through a stool
sample. The samples of ALO and TTO are obtained from the Department of
Microbiology and Parasitology, Hospital University Science of Malaysia (HUSM).
These stool samples are freshly prepared on slides and have been analyzed under
A Fast and Efﬁcient Segmentation of Soil-Transmitted Helminths … 559

40X magniﬁcation by using Leica DLMA digital microscope. Normal saline is used
as the staining to obtain a clear vision of the eggs. In this study, 100 images for each
species of ALO and TTO have been captured and saved in .jpg format.

2.2 Image Enhancement Technique Using Modiﬁed Global

Contrast Stretching (MGCS)

The samples obtained may have different luminance which needs to be standard-
ized. The cause of this problem is due to the color of stool sample or through the
lighting from microscope. In order to standardize the luminance, a contrast
enhancement technique namely modiﬁed global contrast stretching (MGCS) is used
[21]. This technique is used to standardize the lighting in the image as well as
improving the quality of the targeted image.
One of the advantages of MGCS technique is its ability to enhance the contrast of
the image without affecting the color structure of the original image. Besides, this
technique is able to preserve as much information as the original image. MGCS is
altered from global contrast stretching (GCS), hence this technique able to overcome
the weakness of GCS by adjusting the values of minimum and maximum in R, G and B
components that have been acquired through a certain calculation from the total
number of pixels in the images. The original equation of GCS is shown in Eq. (1) [22].

inRGB ðx; yÞ minRGB
outRGB ðx; yÞ ¼ 255 ð1Þ
maxRGB minRGB

Several parameters are required in order to obtain the new minimum and
maximum values. These include the value for minimum percentage, minp, maxi-
mum percentage, maxp, number of pixels in each pixel level, Tpix, total number of
pixels that lie in a speciﬁed minimum percentage, Tmin and total number of pixels
that lie in a speciﬁed maximum percentage, Tmax. The procedures to develop the
MGCS techniques are as follows [22]:
1. Select the preferred values for minp and maxp.
2. Initialize Tmin = 0 and Tmax = 0. Set the value of k = 0, where k is the current
pixel level.
3. Estimate the histogram for the red component.
4. Find the number of pixels, Tpix[k] at k. If Tpix[k] 1, set Tmin = Tmin + Tpix[k].
5. Check the following condition:

Tmin
100 minp ð2Þ
total number of pixel in image
560 N. A. A. Khairudin et al.

6. If Tmin fulfills Eq. 2, set the new minimum value, Nmin for the red component
in the image to the k value that satisfies this condition; else set k = k + 1.
7. Repeat steps 4 to 6 for the next pixel levels until Nmin is obtained based on the
k value that satisfies Eq. 2.
8. Set the value of k = 255.
9. Find Tpix[k] at k. If Tpix[k] 1, set Tmax = Tmax + Tpix[k].
10. Check the following condition:

Tmax
100 maxp ð3Þ
total number of pixel in image

11. If Tmax satisfies Eq. 3, set the new maximum value, Nmax for the red com-
ponent in the image to the k value that satisfies this condition; else set k = k − 1.
12. Repeat steps 9 to 11 for the next pixel levels until Nmax is obtained based on
the k value that satisfies Eq. 3.
13. Repeat steps 2 to 12 in order to calculate the Nmin and Nmax for the green and
blue components.
14. Nmin and Nmax then are used to replace the original min and max in the GCS
formula in Eq. (1).

2.3 Color Conversion of STH Image Using Various Color

Models

Color conversion identiﬁes color that present in an image. It generally is made from
3D coordinate system and a subspace where each color is represented by a single
point [22]. In image processing, color model is used to identify, simplify, extract
and edit the particular color needed.
Various color models like RGB (Red, Green, Blue), HSV (Hue, Saturation,
Value) and L*a*b are used in various applications such as cell detection, lane

(a) Enhanced image (b) R component (c) G component (d) B component

Fig. 1 Results of R, G and B components on STH image

A Fast and Efﬁcient Segmentation of Soil-Transmitted Helminths … 561

detection, face detection and many more. Sharma et al. [23] stated that color space
provides a rational way to effectively considered in displaying the color of objects.
RGB Color Model. The RGB color model is based on the theory that all visible
color models can be created using primary colors of red, green and blue [22]. These
color models are commonly used to recognize, represent and display images in an
electronic system such as televisions, computers and photography. Figure 1 shows
the results of RGB color model on STH image. R, G and B components are suitable
to be used on STH images.
HSV Color Model. HSV is made up based on hue, saturation and value character.
The characteristic of HSV have been illustrated in hex-cone shape and the coor-
dinate system is cylindrical. H describes the hue or true color in the image, while S
represents the amount of white color in the image [24]. The higher the amount of
white, the lower the image saturation. Value shows the degree of brightness in the
image which describes value or luminance in the image. The top of HSV hex-cone
is a projection along the RGB main diagonal color [25]. Figure 2 shows the
hex-cone shape of HSV.
Hue is deﬁned by the one or two largest parameter. The range for H is from 0° to
360°. S able to be controlled by varying the R, G and B collective minimum value
whereas V is controlled by varying the magnitudes while keeping a constant ratio
[23, 25].

H1 ; if B G
H ¼ f ð xÞ ¼ ð4Þ
360 H1 ; if B [ G

maxðR; G; BÞ þ ðR GÞ
S¼ ð5Þ
maxðR; G; BÞ

maxðR; G; BÞ
V¼ ð6Þ
255

The advantage of HSV is it has a simple conceptual concept that each of the
element attributes directly corresponds to the basic color model. The disadvantage
is the saturation attributes correspond to the mixture of a color with white (tinting),
so color desaturation increases the amount of intensity [26]. In this paper, S and V
components are applied on the STH images as the H component is unsuitable to be

Fig. 2 Hex-cone shape of

HSV color space
562 N. A. A. Khairudin et al.

(a) Enhanced image (b) H component (c) S component (d) V component

Fig. 3 Results of H, S and V components on STH image

(a) Enhanced image (b) Y component (c) I component (d) Q component

Fig. 4 Results of Y, I and Q components on STH image

used on the STH image because H components shows low contrast between the
foreground and background as can be seen in Fig. 3(b).
CIE 1976 L*a*b* Color Model. This color conversion is derived from CIE XYZ
and is used to linearize the perceptibility of color differences. The designation of
Lab color space is approximately for a human vision which L component is closely
matched to the human perception of lightness [27]. L* stands for luminosity, A* is
for red or green axis and B* is for blue or yellow axis. CIE Lab is popular in
measuring reflective and transmissive objects [25, 27].
NTSC Color Model. National Television System Committee (NTSC) uses YIQ as
color space which Y component represents the luma information while I and Q
represent the chrominance information for television receiver. Luminance can be
obtained from a linear combination of the three primaries. Equation (7) shows the
formula for the conversion from RGB color space to YIQ color space while Eq. (8)
shows the determined formula by the colorimetric for display system [28].
2 3 2 32 3
Y 0; 299 0:587 0:114 R
4 I 5 4 0:5959 0:2746 0:3213 54 G 5 ð7Þ
Q 0:2115 0:5227 0:3112 B

Y ¼ 0:299R þ 0:587G þ 0:114B ð8Þ

In this study, only Y and I components are applied on the enhanced STH images.
This is because Y and I able to differentiate the foreground and background in the
image whereas the foreground and background are in the same color in Q
A Fast and Efﬁcient Segmentation of Soil-Transmitted Helminths … 563

component. Figure 4 shows the results obtained from the NTSC color model based
on Y, I and Q components.
Arithmetic Between Color Models. The components in color models are altered
through addition and subtraction arithmetic to help in increasing the possibility of
the enhanced image to be segmented accurately. Between the arithmetic formulas
for the color models components, two formulas from arithmetic show a good
improvement in differentiating the color components in the enhanced STH image.
First formula is based on the addition of G component from RGB color model with
Lab color model (GLab). Second, subtraction of S component from HSV color
model with G component from RGB color model (SG).

GLab ¼ G þ Lab ð9Þ

SG ¼ S G ð10Þ

2.4 Image Segmentation of STH Image Using k-Means

Clustering

The main purpose for segmentation of STH image is to separate the regions in STH
image by dividing the image into the region of interest and background region. The
segmentation process is important because it will serve as a basic step for all
subsequent analyses.
In this paper, k-means clustering is used in order to identify which color com-
ponent shows the best STH segmentation result. The algorithm for k-means clus-
tering is based on the concept of data assignation to their respective centers by the
shortest Euclidean distance. The k-means clustering is one of the most popular
clustering methods based on unsupervised learning algorithms due to its simplicity
[20]. The k-means clustering is constructed on minimizing the objective function, J
as in Eq. (11).
Xn Xk
J¼ xi cj ð11Þ
i¼1 j¼1

Where n is the number of data, k is the number for the cluster, xi is the ith the
sample and cj is the jth center of the cluster. In this paper, three clusters are used for
the segmentation process in order to differentiate between target, unwanted and
background regions.
564 N. A. A. Khairudin et al.

2.5 Post-processing Steps After Segmentation Process

After the segmented images have been obtained from k-means clustering, the
unwanted pixels and regions are removed by using object remover technique in
binary form. This technique helps in removing the pixel lower than 17000 pixel and
larger than 70000 pixel in order to achieve an accurate diagnosis for STH.
However, the tendency for the pixels inside the target image to disappear is high.
Fill holes operation is selected to overcome the side effect from the object remover
method on the segmented image by ﬁlling the area of the dark pixels that are
surrounded by lighter pixels.

2.6 Segmentation Performance

The segmentation performance aims to identify the successfulness of the segmen-

tation. In this paper, segmentation performance is used to compare the image of the
segmentation results when the different color components are applied with k-means
clustering technique. Segmentation performance is divided into three measures
which are accuracy, specificity and sensitivity. These measurements are calculated
by comparing the pixels from the resultant segmented image with the manually
segmented image. The calculation for accuracy, specificity and sensitivity are
defined in Eqs. (12), (13) and (14) respectively.

TP þ TN
Accuracy ¼ 100 ð12Þ
TP þ TN þ FP þ FN
TN
Specificity ¼ 100 ð13Þ
TN þ FP
TP
Sensitivity ¼ 100 ð14Þ
TP þ FN

Accuracy is the ratio of correctly classified pixels to the entire area of the STH
images while sensitivity is a true positive measure in that it refers to the proportion
of images that contain the region of helminth eggs which has been classified cor-
rectly. Specificity is the percentage of pixels that are correctly segmented as neg-
ative region [29].
A Fast and Efficient Segmentation of Soil-Transmitted Helminths … 565

3 Results and Discussion

In this study, MGCS technique has been applied on 100 ALO images and 100 TTO
images. From the enhancement results obtained, nine color components have been
applied on the enhanced images. The results of color components image has been

(a) ALO_1 (b) ALO_2 (c) TTO_1 (d) TTO_2

Fig. 5 Original ALO and ALO and TTO images

(a) MGCS ALO_1 (b) R ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) R ALO_2 (g) k-Means ALO_1 (h) PPS ALO2

(i) MGCS TTO_1 (j) R TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) R TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 6 Results of R component and k-means clustering on enhanced ALO and TTO images
566 N. A. A. Khairudin et al.

(a) MGCS ALO_1 (b) G ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) G ALO_2 (g) k-Means ALO_2 (h) PPS ALO_2

(i) MGCS TTO_1 (j) G TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) G TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 7 Results of G component and k-means clustering on enhanced ALO and TTO images

used as input image for k-means clustering in order to pinpoint the most suitable
color component to be used for the segmentation part. Then, the results of the
segmented images has been determined through qualitative and quantitative
evaluations.
Figure 5 shows the samples of the original ALO and TTO images. The lighting
in the images is different from each other. ALO_1 and TTO_2 images are darker
than ALO_2 and TTO_1. The artifacts also come in different colors and sizes for
each image. These differences increase the difﬁculty in the segmentation process.
A Fast and Efﬁcient Segmentation of Soil-Transmitted Helminths … 567

(a) MGCS ALO_1 (b) B ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) B ALO_2 (g) k-Means ALO_2 (h) PPS ALO_2

(i) MGCS TTO_1 (j) B TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) B TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 8 Results of B component and k-means clustering on enhanced ALO and TTO images

However, the MGCS technique eases the problem encountered by enhancing and
ﬁxing the lighting in the images. Figure 6 until Fig. 15 show the result of images
when the proposed color components and k-means clustering are applied on the
MGCS images of ALO and TTO (Figs. 7, 8, 12).
From the resultant images achieved, it can be said that each of the color com-
ponents has their advantage and disadvantage when applied on the MGSC images.
The results obtained from color components are crucial for k-means clustering and
post-processing process. Based on the observation of the enhanced images, MGCS
technique shows that the original images are enhanced into a better quality of
images. The target images pop up and can be distinguished from the artifacts while
the lighting for each image is balanced.
568 N. A. A. Khairudin et al.

(a) MGCS ALO_1 (b) S ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) S ALO_2 (g) k-Means ALO_2 (h) PPS ALO_2

(i) MGCS TTO_1 (j) S TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) S TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 9 Results of S component and k-means clustering on enhanced ALO and TTO images

The results obtained show that R, V, Lab and GLab components are incom-
patible for STH segmentation. The information of the target images is greatly
affected when the images go through the post-processing procedure because most of
the loss information from the target images are unable to be restored. Figures 6, 10,
11 and 14 show the resultant images that have lost their information and unable to
be restored which are mostly come from TTO images.
A Fast and Efﬁcient Segmentation of Soil-Transmitted Helminths … 569

(a) MGCS ALO_1 (b) V ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) V ALO_2 (g) k-Means ALO_2 (h) PPS ALO_2

(i) MGCS TTO_1 (j) V TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) V TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 10 Results of V component and k-means clustering on enhanced ALO and TTO images

The images are successfully segmented when G, B, Lab and Y components are
applied on the enhanced images with the combination of k-means clustering
technique. However, the ﬁnal results show that the artifacts are still present in the
images even though the target images are successfully segmented. These artifacts
are difﬁcult to be removed because their sizes are within the range of target image
size. This increases the possibility of misleading analysis to occur in segmentation
performance.
570 N. A. A. Khairudin et al.

(a) MGCS ALO_1 (b) Lab ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) Lab ALO_2 (g) k-Means ALO_2 (h) PPS ALO_2

(i) MGCS TTO_1 (j) Lab TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) Lab TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 11 Results of Lab color model and k-means clustering on enhanced ALO and TTO images

Then, S, I and SG components show better resultant images when been applied
on the MGCS images compared to the other techniques. The artifacts are present
but in minimize amounts. Figure 9 shows the result images for S component. The
target images are successfully segmented with only a small portion of artifact
present because they are in the same cluster as the target images. The results from I
components in Fig. 13 shows good segmentation results but the target images
A Fast and Efﬁcient Segmentation of Soil-Transmitted Helminths … 571

(a) MGCS ALO_1 (b) Y ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) Y ALO_2 (g) k-Means ALO_2 (h) PPS ALO_2

(i) MGCS TTO_1 (j) Y TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) Y TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 12 Results of Y component and k-means clustering on enhanced ALO and TTO images

produced in the ﬁnal images are smaller than the original images. The results from
SG component in Fig. 15 shows that some information is missing although the
target images are successfully segmented with a lesser amount of the artifacts.
Table 1 shows the average results performance for each color component pro-
posed on the total images of ALO and TTO. From the results obtained, the highest
accuracy result is 99.06%, obtained by S and SG color component. For speciﬁcity,
572 N. A. A. Khairudin et al.

(a) MGCS ALO_1 (b) I ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) I ALO_2 (g) k-Means ALO_2 (h) PPS ALO_2

(i) MGCS TTO_1 (j) I TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) I TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 13 Results of I component and k-means clustering on enhanced ALO and TTO images

Table 1 Results of Color components Accuracy Speciﬁcity Sensitivity

segmentation performances
based on different color R 96.76% 98.06% 67.81%
components and k-means G 98.24% 98.29% 97.33%
clustering B 98.53% 98.64% 96.54%
S 99.06% 99.31% 95.06%
V 96.97% 99.54% 91.46%
Lab 98.02% 98.35% 89.97%
Y 98.01% 98.12% 95.19%
I 97.40% 99.96% 56.24%
GLab 96.50% 99.41% 40.83%
SG 99.06% 99.54% 91.46%
A Fast and Efﬁcient Segmentation of Soil-Transmitted Helminths … 573

(a) MGCS ALO_1 (b) GLab ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) GLab ALO_2 (g) k-Means ALO_2 (h) PPS ALO_2

(i) MGCS TTO_1 (j) GLab TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) GLab TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 14 Results of GLab arithmetic component and k-means clustering on enhanced ALO and
TTO images

the highest result is 99.96%, obtained by I component while the highest result for
sensitivity is 97.33%, obtained by G component. By comparing the overall per-
formance, S component achieved the best segmentation performance when been
applied with the k-means clustering with accuracy of 99.06%, speciﬁcity of 99.31%
and sensitivity of 95.06%.
574 N. A. A. Khairudin et al.

(a) MGCS ALO_1 (b) SG ALO_1 (c) k-Means ALO_1 (d) PPS ALO_1

(e) MGCS ALO_2 (f) SG ALO_2 (g) k-Means ALO_2 (h) PPS ALO_2

(i) MGCS TTO_1 (j) SG TTO_1 (k) k-Means TTO_1 (l) PPS TTO_1

(m) MGCS TTO_2 (n) SG TTO_2 (o) k-Means TTO_2 (p) PPS TTO_2

Fig. 15 Results of SG component and k-means clustering on enhanced ALO and TTO images

4 Conclusions

In this paper, the results of applying the proposed color models with k-means
clustering have been presented. Color components from the various color models
are used for k-means clustering segmentation to ease the identification of the target
image in order to achieve good segmentation results. A good segmentation result
helps to achieve more accurate results for classification and diagnosis of STH.
S component from HSV color model has proven to be the best in obtaining a good
segmentation of ALO and TTO images with accuracy of 99.06%, specificity of
99.31% and sensitivity of 95.06%. These results can be used as a reference for the
morphology of the ALO and TTO in the next project such as classification and
identification process.

Acknowledgements The author would like to acknowledge the support from the Fundamental
Research Grant Scheme for Research Acculturation of Early Career Researchers (FRGS-RACER)
under a grant number of RACER/1/2019/ICT02/UNIMAP//2 from the Ministry of Higher
Education Malaysia. The authors gratefully acknowledge team members and thank Hospital
Universiti Sains Malaysia (HUSM) for providing the helminths eggs samples.
A Fast and Efﬁcient Segmentation of Soil-Transmitted Helminths … 575

References

1. Mohd-Shaharuddin N, Lim YAL, Hassan N-A, Nathan S, Ngui R (2018) Soil-transmitted

helminthiasis among indigenous communities in Malaysia: is this the endless malady with no
solution? Trop Biomed 35(1):168–180
2. Mehraj V, Hatcher J, Akhtar S, Rafique G, Beg MA (2008) Prevalence and factors associated
with intestinal parasitic infection among children in an urban slum of Karachi. PLoS ONE 3
(11):e3680
3. Ghate DA, Jadhav C (2012) Automatic detection of malaria parasite from blood images.
Department of Computer, College of Engineering, Pimpri, Pune, Maharashtra, India, TIJCSA
4. Ghazali KH, Hadi RS, Zeehaida M (2013) Microscopy image processing analysis for
automatic detection of human intestinal parasites ALO and TTO. In: International conference
on electronics computer and computation, ICECCO 2013, pp 40–43
5. World Health Organization (2004) Division of control of tropical diseases. Schistosomiasis
and intestinal parasites unit: training manual on diagnosis of intestinal parasites, tutor’s guide
electronic resource. CD-ROM
6. Amoah ID, Singh G, Stenström TA, Reddy P (2017) Detection and quantification of
soil-transmitted helminths in environmental samples: a review of current state-of-the-art and
future perspectives. Acta Trop 169(February):187–201
7. World Health Organization (WHO) (2005) Deworming for health and development. Report of
the third global meeting of the partners for parasite control. WHO, Geneva
8. World Health Organization (WHO) (2015) Third WHO report on neglected diseases:
investing to overcome the global impact of neglected tropical diseases. World Health
Organization, p 191
9. Bleakly H (2003) Disease and development. Evidence from hookworm eradication in the
American South. Q J Econ 1:376–386
10. Kaewpitoon SJ, Sangwalee W, Kujapun J, Norkaew J, Chuatanam J, Ponphimai S,
Chavengkun W, Padchasuwan N, Meererksom T, Tongtawee T, Matrakool L,
Panpimanmas S, Wakkhuwatapong P, Kaewpitoon N (2018) Active screening of gastroin-
testinal helminth infection in migrant workers in Thailand. J Int Med Res 46:4560–4568
11. Jiménez B, Maya C, Velásquez G, Torner F, Arambula F, Barrios JA, Velasco M (2016)
Identification and quantification of pathogenic helminth eggs using a digital image system.
Exp Parasitol 166:164–172
12. Hadi RS, Ghazali KH, Khalidin IZ, Zeehaida M (2012) Human parasitic worm detection
using image processing technique. In: ISCAIE 2012 - 2012 IEEE symposium on computer
applications & industrial electronics, no Iscaie, pp 196–201
13. Suzuki CTN, Gomes JF, Falcão AX, Papa JP, Hoshino-Shimizu S (2013) Automatic
segmentation and classification of human intestinal parasites from microscopy images. IEEE
Trans Biomed Eng 60(3):803–812
14. Kamarul HG, Raafat SH, Zeehaida M (2013) Automated system for diagnosis intestinal
parasites by computerized image analysis. Modern Appl Sci 7(5):98–114
15. Ghimire D, Lee J (2011) Nonlinear transfer function based local approach for color image
enhancement. IEEE Trans Consum Electron 57(2):858–865
16. Kulkarni TA, Bhosale DS, Yadav DM (2014) A fast segmentation method for the recognition
of acute lymphoblastic leukemia using thresholding algorithm. Int J Electron Commun
Comput Eng 5(4):364–368
17. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G (2018) Image analysis and machine
learning for detecting malaria. Transl Res 194(2018):36–55
576 N. A. A. Khairudin et al.

18. Sharma B, Nayyer R (2015) Use and analysis of color models in image processing. J Food
Process Technol 7(01):1–2
19. Aris TA, Nasir ASA, Mohamed Z, Jaafar H, Mustafa WA, Khairunizam W, Jamlos MA,
Zunaidi I, Razlan ZM, Shahriman AB (2019) Colour component analysis approach for
malaria parasites detection based on thick blood smear images. In: MEBSE 2018- IOP
conference series: materials science and engineering, vol 557. IOP
20. Dai H, Li X (2010) The color components’ exchanging on different color spaces for image
segmentation of hematocyte. In: 2nd international conference on multimedia information
networking and security, MINES 2010. IEEE, pp 10–13
21. Abdul-Nasir AS, Mashor MY, Mohamed Z (2012) Modiﬁed global and modiﬁed linear
contrast stretching algorithms-new colour contrast enhancement techniques for microscopic
analysis of malaria slide images. Comput Math Methods Med
22. Miller E (2017) Understanding the RGB colour model, graphic design 101. https://www.
thoughtco.com/colour-models-rgb-1697461
23. Sharma B, Nayyer B (2015) Use and analysis of color model in image processing. J Food
Process Control Technol
24. Nasir ASA, Mashor MY, Rosline H (2011) Detection of acute leukaemia cells using variety of
features and neural networks. In: 5th Kuala Lumpur international conference on biomedical
engineering. International Federation for Medical and Biological Engineering (IFMBE),
Kuala Lumpur, pp 40–46
25. Latoschik ME (2006) Realtime 3D computer graphic/virtual reality
26. Puniani S, Arora S (2015) Performance evaluation of image enhancement techniques. Int J
Signal Process Image Process Pattern Recogn 8(8):251–262
27. Erich LM (2006) Colour models CIE space for colour matching. CIE 1931 Model
International C
28. Hong Yan NL (2006) Improved method for color image enhancement based on luminance
and color contrast. J Electron Imaging 3(2):190–197
29. Khairudin NAA, Ariff FNM, Nasir ASA, Mustafa WA, Khairunizam W, Jamlos, MA,
Zunaidi I, Razlan ZM, Shahriman AB (2019) Image segmentation approach for acute and
chronic leukaemia based on blood sample images. In: MEBSE 2018- IOP conference series:
materials science and engineering, vol 557. IOP
Machine Learning Calibration for Near
Infrared Spectroscopy Data: A Visual
Programming Approach

Mahmud Iwan Solihin, Zheng Zekui, Chun Kit Ang, Fahri Heltha,
and Mohamed Rizon

Abstract Spectroscopy including Near infrared spectroscopy (NIRS) is a

non-destructive and rapid technique applied increasingly for food quality evalua-
tion, medical diagnosis, manufacturing, etc. The qualitative or quantitative infor-
mation using NIRS is only obtained after spectra data calibration process based
mathematical knowledge in chemometrics and statistics. This process naturally
involves multivariate statistical analysis. Machine learning as a subset of AI (arti-
ﬁcial intelligence), in addition to conventional multivariate statistical tools, seems to
get more popularity for chemometric calibration of NIRS data nowadays. However,
often the software/toolboxes in chemometrics are commercialized version which is
not free. For the free versions, programming skills are required to deal with
applications of machine learning in spectra data calibration. Therefore, this paper
introduces a different approach of spectra data calibration based on visual pro-
gramming approach using Orange data mining, a free software which is still rarely
used by the research community in spectroscopy. The data used namely: pesticide
sprayed on cabbage (to classify between pure cabbage and pesticide-sprayed cab-
bage with different level of pesticide solution), mango sweetness assessment (to
predict sugar soluble content in mango based on Brix degree value). These two data
represent classiﬁcation and regression respectively. This approach is intended more
for researchers who want to apply machine learning calibration in their spec-
troscopy data but don’t want to have rigorous programming jobs, i.e. for
non-programmers.

M. I. Solihin (&) C. K. Ang F. Heltha

Mechatronics Engineering, Faculty of Engineering, UCSI University, Kuala Lumpur,
Malaysia
e-mail: [email protected]
M. Rizon
Electrical and Electronics Engineering, Faculty of Engineering, UCSI University,
Kuala Lumpur, Malaysia
Z. Zekui
TUM (Technical University of Munich) Asia, Singapore, Singapore

© Springer Nature Singapore Pte Ltd. 2021 577

Keywords Machine learning calibration Near infrared spectroscopy Orange

free software Handheld near infrared spectrometer

1 Introduction

Machine learning including deep learning has become a highly discussed topics
recently in digital data world. It has tremendous potential to solve complex human
problems. Thus, many fields of applications demand implementation of machine
learning and artificial intelligence in broad to solve their respective problems [1–3].
This is not exclusive of spectroscopy data application. Spectroscopy is the study of
the interaction between matter and electromagnetic radiation originated through the
study of visible light dispersed according to its wavelength by a prism.
Particularly, near infrared spectroscopy (NIRS) is a non-destructive and rapid
technique applied increasingly for food quality evaluation, medical diagnosis,
manufacturing, etc. in recent years [4–15]. It can provide qualitative (substance
concentrations determination) and quantitate (raw material identification, adulter-
ation of product identification) information of samples for in situ analysis and
online applications [4, 5]. For example, it can provide moisture, protein, fat, and
starch content information. In each industry, NIR applications vary and are tailored
to suit different companies and their products and needs [16–18].
In spectroscopy, absorption spectra of chemical species (atoms, molecules, or ions)
are generated when a beam of electromagnetic energy (i.e. light) is passed through a
sample, and the chemical species absorbs a portion of the photons of electromagnetic
energy passing through the sample. Lamberts beer law states that the absorptive
capacity of a dissolved substance is directly proportional to its concentration in a
solution. The relationship can be expressed as shown in Eq. (1) [19].

Io
A ¼ log10 ¼ elc ð1Þ
I

where:
A= absorbance
e= the molar extinction
l= length of the path light must travel in the solution in centimeters
c= concentration of a given solution
The qualitative or quantitative information using NIRS is only obtained after
spectra data calibration process using chemometrics and this process naturally
involves multivariate statistical analysis. Machine learning as a subset of AI (arti-
ﬁcial intelligence), in addition to conventional multivariate statistical tools, seems to
get more popularity for chemometric calibration of NIRS data nowadays due to its
well-known capability to perform complex classiﬁcation and regression tasks [20–
22]. This emergence may be encapsulated in a subject so called intelligent
Machine Learning Calibration for Near Infrared Spectroscopy Data ... 579

chemometrics. Among popular machine learnings in this regard are support vector
machine (SVM) and artificial neural networks (ANN). Some research in this area
include literatures review can be found [23–28].
The software programming tools for chemometric purpose which can accom-
modate machine learning are many such as Unscrambler, MALAB, R language,
WEKA, SIMCA and Python. However, often these softwares/toolboxes are com-
mercialized version which is not free. Free software implementation on their
respective applications is motivating due to cost [29]. For the free versions, pro-
gramming skills are required to deal with applications of machine learning in
spectra data calibration.
Therefore, this paper introduces a different approach of spectra data calibration
based on visual programming approach using Orange free software developed by
Biolab [30] which is still rarely used by the research community in spectroscopy.
This approach is intended more for researchers who want to apply machine learning
calibration in their spectroscopy data but don’t want to have rigorous programming
jobs, i.e. for non-programmers.
This paper will demonstrate the results of machine learning calibration for some
NIRS data in classification and regression mode. The NIRS data used are obtained
using micro handheld spectrometer, a new type of NIR spectroscopy instrument. The
data used namely: pesticide sprayed on cabbage (to classify between pure cabbage
and pesticide-sprayed cabbage with different level of pesticide solution), mango
sweetness assessment (to predict sugar soluble content in mango based on Brix
degree value). These two data represent classification and regression respectively.

2 Instrument and Software

Spectrometer is the instrument used to collect spectra data of the objects/samples by

directing infrared light source. The spectra data obtained for each sample is unique
for each simple indicating the uniqueness of its chemical composition. Therefore,
particularly NIR spectrometer can be used as a mean of study for material ﬁnger-
print. The spectra data graph can be plotted in unit of nm or cm−1 (wavelength in x
axis) versus the intensity or absorbance (arbitrary unit in y axis). Figure 1 shows
example of spectra data obtained from a spectrometer.
The NIR spectrometer used in this study is a handheld type (hand palm size)
with wavelength range in NIR region from 900–1700 nm. The optical electrical
board of this spectrometer is developed by Texas Instruments. Figure 2 shows the
handheld micro spectrometer used in this study. This device is connected via USB
port so that the user can acquire the spectra signal of the samples in their personal
computer using GUI software. The detailed explanation on how the data was
collected will be explained in the next section for the respective case studies.
For the multivariate spectra data calibration, Orange data mining software is
used [30]. This software can be downloaded freely as it is open source. It features a
visual programming front-end for explorative data analysis and interactive data
580 M. I. Solihin et al.

Fig. 1 An example of spectra data obtained from spectrometer reading on many samples

Fig. 2 The handheld hand

palm-sized NIR spectrometer

visualization and can also be used as a Python library. The visual programming in
Orange is performed as workflow. Orange workflow components are called widgets
and they range from simple data visualization, subset selection, and pre-processing,
to empirical evaluation of learning algorithms and predictive modelling. It means
that workflows are created by linking predeﬁned or user-designed widgets, while
advanced users can use Orange as a Python library for data manipulation and
widget alteration [31]. Figure 3 shows typical Orange workflow example.
The widgets for spectroscopy can be downloaded as Add-ons option which also
includes some other applications such as Image Analyses, Time-Series, Geo, etc.
The widgets contained in Spectroscopy Add-ons are as seen in Fig. 4.
Machine Learning Calibration for Near Infrared Spectroscopy Data ... 581

Fig. 3 An example of workflow visual programming in Orange

Fig. 4 The orange software

widgets available in
spectroscopy add-ons
582 M. I. Solihin et al.

3 Case Studies

In this section, two case studies for NIR spectra data calibration will be presented.
One case represents classiﬁcation problem (qualitative analysis) using machine
learning and another case represents regression problem (quantitative analysis).
This ﬁrst case for qualitative analysis is experiment on pesticide solution spayed on
cabbage samples. The second case for quantitative analysis is mango sweetness
assessment based on sugar content (Brix value).

3.1 Pesticide Solution Sprayed on Cabbage

This experiment is motivated by the effort of developing rapid non-destructive

approach to detect pesticide residue on agricultural crops. It is carried out as initial
research to scrutinize whether NIRS is suitable tool for pesticide residue detection.
Monitoring of pesticides in fruit and vegetable samples has increased in the recent
years since most countries have established maximum residue level (MRL) for
pesticides in food products [32, 33].
Figure 5 shows the cabbage sample and the pesticide solution used, i.e.
Potassium oleate solution (285 g/1000 mL). The experiment procedure can be
summarized as follows:
1. The instrument is set up.
2. A high concentration solution (28.5%, original ratio) of pesticide is blended.
3. The pesticide solution is spray on cabbage.
4. The cabbage sample is scanned 6 times to prove the result.
5. The spectrum is saved as .csv ﬁle.

Fig. 5 Cabbage and the pesticide solution

Machine Learning Calibration for Near Infrared Spectroscopy Data ... 583

Fig. 6 The orange workflow in the experiment for classiﬁcation task

6. Repeat step 3 to 5 for 50 times for different leaf of cabbage.

7. Repeat step 2 to 6 for 5% pesticide, 1% pesticide and water.
8. Repeat step 4 to 5 for 30 times for different leaf of cabbage.
Total NIR spectrum of 230 samples are collected. Those NIR spectrum are of 30
samples of 30 pure cabbage leaves, 50 samples of cabbage sprayed with respec-
tively 28.5% (original product ratio) pesticide solution, 5% pesticide solution, 1%
pesticide solution and water only solution. This means the machine learning will
make classification based on the recorded NIR spectrum to produce five classes
classification outcome. From these 230 samples, 180 samples are randomly for
training and the rest 50 samples are for testing.
Figure 6 shows the orange workflow for this experiment where three classifiers
are used namely, ANN, SVM and KNN (k-nearest neighbor). The classification
results are readily available from Confusion Matrix widget and Test & Score widget
as shown in Figs. 7 and 8. Figure 7 shows confusion matrix of classification per-
formed by SVM. To see the results of other classifiers (ANN and KNN), a selection
click button can be performed on the left side. Noted that some other classifiers can
also be used such as Random Forest, Naïve Bayes, Decision Tree etc.
Furthermore, Test & Score widget can be used to check the classification
accuracy. As can be seen in Fig. 8, the results is mostly expressed in Data Science
terminologies such as AUC (area under curve), CA (classification accuracy),
Precision and Recall, etc. As can be seen, the highest CA performed on Test is
achieved by SVM followed by KNN and ANN respectively: 92, 86 and 72%.
Obviously, these results can be fine-tuned by changing parameters and the per-
formance might be different. However, the focus of this study at the moment is on
584 M. I. Solihin et al.

Fig. 7 Confusion matrix of classiﬁcation performed by SVM

Fig. 8 Screenshot of Test & Score widget that shows classiﬁcation results

the use of the software instead of the machine learning algorithms performance. In
addition, some other algorithms can also be used and analyzed easily.

3.2 Brix Value Prediction on Mango

The second case study is regression problem as a part of research on non-destructive

fruit quality assessment using NIR spectroscopy. For this project, three different
types of mango fruit were selected namely Chokonan, Rainbow, and Kai Te. Total of
60 samples was prepared to be scanned by the spectrometer.
The samples were scanned in reflectance mode to record the absorbance spectra
data. Each sample spectrum was measured for 3 s in reflectance mode. Some
Machine Learning Calibration for Near Infrared Spectroscopy Data ... 585

samples were scanned two times in different environment, and some were scanned
only one time. A total of 80 spectra were collected from 60 samples. The training
and testing dataset consist of 60 and 20 samples respectively.
In assessing the fruit maturity of mango and as a guide to ﬁnal food quality, short
wave near-infrared spectroscopy (NIR) (900–1700 nm) has been investigated. To
obtain a predictive model using spectroscopy data, real data needed to be collected
so that it can be used to calibrate and validate the accuracy of the prediction model.
Refractometer – A device used to measure the refractive index of plant juices in
order to determine the mineral/sugar ratio of the plant cell protoplasm. The
refractometer measured in units called Brix. NIRS is used to predict the Brix values
in mango fruit. The mango fruits used as samples are of three different types
namely: Chokonan, Rainbow, and Kai Te.
The MA871 is an optical refractometer instrument that employs the measure-
ment of the refractive index to determine the % Brix of sugar in aqueous solutions
as shown in Fig. 9 [34]. In this project, the NIR spectrum of the Mango samples is
calibrated by machine learning (AdaBoost ensemble algorithm for regression in this
case) to predict Brix value non-invasively.
Figures 10 and 11 show the raw and the pre-processed spectra data of the Mango
samples. Some pre-processes are applied here namely: Gaussian smoothing and
EMSC (extended multiplicative scatter correction). Test & Score widget can be
used to show the regression accuracy in this regression case, in terms of R2 (co-
efﬁcient of determination). The regression performance obtained by AdaBoost
ensemble regression in this case is 0.99% (training) indicating a very good pre-
diction accuracy. However, R2 = 0.64 is obtained for testing. This lower attainment

Fig. 9 MA871 digital refractometer

586 M. I. Solihin et al.

Fig. 10 Raw spectral data of Mango

Fig. 11 Pre-processed spectra data of Mango

Machine Learning Calibration for Near Infrared Spectroscopy Data ... 587

Fig. 12 Orange workflow for regression experiment and the regression result

Fig. 13 Actual %Brix value vs predicted value (by AdaBoost)

588 M. I. Solihin et al.

is indication of overﬁtting of the prediction model and this needs to be remedied.

However, this discussion is beyond the scope of this conference.
Figure 12 shows the orange workflow (visual programing) used to generate the
data for this regression process. Figure 13 shows the regression plot for testing data.
It indicates the relation between actual %Brix and predicted value.

4 Conclusions and Discussions

This paper introduces a different approach of spectra data-particularly near infrared

spectroscopy- calibration based on visual programming approach using Orange data
mining, a free software which is still rarely used by the research community in
spectroscopy. This software tool is useful particularly for the non-programmer
researchers who want to apply machine learning algorithms in spectroscopy data
which leads to intelligent chemometrics approach. There was no coding involved in
the calibration and analysis which may attract interest for non-programmers.
However, there some recommendations for future improvement particularly for
the Orange software development that the research community and the authors
should proceed, such as: development of PLS (partial least square) regression
widget and deep learning (e.g. convolutional neural networks) widget. This is
because especially PLS is among the popular multivariate regression method in
chemometrics and spectroscopy. This can only be achieved with knowledge in
Python programming language.

References

1. Ang CK, Tey WY, Kiew PL, Fauzi M (2017) An artificial intelligent approach using fuzzy
logic for sleep quality measurement. J Mech Eng SI 4(2):31–47
2. Tang SH, Ang CK, Ariffin MKABM, Mashohor SB (2014) Predicting the motion of a robot
manipulator with unknown trajectories based on an artificial neural network. Int J Adv Robot
Syst 11(10):176
3. Hong TS, Kit AC, Nia DN, Ariffin MKAM, Khaksar W (2013) Planning for redundant
manipulator based on back-propagation neural network. Adv Sci Lett 19(11):3307–3310
4. Cen H, He Y (2007) Theory and application of near infrared reflectance spectroscopy in
determination of food quality. Trends Food Sci Technol 18(2):72–83
5. Teixeira Dos Santos CA, Lopo M, Páscoa RNMJ, Lopes JA (2013) A review on the
applications of portable near-infrared spectrometers in the agro-food industry. Appl Spectrosc
67(11):1215–1233
6. Porep JU, Kammerer DR, Carle R (2015) On-line application of near infrared
(NIR) spectroscopy in food production. Trends Food Sci Technol 46(2):211–230
7. Sakudo A (2016) Near-infrared spectroscopy for medical applications: current status and
future perspectives. Clin Chim Acta 455:181–188
8. Qu J-H et al (2015) Applications of near-infrared spectroscopy in food safety evaluation and
control: a review of recent research advances. Crit Rev Food Sci Nutr 55(13):1939–1954
Machine Learning Calibration for Near Infrared Spectroscopy Data ... 589

9. Yadav J, Rani A, Singh V, Murari BM (2015) Prospects and limitations of non-invasive blood
glucose monitoring using near-infrared spectroscopy. Biomed Signal Process Control 18:214–
227
10. Saputra I, Jaswir I, Akmeliawati R (2018) Identification of pig adulterant in mixture of fat
samples and selected foods based on FTIR-PCA wavelength biomarker profile. Int J Adv Sci
Eng Inf Technol 8(6):2341
11. Chandran M, Rajamamundi P, Kit AC (2017) Tire oil from waste tire scraps using novel
catalysts of manufacturing sand (M Sand) and TiO 2: production and FTIR analysis. Energy
Sources Part A Recover Util Environ Eff 39(18):1928–1934
12. Elango N, Gupta NS, Lih Jiun Y, Golshahr A (2017) The effect of high loaded multiwall
carbon nanotubes in natural rubber and their nonlinear material constants. J Nanomater
2017:1–15
13. Solihin MI, Shameem Y, Htut T, Ang CK, Hidayab MB (2019) Non-invasive blood glucose
estimation using handheld near infra-red device. Int J Recent Technol Eng 8(3):16–19
14. Abdullah Al-Sanabani DG, Solihin MI, Pui LP, Astuti W, Ang CK, Hong LW (2019)
Development of non-destructive mango assessment using handheld spectroscopy and
machine learning regression. J Phys Conf Ser 1367(1):012030
15. Karunathilaka SR, Yakes BJ, He K, Chung JK, Mossoba M (2018) Non-targeted NIR
spectroscopy and SIMCA classification for commercial milk powder authentication: a study
using eleven potential adulterants. Heliyon 4(9)
16. Martens H, Stark E (1991) Extended multiplicative signal correction and spectral interference
subtraction: new preprocessing methods for near infrared spectroscopy. J Pharm Biomed Anal
9(8):625–635
17. Skogholt J, Liland KH, Indahl UG (2019) Preprocessing of spectral data in the extended
multiplicative signal correction framework using multiple reference spectra. J Raman
Spectrosc 50(3):407–417
18. Manley M (2014) Near-infrared spectroscopy and hyperspectral imaging: non-destructive
analysis of biological materials. Chem Soc Rev 43(24):8200–8214
19. Hardesty JH, Attili B, College C (2010) Spectrophotometry and the Beer-Lambert Law: an
important analytical technique in chemistry
20. Zhang H, Yang Q, Lu J (2013) Classification of washing powder brands using near-infrared
spectroscopy combined with chemometric calibrations. Spectrochim Acta Part A Mol Biomol
Spectrosc 120:625–629
21. Makky M, Soni P (2014) In situ quality assessment of intact oil palm fresh fruit bunches using
rapid portable non-contact and non-destructive approach. J Food Eng 120(1):248–259
22. Devos O, Ruckebusch C, Durand A, Duponchel L, Huvenne J-P (2009) Support vector
machines (SVM) in near infrared (NIR) spectroscopy: focus on parameters optimization and
model interpretation. Chemom Intell Lab Syst 96(1):27–33
23. Barbon S, da Costa Barbon APA, Mantovani RG, Barbin DF (2018) Machine learning
applied to near-infrared spectra for chicken meat classification. J Spectrosc 2018:1–12
24. Madden MG, Howley T (2009) A machine learning application for classification of chemical
spectra. In: Applications and innovations in intelligent systems XVI. Springer, London,
pp 77–90
25. Cheng C, Liu J, Zhang C, Cai M, Wang H, Xiong W (2010) An overview of infrared
spectroscopy based on continuous wavelet transform combined with machine learning
algorithms: application to chinese medicines, plant classification, and cancer diagnosis. Appl
Spectrosc Rev 45(2):148–164
26. Torrione P, Collins LM, Morton KD (2014) Multivariate analysis, chemometrics, and
machine learning in laser spectroscopy. In: Laser spectroscopy for sensing. Elsevier, pp 125–
164
27. Astuti W, Dewanto S, Soebandrija KEN, Tan S (2018) Automatic fruit classification using
support vector machines: a comparison with artificial neural network. IOP Conf Ser Earth
Environ Sci 195:012047
590 M. I. Solihin et al.

28. Astuti W, Aibinu AM, Salami MJE, Akmelawati R, Muthalif AG (2011) Animal sound
activity detection using multi-class support vector machines. In: 2011 4th international
conference on mechatronics (ICOM), pp 1–5
29. Yie Y, Solihin MI, Kit AC (2017) Development of swarm robots for disaster mitigation using
robotic simulator software, vol 398
30. Orange – data mining fruitful & fun. https://orange.biolab.si/. Accessed 12 Mar 2019
31. Demšar J et al (2013) Orange: data mining toolbox in Python. J Mach Learn Res 14:2349–
2353
32. Zhao G, Guo Y, Sun X, Wang X (2015) A system for pesticide residues detection and
agricultural products traceability based on acetylcholinesterase biosensor and internet of
things. Int J Electrochem Sci 10(4):3387–3399
33. Jamshidi B, Mohajerani E, Jamshidi J, Minaei S, Shariﬁ A (2015) Non-destructive detection
of pesticide residues in cucumber using visible/near-infrared spectroscopy. Food Addit
Contam Part A Chem Anal Control Expo Risk Assess 32(6):857–863
34. Milwaukee - MA871 Digital Brix Refractometer. http://www.milwaukeeinst.com/site/
products/products/digital-refractometers/165-products-g-digital-refractometers-g-ma871.
Accessed 22 Aug 2019
Real Time Android-Based Integrated
System for Luggage Check-in Process
at the Airport

Xin Yee Lee and Rosmiwati Mohd-Mokhtar

Abstract Airway transportation has become trending among travelers. However,

the check-in process involved a lot of stages, includes flight ticket check-in, luggage
drop-off, luggage scanning and others. This research focuses on developing an
online software integrated device that capable of managing the luggage check-in
process at the airport. A weight sensor device is used to capture the weight of
luggage and synchronized with the real time database. Android application is
developed to display the user interface in check-in the luggage. Additional capa-
bility of purchasing add-on luggage is included in the developed software appli-
cation. Result shows that by some assumption on time duration taken to scan, print
and get the bag tag, the estimated recorded time is four minutes and thirty-seven
seconds for no purchase add-on activity, and an average time of six minutes and
ﬁfteen seconds with purchase add-on activity. By eliminating the purchase of
additional luggage at different counter process, it is believed to reduce the time
taken during luggage check-in process for that passenger.

Keywords Android system Check-in process Weight sensor

1 Introduction

Research has been carried out on the procedures and methods of airline passengers’
check-in over the years [1]. The outcome tabulated detail analysis on each of the
current methods implemented at the airports. Examples of methods used are agent/
check-in desk (assisted counter check-in at the airport), browser (self-online web
check-in), kiosks (self-check-in at the airport), app (self-mobile apps check-in) and
others. Based on the research in [1], it was observed that passenger most prefer
check-in method is agent/check-in desk, which covers a high value of forty nine

X. Y. Lee R. Mohd-Mokhtar (&)

School of Electrical and Electronic Engineering, Universiti Sains Malaysia, Engineering
Campus, 14300 Nibong Tebal, Pulau Pinang, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 591

percent, nearly half of the percentage. It means that most of the passengers are still
preferred the traditional method of check-in process. However, there is a slight
increase in the app method over the years from year 2015 to year 2017, forecasted
to become the majority preference of passengers in year 2020 [1].
Nowadays, national airlines have adapted advance technology into their flight
check-in operation system. In the passenger ticket check-in process, website
browser check-in was popular over the past few years. Mobile applications begin to
take the lead ahead of browser check-in. One of the popular airlines in Malaysia,
Airasia Berhad introduces various passenger check-in methods with advance
technology. The passenger able to get the boarding pass by scanning the 2D bar-
code received on smart phone using the kiosks at the airport. Passenger is ready to
board the flight if they do not carry any luggage. The whole process takes less than
one minute. However, if passenger has checked luggage, they need to head over to
the luggage drop counter and required to get their travel documents veriﬁed two
hours before boarding [2].
Like Malaysia Airport, other international airport and airlines also facilitate the
boarding process with automation system. This can be seen for example at the
Changi Airport in Singapore. The main boarding process can be categorized as 4
steps, namely, check-in, luggage drops, immigration, and boarding [3]. Changi
Airport has adopted the automation system technology in these processes. Firstly,
the automated check-in kiosk. Secondly, the automated luggage-drop system allows
passengers to drop their luggage without counter check-in and others. If the luggage
in excess of the allowed weight, passenger need to pay an excess luggage or air
freight it to destination as unaccompanied luggage [3].
There are several researches that have been conducted to study the check-in
process at the airport [4–9]. They usually conduct simulation design as to study the
dynamics behavior of the passenger, estimate the queuing time as well as analyze
the operational procedures run at the airport. The study mainly focused on impact of
social behavior among passengers at the airport. The simulations are developed as
agent based model. There are also researches which focused on simulation to
evaluate the airport security during luggage check-in as mentioned by Perboli et al.
[10], Hepler [11], Miller [12] and, Leone and Liu [13].
Studies on waiting and queuing times at the airport were also made as can be
seen in [14–18]. The aim is to provide an optimized model that able to reduce the
length of queuing lanes as well as reduce the operational cost of the airport and
flight service providers. Pou et al. [14] for example used a dynamic programming
technique to simulate the process and model the influx of passengers as a Poisson
process. Mehri et al. [15] also used quite similar approach with different in airport
case study. In the other hand, de-Lange et al. [16] proposed the virtual queuing
model at airport security checking lanes.
Flight station is always crowded with people queuing up to check-in the flight
ticket or carrying luggage or both at each counter [19]. At the departure gate the
time available to resolve eventual problems becomes strictly limited [20–22]. The
problem is the removal of cabin bags whenever the volumetric limits for stowage in
passenger cabin are reached. The counter will eventually pause and delay the
Real Time Android-Based Integrated System for Luggage… 593

services to the next passengers. Things are getting worse when there are passengers
without luggage must queue up and wait for those who are checking in the luggage.
These are the causes of the long lines at each counter whenever there are some
issues occurred.
In addition, things eventually come to the worst when the one who have their
check-in luggage exceeds the limitation of the weight allowed. They are prompted
to proceed to the cashier counter ﬁrst in order to pay for the excess weight of their
luggage, and then return to the luggage drop counter to continue with the flight
check-in before boarding on the departure gate. It is inefﬁcient and time consuming,
especially when travelling abroad which may have foreign currency issues that may
be occurred as well. The processes are troublesome and not user-friendly.
The aim of this research is to implement a real time integrated system that able to
store details information about the passenger and the luggage weight that he/she has
initially purchased/obtained, to directly link to the weight measurement device and
to automatically monitor the luggage check-in process via smart phone. The system
also has features to directly purchase the extra luggage weight via the smart phone
should the excess luggage occurred without need to change counter. Even though
some commercial airlines already have apps that able to buy extra luggage online,
however, the integration on system that able to provide real time luggage check-in
process is yet to be implemented by any airline services so far. This will become the
innovative and new contribution of this research.
The remainder of the paper goes as follow. Section 2 will give details on project
implementation. The hardware system is developed to imitate the weighing process
at the luggage check-in counter. The software that synchronized with the weight
sensor, real time database and user interface will be embedded to the system.
Outcome and analysis from the project will be presented in Sect. 3. Section 4 will
give conclusion to the paper.

2 Project Implementation

2.1 Hardware Components and Circuits

The flight luggage check-in device is a mechanism that able to manage the luggage
check-in process without human resources. It measures the weight of the luggage
and the values measured will be auto updated to the database system in the com-
munication system. It will analyze the data collected and trigger the notiﬁcations
according to the allowed weight or purchased weight of the luggage. The system
will setup an internet server to connect the automation device with the portable
devices of the passenger. Figure 1 shows the block diagram of the flight luggage
check-in system.
Raspberry Pi is used as the core of the luggage check-in system hardware
implementation. A sensor device which is sensitive to the weight is used to measure
594 X. Y. Lee and R. Mohd-Mokhtar

Fig. 1 Block diagram of the flight luggage check-in system

the weight of the luggage. Weight sensor module, HX711 is utilized to read the
values from sensor device. Raspberry Pi is connected to the Wi-Fi module to
establish a connection with database server.
Raspberry Pi able to support Wi-Fi module, thus make it a suitable embedded
system to setup wireless server connection. It can be manually remote by using
remote SSH (Security Shell) only, as long as the Pi is powered on with Wi-Fi
connection. Raspberry Pi is linked to the database system, in this case, Firebase real
time database is used. Raspberry Pi granted with permission to refer to the database
and change the data and vice versa. Raspberry Pi reads the login identity and
veriﬁes their purchased weight with check-in luggage. Then, it will upload the
measured weight of the luggage to the database. Figure 2 shows the front view of
the automatic luggage check-in system platform.

Fig. 2 Front view of hardware platform

Real Time Android-Based Integrated System for Luggage… 595

2.2 Software Design

Software design can be categorized into 3 sections, mainly low level software
design in hardware device, database design and user interface android application.
Low level software involves python programming language script. On the other
hand, Android application is programmed in Java, object-oriented language.
Database is set up in both hardware and software by configuring the settings in
compiler.
Python script is suitable to be used in Raspberry Pi program. It is easy to use and
universal. Firebase is a cloud storage database. The database is designed in category
and group according to specific feature or characteristic. The main information
needed in this project is user/passenger information. User will be defined as the
parent of group. Second layer is identified as various user name registered, followed
by their respective flight information such as user password, purchased luggage
weight and their respective measured luggage weight. Table 1 tabulates the dummy
data to be used in designing the database. Figure 3 shows the design structure of the
database created.
The measured weight is synchronized with the weight sensing device. Raspberry
Pi is linked to the database through the python script. The library of the firebase is
imported into Raspberry Pi. An authentication link to the firebase account is used to
access the data stored in the database. Android operating system is used to create
the graphical user interface for luggage check-in process. It has been chosen for this
study as the operating system (OS) is commonly used among the phone users.

2.3 Design Layout

Firstly, the user or passenger needs to log in to his/her account. Upon successful log
in, user will be prompted with the flight he/she is going to board and details of the
ticket check-in and weight of luggage purchased.
At the same time, the data is linked to the server system, or called database system.
The database system is shared together with the luggage automation device. If the
details of the passenger and luggage are passed, a prompt message of “Luggage

Table 1 Dummy data to be User Password Purchased luggage Measured

used in the database (kg) luggage
Ali 0000 4 0
Lee 1234 3 0
Raju 4321 2 0
596 X. Y. Lee and R. Mohd-Mokhtar

Fig. 3 Design structure of the database created

Check-in is successful.” will prompt. After that, the luggage will automatically
transfer into the luggage store. The luggage check-in process is done. However, once
the luggage automation device sense there is any weight of the luggage exceeds the
allowed weight, a notiﬁcation prompt is sent to inform the passenger that the luggage
is overweight. If the luggage is overweight, the passenger is not allowed to proceed to
the ﬁnal session of luggage check-in process. The passengers have the options to
choose either proceed to the payment session to further purchase add-on luggage or
cancel the luggage check-in process by pressing the logout button. Figure 4 shows the
flow of the proposed flight luggage check-in system.
Real Time Android-Based Integrated System for Luggage… 597

2.4 System Functionality on Several Luggage Condition

Several luggage conditions are created to test on the overall system. There are three
major state conditions of the luggage, mainly underweight, equal weight and
overweight. These test cases are simulated. A dummy data listed in the Table 1 is
used to simulate the test case created. In this case, user name, Ali is selected, in
which his respective purchased weight is 4 kg, with initial measured weight set as
0 kg. Table 2 lists down the simulation test case based on the information of the
user, Ali in the database.

Table 2 Test case simulated based on information of the user, Ali in the database
Test case Weight category Load (kg)
1 Underweight 2.0
2 Equal weight 4.0
3 Overweight 5.0
4 No luggage 0.0
5 Overweight with purchase add-on activity 5.0

3 Simulation Results

The flight luggage weighting device able to measure the weight of the luggage. The
collected weight values will be auto updated to the data base system in the com-
munication session. It will analyze the data collected and trigger the notiﬁcations
according to the allowed weight or purchased weight of luggage correspondingly.
598 X. Y. Lee and R. Mohd-Mokhtar

Fig. 4 Flow chart of the flight luggage check-in system

3.1 Load Cell Testing

Table 3 shows the results of the weight sensor obtained when different known
weight items are placed on top of the sensor device. It is found that the percentage
of error for each test case is minimum and almost insigniﬁcant. It ranges from as
low as 0.16%, until a maximum value of 3.00% only through the observations.
Therefore, it can be concluded that the weight sensor device functionality is within
acceptable range of percentage of error of ±5%.
Real Time Android-Based Integrated System for Luggage… 599

Table 3 Lists of the load cell tests results

Test case Actual weight (kg) Average measured weight (kg) Percentage error (%)
1 0.5 0.485 3.00
2 1.1 1.083 1.55
3 2.0 1.977 1.15
4 4.0 3.983 0.43
5 5.0 4.984 0.32
6 8.0 7.987 0.16

3.2 Android-Based User Interface

The software displays user interface for passenger who like to check in the luggage
is shown in Fig. 5. Figure 6 shows the simulation for Test Case 1 when the luggage
check-in process is completed with no excess weight issue. After that, the luggage
will automatically transfer to the luggage cabin. The check-in process is completed.
The simulation for Test Case 2 in which the luggage weight is equivalent to
purchased weight also run successfully.
If the luggage automation device sense there is any weight of the luggage
exceeds the allowed weight, a notiﬁcation prompt is sent to inform the passenger
that the luggage is overweight (Fig. 7(a)). The passenger has the options to choose
either proceed to the payment session to further purchase add-on luggage or cancel
the luggage check-in process by pressing the logout button. In the other hand, if no

Fig. 5 Android application (a) Welcome homepage (b) User login page (c) User flight details
600 X. Y. Lee and R. Mohd-Mokhtar

Fig. 6 Test case 1: Transition of the user interface before and after passenger clicked on the
luggage check-in button

luggage presented on the weighing scale, the no luggage message will be prompted
(Fig. 7(b)).
Figure 8 displays the window for purchasing the additional luggage weight. For
simulation purposes, there are three options that are set for purchasing the luggage
check-in. First option is add-on luggage of 1 kg with price of RM100, followed by
3 kg with price of RM150, and 5 kg with price of RM200. Once the purchased is
done, user pressed the “Luggage Check-in” button to ﬁnish the purchase add-on
process. The add-on luggage will update the value of purchased weight in database
with the sum of previous purchased weight and add-on weight. The layout activity
switched from purchased add-on luggage back to user info layout to resume the
luggage check-in process.
Overall, the processes include account login, weighing of luggage, purchasing of
extra luggage weight, scan, print and get bag tag, put the tag to the bag, account
logout, and lastly luggage move to storage. For the normal scenarios in which there
is no purchase add-on activity, the estimated recorded time is four minutes and
thirty-seven seconds only. The time taken is a rough estimation based on the user
hand phone’s clock with some assumption on time duration taken to scan, print and
Real Time Android-Based Integrated System for Luggage… 601

Fig. 7 (a) Test case 3: Luggage check-in is blocked when the measured weight luggage exceeds
the purchased weight of the user (b) Test case 4: Luggage check-in is blocked when no luggage is
detected

Fig. 8 Test case 5: Purchase add-on process when the luggage exceeds the weight limit

get the bag tag. In contrast, the excess luggage scenario with purchase add-on
activity consumed an average time of six minutes and ﬁfteen seconds while the
assumption of time for other process is remain constant.
In the current system, when there is excess of luggage happen, the passenger will
have to cancel the current luggage check-in process and will have to go to the
ﬁnance counter to purchase the add-on luggage. Then, passenger will return to the
602 X. Y. Lee and R. Mohd-Mokhtar

luggage check-in counter again to continue with the check-in process. The pas-
senger may take a lot of times in shifting from counter to counter and queuing
repeatedly. If passenger is in foreign country, he/she may take additional times to
convert money before purchasing the add-on luggage.
By eliminating the one step process of moving to another counter for purchasing
the extra luggage weight as proposed in this project, it will indirectly reduce the
time taken to complete the luggage check-in process for that passenger. However,
the results are also subjected to the speed of the internet or the processing speed of
the mobile device.
For this research the computational cost is bearable as the developed prototype
system only used python programming language (for programming the Raspberry
Pi) and Java object oriented program for Android system. The processing time is
very much depended on the type of smart phone used.

4 Conclusion

A hardware device is successfully developed with capability of getting real time

luggage weight measurement and linked to the communication system. The
objective is achieved by using the load cell of maximum capacity of 10 kg that
connected to the Raspberry Pi with synchronization of real time database.
Calibration on the hardware device is performed in which, the accuracy of the
weight sensing device is guaranteed with a tolerance of ±3% only.
An android software application that monitor the luggage check-in process is
designed to interact with the hardware sensing device. The system is simulated with
ﬁve test cases as described in Sect. 3. From the results, the estimated time taken is
recorded as six minutes ﬁfteen seconds only with the purchase add-on procedure.
On the other hand, the normal luggage check-in process takes about four minutes
thirty-seven seconds only.
In conclusion, the real time Android-based luggage check-in system able to
eliminate one step process for purchasing of add-on luggage at another counter.
Thus, it reduces the time consumes during the luggage check-in procedure for that
particular passenger.

Acknowledgement The project is partially supported by the USM RUI Grant: 1001/PELECT/
8014093.

References

1. The statistics portal, check in method of airline passengers worldwide from 2015 to 2020,
travel, tourism & hospitality. https://www.statista.com/statistics/493957/check-in-method-
airline-passengers/. Accessed June 2018
2. Airasia Berhad: Passenger guide, departure. https://www.airasia.com. Accessed June 2018
Real Time Android-Based Integrated System for Luggage… 603

3. Changi Airport Singapore: Fast and seamless travel (FAST), passenger guide. http://www.
changiairport.com. Accessed Aug 2018
4. Joustra PE, Van Dijk NM (2001) Simulation of check-in at airports. In: Proceedings of the
2001 winter simulation conference, Arlington, VA, USA, vol 2, pp 1023–1028
5. Bevilacqua M, Ciarapica FE (2010) Analysis of check-in procedure using simulation: a case
study. In: 2010 IEEE international conference on industrial engineering and engineering
management, Macao, pp 1621–1625
6. Ma W, Kleinschmidt T, Fookes C, Yarlagadda PKDV (2011) Check-in processing:
simulation of passengers with advanced traits. In: Proceedings of the 2011 winter simulation
conference, 11–14 December, Phoenix, AZ, USA, pp 1783–1794
7. Trakoonsanti L (2016) A process simulation model of airline passenger check-in. Univ J
Manage 4(5):265–276
8. Felix MM (2015) Micro-simulation of check-in operations: case study of Lisbon Airport’s
Terminal 1. Technical report, pp 1–10
9. Felix M, Reis V (2016) A micro-simulation model for assessing the performance of check-in
airports. In: 2016 IEEE 19th international conference on intelligent transportation systems, 1–4
November, Rio de Janeiro, Brazil, pp 1–6
10. Perboli G, Musso S, Perfetti F, Trapani P (2014) Simulation of new policies for the baggage
check in the security gates of the airports: the Logiscan case study. Procedia Soc Behav Sci
111:58–67
11. Hepler W (2003) Simulation of airport baggage screening. In: 2003 simulation workshop
using simulation to evaluate impact of airport security, no. E-C060, Washington D.C.,
January 2003, pp 16–17
12. Miller E (2003) Modeling checked baggage requirements for Dallas/Fort Worth International
Airport. In: 2003 simulation workshop using simulation to evaluate impact of airport security,
no. E-C060, Washington D.C., January 2003, pp 21–22
13. Leone K, Liu R (2003) Measures of effectiveness for passenger baggage security screening.
In: 2003 simulation workshop using simulation to evaluate impact of airport security, no.
E-C060, Washington D.C., January 2003, pp 23–24
14. Pou S, Kunin D, Xiang D (2017) Reducing wait times at airport security. Technical report
team 56632, pp 1–21
15. Mehri H, Djemel T, Kammoun H (2008) Solving of waiting lines models in the airport using
queuing theory model and linear programming the practice case: A.I.M.H.B., Hal-00263072v2,
pp 1–26
16. De-Lange R, Samoilovich I, Van der Rhee B (2012) Virtual queuing at airport security lanes.
Eur J Oper Res 225(2013):153–165
17. Al-Sultan AT (2017) Simulation and optimization for modeling the passengers’ check-in
system at the airport terminal. Rev Integr Bus Econ Res 7(1):44–53
18. Simaiakis I, Balakrishnan H (2016) A queuing model of the airport departure process. Transp
Sci 50(1):1–30
19. Liu XC (2018) Field investigation on characteristics of passenger flow in a Chinese hub
airport terminal. Build Environ 133:51–61
20. Xavier F, Ricardo F (2016) How do airlines react to airport congestion? The role of networks.
Reg Sci Urban Econ 56:73–81
21. Alvaro R, Fernando GC, Rosa AV, Javier PC, Rocio BM, Sergio CS (2019) Assessment of
airport arrival congestion and delay: prediction and reliability. Transp Res Part C Emerg
Technol 98:255–283
22. Rui CM, Pedro S (2010) Measuring the influence of congestion on efﬁciency in worldwide
airports. J Air Transp Manage 16:334–336
Antenna Calibration in EMC
Semi-anechoic Chamber Using Standard
Antenna Method (SAM) and Standard
Site Method (SSM)

Abdulrahman Ahmed Ghaleb Amer, Syarfa Zahirah Sapuan,

Nur Atikah Zulkefli, Nasimuddin Nasimuddin, Nabiah Binti Zinal,
and Shipun Anuar Hamzah

Abstract Electromagnetic Compatibility (EMC) engineers should self-check the

antenna’s condition and parameters, including Antenna Factor (AF) and Gain,
continuously, by calibrating the antenna. Therefore, several analyses have been
done to compare between SAM and SSM in EMC testing lab. Based on the
analysis, the antenna to be used inside a 3 m semi-anechoic chamber need to be
positioned 1.5 m above the ground plane to avoid the reflection. The Antenna
Factor (AF) results show a good agreement with manufacturer data. SAM is rec-
ommended as a calibration method in the semi-anechoic chamber because the
percentage error is 5% which is lower compared with SSM (18%). This is due to the
site imperfection in the EMC lab. The uncertainty for the EMC lab is up to ±4 dB,
compared to the calibration test site where allowed uncertainty is ±1 dB. Therefore,
an absorber needs to be placed between two antennas. In addition, the phase center
of the reference antenna needs to take into consideration for highly accurate AF.

Keywords Antenna calibration Standard antenna method (SAM) Standard site

method (SSM) Semi-anechoic chamber Antenna factor EMC

A. A. G. Amer (&) S. Z. Sapuan (&) N. A. Zulkefli S. A. Hamzah

Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
86400 Parit Raja, Batu Pahat, Johor, Malaysia
e-mail: [email protected]
S. Z. Sapuan
e-mail: [email protected]
N. Nasimuddin
Institute for Infocomm Research, A-STAR, Singapore, Singapore
N. B. Zinal
Centre for Diploma Studies, Universiti Tun Hussein Onn Malaysia, Parit Raja, Batu Pahat,
Johor, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 605

1 Introduction

Electromagnetic Compatibility (EMC) consists of two aspects, radiation emission,

and immunity. Emission is the generation of electromagnetic energy, whether
deliberate or accidental, by some source and its release into the environment.
Immunity is the opposite of susceptibility, being the ability of equipment to
function correctly in the presence of EMI/RFI, with the discipline of “hardening”
equipment being known equally as susceptibility or immunity.
During the EMC radiated emission and immunity measurement, several anten-
nas are used to sense or radiate the electromagnetic wave and later the result will be
compared with an allowable limit set by an EMC Standard. Therefore, the accuracy
of the antenna for EMC measurements is important to reduce the uncertainty of the
result obtained.
Antenna factor (AF) is a fundamental requirement for reliable EMC measure-
ments in radiated emissions and radiated immunity test [1]. Usually, calibration of
the antenna was conducted once in a year due to the deviation of AF values or
expiry of calibration date. Unfortunately, the antenna needs to be sent to the cali-
bration laboratory and the cost will be increased. Most of the semi-anechoic is
designed for an EMC measurement and it is not suitable for antenna calibration. In
addition, EMC engineer in the test Lab, need to perform an intermediate check once
in two or three months to ensure the antenna provides an accurate reading for
radiated emission testing. Therefore, it is important to conduct a feasibility study of
the EMC semi-anechoic chamber as an antenna calibration test site. SAM, SSM and
SFM used for AF determination in the antenna calibration lab assigned by the
National Institute of Standards and Technology (NIST). SAM and SSM are the
most methods used for AF determination in the antenna calibration laboratory.
Therefore, it is important to analyze the capability of EMC lab for antenna cali-
bration using SAM and SSM. Both methods will be analysed to ensure the accuracy
of the antenna calibration in EMC laboratory [2, 3].

2 Antenna Calibration

An Open Area Test Site (OATS) [4], full anechoic chamber and Semi-anechoic
chamber, designed for radiated emission in EMC measurement testing (i.e.
COMTS) do not usually meet the standard requirement for an antenna calibration
test site (CALTS), because the uncertainties of the site’s imperfection of the
COMTS is up to 4 dB, and considered high for antenna calibration. Therefore, NPL
has designed OATS as a calibration test site with a speciﬁc antenna range, close to
the ideal site, which can be regarded as a national standard site where measurement
on other CALTS can be compared for consistency. In about 2000, NPL built a fully
anechoic chamber with special pyramidal absorber and ferrite, to achieve low
reflection for free space AF [5]. The decision to move CALTS from outdoor OATS
Antenna Calibration in EMC Semi-anechoic Chamber … 607

to indoor shielded rooms are because: to eliminate ambient RF interference and to

enable accurate testing free from extreme weather [6, 7].
Since then, free space AF in a calibrated fully anechoic chamber, limited to and
focused on antenna calibration, has been accepted as a standard site. Standard sites
that use for antenna calibration or so-called antenna calibration test sites (CALTS)
must ﬁrst be calibrated to ensure they satisfy the standard provided by ANSI and
CISPR. The key differences during the calibration of the CALTS is the use of
theoretical computations and geometric speciﬁc correction factors in [8], as com-
pared to the validation of a reference test site (REFTS) using analytically calculable
dipole in CISPR 16-1-5 [8].
Two standard techniques that are commonly used in a national laboratory for
antenna calibration are: 1) Standard Site Method (SSM); and 2) Standard Antenna
Method (SAM).

2.1 Standard Site Method (SSM)

Historically, the SSM has been used to calibrate antennas in the frequency range of
30 to 1000 MHz, and is based on the work of Albert Smith, in the early 1980s [9].
The author proposed that the SSM of determining AFs is based on site attenuation
measurements made on a near ideal, open-ﬁeld site.
The method of SSM is based on the far-ﬁeld Friis transmission equation, and
adds a ray tracing component from the ground bounce of the wave, over the
conducting ground plane used for these calibrations. Even though the ground plane
is used, the SSM in recent times now aims to produce free space AF by removing
the ground effect mathematically [10].
In the standard site method, three sets of measurements of site attenuation were
taken under the identical geometrics using three different antennas, taking two at a
time, as shown in Fig. 1. For the test set-up, transmitting and receiving antennas
were kept at a height of 2 m, and 1–4 m, respectively. The distance between the

Fig. 1 Standard Site Method set up [5]

608 A. A. G. Amer et al.

transmitting and receiving antennas was kept at 3 m. Three equations associated

with three site-attenuation measurements as in [10].

2.2 Standard Antenna Method (SAM)

According to ANSI 63.5, the SAM is different from the reference antenna method
(RAM). Commonly, most researchers highlight that both methods are the same
owing to the same procedure in calibrating the antenna. However, in the reference
antenna method (RAM), the reference/standard antenna is a dipole with a
well-matched balun, but in SAM, the reference/standard antenna can be any
accurate antenna with known AF, including dipole. For this reason, RAM is also
known as SAM, but SAM is not always a RAM. Therefore, to simplify the terms in
this paper, SAM will be used, and the reference antenna terms will be used as a
standard antenna [11, 12].
The SAM of antenna calibration usually uses a dipole with an accurately mat-
ched balun as a reference antenna. The AF of any other antenna may be derived by
the substitution against the reference antenna [13]. The AF measurement was
carried out on the 3 m OATS, or certiﬁcate anechoic chamber, by keeping a dis-
tance of 3 m between the transmitting antenna and the receiving antenna. ANSI
63.5 (2006) states that the two antennas must be 10 m each other, but a 3 m
measurement has been accepted by CISPR 16-1-5 as a standard owing to its
low-cost implementation. Both CIPSR and ANSI have the same calibration tech-
niques with different criteria, and the discrepancy is acceptable to use in any method
based on the customer requirement [14].
The transmitting antenna, S1, was kept at a height of two metres, and the
receiving antenna, S2, was kept at a height between 2.5 and 4 m, as shown in
Fig. 2. It is not important to position the antenna for a signal maximum, but it is
important to avoid the region around a null, where readings will be changing
rapidly with the antenna position. Therefore, the use of an absorber on the floor in a
semi-anechoic chamber, or using a fully anechoic chamber, is preferable for this
method.

2
Fig. 2 SAM conﬁguration
3m
Antenna Calibration in EMC Semi-anechoic Chamber … 609

To calibrate the antenna against the reference dipole antenna, ﬁrst measure the
signal strength of the reference antenna at S2. Then, the reference antenna should be
substituted with the antenna under test (AUT), keeping the height and position from S1
the same as the reference antenna. The AF for the AUT is calculated as Eq. (1)–(2) [8].

Vref þ AFref ¼ E ð1Þ

E VAUT ¼ AFAUT ð2Þ

where;
Vref & VAUT = Received voltage (dBlV) of a RA
AFref & AFAUT = AF of reference antenna and AUT (dB/m)
E = Electric ﬁeld strength (dBlV/m)

3 Results and Discussion

3.1 Measurement Set-up

Figure 3 shows the standard antenna method measurement set-up. Three antennas
are involved in this measurement: 1) Log periodic antenna as a transmitter, 2)
reference antenna, which is a calibration antenna and has a highly accurate AF
(receiver) and 3) An antenna under calibration (AUC), which in this research is a
direct feed horn, dipole, and bi-log antenna (receiver).
The transmitter and receiver must be placed at a distance of 3 m from each other.
Both transmitter and receiver must be positioned at 2 m above the floor, however,
for self-calibration, CISPR 16-1-15 states that the height of an antenna depends on
the chamber size and must be positioned to avoid any reflection. Therefore, height
scanning for the receiver is not important in this method. The transmitter was

Fig. 3 Measurement set-up for SAM

610 A. A. G. Amer et al.

connected to the signal generator and the receiver (either reference antenna or
AUC) was connected to a spectrum analyzer. The measurement is started by
measured the voltage received from the reference antenna. The reference antenna is
then was replaced with the AUC and the receive voltage is measured. During the
replacement of the reference antenna with the AUC, all cables, connections,
transmitter position and any equipment must be static and must be same as the
previous layout to reduce uncertainties.
The measurement ﬁgure for SSM almost same as SAM in Fig. 3 except SSM is
setting without an absorber. Three antennas are involved in this measurement. All
three different antennas taken in pairs. Transmitter and receiver must be placed at a
distance of 3 m from each other. Both transmitter and receiver must be positioned at
2 m above the floor. For each frequency, −20 dBm was fed to transmitting antenna
and receiving antenna height is scanned from 1 to 4 m to record the maximum
received signal; . All frequencies points are measured with above procedure for the
ﬁrst pair of transmitter1 – receiver2 antenna, above procedure is repeated for
remaining antenna pairs. Then the antenna has been replaced by the reference
antenna. Resulting AF is calculated by using equation as in [9].

3.2 Antenna Height Analysis

The height of the antenna for AUT was analyzed by using SSM method without
absorber. For SAM, it is stated in the literature that the effects of antenna height in
the semi-anechoic chamber for antenna factor is 1.5 m. The author mentioned that it
is obvious that both the reference and antenna under test should be positioned at the
same height at 1.5 m from the ground plane to achieve reliable data for AF [15].
For SSM measurement, the absorber was removed from the ground plane to get
a maximum signal from the combination direct and reflected signal. A transmitter
will be ﬁxed at 1.5 m from the ground plane for all measurement but the receiver
was position at 1 to 4 m from the ground plane as shown in Fig. 4.

Fig. 4 SSM measurement for

antenna height analysis
Antenna Calibration in EMC Semi-anechoic Chamber … 611

Figure 5 shows the Antenna Factor (AF) vs frequency for various antenna height.
The measurement result is compared with manufacturer data from calibration cer-
tiﬁcate of the AUC. AF for the antenna height at 1.5 m gives a good agreement with
manufacture data because the height is located in the middle of a semi-anechoic
chamber. Therefore, the maximum power received is measured due to the direct
signal and reflected signal from the ground plane as mentioned in the standard
procedure of the SSM. However, at a height of 4 m, measurement results show that
the voltage give the highest deviation compare with the manufacture data. This is
because the AUC is too close with the ceiling without absorber as shown in Fig. 6.
Therefore, the reflection from the ceiling at 4 m antenna height is unavoidable.
As a conclusion, the effects of the antenna height in the semi-anechoic chamber
for AF have been analysed. Therefore, any measurement including all the antenna
analysis such as antenna gain, and radiation pattern need to be conducted at the
antenna height of 1.5 m from the ground plane due to the lower reflection occurs in
the EMC Semi-Anechoic chamber testing lab.

3.3 Analysis of the Antenna Phase Center

Phase center is an important parameter in determining the AF. EMC tests usually do
not emphasize the position of the phase center of the antenna to be used. But for the
antenna calibration, several studies were conducted using horn antenna as a ref-
erence antenna for SAM. Therefore, an absorber will be used between the trans-
mitter and receiver.

Fig. 5 AF vs frequency for various antenna height

612 A. A. G. Amer et al.

Fig. 6 SSM measurement at

4 m antenna height

Fig. 7 Horn antenna Transmit-

measurement without >
considering a phase center
position (Front)

Two different analyses regarding the phase centre of the horn antenna were
undertaken. The horn antenna is a reference antenna while a dipole antenna is an
AUC. For the ﬁrst measurement, the 3 m distance was positioned at the edge
(Front) of the horn antenna, as shown in Fig. 7. For this measurement, the actual
distance between the horn antenna (reference antenna) and the transmitter was
greater than 3 m.
For the second measurement, the horn antenna was positioned exactly 3 m
below the phase centre from the transmitter, as shown in Fig. 8. As a result, the
horn antenna with exact phase centre position (Center) is in good agreement with
the simulation, compared to the approximation distance without considering a phase
centre as shown in Fig. 9. The horn antenna must be positioned exactly below the
phase centre to ensure the 3 m measurement can be achieved accurately. The phase
centre is therefore important and must be taken into consideration, because devi-
ation from the simulation for horn antenna was quite high, which is up to 3 dB.
Antenna Calibration in EMC Semi-anechoic Chamber … 613

Transmit-

Fig. 8 Horn antenna measurement with correct phase center position (Center)

Fig. 9 AF for dipole antenna and horn as a reference antenna

4 Comparison Between SAM and SSM in Semi-anechoic

Chamber

Figure 9 shows the AF result for a dipole antenna by using SAM. In the mea-
surement, the horn is used as a reference antenna and the dipole is the AUC. The
result indicates that the AF for dipole by using SAM gives a good agreement with
manufacturer data from a calibration certiﬁcate and an absorber needs to be placed
in between transmitter and receiver.
Figure 10 shows the AF result for a dipole antenna (AUC) by using SSM. The
result indicates that the AF without absorber for dipole antenna gives a good
agreement with the manufacturer data. Based on the result, SSM requires a reflected
614 A. A. G. Amer et al.

Fig. 10 AF result for Dipole antenna using SSM

Fig. 11 Percentage error between SSM and SAM

ground plane because to ensure that the maximum transmit wave from direct and
reflected signal are received accurately.
Figure 11 displays the percentage error of the dipole antenna using SAM and
SSM between 500 and 1000 MHz. The percentage error of the dipole antenna using
SAM is 5%, whereas 18% using SSM. Therefore, for an accurate AF calibration in
semi-anechoic chamber (EMC testing laboratory), it is preferable to use SAM
instead of SSM. Apparently, SSM must have a high accurate site measurement
compared with SAM, where site performance is not important, but it must have a
high accuracy antenna with known AF.
Antenna Calibration in EMC Semi-anechoic Chamber … 615

5 Conclusion

Based on studied in the paper, it can be concluded the SAM is the best method for
antenna calibration in semi anechoic chamber (EMC test site) because it gives lower
error compare with SSM. SSM is not recommended because it does require high
accurate calibration test sites, which have low uncertainties [16]. However, the
semi-anechoic chamber for EMC testing lab has high uncertainty of the site
imperfection; which is up to ±4 dB, compared to the calibration test site, where
allowed uncertainty is around ±1 dB [17].

Acknowledgements The authors would like to acknowledge Universiti Tun Hussein Onn
Malaysia (UTHM) for their funding of this research under TIER 1 research grant, H150.

References

1. Betta G, Capriglione D, Carobbi CFM, Migliore MD (2011) The accurate calibration of EMC
antennas in compact chambers — measurements and uncertainty evaluations. Comput Stand
Interfaces 33:201–205
2. Chen Z (2015) An improved method for simultaneous calibrations of gain, phase center and
near boresight patterns for Log-Periodic Dipole Arrays. In: 2015 9th European Conference on
Antennas and Propagation, EuCAP 2015, pp 1–5
3. Lim JH, Lee BW, Choi YJ, Kim HB (2017) A study of measuring a commercial antenna gain
using an R-SAM. In: 2017 Asia-Pacific International Symposium on Electromagnetic
Compatibility, pp 131–133
4. Meng D, Liu X, Dabo L (2015) Research on unwanted reflections in an OATS for precise
omni antenna measurement. In: 2015 IEEE 6th international symposium on microwave,
antenna, propagation, and EMC technologies (MAPE), pp 245–249
5. Dawson L, Clegg J, Porter SJ, Dawson JF, Alexander MJ (2002) The use of genetic
algorithms to maximize the performance of a partially lined screened room. IEEE Trans
Electromagn Compat 44:233–242
6. Seki Y, et al (2018) Antenna calibration in anechoic chambers (30 MHz to 1 GHz): new
approach to antenna calibration. In 2018 IEEE international symposium on electromagnetic
compatibility and 2018 IEEE asia-pacific symposium on electromagnetic compatibility
(EMC/APEMC), pp 1230–1235
7. Sapuan SZ, Jenu MZM (2016) Time domain analysis of direct-feed biconical antenna for
Antenna calibration and EMC measurement. In: 2016 IEEE Asia-pacific conference on
applied electromagnetics, APACE 2016, pp 198–201
8. Eser S, Sevgi L (2010) Open-area test site (OATS) calibration. IEEE Antennas Propag Mag
52:204–212
9. Smith AA (1982) Standard-site method for determining antenna factors. IEEE Trans
Electromagn Compat 3:316–322 EMC-24
10. Standards, C (2008) Calibration standards in the united states. 13
11. Jeong MJ, Lim JH, Park JW, Park SW, Kim N (2019) Validation of compact-standard
antenna method for antenna calibration above 1 GHz. J Electromagn Eng Sci 19:89–95
12. Lim JH, Lee BW, Park SH, Choi YJ, Seo MW (2016) A study of standard antenna method
using two homogeneous horn antennas. In: 2016 URSI asia-pacific radio science conference,
URSI AP-RASC 2016, pp 793–795
616 A. A. G. Amer et al.

13. Chand S (2003) Calibration of antennae on 3 m OATS. In: Proceedings of the international
conference on electromagnetic interference and compatibility, pp 375–379
14. Kaketa S, Fujii K, Sugiura A, Matsumoto Y, Yamanaka Y (2003) A novel method for EMI
antenna calibration on a metal ground plane. In: 2003 IEEE international symposium on
electromagnetic compatibility, 2003. EMC 2003, vol 1, pp 66–69
15. Sapuan SZ, Jenu M, Zarar M, Kazemipour A (2014) Issue on calibration of direct feed
biconical antenna in a semi- anechoic chamber using standard antenna method. In: Advanced
materials research, vol 903, pp 273–278
16. Alexander MJ, et al (2004) Calibration and use of antennas, focusing on EMC applications.
Director
17. Fujii K, Alexander M, Sugiura, A (2012) Uncertainty analysis for three antenna method and
standard antenna method. In: IEEE international symposium on electromagnetic compatibility,
pp 702–707
An Automatic Driver Assistant Based
on Intention Detecting Using EEG
Signal

Reza Amini Gougeh, Tohid Youseﬁ Rezaii, and Ali Farzamnia

Abstract Each year, vehicle safety is increasing. Recently brain signals were used
to assist drivers. Attempting to do movement produces electrical signals in speciﬁc
regions of the brain. We developed a system based on motor intention to assist
drivers and prevent car accidents. The main objective of this work is improving
reaction time to external hazards. The motor intention was recorded by 16 channels
of a portable device called Open-BCI. Extracting features was done by common
spatial patterns which is a well-known method in motor imagery based brain
computer interface (BCI) systems. By using enhanced common spatial pattern
(CSP) called strong uncorrelated transform complex common spatial pattern
(SUTCCSP), features of preprocessed data were extracted. Regarding the nonlinear
nature of electroencephalogram (EEG), support vector machine (SVM) with kernel
trick classiﬁer was used to classify features into 3 classes: left, right and brake. Due
to using developed SVM, commands can be predicted 500 ms earlier with the
system accuracy of 94.6% on average.

Keywords Intentional EEG Driving assistant BCI

1 Introduction

Physiological signals have been widely used in clinical trials to evaluate patients.
Small voltage sensing systems were developed in the past years and now we can
sense and amplify signals of the brain called electroencephalogram (EEG) which is
mainly used to detect mental diseases. The ﬁrst link between brain and computer
was developed in 1970. Different types of brain computer interface (BCI) systems
have been proposed depending on the target areas. We used motor intention signals;

R. A. Gougeh T. Y. Rezaii
Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
A. Farzamnia (&)
Faculty of Engineering, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 617

when intention happens, characteristic signals in some regions of the brain will
appear. Brain waves are made before actual movement of the body, so we can
predict intention of an action just before it happens.
Each year, thousands of people die from car accidents. Reducing this number is a
priority for countries and automakers. Nowadays, manufacturers use radar systems
(for automatic break), drowsiness detector and EEG signals to control commuters’
safety. Conventional systems use external sensors such sonar and video-cameras to
obtain and to analyze information from the vehicle and its surroundings to react
dangerous situations [1]. Several behavioral experiments have shown that drowsi-
ness can have a serious impact on driving performance [2]. To recognize distrac-
tions in driving, Sigari et al. [3] proposed a method based on driver’s face image
processing which evaluates distraction symptoms related to eye region. React to
emergency situations is the final outcome of cognitive and peripheral processes [4].
Non-invasive EEG were used by Haufe et al. [5], showed that event-related
potentials (ERP) recorded in a simulated driving environment can be used to dis-
tinguish emergency braking intention from non-braking driving. Kim et al. [6]
suggests that emergency situations are characterized by specific neural patterns of
sensory perception and processing, as well as motor preparation and execution,
which can be utilized by braking assistance systems. Scientific evidence has shown
that the attention level during the driving task is influenced by stress, workload, and
fatigue, and thus they tend to increase the braking reaction time [7]. Teng and Bi [8]
analyzed EEG signals for the early detection of emergency braking situations and
reducing reaction time. In recent researches, Alyasseri et al. [9], proposed a novel
method for EEG signal denoising based on multi-objective Flower Pollination
Algorithm with wavelet transform. Nguyen et al. [10] used CSP to extract signif-
icant discriminant features from the multi-class data. Then, proposed fuzzy logic
system classifies data. This method outperforms other competing classification
methods including k-nearest neighbor. But its learning process is generally time
consuming.
In this paper, unlike most of the EEG-based assistants which simply evaluate
mental state, our objective is using EEG to detect driver’s decision in 3 states,
turning to left, to right and brake. Open-BCI headset is utilized to acquire EEG data
and machine learning techniques are used to find a discriminant pattern of motor
intention. Common spatial patterns (CSP) is a well-known preprocessing method to
find features of data in motor imagery systems and we used it in the proposed
intention detection system. Several extensions have been developed for CSP to
improve its performance. Standard CSP doesn’t involve phase information but the
strong un-correlating transform complex CSP (SUTCCSP) is an improved variant
of CSP which is shown to improve the accurate by 4% compared to the standard
CSP [11].
Support vector machines were created by Vapnik based on the statistical learning
theory and can solve the problems associated with small sample sizes, nonlinear
relationships, and multiple classifications [12]. The main idea is separate the data
with line in 2-d. Training this system is an optimization problem. If data isn’t linear
we may apply transformations on it and map it into higher dimensional space. The
An Automatic Driver Assistant Based on Intention Detecting… 619

objective is ﬁnding a space that in it data will separate linearly. Therefore, problem
converts from nonlinear to linear with changing kernel. This method was called
kernel trick by M. Aizerman et al. in 1964 [13].
There are 2 methods in order to obtain multiclass SVM. One-versus-all distin-
guish between one of the labels and the rest and one-versus-one distinguish
between every pair of classes. Researches shows that one-versus-one method is
suitable for practical use [5].

2 Materials and Methods

Procedure of our work illustrated in Fig. 1.

In the following we will explain each section.

2.1 Data Acquisition

Twenty students between the ages of 20–25 contributed to this study on a voluntary
basis. Participant’s gender distribution was equal. They were informed by
instruction ﬁles and were examined physically and mentally via questionnaire
before taking the exam.
The experiment was executed in various time slots which contains morning and
evening time. Examining system was made up from the screen, 3-button controller,
open-BCI and personal computer. The experiment has 3 states including left, right
and break which are illustrated in Fig. 2, and therefore recorded data will be labeled
with 3-button controlling box.
Electroencephalography signals were recorded during the experiment by
16-channel Open-BCI headset. EEG signals were recorded from frontal, temporal,
parietal and occipital lobe (Fp1, Fp2, F7, F3, F4, F8, T3, C3, C4, T4, T5, P3, P4,
T6, O1, O2) in accordance with 10–20 international system as shown in Fig. 3. The
reference electrodes were placed in the left and right ear. Open-BCI headset has its
own dry electrodes with acceptable impedance.
In order to increase the adaptation of interface with each subject, eight types of
noise/interference data (blinking, eye up/down movement, eye left/right movement,
clenching, tongue movement and relaxation state) were recorded before starting the
main tasks. Subjects were sat in front of a monitor while they were supposed to put
their dominant hand on the box. Fixation section was shown for 5 s; the participant

SUTCCSP SVM with

Data
Preprocessing Feature kernel trick
Acquisi on
Extrac on classifier

Fig. 1 Block diagram of proposed work

620 R. A. Gougeh et al.

Fig. 2 Test equipment were

used in study

Fig. 3 EEG Channels’ name

and location (Baker et al.
[14])

was then prepared to take the intention task. Then, one of the left, right and brake
marks appeared for 3 s on screen and subject was asked to press the related key on
the box. Finally, blank screen was shown and subject was given a break for 3 s
(Fig. 4).
This procedure was repeated 10 times for one run, and 4 runs were performed in
each day. Between each run, the subjects received a 3-min break. All the platform
was based on Open-sesame and its python core. We repeated this procedure for a
second day, therefore, there are 80 trials for 8 runs per subject.
Two days before the experiment began, the participants are asked to control their
sleep and foods due to their effect on EEG signals [16]. In the experiment day, they
asked to ﬁll out a questionnaire. An Excel document is created to store the subject’s
data. Additionally, they were requested to report false actions.
An Automatic Driver Assistant Based on Intention Detecting… 621

Fig. 4 Procedure of signal acquisition

2.2 Preprocessing

In the preprocessing stage, having the raw signal including labels, latency and
channels is available, we extract each trial by a constant timeframe of 3000 ms.
Each extracted trial was then band-pass ﬁltered between 8 to 30 Hz. Furthermore,
we extracted different types of noise data which are recorded at the beginning of the
experiment as mentioned blinking, eye up/down movement, eye left/right move-
ment, clenching, tongue movement and relaxation state.

2.3 Data Analysis

CSP algorithm is an efficient method used to extract features from different classes.
It finds vectors that maximize the variance for one class while simultaneously
minimizing the variance for the other. A complex version of CSP uses the
covariance matrix that maintains the power sum information of the real and
imaginary parts of the complex-valued data. Another complex-valued CSP algo-
rithm, analytic signal-based CSP (ACSP), was proposed by Falzon et al. [17] to
discriminate different mental tasks. However, given that the Hilbert transformed
analytic signals could only produce circular signals (rotation invariant probability
distribution) and that physiological signals are improper (mismatch of power
between different channel data), the augmented complex CSP was introduced to
fully exploit the second-order statistics of noncircular complex vectors [18].
SUTCCSP is an improved version of ACCSP which provides a higher classification
rate compared to the standard CSP algorithm [19].
Suppose z is z to be complex-valued random vector:
pffiffiffiffiffiffiffi
z ¼ zr þ jzi ; j ¼ 1 ð1Þ
622 R. A. Gougeh et al.

For circular data, zr and zi become uncorrelated and pseudocovariance will equal to
zero. But most of the real signals are non-circular. When we apply SUT, multi-
channel complex data become uncorrelated. Let Za and Zb be zero-mean
complex-valued matrix as a function pf complex values with the form (1).

Ca ¼ E Za ZaH ; Cb ¼ E Zb ZbH

Pa ¼ E Za ZaT ; Pb ¼ E Zb ZbT ð2Þ

where a and b denote the two different classes and C and P stands for covariance
and pseudo-covariance respectively. E[.] is the statistical expected value operator
and ð:ÞH is T
the Hermitian of the data, ð:Þ is transposed data. Za and Zb are
16
N¼ S matrices which S is sample size. So if efa; bg:
2

Ck ¼ E Zk ZkH ; Pk ¼ E Zk ZkT ð3Þ

We can deﬁne the composite covariance Cc and pseudocovariance Pc matrices:

X
Cc ¼ Ck ¼ E Za ZaH þ E Zb ZbH
k
X
Pc ¼ Pk ¼ E Za ZaT þ E Zb ZbT ð4Þ
k

Then if we suppose Uc and Kc as eigenvectors and eigenvalues respectively:

Cc ¼ Uc Kc UcH ð5Þ

Each eigenvector (Uc Þ is related to diagonal eigenvalues matrices (Kc Þ, Now if we

qffiffiffiffiffiffiffiffi
whiten Cc by the whitening matrix G ¼ K1 H
c Uc :

GCc GH ¼ I

Pc ¼ GPc GT ¼ DKDT ð6Þ

where I denotes the identity matrix, D and K yielded by symmetric matrices. DKDT
named SSVD or Takagi’s transformation. So SUT results in the following form:

Q ¼ DH G ð7Þ
An Automatic Driver Assistant Based on Intention Detecting… 623

Now we can diagonalize the covariance and pseudo-covariance matrices

simultaneously.

QCc QH ¼ I

QPc QH ¼ K ¼ QPa QH þ QPb QH ð8Þ

If we deﬁne Sa ¼ QCa QH and Sb ¼ QCb QH , an estimation for K can be as

following:

Ka ¼ B1 Sa B

Kb ¼ B1 Sb B ð9Þ
pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi
If we multiple Eq. 8 by K1 and KT :
pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi
K1 QPc QH KT ¼ K1 QPa QH KT þ K1 QPb QH KT ¼ I ð10Þ

K is diagonal matrix. Now if we deﬁne Sba ¼ QP ^ T and Sbb ¼ QP

^ aQ ^ T , Objective
^ bQ
is ﬁnding eigenvectors of pseudo-covariance:

^ ¼ K12 Q ¼ K12 DH G
Q

^Sa þ ^Sb ¼ I

ca ¼ B
K ^ H ^Sa B;
^ Kcb ¼ B
^H ^ ^
Sb B ð11Þ

So we diagnolized Pc and Cc . Thus, we have a spatial ﬁlter as follow:

bH ¼B
W H ¼ BH Q; W ^
^H Q ð12Þ

Spatially ﬁltered vector is calculated as follows:

^H ¼ W
V ¼ W H Z; V b
b HZ ð13Þ

So if we had N data channels:

0 1
v1
B ..C 0 1
B .C
B C v1
B C
¼ B vm C ¼ B . C
V B vNm þ 1 C @ .. A ð14Þ
B C
B .. C v2m
@ . A
vN
624 R. A. Gougeh et al.

features by SUTCCSP calculated as follow:

If vp corresponds to each row of V,
!
var vp
fp ¼ log P ð15Þ
i¼1;...;m var ðvi Þ

where p ¼ ½1; . . .; 2m. In the dominator of this equation we use variant m which
chooses m rows of var ðvi Þ.
After achieving features, a nonlinear classifier is needed to sort these features.
Support vector machine (SVM) is a relatively low cost classifier which is widely
used in BCI systems. SVM uses a set of mathematical functions that are defined as
the kernel. In order to using SVM on the data which are not linearly separable,
kernel trick can be used to overcome the problem. Simply, it is mapping the
non-linear data into a higher dimensional space where we can find a hyperplane that
can separate the samples. We used the method of Li et al. [20]. As SVM is a binary
classifier, According to Hsu and Lin [5], One-versus-one method used to separate
classes. It constructs one classifier per pair of classes. So if the number of classes is
N, then N(N − 1)/2 SVM must be trained. The number of SVMs in this method are
more than one-versus-all method, but the training set of each SVM is smaller than
the one-versus-all method (only includes two-class data). In test phase, each data
applied to all of the SVMs. We have three classes: i, j, k. If SVM is for i and j
classes classify the test sample belonged to class i, then it votes it, otherwise, it
votes j. At prediction time, the class which received the most votes is selected.
When two classes have an equal number of votes, it selects the class with the lowest
index.

3 Simulation Results and Discussion

In previous experiments we have done, our systems used AC current, then a notch
filter required to reduce line noise (50 or 60 Hz). Furthermore, gel electrodes were
used which have low inductance besides it needs washing after the experiment. Dry
electrodes were used in this study instead of gel type, therefore subjects are com-
fortable while signal accuracy is acceptable. Also, we used open-BCI with DC
current which can reduce the line noise. Its platform is based on python and JAVA
which stores all 16 channels with minimum latency, especially in WIFI mode
(Table 1).
A negligible issue in this study is weight of headset and sharpness of electrodes
which annoyed some of the subjects. In 3-min breaks, we unscrewed the electrodes.
Classification rate of conventional method is provided in Table 2. Classification
was repeated by 100 times for each subject. To train and test our system, all of the
acquired data are split in a 80–20 ratio, 80% for training and 20% for testing
system.
An Automatic Driver Assistant Based on Intention Detecting… 625

Table 1 Scheduling of Number Task Duration

procedure (adapted (min:sec)
from Cho et al. [15])
1 Filling questionnaire 5
2 Open-BCI headset placement 7
3 Acquisition of noise data 2
4 Run 1 7:20
5 Relaxation 3
6 Run 2 7:20
7 Relaxation 3
8 Run 3 7:20
9 Relaxation 3
10 Run 4 7:20
11 Relaxation 3
12 Removing Open-BCI headset 5
Sum 60:20

Table 2 Results of Method Classiﬁcation rate (%)

classifying without Kernel
trick SVM CSP 69.77
SUTCCSP 71.62

Table 3 Overall accuracy m CSP SUTCCSP

with one-versus-one method
1 Average 83 94.6
Maximum 89.3 96
2 Average 79.05 90.6
Maximum 86 92.1

Overall accuracy of used multiclass SVM with one-versus-one method and using
kernel trick provided in Table 3 with speciﬁc variable m. 94.6% accuracy rate is
achieved with proposed method.
The improvement in the results in conventional SVM (Table 1) is quite obvious
compared to proposed combination (Table 2). Table 4 contains linear discriminant
analysis (LDA) with CSP and SUTCCSP.
Despite LDA can be used in multiclass problems using the one-versus-one
method, the main disadvantage of LDA is its linearity that can have insufﬁcient
results on nonlinear EEG data [21]. Therefore, optimized SVM is used in our
research to obtain more accuracy and reliability.
626 R. A. Gougeh et al.

Table 4 Overall accuracy of Method Classiﬁcation rate (%)

classifying with LDA
CSP 78.3
SUTCCSP 80.7

4 Conclusion

Limited field of vision, high speed, careless driving and fatigue are the most
common reasons of accidents. To reduce number of injures, we have two options:
first, we can prepare drivers to obey the rules. Second, preparing cars to get less
damage from an incident. One of the pioneer technologies in this field is using EEG
signals to brake quicker. Our work is just a step toward a system with maximum
accuracy. We showed that using SUTCCSP method with SVM leads to accurate
results compared to previous conjunctions. Our proposed method starts with good
quality data acquisition system. In second stage, we applied conventional prepro-
cessing methods to prepare data for main stage. In the most important part of the
work, we used SUTCCSP to extract features of our classes. SVM with different
kernel helped us to reach beneficial procedure in order to obtain reliable accuracy.
Several approaches have been suggested for multiclass classification using SVM,
and here we adopted the one-versus-one approach.
We hope using this approach in actual systems have positive effect on passen-
ger’s safety.

Acknowledgement The authors appreciate those who contributed to make this research suc-
cessful. This research is supported by Center for Research and Innovation (PPPI) and Faculty of
Engineering, Universiti Malaysia Sabah (UMS) under the Research Grant (SBK0393-2018).

References

1. Shaout A, Colella D, Awad S (2011) Advanced driver assistance systems - past, present and
future. In: 2011 seventh international computer engineering conference (ICENCO 2011),
pp 72–82, Giza
2. Liu CC, Hosking SG, Lenné MG (2009) Predicting driver drowsiness using vehicle measures:
recent insights and future challenges. J Saf Res 40(4):239–245
3. Sigari MH, Fathy M, Soryani M (2013) A driver face monitoring system for fatigue and
distraction detection. Int J Veh Technol 2013:1–11
4. Sherk H, Fowler GA (2001) Chapter 16 Neural analysis of visual information during
locomotion. Prog Brain Res 134:247–264
5. Haufe S, Kim JW, Kim IH, Sonnleitner A, Schrauf M, Curio G, Blankertz B (2014)
Electrophysiology-based detection of emergency braking intention in real-world driving.
J Neural Eng 11(5):056011
6. Kim IH, Kim JW, Haufe S, Lee SW (2014) Detection of braking intention in diverse
situations during simulated driving based on EEG feature combination. J Neural Eng 12
(1):016001
An Automatic Driver Assistant Based on Intention Detecting… 627

7. Borghini G, Astolﬁ L, Vecchiato G, Mattia D, Babiloni F (2014) Measuring neurophysi-

ological signals in aircraft pilots and car drivers for the assessment of mental workload,
fatigue and drowsiness. Neurosci Biobehav Rev 44:58–75
8. Teng T, Bi L (2014) A novel EEG-based detection method of emergency situations for
assistive vehicles. In 2017 seventh international conference on information science and
technology (ICIST), IEEE, pp 335–339
9. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Ahmad Alomari O (2018) EEG-based
person authentication using multi-objective flower pollination algorithm. In: 2018 IEEE
congress on evolutionary computation (CEC), IEEE, pp. 1–8
10. Nguyen T, Hettiarachchi I, Khatami A, Gordon-Brown L, Lim CP, Nahavandi S (2018)
Classification of multi-class BCI data by common spatial pattern and fuzzy system. IEEE
Access 6:27873–27884
11. Kim Y, Park C (2015) Strong uncorrelated transform applied to spatially distant channel EEG
data. IEIE Tran Smart Process Comput 4(2):97–102
12. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
13. Aizerman MA (1964) Theoretical foundations of the potential function method in pattern
recognition learning. Autom Remote Control 25:821–837
14. Baker M, Akrofi K, Schiffer R, O’Boyle MW (2008) EEG patterns in mild cognitive
impairment (MCI) patients. Open Neuroimaging J 2:52
15. Cho H, Ahn M, Ahn S, Kwon M, Jun SC (2017) EEG datasets for motor imagery
brain-computer interface. Gigascience 1(7):1–8
16. Hoffman LD, Polich J (1998) EEG, ERPs and food consumption. Biol Psychol 48(2):139–151
17. Falzon O, Camilleri KP, Muscat J (2010) Complex-valued spatial filters for task
discrimination. In 2010 annual international conference of the IEEE engineering in medicine
and biology, IEEE, pp 4707–4710
18. Park C, Took CC, Mandic DP (2013) Augmented complex common spatial patterns for
classification of noncircular EEG from motor imagery tasks. IEEE Trans Neural Syst Rehabil
Eng 22(1):1–10
19. Kim Y, Ryu J, Kim KK, Took CC, Mandic DP, Park C (2016) Motor imagery classification
using mu and beta rhythms of EEG with strong uncorrelating transform based complex
common spatial patterns. Comput Intell Neurosci 2016:1
20. Li X, Chen X, Yan Y, Wei W, Wang ZJ (2014) Classification of EEG signals using a multiple
kernel learning support vector machine. Sensors 14(7):12784–12802
21. Garcia GN, Ebrahimi T, Vesin JM (2003) Support vector EEG classification in the Fourier
and time-frequency correlation domains. In: First international IEEE EMBS conference on
neural engineering, 2003, conference proceedings, IEEE, pp 591–594
Hybrid Skull Stripping Method
for Brain CT Images

Fakhrul Razan Rahmad, Wan Nurshazwani Wan Zakaria,

Ain Nazari, Mohd Razali Md Tomari, Nik Farhan Nik Fuad,
and Anis Azwani Muhd Suberi

Abstract Ischemic stroke is a medical condition in which blood flow is obstructed

in the brain region. This condition will cause the brain tissue to be deprived of
oxygen resulting in the death of the tissue. In medical imaging, computed
tomography (CT) image or magnetic resonance imaging (MRI) is used to display a
series of slices of the head section when diagnosing ischemic stroke. With the
progress of image processing technology, ischemic stroke can be detected through a
series of detection algorithm instead of visual perception by radiologists. This
method promotes the accuracy of diagnosis by reducing human error. While MRI is
more accurate as compared to CT, the availability of MRI impede the prognosis
when diagnosing ischemic stroke. CT scan is more reliable when dealing with an
emergency situation since it is widely available despite providing less accurate data
especially in early stage of ischemic stroke detection. This is where image pro-
cessing approach provide comprehensive data for ischemic detection. This paper
proposes a preliminary processing stage to remove unrelated non-brain region
which can be considered as an obstacle in ischemic detection. The hybrid method
consists of intensity-based and morphology-based method converts the image
scanned from CT scanner to 8-bit DICOM format before the skull is stripped in the
sequencing process. This method shows remarkable results in term of visual rep-
resentation of skull stripping skull and processing time.

Keywords Image pre-processing Skull stripping Medical imaging

Brain CT image Ischemic stroke

F. R. Rahmad (&) W. N. Wan Zakaria A. Nazari M. R. Md Tomari A. A. Muhd Suberi

Faculty of Electrical and Electronic Engineering (FKEE), Universiti Tun Hussein Onn
Malaysia (UTHM), 86400 Parit Raja, Batu Pahat, Johor, Malaysia
e-mail: [email protected]
N. F. Nik Fuad
UKM Medical Centre, Jalan Yaacob Latif, Bandar Tun Razak, 56000 Cheras, Kuala Lumpur,
Malaysia

© Springer Nature Singapore Pte Ltd. 2021 629

1 Introduction

Stroke from brain ischemia has been in the spotlight for quite some time in the
medical field and has been pressuring the paramedics on a race against the clock
[1]. A collective study had found that brain stroke is progressively growing and
well-known to be among the top cause of death around the globe. Even in Malaysia,
ischemic stroke is responsible for 79.4% of admitted patients [2]. The key to
accurate diagnosis in treating a stroke patient lies within the medical imaging
process. The medical imaging process provides a visual data that will decide the
following treatment planned by the doctors [3]. Introducing machine learning into
the medical fields has helped doctors in making an accurate diagnosis hence
improving patient’s prognosis.
As much advantage machine learning could provide, there will always still be
room for improvement. Machine learning process multitude of data simultaneously
disregarding whether the data is pertinent or not for the diagnosis which may cause
the system to slow down. Hence, preprocessing stage is incorporated to ensure that
the only related images are considered for the processing stage. In this paper, a
preprocessing method is proposed to remove the skull from the image which leaves
only the tissue for the processing stage using a derivative technique by combining
aspect of conventional methods. Introducing a pre-processing stage such as
skull-stripping to a CT image is meant to remove unnecessary non-brain region
from the scanned images with a rapid processing time which is crucial in promoting
accuracy and efficiency for succeeding process such as automated detection, seg-
mentation and analysis [4].
Among the earliest morphology-based skull-stripping method used
histogram-based thresholding for its morphological computation. One of the
problem related with this operation is that the optimum morphological size is
difficult to determine [5]. Another way to remove skull from a CT image is by
addressing the intensity properties of a particular region or also known as
intensity-based method. The method addresses the difference in shade which
characterize the different tissue components that made up the skull and isolate them
[6]. Nevertheless, this method is limited by its sensitivity towards intensity bias that
may arise from low resolutions and contrasts of the CT image, presence of artifact
etc. Theoretically, the drawback from the morphology-based method is compli-
mented by the intensity-based method to ensure concise stripping.
Hybrid Skull Stripping Method for Brain CT Images 631

Axial Coronal Sagittal

Fig. 1 Axial, Coronal and Sagittal views of CT image scanning window [9,10]

2 Medical Imaging

2.1 Brain Anatomy in Non-enhanced Computed

Tomography (CT) Image

CT permits the sampling of brain images with the thickness between 4 to 14 mm

with a speciﬁc given value [7]. Depending on the patient and data visualization, the
number of brain image slices diverse and all these variables should be taken into
account to ensure accurate visualization. Through CT scanning process, the tube
with x-ray beams is rotated around the patient’s head to produce three types of
views which are sagittal, axial and coronal perspective [8]. Figure 1 shows CT
image scanning window.
In order to display the most optimize visualization, the window setting of the
image should be in the range of +40 to +80 Hounsﬁeld Unit (HU) for both or either
window centre (Wc ) and window width (Ww ). This paper utilizes the window
setting of Wc = 40 HU Ww = 40 HU since this setting has been proven to show the
most optimize display for brain morphology [11].

2.2 CT vs MRI

In the battle against the progressive threatening medical condition, every second
past before treating the patient could lead to fatality. Consequently, the sooner the
radiologists take action, the higher the chance of saving the patient. In time con-
straint, CT acted as the ﬁrst-line of imaging due to its fast response despite being
less reliable as compared to MRI [12]. For a very long time, the paramedics have
been relying on CT to diagnose head injury despite its credibility in constraining
situation [13].
CT is suitable in providing rapid prescreening hence, encouraging radiologists to
monitor unstable patient. Its availability has made CT a reliable clinical measuring
tool since MRI is not as widely available as CT particularly in the countryside [14].
Table 1 display the differences between CT and MRI.
632 F. R. Rahmad et al.

Table 1 Comparison between MRI and CT [15]

Imaging MRI CT
technique
Advantages – No radiation imposed on patient – Wide availability
– Distinctive contrast for soft tissue – Rapid data provision
– Excellent visual representation compared to – Economical
CT
Disadvantages – Low accessibility – High radiation
– Complex real-time imaging – Poor soft tissue
implementation visibility
– Expensiv

For example, in the event where ischemic sign is present but too subtle or hard to
perceive, the CT detection procedure can be combined and succeeded with CT
Angiography (CTA) and CT Perfusion (CTP) [16]. CTA aids radiologists to
determine salvageable tissue of the brain by distinguishing infarcted tissue. CTP on
the other hand evaluate the flow blockade in the major vessels. The integration of
these technique is proven to provide a more accurate diagnosis but similar to MRI
which is not widely accessible.

2.3 CT Attenuation (Hounsﬁeld Unit)

The whole brain constitute of several parts when displayed in a CT image. Every
part in the brain structure is represented by different attenuation of Hounsfield Unit
(HU) when portrayed by grayscale value. The representation of different attenuation
is crucial to segment different part of the brain. Figure 2 shows the Hounsfield Unit
for different part of the brain where each part is in between a specific range. This
paper utilizes some part of the range especially the bone to implement the stripping
process from the image. Figure 2 visualize the different cranial region with their
respective attenuation. For example, the attenuation ranges between −1000 to 1000
where the −1000 represent the darkest hue, in this case; air, and 1000 represent the
brightest hue, in this case; the bone, which is focused on in this research.
Hybrid Skull Stripping Method for Brain CT Images 633

Fig. 2 Hounsﬁeld Unit representation of different head structures [17]

3 Pre-processing Method

This section discusses the method for skull stripping which combined
intensity-based method and morphology-based method to eliminate non-brain tis-
sues from CT images.

3.1 Image Acquisition

The experimental methods are conducted with MATLAB R2018b and Intel®
Core™ i3-3110M processor with 2.40 GHz CPU and 6 GB RAM as the testing
platform. The algorithm used CT images with the resolutions of 512 by 512 in
DICOM format obtained from the Radiology department of UKM Medical Centre
acknowledging patients discretion. The images are scanned with Aquilion One by
Toshiba scanner. The parameter covered by the scanner involve producing 152 to
207 slices for each patients where each slice is 1 mm thick 16 bit depth. The default
window centre (Wc) and window width (Ww) for the scanner are 40 HU and 90 HU
respectively.
634 F. R. Rahmad et al.

3.2 Image Normalization

In order to proceed with skull-stripping, it is crucial to prepare the images by

converting them to DICOM format. This process involve normalizing the 16-bit
Hounsﬁeld unit (HU) from the scanner to suitable window setting value which is
the 8-bit grayscale. Previous studies have suggested several window values that can
be used to emphasize the tissue visibility. Nevertheless, the difference in values may
temper with the results in addition to different suitability of machine used with the
window setting. Thus, the most suitable window setting values had been tested and
evaluated to decide the suitable value. Consequently, the default values of Wc = 40
HU and Ww = 40 HU is selected since this setting permits the visualization of the
brain tissue effectively (Fig. 3(b)) [11].

Fig. 3 Output of DICOM

conversion

(a) CT bone view (b) CT brain view

3.3 Hybrid Method (Intensity-Morphology-Based Hybrid)

The proposed hybrid method combines some aspect of the conventional

intensity-based method and morphology-based method. This paper uses histogram
normalization to separate between brain region and non-brain region to present only
the tissue. Figure 4 shows the flow of the skull-stripping method.
The process begins with image acquisition from the CT in 16-bit format as
shown in Fig. 5(a). The presented image is a whole head image in 8-bit DICOM
format which is comprised of tissue, background and skull. The skull is then

Fig. 4 Flow of skull-stripping method

Hybrid Skull Stripping Method for Brain CT Images 635

Fig. 5 Stages of skull-stripping method with product of each stage

636 F. R. Rahmad et al.

sampled with the value 0.9 from the 255 unit of the image value. This process
produced image as illustrated in Fig. 5(b). The same image is then filled with holes
to produce a single subject masked image as shown in Fig. 5(c). The skull region
and inner part of skull region are shown in Fig. 5(b) and Fig. 5(c) respectively.
Subtracting Fig. 5(b) from Fig. 5(c) yields the tissue region as shown in Fig. 5(d).
However, notice that Fig. 5(d) includes an unwanted cavity which is shown by
the smaller region unattached to the tissue region. This small region is removed to
ensure that the image displayed is only the tissue region as shown in Fig. 5(e). The
image is then fine-tuned to remove residual grain by filling the tissue region with
holes similar to previous process and produced Fig. 5(f). Afterwards, the original
image is subtracted with Fig. 5(f) to produce Fig. 5(g) which is a clean tissue
region, without grain. Finally, the original image is used again by subtracting it with
Fig. 5(g) to produce Fig. 5(h) which is the brain tissue without the skull.

3.4 Performance Measurement

The skull-stripping technique is carried out and analyzed with the conventional
morphological-based method. The performance of this method is evaluated quali-
tatively by comparing the segmented output with the original image by observing
the image overlays. In addition to the qualitative evaluation, this method is also
compared quantitatively using Jaccard and Dice similarity index with the con-
ventional morphology-based method.
The qualitative method of overlay is meant to compare the segmented image
from the original image by layering the masked segmented image on the original
image. The Sørensen–Dice coefﬁcient is a statistical method that can be utilized to
analyze the similarity of two samples. [18] This coefﬁcient index is initially used for
discrete data and can be represented as shown in Eq. (1).

ð2jX \ YjÞ
DSC ¼ ð1Þ
ðjXj þ jYjÞ

Where X and Y are two sets and |X| and |Y| are the cardinality of those
respective sets. Jaccard coefﬁcient on the other hand utilize the intersection over
union principle as shown in Eq. (2).

ðjA \ BjÞ
J(A; B) ¼ ð2Þ
ðjA [ BjÞ

From this equation, A and B represent their individual sets. As the principle
suggest, the Jaccard coefﬁcient make use of the number of the intersected element
and divide them by the number of elements in the union [19].
Additionally, the traditional morphological-based method attributed to this
hybrid technique is used to compare the quantitative result.
Hybrid Skull Stripping Method for Brain CT Images 637

4 Result and Discussion

Table 2 shows the proposed method manage to segment the brain tissue region as
accurately as possible within the skull boundary.

Table 2 Qualitative analysis for proposed method

Input Output Overlay Dice Jaccard

As shown in Table 3 and Table 4, it is worth noticing that the processing time is
faster with the proposed method whereas the morphology-based method is rather
slower. However, the similarity of the segmented image seems to be quite similar to
each other.
638 F. R. Rahmad et al.

Table 3 Quantitative Image 1 2 3 4 5

analysis of proposed method
Processing 0.0394 0.0240 0.0158 0.0202 0.0176
time (s)
Dice 0.9430 0.9367 0.9456 0.9233 0.9472
Jaccard 0.8922 0.8809 0.8969 0.8575 0.8997

Table 4 Quantitative Image 1 2 3 4 5

analysis of conventional
morphological-based method Processing 0.1245 0.1407 0.1396 0.0901 0.0797
time (s)
Dice 0.9427 0.9361 0.9402 0.9230 0.9469
Jaccard 0.8916 0.8780 0.2273 0.8570 0.8991

5 Conclusion and Future Works

In this paper, we propose a hybrid skull-stripping methods for brain CT image by

combining aspect of intensity-based method and morphology-based method. This
paper provide insight for medical imaging field, especially image pre-processing of
plausible methods for skull stripping as a mean for a better image processing by
comparing the performance of each methods qualitative and quantitatively. As it
turns out, a hybrid method is proven to provide a faster outcome while retaining the
accuracy of the attributed methods. A well-structured medical imaging technique
mainly image segmentation may be handy in assisting medical professionals in
making decision and even have the potential for a more intricate diagnosis pipeline
to fulfill the needs of the ever-expanding clinical data. Progress in medical imaging
field may provide an intuitive mean to process and manage a huge amount of
clinical data mostly for analysis purposes. This paper generally emphasize the
segmentation of the top brain of the anterior part of the brain. Acknowledging this
fact, this method could be refined to be utilized not only for the top brain, but for the
whole section of the brain including both anterior and posterior region of the brain.
On top of that, these methods hold a potential in facilitating the diagnosis of disease
such as stroke and head trauma if they were to be incorporated with a particular
deep learning algorithm.

Acknowledgements The authors are grateful for Universiti Tun Hussein Onn Malaysia (UTHM)
for supporting this research work under Postgraduate Research Grant (GPPS) Vot 402 and Tier 1
Grant Vot H203.
Hybrid Skull Stripping Method for Brain CT Images 639

References

1. Park E, Kim JH, Nam HS, Chang H-J (2018) Requirement analysis and implementation of
smart emergency medical services. IEEE Access 6:42022–42029
2. Aziz ZA, Lee YY, Ngah BA, Sidek NN, Looi I, Hanip MR, Basri HB (2015) Acute stroke
registry Malaysia, 2010-2014: results from the National Neurology Registry. J Stroke
Cerebrovasc Dis 24(12):2701–2709
3. Yahiaoui AFZ, Bessaid A (2016) Segmentation of ischemic stroke area from CT brain
images. In: 2016 international symposium on signal, image, video and communications
(ISIVC), Tunis, pp 13–17
4. Kalavathi P, Prasath VBS (2015) Methods on skull stripping of MRI head scan images—a
review. J Digit Imaging 29(3):365–379
5. Brummer ME, Mersereau RM, Eisner RL, Lewine RRJ, Caeslles V, Kimmel R, Sapiro G
(1993) Automatic detection of brain contours in MRI datasets. IEEE Trans Image Process 12
(2):153–166
6. Subudhi A, Jena J, Sabut S (2016) Extraction of brain from MRI images by skull stripping
using histogram partitioning with maximum entropy divergence. In: 2016 international
conference on communication and signal processing (ICCSP), Melmaruvathur, 2016,
pp 0931–0935
7. Rekik I, Allassonnière S, Carpenter TK, Wardlaw JM (2012) Medical image analysis methods
in MR/CT-imaged acute-subacute ischemic stroke lesion: Segmentation, prediction and
insights into dynamic evolution simulation models. A critical appraisal. NeuroImage Clin. 1
(1):164–178
8. Zaki WMDW (2012) Content-based medical image analysis and retrieval of intracranial
haemorrhage CT brain images. Doctoral dissertation. Multimedia University Malaysia
9. Gulsen S, Terzi A (2013) Multiple brain metastases in a patient with uterine papillary serous
adenocarcinoma: treatment options for this rarely seen metastatic brain tumor. Surg Neurol Int
4(1):111
10. Clare S (1997) Functional MRI: methods and applications
11. Suberi AAM, Zakaria WNW, Tomari R, Fuad NFN (2018) Classification of Posterior
Fossa CT brain slices using Artificial Neural Network. Procedia Comput Sci 135:170–177
12. Li S, Manogaran G (2019) Design and implementation of networked collaborative service
system for brain stroke prevention and first aid. IEEE Access 7:14825–14836
13. Jauch EC, Saver JL, Adams HP Jr, Bruno A, Connors JJ, Demaerschalk BM, Khatri P,
McMullan PW Jr, Qureshi AI, Rosenfield K, Scott PA (2013) Guidelines for the early
management of patients with acute ischemic stroke: a guideline for healthcare professionals
from the American Heart Association/American Stroke Association. Stroke 44(3):870–947
14. Dubey P, Pandey S, Moonis G (2013) Acute stroke imaging: recent updates. Stroke Res.
Treat. 2013:1–6
15. Saad NM, Bakar SARSA, Muda AS, Mokji MM (2015) Review of brain lesion detection and
classification using neuroimaging analysis techniques. Jurnal Teknologi 74(6):73–85
16. van Seeters T, Biessels GJ, Kappelle LJ, Van Der Schaaf IC, Dankbaar JW, Horsch AD,
Niesten JM, Luitse MJ, Majoie CB, Vos JA, Schonewille WJ (2015) The prognostic value of
CT angiography and CT perfusion in acute ischemic stroke. Cerebrovasc Dis 40(5–6):258–
269
17. M.D. D. O. K. CT Imaging for Stroke. http://neurovascularmedicine.com/imagingct.php.
Accessed 29 July 2019
18. Sørensen T (1948) A method of establishing groups of equal amplitude in plant
sociology based on similarity of species and its application to analyses of the vegetation on
Danish commons. Kongelige Danske Videnskabernes Selskab 5(4):1–34
19. Revoledu homepage. https://people.revoledu.com/kardi/tutorial/Similarity/Jaccard.html.
Accessed 17 Sept 2019
Improvising Non-uniform Illumination
and Low Contrast Images of Soil
Transmitted Helminths Image Using
Contrast Enhancement Techniques

Norhanis Ayunie Ahmad Khairudin, Aimi Salihah Abdul Nasir,

Lim Chee Chin, Haryati Jaafar, and Zeehaida Mohamed

Abstract Image enhancement plays an important role in image processing and

computer vision. It is used to enhance the visual appearance in an image and also to
convert the image suited to the requirement needed for image processing. In this
paper, image enhancement is used to produce a better image by enhancing the
image quality and highlighting the morphological features of the helminth eggs.
Result obtained from enhancement is prepared for segmentation and classification
process. The helminth eggs used in this paper are Ascaris Lumbricoides Ova
(ALO) and Trichuris Trichiura Ova (TTO). In this study, several enhancement
techniques have been performed on 100 images of ALO and TTO which have been
captured under three different illuminations: normal, under-exposed and
over-exposed images. The techniques used are global contrast stretching, limit
contrast, linear contrast stretching, modified global contrast stretching, modified
linear contrast stretching, partial contrast and reduce haze. Based on results
obtained from these techniques, modified linear contrast stretching and modified
global contrast stretching are able to equalize the lighting in the non-uniform
illumination images of helminth eggs. Both techniques are suitable to be used on
non-uniform illumination images and also able to improve the contrast in the image
without affecting or removing the key features in ALO and TTO images as com-
pared to the other techniques. Hence, the resultant images would become useful for
parasitologist in analyzing helminth eggs.

Keywords Helminth eggs Image processing Contrast enhancement techniques

N. A. A. Khairudin (&) A. S. Abdul Nasir H. Jaafar
Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI Alam Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]
L. C. Chin
School of Mechatronic Engineering, University Malaysia Perlis, Pauh Putra Campus,
02600 Arau, Perlis, Malaysia
Z. Mohamed
Department of Microbiology and Parasitology, School of Medical Sciences, Health Campus,
Universiti Sains Malaysia, 16150 Kubang Kerian, Kelantan, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 641

1 Introduction

Helminth is an infection agent for parasitic worm diseases which globally known as
helminthiases. Parasitic helminths can be classified into three major groups which
are nematodes, trematodes and cestodes. Nematodes have a general vermiform
body shape and mostly live on insects. Trematodes are worms with solid body,
hermaphroditic and have a complex life cycle. They use snails as their intermediate
hosts. Trematodes usually inhabit lungs, liver, intestine and blood vessels of their
hosts; while cestodes are flat-worm with a segmented body. The adult cestodes live
in the human intestine [1].
Helminth is a multicellular animal whose eggs are microscopic. The egg sizes
for different species vary which is around 20 to 80 µm for those that are significant
in variable amounts in the sanitary field [2]. Their shape and resistance also differ.
Helminth eggs can remain viable for 1 to 2 months in crops and many months in
soil, freshwater and sewage [1]. They can remain viable for several years in feces,
night soil, sludge and wastewater. Helminth eggs are able to be transmitted into a
human body through contact with polluted sludge or fecal material, exposure to
contaminated food, water and dust from fur or animal body [3]. These parasites can
multiply in the human body and could lead to a serious illness such as filariasis,
cancerous diseases and cysts.
Helminthiases may cause anemia, diarrhea and severe problems of malnutrition.
Usually, children within age 5 to 15 years old are affected which can affect their
physical, mental and quality of their life [4]. Early diagnosis of the disease is
fundamental for patient recovery especially in the case of children. Helminth eggs
can be diagnosed through stool, cells and blood samples from the patients. Problem
occurs when parasitologist needs to diagnose these sample in fresh condition under
a limited time. The diagnosis procedure is also conducted manually using light
microscope. This procedure consumes a great amount of time and the observer must
have a good concentration in observing samples [5]. Results obtained are often
neither reliable nor accurate.
These limitations have triggered the improvement in digital image processing for
helminth egg recognition by using image processing and computer algorithms. For
example, a digital image processing technique is proposed by Hadi et al. [6] in
detecting ALO and TTO from fecal samples. Three pre-processing methods have
been compared in order to obtain the best segmentation result. Method I used
contrast enhancement while method II used canny edge detection and method III
used combination of contrast enhancement and canny edge detection. The results
obtained concluded that method III has the highest accuracy which is 93% for ALO
and 94% for TTO. Then, 15 types of human intestine parasites are diagnosed by
Suzuki et al. [7] through a proposed automatic segmentation and classification
system of human intestinal parasites from microscopy images. The proposed system
explored image foresting transform and ellipse matching for segmentation and
Improvising Non-uniform Illumination and Low Contrast Images … 643

optimum-path forest classifier for object recognition. This system obtained 90.38%
for sensitivity, 98.32% for specificity and 98.19% for efficiency. Next, Jimenez
et al. [5] proposed a system that identifies and quantifies seven species of helminth
eggs in wastewater. The system shows a specificity of 99% and a sensitivity in a
range of 80% to 90%.

2 Literature Reviews on Enhancements and Image

Quality Assessments

Exposure of the microscope may impact the quality of captured images.

Over-exposure setting will lead to forming a bright image, while the
under-exposure setting will form a dark image [8]. It is hard to visualize and
analyze the morphological features of helminth eggs when quality of the image is
low. Various approaches for the enhancement techniques have been developed and
published. Most studies aimed in enhancing the visibility for low contrast images
while reducing the noise to improve the image quality in order to obtain a better
visual in an image [9].
Abdul-Nasir et al. [10] proposed modified global and modified linear contrast
stretching which are good in enhancing the contrast and brightness of the image
compared to the conventional global contrast stretching (GCS) and linear contrast
stretching (LCS) techniques based on the qualitative and quantitative analysis. Kaur
and Choundhary [11] compared five enhancement techniques on acute leukemia
images which are LCS, GCS, dark contrast stretching, bright contrast stretching and
partial contrast and found that partial contrast has a better result in enhancing the
contrast of non-illumination in acute leukemia images compared to the other
techniques.
Ho et al. [12] proposed dehazing algorithm on dark channel prior and contrast
enhancement approaches. The usual dark channel prior method restores the color of
objects in the scene after eliminates the haze, nonetheless it does not consider the
enhancement of image contrast. On the contrary, the image contrast method
improves the local contrast of objects, nevertheless, the color is generally distorted
consequently of to the over-stretching of contrast. The proposed algorithm com-
bines the advantages of those two conventional approaches for keeping the color
while dehazing the image.
Jang et al. [13] proposed an adaptive contrast enhancement using edge-based
lighting to improve the perceptual visual quality by lessening the structural dis-
tortion to a tolerable level. This method estimated the lighting conditions and
adaptively adjusts the luminance in the images. Enhancement performance measure
is used to analyze the results obtained.
Al-Ameen [14] improved LCS by developing an adjustable contrast stretching
improving the contrast of color images. The proposed technique is evaluated by
comparing the technique with four specialized enhancement techniques of
644 N. A. A. Khairudin et al.

TW-CESBLK, NMHE, ESIHE and RESIHE. The proposed technique has provided
satisfactory results as it produced natural contrast images with no visible artefacts
and outperformed the comparative techniques by scoring the highest in terms of
recorded accuracy.
Hitam et al. [15] applied CLAHE to the underwater images in red–green–blue
(RGB) and hue–saturation–value (HSV) color models separately. Then, the indi-
vidual images are combined through Euclidean distance to produce a contrast
enhanced image with a low-noise. In several cases, however, this method produces
output images with more noise than the conventional CLAHE does. As such, the
output image is greenish.
Image quality assessment is used to develop and monitor the quantitative
measure that able to automatically predict the image quality [16]. Kumar and Rattan
[17] analyzed 10 types of quality metrics for medical image and state that SSIM
gave a better accuracy and has a higher result performance compared to the other
quality metrics. Saha et al. [18] analyzed the performance from the full-reference
image quality assessment by combining global and local distortion measures and
concluded that there is no assurance of better overall performance; if the distortion
wise, performance is better. For overall performance in image, the relative
assessment for different distortions is also required.
In conclusion, a lot of enhancement techniques have been proposed and applied
on medical images in order to acquire the targeted information. Then, the quanti-
tative measure for the enhanced image also shows a great improvement. The image
can easily be measured suits the requirement needed. So, the quality of the resultant
image is able to be veriﬁed through the obtained results.

3 Methodology

3.1 Image Acquisition

In image acquisition, images for ALO and TTO species are captured from a stool
sample slides by using a computerized microscope. These stool samples are pre-
pared by the Department of Microbiology and Parasitology, Hospital Universiti
Sains Malaysia (HUSM) which are freshly collected from patients. The stool is
placed on microscope slide and normal saline has been used as staining to obtain a
clearer image of ALO and TTO. Then, the stool slides are observed under 40X
magniﬁcation and the images captured are saved in.jpg extension under 3 different
conditions which are normal images, under-exposed images and over-exposed
images. 100 images for each species are randomly selected from the captured image
to be tested in this paper. Figure 1 shows the samples of the captured ALO and
TTO images in different illuminations.
Improvising Non-uniform Illumination and Low Contrast Images … 645

(a) Normal ALO image (b) Under-exposed ALO (c) Over-exposed ALO
image image

(d) Normal TTO image (e) Under-exposed TTO (f) Over-exposed TTO
image image

Fig. 1 Samples of ALO and TTO images captured under three different conditions

3.2 Image Enhancement Techniques on Helminth Eggs

In image enhancement, the quality of the image is enhanced either by modifying the
brightness, darkness [19] or sharpness in the image into a more suitable values or
conditions depending on the user’s preference. In this paper, seven image
enhancement techniques are applied and tested for enhancing the ALO and TTO
images.
Global Contrast Stretching (GCS). GCS simpliﬁes the contrast problems that
occur in global fashion such as poor or excessive light conditions in the environ-
ment source [20]. Image with high global contrast will have a detailed and
variation-rich image while an image with a lower global contrast contains less
information, fewer details and seems to be more uniform [21].
All color palate range is considered at once to determine the maximum and
minimum for all RGB color image [22, 23]. The combination of RGB color will
give only one value for maximum and minimum for RGB color. These maximum
and minimum values will be used as the desired values for the contrast stretching
process. The calculation for GCS is deﬁned as in Eq. 1.

inRGB ðx; yÞ minRGB
outRGB ðx; yÞ ¼ 255 ð1Þ
maxRGB minRGB

Limit Contrast (LC). Limit contrast also known as normalization is a simple

enhancement technique that is used to improve the contrast in an image to a desired
646 N. A. A. Khairudin et al.

range of values through stretching the range of intensity values of original image
[24]. The original purpose for LC is to enhance the dynamic range in image gray
level but for this project, the dynamic range is enhanced on the image color
component.
Lower and upper pixel value limits (LP and UP) are decided through the pixel
value in RGB components and the highest and the lowest pixel are chosen from
these components automatically. Then, the image is scanned to ﬁnd the lowest and
highest pixel values (L and H) currently present in the image. Then each pixel is
scaled by using Eq. 2.

UP LP
Pout ¼ ðimage LÞ þ LP ð2Þ
HL

The lowest values from R, G and B components are combined to produce a new
L value to be applied on the color image. The disadvantage of this technique is that
LC will not achieve any results if the original range covers the full possible set of
pixel values.
Linear Contrast Stretching (LCS). LCS locally adjusts the value of each element
in the image to simultaneously enhance the visualization structure of both the
lightest and darkest parts of the image. This technique helps to highlight the
information in the regions that are initially very light or dark [25].
In LCS, a certain amount of stretching is applied in a neighborhood that is
controlled by the origin contrast in that neighborhood. Linear contrast technique
considers each range of RGB components in the image. Thus, the range of each
color component is used during the contrast stretching process to represent each
range of color. This will give each component a set of minimum and maximum
values [26]. Equation 3 shows the calculation for LCS technique.

ðinRGB ðx; yÞ minÞ
outRGB ðx; yÞ ¼ 255 ð3Þ
max min

Modiﬁed Global Contrast Stretching (MGCS). MGCS overcomes the weakness

of GCS by adjusting the minimum and maximum values RGB components in the
image. MGCS able to enhance the contrast of the image without affecting the color
structure of the original image. This technique also able to preserve as much
information as the original image [21].
This technique is altered from the GCS by using a new minimum and maximum
values which differ from the original GCS. The new minimum and maximum are
from the value between RGB components that have been acquired through a certain
calculation from the total number of pixels in the images. The process must satisfy
these conditions:
Improvising Non-uniform Illumination and Low Contrast Images … 647

TminðRGBÞ
100 minp ð4Þ
total number of pixels in image

TmaxðRGBÞ
100 maxp ð5Þ
total number of pixels in image

Tmin and Tmax are the total number of pixels that lie in a speciﬁc minimum and
maximum while minp and maxp are the desired values for minimum and maximum.
Modiﬁed Linear Contrast Stretching (MLCS). MLCS prevails over the weak-
ness of LCS by adjusting the values of minimum and maximum of RGB compo-
nents in the image. MLCS is capable in enhancing the contrast of the image without
affecting the color structure of the original image. This technique also able to retain
the same information as the original image [21].
MLCS is altered from LCS enhancement by improving the minimum and
maximum value for each of the RGB components in the image into a new minimum
and maximum values that are beyond the original values for each of the RGB
components [25]. The equation is the same as the original LCS. The desired per-
centages (minp & maxp ) are obtained from the values of each RGB components
which follows these conditions:

TminðRGBÞ
100 minp ð6Þ
total number of pixels in image

TmaxðRGBÞ
100 maxp ð7Þ
total number of pixels in image

Partial Contrast (PC). Partial contrast is a linear mapping function that is used to
increase the contrast level and brightness level of the image. The technique is based
on the original brightness and contrast level of the images to be adjusted. This
technique is able to enhance the contrast of image even with a different illumination
[26, 27].
Before the mapping process started, the range of where the majority of the input
pixels converge for each color model need to be identified. Since the input images
are the RGB color model, so it is necessary to find the range for the red, blue and
green intensities [26].
The pixel within the range of minTH and maxTH are stretched to the desired
range of NmaxTH to NminTH, whereas the remaining pixels will experience
compression. By this stretching and compressing processes, the pixels of the image
can be mapped to a wider range and brighter intensities. As a result, the contrast and
the brightness level of the raw images are increased [27]. PC is defined as in Eq. 8.
648 N. A. A. Khairudin et al.

8
minTH NminTH for img [ minTH
img
>
<
Pout ¼ maxTHminTH ðimg minTH
NmaxTHNminTH
þ minTH; for minTH\img\maxTH ð8Þ
>
:
maxTH NmaxTH
img
for img\maxTH

Reduce Haze (RH). RH technique helps in improving the visibility of information

in low contrast image as well as in the image that has lose the color ﬁdelity by
reducing the haze in the image. This technique employs a per-pixel dark channel in
order to identify the low intensity pixel (dark pixel) in image. The haze transmission
is estimated through the dark pixels in the image.
The intensity of the dark channel act as an estimation for the thickness of the
haze. Then, the haze and quad-tree decomposition is evaluate in order to compute
the atmospheric light. The estimation of haze is based on the value of haze
thickness at each pixel and quantity atmospheric light, which represents the value of
the brightest non-specular haze [28, 29]. Equation 9 represent the calculation for
RH technique.

I ð xÞ A
H¼ ð9Þ
maxðtð xÞ; t0 Þ

I is the observed intensity while A is the global atmospheric. t(x) is a constant

transmission in local path and t0 is equal to 0.1.

3.3 Image Quality Assessment (IQA)

Image Quality Assessment (IQA) is divided into qualitative analysis and quanti-
tative analysis. For qualitative analysis, over enhancement, unnatural enhancement
or the presence of the artefacts in the images are inspected by a human through
naked eyes. Generally, the visual quality or appearance of the resultant image is
evaluated in the qualitative measurement process [30]. Quantitative analysis is
divided into three parts which are full reference (FR) IQA, no-reference (NR) IQA
and reduce reference (RR) IQA. Similarity, brightness and contrast in the image are
examples of inspection in the quantitative analysis.
In this paper, seven techniques from FR IQA are used in order to analyze the
results from the proposed enhancement techniques. These techniques are mostly
been used for restoration and enhancement [31]. For these techniques, MSE, PSNR,
SSIM, and FSIM calculates the similarity and intensity level between the original
image and enhanced image while AMBE, EMEE and Entropy are used to measure
Improvising Non-uniform Illumination and Low Contrast Images … 649

the contrast between the original and resultant image. The description of these
quantitative analyses are as follows:
Mean Square Error (MSE). MSE is computed by averaging the squared inten-
sity of the original (input) image and the resultant (output) image pixels. MSE is
deﬁned as in Eq. 10.

1 Xm1 Xn1
MSE ¼ ðBði; jÞ Aði; jÞÞ2 ð10Þ
mn i¼0 j¼0

Where Aði; jÞ and Bði; jÞ be the original and enhanced image while m n
signiﬁes the image size [30, 32].
Peak Signal-to-Noise Ratio (PSNR). Signal-to-noise ratio (SNR) is a mathe-
matical measure of image quality based on the pixel difference between two images
[32]. The SNR measure is an estimation of the quality of a reconstructed image
when compared to the original image. The PSNR is basically the SNR when all
pixel values are equal to the maximum possible value. PSNR is deﬁned as in
Eq. 11.

s2
PSNR ¼ 10 log ð11Þ
MSR

where s = 255 for the 8-bit image.

Structural Similarity Index Measurement (SSIM). SSIM measures the simi-
larity between two images and is correlated with the quality perception of human
visual system. SSIM is not only has a good image quality prediction accuracy but
also has a simple formulation and low computational complexity [32, 33]. The
calculation of SSIM is given in Eq. 12.

2lx ly þ c1 2rxy þ c2
SSIM ðx; yÞ ¼ ð12Þ
l2x þ l2y þ c1 r2x þ r2y þ c2

Where
lx ly is the average of x and y.
rxy is the covariance of x and y.
r2x and r2y are a variance of x and y.
c1 ¼ ðk1 LÞ2 and c2 ¼ ðk2 LÞ2 are variables to stabilize the division with weak
denominator.
L is the dynamic range of the pixel values.
k1 ¼ 0:01 and k2 ¼ 0:03 by default.
650 N. A. A. Khairudin et al.

Feature Similarity Index Measurement (FSIM). FSIM considers the luminance

component of images by calculating the similarity score between the original image
and the resultant image [33]. Equation 13 presents the calculation for FSIM.
P
s ð xÞ:PCm ð xÞ
FSIM ¼ x2X P l ð13Þ
x2X PCm ð xÞ

PCm ð xÞ is phase congruency and SL is similarity while Ω is a spatial domain for the
whole image.
Absolute Mean Brightness Error (AMBE). AMBE is used to calculate the
difference in mean brightness between two images. Higher AMBE value indicates
that the brightness is better preserved [34]. The calculation for AMBE is shown in
Eq. 14.

AMBE ¼ meanðoriginal image resultant imageÞ ð14Þ

Weber-Law Based Contrast Measure with Entropy (EMEE). EMEE measure

contrast in an image based on entropy. A high value of EMEE is desired because
they indicate the degree of contrast enhancement in compared images [32].
Equation 15 presents the calculation for EMEE.

1 Xk1 Xk2 Ymaxðk;lÞ a Ymaxðk;lÞ
EMEEk1;k2 ¼ a ln ð15Þ
k1 k2 l¼1 l¼1 Yminðk;lÞ þ c Yminðk;lÞ þ c

Where a is constant and c is 0.00001 to avoid dividing by zero (0).

Entropy. Entropy is deﬁned as the corresponding grey level state that can be
implemented by individual pixels. A high entropy value is preferred since it dis-
closes that an image contains much information [35]. The equation for entropy is as
in Eq. 16.

e ¼ xði;jÞ lnðxði;jÞ Þ ð16Þ

4 Results and Discussions

In this paper, seven proposed enhancement techniques have been applied on the
helminth egg images which are ALO and TTO images. Figure 2 shows the original
images of ALO and TTO while Figs. 3 until 9 show the resultant images of
enhancement techniques on ALO and TTO images.
Improvising Non-uniform Illumination and Low Contrast Images … 651

(a) ALO_1 (b) ALO_2 (c) TTO_1 (d) TTO_2

Fig. 2 The original images for two types of STH

Fig. 3 The resultant images of GCS on ALO and TTO images

Figure 2 shows the original images of ALO and TTO. These pictures differ in
illuminations. ALO_1 image is in normal illumination while ALO_2 and TTO_1
are over-exposed illumination images. TTO_2 is an under-exposed illumination
image. The artefact in the images differs in size and color which may lead to false
diagnosis results if it is wrongly detected in the image.
Figure 3 shows the results when the GCS technique is applied on the original
images. The targeted images appear clearer in the images. The background is
slightly enhanced but the artefact also becomes visible. The illumination is still
different for each image.
Figure 4 shows the results when the LC technique is applied on the original
images. The target images become dimmer compared to the original color. The
background color is enhanced and become brighter. The illumination is still dif-
ferent for each image.

(a) ALO_1 (b) ALO_2 (c) TTO_1 (d) TTO_2

Fig. 4 The resultant images of LC on ALO and TTO images

652 N. A. A. Khairudin et al.

Figure 5 shows the resultant images for the LCS technique. The targeted images
appear clearly in the images. The background is also enhanced but differs for each
image. The background color in ALO_2 and TTO_1 images are completely
changed from the original color. The illumination in each image are still different
from each other.
Figure 6 shows the resultant images from the MGCS technique. The enhanced
images shows a good contrast between targeted images and background images. The
targeted images appear clearly in the images. The background is enhanced and appears
in a similar color for each image. The illumination is identical for each image.
Figure 7 shows the resultant images from the MLCS technique. The enhanced
images shows a good contrast between targeted images and background images.
The targeted images appears clearly in the images but in a darker color. The
background is enhanced and appear in a similar color for each image. The illu-
mination is identical for each image.

(a) ALO_1 (b) ALO_2 (c) TTO_1 (d) TTO_2

Fig. 5 The resultant images of LCS on ALO and TTO images

(a) ALO_1 (b) ALO_2 (c) TTO_1 (d) TTO_2

Fig. 6 The resultant images of MGCS on ALO and TTO images

(a) ALO_1 (b) ALO_2 (c) TTO_1 (d) TTO_2

Fig. 7 The resultant images of MLCS on ALO and TTO images

Improvising Non-uniform Illumination and Low Contrast Images … 653

Figure 8 shows the resultant images for the PC technique. The targeted images
in ALO_1, TTO_1 and TTO_2 appear clearer in images while the targeted image in
ALO_2 is dimmer compared to other images. The background color in ALO_1,
ALO_2 and TTO_2 are slightly enhanced while the background in TTO_1 is over
enhanced. The illumination in the images are not identical to each other.
Figure 9 shows the resultant images for the RH technique. A good contrast is
obtained from the targeted images and background images. The targeted images
appear clearer in the images but the artefacts also become visible. Then, the
background is enhanced but the color in each image is different from each other.
The illumination in the images are not identical to each other.
Based on the results obtained, results in GCS and LCS show an improvement on
the target images and the background image. The color of the targeted images is
more highlighted in GCS compared to LCS. LC shows the target images are slightly
enhanced and the background color does not have much difference from the original
images. For MGCS and MLCS, the enhanced images show a good contrast between
the target images and backgrounds. The target images MLCS are darker than
MGCS images. In PC, the resultant images are slightly enhanced while in RH
results, the enhancement shows a decent contrast for target image but the back-
ground color only slightly enhanced. Artefacts in the images also appeared which
can lead to false diagnosis results.
The most suitable technique for enhancement based on the qualitative measure is
MGCS technique. This is because the resultant image is clearer and the feature of
the target image is improved. The resultant images also show that the color for
target image is brighter but it does not change the original color of the target image.

(a) ALO_1 (b) ALO_2 (c) TTO_1 (d) TTO_2

Fig. 8 The resultant images of PC on ALO and TTO images

(a) ALO_1 (b) ALO_2 (c) TTO_1 (d) TTO_2

Fig. 9 The resultant images of RH on ALO and TTO images

654 N. A. A. Khairudin et al.

This shows that MGCS is able to enhance the image while preserving the originality
of the STH color. Therefore, the resultant image is suitable to be used for further
process of image processing.
In order to identify the signiﬁcance of enhancement techniques, seven quanti-
tative analysis are used in comparing each resultant image from the proposed
techniques with the original images. MSE, PSNR, SSIM and FSIM are for simi-
larity measure while AMBE, AMEE and Entropy are for contrast measure. For
enhanced ALO images, Table 1 shows the similarity quantitative analysis results
while Table 2 shows the contrast quantitative analysis results. For enhanced TTO
images, Table 3 shows the similarity quantitative analysis results while Table 3
shows the contrast quantitative analysis results.
Table 1 shows the similarity average results from enhanced ALO images. MSE
analysis shows the lowest value is obtained by LC technique while the highest
value is obtained by MLCS technique. Then, in PSNR analysis MLCS technique
has the lowest value while LC technique has the highest value. For SSIM analysis,
LCS technique obtained the lowest value while LC technique obtained the highest
value. Lastly, in FSIM analysis, RH technique has the lowest value and LC tech-
nique has the highest value. Overall, the values of LC technique are the lowest in
MSE but are the highest in PSNR, SSIM and FSIM quantitative analysis while
MLCS technique has the lowest value in PSNR analysis. RH technique has the
lowest value in SSIM and FSIM analysis.

Table 1 Average values for Techniques MSE PSNR SSIM FSIM

similarity quantitative
analysis from 100 enhanced GCS 460.331 24.235 0.932 0.978
ALO images LC 157.190 28.136 0.974 0.999
LCS 804.981 20.039 0.624 0.953
MGCS 3403.958 13.797 0.827 0.915
MLCS 3664.069 13.375 0.722 0.902
PC 754.941 20.252 0.819 0.922
RH 1052.279 18.535 0.652 0.837

Table 2 Average values for Techniques AMBE EMEE Entropy

contrast quantitative analysis
from 100 enhanced ALO GCS 11.084 2.960 2.462
images LC 11.044 1.070 0.051
LCS 13.073 4.720 2.453
MGCS 42.974 23.493 3.856
MLCS 44.616 25.636 3.998
PC 12.481 5.063 3.617
RH 17.011 5.982 2.375
Improvising Non-uniform Illumination and Low Contrast Images … 655

Table 3 Average values for Techniques MSE PSNR SSIM FSIM

similarity quantitative
analysis from 100 enhanced GCS 357.951 26.429 0.950 0.935
TTO images LC 167.246 31.801 0.959 0.968
LCS 614.836 21.963 0.658 0.954
MGCS 3324.678 14.140 0.824 0.891
MLCS 3559.440 13.715 0.733 0.870
PC 1095.490 18.791 0.614 0.828
RH 875.567 19.697 0.673 0.819

Table 2 shows the contrast results from the enhanced ALO images. In AMBE
analysis, the lowest value is obtained by LC technique while the highest value is
obtained by MLCS. Then, EMEE analysis shows that LC has the lowest value while
MLCS has the highest value. In entropy analysis, the lowest value is obtained by LC
and the highest is obtained by MLCS. Overall, LC technique has the lowest value while
MLCS has the highest value for all techniques in the contrast quantitative analysis.
Table 3 shows the similarity results from the enhanced TTO images. MSE
analysis shows that LC technique has the lowest value and MLCS has the highest
value. Then, the lowest value in PSNR analysis is MLCS technique while the
highest is LC technique. Then, PC technique has the lowest value and LC technique
has the highest value in SSIM analysis. For FSIM, the lowest value is RH technique
while the highest value is LC technique. Overall, LC technique has the highest
values in PSNR. SSIM and FSIM analysis but has the lowest value for MSE
analysis. MLCS has the highest value in MSE but the lowest in PSNR while PC
technique obtained the lowest value for SSIM and FSIM analysis, the lowest value
is obtained by RH technique.
Table 4 shows the contrast results from the enhanced TTO images. In AMBE
analysis. The lowest value is obtained by LC technique and the highest is obtained
by MLCS technique. Then, EMEE shows that the lowest value is RH technique
while the highest value is MLCS technique. In entropy, the lowest is obtained by
LC technique while the highest is MLCS technique. Overall, the lowest value for all
quantitative analysis techniques is LC technique while MLGS technique has the
highest value.

Table 4 Average values for Techniques AMBE EMEE Entropy

contrast quantitative analysis
from 100 enhanced TTO GCS 10.507 2.365 2.573
images LC 8.257 1.130 0.173
LCS 11.899 3.168 2.874
MGCS 45.987 22.959 4.043
MLCS 47.573 24.727 4.232
PC 22.026 7.368 3.919
RH 16.105 4.755 1.904
656 N. A. A. Khairudin et al.

In similarity quantitative analysis, MSE analysis considers the highest similarity

value as the best analysis result while PSNR, SSIM and FSIM pick the lowest value
as the best analysis results. Based on these conditions, LC technique has the highest
similarity to the original image for both ALO and TTO images. AMBE, EMEE and
Entropy analysis prefer the highest value as the most suitable for contrast measure.
Hence, MLCS is preferred as the best contrast qualitative analysis for both ALO
and TTO.
Through the qualitative results, MGCS has a better contrast compared to the
other technique followed by MLCS technique. Then, the results from the quanti-
tative analysis show that MLGS has the highest contrast value followed by MGCS
technique. This shows that both techniques are suitable to be used on the
non-uniform illumination images of ALO and TTO.

5 Conclusions

The comparison results from the proposed enhancement techniques showed the
advantage and disadvantage of the enhancement techniques when been applied on
the different illumination images. The similarity and contrast in quantitative anal-
ysis shows the effect of enhancement in tabulated data which helps in analyzing the
resultant image. The comparison between quantitative and qualitative analysis
proves that although the results from the data are provided, the appropriateness of
the image is also important. So, it can be concluded that the suitable methods to be
used for non-uniform illumination images are MLCS and MGCS technique because
both techniques able to help in improving the enhancement quality of the mor-
phological features in ALO and TTO images. This helps parasitologist to diagnose
helminths eggs without much difﬁculty. The aims of this paper are achieved when
the visibility of low contrast feature is increased and the noise in the image is
reduced to the minimum amount.

Acknowledgments The author would like to acknowledge the support from the Fundamental
Research Grant Scheme for Research Acculturation of Early Career Researchers (FRGS-RACER)
under a grant number of RACER/1/2019/ICT02/UNIMAP//2 from the Ministry of Higher
Education Malaysia. The authors gratefully acknowledge team members and thank Hospital
Universiti Sains Malaysia (HUSM) for providing the helminths eggs samples.

References

1. World Health Organization (2004) Training manual on diagnosis of intestinal parasites:

tutor’s guide, no. 98.2. Organisation mondiale de la Santé, Gevene
2. Ghazali KH, Hadi RS, Zeehaida M (2013) Microscopy image processing analysis for
automatic detection of human intestinal parasites ALO and TTO. In: 2013 international
conference on electronics, computer and computation, ICECCO 2013, pp 40–43
Improvising Non-uniform Illumination and Low Contrast Images … 657

3. Amoah ID, Singh G, Stenström TA, Reddy P (2017) Detection and quantification of
soil-transmitted helminths in environmental samples: a review of current state-of-the-art and
future perspectives. Acta Trop 169(2017):187–201
4. World Health Organization (WHO) (2015) Third WHO report on neglected diseases:
investing to overcome the global impact of neglected tropical diseases. World Health
Organisation, Geneva, Switzerland
5. Jiménez B, Maya C, Velásquez G, Torner F, Arambula F, Barrios JA, Velasco M (2016)
Identification and quantification of pathogenic helminth eggs using a digital image system.
Exp Parasitol 166:164–172
6. Hadi RS, Ghazali KH, Khalidin IZ, Zeehaida M (2012) Human parasitic worm detection
using image processing technique. In: 2012 IEEE symposium on computer applications and
industrial electronics, SCAIE 2012, Kota Kinabalu, Malaysia, pp 196–201
7. Suzuki CTN, Gomes JF, Falcão AX, Papa JP, Hoshino-Shimizu S (2013) Automatic
segmentation and classification of human intestinal parasites from microscopy images. IEEE
Trans Biomed Eng 60(3):803–812
8. Aris TA, Abdul Nasir AS, Mohamed Z, Jaafar H, Mustafa WA, Khairunizam W, Jamlos MA,
Zunaidi I, Razlan ZM, Shahriman AB (2019) Colour component analysis approach for
malaria parasites detection based on thick blood smear images. In: MEBSE 2018 - IOP
conference series: materials science and engineering, vol 557, p 012007
9. Wu Q, Wang Y-P, Liu Z, Chen T, Castleman KR (2002) The effect of image enhancement on
biomedical pattern recognition. In: Proceedings of the second joint 24th annual conference
and the annual fall meeting of the biomedical engineering society. IEEE, pp 1067–1069
10. Abdul-Nasir AS, Mashor MY, Mohamed Z (2012) Modified global and modified linear
contrast stretching algorithms: New color contrast enhancement techniques for microscopic
analysis of malaria slide images. Comput Math Methods Med. https://doi.org/10.1155/2012/
637360
11. Kaur J, Choudhary A (2012) Comparison of several contrast stretching techniques on acute
leukemia images. Int J Eng Innov Technol (IJEIT) 2(1):332–335
12. Ho, KT, Lee SH, Cho NK (2013) A dehazing algorithm using dark channel prior and contrast
enhancement. In: IEEE international conference on acoustics, speech and signal processing
(ICASSP). IEEE, pp 2484–2487
13. Jang CY, Kang SJ, Kim YH (2012) Adaptive contrast enhancement using edge-based lighting
condition estimation. Digit Sig Process 58:1–9
14. Al-Amen Z (2018) Contrast enhancement for color images using an adjustable contrast
stretching technique. Int J Comput 17(2):74–80
15. Hitam MS, Yussof WNJW, Awalludin EA, Bachok Z (2013) Mixture contrast limited
adaptive histogram equalization for underwater image enhancement. In: IEEE international
conference on computer applications technology (ICCAT). IEEE, Sousse, pp 1–5
16. Wang Z, Bovik AC (2002) A universal image quality index. IEEE Sig Process Lett 9(3):81–84
17. Kumar R, Rattan M (2012) Analysis of various quality metrics for medical image processing.
Int J Adv Res Comput Sci Softw Eng 2(11):137–144
18. Saha A, Wu QMJ (2016) Full-reference image quality assessment by combining global and
local distortion measures. Sig Process 128:186–197
19. Fiete RD (2010) Modelling the imaging chain of digital cameras. SPIE, pp 127–132
20. Arici T, Altunbasak Y (2006) Image local contrast enhancement using adaptive non-linear
filters. In: International conference of image processing. IEEE, Atlanta, pp 2881–2884
21. Matkovic K, Neumann L, Neumann A, Psik T, Purgathofer W (2005) Global contrast factor—
a new approach to image contrast. In: The computational aesthetics in graphics, visualization
and imaging workshop, pp 159–168
22. Abdul-Nasir AS, Mashor MY, Mohamed Z (2012) Modified global and modified linear
contrast stretching algorithms - new colour contrast enhancement techniques for microscopic
analysis of malaria slide images. Comput Math Methods Med 2012:637360
658 N. A. A. Khairudin et al.

23. Rizzi A, Algeri T, Medeghini G, Marini D (2004) A proposal for contrast measure in digital
images. In: Second European conference on color in graphics, imaging and vision.
International symposium on multispectral color science, pp 187–192
24. Sulur KM, Abdul Nasir AS, Mustafa WA, Jaafar H, Mohamed Z (2017) Analysis of color
constancy algorithms for improving segmentation of malaria images. J Telecommun Electron
Comput Eng 10(1–16):43–49
25. Khairudin NAA, Ariff FNM, Abdul Nasir AS, Mustafa WA, Khairunizam W, Jamlos MA,
Zunaidi I, Razlan ZM, Shahriman AB (2019) Image segmentation approach for acute and
chronic leukaemia based on blood sample images. In: MEBSE 2018-IOP Conference Series:
Materials Science and Engineering, vol 557, p 012008 (2019)
26. Radha N, Tech M (2012) Comparison of contrast stretching methods of image enhancement
techniques for acute leukemia images. Int J Eng Res Technol 1(6):1–8
27. Abdul-Nasir AS, Mashor MY, Mohamed Z (2012) Segmentation based approach for
detection of malaria parasites using moving k-means clustering. In: 2012 IEEE EMBS
international conference of biomedical engineering and science. https://doi.org/10.1109/
IECBES.2012.6498073
28. He K, Sun J, Tang X (2011) Single image haze removal using dark channel prior. IEEE Trans
Pattern Anal Mach Intell 33(12):2341–2353
29. Tian QC, Cohen LD (2018) A variational-based fusion model for non-uniform illumination
image enhancement via contrast optimization and color correction. Sig Process 153:210–220
30. Silpa K, Mastani S (2012) Comparison of image quality metrics. Int J Eng Res 3(8):1–5
31. Martens JB, Meesters L (1998) Image dissimilarity. Sig Process 70(3):155–176
32. Zhang L, Zhang L, Mou X, Zhang D (2011) FSIM: a feature similarity index for image
quality assessment. IEEE Trans Image Process 20(8):2378–2386
33. Gupta S, Porwal R (2016) Appropriate contrast enhancement measures for brain and breast
cancer images. Int J Biomed Imaging 2016:4710842
34. Ghani ASA, Isa NAM (2015) Enhancement of low quality underwater image through
integrated global and local contrast correction. Appl Soft Comput J 37:332–344
35. Naxos G, Scotti F (2005) Automatic morphological analysis for acute leukemia identiﬁcation
in peripheral blood microscope images In: IEEE international conference on computational
intelligence for measurement systems and applications, July 2005, pp 96–101
Signal Processing Technique for Pulse
Modulation (PM) Ground Penetrating
Radar (GPR) System Based on Phase
and Envelope Detector Technique

Che Ku Nor Azie Hailma Che Ku Melor, Ariffuddin Joret,

Asmarashid Ponniran, Muhammad Suhaimi Sulong, Rosli Omar,
and Maryanti Razali

Abstract Ground Penetrating Radar (GPR) system is a system used to detect and
locate underground embedded objects which are based on the principles of
RADAR. This system use reflection of electromagnetic wave technique, which will
be generated towards the ground and detected back using antenna. This paper
focuses on the analysis of signal processing of Pulse Modulation (PM) GPR system
based on phase and envelope detector (ED) technique to detect and estimate the
depth of embedded object in 3 dimension model of GPR system simulation
designed using CST Studio Suite software. The antenna used in the simulation
model is a Dipole antenna which operates at 70 MHz until 80 MHz. The back-
ground model used is a rectangular object with dry sandy soil material while the
embedded object chosen is a rectangular iron object. Based on the output signal
calculated by the CST software, the signal will be exported to be processed using
MATLAB to produce GPR radargram. The simulation results show that by
applying the proposed signal processing technique based on phase of GPR output
signal, the embedded object can be seen clearly and estimated at about 900 and
1000 mm depth. By using ED technique all the embedded object can be detected
but hard to be estimated on its depth.

C. K. N. A. H. C. K. Melor A. Joret (&) A. Ponniran R. Omar M. Razali

Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
Parit Raja, Malaysia
e-mail: [email protected]
M. S. Sulong
Faculty of Technical and Vocational Education, Universiti Tun Hussein Onn Malaysia,
Parit Raja, Malaysia
A. Joret M. S. Sulong
Internet of Things (IOT) Focus Group, Universiti Tun Hussein Onn Malaysia, Parit Raja,
Malaysia
A. Ponniran
Power Electronic Converters (PECs) Focus Group, Universiti Tun Hussein Onn Malaysia,
Parit Raja, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 659

Keywords Ground penetrating radar Dipole antenna Pulse Modulation GPR

1 Introduction

GPR system can be used to detect and localize target objects in soil [1]. GPR has
been extensively used in engineering, geological exploration and other ﬁelds
recently [2, 3]. The concept used in the GPR system in detecting embedded object
is by using the scattering principle of electromagnetic waves [4]. The GPR system
can be classiﬁed into two groups of operating domain system, namely time domain
and frequency domain. The time domain or impulse GPR transmits discrete pulses
of nanosecond duration and digitizes the returns at GHz sample rates, while the
GPR system of frequency domain emits electromagnetic wave signal with a variable
frequency called chirp signal [5].
Another time domain operation of GPR system that radiate signal of electro-
magnetic waves in Gaussian pulse form is known as Pulse Modulation GPR (PM
GPR) [6–11]. This GPR system use amplitude modulation (AM) technique which
uses two types of signal known as information signal and carrier signal. The carrier
signal used in this GPR system is a high frequency signal in sinusoidal form while
the information signal is a low frequency signal of Gaussian pulse. Based on the
AM signal, the information signal can be retrieved using envelope detector tech-
nique [12].

1.1 Amplitude Modulation (AM)

In AM signal, the amplitude of carrier signal is varied according to the instanta-

neous amplitude of the information signal m(t). The AM signal s(t) can be shown
as:

Carrier signal: Ac cosð2pfc tÞ ð1Þ

Modulating information signal: mðtÞ ð2Þ

AM signal: sðtÞ ¼ Ac ½1 þ mðtÞ cosð2pfc tÞ ð3Þ

1.2 Fast Fourier Transform (FFT)

Fast Fourier Transform (FFT) has been widely used for real time measurement due to
higher computational efﬁciency and its ability to produce high precision receiving
signal level for a large class of signal process [13]. The FFT was developed as an
Signal Processing Technique for Pulse Modulation (PM) … 661

efﬁcient method of computing the Fourier transform. In this method, complete time
history of the Fourier transform values at all point interest is stored and the expo-
nential components are computes iteratively [14]. FFT which is Fourier analysis
converts a signal from time domain to representation in the frequency domain [15].
Considering to convert the AM signal, from analogue time domain s(t) into
frequency domain S(k), the time domain signal can be transformed into discrete
time signal ﬁrst as:

sðnÞ ¼ Ac½1 þ mðn ts Þcosð2pfc nts Þ ð4Þ

where t is time, ts is sampling time, n is sample and fc is carrier frequency. By using

Discrete Fourier Transform (DFT), the discrete frequency domain of AM signal
S(k) is:

X
N 1
Sð k Þ ¼ sðnÞej2pnk=N ; k ¼ 1; 2; 3; . . .N 1 ð5Þ
n¼0

X
N 1
Sð k Þ ¼ Ac ½1 þ mðnts Þcosð2pfc nts Þ ð6Þ
n¼0

Based on Eqs. (5) and (6), the DFT described N sets of equations, thus required
N2 multiplication for its computation. Computationally efﬁcient algorithms to
obtain DFT which is the FFT require the number of samples N to be a power of 2
and compute the DFT using only N log2 N multiplications.
In this study, the antenna signal of the PM GPR system will be processed using
FFT technique based on the phase value. The signal will be converted into fre-
quency domain and rearrange in column as referring to scanning point of the
simulation to generate the GPR radargram. Figure 1 shows the block diagram of the
phase based signal processing technique of this study. Based on the flow chart
shown in Fig. 2, phase difference between the output and input signal is calculated
to produce the PM GPR system radargram.

Fig. 1 PM GPR system using phase technique

662 C. K. N. A. H. C. K. Melor et al.

Fig. 2 PM GPR radargram

processing technique based
on phase calculation

1.3 Envelope Detector

One of the techniques used to retrieve information signal from AM signal is using
Envelope Detector (ED) technique [1, 16]. There are three types of ED techniques
to demodulate AM signal which are Asynchronous Full Wave (AFW),
Asynchronous Half Wave (AHW) and Asynchronous Real Square Law (ARSL). In
this study, the AHW type of ED technique was used to detect information signal
which is the pulse signal from the antenna output signal of the PM GPR system.
Figure 3 shows the block diagram of the AHW type of ED technique and the
processing algorithm flowchart is illustrated in Fig. 4.
Signal Processing Technique for Pulse Modulation (PM) … 663

Fig. 3 AHW envelope detector technique

Fig. 4 PM GPR radargram

processing technique based
on AHW ED

2 GPR System Simulation Using CST

Dipole antenna used in this simulation is operated from 70 to 80 MHz, which is

using a very common practical wire antenna is as shown in Fig. 5. The material
used for this antenna is cooper. There is a gap (G) between the two arms of the
antenna for feeding purpose. The detailed parameters of this antenna are described
in Table 1.
Figure 6 provides the overview of the 3 dimension designed of GPR system
simulation which has been done in this study by adding models of ground and
embedded object to the Dipole antenna design in the CST Software. The size of the
ground object model in this simulation was made as rectangular object using dry sandy
soil material which is 3000 mm Length, 3000 mm Width and 2000 mm Height.
664 C. K. N. A. H. C. K. Melor et al.

Fig. 5 Dipole antenna design in CST software

Table 1 Design parameter of Parameter Value Unit

Dipole antenna
Operating frequency 70–80 MHz
Length of the dipole (L) 1500 mm
Radius of the dipole (R) 50 mm
Gap (G) 200 mm

Fig. 6 GPR system

simulation model using CST
software
Signal Processing Technique for Pulse Modulation (PM) … 665

Meanwhile the model of embedded object has been set up as iron and the size of this
embedded object is 800 mm Length, 800 mm Width and 400 mm Height.
In this study, the GPR system simulation was designed in order to determine the
GPR system capability in detecting and estimating an embedded iron object in dry
sandy soil. The position of the Dipole antenna in this simulation has been set at
5 mm from the surface and placed in the middle of the ground object.

3 Result and Discussion

In order to produce a radargram of the GPR system, the output signal of the
simulation generated in CST Studio Suite software has to be extracted into
MATLAB software. In MATLAB software, this output signal calculated by CST
Studio Suite using Finite Difference Time Domain (FDTD) technique will be
demodulated using ED technique in order to obtain the pulse of the output signal
and then produce an output image of GPR system in form of 2D known as GPR
radargram. Besides, the output signal will also be converted in frequency domain to
retrieve its phase value and produce the GPR radargram.

3.1 Design of Dipole Antenna

The Dipole antenna of the GPR system simulation has been designed to be resonant
at 75 MHz. Figure 7 shows the S11 result of the designed antenna in frequency
range from 70 MHz until 80 MHz which is less than −10 dB.

Fig. 7 S11 result of Dipole antenna

666 C. K. N. A. H. C. K. Melor et al.

Table 2 Summary of the analysis result for magnitude and phase technique of the PM GPR
system in detection and estimation depth of embedded iron object

Depth ED Technique Phase Calculation

(mm)

10
Signal Processing Technique for Pulse Modulation (PM) … 667

Table 2 (Continued)

100

500

900
668 C. K. N. A. H. C. K. Melor et al.

Table 2 (Continued)

1000

3.2 Simulation Result of GPR System Using GPR

Radargram

Several simulations was conducted based on the embedded object depth which are
at 0, 10, 100, 500, 900 and 1000 mm. Referring to Table 2, based on ED technique,
the position of iron object at 0 mm until 1000 mm depth in dry sandy soil can be
expressed easily. Unfortunately, their position cannot be estimated clearly in the
simulation of GPR system with embedded object at the depth of 3000 until 4000 of
time sample.
Meanwhile based on GPR radargram produced using phase calculation tech-
nique, the position of iron object at depth of 0, 10, 100 and 500 mm in dry sandy
soil is not detected, however the embedded iron object can been detected at depth of
900 mm and 1000 with estimated depth at about 4 and 5 of frequency sample
respectively.

4 Conclusion

The processing of output signal of the Dipole antenna in GPR system simulation
using CST Studio Suite software has been performed and shows the good result of
detecting estimating embedded iron object in the dry sandy soil area. The simu-
lation of PM GPR system by using ED technique shows all GPR radargram can be
clearly show detected of embedded iron object at depth of 0, 10, 100, 500, 900 and
Signal Processing Technique for Pulse Modulation (PM) … 669

1000 mm. This result indicates the performance of the ED as an algorithm for
signal processing of antenna output signal can smoothen the GPR radargram.
According to the phase calculation technique, the GPR radargram produced cannot
detect an embedded object at depth of 0, 10, 100 and 500 mm. However, using this
technique the depth of the detected embedded object at depth of 900 and 1000 mm
can be estimated correctly. Further investigation will focus on capability of
PM GPR system in detecting and estimating the depth of variety embedded object
such as wood and water in dry sandy soil by using others antenna.

References

1. Joret A, Sulong MS, Abdullah MFL, Madun A, Dahlan SH (2018) Design and simulation of
horn antenna using CST software for GPR system. IOP Conf Ser J Phys 995:012080 ISSN
2600–7495
2. Sokolov KO, Prudetckii ND, Fedorova LL, Savvin DV (2018) GPR investigation of ice-filled
cracks in loose deposits. In: 17th international conference on ground penetrating radar (GPR)
3. Kulyandin GA, Fedorova LL, Savvin DV, Prudetskii ND (2016) GPR mapping of bedrock of
alluvial gold deposits in permafrost. In: Proceedings of the 2016 16th international conference
on ground penetrating radar (GPR), Hong Kong, China, pp 1–4
4. Oskooi B, Julayusefi M, Goudarzi A (2014) GPR noise reduction based on wavelet
thresholdings. Arab J Geosci 8(5):2937–2951
5. Joret A, Abdullah MFL, Dahlan SH, Madun A, Sulong MS (2016) Development of ground
penetrating radar hybrid system using Vivaldi antenna for buried object detection. Int J Electr
Eng Appl Sci IJEEAS 1(1) SSN: 2600 - 7495
6. Warren C, Giannopoulos A (2016) Experimental and modeled performance of a ground
penetrating radar antenna in lossy dielectric. IEEE J Sel Topics Appl Earth Obs Remote Sens
9(1):29–36
7. Nishimoto M, Yoshida D, Ogata K, Tanabe M (2012) Target response extraction from
measured GPR data. In: International symposium on antennas and propagation (ISAP). IEEE,
pp 427–430
8. Seyfried D, Schoebel J (2015) Stepped-frequency radar signal processing. J Appl Geophys
112:42–51
9. Li L, Tan AEC, Jhamb K, Rambabu K (2012) Buried object characterization using
ultra-wideband ground penetrating radar. IEEE Trans Microw Theory Tech 60(8):2654–2664
10. Gurbuz AC, McClellan JH, Scott WR (2012) Compressive sensing of underground structures
using GPR. Digit Signal Proc 22(1):66–73
11. Qiao L, Qin Y, Ren X, Wang Q (2015) Identification of buried objects in GPR using
amplitude modulated signals extracted from multiresolution monogenic signal analysis.
Sensors 15(12):30340–30350
12. Joret A (2018) Modulation technique for GPR system radargram. PhD thesis, Universiti Tun
Hussein Onn Malaysia
13. Tan L, Jiang J (2013) Discrete Fourier transform and signal spectrum in digital signal
processing, 2nd edn., pp 87–136
14. Tan L, Jiang J (2013) Introduction to digital signal processing in digital signal processing, 2nd
edn., pp 1–13
15. Sidney CB Fast Fourier Transforms. http://cnx.org/content/col10550/1.22/. Accessed 03 Nov
2019
16. Chaparro L (2015) Fourier analysis in communications and filtering in signals and systems
using MATLAB, 2nd edn., pp 449–490
Evaluation of Leap Motion Controller
Usability in Development of Hand
Gesture Recognition for Hemiplegia
Patients

Wan Norliyana Wan Azlan, Wan Nurshazwani Wan Zakaria,

Nurmiza Othman, Mohd Norzali Haji Mohd,
and Muhammad Nurﬁrdaus Abd Ghani

Abstract A hand gesture recognition system is developed for hemiplegia patients

to undergo rehabilitation which can encourage patients’ motor function. The Leap
Motion controller has been studied to detect human hand motion for development
of hand gesture controlled robotic arms. It was shown that the Leap Motion sensor
is useful to obtain the coordinate position and orientation of each human ﬁnger,
palm and wrist movements. A set of test program has been designed using healthy
hand to investigate the accuracy and reliability of the sensor. The test results show
the effectiveness of the device in the recognition of the human hand gestures with a
high accuracy rate of 100% for opening and closing of hand, 97.61% for whole
hand tapping and 99.6% for right movement while 98.71% for left movement of
whole hand lateral rotation.

Keywords Hand gesture recognition Leap Motion Sensor Hemiplegia Data

acquisition

1 Introduction

1.1 Hand Gesture Recognition

Hand gesture recognition has attracted a growing interest due to its applications in
many different ﬁelds, such as human-computer interaction, robotics, computer
gaming, automatic sign-language interpretation. Gesture-based remote operation is
potentially one of the most effective means of communication with a robotic hand

W. N. Wan Azlan W. N. Wan Zakaria (&) N. Othman M. N. Haji Mohd

M. N. Abd Ghani
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
Batu Pahat, Johor, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 671

as it is one of the most effective and intuitive means of human communication [1,
2]. Nowadays, a large variety of robots are used in various industries. Controlling is
highlighted as one of the main points for electro-mechanical designing of robot
manipulators [3].
However, the presence of cameras and vision control have existed for a long
time and it is a challenging task to develop solutions to invent gesture recognition.
The introduction of low cost solutions like Microsoft’s Kinect, has suggested to
exploit the depth information acquired by these devices for achieving proper ges-
ture recognition in a low-cost and user-friendly manner. Recently, the Leap Motion
has been introduced as an inexpensive choice with a proper accuracy for detecting
human hand motions [4, 5]. By using Leap Motion Controller (LMC), a technique
of controlling robot manipulators in slave side of the bilateral system can be
developed [3].
Leap motion is a motion sensor which can detect 3-dimensional hand gestures in
the air. It provides the complete information of hands to help track hand movements
and gestures through API (Application Programming Interface). An idea of real
time hand gesture recognition process through this device is explained along with
insight of existing machine learning models. Finally an attempt is made to explain
the complexities with the device and the models along with its features [6].

1.2 Hemiplegia

Hemiplegia is a type of paralysis that affects one side of the body, for example, only
an arm and a leg on the left or the right side of the body [7]. Hemiparesis is a milder
condition which is described as loss of strength or weakness and mobility on one
side of the body. Unlike a person with hemiplegia, which experience in a full
paralysis on one side of the body, a person with hemiparesis might be unable to
make movements using their arm or leg or may feel a tingling or other unusual
sensations on just one side of the body [8].
However, there are some people who develop hemiplegia after experiencing
hemiparesis while others may alternately experience both condition of hemiparesis
and hemiplegia during a period of time. While paralysis is an extreme form of
weakness and nerve dysfunction, therefore hemiplegia is an extreme form of
hemiparesis [8].
Hemiplegia can be divided into two categories which are congenital and
acquired hemiplegia. Congenital hemiplegia occurs when the brain receives damage
before, during or shortly after birth. Damage happens when there is a bleed or a
blood clot which damages the brain. Hemiplegia occurs in the ratio of 1 to 1000
children in the world. However, acquired hemiplegia happens when a person in
their adulthood experiences stroke, accident, infection or tumor [9].
Evaluation of Leap Motion Controller Usability … 673

2 Methodology

2.1 Leap Motion Sensor

Recent studies have shown that Leap Motion controller is the best device to be used
to detect and track movements of the human fingers and hand gestures. This paper
demonstrates the usage of Leap Motion device as the first step in detecting the
movement of the subject’s hand.
Leap Motion is capable in recognizing hands, fingers and arms. The tracking
mechanism is very precise as it detects discrete positions and movements. Unlike
other tracking devices such as Microsoft Kinect, which tracks using a depth camera,
this device operates based on optical sensors and infrared light.
Leap Motion is able to detect motion in 3D because of its tracking system which
imitates the human pair of eyes. Despite its surface area of 24 cm2, it has two
infrared (IR) cameras and three IR light emitters. The range of detection is roughly
around 2.5 to 61 cm from the device. This device tracking system is in its optimum
state when it has a clear, high-contrast view of an object’s silhouette in order for it
to identify and do its tracking work.
Leap Motion system uses a right-handed Cartesian coordinate system with the
origin that is placed at the top surface of the Leap Motion device as illustrated in
Fig. 1. The x-axis runs along horizontally and is parallel to the longer edge of the
device. The y-axis is vertical with increasing positive values as the direction goes
upwards. The z-axis runs horizontal as positive values are increasing towards the
user.
When Leap Motion track hands and fingers in the field of perspective, a set of
data which are known as frame is provided. The hand model gives data about the
identity (right or left hand), position, orientation (roll, pitch and yaw), grab strength,
pinch strength and other characteristics of the hand detected.
In the case of invisibility of parts of the hand from the Leap Motion’s camera, its
software will predict the internal model of a human hand. The complete five fingers
of the human hand will be displayed although if the fingers are not clearly captured.
However, finger occlusions, fingers that are blocked by the palm, sleeves of a
shirt, watch on the wrist and also jewelries on the hands and fingers may result in
inaccuracy tracking or false detection. Other than that, data tracking could also be

Fig. 1 The Leap Motion

right-handed coordinate
system [10]
674 W. N. Wan Azlan et al.

lost when the palm is not perpendicular to the to Leap Motion’s camera. On the
other hand, human foot can also be detected by the Leap Motion as it is a hand like
object.
The human arm is detected as a bone-like object that gives the direction, length,
width, and end points of an arm. At the point when the elbow is not in ﬁeld of
perspective, the Leap Motion controller estimates its position based on past
observations as well as typical human proportion [10].

2.2 Data Acquisition System

The control architecture consists of a Leap Motion device and an open-source

integrated development environment (IDE) software called Processing. Leap
Motion detects hand gestures made by the human hand and transmit acquired data
to Processing IDE. Processing has its own language of coding within the visual arts
context.
In Processing IDE, data of hand orientation in degrees (Roll, Pitch and Yaw) was
extracted and imported to .txt ﬁle. The .txt ﬁle was then imported to Microsoft
Excel in order to visualize the acquired data and subsequently for further data
analysis.

2.3 Test Procedure

Motor training rehabilitation was done in order to test the efﬁciency of Leap Motion
sensor for this study. A set of training session was designed in 6 min which
comprises of three hand gesture activities which are (1) Opening and Closing of
hand, (2) Whole hand tapping and (3) Whole hand lateral rotation [11], as shown in
Table 1.
According to [12], only the hand is trained instead of the whole arm because
motor impairments usually affects more on the distal compared to proximal limb
region. Therefore, resulting in reduced motor abilities and cortical representation of
the subject.
The tests were designed to improve motor functions for the hemiplegic hand.
The subject will hover their healthy hand over the Leap Motion sensor, making sure
that the palm is perpendicular to the Leap Motion’s camera and ensuring the hands
to be in the ﬁeld of perspective of the sensor. The subject will then move their

Table 1 Hand movements Hand gestures Duration

during 6 min of motor
training Opening and closing of hand 2 min
Whole hand tapping 2 min
Whole hand lateral rotation 2 min
Evaluation of Leap Motion Controller Usability … 675

Fig. 2 Opening and closing of hand

Fig. 3 Whole hand tapping

healthy hand according to the hand gestures listed in Table 1 subsequently within
the total period of 6 min. Illustrations of the described activities are as shown in
Figs. 2, 3 and 4.

3 Results and Discussions

Subsections 3.1, 3.2 and 3.3 show the result and calculation of mean, error per-
centage, and accuracy percentage in proving the usability of Leap Motion Sensor by
conducting the three tests procedure described in Sect. 2.3.
676 W. N. Wan Azlan et al.

Fig. 4 Whole hand lateral rotation

3.1 Opening and Closing of Hand

According to [13], the word ‘grasping’ means a ﬁrm hold or grip, while ‘release’
means to allow something to move, act or flow freely [14]. In this paper, the action
of closing the hand is identiﬁed as ‘grasping’ while opening of the hand is known as
‘releasing’. A grab strength test has been conducted in order to identify the grasping
and releasing behavior detected by the Leap Motion Sensor.
Figure 5 shows the graph of grab strength test as conducted in Fig. 2. The value
of grab strength parameter ranges from 0 to 1. Initially, the hand is opened which is
Evaluation of Leap Motion Controller Usability … 677

Grab Strength Test

1.2

0.8
Grab Strength

0.6

0.4

0.2

0
0 20 40 60 80 100 120 140
-0.2
Time (s)

Fig. 5 Grab strength value for opening and closing of hand

parallel to Leap Motion Sensor’s z-axis. At t = 20 s, the hand is closed for 5 s and
is open again right after in the intervals of 20 s. This test was carried out within
2 min.
Grab strength is 0 when the hand is opened or ‘released’ because Leap Motion
Sensor does not detect the accumulation of ﬁngers, while grab strength is 1 when
the hand is closed or ‘grasping’ because the Leap Motion Sensor detects that the
ﬁngers are close to the palm.
It can be concluded that the accuracy of Leap Motion Sensor in detecting the
grab strength is 100%.

3.2 Whole Hand Tapping

A whole hand tapping gesture was conducted to test the detection of angular motion
of the hand by the Leap Motion Sensor. Whole hand tapping means that all five
fingers including the palm move simultaneously. A test was carried out as shown in
Fig. 3.
The graph in Fig. 6 shows the three rotational parameters of orientation which
are roll, pitch and yaw when whole hand tapping hand gesture test was carried out.
Roll, pitch and yaw are the angular motion of x-axis, y-axis and z-axis respectively
in degrees (°). Based on this graph, it is clear that pitch has the most obvious change
in angle compared to roll and yaw because of the gesture behavior that moves along
the rotational y-axis.
Since the change in pitch was significant, pitch signal was extracted when whole
hand tapping movement was made. Figure 7 emphasizes on the graph of pitch
against time. The hand is in a resting position at t = 0 s until t = 19 s. At t = 20 s,
the hand tapped from a 0° to a −45° position.
678 W. N. Wan Azlan et al.

Hand Orientation Test

50
40
30
20
OrientaƟon (º)

10 Roll
0 Pitch
-10 0 100 200 300 400 500
Yaw
-20
-30
-40
-50
-60
Time (s)

Fig. 6 Orientation of the hand during whole hand tapping

Pitch Test
40
30
20
10
Pitch (°)

0
-10 0 20 40 60 80 100 120 140
-20
-30
-40
-50 -40.050613
-46.149883 -47.967014
-60 -48.172955 -44.76962 -49.335495

Time (s)

Fig. 7 Pitch value for whole hand tapping

Pitch will drop to a negative value and rises to a positive value when the hand
retrieved to its initial position. The process repeats after 20 s up to 2 min. Based on
the graph, the tapping result varied in the range of −40° to −49°.
Table 2 shows the pitch angle recorded when the hand taps for every 20 s. The
mean angle that was calculated between the 6 samples was −46.07°. Therefore, the
error that was captured by the sensor was 1.07°, which resulted in 2.39% of error. In
conclusion, the accuracy of the pitch angle test result was 97.61% for whole hand
tapping gesture.
Evaluation of Leap Motion Controller Usability … 679

Table 2 Pitch angle test Time (s) Pitch angle (°)

result
20 −48.17
40 −40.05
60 −44.77
80 −46.15
100 −49.34
120 −47.97

3.3 Whole Hand Lateral Rotation

Whole hand lateral rotation gesture is the movement of palm from side to side
without moving the wrist. A test as shown in Fig. 4 was conducted to investigate
the gesture behavior captured by the Leap Motion Sensor.
Figure 8 compares the rotational parameters of orientation and it was proven that
yaw, which is the rotation in z-axis has the most noticeable change in angle of the
palm compared to roll and pitch for this hand movement.
Yaw signal was emphasized by the graph in Fig. 9 because of its signiﬁcant
change of angle when the hand moves in a lateral rotation. The left hand was used
in this experiment. Every movement in this experiment has an interval of 10 s per
movement. A total of 2 min were taken to complete this activity.
Initially, the hand was at 0° position. Then, the palm was moved to its maximum
right horizontally without moving the wrist position. The manually measured
maximum positive angle was 20°. The palm was then to be in a resting position
which was at 0°. After that, the palm should move to its foremost left, while still
keeping the wrist position static. The manually measured negative angle was −40°.
Subsequently, the palm moves to the resting position and the process repeats up to
2 min.

Hand Orientation Test

20 Roll
OrientaƟon (º)

0 Pitch
0 200 400 600 800 1000
-20 Yaw

-40

-60

-80

-100 Time (s)

Fig. 8 Orientation of the hand during whole hand lateral rotation

680 W. N. Wan Azlan et al.

Fig. 9 Yaw value for whole Yaw Test

hand lateral rotation 60

Yaw (°)
0
0 20 40 60 80 100 120 140
-20

-40

-60 Time (s)

Table 3 Pitch angle test Pitch angle calculation Value

calculation
Mean −46.07°
Error 1.07°
Error percentage 2.39%
Accuracy percentage 97.61%

According to the graph in Fig. 9. The captured right-side movement were in the
range of 19° to 22° while the left-side movement were in a range of −38° to −43°.
Table 3 shows the yaw angle recorded when the hand moves from side to side
for every 10 s. The mean angle for 1 set of 10 s interval was calculated for every
different movement.
The total mean yaw angle calculated for right direction movement was 19.92°
which leads to an error of 0.08°. As a result, the percentage error and the accuracy
for right direction movement test was 0.40% and 99.60% respectively.

Table 4 Yaw angle test result

Towards right direction Towards left direction
Time (s) Mean yaw angle (°) Time (s) Mean yaw angle (°)
10–20 19.22 30–40 −41.31
50–60 19.93 70–80 −41.57
90–100 20.61 110–120 −38.67

Table 5 Yaw angle test calculation

Towards right direction Towards left direction
Total mean (°) 19.92 −40.52
Error (°) 0.08 0.52
Error % 0.40 1.30
Accuracy % 99.60 98.71
Evaluation of Leap Motion Controller Usability … 681

On the other hand, the total mean yaw angle calculated for left direction
movement was −40.52° which results the error of 0.52°. As conclusion, the per-
centage error and the accuracy for left direction movement test was 1.29% and
98.71% respectively.

4 Conclusion

This paper discusses the usability and efﬁciency of Leap Motion sensor to detect
hand gesture recognition for hemiplegia patients. Several rehabilitation hand
movements were tested to investigate the accuracy of the Leap Motion sensor in
detecting hand gestures. It was found that there were slightly errors between the
actual measurements of the hand gestures and the measurements that were made by
Leap Motion sensor. In addition, the inaccuracies caused might due to a mild hand
tremor which is a normal occurrence for every person but varies between
individuals.
Three different hand gesture tests were carried out which are opening and closing
of hand, whole hand tapping and whole hand lateral rotation. The percentage of
Leap Motion’s usability accuracy of detecting the hand gestures were 100% for
opening and closing of hand, 97.61% for whole hand tapping and 99.6% for right
movement while 98.71% for left movement of whole hand lateral rotation.
The present study establishes a quantitative framework for detecting human hand
gesture particularly for further used in development of robot based controlled
rehabilitation devices. For future work, the usability of human hand motion
recognition to control robot arm as a prosthetic hand will be studied further for the
rehabilitation of hemiplegia patients.

Acknowledgements The authors are grateful for Universiti Tun Hussein Onn Malaysia (UTHM)
for supporting this research work under Postgraduate Research Grant (GPPS) Vot H409.

References

1. Cheng H, Yang L, Liu Z (2015) A survey on 3D hand gesture recognition. IEEE Trans Circ
Syst Video Technol 1–14
2. Wachs JP, Kölsch M, Stern H, Edan Y (2011) Vision-based applications. Commun ACM 54
(2):60–71
3. Gunawardane H, Medagedara N, Madhusanka A (2015) Control of robot arm based on hand
gestures using leap motion sensor technology. Int J Robot Mechatronics 2(1):7–14
4. Ren Z, Yuan J, Meng J, Zhang Z (2013) Robust part-based hand gesture recognition using
kinect sensor. IEEE Trans Multimed 15(5):1110–1120
5. Leap Motion. https://www.leapmotion.com/. Accessed 20 July 2019
6. Panduranga HT, Mani C (2018) Dynamic hand gesture recognition system: a short survey. In:
International conference on inventive research in computing applications (ICIRCA), pp 689–
694
682 W. N. Wan Azlan et al.

7. Hemiplegia, SpinalCord.com, Spinalcord.com. https://www.spinalcord.com/hemiplegia.

Accessed 10 June 2019
8. What is the difference between hemiplegia and hemiparesis. Spinalcord.com. https://www.
spinalcord.com/blog/what-is-the-difference-between-hemiplegia-and-hemiparesis. Accessed
10 June 2019
9. What is Hemiplegia, Epilepsy Society. https://www.epilepsysociety.org.uk/what-
hemiplegia#.XP44JYgzbIU. Accessed 10 June 2019
10. API Overview - Leap Motion C++ SDK v3.2 Beta documentation. https://developer-archive.
leapmotion.com/documentation/cpp/devguide/Leap_Overview.html. Accessed 20 July 2019
11. Tosi G, Romano D, Maravita A (2018) Mirror box training in hemiplegic stroke patients
affects body representation. Journal 11(617):1–10
12. Dohle C, Püllen J, Nakaten A, Küst J, Rietz C, Karbe H (2009) Mirror therapy promotes
recovery from severe hemiparesis: a randomized controlled trial. Journal 23(3):209–217
13. Lexico Dictionaries, English (2019) Grasp Definition of Grasp by Lexico. https://www.lexico.
com/en/definition/grasp. Accessed 12 Nov 2019
14. Lexico Dictionaries, English (2019) Release Definition of Release by Lexico. https://www.
lexico.com/en/definition/release. Accessed 12 Nov 2019
Using Convolution Neural Networks
Pattern for Classification of Motor
Imagery in BCI System

Sepideh Zolfaghari, Tohid Youseﬁ Rezaii, Saeed Meshgini,

and Ali Farzamnia

Abstract The Electroencephalography (EEG) based Brain-computer interfaces

(BCI) enable humans to control external devices through extracts informative
features from brain signals and convert these features into control commands. Deep
learning methods have been the advanced classification algorithms used in various
applications. In this paper, the informative features of EEG signals are obtained
using the filter-bank common spatial pattern (FBCSP), then the selected features
which are prepared using the mutual information method are fed to the classifiers as
input. Convolution neural network (CNN), Naive Bayesian (NB), multiple support
vector machines (SVM) and linear discriminant analysis (LDA) algorithms are used
to classify EEG signals into left and right hand motor imagery (MI) across nine
subjects. Our framework has been tested on BCI competition IV-2a 4-class dataset.
The results are shown that the CNN classifier has yielded the best average classi-
fication accuracy, with 99.77% as compared to other classification methods. The
experimental results represent that our proposed method can obtain more refined
control in the BCI applications such as controlling robot arm movement.

Keywords Electroencephalography (EEG) Brain-computer interface (BCI)

Motor imagery (MI) Filter-bank common spatial pattern (FBCSP) Convolution
neural network (CNN)

1 Introduction

Brain-computer interface (BCI) is an important research ﬁeld that provides com-

munication between a subject’s brain with motor impairment and external devices
without peripheral nervous systems intervention [1]. BCI system has been used for

S. Zolfaghari T. Y. Rezaii S. Meshgini

Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
A. Farzamnia (&)
Faculty of Engineering, Universiti Malaysia Sabah (UMS), Kota Kinabalu, Sabah, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 683

various purposes such as rehabilitation [2, 3], control [4], games [5], authentication
[6] and so on. Based on the type of electrodes that used for obtaining brain signals,
BCI systems can be invasive or non-invasive. Electroencephalography (EEG) is a
non-invasive method refers to recording electrical activities of the brain from a
user’s scalp [7]. Different mental states are lead to produce different types of brain
signals. Interfaces based on, P300 [8, 9], event related desynchronization and
synchronization (ERD/ERS), steady-state visually evoked potential (SSVEP) [10],
error-related potential (ErrP) [11] and movement related cortical potential (MRCP)
[12] are the most widely used. ERD, and ERS respectively refer to decreasing and
increasing of the EEG signal power in two frequency bands consist of mu
(8−12 Hz) and beta (16−24 Hz) bands during a movement execution or
imagery [13].
BCI measures EEG signals associated with the user’s intent and translates the
recorded signals into control commands. So, signal processing, pattern recognition
and classification are very important in BCI researches [14]. Various methods have
been proposed to extract informative features and the classification of the subject’s
mental states. The common feature extraction algorithms which used to train
classifiers are common spatial pattern (CSP) [15], Filter Bank CSP (FBCSP) [16],
wavelet transformation (WT) [17] and other methods. Also, support vector machine
(SVM), linear discriminant analysis (LDA), Naive Bayesian (NB) and other algo-
rithms have been used as classifiers. The EEG signals that achieved from changes
in the direction of the actual hand movement experiment are reported in [18]. The
authors proposed a wavelet-CSP algorithm to extract features of the brain signals.
In [19], to differentiate slow and fast implementation of left/right hand movement,
Welch method based power spectral density estimates have been used to create the
feature vectors and are fed to the SVM, NB, LDA and KNN classifiers. The NB
classifier yielded the best accuracy among other classifiers.
Recently, people have begun to study deep learning (DL) approaches to extract
features and classify data. This method has been used in different fields of study
such as natural language processing [20], speech recognition [21] and currently in
computer vision applications [22]. According to authentication systems using
biometric methods in [23], authors used five various mental tasks from the standard
EEG database. Features were obtained by the WT method with 10 and 5 decom-
position levels, then were classified through artificial neural network
(ANN) classifier. Visual counting task achieved better accuracy than other mental
tasks. A novel multi-objective flower pollination and WT algorithm was proposed
in [24]. Optimal WT parameters were obtained and Decision tree, neural network,
SVM and NB used as classifiers. Inspired by MI-based ERD/ERS, in [25] a 5 -
layer CNN architecture is proposed to feature extraction and the hand motor ima-
gery classification. The accuracy improved by 5%−10% compared with three
convolution classification methods (power + SVM, CSP + SVM, AR + SVM). In
[26], a wavelet transform time-frequency is represented then a 2-layer CNN is made
as a classifier and convolution kernels of various sizes are validated.
In [27] a CNN classification method for the detection of P300 waves is sug-
gested by Cecotti and Graser. The results showed that this classification is obtained
the highest accuracy of about 95.5% on the BCI competition. The CSP algorithm
Using Convolution Neural Networks Pattern for Classification … 685

based deep neural network (DNN) is presented in [28]. In this paper, a four-layer
neural network consists of two hidden layers is employed to classify the MI signals.
The results of several models are compared with each other, it is observed that the
CSP - DNN method is computationally efficient and reduced the overall maximum
error.
In a more recent paper, Sakhavi et al. [29] presented a new classification method
for MI tasks by introducing a temporal representation of the data and as well as
using a CNN architecture for classification. This representation has been generated
by modifying the FBCSP method.

2 Data

In this paper, the EEG data are collected using BCI competition IV-2a from 9
subjects [30]. Dataset recorded from 22 Ag/AgCl electrodes with a 250-Hz sam-
pling rate. Two sessions were recorded for each subject and each session comprised
228 trials.
The timing plan consists of a ﬁxation cross as a sign of the subject’s preparation,
a cue in the form of arrow corresponding to one of the four classes of motor
imagery (left hand, right hand, both feet and tongue) for 1.25 s, a period of the
motor imagery for 3 s and a black screen to rest the subject is shown in Fig. 1.

3 Method

Section 3.1, summarizes the FBCSP algorithm and describes feature extraction
from the EEG signal. Section 3.2 discusses mutual information for use feature
selection. Section 3.3 describes the proposed CNN algorithm and classiﬁcation.
The Filter Bank Common Spatial Pattern (FBCSP) algorithm, mutual informa-
tion method and the use of different classiﬁers for 2-class motor imagery in training
and test phases are shown in Fig. 2.

Fig. 1 The timing plan of a visual cue

686 S. Zolfaghari et al.

Fig. 2 The block diagram representation of our general scheme.

3.1 Filter Bank Common Spatial Pattern

The CSP algorithm is widely used to discriminate features between two classes.
This method by applying spatial filters to the inputs, maximizes the variance of the
signals in the first class and simultaneously minimizes the variance of the signals in
the second class [31].
The CSP filter is sensitive to noisy data and consists of all frequency bands.
Therefore, the FBCSP method solves these problems by passing the EEG signal
through a filter bank [16]. The feature extraction steps are described as follows:
1. The EEG signals consist of hand motor imageries are filtered by a filter bank
that has nine band-pass filters (4−8 Hz, 8−12 Hz, …, 36−40 Hz). These
band-pass filters are Chebyshev type II filters.
2. Each output of the filters is split into training data and test data.
3. CSP algorithm is performed on the training and test data in each of the
sub-bands to compute the spatial filter. Then, the data matrix is depicted in this
spatial filter shown in (1).

Z ¼ WE ð1Þ

where E 2 RCT denotes the data matrix of EEG signals, W 2 RCC denotes the
spatial filter and the spatially filtered signal which maximizes the discrimination
in the variance of two classes is denoted by Z 2 RCT . C is the number of
channels and T is the number of samples per signal.
Using Convolution Neural Networks Pattern for Classification … 687

4. Feature vectors are obtained at (2).

var Zj
fi ¼ log 2m ð2Þ
P
var Zj
j¼1

Zj 2 R2mT represents the rows of Z which are corresponding to the largest and
smallest eigenvalues. We set m ¼ 2 in this paper.

3.2 Mutual Information

One of the commonly used methods of feature selection is mutual information. This
method Selects an informative subset of an initial set of features by measuring
mutual dependence between random variables and the relevance of various features
[32]. The mutual information is described as follows:
1. A set of feature vectors F ¼ ½f1 ; f2 ; . . .; f92m from Eq. (2), a set of selected
features S ¼ ; and the true label of each trial are initialized.
2. The mutual information of each feature with the class label is computed.
3. The features are sorted in descending order of mutual information computed in
the previous step.
4. The earliest K features are selected from sorted features.

3.3 Convolution Neural Network

A convolution neural network is a class of neural networks that have proven very
impressive in image recognition and classification. This algorithm, firstly intro-
duced by LeCun et al. [33] in the LeNet-5 architecture. The general architecture of
CNN has the following layers: an input layer, hidden layers and an output layer.
The hidden layers consist of convolution layers, RELU layer and pooling layers and
fully connected layers. Convolution layers convolve the input to extract the features
of the input and subsampled to the smaller size in the pooling layer. The goal of
max pooling and average pooling, which are functioning as the sub-sampling layer
is to reduce the dimension of data. The fully connected layer includes neurons that
are connected to all previously obtained features is used for flattening features and
the output layer shows which class is identified.
The CNN model structure is shown in Fig. 3. The features of each subject are
classified as the following procedure.
688 S. Zolfaghari et al.

Fig. 3 The CNN model structure for MI classiﬁcation

1. The training and test data are uniformly resized to N 1 1 matrices and are
labeled according to the MI tasks.
2. The convolution-2D layer which includes kernel size 3 1 and 16 ﬁlters is
convolved with the input.
3. The max pooling-2D layer with kernel size 2 1 is used for down-sampling.
4. Two fully connected layers consisting of 150 and 2 neurons, respectively are set.
The flattened matrix goes through the last fully connected layer to tell which
class is representing the imagery of left and right hand movement.

4 Results and Discussion

In our study, the experiments are performed in a Matlab 2017a platform on the
desktop PC with an Intel core i7, up to 3 GHz and 8 GB RAM. In all experiments,
the model learns on training data to be predicting the test data. Features are ran-
domly selected for training and testing. 100 runs are performed in each classifier.
The average accuracy of the MI classifiers is yielded for all subjects.
In Table 1, the results that were obtained from four classifiers: SVM, LDA, NB
and CNN are shown. As noted in the table, it is observed that the CNN classifier
yields the highest average accuracy (more than 99%) relative to other methods.
Also, the average SVM, NB and LDA classification accuracy of all subjects were
achieved 98.42%, 97.75% and 96.82%, respectively.
Also, the training and test data of all subjects for each class were combined
separately. The very distinctive features of the two classes in the training phase are
shown in Fig. 4. As noted in the figure, it is concluded that the proposed method
consists of feature extraction with the FBCSP algorithm and feature selection with
Using Convolution Neural Networks Pattern for Classification … 689

Table 1 Accuracy of SVM LDA NB CNN

classiﬁers obtained from
9 healthy subjects Subject1 97.22 96.30 97.68 100
Subject2 98.61 94.90 96.75 99.50
Subject3 98.60 96.29 97.22 100
Subject4 98.98 96.46 97.47 100
Subject5 98.49 97.98 98.98 99.50
Subject6 99.44 98.33 98.89 99.00
Subject7 99.49 95.45 98.98 100
Subject8 97.47 96.96 96.96 100
Subject9 97.53 98.76 96.91 100
Average 98.42 96.82 97.75 99.77
All subjects 99.74 99.40 99.59 100

Fig. 4 The diagram of the feature vectors for the training data of all subjects in two classes

mutual information method has been effective and has been able to obtain dis-
tinctive features. These informative features have been an increase in classification
performance. As indicated in Table 1, The classification accuracies for all subjects
were achieved more than 99% with CNN classifier yielding the best performance of
100%.
In Fig. 5, the two curves, the accuracy of all subjects and average accuracy are
compared with each other. It is observed that the first curve performed better than
the second curve.
690 S. Zolfaghari et al.

Fig. 5 Compare the two graphs (average accuracy and the accuracy of all subjects)

5 Conclusion and Future Direction

This paper proposes a CNN classiﬁcation algorithm to discriminate the imagery of

left and right hand movement. The discriminative features are obtained by the
FBCSP method and mutual information is used to feature selection. These features
are transformed into images, and are fed into the CNN classifier. The results show
that the method using CNN classifier achieves better results as compared to other
traditional classifiers. In the future work, we want to focus on the classification of
multi-class motor imagery tasks and the study of other different movements of the
hand, which can help better and precise control of the external prosthesis.

References

1. Cincotti F, Pichiorri F, Aricò P, Aloise F, Leotta F, de Vico Fallani F, Millán JDR,

Molinari M, Mattia D (2008) EEG-based Brain-Computer Interface to support post-stroke
motor rehabilitation of the upper limb. In: 2012 annual international conference of the IEEE
engineering in medicine and biology society, IEEE EMBS San Diego, California. IEEE,
pp 4112–4115
2. Bandara DSV, Arata J, Kiguchi K (2018) A noninvasive brain–computer interface approach
for predicting motion intention of activities of daily living tasks for an upper-limb wearable
robot. Int J Adv Rob Syst 15(2):1–10
Using Convolution Neural Networks Pattern for Classiﬁcation … 691

3. Kang BK, Kim JS, Ryun S, Chung CK (2018) Prediction of movement intention using
connectivity within motor-related network: an electrocorticography study. PLoS ONE 13
(1):1–14
4. He S, Zhang R, Wang Q, Chen Y, Yang T, Feng Z, Zhang Y, Shao M, Li Y (2017) A
P300-based threshold-free brain switch and its application in wheelchair control. IEEE Trans
Neural Syst Rehabil Eng 25(6):715–725
5. Kreilinger A, Hiebel H, Müller-Putz GR (2016) Single versus multiple events error potential
detection in a BCI-controlled car game with continuous and discrete feedback. IEEE Trans
Biomed Eng 63(3):519–529
6. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Ahmad Alomari O (2018) EEG-based
person authentication using multi-objective flower pollination algorithm. In: 2018 IEEE
congress on evolutionary computation (CEC). IEEE, pp 1–8
7. Daly JJ, Wolpaw JR (2008) Brain–computer interfaces in neurological rehabilitation. Lancet
Neurol 7(11):1032–1043
8. Obeidat QT, Campbell TA, Kong J (2015) A P300 brain–computer interface for spelling
written words. IEEE Trans Hum Mach Syst 45(6):727–738
9. Martínez-Cagigal V, Gomez-Pilar J, Álvarez D, Hornero R (2016) An asynchronous
P300-based brain-computer interface web browser for severely disabled people. IEEE Trans
Neural Syst Rehabil Eng 25(8):1332–1342
10. Bi L, Fan XA, Jie K, Teng T, Ding H, Liu Y (2014) Using a head-up display-based
steady-state visually evoked potential brain–computer interface to control a simulated vehicle.
IEEE Trans Intell Transp Syst 15(3):959–966
11. Bhattacharyya S, Konar A, Tibarewala DN (2017) Motor imagery and error related potential
induced position control of a robotic arm. IEEE/CAA J Automatica Sinica 4(4):639–650
12. Lin C, Wang BH, Jiang N, Xu R, Mrachacz-Kersting N, Farina D (2016) Discriminative
manifold learning based detection of movement-related cortical potentials. IEEE Trans Neural
Syst Rehabil Eng 24(9):921–927
13. Pfurtscheller G, Neuper C, Flotzinger D, Pregenzer M (1997) EEG-based discrimination
between imagination of right and left hand movment. Electroencephalogr Clin Neurophysiol
103(6):642–651
14. Graimann B, Allison B, Pfurtscheller G (2009) Brain-computer interfaces: a gentle
introduction. In: Brain-computer interfaces, Heidelberg
15. Mishuhina V, Jiang X (2018) Feature weighting and regularization of common spatial
patterns in EEG-based motor imagery BCI. IEEE Signal Process Lett 25(6):783–787
16. Park SH, Lee D, Lee SG (2017) Filter bank regularized common spatial pattern ensemble for
small sample motor imagery classification. IEEE Trans Neural Syst Rehabil Eng 26(2):498–
505
17. Hernández-González CE, Ramírez-Cortés JM, Gómez-Gil P, Rangel-Magdaleno J,
Peregrina-Barreto H, Cruz-Vega I (2017) EEG motor imagery signals classification using
maximum overlap wavelet transform and support vector machine. In: 2017 IEEE international
autumn meeting on power, electronics and computing (ROPEC), Ixtapa. IEEE, pp 1–5
18. Robinson N, Vinod AP, Guan C, Ang KK, Peng TK (2011) A Wavelet-CSP method to
classify hand movement directions in EEG based BCI system. In: 2011 8th international
conference on information, communications & signal processing, Singapore. IEEE, pp 1–5
19. Bhattacharyya S, Hossain MA, Konar A, Tibarewala DN, Ramadoss J (2014) Detection of
fast and slow hand movements from motor imagery EEG signals. In: Advanced computing,
networking and informatics-volume 1. Springer, Cham, pp 645–652
20. Mikolov T, Deoras A, Kombrink S, Burget L, Černocký J (2011) Empirical evaluation and
combination of advanced language modeling techniques. In: Twelfth annual conference of the
international speech communication association. INTERSPEECH. ISCA, Florence, pp. 605–
608
21. Rezazadeh Sereshkeh A, Trott R, Bricout A, Chau T (2017) EEG classification of covert
speech using regularized neural networks. IEEE/ACM Trans Audio Speech Lang Process
(TASLP) 25(12):2292–2300
692 S. Zolfaghari et al.

22. Dobhal T, Shitole V, Thomas G, Navada G (2015) Human activity recognition using binary
motion image and deep learning. Proc Comput Sci 58:178–185
23. Alyasseri ZAA, Khadeer AT, Al-Betar MA, Abasi A, Makhadmeh S, Ali NS (2019) The
effects of EEG feature extraction using multi-wavelet decomposition for mental tasks
classification. In: Proceedings of the international conference on information and commu-
nication technology, pp 139–146
24. Alyasseri ZAA, Khader AT, Al-Betar MA, Papa JP, Alomari OA, Makhadmeh SN (2018)
Classification of eeg mental tasks using multi-objective flower pollination algorithm for
person identification. Int J Integr Eng 10(7):102–116
25. Tang Z, Li C, Sun S (2017) Single-trial EEG classification of motor imagery using deep
convolutional neural networks. Optik-Int J Light Electron Opt 130:11–18
26. Xu B, Zhang L, Song A, Wu C, Li W, Zhang D, Xu G, Li H, Zeng H (2018) Wavelet
transform time-frequency image and convolutional network-based motor imagery EEG
classification. IEEE Access 7:6084–6093
27. Cecotti H, Graser A (2010) Convolutional neural networks for P300 detection with
application to brain-computer interfaces. IEEE Trans Pattern Anal Mach Intell 33(3):433–445
28. Kumar S, Sharma A, Mamun K, Tsunoda T (2016) A deep learning approach for motor
imagery EEG signal classification. In: 2016 3rd Asia-Pacific world congress on computer
science and engineering (APWC on CSE), Nadi. IEEE, pp 34–39
29. Sakhavi S, Guan C, Yan S (2018) Learning temporal information for brain-computer interface
using convolutional neural networks. IEEE Trans Neural Netw Learn Syst 29(11):5619–5629
30. Tangermann M, Müller KR, Aertsen A, Birbaumer N, Braun C, Brunner C, Leeb R,
Mehring C, Miller KJ, Müller-Putz G, Nolte G (2012) Review of the BCI competition IV.
Front Neurosci 6(55):10–3389
31. Ramoser H, Muller-Gerking J, Pfurtscheller G (2000) Optimal spatial filtering of single trial
EEG during imagined hand movement. IEEE Trans. Rehabil Eng 8(4):441–446
32. Battiti R (1994) Using mutual information for selecting features in supervised neural net
learning. IEEE Trans Neural Networks 5(4):537–550
33. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document
recognition. Proc IEEE 86(11):2278–2324
Metasurface with Wide-Angle Reception
for Electromagnetic Energy Harvesting

Abdulrahman A. G. Amer, Syarfa Zahirah Sapuan, Nasimuddin,

and Nabiah Binti Zinal

Abstract A wide-angle and polarization-insensitive metasurface instead of tradi-

tional antenna is built as the primary ambient energy harvester. Proposed meta-
surface harvester can receive electromagnetic (EM) energy from wide angles and
received rectified power can be combined by DC combining for EM harvesting
system. The reflection coefficient, power distribution, EM harvesting efficiency, and
absorption efficiency on the normal and oblique incidences are studied and pre-
sented. For a single unit cell with periodic boundary condition, simulation results
exhibit that the absorption and harvesting efficiencies are more than 98 and 96%,
respectively, under normal incidence at an operating frequency of 2.4 GHz. The
results also show that on the oblique incidence of 0°, the maximum absorption
efficiency is more than 98% and it can achieve more than 80% reception efficiency
at the incidence angle of ±60°.

Keywords Metasurface Electromagnetic energy harvesting Harvesting

efﬁciency Wide-angle

1 Introduction

Electromagnetic waves ﬁlled the human surrounding with the quick advances in
wireless communication systems. The wireless power transfer (WPT) conception is
demonstrated in the early years of the 20th century by Nikola Tesla. The far-ﬁeld

A. A. G. Amer S. Z. Sapuan (&)

Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
86400 Parit Raja, Batu Pahat, Johor, Malaysia
e-mail: [email protected]
Nasimuddin
Institute for Infocomm Research, A-STAR, Singapore, Singapore
N. B. Zinal
Centre for Diploma Studies, Universiti Tun Hussein Onn Malaysia, Parit Raja, Batu Pahat
86400, Johor, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 693

WPT system operating at radio frequency (RF) was built in the 1960s by William
Brown who was the first to build that system [1]. Due to the advanced develop-
ments in wireless technologies, energy harvesting has attracted significant attention
in recent years.
Energy harvesting or energy scavenging is defined as capturing amounts of
energy from the surrounding environment power sources such as electromagnetic
waves, vibration, thermal, and storing them for later use. The rectenna systems
usually used in wireless power transfer (WPT) to collect the electromagnetic energy
from the ambient then transformed into a DC-current to replace the battery usage.
The rectenna system usually comprises of three sections: an antenna, a rectifier
circuit, and a load. An antenna is the key part and uses to receiving the incident
waves. Conversion efficiency is a very important parameter that determines the
performance of the harvesting mechanism and depends strongly on the conversion
medium. Practically, to get higher conversion efficiency the high-efficiency col-
lectors need to be used. The conventional antennas are used in the rectenna systems
as a collector in single or array configuration. To use them as an array, the distance
between two neighbor elements is typical k⁄ 2 to avoid destructive mutual coupling
between array elements [2–4].
Providing power to stand-alone electronic devices is one of the important
applications of electromagnetic energy harvesting and requires highly efficient
collectors to capture enough power from the ambient. Therefore, metamaterial cell
as electromagnetic collectors have been used. Metamaterials are novel manufac-
tured small metal unit cells array based surface designed to display unique prop-
erties that not easily found in nature as tailoring the permittivity (e) and
permeability (µ) to control the electromagnetic field [5]. The metasurface is 2-D
dimensional counterparts of 3-D metamaterials. Unlike absorbers [6–8], metasur-
face harvesters can capture the power from the surrounding environment and drive
it to the load for collecting. Split-ring resonators (SRR) [9, 10] and complementary
split-ring resonators (C-SRR) [11, 12] are the common resonators used in energy
harvesting applications.
The conventional harvesting antenna cannot maintain high absorption efficiency
for arbitrary direction and polarization, due to the random and unknown incident
angles and polarization of ambient EM waves. In this work, a wide-angle meta-
surface based EM harvester is demonstrated for Wi-Fi band (2.4 GHz) to capture
the ambient microwave energy from wide area. The suggested metasurface unit cell
includes a cross-slot patch with a probe to maximize the energy absorption and
drive it to the resistive load by via interconnect with near-unity absorption.
The CST microwave studio was been used to optimize the structure for
meta-surface analysis.
Metasurface with Wide-Angle Reception … 695

Via
y

E
S/2 L Load
P
W
x

ground plane

(a) (b)

Fig. 1 Illustration of proposed metasurface unit cell harvester (a) Top view and (b) back view

2 Methodology

Figure 1 shows the topology of the suggested metasurface unit cell that comprises
of a cross-slot patch with a probe/via. The collected power on the cell is delivered to
the resistive load using a probe/via interconnect through a substrate layer. The via
interconnect is laid at the distance of 5.0 mm from the center and combined to the
ground using a resistive load. The resistor load can be replaced by a rectification
circuit having an input impedance matching to that of the structure at operating
frequency. The optimal resistance value was swept to be 80 Ω where the maximum
power transfer occurs. To reduce the dielectric loss that can deteriorate the effi-
ciency of the harvester, a Rogers TMM10i with er ¼ 9:8, loss tangent d = 0.002
and thickness 3.81 mm was used. In addition, both of the top and bottom layers
respectively unit cell and ground were copper. The unit cell is designed at 2.4 GHz
(ISM band) and its design dimensions are show in Table 1.
The electromagnetic waves are fallen to the structure in normal incident with the
electric field vector in parallel to cross-slot patch. The effect of incidence angle on
absorption efficiency is investigated by changing the direction of the incident waves
onto the metasurface structure.

Table 1 The parameter of metasurface unit cell

Parameter Dimension (mm)
Periodicity of the cell, P 13.7
Length of a slot, L 7.8
Width of a slot, W 1.5
Distance between the edge and contact surface, S 0.4
696 A. A. G. Amer et al.

Fig. 2 Boundary conditions setup of CST Microwave Studio

3 Results and Discussion

CST Microwave Studio was used to perform the numerical simulation for proposed
metasurface design. The periodic boundary condition has been aligned in the x-y
axes and was excited by Floquet ports in a way that the incident waves propagating
along the z-axis as seen in Fig. 2.
The periodic boundary condition was applied to numerically the infinite meta-
surface unit cell and investigates the effect of incident angle change on the
absorption efficiency. The Floquet exciting port was set with two modes of TE and
TM polarization on the top boundary to simulate the incident wave.
Equation (1) describes the absorption efficiency

AðwÞ ¼ 1 jS11 j2 jS21 j2 ð1Þ

where, S11 is the reflection coefficient, and S21 is the transmission coefficient. The
transmitted power ðS21 Þ is almost zero due to the bottom layer of the resonator is a
copper which means that the absorption can be defined by

AðwÞ ¼ 1 jS11 j2 ð2Þ

Therefore, the reflection coefﬁcients need to be minimized to maximizing the

absorption efﬁciency.
The harvesting efﬁciency can be calculated by

Pload
g¼ ð3Þ
Pincident

where PLoad is the total time-average power consumed in the load, and PIncident is
the total time-average power incident on the cell.
Metasurface with Wide-Angle Reception … 697

Fig. 3 Reflection coefﬁcient 0

of the proposed metasurface
unit cell
-10

S (dB)
-20

11
-30

-40
1.5 2 2.4 3 3.5
Frequency (GHz)

Fig. 4 The absorption, 100

reflection and transmission Absorption
coefﬁcients at normal Reflection
80 Transmission
incidence

1.5 2 2.5 3 3.5

Frequency (GHz)

Figure 3 exhibits the reflection coefﬁcient S11 of the meta-harvester at h ¼ 0

(normal incidence).
The absorption coefficient of the meta-harvester under the normal incidence is
over 98% as depicted in Fig. 4. The meta-harvester can capture and absorb almost
all the power incident at its surface because of the impedance of the harvester is
corresponded to free space impedance at the resonance frequency.
The power losses in the load, Rogers substrate and metal were evaluated using
full-wave simulation to analyze the power dissipation within the unit cell. Figure 5
shows that the dissipated power across the resistive load is 98 and 2% is consumed
in both substrate and copper. For the meta-harvester, the most power dissipation
within the unit cell was wasted in the load which is the significant difference from
the meta-absorber where the most power dissipated in the substrate. Figure 6 shows
the harvesting efficiency is 96% at the resonance frequency of 2.4 GHz under
normal incidence.
698 A. A. G. Amer et al.

Fig. 5 Power conversion 100

efﬁciency at normal incidence Load
Copper
80
Substrate

Power Efficiency
60

1.5 2 2.4 3 3.5 4

Frequency (GHz)

Fig. 6 Harvesting efﬁciency 100

under normal incidence
Harvesting Efficiency

0
1 1.5 2.4 3 3.5 4
Frequency (GHz)

Finally, the metasurface unit cell was simulated on different oblique incident
angles to observe the effect of the angle incident change on the absorption effi-
ciency. The phase between the electric field (E-field) and x-axis as shown in Fig. 1a
is evaluated from 0 to 60°. The harvesting efficiency of the metasurface energy
harvester on different oblique incidence angles is shown in Fig. 7.
It can be seen that the max absorption efficiency is more than 98% when the
oblique incidence reaches the 0-degree angle and decrease gradually when the
incidence angle is increased. Therefore, the max energy harvesting can be achieved
from wide area as more than 80% absorption efficiency can be achieved for wide
wave incidence angles of ±60°.
Metasurface with Wide-Angle Reception … 699

Fig. 7 Absorption Efﬁciency 100

under various incidence θ=0o

Absorption Efficiency
angles θ=30o
80
θ=45o
θ=60o
60

0
1.5 2 2.4 3 3.5
Frequency (GHz)

4 Conclusion

EM energy collector has been presented based on the metasurface structure. It

demonstrates a strong absorption to electromagnetic waves in a microwave regime.
The proposed collector has a significant harvesting efficiency with different
polarization angles from 0 to 60° At incident angle = 0° a maximum harvesting
efficiency of more than 98% was observed. Proposed wide-angle metasurface based
EM harvester is useful for efficient energy harvesting systems.

Acknowledgements The authors would like to acknowledge Universiti Tun Hussein Onn
Malaysia (UTHM) for their funding of this research under TIER 1 research grant, H150.

References

1. Brown WC (1984) The history of power transmission by radio waves. IEEE Trans Microw
Theory Tech 32:1230–1242
2. Sharma T, Saini G (2016) Microstrip antenna array for RF energy harvesting system 5:145–
149
3. Heikkinen J, Salonen P, Kivikoski M (2000) Planar rectennas for 2.45 GHz wireless power
transfer. In: IEEE radio and wireless conference. Colorado, pp 63–66
4. Sun H, Geyi W (2017) A new rectenna using beamwidth-enhanced antenna array for RF
power harvesting applications. IEEE Antennas Wirel Propag Lett 16:1451–1454
5. Holloway CL et al (2012) An overview of the theory and applications of metasurfaces: the
two-dimensional equivalents of metamaterials. IEEE Antennas Propag Mag 54:10–35
6. Sood D, Tripathi CC (2016) A wideband wide-angle ultrathin low proﬁle metamaterial
microwave absorber. Microw Opt Technol Lett 58:1131–1135
7. Ramya S, Srinivasa Rao I (2017) A compact ultra-thin ultra-wideband microwave
metamaterial absorber. Microw Opt Technol Lett 59:1837–1845
8. Bağmancı M et al (2019) Polarization independent broadband metamaterial absorber for
microwave applications. Int J RF Microw Comput Eng 29:1–10
700 A. A. G. Amer et al.

9. Almoneef T, Ramahi OM (2014) A 3-dimensional stacked metamaterial arrays for

electromagnetic energy harvesting. Prog Electromagn Res 146:109–115
10. Ramahi OM, Almoneef TS, Alshareef M, Boybay MS (2012) Metamaterial particles for
electromagnetic energy harvesting. Appl Phys Lett 101:173903
11. Alavikia B, Almoneef TS, Ramahi OM (2015) Complementary split ring resonator arrays for
electromagnetic energy harvesting. Appl Phys Lett 107:033902
12. Alavikia B, Almoneef TS, Ramahi OM (2014) Electromagnetic energy harvesting using
complementary split-ring resonators. Appl Phys Lett 104:163903
Integrated Soil Monitoring System
for Internet of Thing (IOT) Applications

Xin Yi Lau, Chun Heng Soo, Yusmeeraz Yusof, and Suhaila Isaak

Abstract Spectroscopy is widely used in various ﬁeld, including in agriculture to

determine the contamination of soil in order to produce the good quality of food and
to avoid the excessive use of fertilizer, thereby minimize the impact on the envi-
ronment. However, the commercial and common method of soil spectroscopy has
some limitation such as bulky in size, costly and non-real-time system. In this
study, a high-speed electronic data acquisition via machine learning on FPGA is
implemented to efﬁciently monitor the macronutrients level in soil, which would
offer economic beneﬁt. Our focus is particularly on recognizing exact photon level
absorbed by soil by applying photon count processing techniques to monitor the
macronutrient in soil samples. The hardware architectures on FPGA feature a 16-bit
Kogge Stone adder to process the input signals from the sensing module, LED light
control system, time frame setting system and data synchronization via cloud for
Internet of Thing (IoT) application. The proposed photon counting system has been
demonstrated using visible range wavelength of 630, 550, and 470 nm, respec-
tively. In addition, the input photon signal can be varied from 0 to 200 kHz and
frame time period of 10 ms produces the optimum counting result with the per-
centage variation from 0% to maximum of 15% as compared to the actual counting
from the signal generated by the function generator. Apart from that, a real-time
system for IoT application has been successfully tested.

Keywords Spectroscopy Soil monitoring Photon counting FPGA IoT

X. Y. Lau C. H. Soo Y. Yusof S. Isaak (&)

Department of Electronics, School of Electrical Engineering, Faculty of Engineering,
Universiti Teknologi Malaysia, 81310 Johor Bahru, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 701

1 Introduction

Monitoring soil nutrients in the orchard or farmland is strongly required in quality

and process controls of agricultural produce, and control of soil fertility. Soil needs
essential elements for growth, which are known as soil macronutrients. These
macronutrients are nitrogen (N), phosphorus (P) and potassium (K). N, P, and K
fertilizer should be used based on the optimum requirement at each location.
Excessive use of these fertilizers can lead to contamination of ground water pol-
lution. Therefore, the monitoring of soil nutrient level is very important not only for
effective production but also to avoid groundwater pollution by nitrate [1, 2]. To
sustain the soil condition well and to control the amount of fertilizer, farmer should
regularly monitor the content of soil nutrients in their farms. A method using
color-developing chemicals for soil nutrients is also useful and commonly used by
farmer. The chemical reagents are commercially available as a soil analyzer.
Solutions of nutrients extracted from a soil, whose color is developed by chemical
reagents, are estimated by a subjective judgment with the color charts for the
nutrients. Thus, the value of soil nutrients content is always fluctuating due to the
different type of crop production and it is difﬁcult to achieve a quantitative analysis.
To execute a precise measurement, a spectrophotometer can be applied to inves-
tigate the color developed in solutions. Spectrometer can be used to identify
molecules of elements because each molecule vibrates at its frequency character-
istic. Spectrophotometry implicates the use of a spectrophotometer, as a device to
measure the intensity of light as the function of the light wavelength [3, 4].
The absorbance spectroscopy is a technique to measure the amount of absorbed
light by a sample at a given wavelength, transmission spectroscopy can be used for
all state of sample (solid, liquid and gas sampling) as the light passed through the
sample is being compared to the light that has not passed through, while reflectance
spectroscopy is the study of light as a function of wavelength that has been reflected
or scattered from a solid, liquid or gas [3]. According to Beer-Lambert Law as
illustrated in Fig. 1, there is linear relationship between absorbance and transmit-
tance of sample. Lu et al. states that Beer-Lambert law is a logarithmic relationship
between the radiant power and the concentration of a target compound or particle
within the path length from source to detector [4].

Fig. 1 Basic concept of light absorption [3]

Integrated Soil Monitoring System for Internet of Thing ... 703

The proportion of light absorption or transmitted light is independent to the

intensity of light source. In addition, the absorbance of light will directly propor-
tional to the concentration of the absorbing material sample. The measurement of
light transmitted and absorbance are deﬁned as in Eq. (1).

I
A ¼ log ¼eCL ð1Þ
IO

Where I0 = light intensity of a light source, I = transmitted light after passes

through the material sample, e = molar absorptivity (L mol−1 cm−1), C = molar
concentration (mol L−1) and L = path length in cm.
The soil is an important substance in the earth that assists the growth of plants in
providing food production to the human. To increase the production and quality of
food, the farmer may consume an excessive amount of fertilizers to the soil, may
resulting the contamination in soil. The excess of fertilizers will not only increase
the cost of production, but also cause a huge impact on the environment. Thus, a
soil-monitoring device is needed in the agriculture to determine the macronutrient
contents in the soil such as nitrogen and phosphorus. Therefore, the huge demand
for soil spectrophotometer is needed to provide healthy food to the human. Optical
spectroscopy has good merit to realize a low cost and smart tool for soil nutrient
monitoring system [5, 6]. New generation digital sensors are smart enough to
replace chemical lab testing at real time with minimum efforts and with almost
precise results. With the help of portable remote data acquisition system coupled
together with sensor could let the researchers collect results from wide locations.
The Internet of things (IoT), which is the ability for technology in every day objects
to send and receive data, and will revolutionize how we do everything from agri-
culture to communication. Agriculture also stands to beneﬁt greatly from inte-
grating this technology into simple electronics: IBM estimates that IoT will enable
farmers to increase food production by 70% by the year 2050 [7]. In addition to
better pest management and weather forecasting, IoT could save up to 50 billion
gallons of water annually, as sensors can better help farmers optimize water usage.
Being able to better optimize crop management will have a transformative effect on
agriculture in the following years. Therefore, IoT solutions with integrated
affordable sensors that monitor soil humidity/moisture levels and soil/air tempera-
tures and automatically would help farmers know what is best for their crops
without having to manually estimate or make an educated guess.
Photon counting is a method to count a single photon by using a single photon
detector device. The number of photons is accumulated in a ﬁxed period and used it
to determine the strength of the slowly changing optical signal [8]. Single photon
counting in the near-infrared and visible light spectrum has become one the popular
process for many application [9]. In addition, the photon counting method is fast
and accurate in term of the signal to noise and result in a high signal to noise ratio
[10]. Thus, a single photon absorption event can perform better as compared to
measuring an optical intensity or power in the spectrophotometer. Avalanche
704 X. Y. Lau et al.

Photodiode (APD) in reverse biased and Geiger mode it is a promising single

photon detector and it will able to detect a photon even in a low light condition [11].
This paper is reported the integration of inexpensive mobile method on soil
macronutrients monitoring and fertilization usage with IoT system. The integrated
system is adapted on previous work of [12] for soil spectroscopy on macronutrient
detection. Proposed system tends to realize a portable handheld device for soil
testing and result uploading over IoT. The data acquisition part is implemented on
ﬁeld programmable gate array (FPGA) based spectrophotometer, which detects the
signal from APD and the counted photon count that been trapped in soil sample is
sent to mobile application over wireless system network (WSN) communication.
This project required high speed in data acquisition to avoid to loss of photon
counting. FPGA is a good promising device due to high speed and high channel
density features [13]. In addition, the advantage of using FPGA are the ability to
reprogram the digital circuit and the powerful backup software Quartus II.
Therefore, the same device can be used for multiple tasks by changing the circuitry
in the FPGA.

2 Integrated Circuit Design

The block diagram of integrated circuit design is shown in Fig. 2. The composition
of the circuit design is made of five modules namely soil testing, photon signal
detecting, amplification and filtering, data processing, and data schronization via
cloud for Internet of Thing (IOT) application. Main system is utilized using FPGA
for complex computations, where the intensity level obtained is processed mostly
using the FPGA. FPGA based spectrophotometer is implemented as the integration
of front end, back end modules, and real-time monitoring. The front end consist of
LED light control system, to perform activation of the LED light and time frame
signal when the input is permitted. The light source illuminated the soil sample in
the container, then the sample is absorbing the photon and the transmitted photon is
entering the photodiode. Moreover, the small current is converted and amplified by
the transimpedance amplifier (TIA) circuit. The back end module consists of data
acquisition system on FPGA. This module is implemented the signal conditioning,
discriminating, analyzing and counting. Kogge stone adder is utilized on the photon
counting circuit to prevent losing the signal count on high frequency signal
detection. The third part of the system consists of the real-time data monitoring
system. The data is uploaded to cloud via Nodemcu and user can download the
develop app to perform the real-time monitoring.
The sensing module is used to detect the photon signal due to the concentration
level in soil with various wavelengths of 630, 550, and 470 nm. A photodetector,
APD is used to detect a very low light which is absorbed in soil and convert the
photon into a very small current signal. The current signal is converted to voltage
signal by using passively quenched circuit, then the small voltage is amplified using
Integrated Soil Monitoring System for Internet of Thing ... 705

Fig. 2 The block diagram of FPGA based spectrophotometer

TIA. The gain of the TIA can be deﬁned as transimpedance gains, RT which is
given by [14]:

RT ¼ @Vout =@Iin ð2Þ

where @Vout is the output voltage in voltage (V) and @Iin is the input current in
Ampere (A). In the hardware setup, the transimpedance gain design is adapted the
design proposed with other researcher [15, 16]. The RT is selected based on the
speciﬁcation sheet of the APD. The analog voltage from the TIA output is then
converted to digital signal using an analog to digital converter (ADC) circuit before
the digitized signal is discriminated with pulse discriminator for counting purpose
with using FPGA data acquisition system. The data acquisition system on FPGA
consist few modules, including LED control system. The LED light control system
is controlling the LED time on and off during macronutrient level test. As the user
assigns the start mode for the macronutrient test, the count mode is activated to
ignite the counting module on FPGA. The user may choose various frame time for
sensitivity analysis purpose. The adding process of the total counted photon signal
is performed using Kogge Stone Adder algorithm for fastest counting need. Kogge
stone adder (KSA) had the ability to generate it carries fast due to the logarithmic
scaling of the delay by (log2 N), where N is the number of the bit through the carry
path. As an example, the 16 bit of KSA is required 4 bit of carry path, while the 16
bit of Ripple Carry Adder (RCA) required 16 bit of carry [17, 18].
The total counted photon and the macronutrient concentration level in soil will
be displayed on the 7-segment LED on DE2 FPGA board after the data acquisition
completing all process. The macronutrient concentration module is developed to
compare the counted number of photon in soil with light illumination and without
706 X. Y. Lau et al.

illumination. This comparison is initiated in percentage value on the 7 segment

LED to deﬁne the useful signal over dark count signal. For IoT application, the data
on data acquisition system is uploaded to the Nodemcu to perform the real time data
analysis. Nodemcu is connected to the Wi-Fi and uploaded the data from
DE2 FPGA board to ﬁrebase database. Firebase is chosen as database for this work,
which is also acted as a cloud to store the data form Nodemcu. Finally, the user can
download the develop application and perform the real time monitoring.

3 Experimental Results and Discussions

The implementation of this project is divided into two parts, which are front-end
and back-end modules. The front-end module consists of passively quenched APD,
ADC and TIA. The back-end module is only involved on the Verilog programming
on data processing for the digitized photon count from the front-end module. The
overall process of back-end module is illustrated in Fig. 3. This flowchart sum-
marizes the development of integrated system and algorithm on the proposed data
acquisition for macronutrient monitoring task.

3.1 Performance of Data Acquisition System

One pulse wave is representing a single photon have been detected by the pho-
todetector and converted into an electrical signal. The light which is absorbed in soil
for various macronutrient concentration is converted to a voltage signal by TIA.
The generated signal from the TIA output is ﬁltered at threshold voltage of 1.4 V
via pulse discriminator module. The counter circuit is only recognized the input
signal between 1.4 and 3.3 V. The comparator output is acted as the clock of the
counter. Time frame enables signal will only active when there is an input signal
ignite the buffer on data acquisition system. Once the time frame enables signal is
activated, the counter will start count. The accumulated photon count for a single
time frame of 1 ms will be sent out to the 7 segment LED as shown in Fig. 4. The
percentage of counting variation between the actual counting and the experimental
counting is 10–15% due to some delay in data processing module using FPGA. The
photon counting circuit comprises of a 16-bit buffers for sampling of incoming data
from the SPAD sensor, a 16-bit Kogge-Stone adder that works as a stabilizer to the
incoming signal from the buffer, a 16 bit parallel in serial out (PISO) shift register to
serialize the output from the adder and a clock divider that provides the necessary
divided clock input to the buffers and PISO as required by the design implemen-
tation. All the components of the counting circuit are fully implemented using
Verilog HDL. The design implemented is synthesized onto the Cyclone 2 FPGA
Integrated Soil Monitoring System for Internet of Thing ... 707

Fig. 3 The flowchart of data acquisition system on FPGA board

and simulated using Quartus 2 software by Altera to verify its functionality and
analyse its performance characteristic. The developed Kogge-stone adder has
counting limitation to 2 GHz and affect the counting percentage.
The Kogge-Stone adder circuit managed to achieve a delay of 12.283 ns and it
was successfully integrated together with the remaining components of the data
acquisition circuit to complete the design implementation of the counting circuit
before being synthesized. The results from the synthesis reported the designed
circuit of achieving an operating frequency of 420 MHz and an average power
consumption 38 mW. Therefore, the implemented counting circuit was simulated
for an input clock frequency of 400 MHz to verify its functionality.
708 X. Y. Lau et al.

Fig. 4 The accumulated

photon count for a single time
frame is displayed on
7-segment LED

3.2 Kogge-Stone Adder Implementation Overview

The functional block diagram of the Kogge-Stone adder has been drawn as shown
in Fig. 5. The flow from the incoming input to the Kogge-Stone adder circuit output
can be easily separated into three main stages. The first stage is the pre-processing
stage, which is being represented by the square blocks. This pre-processing stage is
required to determine if the input, when added together, is going to either generate
or propagate a carry. The carry-generation stage is represented by all the circular
and triangular blocks. This stage is an important step in allowing the Kogge-Stone
adder circuit to achieve its fast speed as compared to other adder circuits. Finally,
the diamond shaped blocks represent the post-processing stage which is responsible
to produce the final output for the Kogge-Stone adder circuit.
A 16-bit Kogge-Stone adder is successfully implemented structural dataflow
modelling and the simulation performance has verified its functionality. The RTL
code is synthesized on the Cyclone II family FPGA using the Altera Quartus 2
software. From the synthesized analysis, the design implemented for the
Kogge-Stone uses a total of 89 logic elements and have an average fan-out of 2.47.
After the successful synthesis of the Kogge-Stone adder circuit is completed, a RTL
Integrated Soil Monitoring System for Internet of Thing ... 709

Fig. 5 16-bit Kogge-Stone adder implementation functional block diagram

and GATE level simulation is performed to verify the circuit functionality. For the
simulation, a random set of number are injected into the inputs of the adder to verify
the functionality of the circuit as an adder. The results from the functional RTL
simulation observed that the output from the adder circuit is corresponding to the
input that injected to the circuit. It can also be noted that the carry out bit asserts as
it should when the addition outcome from the performed addition of the injected
input is greater the supported bit size, thus resulting in an overflow.
Figure 6 shows the timing diagram of data processing system on FPGA based
spectrophotometer. The comparator output is assigned as the clock of the counter.
However, the counter will not start counting if no input is ignited from the DE2
Board. When there is an incoming input (L1 = active low), then the time frame
enables signal (clken). Thus, the counter will start count within the set time frame.
When the incoming signal is deactivated, then the counter will stop counting and
the total counted number of photon will be displayed on the 7 segment LED on
DE2 board. In general, the generated digital signal is counted within a given frame
time to produce counting value on various soil concentration.
710 X. Y. Lau et al.

Fig. 6 The timing diagram of the data acquisition system

3.3 Real Time Monitoring and App Development

The timing simulation results obtained from the implemented Kogge-Stone circuit
obtained that the propagation delay from the ﬁrst input to the ﬁrst output is
12.283 ns. The reported worst case propagation delay is similar to the results
obtained by authors Kaur and Kumar [19] on their research, in which they
implemented 16-bit KSA on a Xilinx FPGA through a Xilinx software achieved a
delay of 12.84 ns. A design synthesized with a more recent CMOS process tech-
nology implementation would have better performance as compared to a design
with a much older process technology implementation. Another attribute that can
also contribute to the performance of a design is also the type of optimization
performed during synthesis as a different setting for optimization could for
Integrated Soil Monitoring System for Internet of Thing ... 711

Fig. 7 Database interface system

example, result in a high speeds but at a cost of higher power consumption or vice
versa. For the case of this circuit, a more balanced optimization setting was selected
and therefore the results obtained are a reflection of that setting.
One of the vital specification of this project is to perform a real time data analysis
and monitoring. Thus, the obtained data from the FPGA is transferred to Nodemcu
(refer Fig. 1) and uploaded to the database through Wi-Fi connection. Figure 7
depicts the interface of project database. Two types data of wavelength range and
contamination value in soil are uploaded from Nodemcu to firebase via connecting
Wi-Fi. MIT App Inventor II is utilized as the application development tool due to
its feature of open source, free and user friendly. Therefore, the application
developer can easily develop an app by using block coding to assign the function of
the application. Figure 8 shows the graphical user interface of the developed
application. The data from firebase is grabbed and displayed on the application
when the user press the GET button. The user may select any visible wavelength
range for the illumination. In addition, the type of macronutrient is set to be
phosphorous. Meanwhile the contamination of the macronutrient depends on the
input signal from TIA. Since the firebase store the latest result of the type of LED
light is red and the contamination level is 50%, thus when GET button is pressed
then the data from firebase is initiated and display on the screen.
In the view of the overall result, the discriminator module is output the correct
pulse waveform. Besides, the high, low level and undefined region of the
DE2 FPGA board are determined through the experiment. In addition, the counting
module part of the total photon within one frame time is also verified by the
experimental performance. Table 1 shows the performance specification of FPGA
based spectrophotometer phosphorous macronutrient in soil samples.
The results define that, as the speed of the input clock configuration increases,
the number of gates used to implement the design also increases. The number of
gates used is also dependant to the setting on the type of optimization that is
selected for the circuit. In the case of this parallel photon counting circuit, a more
712 X. Y. Lau et al.

Fig. 8 Graphical user

interface

Table 1 Speciﬁcation of FPGA based spectrophotometer

Parameters Speciﬁcations
RGB LED time control (s) 1
Frame time (ms) 1–1000
Type of macronutrient 0–15
Wavelength 630 nm, 550 nm, and 470 nm
Input frequency (kHz) 0–200
Optimum input frequency range (kHz) 1–100
Optimum frame time (ms) 10

balanced optimization setting is selected and hence the results obtained are
reflection of that selection. Another observation that can also be made is that the
timing slack of the circuit slowly turns negative as the input clock conﬁguration of
the circuit increases. This indicates the signal arrival time (AT) is unable to meet the
required time (RT) of the circuit before the next clock edge and the behaviour can
be attributed to the limitation of the available in the standard cell library on this
circuit is implemented. It can be observed that the lowest propagation delay time
able to be achieved from the synthesized circuit was at 1.18 ns and therefore by
making the input clock faster, results in the reduction of the required time for the
circuit, hence causing the calculated slack result obtained to be negative.
Integrated Soil Monitoring System for Internet of Thing ... 713

4 Conclusions

This proposed project gives a viable tool for in situ soil nutrient monitoring with
high speed nutrient, pH and water level alert system, and potentially help daily
agricultural activities reduce excessive fertilizer usage. IoT ensures accurate and
efﬁcient communication to farmers of real time data related to soil quality, so they
can plan agriculture activities beforehand and take corrective or preventive mea-
sures in advance for the future. This would promise less usage of fertilizer and more
organic food production. In addition, better impact on healthy and economic meal
for people with smart soil nutrient monitoring system at the start point of farm
production. In addition, this project is performing the replacement of standard
library adder from RCA to Kogge Stone Adder (KSA). This is because the speed of
the KSA is the fastest as compared to other types of adder especially to RCA.
Through analysis of the results obtained through simulation, it could be observed
that the parallel photon counting circuit implemented did present some limitations
in terms of meta-stability when injected with a randomly toggling asynchronous
input at high speeds. This limitation was concluded to be mainly attributed to the
minimum setup and hold time required by the circuit to properly sample the
incoming signal before a corresponding output can be produced. Another limitation
presented with the design was the slow buffer sample rate that was needed to meet
the timing requirement of the PISO shift register circuit to properly output the all of
sampled data input from the buffer of the parallel photon counting circuit.
By adapting KSA, the percentage of photon count lose can be reduced and
would improve the accuracy of the spectrophotometer. There are possible contri-
butions that can be generated from this project including an improved automated
soil monitoring for healthy and economic daily meal production from farm with IoT
and low cost portable soil monitoring system at ﬁeld.

Acknowledgements Authors would like to express gratitude to the ﬁnancial support from
Universiti Teknologi Malaysia under GUP Grant Tier 1 (Vot. 11H44) and FRGS Grant (Vot.
4F959). This work was primarily conducted in Advanced Electronics Laboratory and Basic
Communication Laboratory in Faculty of Electrical Engineering. The authors would also like to
express utmost appreciation to the assistant engineers, Mr. Ahmad Hassan and Mrs. Wan Noraﬁza
for their contributions to this work.

References

1. Dick WA, Cheng L, Wang P (2000) Soil acid and alkaline phosphatase activity as pH
adjustment indicators. Soil Biol Biochem 32:1915–1919
2. Sankpal A, Warhade K (2015) Review of optoelectronic detection method for the analysis of
soil nutrients. Int J Adv Comput Electron Technol 2(2):2394–3416
3. Lu C, Wang L, Hu H, Zhuang Z, Wang Y, Wang R, Song L (2013) Analysis of total nitrogen
and total phosporus in soil using laser induced breakdown spectroscopy. Chin Opt Lett 11
(5):053004
714 X. Y. Lau et al.

4. Yusof KM, Isaak S, Rashid NCA, Ngajikin N (2016) NPK detection spectroscopy on non
agriculture soil. Jurnal Teknologi (Sci Eng) 78(11):227–231
5. Albert DR, Todt MA, Davis FA (2012) A low-cost quantitative absorption spectrophotometer.
J Chem Educ 89(11):1432–1435
6. Bah A, Balasundram SK, Husni MHA (2012) Sensor technologies for precision soil nutrient
management and monitoring. Am J Agric Biol Sci 7(1):43–49
7. Isaak S, Yusof Y, Ngajikin NH, Ramli N, Chuan MW (2019) A low cost spectroscopy with
Raspberry Pi for soil macronutrient monitoring. Telkomnika 17(4):1867–1873
8. Zappa F, Tosi STA, Cova S (2007) Principles and features of single photon avalanche diode
arrays. Sens Actuators A 140:103–112
9. Isaak S, Pitter MC, Bull S, Harrison I (2010) Fully integrated linear single photon avalanche
diode (SPAD) array with parallel readout circuit in a standard 180 nm CMOS process. AIP
Conf Proc 1341(1):175–180
10. Yusof KM, Isaak S, Ngajikin NH, Rashid NCA (2016) LED based soil spectroscopy. Buletin
Optik 3:1–7
11. Chuah JH, Holburn D (2014) An integrated solid-state solution for secondary electron
detection. Analog Integr Circ Sig Process 1:395–411
12. Isaak S, Pitter MC, Bull S, Harrison I (2010) Design and characterisation of 16 1 parallel
outputs SPAD array in 0.18 um CMOS technology. In: 2010 IEEE Asia pacific conference on
circuits and systems, Kuala Lumpur, Malaysia. IEEE, pp 979–982
13. Zheng W, Liu R, Zhang M, Zhuang G, Yuan T (2014) Design of FPGA based high-speed
data acquisition and real-time data processing system on J-TEXT tokama. Fusion Eng Des 89
(5):689–701
14. Li M (2007) 5 GHz optical front end in 0.35 mm CMOS. PhD dissertation, Nottingham
15. Lu Z, Yeo KS, Lim WM, Do MA, Boon CC (2010) Design of a CMOS broadband
transimpedance amplifier with active feedback. IEEE Trans Very Large Scale Integr (VLSI)
Syst 18(3):461–472
16. Isaak S, Yusof Y, Leong CW (2018) A 2.5-GHz optical receiver front-end in a 0.13 lm
CMOS process for biosensor application. In: Proceedings of 2018 IEEE-EMBS conference on
biomedical engineering and sciences (IECBES), Kuching, Malaysia. IEEE, pp 376–381
17. Butchibabu S, Babu SK (2014) Design and implementation of efficient parallel prefix adder
on FPGA. Int J Eng Res Technol 3(7):239–244
18. Xiang LM (2017) VLSI implementation of Kogge stone adder. Universiti Teknologi Malaysia
19. Kaur J, Kumar P (2014) Analysis of 16 & 32 bit Kogge stone adder using Xilinx tool.
J Environ Sci Comput Sci Eng Technol 3(3):1639–1644
Contrast Enhancement Approaches
on Medical Microscopic Images:
A Review

Nadzirah Nahrawi, Wan Azani Mustafa,

Siti Nurul Aqmariah Mohd Kanaﬁah, Mohd Aminudin Jamlos,
and Wan Khairunizam

Abstract Nowadays, there are many method for medical identiﬁcation that exist
for example based on microscopic and nonmicroscopic. Microscopic is a method
that use microscope to capture an image and identify the disease based on the image
captured. The image quality of medical image is very important for patient diag-
nosis. Image with poor contrast and the quality of the image is not good may lead to
the mistaken decision, even in experienced hands. Therefore, contrast enhancement
methods was proposed in order to enhance the image quality. Contrast enhancement
is a process that improves the contrast of an image to make various features more
easily perceived. Contrast enhancement is widely used and plays important roles in
image processing application. This paper review the contrast enhancement tech-
niques was used in microscopic images. There are microscopic images for cervical
cancer, leukemia, malaria, tuberculosis and anemia.

Keywords Contrast Enhancement Microscopic Image Review

1 Introduction

Contrast condition is an important factor in any assessment of the image quality. It

is made by the difference in luminance reflected from the two adjacent surfaces [1].
In human visual perception, contrast is determined by the difference in term of color
and brightness of the object and other objects [2–4]. The factor that image have a
poor contrast quality is cause by imaging device speciﬁcation is to low, the operator
lack of expertise and adverse external conditions. The result of the imaging will not

N. Nahrawi S. N. A. M. Kanaﬁah W. Khairunizam

School of Mechatronic Engineering, University of Malaysia Perlis, Pauh Putra Campus,
02600 Arau, Perlis, Malaysia
W. A. Mustafa (&) M. A. Jamlos
Faculty of Engineering Technology, University of Malaysia Perlis, UniCITI Alam Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 715

show all the details in the captured scene [5]. Thus, the aim of contrast enhancement
(CE) is to solve these problems. CE is a process that improves the contrast of an
image to make various features more easily perceived. CE is widely used and plays
important roles in image processing application. The purpose of CE is to enhance
the interpretability or acquire a more visually pleasing and informative image [5, 6].
There are two categorized of image enhancement techniques which are direct and
indirect enhancement method. For direct enhancement method, the image contrast is
deﬁned and try to improve the contrast. In the indirect method, the intensity values of
the images are redistributed to improve the contrast [5, 7]. There are many CE
techniques and each technique has got its own merits and demerits [8, 9].

2 Contrast Enhancement Method on Microscopic Images

2.1 Cervical Cancer

Previous research by Chang [10] has proposed Energy method. The objective of
this paper is to classify abnormal cells in Pap smear images. The images are
collected from the Pathology Department, China Medical University Hospital,
Taichung, Taiwan. The resolution of the image is 1280 960 pixels. Energy
method is used to enhances the cell nucleus which has a low grey level. There are
two mean filters used in energy method. There are background operator and
local-energy. Background energy operator is the mean filter that has a larger mask
and the mean filter with a smaller mask is used as local energy. The local energy is
subtracted from background energy to enhance the cell nucleus in the image.
Plissiti [11] found differences suggesting that used Contrast-limited adaptive
histogram equalization (CLAHE). The aim of this paper is to detect cell nuclei
boundaries. There are 19 images of Pap smear and the image is stored in JPEG
format. In the pre-processing step, CLAHE and the global threshold are applied to
the image to extract the background and get smooth regions of interest. The author
used the same method in another paper [12]. The above finding is consistent with
the study by Tareef et al. [13]. They also used Contrast-limited adaptive histogram
equalization (CLAHE). The aim of this paper is to segment the nucleus and
cytoplasm in Pap smear image. There are 135 Pap smear images with the size
512 512 pixels. In the pre-processing process, the Gaussian filter and CLAHE is
used to reduce noise and enhance images. The nuclei in the image have poor
contrast due to cytoplasm overlap. Thus, CLAHE is applied to enhance the nuclei.
By the same objective in order to improve the contrast on Pap smear images, Isa
[14] have proposed a combination of moving k-means clustering algorithm and
linear contrast enhancement. K-means clustering algorithm is used to segment the
images into 60 regions. Next, linear contrast enhancement method is applied to
enhance the contrast of the images. The images are tested to three Pap smear images
with three different methods which are the proposed method, moving k-means
Contrast Enhancement Approaches on Medical Microscopic Images … 717

clustering algorithm and linear contrast enhancement. The result shows that the size
and shape of the nucleus and cytoplasm can be seen clearly. Thus, the proposed
method produces better contrast of Pap smear images.

2.2 Leukemia

The combination of removing pixels technique and Gaussian filter for image
enhancement is proposed by Nasir [15]. Almost 91 microscopic image are collected
from Hospital Universiti Sains Malaysia (HUSM). The image resolution is
600 400. There are of 14 microscopic images of Acute Lymphoblastic Leukemia
(ALL), 43 images of Acute Myelogenous Leukemia (AML) and normal blood cells
with 34 images. In the image enhancement process, the removing pixels technique
and Gaussian filter are used. Removing pixels technique was applied to eliminate
unwanted image and noise. 100 pixel value is set and if the image contains less than
100, the image will be eliminated. Gaussian filter remove noise by smoothing and
preserve edge better than a similar sized mean filter.
Again, research finding by Nasir also points towards a new method based on
partial contrast, dark stretching and bright stretching technique [16]. The image is
acute leukemia microscopic with 800 600 resolution. Partial contrast, bright
stretching and dark stretching are applied to improve the quality of the image. In
this research, the partial contrast and bright stretching is combine and applied to
enhance for blast detection while a combination of partial contrast and dark
stretching is applied to enhance for nucleus detection. Partial contrast is a linear
function and bright stretching techniques is based on linear mapping function. This
technique is applied to resultant partial contrast image for enhancing cytoplasm of
the image. The mapping function is given as Eq. 1 and for the bright stretching
technique will interpret as Eq. 2. Dark stretching techniques also linear mapping
function and it is the reverse process of bright stretching technique. To enhance the
contrast between nucleus and background, the dark stretching technique is applied
to the resultant partial contrast image.

ðmax minÞ
Pk ¼ ðqk fmin Þ þ min ð1Þ
ðfmax fmin Þ

Pk : Colour level of the output level

qk : Colour level of the input level
fmax : Maximum colour level values in the input image
fmin : Minimum colour level values in the input image
min: Desired minimum colour levels in the input image
max: Desired maximum colour levels in the input image
718 N. Nahrawi et al.

( inðx;yÞ
TH SFb for inðx; yÞ\TH
outðx; yÞ ¼ h ðinðx;yÞTH Þ
i ð2Þ
255TH ð255 SFbÞ þ SFb for inðx; yÞ [ TH

inðx; yÞ: Colour level for the input pixel

outðx; yÞ: Colour level for the output pixel
TH: Threshold value
SFb: Bright stretching factor
( inðx;yÞ
TH SFd for inðx; yÞ\TH
outðx; yÞ ¼ hðinðx;yÞTH Þ i ð3Þ
255TH ð255 SFd Þ þ SFb for inðx; yÞ [ TH

inðx; yÞ: Colour level for the input pixel

outðx; yÞ: Colour level for the output pixel
TH: Threshold value
SFd: Dark stretching factor
However, interestingly, this is contrary to a study conducted by Harum et al. [17].
They focused on Local Contrast Stretching (LCS) in order to enhance the contrast
variation. The size of the image is 800 600 resolution. For the enhancement
process, LCS is used to increase the contrast of an image. The equation is shown in
Eq. 4. The goal of the study is to compare the segmentation technique based on HSI
and RGB color space.

Iinput ðx; yÞ min
Ioutput ðx; yÞ ¼ 255: ð4Þ
ðmax minÞ

Where,
I outputðx; yÞ: The color level for the output pixel
I inputðx; yÞ: The color level for the input pixel
max: Maximum value for color level in the input image
min: Minimum value for color level in the input image
Almost similar with [17] technique, Halim [18] applied Global Contrast
Stretching (GCS) technique on the degraded images. The sample images are acute
leukemia microscopic images. The KERNEL is sliding windows which is applying
across the image and the center element is adjusted by using Eq. 5. During the
contrast stretching process, the maximum value and minimum value will be used.
To determine the maximum and minimum value of RGB color image, GCS con-
sider all range of color plates.
Contrast Enhancement Approaches on Medical Microscopic Images … 719

½Io ðx; yÞ min

Ip ðx; yÞ ¼ 255 ð5Þ
ðmax minÞ

Ip ðx; yÞ: The color level for the output pixel

Io ðx; yÞ: The color level for the input pixel
max: Maximum value for color level in the input image
min: Minimum value for color level in the input image
In another study, Rejintal [19] used a histogram equalization. The aim of the
research paper is to segment the cells and extract the feature to detect cancer.
Leukemia microscopic images are used as sample images. For the pre-processing
process, the image is converted into grayscale images. Next, the ﬁltering process
and HE is applied to the images. HE is used to increasing the contrast of the images.

2.3 Malaria

The aim of the research by Purwar [20] is to classify between positive and negative
cases of malaria using thin blood smear image. Local histogram equalization is used
to enhance the grayscale image. There are over 500 microscopic images is tested
from two independent laboratories. Local histogram equalization is used to
enhancing the visibility of the RBC and parasite. Partial contrast stretching tech-
nique (PCS) is used by Nasir [21] to improve the contrast of pixels in the image.
The objective of this research is to segment the RBC with malaria parasites. The
microscopic images sample is used. An inﬁnity-2 digital camera is to capture the
image at a resolution of 800 600 pixels. The PCS is applying to the original
image.
Mehrjou [22] proposed an adaptive histogram shaping function to improve the
contrast of the image. The aim of the research is to quantify the number of RBC and
determine the RBC is normal or infected by parasites. The image is divided into
many several tiles and histogram shaping is applied. To eliminate artiﬁcially
induced boundaries, bilinear interpolation is applied on adjacent tiles. Uniform
histogram shaping is used for each tile to get the best results. May [23] applied
Histogram stretching to adjust the contrast or intensity values of the image. There
are 507 images sample of malaria Plasmodium vivax at trophozoites stages and the
resolution is 764 574 pixels. The aim of this research is to detect infected RBC.
Somasekar [24] found that Gamma Equalization (GE) can improve the low contrast
image. GE algorithm starts with the input image. Then, the RGB color image was
converted to a grayscale image by using Eq. 6.
720 N. Nahrawi et al.

2 3
Fr
6 7
G ¼ ½ 0:299 0:587 0:114 4 Fg 5
ð6Þ
Fb
¼ 0:299 Fr þ 0:587 Fg þ 0:114 Fb

Fr : Red Channel intensity of original image F

Fg : Green Channel intensity of original image F
Fb : Blue Channel intensity of original image F
After convert into grayscale, cth order image analysis is carried out.

H ¼ ½Gc ð7Þ

The maximum and minimum value of image intensity is calculated.

Ma ¼ maxðHÞ; Mi ¼ minðHÞ ð8Þ

The differences between the maximum and minimum value of intensity are deﬁned
as a range value.

R ¼ Ma Mi ð9Þ

The LUT value is calculated using Eq. 10.

ðH M i Þ
L¼ ð10Þ
R

Lastly, the LUT values are transformed to intensity values of the grayscale image.
GE is compared with Histogram equalization (HE), Imadjust (IA) and
Contrast-limited adaptive histogram equalization (CLAHE). There is three image
quality measure (IQM) which are entropy, average luminance and absolute
brightness error (AMBE) to evaluate the performance. There are 20 malaria images.
As a result, GE shows better image quality. Savkare [25] used Imadjust (IA) to
increase the contrast of the image. There are 68 images of malaria parasite at ring,
trophozoite and gametocyte stages of P. Falciparum and P. Vivax. The image’s data
has a different resolution. The objective of this research is to identify the species of
malaria. To enhance the image, the intensity values of the grayscale image is a map
to a new value.
Studies by Abidin [26] has found that the combination of Lowpass ﬁltering and
Contrast stretching shows the best result among six combinations. There are 50
images sample of malaria parasites and the resolution of the data is 140 140
pixels. This research is more focus on image enhancement and segmentation steps.
The image is tested with the proposed method and another combination method.
The combination methods are median ﬁltering-contrast stretching, gaussian
Contrast Enhancement Approaches on Medical Microscopic Images … 721

ﬁltering-contrast stretching, lowpass ﬁltering-contrast stretching, median

filtering-dark stretching, Gaussian filtering, dark stretching, dan lowpass
filtering-dark stretching. The result of this test is based on visual inspection. As a
result, Lowpass filtering and Contrast stretching shows that the background image
is darker and the object is brighter.

2.4 Tuberculosis

Another study by Raof [27] is also used Partial Contrast Stretching (PCS). The
sample images are collected from the Department of Microbiology and
Parasitology, School of Medical Science, Universiti Sains Malaysia, Kubang
Kerian. Contrast stretching is based on linear mapping function. Usually, contrast
stretching is used to enhance the brightness and level of contrast of the image. The
mapping function is shown in Eq. 11.

ðmax minÞ
Pk ¼ ðqk fmin Þ þ min ð11Þ
ðfmax fmin Þ

Pk : Colour level of the output level

Fig. 1 The partial contrast stretching process

722 N. Nahrawi et al.

Wahidah [28] applied a linear contrast stretching technique to 50 images. The

images are captured from the collected set of positive control TB slide sample for
sputum which stained with Ziehl-Neelsen method. The sample images were col-
lected from the Department of Microbiology and Parasitology of HUSM, Kelantan.
The aim of this paper is to compare color thresholding and global thresholding
techniques. Before segmentation process, linear contrast stretching technique is
applied to enhance the contrast of the image.

2.5 Anaemia

Hirimutugoda [29] proposed an adapted grey world normalization method. The aim
of this paper is to develop an automated diagnosis of the disorder of RBC and detect
malaria parasite and thalassemia in blood. There are 300 images with the size
160 160 pixels. The color in each sensor channel averages is assumed to grey.
This method based on the diagonal model of illumination change that uses certain
characteristics of microscopic peripheral blood images. Maitra [30] used adaptive
histogram equalization in the paper. Detecting and counting RBC using Hough
Transform is the aim of the research. There are five microscopic images. The image
sample used for this paper is the blood cell of microscopic images. For the pre-
processing process, adaptive histogram equalization is used to enhance the images.
In 2016, Tyagi [31] published a paper in which they applied histogram equal-
ization for image enhancement. The objective of the research is to classify normal
RBC and poikilocyte cells by using Artificial Neural Network. There are 100
images with different blood samples collected from Haematological
Department AIIMS, New Delhi. The image is captured in JPEG format and the size
is 1024 768 pixels. There are pre-processing, segmentation, morphological
operations, feature extraction and classification process to identify the cells. In the
pre-processing process, the images are converted into grey scale image and his-
togram equalization is applied to obtain cells boundary (Table 1).

Table 1 Summary of selected contrast enhancement method

Method Description
Energy Method There are two mean ﬁlters used in energy method which
are background and local energy. Background has larger
mask and local has smaller mask. The local energy is
subtracted from background energy to enhance the cell
nuclei in the image [10]
Contrast-limited adaptive A variant of adaptive histogram equalization(AHE). The
histogram equalization (CLAHE) development of CLAHE is to reduce the problem of
noise ampliﬁcation that AHE can give rise to [11]
(continued)
Contrast Enhancement Approaches on Medical Microscopic Images … 723

Table 1 (continued)
Method Description
Moving k-means + Linear Moving k-means constantly checked for fitness of each
Contrast Enhancement centre during clustering process. The centre will moved
to the most active centre region if the centre fails to
satisfied a specific criteria [32]
Linear contrast enhancement linearly manipulate the
histogram of the image to fulfilled it’s dynamic range.
The contrast of the image will be more uniform [14]
Histogram Equalization (HE) Increasing global contrast of an image. The most
frequent intensity values is spreading out [19]
Partial Contrast Stretching (PCS) Increasing contrast for overall image. The minimum and
maximum value of threshold will be mapped to new
wider range of pixels [16]
Local Contrast Stretching (LCS) Locally adjust each pixel value to improve the
visualization of structures in both darkest and lightest
area of image at the same time. In contrast stretching
process, range of each color palate in the image will be
used to represent each range of color. Each of color
palate will have a minimum and maximum values [33]
Global Contrast Stretching (GCS) All color plate range are considered at once to determine
only one maximum and minimum value for combination
of RGB color. The value will be used in contrast
stretching process [33]
Gamma Equalization (GE) Improve low contrast image. The value of look-up-table
(LUT) is calculated and the intensity values of gray scale
image is converted into LUT values. c value is important
in getting enhancement of input image [24]
Grey world normalization The averages color in each sensor channel is assumed to
grey in entire image. Based on the diagonal model of
illumination change that uses certain characteristics of
microscopic peripheral blood images [29]

3 Conclusion

Contrast enhancement is widely used and plays important roles in image processing
application. The purpose of CE is to enhance the interpretability or acquire a more
visually pleasing and informative image. HE, CLAHE and contrast stretching
techniques are commonly used. In this paper, there is various contrast enhancement
techniques review for microscopic images. The review of this paper hopefully will
help the researcher to improve the existing techniques and develop a new algorithm
to produce a better quality of an image. An image with poor contrast and the quality
of the image is not good may lead to the mistaken conclusion, even in experienced
hands. The image quality is important for diagnosis result.
724 N. Nahrawi et al.

Acknowledgements This work was supported by the Ministry of Higher Education Malaysia
under the Fundamental Research Grant Scheme (FRGS/1/2018/SKK13/UNIMAP/02/1)

References

1. Al-amri SS, Kalyankar NV, Khamitkar SD (2010) Linear and non-linear contrast enhancement
image. J Comput Sci 10:139–143
2. Mustafa WA, Yazid H (2017) Image enhancement technique on contrast variation: a
comprehensive review. J Telecommun Electron Comput Eng 9:199–204
3. Kanafiah SNAM, Mashor MY, Mustafa WA, Mohamed Z (2018) A novel contrast
enhancement technique based on combination of local and global statistical data on malaria
images. J Biomim Biomater Biomed Eng 38:23–30. https://doi.org/10.4028/www.scientific.
net/JBBBE.38.23
4. Mustafa WA, Yazid H, Yaacob S (2014) A review : comparison between different type of
filtering methods on the contrast variation retinal images. In: IEEE international conference on
control system, computing and engineering, pp 542–546
5. Arici T, Dikbas S, Altunbasak A (2009) A histogram modification framework and its
application for image contrast enhancement. IEEE Trans Image Process 18:1921–1935.
https://doi.org/10.1109/TIP.2009.2021548
6. Ibrahim H, Kong NSP (2007) Brightness preserving dynamic histogram equalization for
image contrast enhancement. IEEE Trans Consum Electron 53:1752–1758. https://doi.org/10.
1109/TCE.2007.4429280
7. Baby J, Karunakaran V (2014) Bi-Level Weighted Histogram Equalization with Adaptive
Gamma Correction. Int J Comput Eng Res 4(3):25–30
8. Mustafa WA, Yazid H (2017) Contrast and luminosity correction based on statistical region
information. Adv Sci Lett 23:5383–5386
9. Mustafa WA, Yazid H (2016) Illumination and contrast correction strategy using bilateral
filtering and binarization comparison. J Telecommun Electron Comput Eng 8:67–73
10. Chang CW, Lin MY, Harn HJ, Harn YC, Chen CH, Tsai KH, Hwang CH (2009) Automatic
segmentation of abnormal cell nuclei from microscopic image analysis for cervical cancer
screening. In: 2009 IEEE 3rd international conference on nano/molecular medicine and
engineering NANOMED 2009, pp 77–80. https://doi.org/10.1109/NANOMED.2009.
5559114
11. Plissiti ME, Nikou C, Charchanti A (2011) Accurate localization of cell nuclei in pap smear
images using gradient vector flow deformable models, pp 284–289. https://doi.org/10.5220/
0002746702840289
12. Plissiti ME, Nikou C, Charchanti A (2011) Automated detection of cell nuclei in pap smear
images using morphological reconstruction and clustering. IEEE Trans Inf Technol Biomed
15:233–241. https://doi.org/10.1109/TITB.2010.2087030
13. Tareef A, Song Y, Cai W, Huang H, Chang H, Wang Y, Fulham M, Feng D, Chen M (2017)
Automatic segmentation of overlapping cervical smear cells based on local distinctive features
and guided shape deformation. Neurocomputing 221:94–107. https://doi.org/10.1016/j.
neucom.2016.09.070
14. Isa NAM (2015) Contrast enhancement image processing technique on segmented pap smear
cytology images, 6:3375–3379. https://doi.org/10.13040/IJPSR.0975-8232.6(8).3375-79
15. Abdul-nasir AS, Mustafa N, Mohd-nasir NF (2009) Application of thresholding technique in
determining ratio of blood cells for leukemia detection. In: Proceedings of the international
conference on man-machine systems, pp 11–13
Contrast Enhancement Approaches on Medical Microscopic Images … 725

16. Aimi Salihah AN, Mashor MY, Harun NH, Abdullah AA, Rosline H (2010) Improving colour
image segmentation on acute myelogenous leukaemia images using contrast enhancement
techniques. In: 2010 IEEE EMBS conference on biomedical engineering and science, pp 246–
251. https://doi.org/10.1109/IECBES.2010.5742237
17. Harun NH, Mashor Y, Mokhtar NR, Osman MK (2010) Comparison of acute leukemia image
segmentation using HSI and RGB. In: International conference on information science, signal
processing and their applications 2010, pp 749–752
18. Halim NHA, Mashor MY, Abdul Nasir AS, Mokhtar NR, Rosline H (2011) Nucleus
segmentation technique for acute leukemia. In: Proceedings - 2011 IEEE 7th international
colloquium on signal processing and its applications CSPA 2011, pp 192–197. https://doi.org/
10.1109/CSPA.2011.5759871
19. Ashwini R, Aswini N (2017) Image processing based leukemia cancer cell detection. In: 2016
IEEE international conference on recent trends in electronics, information & communication
technology RTEICT 2016 – Proceedings, pp 471–474. https://doi.org/10.1109/RTEICT.2016.
7807865
20. Purwar Y, Shah SL, Clarke G, Almugairi A, Muehlenbachs A (2011) Automated and
unsupervised detection of malarial parasites in microscopic images. Malar. J. 10:364. https://
doi.org/10.1186/1475-2875-10-364
21. Nasir ASA, Mashor MY, Mohamed Z (2012) Segmentation based approach for detection of
malaria parasites using moving k-means clustering. In: 2012 IEEE EMBS conference on
biomedical engineering and science, pp 653–658. https://doi.org/10.1109/IECBES.2012.
6498073
22. Mehrjou A, Abbasian T, Izadi M (2013) Automatic malaria diagnosis system. In:
International conference on robotics and mechatronics, ICRoM 2013, pp 205–211. https://
doi.org/10.1109/ICRoM.2013.6510106
23. May Z, Sarah S, Mohd A (2013) Automated quantification and classification of malaria
parasites in thin blood smears, pp 369–373
24. Somasekar J, Reddy BE (2015) Contrast-enhanced microscopic imaging of malaria parasites.
In: 2014 IEEE international conference on computational intelligence and computing research
IEEE ICCIC 2014, pp 1–4. https://doi.org/10.1109/ICCIC.2014.7238439
25. Savkare SS, Narote SP (2015) Automated system for malaria parasite identification. In:
Proceedings - 2015 international conference on communication, information & computing
technology ICCICT 2015, pp 15–18. https://doi.org/10.1109/ICCICT.2015.7045660
26. Abidin SR, Salamah U, Nugroho AS (2017) Segmentation of malaria parasite candidates from
thick blood smear microphotographs image using active contour without edge. In:
Proceedings of the 2016 1st international conference on biomedical engineering: empowering
biomedical technology for better future IBIOMED 2016. https://doi.org/10.1109/IBIOMED.
2016.7869824
27. Raof RAA, Mashor MY, Ahmad RB, Noor SSM (2012) Image segmentation of
Ziehl-Neelsen sputum slide images for tubercle bacilli detection. In: Image segmentation.
https://doi.org/10.5772/15808
28. Wahidah MFN, Mustafa N, Mashor MY, Noor SSM (2015) Comparison of color thresholding
and global thresholding for Ziehl-Neelsen TB bacilli slide images in sputum samples. In:
Proceedings - 2015 2nd international conference on biomedical engineering ICoBE 2015,
pp 30–31. https://doi.org/10.1109/ICoBE.2015.7235913
29. Hirimutugoda Y, Wijayarathna G (2010) Image analysis system for detection of red cell
disorders using artificial neural networks. Sri Lanka J Bio-Med Inform 1. https://doi.org/10.
4038/sljbmi.v1i1.1484
30. Maitra M, Kumar Gupta R, Mukherjee M (2012) Detection and counting of red blood cells in
blood cell images using Hough transform. Int J Comput Appl 53:13–17. https://doi.org/10.
5120/8505-2274
726 N. Nahrawi et al.

31. Tyagi M, Saini LM, Dahyia N (2016) Detection of Poikilocyte cells in iron deﬁciency
anaemia using artiﬁcial neural network. In: 2016 international conference on computation of
power, energy information and commuincation ICCPEIC 2016, pp 108–112. https://doi.org/
10.1109/ICCPEIC.2016.7557233
32. Mashor MY (2000) Hybrid training algorithm for RBF network. Int J Comput Internet Manag
8:50–65
33. Ravindraiah R, Srinu MV (2012) Quality improvement for analysis of leukemia images
through contrast stretch methods. Procedia Eng 30:475–481. https://doi.org/10.1016/j.proeng.
2012.01.887
Effect of Different Filtering Techniques
on Medical and Document Image

Wan Azani Mustafa, Syaﬁq Sam, Mohd Aminudin Jamlos,

and Wan Khairunizam

Abstract Image enhancement is very important stages used in image processing.

A normal image enhancement process is using the filtering technique. Filtering
helps the problems of the image display and can improvise the quality of the image.
The problems that always happened in the image is illumination, noise and
under-light images. In addition, these problems also caused a few troubles for
image recognition for the daily life of certain people for their work. The objective of
this study is to explore and compare a few starts of art filtering techniques based on
the mathematical algorithm of the filters and then identifying the best method of the
filters. There were a few methods that were selected in this project such as a high
pass filter, low pass filter, high boost filter and others. All the selected filter
experimented on the medical images and document images. The resulting images
were evaluated using the Image Quality Assessments (IQA) which is a global
contrast factor (GCF) and signal to noise ratio (SNR). Based on the numerical
result, homomorphic low pas filter (HLF) provides a better performance among the
other filters in terms of GCF (2.066) and SNR (8.907) value of the selected images.

Keywords Contrast Illumination Signal noise ratio Global contrast factor

Filtering

W. A. Mustafa (&) S. Sam M. A. Jamlos

Faculty of Engineering Technology, University of Malaysia Perlis,
Kampus Sg. Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]
W. Khairunizam
School of Mechatronic Engineering, University of Malaysia Perlis,
02600 Arau, Perlis, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 727

1 Introduction

In recent decades, the filtering technique has been one of the major interesting
research subjects due to improving the image quality. In the image processing field,
the filtering technique is one of the important things should be considered before
applying the post-processing such as detection or segmentation [1, 2]. The main
objective of filtering is to remove or eliminate the noise effect on the input image
[3]. This unwanted signal will be the effect on the output image apply process stage.
Nowadays, many researchers have studied the effect of filtering technique on the
non-uniform image such as medical image [4] and document image [5].
In the year 2013, Xu et al. [6] from Beihang University has proposed a new
filtering technique Hessian-based technique. They applied a few steps of filtering
technique on filter bank and the filtered direction images were reassembled to pro-
duce the final result. Mustafa et al. [7] presented a comprehensive review of filtering
types such as a high pass filter, low pass filter and homomorphic filtering. This
comparison was tested on the medical images which are a retinal image of DRIVE
dataset. According to this experiment, they conclude that the homomorphic high
pass filter is more effective compared to other filters. In another study, Rahul
Rajendran et al. presented a different filtering technique to improve the small res-
olution in the input images [8]. The proposed technique has applied the use of
conducting filtering, edge enhancement, and morphological filtering and many more.
This filtering method has numerous uses such as image combination and colorization
[9]. The results achieved show that the filtering algorithm enhanced effectively for
X-ray image as well. Their method distributes better outcomes than the way in [10]
as it removes the unwanted noise. This experiment was applied to Computer
Tomography (CT) image and X-ray image. In the year 2016, the previous project
was about the process for detection of a brain tumor that applied the accuracy and
efficiency with its position and the area in the MRI Images of the [11]. The brain was
scanned and the x-ray image was taken in the entire process. The pre-processing
stage is to remove the noises on that image and resizing the x-ray image of the brain.
According to an investigation by Mustafa et al. [12], the mean filtering is the good
alternative ways to remove the noise in non-uniform input images. In this study, they
proposed a new enhancement technique by applied two times mean filtering called as
Double Mean Filtering (DMF) technique. Based on the result, the luminosity was
successful normalize and improved the image quality [13].
In this paper, a comprehensive review of a few selected filtering types was
discussed. The objective of this study is to explore the mathematical algorithm for
each filtering and find the best technique. The experiment was conducted on the
document image and the medical image (cell image). A few image quality
assessment such as Global Contrast Factor (GCF) and Signal Noise Ratio
(SNR) was performed in order to compare the effectiveness of every method. The
rest of this paper is organized as follows: Sect. 2 describes of selected filtering
technique. Experimental results and discussion are shown in Sect. 3. Finally,
Sect. 4 explained the conclusion of this work.
Effect of Different Filtering Techniques on Medical … 729

2 Methodology

The methods to be used in this paper will be explained in this part, for example, the
basic meaning of the ﬁlters and its mathematical algorithms.

2.1 Low Pass Filter (LPF)

An LPF also named a blurring or smoothing filter. Its image will look a lot blurrier.
Even how good the camera is, it will always add an amount of snow or noise into
the image. The numerical nature of light itself also donates noise into the image
[14]. However, the LPF divided into 2 types which is discrete and continuous. The
equation of LPF as follows;
( pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 if x2 þ y2 \ cf
LPF ðx; yÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1Þ
0 if x2 þ y2 [ cf

2.2 High Pass Filter (HPF)

HPF can make the identical technique as LPF with an unlike complication kernel
[7]. Equation (2) shows the HPF mathematical algorithm. In fact, Fourier HPF is
applied to detach low-frequency illumination from high-frequency reflectance. The
main purpose of using the high pass ﬁlter is to stop the low-frequency component,
while the high-frequency component is passing on the signal [15, 16].

1 if r ðx; yÞ [ cf
HPFðx; yÞ ¼
0 otherwise ð2Þ
HPF ðx; yÞ ¼ 1 LPF

2.3 High Boost Filter (HBF)

High boost filtering of a black & white image included time domain image
sharpening techniques. It is frequently necessary to underline high-frequency parts
demonstrating the image data by means such as refining minus removing
low-frequency modules demonstrating the simplest form of the signal [17]. In this
case, the HBF can be used to filter high-frequency parts while unmoving the
low-frequency parts. Figure 1 shows the comparison position between the low pass
filter, high pass filter, and a high boost filter.
730 W. A. Mustafa et al.

Fig. 1 Homomorphic high

boost ﬁlter graph

2.4 Homomorphic Filtering

This method is a typical method for filtering and improvement of the image. It
stabilizes the illumination of the image and improves the contrast while illumination
is the components that could not be excluded [18, 19]. There are two components to
apply to improve the unwanted presence simultaneously, which is the illumination
and contrast [20]. Fundamentally, an image can be mathematically in terms of
illumination and reflectance as follows in Eqs. 3. Mainly, the model is identified in
the natural log domain and the Fourier transform is obtained as in Eq. 4. Next, the
inverse Fourier transform will apply as denoted Eq. 5. Illumination images are I(x,
y), reflectance images are R(x, y) and type of filter (spatial filter) is H(x, y). Lastly,
the inverse transform of the natural log is inserted to get the spatial domain which is
exponential as following Eq. 6.

F ðx; yÞ ¼ Iðx; yÞ Rðx; yÞ ð3Þ

Z ðx; yÞ ¼ lnfFðx; yÞg ¼ lnfIðx; yÞg x lnf Rðx; yÞg ð4Þ

Sðx; yÞ ¼ F 1 fHðx; yÞ I ðx; yÞg þ F 1 fHðx; yÞ Rðx; yÞg ð5Þ

Gðx; yÞ ¼ expfIðx; yÞg x expfRðx; yÞg ð6Þ

Effect of Different Filtering Techniques on Medical … 731

2.5 Image Quality Assessment (IQA)

Image quality assessment (IQA) is one the method to measure the quality display of
an image and it plays a variety of roles in many image processing techniques [21].
In common, the evaluation of image quality usually can categorize into two types
which is a subjective and objective evaluation of quality. A subjective quality
technique such as Mean Opinion Score (MOS) that decisive, but too troublesome,
also more period engaged and costly. However, objective evaluation is more simple
and easy compared to the subjective technique. In addition, the objective evaluation
more accurate caused used quantitative data. In this study, an objective measure-
ment technique was used to evaluate the performance of each ﬁltering type. In this
assessment part, two types of image quality assessment (IQA) were calculated
known as Global Contrast Factor (GCF) and Signal Noise Ratio (SNR).
Signal to Noise Ratio (SNR). SNR is the ratio of the average signal value of the
standard deviation of the signal. Higher SNR value showed a better quality image
and lower SNR speciﬁes the region of image flaw of the background noise [22, 23].
The represents an input image and is the standard deviation of the image. The SNR
equation as follows;

mean ½Iðx; yÞ

SNR ¼ 10 log10 ¼ ð7Þ
std ½Iðx; yÞ

Global Contrast Factor (GCF). The newly introduced GCF nearer to the human
awareness of difference by calculating the local contrast at several spatial fre-
quencies and to use these local contrast for the computation of the global contrast
factor. The low GCF value represents the image is uniform [24, 25].

X
N
GCF ¼ wi x ci ð8Þ
i¼1

In this study, 10 document images and 10 cell images were tested. Firstly, the
input data was through each filtering process. Each output image will be saved on
storage. After all the process finish, the GCF and SNR were obtained in order to
find effective filtering techniques.

3 Results and Discussion

Indication of image quality is signiﬁcant for image processing technique.

Independent methods for measuring perceptual image quality by tradition way was
discovered to measure the emergence of defects amongst a partial image and an
732 W. A. Mustafa et al.

original photo using a selection of known properties. First, it can apply to ani-
matedly computer and control the intensity of the picture. For example, a network
digital video server can observe the performance of the video is to switch and
distribute on display assets. Next, it can be heightened algorithms and parameter
settings of image processing systems. For occasion, in a visual system, a quality
measurement can contribute to the ideal design of filtering and a few algorithms [6].
In this part, two types of image quality assessment (IQA) was calculated known as
SNR and GCF. The primary goal of this analysis is to explore and study the effect of
each IQA on tested images. Figures 2 and 3 shows the resulting image after applying
a few selected filtering technique on document images and medical images.
According to Table 1, there were twenty (20) images were filtered. It was
divided into 2 types of the image which is document image and medical image. The

Type of Number of images

filter

Original
image

LPF

HPF

HHPF

HLPF

HHBF

Fig. 2 Comparison of different types of the ﬁlter of the document image

Effect of Different Filtering Techniques on Medical … 733

Type of
Number of images
filter

Original
Image

LPF

HPF

HHPF

HLPF

HHBF

Fig. 3 Comparison of different types of the ﬁlter of medical image

outcome images after put on, unlike filters. The homomorphic HPF shows the best
output to abolish the contrast variation in both the document images and medical
images associated than the others. Table 1 shows the value of GCF and SNR with
dissimilar kinds of filter. The best image must produce high SNR value and lower
GCF value. Based on Table 1, the image of number 9 from the document images
gives the highest SNR value which is 10.974 and the lowest GCF value is 0.679
which is using the method of homomorphic HPF. However, the image of number
17 from the medical images gives the highest SNR value which is 10.177 and the
lowest GCF value is 1.791 which is also by using the method of homomorphic
HPF. The theory of SNR, if the SNR value is high it defines better image because
noise and intensity have even now diminished, while the GCF value ratio is low. If
the GCF reduce to zero, then the image comes to be perfect. The best performance
in this IQA result is the homomorphic HPF because it shows the highest SNR value
which is 10.974 and the lowest is 0.679 compared to the other filters and overall of
the experimented image.
734 W. A. Mustafa et al.

Table 1 Comparison GCF & SNR value using different ﬁltering methods
Number Method LPF HPF HHPF HLPF HHBF
of images IQA GCF SNR GCF SNR GCF SNR GCF SNR GCF SNR
1 4.445 5.900 3.653 6.272 4.494 5.912 2.970 6.512 4.318 5.769
2 DOCUMENT IMAGES 3.661 10.001 3.341 6.308 3.525 4.926 3.132 6.615 3.685 5.361
3 3.666 7.748 2.323 8.477 2.652 8.244 1.707 9.396 2.643 8.058
4 2.613 7.335 2.129 8.427 2.240 8.334 1.434 9.350 2.160 8.381
5 2.510 7.248 1.875 8.487 2.313 7.708 1.644 8.802 2.197 7.822
6 3.072 7.187 2.530 7.630 2.920 7.190 1.832 7.737 2.689 7.156
7 3.431 7.052 2.634 8.029 2.850 7.743 1.806 8.596 2.769 7.749
8 2.965 6.904 2.454 8.157 2.730 7.873 1.924 8.524 2.730 7.729
9 1.307 4.591 1.088 10.420 1.136 10.235 0.679 10.974 1.069 10.357
10 3.367 2.965 2.912 7.469 3.093 7.355 1.952 8.085 2.921 7.417
11 3.514 7.316 3.023 8.314 3.278 7.669 2.655 8.974 2.987 8.282
12 3.540 7.132 3.062 7.937 3.350 7.358 2.736 8.389 3.083 7.834
MEDICAL IMAGES

13 2.896 8.002 2.099 9.626 2.676 8.345 1.857 10.157 2.241 9.300
14 3.222 7.406 2.614 8.571 3.027 7.727 2.288 9.172 2.678 8.415
15 2.915 7.925 2.340 9.035 2.647 8.364 1.983 9.710 2.356 8.942
16 3.062 7.740 2.610 8.539 2.861 8.004 2.299 9.039 2.603 8.476
17 2.678 8.368 2.103 9.563 2.438 8.756 1.791 10.177 2.147 9.414
18 2.745 8.135 2.180 9.339 2.548 8.450 1.906 9.812 2.288 9.020
19 2.733 8.411 2.211 9.452 2.519 8.758 1.907 10.046 2.233 9.357
20 3.865 6.566 3.244 7.497 3.608 6.868 2.816 8.075 3.308 7.333
Total 62.20 143.93 50.42 167.54 56.90 155.81 41.31 178.13 53.10 162.17
Average 3.110 7.197 2.521 8.377 2.845 7.791 2.066 8.907 2.655 8.109
Highest 4.445 10.001 3.653 10.420 4.494 10.235 3.132 10.974 4.318 10.357
Lowest 1.307 2.965 1.088 6.272 1.136 4.926 0.679 6.512 1.069 5.361

Lowest Value Highest Value

4 Conclusion

The filters can be applied to certain images that have illumination problem, noises
and under-light images. This paper has shown that an overview of the background
and related work in the area of filtering technique. This work described a few
selected popular filtering methods such as HPF, LPF and Homomorphic. Other than
that, the study also to explore mathematical algorithms from the filtering formula
techniques. The filtering was tested on the medical image and document image. The
output image was calculated to calculate by using the formula of global contrast
factor (GCF) and signal noise ratio (SNR). The highest SNR value and the lowest
GCF value are the best methods. Based on the numerical result, homomorphic HLF
provides a better performance among the other filters in terms of GCF (2.066) and
SNR (8.907) value of the selected images. It is recommended that in the following
areas for further experimental research are needed to estimate the weakness in this
method because the GCF value still not close to zero. From the results, if GCF is
closer zero the image will be more perfect. So, there a few researcher is needed to
determine to get the best result of GCF value and SNR value. Otherwise, a future
study by investigating the many filters would be very interesting.
Effect of Different Filtering Techniques on Medical … 735

Acknowledgements This work was supported by the Ministry of Higher Education Malaysia
under the Fundamental Research Grant Scheme (FRGS/1/2018/SKK13/UNIMAP/02/1).

References

1. Al-Rawi M, Qutaishat M, Arrar M (2007) An improved matched filter for blood vessel
detection of digital retinal images. Comput Biol Med 37:262–267
2. Thilagamani S, Shanthi N (2014) Gaussian and Gabor filter approach for object segmentation.
J Comput Inf Sci Eng 14:1–7. https://doi.org/10.1115/1.4026458
3. Chandel R, Gupta G (2013) Image filtering algorithms and techniques: a review. Int J Adv
Res Comput Sci Softw Eng 3:198–202
4. Mustafa WA, Yazid H, Yaacob S (2015) Illumination correction of retinal images using
superimpose low pass and Gaussian filtering. In: International conference on biomedical
engineering (ICoBE), pp 1–4
5. Sehad A, Chibani, Y, Hedjam R, Cheriet M (2018) Gabor filter-based texture for ancient
degraded document image binarization. https://doi.org/10.1007/s10044-018-0747-7
6. Xu X, Liu B, Zhou F (2013) Hessian-based vessel enhancement combined with directional
filter banks and vessel similarity. In: 2013 ICME international conference on complex
medical engineering, CME 2013, pp 80–84. https://doi.org/10.1109/ICCME.2013.6548216
7. Mustafa WA, Yazid H, Yaacob S (2014) A review : comparison between different type of
filtering methods on the contrast variation retinal images. In: IEEE international conference on
control system, computing and engineering, pp 542–546
8. Rajendran R, Panetta K (2016) A versatile edge preserving image enhancement approach for
medical images using guided filter. In: IEEE international conference on systems, man and
cybernetics, SMC 2016, pp 2341–2346
9. Li S, Kang X, Hu J (2013) Image fusion with guided filtering. IEEE Trans Image Process
22:2864–2875. https://doi.org/10.1109/TIP.2013.2244222
10. Rajendran R, Rao SP, Agaian SS, Liss M (2016) A novel technique to enhance low resolution
CT and magnetic resonance images. In: Simulation series
11. Shivakumarswamy GM, Aksha Patil V, Chethan TA, Prajwal BH, Hande SV (2016) Brain
tumour detection using Image processing and sending tumour information over GSM. Int J
Adv Res Comput Commun Eng 5:179–183. https://doi.org/10.17148/IJARCCE.2016.5543
12. Mustafa WA, Yazid H, Yaacob S (2014) Illumination normalization of non-uniform images
based on double mean filtering. In: IEEE international conference on control systems,
computing and engineering, pp 366–371
13. Mustafa WA, Yazid H, Kader MMMA (2018) Luminosity correction using statistical features
on retinal images. J Biomim Biomater Biomed Eng. 37:74–84. https://doi.org/10.4028/www.
scientific.net/JBBBE.37.74
14. Zhu S, Zeng B, Yan S (2012) Image super-resolution via low-pass filter based multi-scale
image decomposition. In: Proceedings - IEEE international conference on multimedia and
expo, pp 1045–1050. https://doi.org/10.1109/ICME.2012.29
15. Liu M, Wang A (2014) Fully homomorphic encryption and its applications. Comput Sci Res
Dev 51:2593–2603. https://doi.org/10.7544/issn1000-1239.2014.20131168
16. Gangkofner UG, Pradhan PS, Holcomb DW (2008) Optimizing the high-pass filter addition
technique for image fusion. Photogramm Eng Rem S 74:1107–1118. https://doi.org/10.
14358/PERS.74.9.1107
17. Alirezanejad M, Amirgholipour S, Safari V, Aslani S, Arab M (2014) Improving the
performance of spatial domain image watermarking with high boost filter. Indian J Sci
Technol 7:2133–2139
18. Mustafa WA, Kader MMMA (2018) Contrast enhancement based on fusion method: a
review. J Phys Conf Ser 1019:1–7. https://doi.org/10.1088/1742-6596/1019/1/012025
736 W. A. Mustafa et al.

19. Gu H, Lv W (2012) A modified homomorphic filter for image enhancement. In: Proceedings
of the 2nd international conference on computer application and system modeling. https://doi.
org/10.2991/iccasm.2012.45
20. Mustafa WA, Khairunizam W, Yazid H, Ibrahim Z, Ab S, Razlan ZM (2018) Image
correction based on homomorphic filtering approaches : a study. In: IEEE international
conference on computational approach in smart systems design and applications (ICASSDA).
IEEE, pp 1–5
21. Mustafa WA, Yazid H, Jaafar M, Zainal M, Abdul- AS, Mazlan N (2017) A review of image
quality assessment (IQA): SNR, GCF, AD, NAE, PSNR, ME. J Adv Res Comput Appl 7:1–7
22. Mustafa WA, Yazid H (2016) Background correction using average filtering and gradient
based thresholding. J Telecommun Electron Comput Eng 8:81–88
23. Mustafa WA, Yazid H (2016) Illumination and contrast correction strategy using bilateral
filtering and binarization comparison. J Telecommun Electron Comput Eng 8:67–73
24. Kanafiah SNAM, Mashor MY, Mustafa WA, Mohamed Z (2018) A novel contrast
enhancement technique based on combination of local and global statistical data on malaria
images. J Biomim Biomater Biomed Eng 38:23–30. https://doi.org/10.4028/www.scientific.
net/JBBBE.38.23
25. Matkovic K, Neumann L, Neumann A, Psik T, Purgathofer W (2005) Global contrast factor -
a new approach to image contrast. In: Computational aesthetics in graphics, visualization and
imaging, pp 159–167
Implementation of Seat Belt Monitoring
and Alert System for Car Safety

Zainah Md Zain, Mohd Hairuddin Abu Bakar, Aman Zaki Mamat,

Wan Nor Raﬁdah Wan Abdullah, Norsuryani Zainal Abidin,
and Haris Faisal Shaharuddin

Abstract Modern cars have many safety features which are playing a significant
role in reducing traffic injuries and deaths. One of the reasons that cause cars
accident’s fatalities is not wearing a seat belt. In order to overcome this problem, an
attempt has been made to design a car safety system whereby the car will not run
unless the driver and passengers use the seat belt first before turning on the car. In
the proposed system, the ultrasonic devices and limit switches are used to detect
driver and passengers and also to detect seat belts that have been used, respectively.
In addition, the switch of electric circuits is designed and installed between seat
belts and ignition systems to control start engines. Arduino Mega microcontroller
act as a signal processing unit to control the security system in the car. The
experimental results show that the system is accurately able to enhance the safety
aspects of driver and passengers.

Keywords Seat belt Alert systems Car safety

1 Introduction

Seat belt is one of the best safety features in the modern car where it can secure
passengers in a car during collisions and other accident [1]. All cars are equipped
with the three pointed seat belt for driver (mandatory), front passenger (mandatory),
and rear passenger (optional). Seat belt plays a vital role in preventing injuries. The
basic idea of a seat belt is very simple; it keeps you from flying through the
windshield or hurdling toward the dashboard when your car comes to an abrupt
stop. A seat belt applies the stopping force to more durable parts of the body over a
longer period of time to prevent injuries. A typical seat belt consists of a lap belt,

Z. Md Zain (&) M. H. Abu Bakar A. Z. Mamat W. N. R. Wan Abdullah

N. Zainal Abidin H. F. Shaharuddin
Robotics and Unmanned Research Group (RUS), Instrument and Control Engineering
(ICE) Cluster, Faculty of Electrical and Electronics Engineering, Universiti Malaysia Pahang,
26600 Pekan, Pahang, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 737

which rests over your pelvis, and a shoulder belt, which extends across your chest.
The two belt sections are tightly secured to the frame of the car in order to hold
passengers in their seats. In modern car ECU’s are intelligent enough to alert the
driver/passenger about the seatbelt information whether it is buckled/unbuckled
through different ways- sometime only a tell-tale, sometime tell-tale with buzzer (if
vehicle moving) or tell-tale + buzzer + text warning display [2–5].
According to a study by the Malaysian Institute of Road Safety Research
(MIROS) [6], Malaysia’s survival survivors are guaranteed 60% when they wear
seat belts. The attitude of the public to consider the use of seat belts as trivial is the
major contributor to the increase in death rate due to accidents. The excuse given
includes a feeling of comfort or travel just near, while wearing a seat belt only takes
a couple of seconds. The user considers the air bag in the car is sufficient to
minimize the impact at the accident, whereas the equipment should also be used
with the main safety device namely seat belts [5].
Accordingly, the aim of this study is to design a system that allows the use of
seat belts in reducing the risk of death during the accident. The car will not be
starting the engine as long as the driver/passenger does not use the seat belt. The
system will detect the driver if only the driver in the car and the driver should use
seat belt first before turning on the car engine. If the driver does not use the seat belt
first and then turn on the car’s engine, the engine cannot be switched on as long as
the driver does not use the seat belt first. If there is a passenger 1, even though the
driver has installed the seat belt but passenger 1 still does not install the seat belt,
the car engine also cannot be turned on until the passengers 1 use the seat belt. The
same situation is designed for the next passenger 2, 3 or 4.

2 System Overview

The main components proposed in system Fig. 1 includes Arduino Mega micro-
controller acts as the heart of the system, limits switch, proximity switch, Isd1820
voice recorder and player, 8 relay module type 5VDC, HC-SR04 ultrasonic sensor,
mini siren and display (16 2). The connection between electronics components
illustrated in Fig. 2.
The flowchart of the proposed system is shown in Fig. 3, starting from designing
the ultrasonic sensor circuit and limit switch to ensure that the car engine will not be
switched on as long as the sensor limits switch is not activated while the sensor
ultrasonic detectors are active. Thereafter, there is a voice command used to tell the
current driver status of this system. The alarm output has been designed using three
items that are deemed suitable for this project. The audio instruction, the indicator
on the LED has its own red code indicates that the car cannot be switched on again.
When the LED code turns from red to green means the car can be switched on. If
the car in switched on condition and the seat belt is removed, the sound output will
go out with the siren to indicate that the car engine will die within 15 min if it does
not reuse the seat belt. The 15 min given, the car engine will survive and instead
Implementation of Seat Belt Monitoring … 739

4 ULTRASONIC
SENSORS:

DRIVER_1, SPEAKER ALARM

PASSENGER_1,
PASSENGER_2,
PASSENGER_3 OUTPUT
CONTROLLER
TO
(ARDUINO
START
MEGA 2560)
ENGINE
4 LIMIT SWITCH:
BELT DRIVER_1,
BELT PASSENGER_1, DISPLAY STARTING
BELT PASSENGER_2, ENGINE
BELT PASSENGER 3 CAR

Fig. 1 Block diagram of the proposed system

Fig. 2 Wiring diagram of the proposed system

designed and built a hardware that means a combination of design circuits with a
compact panel mounted on the real car. Once completed to design the hardware, it
runs to the next level, which is hardware testing. From this stage, the hardware will
740 Z. Md Zain et al.

Fig. 3 Flowchart of the developed system

Implementation of Seat Belt Monitoring … 741

be tested to check whether its function and achieve the goal or not. If the test is
unsuccessful, it will go back to the redesign and build the hardware. The last stage
is to analyse the results in which it gets the output of the simulation and the test
performed and will be recorded.

3 Project Implementation

3.1 Panel Design

The front panel is required to show users what the system is up to. At the front of
the display contains a display LCD display to indicate the current status of the
system, 4 LED lights that can change color green and red each representing drivers
and passenger vehicles and a LED blue to display a warning when the safety belt is
opened while the engine is running.

Fig. 4 Front panel and internal panel design

742 Z. Md Zain et al.

Fig. 5 Power supply circuit

The red LED will light up when the ultrasonic sensor detects the driver and the
passengers who are in the vehicle, when the driver and passengers are wearing the
safety belt, the red LED will turn into green and then the driver can turn on the
vehicle’s engine. On the front of the front panel is placed power supply, Arduino
mega, voice recorder and 8 channel relay to facilitate the installation of this project
on the vehicle so it looks neat. This front panel will be mounted on the top of the
radio (Fig. 4).

3.2 Arduino Circuit Power Supply Form Direct Car System

The Arduino board can operate on an external supply from 6–20 V. However, if the
supply voltage less than 7 V, the 5 V pin will supply less than 5 V and the board
may become unstable. The voltage regulator may overheat and damage the board if
the voltage more than 12 V. The recommended range is 7–12 V. The basic car
electrical system gives you around 12–13 V when the engine is off and 13–14 V
when it is running. So the basic idea is to use a simple voltage which takes an
unregulated voltage in and outputs a regulated voltage. The LM1084IT-12 is a
12 V, 5 A, low dropout voltage regulator in a TO220 package to which a heatsink
will be added. The low dropout feature is nice as the cars supply varies so much.
We loop the power supply from the point accessories system (radio) cause this point
max current is 10 A. From calculation, radio used ± 6 A and Arduino controller
maximum running with load is 1.5 A (Fig. 5).

3.3 Start-Stop Wiring Diagram

The start-stop wiring diagram for start and stop engine is shown in Fig. 6.
Implementation of Seat Belt Monitoring … 743

Fig. 6 Start-stop wiring diagram

3.4 Buckle Design

Buckle is one of the important parts of this project, the buckle will be plugged in
with the switch limit to detect the vehicle users wearing a safety belt or not. Limit
switch will be connected to relay and relay will signal to display. This design
buckle is not so difﬁcult, just pulling the screw hole and installing the switch limit
just at the level that can detect the safety belts already worn (Fig. 7).

4 Results and Discussions

Rescuing drivers with rear passengers as rear passengers who do not wear back
belts can invite risks and hazards to drivers and passengers who are in front as when
a vehicle moves at 50 km/hr speed and a 60 kg back passenger can hit the front seat
with impact 2.4 tons. Prevents passengers from hurling or exit to the vehicle when
the vehicle stops suddenly. This project is a very useful reminder to drivers who
drive without ignoring the importance of seat belt. The complete electronic part in
the developed panel is shown in Fig. 8.
744 Z. Md Zain et al.

Fig. 7 Buckle design

Fig. 8 Electronic parts in the

panel

4.1 Power Consumptions

The power consumptions for the developed hardware is measured as shown in

Fig. 9. Full power consumption reading is presented in Table 1.
• Using the formula Power = Voltage Current
• 404.8 mA to 140.4 mA is a range running LED in current draw from 1 unit
LED to 5 units LED used in system
• 134.5 mA to 624.5 mA is current range for 1 to 8 unit relay operate
• 141.8 mA to 361.8 mA is current range for 1 to 4 unit Ultrasonic sensor operate
• Total current for full system operating is 1.5A and power is 16859 mW
Implementation of Seat Belt Monitoring … 745

Fig. 9 Measuring power consumption

Table 1 Power consumption

4.2 Working Principle

Figure 10 shows the ultrasonic sensor detects the driver and the passengers, LED
will show a red colour and the LCD display will display the status “TIGHTEN
SEAT BELT FIRST” so that the safety belt is worn.
A Fig. 11 show after all the safety belt is worn, the display on LED will turn
green and LED display shows the driver can turn on the vehicle engine.
746 Z. Md Zain et al.

Fig. 10 Ultrasonic sensors detect the passengers in the car

Implementation of Seat Belt Monitoring … 747

Fig. 11 After all the safety belt is worn

Figure 12 shows the LED passenger 2 lights turn red; this is because passenger 2
has opened the safety belt. In the meantime the blue LED warning will light up and
the LCD display will show the engine will stop within 15 min if the safety belt is

Fig. 12 The LED passenger 2 lights turn red

748 Z. Md Zain et al.

Fig. 13 The safety belt is still not worn

not reused. Figure 13 shows if the safety belt is still not worn LCD display will
indicate the engine has stopped.

5 Conclusion

In this paper, we have seen the system designed to monitor and alert the driver and
passengers’ seat belt in order to obtain a car safety system where the car cannot be
turned on unless the driver and passengers use the seat belt ﬁrst before turning on
the car has successfully developed. Therefore, this system can be used in the car as
an alert system to the driver and passenger in order to reduce the fatality caused by
accident due to not wearing a seat belt while driving.

Acknowledgements The authors would like to thank for the support given to this research by
Universiti Malaysia Pahang (UMP) under grant RDU1803189.

References

1. Hammadi KA, Ismaeel M, Faisal T (2016) Intelligent car safety system. In: 2016 IEEE
industrial electronics and applications conference (IEACon), pp 319–322. https://doi.org/10.
1109/ieacon.2016.8067398
2. Seelam K, Lakshmi CJ (2017) An Arduino based embedded system in passenger car for road
safety. In: 2017 international conference on inventive communication and computational
technologies (ICICCT). https://doi.org/10.1109/icicct.2017.7975201johansson
3. Johansson P, Bernhard J (2012) Advanced control of a remotely operated underwater vehicle,
Department of Electrical Engineering, Linköpings Universitet, Sweden, Technical report.
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-79364
Implementation of Seat Belt Monitoring … 749

4. Kulanthayan S, Law TH, Raha AR, Radin Umar RS (2004) Seatbelt use among car users in
Malaysia. IATSS Res 28:19–25
5. Fahmi AM, Othman I, Ahmad MS, Batcha WA, Mohamed N. Chapter of seatbelt wearing
among vehicle occupants, report of evaluation of the effectiveness of ops Chinese new year
2013 conducted over the Hari Raya period from February 3 to February 17 2013. Malaysian
Institute of Road Safety Research, Kuala Lumpur
6. Seat belts, helmet reduces fatalities, 27 July 2015. http://www.thesundaily.my/news/1500754
Electroporation Study: Pulse Electric
Field Effect on Breast Cancer Cell

Nur Adilah Abd Rahman, Muhammad Mahadi Abdul Jamil,

Mohamad Nazib Adon, Chew Chang Choon, and Radzi Ambar

Abstract Electroporation has been found since the mid-1980 s but the first clinical
trial was conducted at 100 ls with an amplitude of 1–1.5 kV/cm to bring bleo-
mycin to the target region of the carcinoma cell in the 1990 s. Nowadays pulse
electric field has rapidly growing with applications in medicine, food, industry and
environment. Pulse electric field can be applied in variety of ways which is by the
pulse width (nanosecond to millisecond) with specific intended application.
Electroporation means increasing the plasma membrane permeability. There are
two types of electroporation, reversible (temporary) and irreversible (permanently).
These applications were widely used for the therapy of cancer and reversible
electroporation was the most frequently used. In this research, a small range of
pulse electric field amplitude (100–1000 V/cm, 30 ls and a single pulse) will be
applied to breast cancer cell to explore the method of electroporation. Besides, this
research concentrated on the efficacy of parameter reversible and irreversible
electroporation to examine the anti-proliferation impact on the cancer cell. Thus, the
existence of an electrical field as the stimulator for aggressive adsorption of the
anti-cancer agent into the cells will lead to variables in the therapy of cancer cell.
This technique will help to understand the factors in the cancer cell therapy that
may lead to a new method for the drug free therapy.

Keywords Electroporation Pulse electric ﬁeld Cancer treatment Irreversible
Reversible

N. A. Abd Rahman M. M. Abdul Jamil (&) M. N. Adon C. Chang Choon R. Ambar

Biomedical Modelling and Simulation Research Group, Faculty of Electrical and Electronics
Engineering, Universiti Tun Hussein Onn Malaysia, Batu Pahat, Johor, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 751

1 Introduction

1.1 Electroporation

Electroporation is a technique that has been commonly used since 1991 in medical
areas. The first clinical trial was conducted to introduce anti-toxic agent for cancer
treatment in malignant cells. This implementation discovered in 1754 by
J. A. Nollet who did an experiment with electrical fields and noticed a red spot on
human and animal skin in the areas where the sparks applied this was due to the
heating impacts of the joule or it could be said as harm to capillaries showing
irreversible electroporation and this was supported by A. J. Jex-Blake in 1913 who
discovered the same injuries. Stamfli and Willi reported on 1957 that the membrane
breakdown is referred to as irreversible under certain circumstances and vice versa
as reversible. On 1961 and 1977, the technique continues to grow in the food
processing industry. K. Kinosita and T. Tsong propose a cell membrane perme-
abilization because the application of the pulse electric field is linked to the creation
of pores and can vary in size. In 1987 and 1988, R. C. Lee started the significant
sequence of irreversible research on electrical discharge-induced tissue trauma.
From 2003 to 2007, R. V. Davalos as well as B. Rubinsky points out that IRE can
be readily implemented in regions where there is an elevated level of perfusion such
as in the proximity of blood vessels and performed clinical scenario on the liver
using 18-gage stainless steel needles placed with sonography assistance. On 2007,
Al sakere et al. performed IRE in vitro on mice with subcutaneous tumors and
studies immune reactions. In 2010, G. Onik and B. Rubin-sky report the first
irreversible clinical trial on patients with prostate cancer in series of outpatient
procedures and on 2015, 200 patients with locally advanced pancreatic adenocar-
cinoma (stage III) [1].
Transfer of material to cells and tissues through electrical induction provides a
chance for many fresh medical procedures and provides a precious tool for the
research of the cellular and intercellular system’s fundamental structural and bio-
chemical behavior [2]. It has been found that this technique is an effective way to
over-come the membrane barrier [3]. It is therefore interesting to explore mor-
phological modifications in the membrane of cells such as the chance of moving,
separating, fusing and deforming during electroporation [4]. Pulse electrical fields
influence not only excitable tissues, such as muscles and nerves, but also
non-excitable tissues, either thermally, by generating heat within the tissue or by
causing structural changes down to cellular membranes [5].
However, electroporation is hard to directly observe because of the very tiny size
of pores (nanometers) and their creations and development are very rapidly
(micro-seconds). Therefore, in doing any experimental method, several things need
to be measured. While important progress has been made, there are still fundamental
electroporation elements that have not been fully experimentally determined.
Electroporation Study: Pulse Electric Field Effect … 753

This research involved in-vitro method to observe particular interactions at the

cellular level by exposing the PEF outside the living organism in controlled set-
tings. Many of the EP systems have been created since Eberhard Neumann used an
EP scheme in molecular biology [6]. Treatment using the technique of electropo-
ration is cancer treatment, ablation of tissue, food sterilization, and much more.

2 Literature Review

2.1 Breast Cancer Cell

Breast cancer is one of the most prevalent cancers in Malaysia among women with
the largest proportion of patients who died as a result with 52 percent death among
Malaysians [7]. Breast cancer cells created from breast tissue with signs of breast
shape change, presence of lump in the breast, fluid from the nipple, skin dimpling
or red scaly skin patch. The risk factor is woman, obesity, absence of physical
activity, alcohol consumption, and hormone replacement therapy during meno-
pause, ionizing radiation and early age at ﬁrst menstruation, age and family history.
It usually develops in cells from milk duct lining and the lobules that provide milk
to the ducts. Operation, radiation therapy, chemotherapy, hormonal therapy, and
focused therapy are the treatment given. Breast cancer cells are a metastatic disease
that can spread beyond the initial organ like bone, liver, lung, and brain. As shown
in Fig. 1, the breast cancer cell has a trait of growing in a group shape.

Fig. 1 Breast cancer cell,

MCF-7
754 N. A. Abd Rahman et al.

2.2 Electroporation Type

Electroporation is a method that usually relates to the cells electrical ﬁelds. This
phenomenon results in the cell being permeable to ions and macromolecules when
inducing with short pulses of high voltage [8]. Because the voltage breakdown
happens on the cell membrane causing the lipid bilayer to fold, this phenomenon
produces an open pathway on the cell membrane called the cell membrane opening
pore. Many applications such as introduction protein, large and small molecule, as
well as cell fusion can be used once the cell membrane has an opening pore as
shown in Fig. 2. Reversible electroporation is called a temporary open pore. But if
the induction of voltage is too large, it can result in cell destruction, which is called
irreversible electroporation.

2.3 Reversible Electroporation

Reversible electroporation is a temporary open of pores on membranes and cell

survival after the pulse electrical field inducement. For molecular delivery to the
cell, reversible electroporation is primarily used. RE is frequently used for the
introduction into cells of substances such as colorants, drugs, proteins and nucleic
acids [9]. By applying an electrical pulse of adequate amplitude and length, it is
feasible to produce small pores in biological and artificial membranes. The pro-
duction of the pore on the cell membrane is sufficient for reversible, low-amplitude
and short-duration pulses and the pores close within milliseconds to minutes.
Which is safer in the induction of electrical field. This type of electroporation was

Fig. 2 Electroporation applications

Electroporation Study: Pulse Electric Field Effect … 755

selected to see the cell responds of the breast cancer cell with the reversible range of
electroporation technique.

2.4 Reversible Application

Several reversible applications commonly used in the ﬁeld of drug delivery and
gene therapy are presently undergoing thorough inquiry in electroporation-based
cancer treatment techniques. Starting in the early 1980 s, this has developed into a
clinically tested therapy for skin and subcutaneous tumor nodules [10]. The
so-called electro-chemotherapy is derived from the mixture of electroporation and
chemotherapy or ECT. Electro chemotherapy is an implementation by electrical
pulses of cell membrane permeabilize and was used as a local therapy. Electro
chemotherapy’s main mechanism is the induction of tumors by electroporation, a
process that increases drug effectiveness by allowing the drug to influence inter-
cellular targets [11]. Electrical pulses can be applied to tumors either through plate
electrodes positioned on the skin above the tumors or through needle electrodes
inserted into them [12]. The beneﬁts of electro chemotherapy are effective, secure,
low-cost, once-only therapy that can be given to cancer patients with tumors of
various histologist [13].

2.5 Irreversible Electroporation

Irreversible electroporation (IRE) creates permanent defects in cell membranes and

tempts cell death. Irreversible is considered to be the greater reversible limit, which
is caused by the after effects by IRE. If the IRE is induced into the cells, the pore on
the cell membrane will be permanently opened [14]. This technique is mostly used
in other tissue ablation in food technology and water treatment. This occurs when
the magnitude of the induced transmembrane potential exceeds a critical value that
disrupts the cell membrane to the extent that the cell dies as a result of homeostasis
failure.

2.6 Irreversible Application

In medical application, over the previous 7 years, IRE has emerged as a novel
ablation instrument through the use of the impact of an applied electrical ﬁeld to kill
cancer cells without damaging the surrounding extracellular matrix, vessels, nerves
and adjacent ordinary tissue [15]. While IRE has been investigated for a short time,
its potential use for cancer and tissue ablation has received increasing attention,
756 N. A. Abd Rahman et al.

resulting in a consider able number of validity and safety studies, including recent
in vivo animal and human studies.
In food technology, irreversible electroporation is referred to as pulsed pro-
cessing of electrical fields or electro plasmolysis in relation to cell membrane lysis
to remove their contents and the bactericidal impact in these kinds of medicines.
Through the first and second half of the twentieth century, the non-thermal bac-
tericidal effect of electric fields remained a research area in the food industry and
continues today. The use of electroporation as a method of microbial inactivation in
foods is a good implementation of non-thermal food pasteurization. This imple-
mentation relies heavily on several elements: strength of the electric field, length,
power supplied, electrical properties of the treated food, as well as microbial fea-
tures including form, size, structure of the cell wall, composition and conditions of
development. This implementation of microbial inactivation electroporation aimed
primarily at pasteurizing food rather than sterilization [16].
Using electroporation for microbial inactivation is often referred to as therapy
with pulsed electrical field (PEF). One of its applications is the treatment of
wastewater that utilizes irreversible electroporation for hospital wastewater bacterial
decontamination and also eradicates types of antibiotic resistance. This therefore
limits the spread of such bacteria to the surroundings.

3 Methodology

Figure 3 shows flow of work for this study, breast cancer was selected and maintain
by subculture method. Next, by the parameter selected which is 100–100 V/cm
with 30 µs pulse duration. These parameters will be induced on the breast cancer
cell and ﬁnd the best parameter for breast cancer cell (MCF7) then the

Fig. 3 Flow of work

Electroporation Study: Pulse Electric Field Effect … 757

morphological changes and cell response will be monitor towards cancer cell
treatment or wound healing application.

3.1 Electroporation Setup

The unit used for the electroporation device consists of the High Voltage Pulse
Generator (ECM 830) and the inverted Nikon TS100 microscope linked to the Dino
Camera and Dino capture 2.0 software as shown on Fig. 4 and 5. Inside the cuv-
ette’s 4 mm gap dimension, the cell was suspended and connected to the safety
stand connected to ECM 830. While the parameter to be set in the pulse generator
are 100–1000 V/cm (with 100 V/cm interval) of electric ﬁeld intensity, 30 µs of
pulse duration and a single number of pulse. EP effectiveness depends on the
amplitude of the pulse, duration, frequency of repetition, number of pulses and
shape of the pulse [9, 12].

Fig. 4 Experimental setup for electroporation exposure by using cuvette system

Fig. 5 Nikon TS100 inverted microscope with Dino camera and Dino capture 2.0 Software
758 N. A. Abd Rahman et al.

3.2 Cell Culture Setup

The cell line of breast cancer is an immortal line of cells that is used mostly for
scientific research. It is the oldest and most commonly used line of human cells. The
primary benefit of this cell is that it can be split into an infinite amount of occasions
as long as the fundamental cell survival requirements are met. Breast cancer cell
was therefore used as the main form of cell. These samples of breast cancer cells
were acquired from the animal laboratory cell cultures (Kulliyyah of Allied Health
Sciences, IIUM).

4 Morphological Changes During Electroporation

The result in Figs. 6 and 7 shows the difference in MCF-7 cell diameter with and
without exposure to the pulse electric ﬁeld. It showed that the cell expanded after
pulsed electric ﬁeld induction due to cell swelling. It indicates the comparison of
MCF-7 before with the size of MCF-7 in the range of 20–24 lm, during the
beginning of expansion of the PEF induction cell in the range of 26–30 lm, it

Fig. 6 MCF7 cells (a) with pulse electric ﬁeld inducement and (b) without pulse electrical ﬁeld
inducement

Fig. 7 MCF7 cell before (0–

300 s), during (600–1200 s)
and after (1200–1800 s) with
PEF inducement
Electroporation Study: Pulse Electric Field Effect … 759

increases 10–20% of the cell size before induction of PEF due to an adsorption of
nutrient on cell surroundings and after induction with EP the size of MCF-7
decreases back in the range of 24–25 lm and continues to raise and proliferate. The
result of this experiment demonstrated the effect of Reversible Electroporation (RE).

5 Summary

In conclusion, the method of electroporation is a convenient and non-invasive

technique that requires no-chemical method to be applied to the cells.
Electroporation is known as phenomenon where the tension breakdown on the cell
membrane causes the cell permeable to any adjacent molecule due to an opening
pore. These occurred because of the effect of the electrical field, causing the lipid
molecule to change its orientation and creating hydrophilic pores. The induction of
the pulse electrical field influences the morphological changes in the cell depending
on the range used for the induction. As explained above, there are two types of
electroporation, reversible (temporary) and irreversible (permanently) electropora-
tion that can either kill or heal the cell.
From these phenomena numerous applications uses this method. The most
application that used this method is in the medical and food industry as elaborated
in the literature review above. But for this study, it will be more focused on medical
applications as in cancer treatment. This is due to improve the cancer treatment that
already has in the medical industry. Cancer treatment that already used in the
medical field are chemotherapy and targeted therapy, but in some cases, the uses of
the drug in this treatment causes varied side effect to the cancer patient. By finding
an alternative that gives less or no side effect would be a great contribution to
cancer treatment applications.
The reasons for selecting reversible electroporation are due to the temporary
open pore and after induction the cells survive. It may therefore not damage the
ordinary cell surrounding it and can be used to improve the anti-cancer agent’s
adsorption into the cancer cells without harming the other ordinary cell. The goal is
to test the low range of pulse electrical field that starts at 100–1000 V/cm (with
100 V/cm interval) on breast cancer cells, monitor the response of the cell and find
the best parameter of pulse electrical field induction that can enhance the cancer
cell’s anti-proliferative.
The research on the electroporation method are interesting due to the growing
demand for alternative and less invasive treatments for localized tumors, we have
seen the development and investigation of several electroporation methods in
several applications. However, if the applied electric voltage is above a certain
threshold, it leads to a larger potential gradient and the cells are unable to seal the
formed pores and the result is cell death. There for by finding the best parameter for
each cancer cell would be more benefit for the researcher or medical practices to
provide a better treatment.
760 N. A. Abd Rahman et al.

Thus, for future biomedical applications such as cancer treatment or wound

healing applications, the optimal parameter of the recognized pulse electric ﬁeld
may have excellent implications.

Acknowledgements The authors would like to thank the Research Management Center (RMC),
UTHM and Ministry of Higher Education for sponsoring the research under Tier 1 Research
Grants (H161) and Geran Penyelidikan Pascasiswazah (GPPS), VOT U949.

References

1. Rolong A, Davalos RV, Rubinsky B (2018) History of Electroporation. In: Meijerink M,

Scheffer H, Narayanan G (eds) Irreversible Electroporation in Clinical Practice. Springer,
Cham
2. Jordan DW, Gilgenbach RM, Uhler MD, Gates LH, Lau YY (2004) Effect of pulsed,
high-power radiofrequency radiation on electroporation of mammalian cells. IEEE Trans
Plasma Sci 32(4):1573–1578
3. Dev SB, Rabussay DP, Widera G, Hofmann GA (2000) Medical applications of
electroporation. IEEE Trans Plasma Sci 28(1):206–223
4. Adon MN (2015) Pulse electric field exposure effect on morphological. Universiti Tun
Hussein Onn Malaysia
5. Abdul Jamil MM, Milad Zaltum MA, Abd Rahman NA (2018) Optimization of pulse
duration parameter for hela cells growth rate. J Telecommun Electron Comput Eng 10(1–
17):1–4
6. Milad Zaltum MA, Adon MN, Hamdan S, Dalimin MN, Abdul Jamil MM (2015)
Investigation a critical selection of pulse duration effect on growth rate of HeLa cells. In:
International conference on BioSignal analysis, processing and systems ICBAPS 2015,
pp 33–36
7. Weaver JC (2000) Electroporation of cells and tissues for drug and gene delivery. IEEE Trans
Plasma Sci 28(1):1–10
8. Lim GCC (2002) Overview of cancer in Malaysia. Jpn J Clin Oncol 32(1):S37–S42
9. Batista Napotnik T, Miklavčič D (2018) In vitro electroporation detection methods – an
overview. Bioelectrochemistry 120:166–182
10. Gehl J (2003) Electroporation: theory and methods, perspectives for drug delivery, gene
therapy and research. Acta Physiol Scand 177(4):437–447
11. Rems L et al (2019) The importance of electric field distribution for effective in vivo
electroporation of tissues. Bioelectrochemistry 125(2):127–133
12. Cemazar M, Sersa G, Frey W, Miklavcic D, Teissié J (2018) Recommendations and
requirements for reporting on applications of electric pulse delivery for electroporation of
biological samples. Bioelectrochemistry 122:69–76
13. Marty M et al (2006) Electrochemotherapy - an easy, highly effective and safe treatment of
cutaneous and subcutaneous metastases: results of ESOPE (European Standard Operating
Procedures of Electrochemotherapy) study. Eur J Cancer Suppl 4(11):3–13
14. Gothelf A, Mir LM, Gehl J (2003) Electrochemotherapy: results of cancer treatment using
enhanced delivery of bleomycin by electroporation. Cancer Treat Rev 29(5):371–87
15. Deipolyi R, Golberg A, Yarmush ML, Arellano RS, Oklu R (2014) Irreversible electropo-
ration: evolution of a laboratory technique in interventional oncology. Diagn Interv Radiol
20:147–154
16. Kumar Y, Patel KK, Kumar V (2015) Pulsed electric field processing in food technology. Int J
Eng Stud. Tech Approach 1(2):6–17
Influence of Electroporation on HT29
Cell Proliferation, Spreading
and Adhesion Properties

Hassan Buhari Mamman, Muhammad Mahadi Abdul Jamil,

Nur Adilah Abd Rahman, Radzi Ambar, and Chew Chang Choon

Abstract The aim of this study is to investigate the influence of pulse electric field
on cell proliferation, spreading and adhesion properties of HT29 cell line towards
the enhancement of tissue regeneration and wound healing process. The HT29 cells
were treated with an electric field of 600 V/cm for 500 µs in vitro. A time-lapse live
imaging of the adhesion properties of the HT29 cell was carried out using integrated
devices that was equipped with digital camera and an inverted microscope. The
study found out that when HT29 cell lines were electroporated with 600 V/cm and
500 µs pulse duration, it reached 96.1% confluence after 64 h of seeding whereas
the non-electroporated (NEP) cells reached only 76% confluence after 64 h.
Interestingly, both the EP cell and the NEP cell attained their maximum length of
34.76 and 29.73 µm respectively after 24 h of seeding. Furthermore, the electric
treatment was found to decrease the adhesiveness of the cell where it detaches from
the substrate after 5.6 min as compared to control group that took 8 min to com-
pletely detach from substrate. Hence, the study suggests that the application of
appropriate electric filed treatment, can cause cellular changes in cells such as
proliferation and adhesion which could contribute to the facilitation of wound
healing process via increased cellular proliferation and migration.

Keywords Adhesion Electroporation Proliferation Cell size Wound healing

1 Introduction

Adhesion of cells to each other and to their ECM is important in creating cell shape
and organization in tissue engineering. Likewise, comprehending how cells adhere
is signiﬁcant in knowing disease development like cancer and muscular dystrophies
which mainly comprise of failure in cell adhesion. Additionally, cell adhesion and

H. B. Mamman M. M. Abdul Jamil (&) N. A. Abd Rahman R. Ambar C. Chang Choon

Biomedical Modelling and Simulation Research Group, Faculty of Electrical and Electronics
Engineering, Universiti Tun Hussein Onn Malaysia, Batu Pahat, Johor, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 761

migration are essential processes in numerous physiological processes such as

wound healing, malignancy as well as embryogenesis [1, 2]. In anchorage–de-
pendent cells, cell adhesion plays a vital role in cell survival and growth because it
supports tissue organization [1]. For example, the inhibition of early cell attachment
events, like cell spreading triggers rapid apoptosis [3] or lack of cell colonization
and differentiation [4].
In recent times, cells lines in tissue culture have been found to reveal a great
dependency on adhesion to surface [5]. Additionally, cell to cell and cell to
extracellular matrix (ECM) interaction have a great implication on numerous bio-
logical processes like migration, adhesion and differentiation [6]. Cell adhesion and
migration have numerous common features and have vital role in wound healing
process [7], embryogenesis [8] and in inflammatory reactions [9]. Impairment of
cell attachment has intense effect and result to uncontrolled states like a defect in
development and metastasis and invasion in cancer [10].
Many physical signals like external force [11], topography [12, 13] and elastic
property of ECM [14] have been considered as significant factors that can control
numerous biological process that are linked to cell migration and adhesion.
Electrostatic and London-Van der Waal interactions (attraction and repulsion) have
been related to the adhesion like charges [15]. That is for bodies of like charge to
adhere to each other, the van der Waal force of attraction must be dominant over the
electrostatic force of repulsion of the like charges [16].
In addition to several other effects, external pulse electric field has been proven
to alter cellular functions including cell surface redistribution and cytoskeletal
reorganization [17]. Manipulations of the cell adhesion, proliferation and migration
abilities are very significant preconditions for inhibition of the cancer cell ability to
grow and invade [2]. However, pulse electric field exposure on cell behaviors which
could alter the electrostatic properties (by changing the charges on the cell due to
external field exposure) of the cells and also affect cell attachment, proliferation,
adhesion and migration have not been fully investigated. Therefore, the effect of
pulse electric field on the proliferation, spreading and adhesion properties on HT29
cell line was investigated in this study. All experiments were repeated three times.

2 Materials and Methods

2.1 Introduction

The experimental setup involved two major sections; cell culture and cell exposure
to electric ﬁeld under controlled environment for live cell imaging. In the cell
culture, the cells were grown and harvested using the standard cell culture tech-
niques. Cell exposure to electric ﬁeld is done with help of integrated equipment for
live cell imaging which comprise an ECM 830 Electroporator, simulation magnetic
chamber, Nikon (Ti-series) inverted microscope, live cell imaging equipment and
Metamorph7.5.0 as imaging software.
Influence of Electroporation on HT29 Cell Proliferation, Spreading … 763

2.2 Cell Culture

The human colon cell line HT29 was used for the experiments. The HT29 cells
lines were grown in a 25 cm2 culture flask. Details of the cell culture procedure
were explained in previous study [18].

2.3 Electroporation

In this study, the commercial electroporator ECM 830 was used to electrically treat
the HT29 cell line. The low voltage (LV) mode of the ECM 830 electroporator at a
voltage of 240 V with a 4 mm gap Cuvette was used to achieve 600 V/cm electric
field, for cell proliferation assay, cell length analysis (cell spreading) and cell
adhesion assay. While the high voltage (HV) mode at a voltage of 600 V with a
10 mm electrode gap was used to achieve 600 V/cm electric field strength for cell
attachment analysis and cell migration assay.
First of all, cells are detached using the procedure explained elsewhere [18].
After neutralizing the effect of detaching enzyme, 800 µl of cells suspension at a
concentration of 4.3 105 cells/ml, were put in a 4 mm cuvette and then placed in
BTX ECM 830 electroporator chamber. Electroporation was executed with an
electric field of 600 V/cm intensity (240 V using 4 mm cuvette) for 500 µs dura-
tion. Immediately after electroporation, the cuvette was then transferred to a bio-
safety hood.
For cell proliferation and cell length measurement, 600 µl of the electroporated
cells were seeded in 25 cm2 flasks containing 7 ml of pre-warm complete growth
medium and incubated at 37 °C and 5% CO2. At the same time, 600 µl of cell
suspensions from a same initial flask but without electric treatment, were seeded
into another 25 cm2 flask containing 7 ml of pre-warm complete growth medium
and incubated in the same incubator as a control. The flasks were then observed
after 6, 24, 48, 64 and 72 h. Images from four different fields of view were acquired
at each time point for cell proliferation and cell length analysis. Each experiment
was repeated three times.
For cell trypsinization adhesion, 300 µl of the electroporated cell were seeded in
a well of 6-well plate, containing 2.5 ml of pre-warm complete growth medium and
incubated at 37 °C, and 5% CO2. The well was labelled as EP. At the same time,
300 µl of non-electroporated cells were seeded in another well of the 6-well plate
containing 2.5 ml of pre-warm complete growth medium and incubated as a con-
trol. The second well was labelled as NEP. The cells were harvested after 48 h at
room temperature as describe in Sect. 2.
764 H. B. Mamman et al.

2.4 Cell Proliferation Assay

Images were acquired after 6, 24, 48, 64 and 72 h of cells seeding for the elec-
troporated and non-electroporated cells. This was done to check cell proliferation
rate or confluence percentage. Image acquisition was achieved with the Nikon
Eclipse TS100 inverted microscope (phase contrast 10/0.25) equipped with Dino
camera and DinoCapture2.0 software. Images from four different ﬁelds of view
were captured during each image acquisition at each time point. The average
confluence percentages were calculated for each time point. The measurement was
carried out using graduation squares in the DinoCapture2.0 software. Each exper-
iment was repeated three times. The average confluence percentage of HT29 cell
lines and its standard error of the mean (SEM) over time for both electroporated
(EP) and non-electroporated (NEP) cell were measured and exported to Microsoft
Excel for analysis.

2.5 Cell Spreading (Cell Length Measurement)

Cell length measurement of the electroporated (EP) and non-electroporated

(NEP) cell (control) began 6 h after the cell plating. This time is enough for the
HT29 cells to attach onto the substrate. Images were captured with Nikon Eclipse
TS100 inverted microscope (phase contrast 10/0.25) equipped with Dino camera
at 6, 24, 48, 64, and 72 h. The cell length measurement was carried out using
DinoCapture2.0 software on the images acquired at 6, 24, 48 and 72 h. At each
time, forty cells are randomly selected, and their mean length was computed. The
experiment was repeated three times. The mean and standard error of the mean for
the measurement were evaluated using Microsoft Excel.

2.6 Cell Adhesion or Trypsinization Assay

The aim of this test was to investigate the effect of the electric ﬁeld (600 V/cm for
500 µs) on HT29 cells line adhesion properties. After 48 h in culture, the 6-well
plate was transferred from the main incubator to the biosafety cabinet. A time-lapse,
multi-stage acquisition system in the DinoCapture2.0 software was prepared in
order to capture the cell images from the wells during trypsinization. Cells from
each well were washed twice with 1 ml of PBS. Thereafter, 0.5 ml of tryple express
solution (detaching enzyme) was added to the cells in each well. The cells were
immediately placed on the stage of Nikon TS100 inverted microscope. Imaged
were acquired every 10 s for a duration of 10 min (60 frames in total) using a 10/
0.25 phase-contrast objective microscope and DinoCapture2.0 software (with the
time-lapse multi-dimensional acquisition).
Influence of Electroporation on HT29 Cell Proliferation, Spreading … 765

The process was carried out for both electrically treated cells and the cells in the
control group (cell seeded without the electric treatment). In the course of
trypsinization process, cell detaching from a substrate usually become rounded [19].
Later, the amount of cell detachment was computed by counting the number of
spherical or rounded cells at each time divided by the total number of cells in that
ﬁeld of view (both rounded and unrounded) multiply by 100%. This gives the
percentage of cell detachment at that time. Each experiment was repeated three
times and the mean percentage of cells detachment was calculated at each time
point.

3 Results and Discussion

3.1 Cell Proliferation Assay

All data obtained from the experiment were found to be normally distributed with
P > 0.05 using Kolmogorov- Smirnov and Shapiro-Wilk test for normality.
Therefore, one-way analysis of variance and Turkey HSD post Hoc test were used
to test for statistical significance among the data obtained. P value of less than 0.05
shows a significant difference in treatment. Whereas, P-value greater than 0.05
indicates no significant difference in the treatment. Table 1 gives P-values obtained
from the statistical analysis (Post Hoc Turkey SHD test in one way ANOVA) via
SPSS. The PEF treated cells and control group were found to be 37.5% ± 5.58 and
31.0% ± 4.14 confluence respectively after 6 h of seeding with no significant
difference in confluence percentage between the EP cells and NEP cells
(P = 0.0910 > 0.05). This could be because cells have not started proliferating at
this time and since both flasks were seeded with equal cell concentration, there
could be no difference in the confluence percentage between them. The PEF treated
cells and the cells in the control group were found to reach 51.3% ± 5.48 and
40.0% ± 3.28 confluences respectively after 24 h of seeding. There was a signif-
icant difference between the confluences percentage in PEF treated cells and cells in
the control group after 24 h (P = 0.007 < 0.05).

Table 1 P-values from Time of treatment (hour) Parameter P-values

statistical analysis of
proliferation percentage over 6 EP - NEP 0.0910
time 24 EP - NEP 0.0070
48 EP - NEP 0.0001
64 EP - NEP 0.0001
72 EP - NEP 0.0700
766 H. B. Mamman et al.

Furthermore, the PEF treated cells and the cells in the control group reached
75.9% ± 2.13 and 60.0% ± 8.58 confluences respectively after 48 h of seeding.
There was also a significant difference (P = 0.0001 < 0.05) between the percentage
confluences in PEF treated cells and cells in the control group after 48 h in culture.
Moreover, the PEF treated cells reached 96.1% ± 1.76 confluences after 64 h of
seeding. On the other hand, the cells in the control group only reached
75.6% ± 5.75 confluences after 64 h of seeding. The result revealed further that
there was a significant difference (P = 0.0001 < 0.05) in the percentage confluence
of the cells after 64 h in culture. The cells in the control group were found to reach
90.6% ± 1.43 after 72 h in culture. There was no significant difference in per-
centage confluences of in PEF and control groups after 72 h in culture. This could
be that the cell in the PEF group has already reached 96.1% ± 1.76 confluences
after 64 h and stop proliferating due contact inhibition and unavailability of space
for growth.
From the results obtained and tabulated in Table 2, it shows that the growth is
more rapid with EP than NEP. This could be that EP assisted the cells to take more
nutrients from growth media and facilitated its growth as a result of pore formation.
It could be also stated that the electric field modulated S-phase and M-phase of the
cell cycle, of which consequence is the increases in the proliferation rate of the PEF
cells as compared to the cell in the control group. Furthermore, it has shown that
field parameter used did not irreversibly damage the cells, because if the opening
stayed for a long time the cell might take more than enough nutrients, which could
cause the membrane to burst and result in cell death. As shown in Fig. 1, there is a
steady increase in the growth rate of the HT29 cells treated with EP. Even though
NEP also showed a continuous increase in percentage confluence, it is always less
than that of the PEF cells.

Table 2 Average confluence percentage of HT29 cell lines over time ± Standard error of the
mean
Treatment Time (hours) Percentage confluence ± SD (%) Standard error
EP 6 37.5 ± 4.92 2.84
24 51.3 ± 5.48 3.17
48 75.9 ± 2.13 1.23
64 96.1 ± 1.76 1.01
72 98.6 ± 0.35 0.20
NEP 6 31 ± 4.14 2.40
24 40 ± 3.28 1.89
48 60 ± 8.58 4.96
64 75.6 ± 5.75 3.32
72 90.6 ± 1.43 0.83
Influence of Electroporation on HT29 Cell Proliferation, Spreading … 767

Fig. 1 Graph showing confluence percentage over time for both electroporated and
non-electroporated HT29 cell lines

3.2 Cell Spreading (Cell Length Measurement)

From the results obtained and tabulated in Table 3, the electrically treated HT29
cells showed a continuous increase in cell length until it reaches a maximum length
of 34.76 µm after 24 h of seeding. Furthermore, the cell exhibited a decrease in
length with increased in proliferation (cells begins to divide) until it reached a
length of 16.46 µm when it is fully confluence after 72 h. Interestingly, the cells in
NEP group also reached its maximum length of 29.73 µm after 24 h of seeding.
The cells in the control group also revealed a decrease in length after 24 h.
The NEP cell length decreases to 15.6 µm when the cells reached confluence stage
after 72 h. Since cell must attain a certain size before they could divide [20], and
cell length in the electroporated group are larger than those in the control group; this
could be the reason of higher proliferation rate in cells treated with PEF. This could
be that electroporation increased extracellular matrix (ECM) production in cell and
also stimulated cell spreading capability that resulted in the increment in cell length.
This increase in length further accelerated the cell division in the M phase of the
cell cycle.

Table 3 Average length of Time (hour) Average cell length ± SEM (µm)
HT29 cell line over time
EP NEP
6 21.4 ± 1.08 16.84 ± 0.4
24 34.76 ± 0.69 29.73 ± 1.35
48 26.37 ± 1.8 24.73 ± 0.63
64 23.25 ± 0.75 17.69 ± 1.06
72 16.46 ± 1.19 15.6 ± 0.55
768 H. B. Mamman et al.

Fig. 2 Graph showing

average cell length over time
for both electroporated and
non-electroporated HT29 cell
lines

Even though cell length is greater in treated cells than in control, it is interesting
that both treated and control cells reached their maximum length after 24 h of
plating as shown in Fig. 2. This could be possible because cells divide every 24 h
[21] after reaching a certain minimum size [22]. Similarly, both cells almost return
to their initial length after 72 h of seeding with increased in proliferation rate which
is more pronounced in EP cells than NEP cells. Therefore, it could be stated that
pulse electric ﬁeld facilitated cell division and increased cell growth. Figure 3 also
shows images of the growth rate of HT29 cells treated with and without EP after 0,
6, 24, 48, 64 and 72 h. Figure 3 revealed Images of the cell proliferation rate as
described in Table 3.

3.3 Cell Adhesion or Trypsinization Assay

Figure 4 shows the trypsinization progression for the electrically treated HT29 cell
line and the control group over a period of eight minutes. The electrically treated
cells began to detach from the surface of the substrate at about 3 min after the
application of the tryple express solution. On the other hand, the control group starts
dissociating from the surface of the substrate at about 4.6 min after trypsinization.
The cell detachment in the electrically treated group was 85.6% after four minutes
while the cell detachment in the control group was 68.7% after 4 min.
The electrically treated cell and cells in the control group completely detached
from the substrate after 5.6 and 8 min respectively after the addition of the tryple
express solution. The result shows that the electrically treated cells detached rela-
tively faster during trypsinization process. Thus, the decreased in the adhesion
properties could be beneﬁcial to cellular behaviour during wound healing process
[23]. The results suggested that the use of 600 V/cm and 500 µs decreased the
degree of cell adhesion which could influence cell migration in wound healing
process [24].
Influence of Electroporation on HT29 Cell Proliferation, Spreading … 769

EP (600V/cm at 500μs) NEP (Control)

0 hours

6 hours

24 hours

48 hours

64 hours

72 hours

Fig. 3 Images of electroporated (EP) and non-electroporated HT29 cells at 0, 6, 24, 48, 64 and
72 h respectively (Scale bar = 100 µm)

In this research, the electrically treated HT29 cell line was found to reach 96.1%
confluence after 64 h of seeding whereas the non-electroporated (NEP) cells
reached only 76% confluence after 64 h. Interestingly, both the EP cell and the NEP
770 H. B. Mamman et al.

(600V/cm at 500μs) NEP (Control)

0 minutes

4 minutes

6 minutes

8 minutes

Fig. 4 Trypsinization process of HT29 cell line under pulse electric treatment and control
group. Scale bar = 50 µm

cell attained their maximum length of 34.76 and 29.73 µm respectively after 24 h
of seeding. Furthermore, the electric treatment was found to decrease the adhe-
siveness of the cells where they detach from the substrate after 5.6 min as compared
to the control group that took 8 min to completely detach from the substrate.
Therefore, the findings suggested that the application of appropriate electric field
treatment can cause cellular changes in cells such as attachment, proliferation and
adhesion which could contribute to the facilitation of wound healing process via
increased cellular proliferation and migration.
The study revealed that the electroporation has a significant effect on HT29 cells
line attachment, proliferation and adhesion. Thus, when HT29 cells line were
exposed with 600 V/cm electric field strength and 500 µs pulse duration made the
Influence of Electroporation on HT29 Cell Proliferation, Spreading … 771

cells attached faster to the monolayer for growth and development when compared
to those in the control group. This could be that the electric ﬁeld up-regulates the
signalling pathway of cell adhesion molecules such as integrin and cadherin and
facilitated cell attachment [25]. The study also revealed that electroporation has an
influence on the cell length and proliferation rate of the HT29 cell line. The reason
could be that electroporation-facilitated the synthesis of extracellular matrix protein
and assisted cell in taking more nutrients for growth and proliferation due to pore
formation. Therefore, the study could be useful in understanding cell adhesion and
migration in wound healing application, since cell adhesion and proliferation forms
the basis of cell migration and other physiological processes.
The study further investigated the effect of PEF on the HT29 cells proliferation
rate. Subsequently, it was found that PEF has great influence on the HT29 cell
proliferation rate. The HT29 cells under PEF treatment were found to reach 75%
confluence in 16 h faster than the untreated cells. Similarly, cells spread wider
under PEF treatment when compared to the untreated cells. The increase in the
spreading characteristics of the cells could also be the reason for their increase in
proliferation. The outcome of this investigation is in agreement with that of [1] and
[2]. This is because cells proliferation is greatly dependent on the cell size [22].
That is cells must grow to a certain size before they can divide. The study, there-
fore, suggests an opportunity for facilitating wound healing process without the
need for adding external drug or growth factor.

4 Summary

In this study, the influence of EP on the proliferation, spreading and adhesion

properties of HT29 cells line using the optimum parameter identiﬁed for HT29 cell
line [24]. EP was also found to increase the proliferation rate of the HT29 cells by
27.1 as compared to the NEP cells. On the other hand, the EP reduced the adhesion
strength of the cells by 29.2% as compared to the cells in the NEP group. This
increment in the cell proliferation and the corresponding decrement in cell adhesion
have great implication for wound application where cells are required to proliferate
and migrate faster to heal a wound. Thus the outcomes of this study could have
great potentials use in drug-free wound healing process.

References

1. Pehlivanova VN, Tsoneva IH, Tzoneva RD (2012) Multiple effects of electroporation on the
adhesive behaviour of breast cancer cells and fibroblasts. Cancer Cell Int 12(9):1–14
2. Pehlivanova VN, Tsoneva IH, Tzoneva RD (2010) Influence of electroporation on cell
adhesion, growth and viability of cancer cells and fibroblasts. Biol Cell 64(4):581–590
3. Gekas J, Hindié M, Faucheux N, Lanvin O, Mazière C, Fuentès V, Nagel MD (2004) The
inhibition of cell spreading on a cellulose substrate (cuprophan) induces an apoptotic process
via a mitochondria-dependent pathway. FEBS Lett 563(1–3):103–107
4. Ivanov II, de Llanos Frutos R, Manel N, Yoshinaga K, Rifkin DB, Sartor RB, Littman DR
(2008) Specific microbiota directs the differentiation of IL-17-producing T-helper cells in the
mucosa of the small intestine. Cell Host Microbe 4(4):337–349
5. Lloyd AC (2013) The regulation of cell size. Cell 154(6):1194–1205
6. Geiger B, Bershadsky A, Pankov R, Yamada KM (2001) Transmembrane crosstalk between
the extracellular matrix and the cytoskeleton. Nat Rev Mol Cell Biol 2(11):793–805
7. Fletcher SJ (2013) Investigating the role of vesicle trafficking in epithelial cell migration.
University of Birmingham, PhD Thesis
8. Jacinto A, Woolner S, Martin P (2002) Dynamic analysis of dorsal closure in Drosophila:
from genetics to cell biology. Dev Cell 3(1):9–19
9. Muller WA (2003) Leukocyte–endothelial-cell interactions in leukocyte transmigration and
the inflammatory response. Trends Immunol 24(6):326–333
10. Thiery JP (2002) Epithelial–mesenchymal transitions in tumour progression. Nat Rev Cancer
2(6):442–454
11. Desprat N, Supatto W, Pouille PA, Beaurepaire E, Farge E (2008) Tissue deformation
modulates twist expression to determine anterior midgut differentiation in Drosophila
embryos. Dev Cell 15(3):470–477
12. Dalby MJ (2009) Nanostructured surfaces: cell engineering and cell biology. Nanomedicine 4
(3):247–248
13. Le Digabel J, Ghibaudo M, Trichet L, Richert A, Ladoux B (2010) Microfabricated substrates
as a tool to study cell mechanotransduction. Med Biol Eng Comput 48(10):965–976
14. Ghassemi S, Meacci G, Liu S, Gondarenko AA, Mathur A, Roca-Cusachs P, Hone J (2012)
Cells test substrate rigidity by local contractions on submicrometer pillars. Proc Nat Acad Sci
109(14):5328–5333
15. Poortinga AT, Bos R, Norde W, Busscher HJ (2002) Electric double layer interactions in
bacterial adhesion to surfaces. Surf Sci Rep 47(1):1–32
16. Leckband D (2000) Measuring the forces that control protein interactions. Ann Rev Biophys
Biomol Struct 29(1):1–26
17. Kanthou C, Kranjc S, Sersa G, Tozer G, Zupanic A, Cemazar M (2006) The endothelial
cytoskeleton as a target of electroporation-based therapies. Mol Cancer Ther 5(12):3145–3152
18. Mamman HB, Sadiq AA, Adon MN, Jamil, MMA (2015) Study of electroporation effect on
HT29 cell migration properties. In: 2015 IEEE international conference on control system,
computing and engineering (ICCSCE). IEEE, pp 342–346
19. Rubinsky B (2007) Irreversible electroporation in medicine. Technol Cancer Res Treat 6
(4):255–259
20. Marguerat S, Bähler J (2012) Coordinating genome expression with cell size. Trends Genet
28(11):560–565
21. Cooper GM, Hausman RE (2000) The cell. Sinauer Associates, Sunderland, pp 725–730
22. Turner JJ, Ewald JC, Skotheim JM (2012) Cell size control in yeast. Curr Biol 22(9):R350–
R359
Influence of Electroporation on HT29 Cell Proliferation, Spreading … 773

23. Sevilla C (2013) The role of extracellular matrix fibronectin and collagen in cell proliferation
and cellular self-assembly. University of Rochester, PhD thesis
24. Mamman HB, Jamil MMA, Adon MN (2016) Optimization of electric field parameters for
HT29 cell line towards wound healing application. Indian J Sci Technol 9(46):1–7
25. Zhao M (2009) Electrical fields in wound healing—an overriding signal that directs cell
migration. Semin Cell Dev Biol 20(6):674–682 Academic Press
Wound Healing and Electrofusion
Application via Pulse Electric Field
Exposure

Muhammad Mahadi Abdul Jamil, Mohamad Nazib Adon,

Hassan Buhari Mamman, Nur Adilah Abd Rahman, Radzi Ambar,
and Chew Chang Choon

Abstract This study is concerned with the investigation of pulsed electric field
(PEF) towards biological cells. Biological cells selected in this study are HeLa
(cervical cancer) cells. Experimental setup involves different important parts, which
include; the source of square wave PEF (ECM®830) that can generate until 3 kV
field strength; a modified EC magnetic chamber with incubator system that has been
used in order to expose HeLa cells to PEF and a Nikon inverted microscope
(Ti-series) for subsequent visualization techniques, image and video. In the early
stage, experimental setup was tested by monitoring the proliferation rate of HeLa
cells within 0 to 48 h. Then HeLa cells were tested to look at the swelling effect via
PEF exposure. After that, we continued to identify the optimum PEF parameters for
reversible condition on HeLa cell. As a result, HeLa cells give a good response at
2.7 kV field strength, 30 ls pulse length with single pulse. Further study showed
that two or more adjacent HeLa cells merged together due to increased cell
membrane permeability (electrofusion). This discovery triggered an idea to look at
the PEF effect on wound healing process. An artificial wound site was investigated
with and without PEF exposure. The finding shows that wound area exposed to PEF
took 3 h to completely heal while the untreated area took 10 h. This provide a novel
technique (electrical based novel treatment) which could be an alternative to drug
usage for wound healing process. Overall, the findings achieved in this study could
lead us onto a drug free wound healing method.

Keywords HeLa cells Electric ﬁeld Electro-fusion Wound healing

M. M. A. Jamil (&) M. N. Adon H. B. Mamman N. A. A. Rahman R. Ambar

C. C. Choon
Biomedical Modelling and Simulation Research Group, Faculty of Electrical and Electronics
Engineering, Universiti Tun Hussein Onn Malaysia, Batu Pahat, Johor, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 775

1 Introduction

Despite the widespread use of electroporation in biotechnology and biomedical

engineering, there is no uniform and comprehensive theory that explains the
mechanism that triggers electroporation phenomenon. It is, therefore, paramount
important and interesting to look into the experimental fact that theory needs to
encompass. Therefore, this paper is intended to offer an explanation of experiment
in the area of electroporation for wound healing application.
Effects of pulse electric field (PEF) on biological cell have been intensively
investigated over the last decade. Ušaj et al. 2010 have been studying the opti-
mization of electric field amplitude and hypotonic treatment for mouse melanoma
(B16-F1) and Chinese hamster ovary (CHO) cells [1]. It has been shown that
appropriate hypotonic treatment of cells before the application of electric pulses can
cause a significant increase in electrofusion efficiency. This phenomenon is
potentially, the basis for many in vivo applications such as electro chemotherapy
and gene therapy. However, it still lacks a comprehensive theoretical and experi-
mental basis.
This study involves in vitro technique to evaluate specific cellular level inter-
actions with PEF under controlled environments outside of living organism. This
fascination of controlling cell functions by using PEF has led to the discovery of
electrofusion and has been a topic of great interest in physiological and morpho-
logical changes [2–4]. Based on previous simulation combined with current
experimental work, we can observe that the main advantage of this study is some of
the exposure conditions can be easily and precisely controlled (e.g., changing
exposure duration, background temperature, or exposure field intensity) as a mean
of determining the dose-response relationships and the effect of applying different
threshold levels [5–8]. In order to delve deeper into the quantitative interaction
mechanisms between electric field and biological cells, an experimental setup that
confines cell observation during exposure to electric field has been reported [9].
Here, we are reporting the effect of multicellular cell on the transient increase in the
permeability of cell membranes for cultured HeLa cells. Besides that, we are also
monitoring this effect in real time using confocal laser Nikon inverted research
microscope (Ti-series). There is 2.7 kV/cm amplitude voltage applied with pulsing
sequences of 30 ls long pulses. These are simply single pulses produced by a
commercially available pulsed power generator [10]. Due to the fact that the cell
membrane permeability depends on the applied pulse number, the experiment was
performed only one pulse number for the amplitude voltage.
Furthermore, wound healing is a complex and dynamic process with the wound
environment changing with the changing health status of the individual. The
knowledge of the physiology of the normal wound healing trajectory through the
phases of homeostasis, inflammation, granulation and maturation provides a
framework for an understanding of the basic principles of wound healing. Through
this understanding the health care professional can develop the skills required to
care for a wound and the body can be assisted in the complex task of tissue repair.
Wound Healing and Electrofusion Application via PEF Exposure 777

A chronic wound should prompt the health care professional to begin a search for
unresolved underlying causes. Healing a chronic wound requires care that is patient
centered, holistic, interdisciplinary, cost effective and evidence based. This is one of
ﬁve articles made available by the Canadian Association of Wound Care to assist
the wound care clinician develop an increased understanding of wound healing.

2 Materials and Methods

2.1 Preparation of HeLa Cells

In this study HeLa cells were cultured on circular coverslip glass (25 mm diameter)
in a 6 well culture plate. The cell was harvested and appropriate amount was seeded
into the six well and incubated for 24 h until HeLa cells growth rate increased as
much as 40%. Lastly, the glass cover slip was transferred into controlled EC
magnetic chamber to performed real time visualization and data analysis at the same
time.

2.2 Inducement of PEF Towards HeLa Cells

The same optimum PEF parameter will be used in this investigation namely,
2700 V (voltage amplitude), 30 ls (pulse duration) and single pulse number. This
optimum parameter will excite plasma membrane to induced transmembrane
potential which performed polarization effects because of the ion concentration
namely potassium and sodium. After that, ions can cross the plasma membrane and
cause an action of cell depolarization, until it attempts to return to its resting state
(normal condition).

2.3 Integrated Devices of Real Time Imaging System

In this study the signiﬁcance of real time imaging equipped with high speed camera
was higher priority to record electrofusion phenomena that happened at multiple
seconds [11, 12]. In this studies, high resolution and sensitivity CCD camera
(QImaging, Exi Aqua) with short exposure time (10.9 frame per second full res-
olution @ 14 bits (20 MHz)) will be obtained to performed high resolution bright
ﬁeld imaging such as phase contrast for electrofusion phenomena in short duration
(<60 s).
778 M. M. A. Jamil et al.

2.4 Preparation of HeLa Cells

In this investigation, 6-well cell culture plate was used with 25 mm circular cover
slip. The plate was stored in the CO2 Nuaire incubator (level of 5%; temperature
setting 37 °C) until reaching the proliferation rate 80 to 85%. After HeLa cells
reached 85% confluence, aspirate the culture medium and wash the confluent of 6
well cell dish once with 5 ml PBS. Moreover, transfer two 25 mm circular cov-
erslip into EC and CMB magnetic chamber using tweezers; add 1 ml of fresh RPMI
media into EC magnetic chamber and CMB magnetic chamber. Scratching a line at
the surface of 25 mm circular coverslip layer with needle. Finally put the two
chambers above the 2-hole Chamlide MB for 25 mm round coverslip model
MB-R25-2, then the prepared sample can be induced PEF with real time visual-
ization simultaneously.

2.5 Optimum PEF Exposed

The optimum PEF parameter will be used in this investigation namely, 2700 V
(electric ﬁeld intensity), 30 ls (pulse duration) and single pulse number.

2.6 Real Time Imaging Technique

Real time monitoring for this studied was conducted by using Metamorph® time
lapse software to observed morphological changes of wound closure between two
samples of HeLa cells. Therefore, the signiﬁcance of motorized microscope stage
(Nikon Ti-series) to alternate successively for captured imaged between EC and
CMB magnetic chamber were performed with 1 min time interval (MetaMorph®).

3 Result and Discussion

Electrofusion is a related phenomenon by which neighboring cells can be induced to

fuse by applying a PEF. Both electroporation and electrofusion are related to the
electrical breakdown of the cell membrane. The experimental result shows that there
is an effect of a PEF on the HeLa cells. Where the HeLa cells exposed to the PEF are
known as induced transmembrane potential. This experimental result is related to the
morphological phenomena during multicellular exposed to PEF. From Fig. 1, it can
be seen that the HeLa cells in contact with each other, the pulse-induced change in
cells structure eventually leads to the fusion of the cells (electrofusion). Finally,
single cells or hybrid cells can be produced.
Wound Healing and Electrofusion Application via PEF Exposure 779

0 second 4 second 8 second

12 second 16 second 20 second

24 second 28 second

Fig. 1 Electrofusion effect due to PEF exposure (scale bar = 24 lm)

From the diagram, it can be seen that the intensity profile for the border area
between the cells are very high reach to 100% (5519; arbitrary unit). It happened
before the sample being applied with PEF. Once the sample being exposed to PEF,
the intensity profile decreases gradually to 82% (962). This situation will not
change until the whole of these cells merge and form a condition called hybrid cells.
This was due to the PEF which is known to cause temporal membrane defects such
as electroporation or electrofusion. It was also experimentally shown that the
electrical impedance of the plasma membrane began to fluctuate after the appli-
cation of a certain level of PEF before settling within several hundreds of second.
This delay is likely dependent on the energy of the pulses provided.
The Electrofusion study demonstrated that the cells are able to combine with the
adherent cells depending on PEF exposure rate via the examined optimum range.
Thus, the contribution of this study as enabling the combination of HeLa cells could
be useful in the wound healing process. The electrofusion phenomena show that the
single cell area becomes integrated with the adjacent cells thus covering a bigger
surface area in comparison to the area before the phenomena took place. One of the
challenges in the wound healing studies is the ways to direct cells to grow in the
specific direction that we would like the cell to grow. This directly related to the as
much surface being covered by the cell’s presence at the wound site. The PEF effect
does show a mechanism or ways to increase the cell size, covering bigger area and
potentially saving time towards wound closure. Thus, to prove this it will be useful
to look at the process of wound healing assisted with the PEF exposure by imaging
the wound closure rate scenario in real time. It is highly anticipated that the fol-
lowing study and investigation may contribute towards a novel method of wound
healing process by the means of PEF.
780 M. M. A. Jamil et al.

0 minute 10 minutes

30 minutes 60 minutes

90 minutes 120 minutes

150 minutes 180 minutes

Fig. 2 Wound healing process with PEF treatment (scale bar = 40 lm)
Wound Healing and Electrofusion Application via PEF Exposure 781

0 minutes 10 minutes

30 minutes 60 minutes

180 minutes 360 minutes

480 minutes 600 minutes

Fig. 3 Wound healing process without PEF (scale bar = 40 lm)

782 M. M. A. Jamil et al.

Previous study we began with proliferation rate monitoring to look at the timing or
motion of cell activity in migratory response. After the convincing results produced
by the proliferation rate study, we managed to further investigate one potential
application of the system. This method introduces was PEF assisted wound healing
process live recording. The results obtained from this experiment shows the effect of
HeLa cell charged PEF (2.7 kV, 30 ls, number of pulses 1) HeLa cell causes a more
rapid wound healing process to take approximately 3 h as shown in Fig. 2. In con-
trast, HeLa cells in the control environment (no PEF) was still open with a gap of
29.7 lm. Further observation of the cells in the control group revealed that, the cells
took about 10 h to closed the wound model as shown in Fig. 3.
This study shows that the rate of wound closure with respect to time versus area
has been investigated closely. The imaging results shows that the duration taken by
the HeLa cells to proliferate towards wound area have signiﬁcant contribution
towards rate of wound closure. From the control sample it was found that the
complete wound closure achieved at 10 h after the scratching a line.
Surprisingly, for the PEF exposed sample it shows that the complete wound clo-
sure achieved at 3 h which shows great reduction in time for wound closure. These
ﬁndings were further investigated by repeating the experiments 5 times and the results
show the same. Therefore, from this study it shows that the physical activity and
morphological properties of HeLa cells does affected by the presence of PEF.
The reduction in time for wound closure as proven in this study shows a novel
contribution of a PEF assisted method or Electrical technique that may be useful for
the treatment of wound healing in the future [13–15].

A: 0 second B: 10 second C: 15 second

D: 20 second E: 20 second

Fig. 4 Swelling effect due to PEF (scale bar = 15 lm)

Wound Healing and Electrofusion Application via PEF Exposure 783

The results in Fig. 4. show the effect on HeLa cell diameter when induced by
2.7 kV/cm. The pulse duration is 30 ls and the number of pulses is 1. The imaging
technique is the phase contrast microscopy with the 20X objective. The PEF
excitation performed using magnetic EC chamber combined with incubator system.
The results shown in Fig. 4.(A) demonstrates HeLa cell diameter is 11.98 lm
without PEF exposure. After being exposed with PEF it can be seen that HeLa cell
diameter expanded to 13.51 lm, (an increased to 12.77%) on the Fig. 4.(B). The
circumference diameter gradually decreased until when the cell diameter of returned
to the initial value 11.98 lm as shown in Fig. 4.(E). This situation confirms that
physical changes indicated by the HeLa cell and the response encountered in this
study named reversible EP [16, 17].
The use of Metamorph® software enabled time lapse recording of the cell
reversible electroporation (EP) response thus, providing a qualitative data for
analysis. This is an advantage for us to record real time images and video as low as
1 ms per frame. This is also an important factor for close monitoring of morpho-
logical related changes on the cells which being interrogated. Lastly, in this
investigation we have shown how the PEF exposure does have an effect on HeLa
cells through complete experimental setup by attaining the preliminary results. This
shows good agreement with the theoretical modeling which also predicts similar
outcome when cells exposed with PEF resulting in changes on transmembrane
potential. The experimental studies and setup in this study also confirmed that
swelling and resealing process on HeLa cells achievable by the means of EP.
Overall, we have proven that the PEF changes the HeLa cell membrane that
induced transmembrane potential. This is due to the significant increase in the
electrical permeability and conductivity to vary when subjected to PEF. This
change occurs when the high voltage amplitude is set up to 2.7 kV with a pulse
length 30 ls and the number of pulses one. All the data were recorded with real
time observation using the Ti-series Nikon microscope. Due to the simplified type
controlled high voltage amplitude, the optimized high voltage pulse is preferred
when the PEF is repetitively applied to biological targets.

Acknowledgements The authors would like to thank the Research Management Center (RMC),
UTHM and Ministry of Higher Education for sponsoring the research under Tier 1 Research
Grants (H161).

References

1. Ušaj M, Trontelj K, Damijan M, Kanduser M (2010) Cell-cell electrofusion: optimization of

electric ﬁeld amplitude and hypotonic treatment for mouse melanoma (B16–F1) and Chinese
hamster ovary (CHO) cells. J Membr Biol 236(1):107–116
2. Joshi R, Hu Q (2010) Analysis of cell membrane permeabilization mechanics and pore shape
due to ultrashort electrical pulsing. Med Biol Eng Comput 48(9):837–844
3. Pech M, Janitzky A, Wendler JJ et al (2011) Irreversible electroporation of renal cell
carcinoma: a ﬁrst-in-man phase I clinical study. Cardiovasc Intervent Radiol 34(1):132–138
784 M. M. A. Jamil et al.

4. Schönenberger C, Schütz A, Franco-Obregón A et al (2011) Efﬁcient electroporation of

peptides into adherent cells: investigation of the role of mechano-growth factor in
chondrocyte culture. Biotechnol Lett 33(5):883–888
5. Huo R, Ma Q, Wu JJ et al (2010) Noninvasive electromagnetic fields on keratinocyte growth
and migration. J Surg Res 162(2):299–307
6. Garcia PA, Neal RE, Sano MB et al (2011) An experimental investigation of temperature
changes during electroporation. In: XXXth URSI general assembly and scientific symposium,
pp 1–4
7. Khan OGM, El-Hag AH (2011) Biological cell electroporation using nanosecond electrical
pulses. In: 1st middle East conference on biomedical engineering (MECBME), pp c1–c4
8. Reigada R, Fernandez ML (2011) Structure and electroporation of lipid bilayers: a molecular
dynamics study. In: XXXth URSI general assembly and scientific symposium, Istanbul,
pp 1–4
9. Mamman HB, Jamil MMA, Adon MN (2018) Investigation of electroporation effect on HT29
proliferation rate and spreading properties when plated on fibronectin coated substrate. Adv
Sci Lett 24(6):4387–4390
10. Mitsutake K, Satoh A et al (2012) Effect of pulsing sequence of nanosecond pulsed electric
fields on viability of HeLa S3 cells. IEEE Trans Dielectr Electr Insul 19(1):337–342
11. Hamdi FS, Français O (2014) How medium osmolarity influences dielectrophoretically
assisted on-chip electrofusion. Bioelectrochemistry 100:27–35
12. Salipante PF, Vlahovska PM (2014) Vesicle deformation in DC electric pulses. Soft Matter 10
(19):3386–3393
13. Mamman HB, Adon MN, Jamil MMA (2016) Optimization of electric field parameters for
HT29 cell line towards wound healing application. Indian J Sci Technol 9(46):1–7
14. Ganeson S, Jamil MMA, Mamman HB, Rahman NAA (2018) Correlation between electrical
field strength and pulse width analysis on cell viability. J Phys 1019(1):1–5
15. Zegman Y, Bonazzi D, Minc N (2015) Measurement and manipulation of cell size parameters
in fission yeast. Methods Cell Biol 125:423–443
16. Hjouj M, Last D et al (2012) MRI study on reversible and irreversible electroporation induced
blood brain barrier disruption. PLoS ONE 7(8):e42817
17. Mamman HB, Jamil MMA, Adon MN (2017) Low amplitude pulse electric field for
elimination of unpleasant sensation associated with high amplitude electric field for
electrochemotherapy. J Adv Technol Eng Stud 3(2):27–33
Color Constancy Analysis Approach
for Color Standardization on Malaria
Thick and Thin Blood Smear Images

Thaqifah Ahmad Aris, Aimi Salihah Abdul Nasir, Haryati Jaafar,

Lim Chee Chin, and Zeehaida Mohamed

Abstract Malaria is an extensively prevalent blood infection, the most severe and
widespread parasitic disease that stirring millions of people in the world. Currently,
microscopy diagnosis still the most widely used method for malaria diagnosis.
However, this procedure contains the probability of miscalculation of parasites due
to human error. Computerized system by using image processing is recognized as a
quick and easy ways to analyze a lot of blood samples. However, because of the
non-standard preparation of the blood slides which producing color varieties in
different slides will result on low quality images. Hence, it is difficult to identify the
existence of malaria parasites as well as observing its morphological characteristics
to recognize malaria parasites. Therefore, this paper aims to analyze the standard-
ization performance between six types of color constancy algorithms namely, gray
world (GW), white patch (WP), modified white patch (MWP), progressive hybrid
(PH), shades of gray (SoG) and gray edge (GE) on both thick and thin blood smear
malaria images of P. falciparum and P. vivax species. Six types of color constancy
algorithms standardization performance are analysed by using quantitative measure
namely, peak signal to noise ratio (PSNR), normalized absolute error (NAE), mean
square error (MSE) and root mean square error (RMSE). Based on the qualitative
and quantitative findings, the results show that SoG algorithm is the best color
constancy as compared to others proposed color constancy. SoG algorithm has
achieved the highest PSNR and lowest NAE, MSE and RMSE values, thus proved
that the quality of malaria images have been improved.

T. A. Aris (&) A. S. A. Nasir H. Jaafar

© Springer Nature Singapore Pte Ltd. 2021 785

Keywords Malaria thick and thin blood smear Color constancy Color

standardization Quantitative measure

1 Introduction

Malaria is a blood disease that is generated by the plasmodium parasite, which is

spread by the bite of a female Anopheles mosquito from a person to person [1].
According to the current WHO malaria report, there were 219 million cases
of malaria in 2017 [2]. Based on the high number of malaria incident, it is crucial to
do a medical inspection every year. Over time, malaria parasites can crowd out the
normal blood cells. One to two weeks after a person is infected the first ailment of
malaria appear such as fever, headache, chills and vomiting [3]. If not administer
properly with effective medications, this disease can infecting and destroying red
blood cells by clogging the capillaries that carry blood to the brain or other crucial
organs [3]. Microscopic diagnosis based on giemsa stained blood smears currently
turn out as “gold standard” for malaria detection and identification. Currently, the
microscopic investigation of blood smear is done manually through identification
under light microscope [4]. However, this operation still consist of probability of
miscalculated and misidentification of the parasites by human error [3].
Regardless of that, digital image processing has become as one of the common
computerized system in medical field as this system substantially reduces the time
as compared to the manual procedure since it allows scanning larger number of
blood sample slides. However, from the technical point of view, there is an issues
appear in constructing a computer aided system [5]. The issue is the blood sample
images often require to be pre-processing. The aim of pre-processing step is to
obtain images that will ease the further procedure which is segmentation process.
The effectiveness of segmentation is one of importance element for the success of
image analysis system [6]. This is because, segmentation of the malaria parasite or
infected red blood cells (RBCs) in malaria image is required for recognition of
malaria parasite. Nonetheless, the segmentation process is challenging since most of
the parasites are usually a part of the RBCs that have been infected which make the
parasites presented in a complex and uncommon visual form [7]. This is due to
human aspect in formation of the slides which results in non-standard and inho-
mogeneous staining concentrations and colors [8]. Otherwise, the blood sample
images might be affected by illumination and color propagation due to the staining
variability and camera calibration. Therefore, pre-processing step called as color
constancy is needed to normalize the staining variation. Thus, this study aims to
compare several color constancy algorithms for standardization of both thick and
thin blood smear images of P. falciparum and P. vivax species.
Color Constancy Analysis Approach for Color Standardization … 787

2 Color Constancy

Color constancy is a part of the human color visual system which ensures that the
perceived color of objects remains nearly perpetual even though under varying
illumination situations. Therefore, it is needed to apply a color constancy to the
images in order to lessen the side effect of particular light sources as it turns the
variable input into a stable perception. Color constancy plays significant role in
discard unreal color from the image, which is caused by the acquisition framework
[9]. Color constancy is one of the most necessary elements in image processing
procedure which is needed to enhance the visual display of the image or to provide
a better representation for further image processing method such as image analysis,
detection, segmentation and recognition. In general, color constancy algorithms can
be separated into two groups. The first group consists of algorithms based on
low-level image features that can directly be applied to images called as
statistics-based methods. The second group consists of algorithms that use data
acquired in a learning phase to obtain information about the images called as
learning-based methods. Statistics-based methods are characterized by a relatively
fast, simplicity, and usually lower accuracy, while learning-based methods are
slower, but have higher accuracy. However, since all recognized learning-based
methods are supervised, a major difficulty for their application is that for a given
sensor, despite proposed workarounds [10], supervised learning-based methods
have to be trained on calibrated images taken by preferably the same sensor [11].
As a result, assorted applications of color constancy algorithms have been
proposed by previous researchers. For instance, illumination estimation using
statistics-based methods such as gray world [12], white patch [13], iterative white
balance [14], retinex [15], shades of gray [16], weighted gray edge [17], normal-
ization using bright pixels [18], exploiting illumination color statistics perception
[19], and exploiting expected illumination statistics [20]. Meanwhile, some
example using learning-based methods are gamut mapping [21], neural networks
[22], high-level visual information [23], bayesian learning [24], spatio-spectral
learning [25], by using color or edge moments [26], convolutional neural networks
[27–30] modelling color constancy by using the overlapping asymmetric Gaussian
kernels with surround pixel contrast based sizes [25] and finding paths for the
longest dichromatic line produced by specular pixels [31].
Weijier et al. [32] proposed a grey edge based color constancy algorithm which
speculate the average edge difference in a scene to be achromatic. Gijsenij et al. [33]
proposed generalized gamut mapping color constancy algorithm based on image
derivative structures. Wang et al. [34] proposed a novel color constancy remote
sensing algorithm. This algorithm can be used in the dark area of the image, and
also self-adaptive according to the luminance conditions. Gijsenij et al. [35] pro-
posed the natural image statistic algorithm (NIS) based on selection and combi-
nation of basic color constancy algorithms. In NIS algorithm, image statistics are
used to identify the most important characteristics of color images.
788 T. A. Aris et al.

Apart from that, there are several color constancy algorithm have been applied to
revise quality of malaria image. As example, Sulur et al. [36] have analyzed four
different color constancy algorithms on thin blood smear images of P. falciparum
and P. vivax species on schizont and gametocyte stages to determine which color
constancy algorithm that can give the powerful segmentation performance. The
experimental outcomes show that segmentation on gray world images has suc-
cessfully segmented 100 malaria images with average segmentation accuracy,
sensitivity and specificity of 99.60, 91.26 and 99.85%, respectively. Khatri et al.
[37] have applied divisive normalization which is the process that changes the range
of pixel intensity on malaria images. It expands the dynamic range of pixel values
in an image into the range in which the image emerge more stabilize. Hirimutugoda
et al. [38] and Davi et al. [39] have utilized an adapted grey world normalization
method to the malaria images in order to reduce the effect of different light sources.
Das et al. [40] used gray world assumption for correcting illumination of malaria
images. Tek et al. [41] proposed a practical method which manipulates the special
characteristics of the peripheral thin blood film images that are easily separable into
the foreground and background regions. After separation, the method employs the
simple grey world assumption in two consecutive steps to provide an effective color
correction. Based on the previous studies, a good quality of malaria images plays a
major role in improving image processing performance. Therefore, this study will
exploit the potential of six color constancy algorithms in order to find the best
algorithm to standardize both thick and thin blood smear malaria images of
P. falciparum and P. vivax species by measuring the standardization performance of
each algorithm using certain quantitative measure.

3 Methodology

In this study, the proposed work comprises of three main steps. These include
image acquisition, image pre-processing and measurement standardization perfor-
mance of thick and thin blood smear image quality for the six color constancy
algorithms.

3.1 Image Acquisition

A total of 600 malaria images have been used in this study which consist of 100
images of thick blood smear each from P. falciparum and P. vivax and 200 images
of thin blood smear consist of ring, trophozoite, schizont, and gametocyte stages
also have been captured from each of P. falciparum and P. vivax species. The
malaria slides are prepared by the Department of Microbiology & Parasitology,
Hospital Universiti Sains Malaysia (HUSM). The slides are examined by using
Color Constancy Analysis Approach for Color Standardization … 789

Leica DLMA microscope under 100 magniﬁcation with oil immersion. The
images are then captured at a resolution setting of 1200 900 pixels for thick
blood smear image and 800 600 pixels for thin blood smear image. Both of the
images have been saved in jpeg format.
During observation of the slide samples, it was found that the appearance of the
infected red blood cells (RBCs) and their morphological features are very important
for recognizing the presence of malaria parasites. Thus, the quality of malaria
images is important to increase the effectiveness of digital image processing system
for recognition of malaria parasites. Malaria images are captured by using a ﬁxed
computerized microscope, and are captured under different illuminants such as
normal, overexposure and underexposure lighting condition as shown in Fig. 1.
Moreover, based on these malaria images, it can be seen that the color of the
parasites and normal RBCs regions varies in each slide due to the non-standard
preparation of the blood slides.

(a) Thick smear PF (b) Thick smear PV

(c) Thin smear PF (d) Thin smear PV

Fig. 1 Original images of malaria thick and thin blood smear from P. falciparum and P. vivax
species
790 T. A. Aris et al.

3.2 Image Pre-processing

In this study, two operations of image pre-processing step have been implemented
in order to obtain a good quality images for both thick and thin blood smear sample.
This process contains of contrast enhancement and color constancy algorithm.
Applied Contrast Enhancement Technique Using Modified Global Contrast
Stretching (MGCS). In this study, the images are captured under particular illu-
mination conditions such as underexposure, overexposure and normal lighting
conditions which may result on low quality images. Therefore, image enhancement
is needed to improve the quality of images for visual recognition. The images are
enhanced by using MGCS technique to emphasize and sharpen the images quality.
This technique use specific minimum and maximum values that lie in a specified
percentage of pixels based on the total number of pixels in the RGB image from
histogram to enhance malaria images significantly. Here, the minimum and maxi-
mum percentage values that have been used to determine the new minimum and
maximum RGB values for contrast stretching process is minp ¼ 0:1 and maxp ¼ 15.
Detail explanations about MGCS enhancement method can be found in [42].
Applied Color Constancy Algorithms On Enhanced Image. Color constancy is
the ability to perceive color of objects remains relatively constant under different
illumination conditions [7]. Obtaining color constancy is importance for many
computer system applications, such as image retrieval, image classification and
color object recognition [7, 8]. Six types of color constancy algorithms namely,
gray world (GW), white patch (WP), modified white patch (MWP), progressive
hybrid (PH), shades of gray (SoG) and gray edge (GE) will be applied on thick and
thin smear malaria images of P. falciparum and P. vivax species.
Gray World Algorithm. The gray world algorithm is one of the oldest and the
simplest color constancy algorithms which claimed under the assumption that the
average pixel values of images, under a normal light source condition, average
reflectance of the entire image is achromatic, with some pre-specified value, which
is referred to as “gray” [43]. It is also said that, given an image with sufficient
amount of color variations, the average value of the R, G. and B components of the
image should average to a common gray value [44]. This implies that any deviation
of the average color away from grey is caused by the effects of the light source. In
real life, color variations in an image are randomly happened and very common to
see. Based on the gray world assumption, we can consider that the average would
converge to the mean value, gray. Therefore, the algorithm could estimate the color
of image under unknown light source by forcing its images to have a common
average gray value for all R, G, and B components. As a result of approximation
lighting casts can be properly removed and the color condition of the image is much
closer to the original scene. We use a mathematical way to represent this
assumption:
Color Constancy Analysis Approach for Color Standardization … 791

R
Sc ðx; yÞdx
kec ¼ ð1Þ
dx

Here, Sc is actually the image itself where ðx; yÞ is the position of the pixel, and k
is a constant which means the concept of achromatic, between 0 for no reflectance
(black) and 1 for total reflectance (white) of the incident light [43]. According to
this equation, the color of the light source can be obtained by the average values of
three color channels of the whole image. Therefore, if any deviation from this
condition it is due to color of illumination.
White Patch Algorithm. White patch algorithm estimated illuminant color from
maximum response of RGB component [45]. Maximum response is obtained from
perfect reflectance. Perfect reflectance refers to a region with particular white values
and the maximum response in image is caused by this white patch. It assumes that if
there is a white patch in the imaging scene of an image, the maximum value of each
channel of the RGB image will appear in the white patch. Thus, under the
assumption of the presence of white patch, the maximum values of each R, G, B
channel of the image is the illumination color. Despite that, white patch algorithm
also known as max-RGB algorithm, since it utilizes this assumption to compute the
maximum responses in either of the three channels R, G and B and sets the color of
the light source to maximum value [45]. A surface having perfect reflectance equity
will reverse when the full range of light colors is captures. The algorithm is based
on the assumption that the reflectance which is achieved for each of the three
channels is equal where detecting the maximum operation is carried out on the
separate component for each RGB channel. Even though it works in a very similar
way to that of the gray world algorithm in Eq. (1), it is still difference as the mean
value is replace by the maximum value of the sensor responses in each channel.
White patch algorithm is stated in the following way:

Pc ¼ max Sc ðx; yÞ ð2Þ

255
kec ¼ ð3Þ
Pc

Modified White Patch Algorithm. Modified white patch algorithm measure the
illuminant color by considering the mean of the threshold instead of the maximal
values of the RGB components of the image [46]. Modified white patch algorithm
is expressed in the following way:
h
Pc ¼ mean Rh ; Gh ; Bh ð4Þ

h
Sc
kec ¼ h
ð5Þ
Pc
792 T. A. Aris et al.

where;

Rh h ð6Þ

Gh h ð7Þ

Bh h ð8Þ

Progressive Hybrid Algorithm. Progressive hybrid algorithm is based on a merger

of gray world and modiﬁed white patch algorithm [47]. Classic color constancy
methods such as gray world and white patch are global algorithms, which determine
the color cast from the whole image. Gray world measure the most prominent cast
in the image, since it uses the mean of the image to measure the cast while, white
patch measure the cast in threshold and is very sensitive to noise and clipping.
Therefore, since progressive hybrid is based on a combination of gray world and
modiﬁed white patch, it is more robust to noise and clipping effects. To avoid
artifacts caused by loss of gradation in intensity levels, different methods are used
depending on the intensity level of the current pixel which comprised between two
thresholds value. Progressive hybrid algorithm is computed in the following way:

M ¼ ðR þ G þ BÞ=3 ð9Þ

where;
h1
Sc
if M h1 ; kec ¼ h1 ð10Þ
Pc

if M h2 ; kec ¼ PSc ð11Þ

h1
Sc
if h2 M h1 ; kec ¼ ð1 dÞ PSc þ d h1 ð12Þ
c Pc

h1
Here, Pc is the mean of channel c, Pc is the mean of the pixels greater than h1 of
h1
the channel c, Sc is the target mean of channel c and Sc the target mean of the
pixels greater than h1 of the channel c. Progressive hybrid is able to handle more
than a cast in the same image. In fact gray world estimates and corrects color cast in
mid-tone and shadows, while modiﬁed white patch estimates and corrects color cast
in highlights.
Shades of Grey Algorithm. Shades of gray algorithm is based on gray world and
white patch algorithm [42]. This algorithm is function by considering the scene
average of gray world and white patch. If the average scene color is approximating
grey or the maximum is white, a trade-off between both algorithms is needed. This
algorithm introduces Minkowski-norm, which expands the applying scope of the
Color Constancy Analysis Approach for Color Standardization … 793

algorithm to a certain degree, hence gray world and white patch algorithms can be
included. The Minkowski-norm is the normalized result forms the estimated illu-
mination vector. It computes a weighted average of the pixel values, assigning
higher weights to pixels with higher intensities. The weight function proposed is
based on the Minkowski-norm of p. It is notable that, for p ¼ 1, the equation is
equal to Eq. (1) in the assumption of gray world, and for p ¼ 1, it is equal to
compute the maximum value that is the assumption of white patch. Besides,
through repeated experiments, when 1\p\1, Finlayson get an approximately
accurate value, that is, when p ¼ 6, SoG algorithm produces the best estimation
results. Shades of gray algorithm is represented in the following way:
R 1p
ðSc ðx; yÞÞp dx
kec ¼ R ð13Þ
dx

Gray Edge Algorithm. Gray edge algorithm is function by extending pixel-based

to incorporate derivative information such as edges and higher-order statistics. The
gray edge algorithm is actually based on shades of grey algorithm [48]. Grey edge
hypothesis can also be adapted to incorporate the Minkowski-norm. Algorithm
based on this equation assumes that the p Minkowski-norm of the derivative of the
reflectance in a scene is achromatic. For p ¼ 1, the illuminant is derived by a
normal averaging operation over the derivatives of the channels. For p ¼ 1, the
illuminant is computed from the maximum derivative in the scene. The resemblance
between the color constancy derivation from the shades of gray and gray edge
hypothesis is similar. Both methods can be combined in a single framework of color
constancy methods based on low-level image features. Gray edge algorithm is
computed in the following way:

R r 1p
ðSc ðx; yÞÞp dx
kec ¼ R ð14Þ
dx

3.3 Measurement Performance

Various quantitative measures are used in this study. For instance, Peak Signal to
Noise Ratio (PSNR), Mean Square Error (MSE), Normalized Absolute Error
(N.A.E) and Root Means Square Error (RMSE) are used to evaluate the perfor-
mance of color constancy algorithms.
Peak Signal to Noise Ratio. PSNR computes the peak signal-to-noise ratio
between two images [49]. This ratio is used as a quality measurement between the
original and a reconstructed image. It is an expression for the ratio between the
maximum possible power of a signal and the power of distorting noise that affects
794 T. A. Aris et al.

the quality of its representation. The higher the PSNR, the better the quality of the
reconstructed image. The mathematical representation of the PSNR is as follows:

2552
PSNR ¼ 10 log10 ð15Þ
MSE

Normalized Absolute Error. NAE is a measure of how far is the reconstructed

image decompressed from the original image [50]. This is the conventional measure
and it can be used to detect the blurring effect present in between original and
reconstructed images. Large value of NAE indicates poor quality of the image. The
mathematical representation of the NAE is expressed as follows:
PM PN

i¼1 j¼1 Aij Bij
NAE ¼ PM PN ð16Þ
i¼1 j¼1 ðAij Þ

Mean Square Error. MSE is the most common estimator of image quality
measurement metric [49]. Estimator is referred as the procedure for measuring an
unobserved quantity of image. MSE measures the average of the square of the
errors between original image and reconstructed image. The error is the amount by
which the values of the original image change from the reconstructed image. MSE
can be deﬁned mathematically as follows:

1 XM XN 2
MSE ¼ Aij Bij ð17Þ
M i¼1 j¼1

Root Mean Square Error. RMSE is another type of quantitative measuring

technique that is used to measure the differences between the predicted value by an
estimator and the actual value [51]. It evaluates the error magnitude. It is a perfect
measure of accuracy which is used to perform the differences of forecasting errors
from the different estimators for a definite variable.
pffiffiffiffiffiffiffiffiffiffi
RMSE ¼ MSE ð18Þ

4 Results and Discussions

In this study, six types of color constancy algorithm have been applied on both
thick and thin blood smear malaria images of P. falciparum and P. vivax species.
A total of 100 thick blood smear images from each species and 200 of thin blood
smear images which consist of ring, trophozoite, schizont, and gametocyte stages
from P. falciparum and P. vivax species have been used in this study. In order to
assess the proposed work, the captured images with different lighting conditions
Color Constancy Analysis Approach for Color Standardization … 795

(a) Original image of thick smear PF (b) Original image of thick smear PV

Fig. 2 Original images of malaria thick and thin blood smear from P. falciparum and P. vivax
species

such as normal, blurred and underexposure have been processed using the proposed
procedure. Four original malaria images based on thick and thin blood smear from
P. falciparum and P. vivax species are shown in Fig. 2 (a)–(d) respectively. Based
on these malaria images, it can be seen that the color of the parasites and normal
RBCs locality differ in each slide due to the non-standard preparation of the blood
slides.
Firstly, the malaria images are enhanced by using MGCS since the parasites are
hardly being seen due to the low image contrast. The results of applying MGCS
techniques on thick and thin blood smear image are shown in Fig. 3 (a)–(d). Based
on the resultant images, this technique has produced a good image contrast com-
pared to the original image since the background region has become brighter and
the presence of the parasite can be easily seen.
From the enhanced images of thick and thin blood smear of P. falciparum and
P. vivax species, six types of color constancy algorithm have been applied. The
results of images after applying the color constancy algorithms are shown in Fig. 4
(a)–(x). By applying the color constancy algorithm, the color of malaria images has
been standardized, and the quality of the images has been revised. Comparing the
results among these six algorithms, it can be seen that the results of images for both
796 T. A. Aris et al.

(a) MGCS image of thick smear PF (b) MGCS image of thick smear PV

Fig. 3 Enhanced images after applying MGCS technique

SoG and GW algorithms have been normalized to suit the human visual perception.
However, based on Fig. 4 (a), (b), (m) and (n) shows that GW algorithm failed to
standardize both thick and thin smear malaria images. Next, the results after
applying MWP and PH algorithm are shown in Fig. 4 (e), (f), (g), (h), (q), (r),
(s) and (t). Based on the resultant images, both thick and thin smear images of
P. falciparum and P. vivax species have been standardized but since the contrast of
resultant images have been enhanced therefore some local details of images cannot
be sustained. Apart from that, GE algorithm also failed to produce a standardized
resultant images on both thick and thin blood smear for P. falciparum and P. vivax
species as shown in Fig. 4 (k). (l), (w), and (x). As for WP algorithm, the resultant
images that have been shown in Fig. 4 (c), (d), (o) and (p) are almost similar to the
enhanced image. Hence, nothing can be compared for this algorithm.
In order to evaluate the color standardization performance for each color con-
stancy technique, the malaria images have been assessed by using four quantitative
measure namely, PSNR, NAE, MSE and RMSE. Table 1 is showing the compar-
ative analysis of the PSNR. Based on Table 1, it is observed that WP algorithm has
the highest PSNR value. However, since the WP algorithm have produced resultant
images that are almost similar to the enhanced image, so this result is invalid.
Color Constancy Analysis Approach for Color Standardization … 797

(a) GW on thick smear PF (b) GW on thick smear PV

(c) WP on thick smear PF (d) WP on thick smear PV

(e) MWP on thick smear PF (f) MWP on thick smear PV

(g) PH on thick smear PF (h) PH on thick smear PV

Fig. 4 Results after applied GW, WP, MWP, PH, SoG and GE algorithms on thick and thin blood
smear images of P. falciparum and P. vivax species
798 T. A. Aris et al.

(i) SoG on thick smear PF (j) SoG on thick smear PV

(k) GE on thick smear PF (l) GE on thick smear PV

(m) GW on thin smear PF (n) GW on thin smear PV

(o) WP on thin smear PF (p) WP on thin smear PV

Fig. 4 (continued)
Color Constancy Analysis Approach for Color Standardization … 799

(q) MWP on thin smear PF (r) MWP on thin smear PV

(s) PH on thin smear PF (t) PH on thin smear PV

(u) SoG on thin smear PF (v) SoG on thin smear PV

(w) GE on thin smear PF (x) GE on thin smear PV

Fig. 4 (continued)
800 T. A. Aris et al.

Therefore, as SoG algorithm is the second highest PSNR value, plus it can be seen
on Fig. 4 (i), (j), (u) and (v) that the color of malaria images has been standardized
for both thick and thin blood smear images, hence giving better results of how
standardization performance is increased. Figure 5 shows values of PSNR param-
eter that have been plotted from Table 1. Despite of WP algorithm shown higher
PSNR value, but since it is invalid, the second highest PSNR value which is SoG
algorithm is selected as the best color constancy algorithm that performs better on
both thick and thin smear images of P. falciparum and P. vivax species with value
of 30.12499.
Table 2 presents the comparative analysis of the six color constancy algorithm
by using others quantitative measure namely, NAE, MSE and RMSE. All of these
quantitative measure parameter need to be selected the lowest values as lowest

60 55.47285

40
30.12499
30 26.36467

19.72508 18.94157 19.74716

0
GW WP MWP PH SoG GE

Fig. 5 Plotted PSNR parameter

Table 1 Average PSNR value of thick and thin blood smear for GW, WP, MWP, PH, SoG and
GE algorithm
Details GW WP MWP PHM SoG GE
PSNR 26.36467 55.47285 19.72508 18.94157 30.12499 19.74716

Table 2 Average NAE, MSE and RMSE values of thick and thin blood smear for GW, WP,
MWP, PH, SoG and GE algorithm
Details GW WP MWP PHM SoG GE
NAE 0.02958 0.00022 0.11313 0.13021 0.01929 0.07611
MSE 76.92677 0.04835 659.05694 890.48924 37.11536 349.33970
RMSE 6.60749 0.19686 25.31024 29.38089 4.32364 17.10886
Color Constancy Analysis Approach for Color Standardization … 801

890.48924
NAE MSE RMSE

659.05694
1000
900
800

349.33970
700
600
500
76.92677

400

37.11536
29.38089
25.31024

17.10886
6.60749

4.32364
300
0.19686

0.11313

0.13021
0.02958

0.00022
0.04835

0.01929

0.07611
200
100
0
GW WP MWP PH SoG GE

Fig. 6 Plotted NAE, MSE and RMSE parameter

value indicated a good results. Therefore, SoG algorithm is again showing the better
results than the others algorithms as it has clearly demonstrated less values in every
measurement performance excluding the WP algorithm.
Figure 6 shows values of NAE, MSE and RMSE parameters that have been
plotted from the Table 2. The graph has clearly revealed that the SoG algorithm has
produced the lowest values of NAE, MSE and RMSE values as compared to others
algorithm if excluding the WP algorithm.

5 Conclusion

In this paper, the results of measuring standardization performance of malaria

images by applying six types of color constancy algorithms have been displayed.
The performance of the proposed color constancy algorithms has shown the sig-
niﬁcant improvement on malaria images. Overall, the results indicate that SoG
algorithm resultant image has proven to be the best color constancy algorithm with
performance measure of highest PSNR value and lowest NAE, MSE and RMSE
respectively. Thus, the results also signiﬁcantly demonstrate the suitability of SoG
algorithm in standardized of both thick and thin blood smear malaria images on
P. falciparum and P. vivax species.

Acknowledgements The authors would like to express their gratitude to Hospital Universiti
Sains Malaysia (HUSM) for providing the thick and thin smear malaria blood samples.
802 T. A. Aris et al.

References

1. Chemistry AAfC (2017) Centers for Disease Control and Prevention: About Malaria
2. World Helath Organization (2018) World Malaria Report 2018. Technical Report World
Health Organization
3. Prevention CfDCA (2013) Guidelines for Treatment of Malaria in the United States (2013)
4. Aris TA, Nasir ASA, Mohamed Z, Jaafar H, Mustafa WA, Khairunizam W, Jamlos MA,
Zunaidi I, Razlan ZM, Shahriman AB (2018) Color component analysis approach for malaria
parasites detection based on thick blood smear images. In: IOP conference series: materials
science and engineering, pp 1–7. IOP Publishing
5. World Health Organization (2010) Basic Malaria Microscopy (part I and II), 72
6. Nasir ASA, Mashor MY, Rosline H (2011) Detection of acute leukaemia cells using variety of
features and neural networks. In: 5th Kuala Lumpur international conference on biomedical
engineering. International Federation for Medical and Biological Engineering (IFMBE),
Kuala Lumpur, pp 40–46
7. Nasir ASA, Jaafar H, Mustafa WAW, Mohamed Z (2018) The cascaded enhanced k-means
and fuzzy c-means clustering algorithms for automated segmentation of malaria parasites. In:
Malaysia technical universities conference on engineering and technology. MATEC Web of
Conferences, pp 1–6
8. Halim NHA, Mashora MY, Abdul AS, Hassan NR (2017) The effectiveness of color
constancy on segmenting leukemia cells using unsupervised clustering technique. In: 1st
International conference on green and sustainable computing. IOP Publishing, pp 1–8
9. Sato Y, Ikeuchi K (2003) Illumination from shadows. IEEE Trans Pattern Anal Mach Intell 25
(3):290–300
10. Kulkarni SG, Kamalapur PSM (2014) Color constancy techniques. Int J Eng Comput Sci 3
(11): 9147–9150
11. Finlayson GD, Hordley SD (2006) Gamut constrained illuminant estimation. Int J Comput
Vis 67:93–109
12. Aulakh AS, Arora A, Kaur M (2014) Color correction using color constancy algorithms. Int J
Appl Innov Eng Manage 3(5):71–80
13. Barata C, Celebi ME (2015) Improving dermoscopy image classiﬁcation using color
constancy. IEEE J Biomed Health Inform 19(3):1146–1152
14. Cardei VC, Funt B Barnard JJ (1999) White point estimation for uncalibrated images. In:
Proceeding of the seventh color imaging conference, pp 97–100
15. Funt B, Ciurea F, McCann J (2000) Retinex in matlab. In: Proceedings of the eighth color
imaging conference: color science, systems and applications, pp 112–121
16. Trezzi GD, Finlayson GD (2004) Shades of gray and color constancy. In: Color imaging
conference: color science engineering system technology, Scottsdale, pp 37–41
17. Gijsenij A, Gevers T, Weijer JVD (2012) Improving color constancy by photometric edge
weighting. IEEE Trans Pattern Anal Mach Intell 34(5):918–929
18. Joze HRV, Drew MS, Finlayson GD (2012) The role of bright pixels in illumination
estimation. In: Color and imaging conference. Society for Imaging Science and Technology,
pp 41–46
19. Banic N, Loncaric S (2019) Blue shift assumption: improving illumination estimation
accuracy for single image from unknown source. In: VISAPP, pp 191–197
20. Banic N, Loncaric S (2018) Green stability assumption: unsupervised learning for
statistics-based illumination estimation. J. Imaging 4(11):127–132
21. Barnard K (2000) Improvements to Gamut mapping color constancy algorithms. In: 6th
European conference on computer vision, Ireland, pp 390–403
22. Cardei VC, Funt B, Barnard K (2003) Estimating the scene illumination chromaticity by using
a neural network. J Opt Soc Am 19(12):2374–2386
23. Weijer JVD, Schmid C, Verbeek J (2007) Using high-level visual information for color
constancy. In: IEEE 11th International conference on computer vision. IEEE, pp 1–8
Color Constancy Analysis Approach for Color Standardization … 803

24. Gehler PV, Rother C, Blake A, Minka T, Sharp T (2008) Bayesian color constancy revisited.
In: Computer vision and pattern recognition. IEEE, pp 1–8
25. Akbarinia A, Parraga CA (2018) Color constancy beyond the classical receptive field. IEEE
Trans Pattern Anal Mach Intell 40(9):2081–2094
26. Shi W, Loy CC, Tang X (2016) Deep specialized network for illuminant estimation. In:
European conference on computer vision. Springer, pp 371–387
27. Barron JT (2015) Convolutional color constancy. In: Proceedings of the IEEE international
conference on computer vision. IEEE, pp 379–387
28. Bianco S, Cusano C, Schettini R (2015) Color constancy using CNNs. In: Proceedings of the
IEEE conference on computer vision and pattern recognition workshops, pp 81–89
29. Hu Y, Wang B, Lin S (2017) Fully convolutional color constancy with confidence-weighted
pooling. In: IEEE Conference computer vision and pattern recognition, pp 4085–4094
30. Oh SW, Kim SJ (2017) Approaching the computational color constancy as a classification
problem through deep learning. Pattern Recognit 61:405–416
31. Woo S-M, Lee S-H, Yoo J-S, Kim J-O (2018) Improving color constancy in an ambient light
environment using the Phong reflection model. IEEE Trans Image Process 27(4):1862–1877
32. Weijer Jvd, Gevers T (2005) Color constancy based on grey edge hypothesis. In: IEEE,
pp 722–725
33. Gijsenij A, Gevers T, Jvd Weijer (2010) Generalized Gamut mapping using image derivative
structures for color constancy. Int J Comput Vis 86(2–3):127–139
34. Wang M, Zheng X, Feng C (2013) Color constancy enhancement for multi-spectral remote
sensing images. In: IEEE international conference on geoscience and remote sensing
symposium (IGARSS), pp 864–867
35. Gijsenij A, Gevers T (2011) Color constancy using natural image statistics and scene
semantics. IEEE Trans Pattern Anal Mach Intell 33(4):687–698
36. Sulur KM, Nasir ASA, Mustafa WA, Jaafar H, Mohamed Z (2017) Analysis of color
constancy algorithms for improving segmentation of malaria images. J Telecommun Electron
Comput Eng 10(1–16):43–49
37. Khatri KM, Ratnaparkhe VR, Agrawal SS, Bhalchandr AS (2014) Image processing approach
for malarial parasite identification. In: IJCA proceedings on national conference on growth of
technologies in electronics, telecom and computers, India, pp 5–7. International Journal of
Computer Applications
38. Hirimutugoda YM, Wijayarathna DG (2010) Image analysis system for detection of red cell
disorders using artificial neural networks. Sri Lanka J Bio-Med Inform 1(1):35–42
39. Devi SS, Sheikh SA, Talukdar A, Laskar RH (2016) Malaria infected erythrocyte
classification based on the histogram features using microscopic images of thin blood smear.
Indian J Sci Technol 9(45):1–10
40. Das DK, Ghosh M, Pal M, Maiti AK, Chakraborty C (2013) Machine learning approach for
automated screening of malaria parasite using light microscopic mages. ELSEVIER, pp 97–
106
41. Tek FB, Dempster AG, Kale I (2006) Malaria parasite detection in peripheral blood images.
In: Proceedings of the British machine vision conference, United Kingdom, pp 4–7
42. Nasir ASA, Mashor MY, Mohamed Z (2012) Modified global and modified linear contrast
stretching algorithms: new color contrast enhancement techniques for microscopic analysis of
malaria slide images. Comput Math Methods Med pp 1–16
43. Singh M, Sharma DS (2014) Enhanced color correction using histogram stretching based on
modified gray world and white patch algorithms. Int J Comput Sci Inf Technol 5(3):4762–
4770
44. Gijsenij A, Gevers T, Lucassen MP (2009) Perceptual analysis of distance measures for color
constancy algorithms. J Opt Soc Am 26(10):2243–2256
45. Barnard K, Martin L, Coath A, Funt B (2002) A comparison of computational color constancy
algorithms-part ii: experiments with image data. IEEE Trans Image Process 11(9):985–996
804 T. A. Aris et al.

46. Chambah M, Besserer B, Courtellemont P (2002) Recent progress in automatic digital

restoration of color motion pictures. In: Color imaging: device-independent color, color
hardcopy, and applications VII, United States, pp 98–109
47. Chambah M (2006) More than color constancy: non-uniform color cast correction. Springer,
Netherlands, pp 780–786
48. Jvd Weijer, Gevers T, Gijsenij A (2007) Edge-based color constancy. IEEE Trans Image
Process 16(9):2207–2214
49. Sara U, Akter M, Uddin MS (2019) Image quality assessment through FSIM, SSIM, MSE and
PSNR—a comparative study. J Comput Commun 8–18
50. Rajkumar S, Malathi G (2016) A comparative analysis on image quality assessment for real
time satellite images. Indian J Sci Technol 9(34):1–11
51. Vora VS, Suthar ACP, Makwana YN, Davda SJ (2010) Analysis of compressed image quality
assessments. Int J Adv Eng Appl 225–229
Stochastic Analysis of ANN Statistical
Features for CT Brain Posterior Fossa
Image Classiﬁcation

Anis Azwani Muhd Suberi, Wan Nurshazwani Wan Zakaria,

Razali Tomari, Ain Nazari, Nik Farhan Nik Fuad,
Fakhrul Razan Rahmad, and Salsabella Mohd Fizol

Abstract Automated classiﬁcation of Posterior Fossa (PF) in Computed

Tomography (CT) image sequences is important for the diagnosing of ischemic.
The identification of PF remains ambiguous due to its varying structures and
regions. In this paper, a stochastic analysis of classification system is presented to
classify PF slices in CT image sequences. This system can highlight the specified
slices with credence, which can be a good visual companion piece for the radiol-
ogist to accurately identify the region with minimum processing time. The choice of
eleven statistical features is based on the t-test and sensitivity analysis, which were
conducted to investigate the significant features for Artificial Neural Network
(ANN) predictive model. This classifier has been compared with the Support Vector
Machine (SVM) and the effectiveness of both classifiers is leveraged through
10-fold cross-validation strategy. It is found that the ANN had achieved the highest
classification accuracy of 95.1% in 1.695 s.

Keywords Posterior Fossa Feature extraction Classiﬁcation Computed

Tomography

A. A. Muhd Suberi W. N. Wan Zakaria (&) R. Tomari A. Nazari F. R. Rahmad

Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
86400 Batu Pahat, Johor, Malaysia
e-mail: [email protected]
N. F. Nik Fuad
Department of Radiology, UKM Medical Centre, Jalan Yaacob Latif, Bandar Tun Razak,
56000 Cheras, Kuala Lumpur, Malaysia
S. Mohd Fizol
Faculty of Technical and Vocational Education, Universiti Tun Hussein Onn Malaysia,
86400 Batu Pahat, Johor, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 805

1 Introduction

Computed Tomography (CT) is a reliable imaging technique utilised to visualise

the morphological anatomy of the brain. A CT scan is regarded as more widely
accessible and inexpensive while being capable of producing rapid acquisition, as
compared to its counterpart, the Magnetic Resonance Imaging (MRI) [1]. As a
consequence of the aforementioned advantages, CT imaging has been accepted as
the conventional method in medical field and implemented in all medical institu-
tions, from rural hospitals, private medical institutions, to specialized centres.
Nonetheless, CT imaging comes with a limitation, whereby scanning the Posterior
Fossa (PF) region may result in the production of artifacts from beam hardening
effect [1, 2]. Thus, the artifacts would substantially alter the presented image, which
in turn, affects the process of diagnosis. Neurologically, the PF structure consists of
the midbrain, Pons and medulla, as depicted in Fig. 1. Consequently, the mor-
phology of these components differs from one slice to another, depending on the
thickness and the orientation of the patient during scanning. Finding an ischemic
lesion or tumour in the slice of interest is a challenging task when manual identi-
fication is conducted upon the CT images. This is because PF is monotonous and it
has been proven to have adverse effects on the next stages [3].
Over the last decades, the applicability of CT has been significantly magnified
by the precise classification of automated CT images. Rapid development in
computer technology has led to the introduction of Computer-aided Diagnosis
(CAD) schemes to offer radiologists a second opinion in clinical decision-making
process [4–9]. Generally, the traditional classifier design for CAD requires signif-
icant handcrafted feature extraction to achieve high accuracy of classification [10–
13]. Several authors have highlighted the importance of feature extraction and

Fig. 1 PF anatomy in: a axial and b sagittal plane image [3]

Stochastic Analysis of ANN Statistical Features … 807

selection in reducing data dimensionality to interpret the input data [11, 14, 15].
Thus, the implementation of feature extraction and selection of a given large
medical image data as input is appropriate in achieving a robust pattern classifi-
cation. Most of the recent solutions have confirmed the promising performance of
texture, such as first-order features and Gray Level Co-Occurrence Matrix (GLCM)
as the significant features in characterizing the low distribution of gray level values
[9, 14]. Some of the advantages of these features include non-redundancy, inter-
pretable, and distinguishable features, among others [14].
In recent years, researchers have proposed many medical image classification
techniques [16–20]. The classification falls into two paradigms: supervised and
unsupervised. While both methods can attain a high performance, the supervised
classifier produces higher accuracy than the unsupervised would. Artificial Neural
Network (ANN) and Support Vector Machine (SVM) are examples of the super-
vised classifier. SVM has been remarkably performed in medical fields with
advantages of high accuracy and suitability for a number of samples [21].
Meanwhile, ANN appears to be one of the preferable classifiers in medical image
analysis [22]. The ability to perform well and its robustness for numerous data types
and conditions is one of the major advantages of ANN [22].
Until today, the existing CAD for brain often lacks the feasibility for analysing
Posterior Fossa (PF)-like structures. Previously, a new ANN predictive model is
proposed and extensively investigated to classify the PF slices from CT image
sequences [3]. This is the first time the ANN-based approach is proposed in clas-
sifying the PF slices with the integration of first- and second-order of statistical
textures as inputs. In this paper, Support Vector Machine (SVM) is introduced,
which extends the previous paper [3] by comparing this classifier with ANN.
Meanwhile, the aim of this paper is to further explore the prominent extracted
features of first- and second-order in distinguishing CT images into PF and non-PF
slices. This paper presents two main contributions: (a) investigation of significant
parameters in distinguishing PF slices using one sample t-test and sensitivity
analysis; and (b) comparative study between ANN and SVM using K-fold
cross-validation method and time analysis. This paper is organized starting from the
description of the proposed research method and dataset in Sect. 2, followed by the
results and discussion in Sect. 3, and finally, the summary of findings in Sect. 4.

2 Methodology

The experimental works were conducted using Intel® Core™ I7-7500U processor
and 8 GB RAM platform via MATLAB R2018b. The proposed automatic PF of CT
brain slices classiﬁcation system involves the following three stages:
(1) pre-processing, (2) effective feature extraction, and (3) classiﬁcation as depicted
in Fig. 2.
808 A. A. Muhd Suberi et al.

Fig. 2 Automated PF and non-PF slices classiﬁcation

2.1 Dataset

The proposed methodology is applied to 1838 non-contrast head CT images,

retrieved from UKM Medical Centre, Malaysia. An approval is obtained for this
retrospective study and the patient’s consent is waived. Each CT dataset has
between 152 to 198 slices. Among them, the average number of CT image in each
patient is 73 for PF and 110 for non-PF. The dataset came from 10 male subjects
with an age ranging from 30 to 60 years old, scanned using Aquilion One by
Toshiba scanner in axial plane following the standard parameters, as shown in
previous work [3]. All the two-dimensional (2D) CT images are stored in DICOM
format with 512 512 resolution. Each patient will have a varying number of
image slices owing to the scanning procedure conducted by the radiologist.
A board-certified radiologist with five years of experience provides the ground truth
data manually on the computer screen.
The prominent features for classification of PF slices in CT image sequences are
influenced by the visual features interpreted by the radiologist. As described in
Sect. 2.3, these features are defined based on their anatomy parts that are presented
in the slice. According to Iqbal et al. [23], PF can be seen in a funnel shape as
outlined using red dotted lines (3rd–7th slices) in Fig. 3. The rest is CT images with
non-PF structures. The yellow arrow indicates the position of each slice located in
the sagittal view. All slices are presented in different bone structures, which can
introduce various challenges in the PF decision-making process. For instance, one
of the perennial difficulties to identify the PF region in the 6th and 7th CT slices is
associated with minimal visibility of the bone boundary. Meanwhile, PF in the 3rd
CT slice is visualised in the small region, having different shapes compared to other
slices.

2.2 Pre-processing

DICOM image conversion is achieved through the windowing operation of 40HU

for centre and width. This is conﬁrmed from the results obtained from previous
studies [3]. Generally, this operation converts the original DICOM CT images from
Stochastic Analysis of ANN Statistical Features … 809

Fig. 3 PF region (red dotted lines) in CT image sequences

16-bit into 8-bit range of grayscale intensity level. The output exhibits more
preservative and clear visualization of brain tissues. The details of this stage can be
found in previous study [3].

2.3 Feature Extraction

Various studies have assessed the efficacy of statistical and texture to acquire the
prominent features in medical images [6, 9, 14]. This method is particularly useful in
interpreting the vital component of human visual recognition. Therefore, six first-order
and five second-order statistical features are selected for their reliability in medical
image analysis application. First-order features are the statistical features which provide
the information based on the gray level distribution of the image. Meanwhile,
second-order features are the way of extracting Gray Level Co-occurrence Matrix
(GLCM) textures by measuring the relationship between neighbouring pixels in the
image. In this feature extraction stage, feature computation is performed and extracted
from the PF and non-PF slices in CT images. All the extracted feature values correspond
to the CT slices in Fig. 3 can be found in Tables 1 and 2.
810 A. A. Muhd Suberi et al.

Table 1 First-order statistical features of PF and non-PF in 10 CT slices

Slice Mean Std. dev. RMS Variance Kurtosis Skewness
Non PF 1 47.0717 90.2496 6.5079 6661.9845 3.7154 1.5739
Non PF 2 52.5591 94.7007 6.8504 7218.7811 3.2221 1.4180
PF 3 60.6546 100.4566 7.3175 7735.2059 2.6507 1.2115
PF 4 58.5293 96.8693 7.4086 7289.0883 2.8290 1.2550
PF 5 60.7755 94.3901 8.1997 6997.7864 2.8928 1.2357
PF 6 57.7147 84.5990 8.8613 5701.4005 3.4256 1.3114
PF 7 52.9180 79.2254 9.0176 5178.5310 4.0914 1.4803
Non PF 8 48.4564 77.8005 8.9339 5291.9117 4.8295 1.7119
Non PF 9 48.1100 77.6303 8.7671 5223.5784 4.8679 1.7207
Non PF 10 36.3791 77.0026 6.4176 5220.0434 6.0133 2.1008

Table 2 Second-order statistical features of PF and non-PF in 10 CT slices

Slice Contrast Correlation Energy Homogeneity Entropy
Non PF 1 0.8721 0.9309 0.5649 0.9271 1.8187
Non PF 2 0.7772 0.9438 0.5372 0.9314 1.8868
PF 3 0.7787 0.9497 0.5016 0.9322 1.9354
PF 4 0.8491 0.9411 0.4871 0.9236 2.0825
PF 5 0.7778 0.9431 0.4135 0.9063 2.4942
PF 6 0.7013 0.9364 0.3521 0.8885 2.8895
PF 7 0.6280 0.9351 0.3458 0.8875 2.9422
Non PF 8 0.6150 0.9339 0.3634 0.8948 2.8370
Non PF 9 0.5912 0.9361 0.3689 0.8989 2.7972
Non PF 10 0.4187 0.9537 0.5771 0.9457 1.8326

2.4 ANN Classiﬁcation

ANN is one of the popular approaches used in medical image classiﬁcation. ANN is
a widely used algorithm that represents the brain’s neural structure of mathematical
models [24]. Along the learning process, the ANN aims to create a functional
relationship between two data groups, which would then reproduce the connection
in the recall process [25]. One of the most prominent ANN architectures that are
frequently used in various applications is the Multilayer Perceptron (MLP) [26].
MLP is a layered feedforward network trained by the backpropagation algorithm.
This supervised learning process encompasses the input, hidden and output layers
[27]. The overview of the ANN predictive model used in this study is depicted in
Fig. 4. Eleven statistical features, including ﬁrst- and second-order statistical fea-
tures, have been selected as the input to classify two types of classes: PF and
Stochastic Analysis of ANN Statistical Features … 811

Fig. 4 ANN predictive model [3]

non-PF slices. The dataset is divided into 90% and 10% for training and testing,
respectively. In the previous development of this predictive model, hidden neurons
with six nodes are applied to the learning rate and epoch number of 0.168 and 92,
respectively [3]. The details of this developed ANN predictive model can be found
in [3].

3 Results and Discussion

The t-test and sensitivity analyses were conducted to determine which extracted
features are significant and represent the characteristics of the PF structure in CT
image sequences. In order to verify the performance of the ANN, this method was
compared to the SVM classifier with linear and quadratic kernel. The performances
of these classifiers have been properly evaluated through 10-fold cross-validation.
This type of validation will examine the classifier model using each of the nine
partitions and the remaining one partition will represent the validation [28].
Afterwards, the performance assessment in terms of computation time was mea-
sured to assess the computational time taken for the classifier in cross-validation
stage.
812 A. A. Muhd Suberi et al.

3.1 Important Features in PF Classiﬁcation

A statistical analysis using two-sample t-test analysis was conducted using the
SPSS software. It is an important consideration when selecting the optimal features
that could differentiate the PF and non-PF image slices. The t-test results are cal-
culated based on the extracted feature’s value with a 95% confidence interval (CI);
all values are presented in Table 3. The p-value is referring to the probability
measuring evidence against the null hypothesis. It is important to note that p < 0.05
indicates a statistically significant difference between the means of PF and non-PF.
Closer inspection of the table shows that there are clearly significant differences
between PF and non-PF slices. Each proposed feature provides a significant value,
p < 0.05. These findings may explain a relatively good distinction between PF and
non-PF slices by having statistical differences for all p-values. Therefore, it can be
concluded that all of these features can be applied to differentiate the PF and non-PF
structures in CT image sequences. In order to verify the prediction performance
using all features as inputs, a sensitivity analysis has been conducted and the results
are presented in Sect. 3.2.

3.2 Sensitivity Analysis

A series of experiments on sensitivity analysis were conducted to validate the

classification accuracy of the developed predictive model with the exclusion of each
parameter. Accuracy here is defined as the percentage of correctly classified PF.
This evaluation is important to identify the prominent features from a large number
of parameters. The results in Fig. 5 illustrate a reasonable agreement between the
performances of sensitivity analysis on each removed parameter. It is interesting to
note that the combination of all features from the first- and second-order (GLCM) is

Table 3 T-test analysis for Statistical feature Descriptive P-value (95% CI)
texture features
First-order Mean 0.000
Std. dev. 0.000
RMS 0.000
Variance 0.000
Kurtosis 0.000
Skewness 0.000
Second-order Contrast 0.000
Correlation 0.000
Energy 0.000
Homogeneity 0.000
Entropy 0.000
Stochastic Analysis of ANN Statistical Features … 813

Fig. 5 Sensitivity analysis for each parameter

capable of demonstrating high training and testing accuracy at 94% and 96.2%,
respectively. Simultaneously, higher accuracy of 93.4% and 96.7% for training and
testing, respectively, can be observed when the standard deviation parameter is
excluded. In the training stage, most of the ANN predictive models performed had
attained above 90% accuracy when each of the parameters was excluded. Only a
few demonstrate training accuracy between 82% and 89%. However, it is notice-
able that those of the excluded parameters in the testing stage, such as the mean,
skewness, contrast, correlation, homogeneity, and entropy, resulted in higher mis-
classification error, which causes the accuracy to be below 90%. Meanwhile,
homogeneity provides the lowest testing accuracy at only 78.1%.
Based on the brain structures, some of the PF regions presented were with and
without the bone boundary. In the meantime, each patient may have different
structures of PF visualization for each slice due to the parameters applied during the
CT scanning procedure. Therefore, achieving the optimum value for each parameter
is relatively challenging, as the parameter value will depend on the context visu-
alization in the image. This finding raises the possibility that the varying structures
of the PF region can be represented using internal texture characteristics instead of
focusing on external characteristics, such as shape. Overall, this inconsistency may
be due to the different structures of PF presented in the CT image. Each pixel of the
PF structure may correlate with other pixels of non-PF due to the intensity variation
in a specific pixel of interest. Although the exclusion of each parameter in the ANN
predictive model did not affect the accuracy significantly, these results should be
814 A. A. Muhd Suberi et al.

interpreted with caution. These results provide further support for the hypothesis
that it is important to use all eleven features of the proposed statistical features as
input parameters in the ANN predictive model.

3.3 Classiﬁcation Comparison

The experiments were conducted to compare the performance of the ANN with
other current states of the art classifiers, such as SVM. Generally, training in SVM
involves kernel parameters, which influences the decision boundary. SVM is cap-
able of categorizing non-linear data by using the theory of kernel function, such as
Gaussian, polynomial and sigmoid kernels [29]. These types of kernel functions
will transform the data into higher dimensional feature space, where SVM will
determine the hyperplane and perform data division of distinct classes [29, 30].
Several kernels, such as the linear and quadratic kernels, were proposed as poly-
nomial and Gaussian kernels have frequently introduced overfit issues [21]. In this
work, the SVM classifier used at this stage is integrated with linear and quadratic
kernels, as well as in the 10-fold cross-validation strategy. These classifiers include
the 11 extracted features discussed in Sect. 2.3. Similar to ANN, the training and
testing data are efficiently divided into 90% and 10%, respectively.
The classification comparison between ANN and SVM is shown in Fig. 6.
Comparing the result of SVM obtained, there is a marked improvement for ANN
with the accuracy of 95.1%. The classification of PF and non-PF are much more
accurate when the ANN classifier is used rather than SVM + linear kernel or
SVM + quadratic kernel. In particular, for accuracy classification using
SVM + linear kernel, 93.7% of PF and non-PF are classified, with this classifica-
tion rate decreasing to 90% for the SVM + quadratic kernel. The false positive rate
(FPR) of the ANN that represents the proportion of non-PF slice that is incorrectly
classified as PF is also negligible, as opposed to the FPR of SVM + linear kernel
and SVM + quadratic kernel using statistical features. Classification using ANN
achieves a small false positive rate of 4.9% that increases to 6.3% and 10% for
SVM + linear kernel and SVM + quadratic kernel, respectively.

3.4 Time Analysis

Computation time is one of the significant factors for evaluating the classifier
performance. The findings of this time analysis can be found in Table 4. The
average computation time on ANN, SVM + linear kernel and SVM + quadratic
kernel classifications using the 10-fold cross-validation is 1.695 s, 2.271 s and
4.672 s, respectively. The analysis obtained by the classifier of SVM + quadratic
Stochastic Analysis of ANN Statistical Features … 815

94
Accuracy (%)

87
ANN SVM + linear kernel SVM + quadratic kernel
Methods

Fig. 6 Overall accuracy of 10-fold cross-validation

Table 4 Computation time Classiﬁer Computation time (seconds)

for different classiﬁers
ANN 1.695
SVM + linear kernel 2.271
SVM + quadratic kernel 4.672

kernel is the most time-consuming, followed by the SVM + quadratic kernel.

Meanwhile, the average computation time for ANN is the least, which is sufﬁciently
rapid for a medical image diagnosis.

4 Conclusion

A predictive model has been developed to classify PF and non-PF slices in CT

image sequences. Therefore, this paper has investigated the prominent features in
the development of the ANN predictive model using t-test and sensitivity analyses.
The results conclude that all proposed features are statistically significant (p < 0.05)
and important to be fed as the input parameter in the ANN predictive model. One of
the more significant findings to emerge from this study is that it can provide a better
classification accuracy if all parameters were included. These findings have sig-
nificant implications to understand how the statistical features are capable of rep-
resenting the PF structure based on the texture. Then, the performance of this ANN
816 A. A. Muhd Suberi et al.

predictive model was compared to the SVM classifier with different kernels: linear
and quadratic kernels. The experiment using 10-fold cross-validation demonstrates
that the ANN has obtained 95.1% classification accuracy in 1.695 s, which is higher
than the SVM with linear and quadratic kernels. Future work can focus on the
following aspects: First, the proposed predictive model could employ Principle
Component Analysis (PCA) and Particle Swarm Optimization (PSO) to accelerate
the performance of the classifier; and second, other machine-learning classifiers,
such as K-Nearest Neighbour (KNN) or decision trees, could be explored to classify
the PF and non-PF slices. The insights gained from this study may be of assistance
to developing an automated classification system for ischemic prediction in the PF
region.

Acknowledgements The authors would like to thank to Universiti Tun Hussein Onn Malaysia
under TIER 1 Vot H203 and GPPS Vot U954 Research Grants.

References

1. Inoue T, Nakaura T, Yoshida M, Yokoyama K, Hirata K, Kidoh M, Oda S, Utsunomiya D,

Harada K, Yamashita Y (2017) Diagnosis of small posterior fossa stroke on brain CT: effect
of iterative reconstruction designed for brain CT on detection performance. Eur Radiol
27(9):3710–3715
2. Boulter DJ, Luigetti M, Rumboldt Z, Chalela JA, Cianfoni A (2012) Multimodal CT imaging
of a posterior fossa stroke. Neurol Sci 33(1):215–216
3. Suberi AAM, Zakaria WNW, Tomari R, Fuad NFN (2018) Classification of posterior fossa
CT brain slices using artificial neural network. In: 3rd international conference on computer
science and computational intelligence, pp 3–10. Procedia Comput Sci. Elsevier (2018)
4. Tyan YS, Wu MC, Chin CL, Kuo YL, Lee MS, Chang HY (2014) Ischemic stroke detection
system with a computer-aided diagnostic ability using an unsupervised feature perception
enhancement method. Int J Biomed Imaging. Article ID 947539, 12 p
5. Kanchana R, Menaka R (2015) Computer reinforced analysis for ischemic stroke recognition:
a review. Indian J Sci Technol 8(35):1–9
6. Rajini NH, Bhavani R (2013) Computer aided detection of ischemic stroke using
segmentation and texture features. Measurement 46(6):1865–1874
7. Titano JJ, Badgeley M, Schefflein J, Pain M, Su A, Cai M, Swinburne N, Zech J, Kim J,
Bederson J, Mocco J (2018) Automated deep-neural-network surveillance of cranial images
for acute neurologic events. Nat Med 24(9):1337–1341
8. Tang FH, Ng DKS, Chow DHK (2011) An image feature approach for computer-aided
detection of ischemic stroke. Comput Biol Med 41(7):529–536
9. Kanchana R, Menaka R (2017) A novel approach for characterisation of ischaemic stroke
lesion using histogram bin-based segmentation and gray level co-occurrence matrix features.
Imaging Sci J 65(2):124–136
10. Berbar MA (2018) Hybrid methods for feature extraction for breast masses classification’.
Egypt Inf J 19(1):63–73
11. Sharma V, Juglan KC (2018) Automated classification of fatty and normal liver ultrasound
images based on mutual information feature selection. IRBM 39(5):313–323
12. Kumar M, Pachori RB, Acharya UR (2016) An efficient automated technique for CAD
diagnosis using flexible analytic wavelet transform and entropy features extracted from HRV
signals. Expert Syst Appl 63:165–172
Stochastic Analysis of ANN Statistical Features … 817

13. Zhou S, Shi J, Zhu J, Cai Y, Wang R (2013) Shearlet-based texture feature extraction
for classification of breast tumor in ultrasound image. Biomed Signal Process Control 8(6):
688–696
14. Kairuddin WNHW, Mahmud WMHW (2017) Texture feature analysis for different resolution
level of kidney ultrasound images. In: IOP conference series: materials science and
engineering, vol 226
15. Khalid S, Khalil T, Nasreen S (2014) A survey of feature selection and feature extraction
techniques in machine learning. In: Science and information conference. IEEE, pp 372–378
16. Giger ML (2018) Machine learning in medical imaging. J Am Coll Radiol 15(3):512–520
17. Spanhol FA, Oliveira LS, Petitjean C, Heutte L (2015) A dataset for breast cancer
histopathological image classification. IEEE Trans Biomed Eng 63(7):1455–1462
18. Camlica Z, Tizhoosh HR, Khalvati F (2015) Medical image classification via SVM using LBP
features from saliency-based folded data. In: IEEE 14th international conference on machine
learning and applications. IEEE, pp 128–132
19. Li Q, Cai W, Wang X, Zhou Y, Feng DD, Chen M (2014) Medical image classification with
convolutional neural network. In: 13th international conference on control automation
robotics & vision. IEEE, pp 844–848
20. Lai Z, Deng H (2018) Medical image classification based on deep features extracted by deep
model and statistic feature fusion with multilayer perceptron. Comput Intell Neurosci.
Article ID 2061516, 13 p
21. Zhang Y, Wu L (2014) An MR brain images classifier via principal component analysis and
kernel support vector machine. Prog Electromagnet Res 130:369–388
22. Lee JG, Jun S, Cho YW, Lee H, Kim GB, Seo JB, Kim N (2017) Deep learning in medical
imaging: general overview. Korean J Radiol 18(4):570–584
23. Iqbal S, Robert A, Mathew D (2016) Computed tomographic study of posterior cranial fossa,
foramen magnum, and its surgical implications in Chiari malformations. Asian J Neurosurg
12(3):428–435
24. Jiang Y, Yang C, Na J, Li G, Li Y, Zhong J (2017) A brief review of neural networks based
learning and control and their applications for robots. Complexity. Article ID 1895897, 14 p
25. Chojaczyk AA, Teixeira AP, Neves LC, Cardoso JB, Soares CG (2015) Review and
application of artificial neural networks models in reliability analysis of steel structures. Struct
Saf 52:78–89
26. Park YS, Lek S (2016) Artificial neural networks: multilayer perceptron for ecological
modeling. Dev Environ Model 28:123–140
27. Abedini R, Esfandyari M, Nezhadmoghadam A, Rahmanian B (2012) The prediction of
undersaturated crude oil viscosity: an artificial neural network and fuzzy model approach. Pet
Sci Technol 30(12):2008–2021
28. Talo M, Baloglu UB, Yıldırım Ö, Acharya UR (2019) Application of deep transfer learning
for automated brain abnormality classification using MR images. Cogn Syst Res 54:176–188
29. Tomar D, Agarwal S (2015) Twin support vector machine: a review from 2007 to 2014.
Egypt Inf J 16(1):55–69
30. Müller KR, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based
learning algorithms. IEEE Trans Neural Netw 12(2):181–201
Improvement of Magnetic Field
Induction for MPI Application
Using Maxwell Coils Paired-Sub-coils
System Arrangement

Muhamad Fikri Shahkhirin Birahim, Nurmiza Othman,

Syarfa’ Zahirah Sapuan, Mohd Razali Md Tomari,
Wan Nurshazwani Wan Zakaria, and Chua King Lee

Abstract The development of excitation coil for Magnetic Particle Imaging

(MPI) requires an adequate and homogeneous magnetic field induction to excite the
magnetic nanoparticles (MNPs) located under the skin surface. The complex sur-
face of human body and excitation coil physical properties can influence the
magnetic field induction and homogeneity towards the MNPs. In this paper, several
excitation coil designs with and without the combination of Maxwell coils pair were
proposed. The simulation studies were performed to determine the magnetic field
induction and homogeneity towards the targeted MNPs distance of 10–50 mm
below the excitation coils using ANSYS Maxwell. The results show that the
combination of several sub-coils with Maxwell coils pair can produce stronger
magnetic field induction compared to the excitation coil used in conventional MPI
system. Whereas, magnetic field induction for one single-coil with no Maxwell
coils pair shows the weakest strength compared to other designs. The proposed
excitation coil design can be used to enhance the MPI system performance for
various medical applications.

Keywords Magnetic Particle Imaging Excitation coil Coil design Sub-coil

Maxwell coils pair

M. F. S. Birahim (&) N. Othman S. Z. Sapuan M. R. M. Tomari W. N. W. Zakaria

C. K. Lee
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussein Onn Malaysia,
86400 Parit Raja, Johor, Malaysia
e-mail: muhamadﬁ[email protected]
N. Othman S. Z. Sapuan M. R. M. Tomari W. N. W. Zakaria C. K. Lee
Research Centre for Applied Electromagnetic (EMCentre), Universiti Tun Hussein Onn
Malaysia, 86400 Parit Raja, Johor, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 819

1 Introduction

MPI is a new imaging technique that uses the magnetic response of MNPs distri-
bution field. MPI uses non-ionizing radiation and biocompatible superparamagnetic
iron oxide nanoparticles (SPIONs) as a tracer material. These advantages of MPI
system offer great potential in a wide range of medical imaging application such as
cancer diagnosis, brain injury detection, cardiovascular imaging and lung imaging.
However, effort to improve MPI system performance is still required in developing
several important aspects such as high-resolution image contrast, high
signal-to-noise ratio (SNR), tracer’s material development, fast acquisition time for
image reconstruction and optimum coil design configuration [1–4].
Gleich and Weiznecker had invented the first MPI at Philips Research
Laboratories in Hamburg, Germany in year 2005. They had utilized the magneti-
zation behaviour of the MNPs for tomographic imaging [5]. Since then, many
groups had done a lot of researches for developing MPI performance such as
magnetic nanoparticles (MNPs) material, coil design configuration and image
reconstruction method. A summary of the recent MPI development was listed in
Table 1 [6–12].
MPI operation is started with the injection of MNPs into a patient and they will
accumulate at the targeted area of tissue or cell. Then, the MNPs will be excited by
the excitation coil and the magnetic field signal produced from the magnetized
MNPs will be detected by receiving coil to produce an image.
There are three main components in the MPI system which are MNPs, magnetic
coils (excitation coil, selection coil, and receiving coil) and image reconstruction.
The excitation coil is the coil used to excite ‘strong’ AC magnetic field at a high
frequency around 1 to 30 kHz (depending on the design) so that the harmonic
response of the MNP can be utilized; selection coil is a DC gradient coil to create
field-free-point (FFP) or field-free-line (FFL) so that the MNP harmonic response
could be suppressed for spatial localization of the MNPs; receiving coil is the coil to
detect harmonics responses from the MNP to be used in image reconstruction
process.

Table 1 Summary of the recent MPI development

Year Milestones
2013 The world’s ﬁrst preclinical MPI system was announced by Bruker [6]
2014 Vogel et al. developed the ﬁrst hybrid MRI/MPI system [7]
2015 Zheng et al. developed the MPI system for stem cell imaging [8]
2016 Grafe et al. recorded 2D images with MPI [9]
2017 Ohki et al. developed the MPI system for hyperthermia [10]
2018 Muslu et al. proposed a method to generate a multi-colour relaxation map in MPI
system [11]
2019 Kosch et al. developed solid phantoms containing freeze-dried MNPs for comparison
of resolution capabilities between different MPI [12]
Improvement of Magnetic Field Induction for MPI Application … 821

In this study, the magnetic field induction and its homogeneity generated from
several different excitation coil arrangement and parameter sets have been inves-
tigated. A homogeneous and adequate level of magnetic field excited at a certain
detection position will produce a high quality of image. The aim of this study was
to generate a strong and homogeneous magnetic field induction from the excitation
coil towards the MNPs at the targeted distance of 10–50 mm. In simulation work,
Irms ¼ 1 A was use as to see the static magnetic field distribution at particular
location and at particular period of time. While in real application of MPI, alter-
nating current (AC) will be used at 25 kHz frequency for generating harmonic
response from the MNPs. 1 A current was used to ensure the minimal power to be
used in the whole MPI system.

2 Method

Several different conﬁguration of excitation coils were designed and simulated,

namely Design A, Design B, Design C, Design D, Design E and Design F. Then, all
the excitation coils were determined with respect to their magnetic field induction
and homogeneity towards the MNPs at the targeted distance from 10–50 mm under
the excitation coils using ANSYS Maxwell as shown in Fig. 1.
(Design A - Design F) excitation coils (red coil) were designed as shown in
Fig. 2. L indicated length of one single-coil used in Design A and Design D, while
L1 and L2 indicated the shorter length used for (Design B and Design E) and
(Design C and Design F) respectively. L2 was shorter than L1 , while L1 was shorter
than L. The excitation coil length difference can be simply summarized as
L [ L1 [ L2 . Design B and Design E have three sub-coils while Design C and
Design F have five smaller sub-coils that were employed with the excitation coil.
The difference between excitation coil (Design A - Design C) and (Design D -
Design F) was either the Maxwell coils pair were being employed or not. Excitation
coil (Design A - Design C) were not employed with the Maxwell coils pair while
(Design D - Design F) were employed with the Maxwell coils pair (blue coil). The
other coil specification parameters were set up to similar value except for the
excitation coil length as mentioned in Table 2.
The standard specification parameters were set for all coil such as the amount of
current supplied (1 A), the material used for winding the coils (copper wire), and
coil shape (cylindrical) was shown in Table 2. Note that excitation coil length were
varied from L to L2 . Since majority of the excitation coils proposed by previous
researches used copper wire, therefore copper wire was chosen for all coil models
[13–16]. Conventional MPI need high current excitation to get stronger magnetic
field inductance, but this project wants to avoid high current and additional
accessories circuit. Thus this project proposed new excitation coil design and
arrangement. The excitation coil parameter and specification can be varied
depending on particular design.
822 M. F. S. Birahim et al.

Fig. 1 Magnetic ﬁeld

induction was determined
from the distance of 10–
50 mm under the excitation
coil

Table 2 Excitation coil Speciﬁcation Value

design speciﬁcation of
(Design A - Design F) Length of one single-coil, L 70 mm
Length of one sub-coil, L1 20 mm
Length of one smaller sub-coil, L2 10 mm
Inner radius, a 10 mm
Outer radius, b 50 mm
Wire diameter 1 mm
Turns of one single-coil 2800
Turns of one sub-coil 800
Turns of one smaller sub-coil 400
Wire material Copper
Current 1A

In conventional MPI system, Maxwell coils pair was used to create FFP for
MNPs magnetization purpose [17]. An example of the Maxwell coils pair config-
uration that had been used in conventional MPI can be found in Fig. 3. Two
identical circular coils are arranged symmetrically along the same axis. Each coil
carries an equal amount of current, I flowing in the opposite direction. The blue coil
indicate the Maxwell coils pair configuration and generated the FFP between the
Maxwell coils pair. FFP is very important in MPI as it was used for spatial
encoding. Spatial encoding is needed by the MPI system to encode the spatial
position of the MNPs for later being used in image reconstruction.
In this work, Maxwell coils pair was used to increase the induced magnetic field
produced by the vertically stacked sub-coils instead of creating the FFP as con-
ventionally used by other MPI researchers. This method suggests a new idea of the
functionality of the Maxwell coils pair in MPI.
Improvement of Magnetic Field Induction for MPI Application … 823

Fig. 2 Excitation coil (Design A - Design C) without, and (Design D - Design F) with Maxwell
coils pair (blue coil) from xyz-axis view

Fig. 3 Example of Maxwell coils pair used in conventional MPI system [17]

3 Results and Analysis

3.1 The Functionality of Maxwell Coils Pair for Magnetic

Field Induction from Single-Coil Design

Figure 4 shows the comparison of the magnetic ﬁeld induction from two different
design of the excitation coils, one single-coil (a) Design A (without), and (b) Design
D (with) Maxwell coils pair. Figure 4(a) shows solid line of the magnetic ﬁeld
824 M. F. S. Birahim et al.

distribution direction flow towards the MNPs position (10–50 mm). The strength of
magnetic field induction decrease as it further away from the centre of the excitation
coil. Red line and blue line indicate the strongest and the weakest magnetic field
induction strength respectively. Magnetic field vector generated from Design

Fig. 4 Magnetic field distribution from a Design A and b Design D excitation coils. Magnetic
field induction was indicated by the rainbow colour spectrum on the right post of the figure
Improvement of Magnetic Field Induction for MPI Application … 825

A seems more homogeneous than in Design D because of slightly small interfer-

ences from Maxwell coils pair might reduce the homogeneity of the magnetic field
induction at Design D. I indicates the direction of current flow inside the Maxwell
coils pair and one single-coil. Btotal indicates the total magnetic field induction
generated by the excitation coil towards the targeted MNPs position.
It can be seen that the magnetic field induction from Design D was higher than
Design A. It was caused by the Maxwell coils pair that were employed with the one
single-coil that can increase the total magnetic field induction from the excitation
coil. Figure 4(b) shows the magnetic field induced from the Maxwell coils pair
(blue coil) flow outwards the vertical one single-coil (red coil), hence increase the
total magnetic field induction, Btotal generated by both one single-coil and Maxwell
coils pair towards the targeted MNPs distance.

3.2 Improvement of Magnetic Field Induction

from Vertically Stacked Sub-coils Design (Without
Maxwell Coils Pair)

Table 3 and Fig. 5 show the comparison of the magnetic field induction from
Design A, Design B and Design C excitation coils. From the simulation results, it
can be seen that the magnetic field induced from Design C was higher than Design
A and Design B. For instance, the magnetic field induction from excitation coil of
Design A, Design B, and Design C at distance of 10 mm were 3.5271, 10.6802, and
18.0564 µT respectively.
When one single-coil with length L from Design A was divided into smaller
sub-coils in Design B (with length L1 ) and Design C (with length L2 ), it generates
stronger magnetic field induction at the targeted distance of 10–50 mm. The length
of one single sub-coil of Design B and Design C was smaller than in Design
A (L [ L1 [ L2 ).
The magnetic field induction from the excitation coil was affected by the power
loss. In order to generate a high level of magnetic field induction, the power loss
from the coils must be reduced and it can be achieved by decreasing the copper loss.
Copper loss is the term given to heat produced by electrical currents in the winding
conductors [18]. The power loss from the copper loss is given in Eq. (1) [19]. I is
current, R is resistance and t is the time current is supplied. When current was
supplied to the coils, the copper loss will occur. The copper loss will increase when
the resistance of the coil increase as long as the current supplied is maintained. The
resistance of coil was influenced by resistivity, length and area of the wire used as
can be found in Eq. (2) [20]. As the length of coil decrease, the resistance of coil
will decrease and thus the copper loss can be reduced to maintain the power from
loss to surrounding [19].
826 M. F. S. Birahim et al.

Table 3 Magnetic ﬁeld induced from the excitation coil of Design A, Design B and Design
C (without Maxwell coils pair)
Distance (mm) Design A (µT) Design B (µT) Design C (µT)
10 3.5271 10.6802 18.0564
20 1.6559 5.3177 9.0644
30 0.7888 2.5558 4.5156
40 0.3362 1.2523 2.1504
50 0.0684 0.1681 0.5069
Magnetic Field Induction (μT)

10 Design A
Design B
5
Design C

0
0 10 20 30 40 50 60
Distance (mm)

Fig. 5 Magnetic ﬁeld induced from the excitation coil of Design A, Design B, and Design C when
measured at different distances

This shows the signiﬁcant of vertically stacked sub-coils to be used as excitation

coil compared to just one single-coil as normally used in conventionally MPI
system.

Copper Loss ¼ I 2 R t ð1Þ

L
R¼q ð2Þ
A

3.3 Further Improvement of Magnetic Field Induction

from Vertically Stacked Sub-coils Design Employed
(with Maxwell Coils Pair)

Table 4 and Fig. 6 show the comparison of the magnetic field induction from
Design D, Design E, and Design F excitation coils. From the simulation results, it
can be seen that the magnetic field induced from Design F was higher than Design
D and Design E. For instance, the magnetic field induction from excitation coil in
Improvement of Magnetic Field Induction for MPI Application … 827

Table 4 Magnetic ﬁeld induced from the excitation coil of Design D, Design E and Design
F (with Maxwell coils pair)
Distance (mm) Design D (µT) Design E (µT) Design F (µT)
10 3.8150 12.8799 22.8015
20 2.0702 7.3557 12.7592
30 1.0331 3.7020 6.0898
40 0.5075 1.7850 3.1155
50 0.1250 0.3935 0.4801
Magnetic Field Induction (μT)

15
Desugn D
10
Design E
5 Design F

0
0 10 20 30 40 50 60
Distance (mm)

Fig. 6 Magnetic ﬁeld induced from the excitation coil of Design D, Design E and Design F when
measured at different distances

Design D, Design E, and Design F at distance of 10 mm were 3.8150, 12.8799, and

22.8015 µT, respectively. When one single-coil with length L from Design D was
divided into smaller sub-coils in Design E (with length L1 ) and Design F (with
length L2 ), it generates stronger magnetic field induction at the targeted distance of
10–50 mm. The length of one sub-coil of Design E and Design F was smaller than
in Design D (L [ L1 [ L2 ). As the length of excitation coil decrease, the resistance
of the excitation coil will decrease and thus the copper loss can be reduced to
maintain the power from loss to surrounding as previously explained in
Subsect. 3.2.
Besides, these excitation coils were employed with Maxwell coils pair to
increase in further the total magnetic field induction. The magnetic field induced
from the Maxwell coils pair will flow inwards the excitation coils as shown pre-
viously in Fig. 4(b) and increased the total magnetic field induction to the targeted
point of 10–50 mm. Thus the total magnetic field induction was increased com-
pared to the designs that were not employed with Maxwell coils pair in the exci-
tation coils. Here, the addition of Maxwell coils pair into an excitation coil shows
its different role besides creating the FFP in the conventionally MPI.
828 M. F. S. Birahim et al.

4 Conclusion

This work presented several designs of excitation coil for MPI application that were
simulated to measure the magnetic field induction towards the MNPs located at a
targeted distance of 10–50 mm under the excitation coils. As a result, the magnetic
field induction from the excitation coils at certain targeted positions can be
enhanced by dividing the excitation coil into smaller sub-coils.
Furthermore, excitation coil that was employed with Maxwell coils pair had
generated stronger magnetic field induction compared to the excitation coil that did
not employ it. Improvement of magnetic field induction and homogeneity can be
obtained based on the combination of the sub-coils design with Maxwell coils pair.
For future work, experiment work will be conducted for proposed excitation coil
design to support the simulation and calculation results. 1 A of current excitation at
25 kHz will be used in the experiment so that harmonic response of MNPs can be
utilized for MPI system.
Therefore, a new concept of an excitation coil design can be proposed from the
findings and can be used for single-sided MPI system for future development in the
medical imaging application.

Acknowledgements The authors would like to acknowledge the Geran Penyelidikan

Pascasiswazah (GPPS) Vot H308, TWAS-COMSTECH Joint Research Grant Vot X071,
Fundamental Research Grant Scheme (FRGS) K087 and Universiti Tun Hussein Onn Malaysia
(UTHM) for the support in this research.

References

1. Paysen H et al (2018) Improved sensitivity and limit-of-detection using a receive-only coil in

magnetic particle imaging. Phys Med Biol 63(13):13NT02. https://doi.org/10.1088/1361-
6560/aacb87
2. Shimada K, Murase K (2017) Effect of signal ﬁltering on image quality of projection-based
magnetic particle imaging. Open J Med Imaging 7(2):43–55
3. Khan I, Saeed K, Khan I (2017) Nanoparticles: properties, applications and toxicities. Arab J
Chem 12:908–931
4. Ilbey S, Top CB, Cukur T, Saritas EU, Guven HE (2017) Image reconstruction for magnetic
particle imaging using an augmented Lagrangian method. In: Proceedings - international
symposium on biomedical imaging, pp 64–67
5. Gleich B, Weizenecker J (2005) Tomographic imaging using the nonlinear response of
magnetic particles. Nature 435(7046):1214–1217
6. Thiel T (2013) Bruker Announces the World’s First Preclinical Magnetic Particle Imaging
(MPI) System. Bruker Corporation. https://ir.bruker.com/press-releases/press-release-details/
2013/Bruker-Announces-the-Worlds-First-Preclinical-Magnetic-Particle-Imaging-MPI-
System/default.aspx. Accessed 28 July 2019
7. Vogel P et al (2014) MRI meets MPI: a bimodal MPI-MRI tomograph. IEEE Trans Med
Imaging 33(10):1954–1959
8. Zheng B et al (2015) Magnetic particle imaging tracks the long-term fate of in vivo neural cell
implants with high image contrast. Sci Rep 5(1):1–9
Improvement of Magnetic Field Induction for MPI Application … 829

9. Gräfe K, Von Gladiss A, Bringout G, Ahlborg M, Buzug TM (2016) 2D images recorded

with a single-sided magnetic particle imaging scanner. IEEE Trans Med Imaging 35(4):1056–
1065
10. Kuboyabu T, Yamawaki M, Ohki A, Aoki M, Murase K (2016) Quantitative evaluation of
tumor early response to magnetic hyperthermia combined with vascular disrupting therapy
using magnetic particle imaging. Int J Nanomed Nanosurgery 2(3):1–7
11. Muslu Y, Utkur M, Demirel OB, Saritas EU (2018) Calibration-free relaxation-based
multi-color magnetic particle imaging. IEEE Trans Med Imaging 37(8):1920–1931
12. Kosch O et al (2019) Evaluation of a separate-receive coil by magnetic particle imaging of a
solid phantom. J Magn Magn Mater 471:444–449
13. Bringout G, Buzug TM (2015) Coil design for magnetic particle imaging: application for a
preclinical scanner. IEEE Trans Magn 51(2):1–8
14. Wojtczyk H et al (2014) Toward the optimization of D-shaped coils for the use in an open
magnetic particle imaging scanner. IEEE Trans Magn 50(7):1–7
15. Bringout G et al (2013) High power driving and selection ﬁeld coil for an open MPI scanner.
In: 2013 international workshop on magnetic particle imaging, IWMPI 2013
16. Othman NB, Tsubaki T, Yoshida T, Enpuku K, Kandori A (2012) Magnetic nanoparticle
imaging using harmonic signals. IEEE Trans Magn 48(11):3776–3779
17. Knopp T, Buzug TM (2012) How to build an MPI scanner. Springer. https://link.springer.
com/chapter/10.1007/978-3-642-04199-0_3#citeas. Accessed 01 Jan 2019
18. Yadav A (2016) Effect of temperature on electric current, magnets and electromagnet. Int J
Adv Technol 7(4):1000167. https://doi.org/10.4172/0976-4860.1000167
19. Lin D, Fuchs EF (2006) Real-time monitoring of iron-core and copper losses of transformers
under (non)sinusoidal operation. IEEE Trans Power Deliv 21(3):1333–1341
20. Cho YC et al (2010) Copper better than silver: electrical resistivity of the grain-free
single-crystal copper wire. Cryst Growth Des 10(6):2780–2784
DCT Image Compression Implemented
on Raspberry Pi to Compress Image
Captured by CMOS Image Sensor

Ibrahim Saad Mohsin, Muhammad Imran Ahmad, Saad M. Salman,

Mustafa Zuhaer Nayef Al-Dabagh, Mohd Nazrin Md Isa,
and Raja Abdullah Raja Ahmad

Abstract The purpose of compression is to reduce the amount of data at the same
time maintain the quality of image and signal for the other purpose. Discrete Cosine
Transform (DCT) is a family of image compression where the raw image is
transformed to the other domain to produce smaller size of data. DCT transform has
low computational complexity and fast processing algorithm. In this project, DCT
transform will be implemented using PI camera and Raspberry Pi SBC develop-
ment board running on an ARM based processor. The raspberry Pi board has an
advantage of image processing implementation due to the existing software
development tool offered a rich feature for image processing such as OPENCV. The
result of applying DCT compression algorithm on images with six compression rate
level which are 10, 20, 50, 100, 170 and 200. The best performance can be achieved
with compression rate level 200. However, on reducing the quality level of com-
pression rate, the error measurements start becoming worse until a point is reached,
where the perceptual difference from the original image can be easily noted.

Keywords DCT algorithm Image compression Raspberry Pi board

I. S. Mohsin M. I. Ahmad (&) R. A. R. Ahmad

School of Computer and Communication Engineering, Universiti Malaysia Perlis,
Kampus Pauh Putra, 02600 Perlis, Malaysia
e-mail: [email protected]
S. M. Salman
Department of Mathematical and Computer Application, College of Science, Al-Nahrain
University, Baghdad, Iraq
M. N. M. Isa
School of Microelectronic Engineering, Universiti Malaysia Perlis, Kampus Pauh Putra,
02600 Perlis, Malaysia
M. Z. N. Al-Dabagh
Department of Computer Science, College of Science, Knowledge University, Kurdistan
Region, Erbil, Iraq

© Springer Nature Singapore Pte Ltd. 2021 831

1 Introduction

Image compression is the technique that is used to properly code digital images in
order to minimize the number of bits needed in displaying an image. Advancement
in digital technology has enhanced the communication media in which represen-
tation of visual information poses a vital role. High deﬁnition TV, videoconfer-
encing, medical imaging, video wireless transmission, virtual reality, video
telephony and video server are the emerging application of image compression [1].
If the number of users is large, the video signal and original biometric digital image
often contains many information, hence, requires a large channel and storage
capacity. The cost of implementation increases exponentially with the required
capacity despite the improvement of the communication channels and storage
capacity technology. However, compression techniques are important to overcome
the shortcomings of the channel or storage capacity, which in turn reduce the data
rate and retaining the quality of the image or video signal decoded [2].
Image compression techniques can reduce data to the extent where it is almost
not obvious to the human eye and the utilization of statistical redundancies in the
data. Spatial and spectral redundancies are the two types of redundancies that ensue
in images [3]. The correlation that joins neighboring pixels together causes spatial
redundancy while spectral redundancy occur according to the correlation relating
different color planes.
In the compression theory both spatial and spectral redundancies can be elimi-
nated by utilizing transform coding or sub band coding (Discrete Cosine
Transform). Another type of redundancy called temporal redundancy which is due
to the correlation of different frames in an array of images such as in video con-
ferencing applications or broadcast images. Inter-frame coding also called Motion
Compensated Predictive Coding (MCPC) can be used to remove temporal redun-
dancies [4]. DCT transform, is characterized by speedy processing algorithm and
low computational complexity, DCT transform is applied to all images, so as to
ensure no blocking artefact is produced. This is a key advantage of DCT com-
pression when compared to other transform compression methods [5]. This work is
constituted of a number of design stages, which include of DCT algorithm, image
pre-processing, MSE and PSNR measurement. The programming language Python
was used to develop the algorithm, along with the NumPy mathematical library and
an additional image processing library such as OPENCV.
The main objectives of this work are to Implement a DCT algorithm on a
Raspberry Pi to compress biometric image and measure the performance of an
embedded system in terms of performing a compression algorithm for different
compression ratios. The purpose of compression is to reduce the amount of data at
the same time maintain the quality of image and signal for the other purpose [6]. In
this project, DCT transform will be implemented using Raspberry Pi SBC devel-
opment board running on ARM based processor. A compression algorithm has high
computational cost, and thus implementation in a low-cost processor is a chal-
lenging task. Raspberry Pi board has an advantage for image processing
DCT Image Compression Implemented on Raspberry Pi … 833

implementation due to the existing of CMOS camera input socket to deal with raw
data capture by camera module. Compare to the other SBC board, camera module
needs to be connected to USB cable which produce low quality image compare to
direct camera input offered by Raspberry Pi board [7].

2 Biometric Image Compression

Biometrics image is the measurement and statistical analysis of people’s physical

attribute. It mainly identifies and get control to access any individuals or object of
interest that are under surveillance. Biometric authentication usually authenticate
with unique parameters of any individual’s inherent physical traits. For example, if
any government is launching program to gather the biometric grouping features
particularly fingerprint, iris and facial patterns for its residents. Managing the large
system storage can be challenging including database transmission over internet or
designing a portable sensor device [8]. This forms an underlying reason to inves-
tigate the compression algorithm for several biometrics image based on JPEG
compression techniques. It is also important to measure the compression perfor-
mance of compressed biometric image using several measurement standards. Image
generally dominates the major share of the communication bandwidth. Hence,
efficient image compression technique is essential. The most focal objective of
image compression is to eliminate redundancy and exclude irrelevancy.
Redundancy helps to remove redundancy from the signal source and irrelevancy
omits pixel values which are not noticeable by the human eye [9].

3 Propose Compression Method

There is a compression method based on a frequency transform that is able to

compress biometric image which is Discrete Cosine Transform (DCT). This
transform (DCT) contains unique features which allow for the creation of an effi-
cient image compression. The main advantage of image transformation using DCT
is the elimination of redundancy between neighbor pixels [10]. Efficiency or per-
formance of the transformation scheme can be directly measured by its ability to
array input data into as few coefficients as possible [8]. Image compression can be
implemented in both software and hardware. Figure 1 shows the hardware and
software setup using Raspberry Pi board and Pi camera. However, the use of
hardware implementations will achieve faster processing than software imple-
mentation which is due to highly parallel algorithms that exist in specific hardware
such as Raspberry Pi board and a dedicated camera socket for Pi camera.
834 I. S. Mohsin et al.

Fig. 1 Raspberry Pi and camera interface

3.1 Raspberry Pi Board

The main signal processing chipset used in the Raspberry Pi system is the
Broadcom 2835 700 MHz chip, in which the core CPU is a 32-bit ARIS1176JZF-S
RISC processor designed by Advanced RISC Machines. This main processing chip
has a special connection between the camera and monitor. Figure 2 shows the
Raspberry Pi Board use in this project which has a dedicated port for Pi camera.
The design of the Raspberry Pi does not include a hard disk or hard drive, and uses
an SD card for long-term playback and storage [11]. This Raspberry Pi unit is
designed to run Linux Debian operating systems, and features a Samsung Micro 4
Micro SD card preloaded with the Raspberry Pi NOOBS package (a new program
outside the box) and a printed Micro SD card adapter [12]. With a small size of
memory, the design should be modelled using PC based method to optimize the
statistical parameter. Once the algorithm is ﬁnalized all statistical parameter is
transferred to the board.
DCT Image Compression Implemented on Raspberry Pi … 835

Fig. 2 Raspberry Pi board

3.2 PI Camera Interface

The camera unit used in this project is a Pi camera module, and is connected to the
CSI connector of the Raspberry Pi. It can deliver a 5-megapixel image, or 1080p
video recording at 30 frames per second. The Raspberry Pi camera is connected via
a 15-pin ribbon cable to the 15-pin MIPI camera serial interface (CSI) interface,
which is speciﬁcally designed to communicate with cameras. The CSI bus is
capable of extremely high data rates, and transmits pixel data exclusively to the
BCM2835 processor. The camera unit used in this project is the Raspberry Pi
Camera Module v2, as shown in Fig. 3 [13]. In this project OPENCV libraries is
extensively use to access the camera and to perform a simple image pre-processing.
This is another advantage of using Raspberry Pi due to existing image processing
libraries that will shorten the development process.

3.3 Discrete Cosine Transform Algorithm

The most popular technique for compressing images over the past few years and is
able to be implemented in hardware or software. The main drawback of this
transformation is time consuming for a large size of data However many researchers
has improve the structure of this algorithm in order to speed up the computation
process. The algorithm is classiﬁed as a JPEG standard is one of the main reasons
for its popularity in many applications. DCT is used in many non-analytical
applications such as image processing and DSP applications to process signals such
as video conferencing [14]. DCT is an orthogonal transform, which contains a ﬁxed
set of base function with a unique pattern and this pattern can be precomputed in
836 I. S. Mohsin et al.

Fig. 3 Raspberry Pi camera

module

advanced. DCT is able to transform the image space to a specific frequency region.
DCT has the power to fill the low-frequency coefficient of image data and has the
ability to reduce the effect of prohibited artefacts and this effect resulting from the
boundary between the sub-images become visible [15]. The two-dimensional DCT
of image f(x, y) can be obtained from the one dimensional DCT as expressed
below.
XN1 XN1
2 pð2x þ 1Þu pð2y þ 1Þv
F ðu; vÞ ¼ CðuÞCðvÞ f ð x; y Þcos cos
N x¼0 y¼0 2N 2N
ð1Þ

The two-dimensional inverse DCT is given by

XN1 XN1
2 pð2x þ 1Þu pð2y þ 1Þv
f ðx; yÞ ¼ CðuÞC ðvÞ F ðu; vÞcos cos
N x¼0 y¼0 2N 2N
ð2Þ
DCT Image Compression Implemented on Raspberry Pi … 837

Fig. 4 Compression and reconstruction implemented in this project

where u and v are coefﬁcients in the transform domain, and N represents the
number of pixels in each image. The successive zeros that occur within each block
are exploited in a zigzag style. This usually begins with an increase from
low-frequency to high-frequency terms, and a quantitative estimate may eliminate
the high-frequency terms at the quantiﬁcation stage [16]. This process will gener-
ally produce a more demanding product by positioning the entropy for further
arrangement. Figure 4 represents schematic coding and decoding schemes for the
propose method implemented in Raspberry Pi.

3.4 Implementation of Algorithm

The algorithm is modelled using Python language with the OPENCV and Numpy
libraries. OPENCV is the image processing libraries to support several image
processing task such as cropping, resizing and controlling Pi camera. Numpy is
another library for mathematical operation such as trigonometry function. This
work is performed by employing the Python Development Software executed in
Raspberry Pi platform to access the pi camera and perform DCT algorithm. DCT
based lifting is employed to compress input image to a certain level and measure the
performance in terms of signal-to-noise ratio. Qualitative analysis is done by using
the reconstructed image with different compression ratio by comparing with the
original image. For compression and processing time consumption, the images
using Pi camera is used in the analysis. The reconstructing of compressed coefﬁ-
cients into an image similar to the original image is done through the same DCT
coefﬁcients. For different decomposition levels, the values of peak signal to noise
ratio (PSNR), mean square error (MSE) and compression ratio are determined to
conduct quantitative analysis. The MSE (Eq. 4) and PSNR (Eq. 3) are usually
employed to measure the video or image data quality at the output of the decoder.
838 I. S. Mohsin et al.

The MSE is also known as the quantization error variance, r2 q. Normally, when
the PSNR has a value of 40 dB or greater, human observers will not be able to
distinguish between the original and the reconstructed images. The formulas of
quality assessments that are used in this paper is shown below.

PSNR ¼ 10 log10 ð2552 = MSEÞ ð3Þ

1 XN XM
MSE ¼ ðI orignalði; jÞ I reconsructed ði; jÞÞ2 : ð4Þ
NM i¼1 j¼1

4 Performance Analyses

A series of experiment is carried out to compress the image related to biometric

which is face and palmprint images. The compression algorithm is implemented in
two platform which is PC based and Raspberry Pi board. The results show that
although the algorithm is time consuming process for large size of image, it can be
efﬁciently implemented and deployed on a portable and fast platform such as the
Raspberry Pi. We apply the DCT compression algorithm to a palmprint image

Fig. 5 Fingerprint image reconstruction with different compression rate

DCT Image Compression Implemented on Raspberry Pi … 839

Fig. 6 Face image reconstruction with different compression rate

(200 200 pixels) using six compression rates, i.e. levels 1–6, and the results are
assessed using the MSE and PSNR.
Figures 5 and 6 shows the original palm and face image that was reconstructed
with various qualities. As presented in Tables 1 and 2, to maintain the image quality
level, image compression rate is signiﬁcant, i.e. if the quality level of compression
rate is high, the error measurements would be very small for the reconstructed
image and if the compression rate is low, lower image quality was obtained. Based
on the results tested using PC based and Raspberry Pi board, the best values for
MSE and PSNR is achieved at compression rate 200. All these parameters signify
that no perceptual difference is found with the reconstructed image when compared
with the original. However, on reducing the quality level of compression rate, the
error measurements start becoming high until a point is reached, where the per-
ceptual difference from the original image can be easily noted, as presented in
Figs. 5 and 6. Table 3 shows the comparison between PC based, DSP processor
and Raspberry Pi. The results shows by using real time mode, Raspberry Pi give a
better processing time compare to PC based but lower performance compare to DSP
processor. However, in offline mode, PC based perform better than Raspberry Pi.
840 I. S. Mohsin et al.

Table 1 Performance Compression rate Palm print (200 200)

analysis of DCT compression
MSE PSNR
for palmprint image
10 52.643 33.703
20 19.893 42.156
50 2.073 61.798
100 0.353 77.175
200 5.272e−26 279.524

Table 2 Performance Compression rate Face (200 200)

analysis of DCT compression
MSE PSNR
for face image
10 181.521 22.953
20 71.694 31.021
50 13.801 45.332
100 3.721 56.717
200 2.508e−25 277.291

Table 3 Performance comparison between online and offline

Platform Computational time (sec)
Real time-image is captures Offline-image is
using Pi camera/webcam pre-store in memory
PC based [17] 1.2 0.3
DSP processor [18] 0.05 0.05
Raspberry Pi 0.3 0.5

5 Conclusions

This paper have presented the implementation of Raspberry Pi to compressed

biometric image using 2D-DCT. The algorithm is combined with quantization and
zigzag process to compress biometric images such as facial and palmprint. In this
paper, biometric image is captured using Pi camera and OPENCV library is used to
pre-process the captured image. DCT is used to transform raw image from spatial
domain to frequency domain and low frequency components is preserved for
reconstruction purpose. The utilisation of DCT algorithms in image compression
preserves imaging quality by producing high compression ratios while minimising
the amounts of data, thus resulting in increased transmission efﬁciency and data
storage. Biometric images compression performance is evaluated with ﬁve different
quality levels and result is reported in terms of MSE, PSNR and processing time.
Raspberry Pi board outperformed the PC based in term of processing speed when
DCT Image Compression Implemented on Raspberry Pi … 841

the algorithm is implemented in real time using Pi camera. However for the offline
(image is pre-capture and save in memory), PC based give a similar speed with
Raspberry Pi.

References

1. Chen Y, Huang Z, Sun H, Chen M, Tan H (2016) Lossy image compression using PCA and
contourlet transform. In: MATEC web of conferences, p 8002. EDP Sciences
2. Ballé J, Laparra V, Simoncelli EP (2016) End-to-end optimized image compression. arXiv
Preprint arXiv:1611.01704
3. Theis L, Shi W, Cunningham A, Huszár F (2017) Lossy image compression with compressive
autoencoders. arXiv Preprint arXiv:1703.00395
4. Ismail Y, Abdelgawad A, El-etriby S (2019) High speed on-chip multiple cosine transform
generator. J Real-Time Image Process 16:255–269
5. Madanayake A, Cintra RJ, Dimitrov V, Bayer F, Wahid KA, Kulasekera S, Edirisuriya A,
Potluri U, Madishetty S, Rajapaksha N (2015) Low-power VLSI architectures for DCT\/
DWT: precision vs approximation for HD video, biomedical, and smart antenna applications.
IEEE Circuits Syst Mag 15:25–47
6. Sahitya S, Lokesha H, Sudha LK (2016) Real time application of Raspberry Pi in
compression of images. In: 2016 IEEE international conference on recent trends in
electronics, information & communication technology (RTEICT), pp 1047–1050. IEEE
7. Richardson M, Wallace S (2012) Getting started with Raspberry PI. O’Reilly Media, Inc.,
Sebastopol
8. Hussein AF, AlZubaidi AK, Al-Bayaty A, Habash QA (2017) An IoT real-time biometric
authentication system based on ECG fiducial extracted features using discrete cosine
transform. arXiv Preprint arXiv:1708.08189
9. Razaque A, Chavali SSV, Malkapuram SG, Nadimpalli SSV, Vommina S, Atukuri DK,
Malllapu VS (2016) Trans receiving multimedia using Raspberry-Pi. In: 2016 international
conference on electrical, electronics, and optimization techniques (ICEEOT), pp 1904–1908.
IEEE
10. Mora H, Signes-Pont MT, Jimeno-Morenilla A, Sanchez-Romero J-L (2019) High-
performance architecture for digital transform processing. J Supercomput 75:1336–1349
11. Pajankar A (2015) Raspberry Pi computer vision programming. Packt Publishing, Birmingham
12. Shilpashree KS, Lokesha H, Shivkumar H (2015) Implementation of image processing on
Raspberry Pi. Int J Adv Res Comput Commun Eng 4:199–202
13. Senthilkumar G, Gopalakrishnan K, Kumar VS (2014) Embedded image capturing system
using Raspberry Pi system. Int J Emerg Trends Technol Comput Sci 3:213–215
14. Masera M, Martina M, Masera G (2016) Adaptive approximated DCT architectures for
HEVC. IEEE Trans Circuits Syst Video Technol 27:2714–2725
15. Raid AM, Khedr WM, El-Dosuky MA, Ahmed W (2014) JPEG image compression using
discrete cosine transform-a survey. arXiv Preprint arXiv:1405.6147
16. Abdelsalam AM, Langlois JMP, Cheriet F (2017) A configurable FPGA implementation of
the tanh function using DCT interpolation. In: 2017 IEEE 25th annual international
symposium on field-programmable custom computing machines (FCCM), pp 168–171. IEEE
17. Li S, Qu X, Li Q (2010) Implementation of the JPEG on DSP processors. Appl Mech Mater
34:1536–1539
18. Azuwam AAA, Ahmad MI, Md Isa MN, Ilyas MZ, Raja Ahmad RA (2017) Lossy
compression of biometric images implemented using floating point DSP processor. In:
MATEC web of conferences, vol 140
A Racial Recognition Method Based
on Facial Color and Texture
for Improving Demographic
Classification

Amer A. Sallam, Muhammad Nomani Kabir,

Athmar N. M. Shamhan, Heba K. Nasser, and Jing Wang

Abstract Facial recognition is one of the important techniques in the security and
authentication domain of the present time. Facial image recognition involves
complex process which reduces the overall performance of the system for a large
database, and consequently, it may incur inefficiency to the system in the com-
mercial sector. In this paper, we split the image database into a set of smaller groups
by classifying the face images in terms of race demography. First, facial compo-
nents (i.e., eyes, nose and mouth) are captured using a segmentation technique and
then race sensitive features: chromatic/skin tone and local features from face images
are extracted using Color Coherence Vector and Gabor filter. K-Nearest Neighbors,
Artificial Neural Network, and Support Vector Machines are then used to classify
the face image according to race groups. We consider racial classification as Asian,
African and European. It was found that the average classification accuracy with
Gabor and CCV features for Artificial Neural Network is 91.74% and 84.18%,
respectively, providing plausible results comparing to some other existing models.

Keywords Color Coherence Vector Gabor ﬁlter K-Nearest Neighbors

Artiﬁcial Neural Network Support Vector Machines

1 Introduction

Race is a classiﬁcation system used to categorize humans into large and distinct
populations or groups by heritable, phenotypic characteristics, geographic ancestry,
physical appearance, ethnicity and social status [1, 2].

A. A. Sallam A. N. M. Shamhan H. K. Nasser

Faculty of Engineering and Information Technology, Taiz University, Taiz, Yemen
M. N. Kabir (&) J. Wang
Faculty of Computing, Universiti Malaysia Pahang, 26300 Gambang, Kuantan, Pahang,
Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 843

Race recognition is generally performed through face analysis that is one of the
most studied research topics in the field of computer vision and pattern recognition
for the past few decades [2–4]. Although face of a person provides a variety of
demographic information like gender, age and race, yet race remains one of the
invariant and fundamental attributes that cannot be easily masked like age and
gender. For this reason, racial classification can be a key component that can be
deployed in various systems at security checkpoints [5, 6]. Furthermore, this clas-
sification has a potential application in image search query where prior knowledge
of race would narrow down the search space in the database [7], thus simplifying the
process. Due to similar skin color for different races and extreme variation in illu-
mination conditions for real world scenarios, skin tone alone cannot classify.
However, combining with local or global descriptors, classification accuracy can be
improved. The crucial features in race recognition are characterized by facial organs
e.g., eyes, nose and mouth. Thus, facial feature extraction involves segmenting eyes,
nose and mouth region and building the features from these organs [8].
In this research, first, facial components (i.e., eyes, nose and mouth) are iden-
tified from images using a segmentation technique and then the features from face
images are extracted using color coherence vector (CCV) and Gabor filter. Next,
k-nearest neighbors (KNN) [9, 10], artificial neural network (ANN) [11, 12], and
support vector machines (SVM) [13] are employed to classify the face image
according to race groups.

2 Methodology

In this research, two race feature extraction approaches are considered and then
three different classifiers are used to classify the features into different groups.
Figure 1 illustrates the proposed racial recognition framework which follows the
standard procedure of training and testing phases [14]. In this framework, the input
facial images are segmented and Gabor filter is used to extract the features. Besides,
CCV is used directly on the images to extract CCV features. These two features are
used to classify the images through KNN, SVM and ANN. Then all these classifiers
are investigated on their classification performances.

2.1 Feature Extraction

Many systems retrieve images by color histogram. But color histogram is not
efﬁcient to small changes in images, since different images may have the same
histogram. In order to contain more spatial information, CCV [15] was used. CCV
divides color histogram into two parts: coherent and incoherent. Suppose that for ith
discretized color, ai implies the number of coherent pixels and bi denotes the
A Racial Recognition Method Based on Facial Color and Texture … 845

Fig. 1 Proposed racial recognition framework

number of incoherent pixels. Thus, ai + bi is the total number of pixels of that

color, and a color histogram of an image is given by:

½ða1 þ b1 Þ; ða2 þ b2 Þ; . . .; ðan þ bn Þ: ð1Þ

However, the coherence pair (ai, bi) for ith color is considered and CCV for the
image is given by

½ða1 ; b1 Þ; ða2 ; b2 Þ; . . .; ðan ; bn Þ: ð2Þ

Thus, CCV contains spatial information as it includes the distribution of

coherent and incoherent pixels for each discrete color.
A two-dimensional Gabor ﬁlter [16] is deﬁned as:
846 A. A. Sallam et al.

02 þ a2 y02

x 2p
gðx; yÞ ¼ e 2d 2 cos þ/ ð3Þ
k

where x0 ¼ x cosðhÞ þ y sinðhÞ and y0 ¼ x sinðhÞ þ y cosðhÞ. The standard devi-

ation d refers to the Gaussian factor that provides the effective size of the sur-
rounding of a pixel; the parameter a is the spatial aspect ratio. The parameter k is
the wavelength. The following values are considered [17] for the parameters: d = 8,
a = 0.5, k ¼ 4, and

U 2 f0:13; 0:22; 0:31; 0:40; 0:49g;

h 2 f0; p=8; 2p=8; 3p=8; 4p=8; 5p=8; 6p=8; 7p=8g:

Hence, Gabor filter bank with 40 channels is considered. Then the facial image is
set into the filter bank which convolves the image with each Gabor filter. The
results from the process are concatenated into a feature vector.
Gabor filter is used to extract the face features by detecting the eyes, mouth and
nose. In the proposed method, 5 frequencies and 8 orientations are used to extract
the features from face components. Mean l and standard deviation r of each
component for 5 frequencies and 8 orientations were calculated and stored in
feature vectors as follows:

VLe ¼ fl1Le ; ; l40Le ; r1Le ; ; r40Le g ð4Þ

VRe ¼ fl1Re ; ; l40Re ; r1Re ; ; r40Re g ð5Þ

VN ¼ fl1N ; ; l40N ; r1N ; ; r40N g ð6Þ

VM ¼ fl1M ; ; l40M ; r1M ; ; r40M g ð7Þ

where the subscript Le denotes the left eye; Re, the right eye; N, the nose; and M,
the mouth. A face was then represented by a feature vector V of 320 components as:

V ¼ fl; rg ð8Þ

where

l ¼ fl1Le ; ; l40Le ; l1Re ; ; l40Re ; l1N ; ; l40N ; l1M ; ; l40M g ð9Þ

r ¼ fr1Le ; ; r40Le ; r1Re ; ; r40Re ; r1N ; ; r40N ; r1M ; ; r40M g: ð10Þ

A Racial Recognition Method Based on Facial Color and Texture … 847

2.2 Classiﬁcation

Three classifiers, namely, KNN [18], ANN [19] and Support Vector Machines [20]
were used to classify features extracted from the faces. Each classifier is separately
used to work on each of the two extracted features from CCV and Gabor filter. We
evaluate the performance of each classifier using different parameters associated
with each algorithm. In KNN classifier, different k-values was experimented, while
different numbers of neurons in the hidden layer of ANN were taken to find the best
number of neurons that made the best performance.

3 Results and Discussion

For our tests, we used the dataset MR2 [21] consisting of 74 color images of men
and women of European, African, and East Asian descent. 65% of the images are
taken for training and the rest of the images were taken for testing. Accuracy A (%)
of the classiﬁcation is estimated by

Ic
A¼ 100 ð11Þ
It

where Ic is the number of correctly classified Images and It is the total number of
images.
Extensive experiments were carried out for each feature extraction algorithm
with each classifier and race classification results are provided in Tables 1, 2, 3 and
4 and Figs. 2 and 3. The final results are listed in Table 1 with Figs. 2 and 3 that
demonstrate that the proposed method achieves a high degree of accuracy. The best
result is obtained for Gabor feature extraction, 91.74% for ANN. On the other hand,
SVM provides the best result (i.e., 85.21%) for CCV feature extraction. It reveals
that the selection of specific classification approach does not affect the recognition
rate much. Feature extraction approach is the main reason of increase the recog-
nition rate of the system. The Gabor features are demonstrated to be more accurate
than the CCV features because of the robustness of its texture-based features which
are invariant to illumination, scale and rotation rather than based-color features
(CCV)—which may not be significantly invariant among race groups. For example

Table 1 Accuracy of Classiﬁer Feature extraction method Accuracy (%)

different classiﬁers with two
feature extraction methods KNN CCV 83.53
Gabor 87.38
SVM CCV 85.21
Gabor 90.40
ANN CCV 84.18
Gabor 91.74
848 A. A. Sallam et al.

Table 2 Racial classiﬁcation recognition rate using KNN algorithm with different number of k-
values
Feature extraction method Value of k Correct recognition rate (%)
CCV 2 - Nearest Neighbors 83.53
5 - Nearest Neighbors 80.60
7 - Nearest Neighbors 77.56
Gabor ﬁlter 2 - Nearest Neighbors 83.30
5 - Nearest Neighbors 84.60
7 - Nearest Neighbors 87.38

Table 3 Racial classiﬁcation recognition rate using ANN with different number of hidden-layer
neurons
Feature extraction method No. of hidden-layer neurons Correct recognition rate (%)
CCV 5 – Hidden layer neurons 79.94
10 – Hidden layer neurons 74.68
15 – Hidden layer neurons 84.18
Gabor ﬁlter 5 – Hidden layer neurons 91.74
10 – Hidden layer neurons 85.21
15 – Hidden layer neurons 79.94

Table 4 Effectiveness of facial components of racial recognition system with Gabor ﬁlter
Deletion of face Classiﬁcation accuracy (%)
component KNN SVM ANN
Asian African European Asian African European Asian African European
(%) (%) (%) (%) (%) (%) (%) (%) (%)
Left eye 88 79 63 79 79 67 76 70 81
Right eye 85 74 65 83 76 73 85 75 77
Mouth 83 76 67 85 68 81 71 71 69
Nose 87 77 64 78 65 76 86 63 61
None 91 86 85 93 90 88 93 90 92

there are a number of samples in Europeans which overlap with those in Asian
group in terms of skin color.
Performance of each classifier using different parameters associated with each
algorithm are evaluated. In KNN classifier, different k-values were experimented.
Depending on the feature extraction method and how their features distributed in
feature space, we perform cross-validation to identify the suitable value of
k. Table 2 illustrates that, in CCV algorithm, the suitable value of k is 2, while it is 7
for the Gabor filter.
Number of neurons in the hidden layers in ANN method is given as inputs to
build the neural network. Few hidden layers may cause inaccuracy in computation.
More hidden neurons lead to increase in computational cost and error may increase
due to numerical errors [22]. Trial and error approach is often used by calculating
A Racial Recognition Method Based on Facial Color and Texture … 849

Fig. 2 Classiﬁcation
accuracy with KNN and SVM

Fig. 3 Classiﬁcation
accuracy with ANN with
different numbers of neurons
in the hidden-layer

root mean squared error (RMSE) to determine the suitable number of neurons in a
hidden layer. Table 3 shows that in CCV algorithm, the suitable number of neurons
is 15 while it is 5 for the Gabor filter.
To determine the most significant facial component that affects the racial
recognition rate with Gabor filter, some additional tests were carried out where
features of the components (right eye, left eye, nose and mouth) were deleted one at
a time and recognition accuracy is computed. Test results are listed in Table 4 to
demonstrate more effective components for different races.
Table 5 shows performances of the proposed method compared to some existing
methods. In the first and second rows of the table, race classification by Manesh
et al. [23] and Lyle et al. [24] were taken as a two class (Asian/Non-Asian)
problem, which is simpler than our three-class problem. The accuracy in the second
row is almost the same as our approach with Gabor filer. Although the accuracy
850 A. A. Sallam et al.

Table 5 Comparative performance of some race classiﬁcation systems

Reference Features Classiﬁer Race groups Database Recognition rate
(%)
Manesh Gabor SVM Asian, FERET 98
et al. [23] Non-Asian and
PEAL
Lyle et al. Local binary SVM Asian, FRGC 91
[24] patterns Non-Asian
Proposed Gabor/CCV ANN Asian, MR2 91.74 with Gabor;
method European, 84.18 with CCV
African

reported in the literature is comparable to our method, we notice that the feature
extraction approach based on gray-scale pixel intensities i.e., CCV is not efﬁcient as
Gabor-ﬁlter approach.

4 Conclusion and Future Work

In our research, we presented a racial classiﬁcation method using both texture

analysis of face components and skin tone color. We conducted tests on images
obtained from MR2 face database that was (African, Asian and European). We
carried out experiments using 74 images of both genders. To extract texture fea-
tures, multi-channel 2D Gabor filters were used to find the texture information of
the face components. On the other hand, CCV algorithm was used to extract color
coherence vector from detected face image for skin tone color feature purposes. We
classified the test images by KNN algorithm, ANN and SVM. Gabor feature
extraction algorithm with SVM, KNN and ANN classifier achieved an accuracy of
90.40%, 87.38% and 91.74%, respectively, while with CCV feature extraction
algorithm, accuracy dropped to 85.21%, 83.53% and 84.18%, respectively. From
our experiment, we notice that the mouth and the nose could be further investigated
for better racial classification. In the future work, one can consider geometric
features for further improvement. Moreover, increasing the number of classes to
cover the commonly accepted major race groups of the world will also be of great
interest.

Acknowledgements This work was supported by Universiti Malaysia Pahang (UMP) through the
University Research Grant (RDU1901150).
A Racial Recognition Method Based on Facial Color and Texture … 851

References

1. Hosoi S, Takikawa E, Kawade M (2004) Ethnicity estimation with facial images. In:
Sixth IEEE international conference on automatic face and gesture recognition 2004,
proceedings, pp 195–200. IEEE
2. Roomi SMM, Virasundarii S, Selvamegala S, Jeevanandham S, Hariharasudhan D (2011)
Race classification based on facial features. In: 2011 third national conference on computer
vision, pattern recognition, image processing and graphics (NCVPRIPG), pp 54–57. IEEE
3. Hooi YK, Kalid KS, Tachmammedov S (2018) Multi-factor attendance authentication system.
Int J Softw Eng Comput Syst 4(2):62–79
4. Levi G, Hassner T (2015) Age and gender classification using convolutional neural networks.
In: Proceedings of the IEEE conference on computer vision and pattern recognition
workshops, pp 34–42. IEEE
5. Alginahi YM, Kabir MN (2019) Authentication technologies for cloud technology, IoT, and
big data. Institution of Engineering & Technology, London
6. Ernawan F, Kabir MN (2018) A block-based RDWT-SVD image watermarking method using
human visual system characteristics. Vis. Comput 1–19
7. Kabir M, Alginahi Y, Ali J, Abdel-Raheem E (2019) Optimal search algorithm in a big
database using interpolation–extrapolation method. Electron Lett 55(21):1130–1133
8. Sharif M, Mohsin S, Javed MY, Ali MA (2012) Single image face recognition using
Laplacian of Gaussian and discrete cosine transforms. Int Arab J Inf Technol 9(6):562–570
9. Gou J, Ma H, Ou W, Zeng S, Rao Y, Yang H (2019) A generalized mean distance-based
k-nearest neighbor classifier. Expert Syst Appl 115:356–372
10. Parveen P, Thuraisingham B (2006) Face recognition using multiple classifiers. In: 2006 18th
IEEE international conference on tools with artificial intelligence (ICTAI 2006), pp 179–186.
IEEE
11. Le TH (2011) Applying artificial neural networks for face recognition. Adv Artif Neural Syst
2011:15
12. Dwivedi AK (2018) Artificial neural network model for effective cancer classification using
microarray gene expression data. Neural Comput Appl 29(12):1545–1554
13. Guo G, Li SZ, Chan KL (2001) Support vector machines for face recognition. Image Vis
Comput 19(9–10):631–638
14. Sallam AA, Kabir MN, Ahmed AA, Farhan K, Tarek E (2018) Epilepsy detection from EEG
signals using artificial neural network. In: International conference on intelligent computing &
optimization, pp 320–327. Springer
15. Ponti M, Nazaré TS, Thumé GS (2016) Image quantization as a dimensionality reduction
procedure in color and texture feature extraction. Neurocomputing 173:385–396
16. Grigorescu SE, Petkov N, Kruizinga P (2002) Comparison of texture features based on Gabor
filters. IEEE Trans Image Process 11(10):1160–1167
17. Moreno P, Bernardino A, Santos-Victor J (2005) Gabor parameter selection for local feature
detection. In: Iberian conference on pattern recognition and image analysis, pp 11–19.
Springer
18. Murty MN, Devi VS (2011) Nearest neighbour based classifiers. In: Pattern recognition,
pp 48–85. Springer
19. Hopfield JJ (1988) Artificial neural networks. IEEE Circuits Devices Mag 4(5):3–10
20. Suykens JA, Vandewalle J (1999) Least squares support vector machine classifiers. Neural
Process Lett 9(3):293–300
21. Strohminger N, Gray K, Chituc V, Heffner J, Schein C, Heagins TB (2016) The MR2: a
multi-racial, mega-resolution database of facial stimuli. Behav Res Methods 48(3):1197–1204
22. Kabir MN (2007) Numerical tools for some identification problems in industrial applications.
PhD thesis, Technische Universität Braunschweig, Germany
852 A. A. Sallam et al.

23. Manesh FS, Ghahramani M, Tan YP (2010) Facial part displacement effect on template-based
gender and ethnicity classiﬁcation. In: 2010 11th international conference on control
automation robotics & vision, pp 1644–1649. IEEE
24. Lyle JR, Miller PE, Pundlik SJ, Woodard DL (2010) Soft biometric classiﬁcation using
periocular region features. In: 2010 fourth IEEE international conference on biometrics:
theory, applications and systems (BTAS), pp 1–7. IEEE
Automatic Passengers Counting System
Using Images Processing Based
on YCbCr and HSV Colour Spaces
Analysis

Muhammad Shahid Che Husin and Aimi Salihah Abdul Nasir

Abstract Public transportations such as trains, commuter, buses, and taxis have
become main transportation for major citizen in Malaysia. In recent years,
Malaysia’s government has improving a modern public transportation with a big
cost such as MRT project and ECRL. In order to improve the public transportation
system, the management must propose proper schedule as well as updating the
number of passenger counts. In general, the counting process is conducted by
manual traditional headcount system which the driver or bus conductor will man-
ually count their passenger. Hence, this process is time consuming. Therefore, this
research proposes an automatic passengers counting system using images pro-
cessing based on YCbCr and HSV colour model analysis. To obtain the passengers
counting result, there are several image processing steps to be implemented;
(1) data collection by capturing 50 passenger images, (2) colour conversion from
RGB to YCbCr and HSV colour spaces, (3) image segmentation using Otsu
thresholding method, (4) removing of unwanted object and, (5) passenger counting
process. This study will analyze two different colour conversions which are from
RGB to HSV, as well as from RGB to YCbCr colour conversion. Overall, seg-
mentation using HSV colour model has proven to be the best in segmenting the
passenger’s image with segmentation accuracy of 85.63%. Overall, this project is
expected to help buses administration to improve their services by updating their
passengers counting automatically.

Keywords Automatic passenger counting Image processing Colour

segmentation YCbCr colour spaces HSV colour spaces

M. S. Che Husin (&) A. S. Abdul Nasir

Faculty of Engineering Technology, Universiti Malaysia Perlis UniCITI Alam Campus
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]
A. S. Abdul Nasir
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 853

1 Introduction

1.1 Introduction

Public transport has been a signiﬁcant transportation framework that interfacing

individuals to their destination. Public transport is a gathering voyages that utilizing
a major transportation, for example, buses, trains, commuter, and ships. This
framework is taken care of by an administration that controls the timetable and the
courses of the transportation to get the charges as their benefits. Malaysia’s trans-
port system are covers up to 63,445 km, including 1,630 km of expressways [1].
Public transportation is valuable for the general population who can’t drive, kids,
individuals who did not have their own vehicles, etc. There are numerous expla-
nations behind individuals utilize this framework. The reasons of expanding open
transportation framework as a result of the expanding of individuals who needs it to
go to their destination. By utilizing this framework, the traffic jammed can be
decreases and less danger of accident. In Malaysia, there are 1.3 million people
groups are utilizing open transport each day [2]. This shows that public trans-
portation is significant in the creating nation.
Based on the statement above, the arrangement of public transport likewise
should be moved up to serve better for the passengers. The schedule and the
location of the vehicles must be informed to the passenger to assist them with being
on schedule for boarding in the vehicle [3]. The measure of the passengers in the
vehicles additionally should be refreshed each opportunity to encourages the
administration to confront the issues of insufficient transportation for all passengers.
During the past few years, Automatic Passengers Counting system had a wide
impact in people’s life to improve transportation safety and mobility and to enhance
productivity through the use of advanced technologies [4].
In the era of technologies and globalization, information and data accuracy are
crucial in every aspect to increase the performance and effectiveness of any orga-
nization. The data for passenger are important by bus management to improve their
service. Automatic counting system such as automatic passenger counting (APC) is
an applicable issue for today’s public transport in the entire world. By knowing the
flow of passenger, bus management can improve their service quality and decrease
the cost of transport [5].
Generally, there are many ways to get the total of passenger that travels using a
bus. Most of the current system using a sensor such as laser beam, infrared and
thermal sensor [6]. However, the problem faced by this system is inaccurate
counting due to fault in transmission of data from sensor to counter. This situation
occurred as the sensor can easily be interrupted by the surroundings. For example
sensor that count people passing through the door can easily interrupted by standing
passenger that stand in front of the sensor [7].
The current procedure to obtain the number of passenger count is a complicated
task [8]. For instance, in the procedure that is based on traditional manual head-
count [9]. This technique requires a person to count the passenger manually which
Automatic Passengers Counting System Using Images Processing … 855

it is a time consuming process. Furthermore, this counting process is conducted by

bus conductor or driver himself. As a result, manual headcount procedure is not a
type of passenger user friendly because the passengers need to queue outside the
bus before get in the bus.
Crowding has effects on operating speed, waiting time, travel time reliability and
route and bus choice [10]. Therefore, by knowing total number of the passenger and
the bus location is important to avoid crowding in the bus. Hence, management can
add extra buses if the main buses are full especially during peak hour. Therefore,
bus management companies need to know and concerned with the number of
passengers count.

1.2 Previous Study for Automated Passenger Counting

System

There are various methods which are related for passengers counting these days.
Passengers counting can be held in various transportation system such as bus,
trains, flight and so on. The methods for count the passengers are expanding fol-
lowing the time. There are many methods that researcher used for research the
perfect methodology to implement into transportation system.
For instance, Perng et al. [11] proposed implementation of a vision-based people
counting system in buses. This study used a camera that put on the buses door to
take a data. This research proposed three main components which are moving
object extraction, people detection, and people tracking and counting. To perform
the process, the researcher use segmented foreground object to extract the main
object in the image. To make the object cleared, the researcher use morphological
process.
Meanwhile, Liciotti et al. [12] proposed an intelligent RGB-D video system for
bus passenger counting. This system is mainly based on an RGB-D sensor, located
over each bus door, and image processing and understanding software. To obtain
the counting process, image of passenger has been proceed by using RGB con-
version. Another researcher, Barabino et al. [13] proposed an offline framework for
handling automatic passenger counting raw data. This paperwork used matching
algorithm for getting the result of the counted passenger.
Bernini et al. [14] proposed an embedded system for counting passengers in
public transportation vehicles. This system use stereo vision system, it has been
developed since its future installation over bus doors. To count the passengers, this
system use people counting algorithms which can be grouped in two major cate-
gories, people detection and features based.
Potter et al. [15] proposed automatic available seat counting in public rail
transport using wavelets. This research implement two automatic wavelet-based
available seat counting algorithms which is the spatial-domain Laplacian-of-
Gaussian based wavelet, and the frequency-domain Non-Linear Difference of
856 M. S. Che Husin and A. S. Abdul Nasir

Gaussians-based wavelet. To collect the data, camera has been used to capture
images in the rail and then going through image processing process. The accuracy
of this proposed paper are up to 81%.
Arslan et al. [16] proposed a service oriented passenger counting application for
public transportation systems. To count of the passengers, two PIR sensors will be
placed-one in the entry and the other in the exit of the bus. This system is suitable
for double door buses that usually used in Europe country. However, the accuracy
of passenger counting is not stated in this paper.
There are many methodology that has been used for counting people nowadays.
Many researcher has upgrade the counting system from sensor based to image
processing based. Image processing for counting people are still new and not many
application use this platform for performing people counting. So, this paper pro-
posed an images processing technique for counting passengers in the bus which the
latest technology for passengers counting are still using a sensor.

2 Methodology

In this research, various image processing techniques and steps have been applied
to segment the images of the passengers. In order to design the automated counting
system using images processing, Matlab R2015a has been used. The data acqui-
sition is started by capturing the images of passengers by using mobile phone
camera. The data will go through several steps in performing counting process
using different colour spaces. The steps is as follow;
Step 1: Capture an images of passengers using mobile phone.
Step 2: Apply colour conversion from RGB colour to HSV and YCbCr color
spaces.
Step 3: Perform image segmentation process using Otsu thresholding.
Step 4: Perform image ﬁltering using Gaussian ﬁlter.
Step 5: Perform automatic images cropping and draw the bounding boxes.
Step 6: Perform the passenger counting.
Steps above describe the process that taken in processing images of passengers.
Details description about this steps will be presented in the next chapter.

2.1 Data Acquisition

Data acquisition is a process to acquire the images of passenger in the bus. For this
research, in case to design the image segmentation process, the images of the
passengers are captured using mobile phone camera. For data acquisition process, a
mobile phone camera is using 8mp with HD resolution. This research will utilize
two cameras in order to capture the images of front and back rows of the passenger.
Automatic Passengers Counting System Using Images Processing … 857

Fig. 1 Samples images of the front rows in the bus

Fig. 2 Samples images of the back rows in the bus

In order to design this application, 50 images of the passenger has been collected is
which are divided by two position, front and middle. The front camera will take the
images of the passengers of the ﬁrst ﬁve rows and the middle camera takes the
pictures of passenger of the sixth to the last rows of the bus. For designing the real
time images processing, the collected data are based on real situation in the bus. For
example, different lighting, different people activities based on real situation and
different people skin pixels. The data are not collected based on the number of
passengers but based on the situation. Figure 1 shows the images of the front rows
in the bus, while Fig. 2 shows the images of passenger at the back rows.

2.2 Skin Pixels Detection Based on Colour Conversion

Technique

Colour conversion is needed to be performed before segmenting the images. This

step is to change the RGB colour spaces into colour components of HSV and
YCbCr colour spaces. This process is required because it is difﬁcult to segment the
858 M. S. Che Husin and A. S. Abdul Nasir

images based on the original colour image (RGB form). Figure 3 shows the images
that have been converted to HSV and YCbCr form based on their colour compo-
nents. HSV colour model are representing Hue (e), Saturation (f), and value (g).
Meanwhile, YCbCr are representing Y for luminance (b), Chroma blue (c) and
Chroma red (d). From the ﬁgure, the colour spaces that can be used to segment the
skin pixels can be chose based on the suitability that differencing the skin pixels and
another object. For examples, Cr colour space that shows a lighter colour for skin
pixels and darker pixels for another object. Y colour spaces and Value colour
spaces are hard to segmenting skin pixels because the skin pixels are almost same as
other object pixels. This two colour spaces cannot be used for skin pixels seg-
menting. As the results, from HSV colour model, hue and saturation colour spaces
are chosen and for YCbCr colour model, Cb and Cr are chosen to be segmenting
because the suitability to detecting the skin pixels.
The following Eqs. 1, 2, and 3 are used for conversion process from digital
images of RGB to HSV colour model. For conversion from digital images for RGB
format that 8 bits per sample convert to YCbCr, Eqs. 4, 5 and 6 are used.
RGB to HSV colour model:

3
Saturation ¼ 1 min ðR; G; BÞ ð1Þ
RþG þB

h; if G
Hue ¼ 360 0 ð2Þ
h; B [ G
( )
2 ½ðR GÞ þ ðR BÞ
1
1
h ¼ cos ð3Þ
½ðR GÞ2 þ ðR BÞðG BÞ2

RGB to YCbCr colour model:

65:738R 129:057G 25:064B

Y ¼ 16 þ þ þ ð4Þ
256 256 256
37:945R 74:494G 112:439B
Cb ¼ 128 þ ð5Þ
256 256 256
112:439R 94:154G 18:285B
Cr ¼ 128 þ ð6Þ
256 256 256

2.3 Image Segmentation Using Automatic Thresholding

This section will discuss about the analysis that involves in segmenting the images
by using YCbCr and HSV colour spaces. For designing automatic segmentation
process using both colour models, Otsu thresholding technique will be used in this
Automatic Passengers Counting System Using Images Processing … 859

(a) Original image

(b) Y colour component (c) Cb colour component

(d) Cr colour component (e) Hue colour component

(f) Saturation colour component (g) Value colour component

Fig. 3 Sampled images of the HSV and YCbCr colour spaces form
860 M. S. Che Husin and A. S. Abdul Nasir

process. This algorithm returns a single intensity threshold that separate pixels into
two classes, foreground and background. Therefore, when the within-class vari-
ances of two classes are different, the threshold biases toward the class with larger
variance. As a result, partial pixels belonging to this class will be misclassiﬁed into
the other class with smaller variance [17].
Automatic segmentation by comparing Cr and Cb segmented images and
comparing hue and saturation segmented images will be used in this research.
Based on this initial study, it is found that combination of hue and saturation using
AND function. AND function works with comparing both images from Cb and Cr

Table 1 Sample of segmented images and its combination results

Images Image A Image B Image C
Original

Segmented
Cb

Segmented
Cr

Combination
Cb and Cr

Segmented
hue

Segmented
saturation

Combination
hue and
saturation
Automatic Passengers Counting System Using Images Processing … 861

or hue and saturation colour spaces. If the pixels are white for both pictures, the
pixels will still remain in the images. If only one colour spaces are not detected the
white pixel, it will be removed for the result for combined images. This combined
methodology are the best methodology to produce best colour segmentation for
skin pixels comparing with single colour component. Table 1 shows the examples
of the combination of dual colour component.

2.4 Unwanted Object Removing and Images Filtering

After segmenting the images, the unwanted object that have same pixels values of
skin will be stayed in the segmented images. To remove the noise in this images,
images filtering process will be applied. There are several methods can be applied to
removing the noise and smoothing the image. This research use Gaussian filter.
This filters will smooth the object by combining the pixels of the images. For
example, two object are separated after segmented. The Gaussian filter will make
the object that has the pixel closed to combined and make it one objects. There are
different of size between unwanted object and skins. So, to remove unwanted noise,
it can be done by declaring the range or area of interested object by using region
growing method.

2.5 Automatic Images Cropping

After the images have been through filtering process, the images will be cropped.
This process needed because this system are using two camera. Which the first
camera for capturing the first five rows and the second camera is in the middle of
the bus for capturing the last five rows of passengers. The first camera will capture
the back seat passengers too. So, to avoid the passenger count for twice, the
cropping is needed. The cropping methods also can remove the unnecessary parts
that not needed. After segmentation, the pixels that are white are the skin pixels.
Which means, the skin at hands and another parts of passenger body also detected.
To make the image focusing only at heads, this cropping methods are used.
Figure 4 shows the segmented images and unnecessary parts (a) and images after it
has been cropped (b). The images will be cropped into four section. First section
and seconds section is for first and second rows both right and left rows and third
and fourth is for third, fourth and fifth rows of the right and left seat.
862 M. S. Che Husin and A. S. Abdul Nasir

(a) Original and Segmented image

Unwanted
object

(b) Cropped image base on focused area

Fig. 4 Cropped segmented image

2.6 Automated Passenger Counting Method

The method that used for counting passengers in the bus is by counting the object
that left in the segmented images. By using Matlab software, this process can be
done by ﬁnding connected component in binary images. the bounding box will
detect the object left in the images and calculated the height and width of the
images. After that, the boxes will be drawn based on the center of the object. The
boxes will be displayed on the original images that has been cropped. Counting
process are done by calculating each object left in the each cropped image. After
that, summarizing the calculated value to get the counted object in whole image.
Besides that, the accuracy of the system can be calculated by using formulae on
Eq. 7.

Percentages of accuracy:

total count manually jtotal count manually total people countj

100 ð7Þ
total manual count
Automatic Passengers Counting System Using Images Processing … 863

3 Results and Discussion

3.1 Results of Image Processing Techniques for Passengers

Images

This section provides the result of segmented passenger images using image pro-
cessing techniques that has been applied. There are several steps that includes in
this segmentation process until get the number of the passengers in the images. For
illustrate image processing, four images are chosen. Two of it represent the images
from front camera and two of it is represent images from middle camera that capture
back rows of seat in the bus. Its will be segmented based on two different colour
spaces which is YCbCr and HSV colour spaces. The chosen images named as
Image A and Image B for front camera, and Image C, and Image D for middle bus
camera. These selected images consist of different lighting and different passengers’
activity and position in the bus. Figure 5 shows the original images of four selected
passenger images.
For each segmentation process, it will take pair of images. Both images are
going through simultaneously. The images have been processed using image pro-
cessing technique which has been explained briefly in Sect. 2.3. This methodology
will changes the grayscale colour of YCbCr and HSV to black and white form.

(a) Images A (Front camera) (b) Images B (Front camera)

(c) Images C (Middle camera) (d) Images D (Middle camera)

Fig. 5 Samples of the original passengers’ images

864 M. S. Che Husin and A. S. Abdul Nasir

For illustrating the samples of the process, Image A and Image C are paired and
named Pair 1 and Image B and Image D is paired and named Pair 2. Using two
selected colour models, Table 2 shows the result of segmenting the converted
images from RGB colour model using same methodology.
From the segmented images, there are many unwanted object that have same pixels
value of skin are remain in the segmented images. To remove this object, images will
be go through object removal process by suing the methods which has been explained
in Sect. 2.4. With range large than 100 pixels and less than 7000 pixels, others
unwanted object are removed. The result from this process are shown in Table 3.

Table 2 The results of image segmentation of passengers’ images

Colour Pair 1 Pair 2
model
YCbCr

(a) Images A (b) Images B

(c) Images C (d) Images D

HSV

(d) Images A (e) Images B

(f) Images C (g) Images D

Automatic Passengers Counting System Using Images Processing … 865

After ﬁltering process, cropping process take part. The purpose for cropping is
explained briefly in Sect. 2.5 for the purpose of this methods. This process are using
cropping methodology in the Matlab software. This process results can be seen in
Table 4 for the samples images.

Table 3 Filtered images after segmentation process

Colour Pair 1 Pair 2
model
YCbCr

(a) Images A (b) Images B

(c) Images C (d) Images D

HSV

(d) Images A (e) Images B

(f) Images C (g) Images D

866 M. S. Che Husin and A. S. Abdul Nasir

After the images has been cropped based on it interested area, bounding boxes
will be drawn to illustrate the remaining object in the images like explained in
Sect. 2.6. Table 5 shows the cropped original images and bounding boxes that
illustrate the object that been count in the system.

Table 4 Cropped segmented images

Colour Pair 1 Pair 2
model
YCbCr

(a) Images A (b) Images B

(c) Images C (d) Images D

HSV

(d) Images A (e) Images B

(f) Images C (g) Images D

Automatic Passengers Counting System Using Images Processing … 867

Table 5 Cropped original images with bounding boxes

Colour Pair 1 Pair 2
model
YCbCr

(a) Images A (b) Images B

(c) Images C (d) Images D

HSV

(d) Images A (e) Images B

(f) Images C
(g) Images D
868 M. S. Che Husin and A. S. Abdul Nasir

3.2 Results for Counting of Overall Passengers Images

After performing all the image processing steps, the counting of passengers’ images
has been carried on. This section will discuss the results that has been obtained for
all the samples data. The result show the calculated images using Matlab image
processing and manual counting. The accuracy can be obtained by using Eq. 8 and
the percentages of error can be calculated using Eq. 9. The results is divided into
two tables. Table 6 for counting process by using YCbCr colour model and HSV
colour model. There are 50 images of passengers have been evaluated using both
colour model. The result for YCbCr colour model at Eq. 10 calculated is 72.53%
for percentage of accuracy with the Eq. 11 calculated 27.47% for percentage of
error. For HSV colour model, the Eq. 12 calculated is 85.63% for percentage of
accuracy and it error on Eq. 13 calculated is 14.37% for HSV colour model per-
centage of errors. From this result, the conclusion that can make is the HSV colour
model more suitable for counting passengers in the bus which have several con-
dition of lighting and activity of passengers. HSV colour model also more suitable
for detecting skin pixels rather than YCbCr colour model for counting passengers.

Table 6 Counting of passengers using YCbCr and HSV colour spaces

Image Manual System count for Accuracy System count for Accuracy
count YCbCr % HSV %
1 7 4 57.1 7 85
2 6 9 66.67 6 100
3 6 7 85 6 100
4 7 6 85 7 85.71
5 7 6 85 7 85.71
6 6 7 85 6 100
7 14 17 82 14 64.3
8 13 13 100 13 77
9 12 15 80 12 92.3
10 7 4 57 7 100
11 7 5 71 7 100
12 14 11 78.5 14 57.14
13 12 16 75 12 83.33
14 6 6 100 6 100
15 7 4 57.14 7 71.5
16 12 13 92.3 12 83.3
17 10 11 90.9 10 83.3
18 10 12 83.3 10 90.9
19 3 5 60 3 100
20 1 2 50 1 100
(continued)
Automatic Passengers Counting System Using Images Processing … 869

Table 6 (continued)
Image Manual System count for Accuracy System count for Accuracy
count YCbCr % HSV %
21 3 3 100 3 100
22 3 4 75 3 100
23 4 7 57.14 4 100
24 4 6 66.67 6 66.67
25 11 8 72.7 11 91.66
26 7 10 70 7 63.63
27 10 6 60 10 100
28 10 7 70 10 80
29 10 8 80 10 90
30 10 5 50 10 100
31 10 7 70 10 60
32 10 7 70 10 50
33 10 6 60 10 100
34 10 4 40 10 100
35 12 14 85 12 83.3
36 10 7 70 10 100
37 6 5 83.3 6 100
38 3 4 75 3 75
39 7 3 42.8 7 71
40 4 8 50 4 100
41 7 4 57.14 7 71
42 11 11 100 11 90.9
43 6 6 100 6 100
44 3 3 100 3 100
45 6 5 83.33 6 100
46 10 8 80 10 90.9
47 6 3 50 6 66.67
48 9 6 66.67 9 66.67
49 6 8 75 6 100
50 14 13 92.85 14 71.4
Total 363 395 3626.84 395 4281.62

Percentages of accuracy:

total count manually jtotal count manually total people countj

100 ð8Þ
total manual count
870 M. S. Che Husin and A. S. Abdul Nasir

Percentages of error:

jtotal count manually total people countj

100 ð9Þ
total manual count

Percentages of accuracy of YCbCr colour:

5000 j5000 3626:84j

100 ¼ 72:53% ð10Þ
5000

Percentages of error of YCbCr colour:

j5000 3626:84j
100 ¼ 27:47% ð11Þ
5000

Percentages of accuracy of HSV colour:

5000 j5000 4281:62j

100 ¼ 85:63% ð12Þ
5000

Percentages of error of HSV colour:

j5000 4281:62j
100 ¼ 14:37% ð13Þ
5000

4 Conclusion

In this research, an automated counting system for counting the number of pas-
sengers has been developed. There are several technique has been implemented for
counting the number of passengers using image processing. During the process,
several images have be collected to make a sample for this purpose. In order to
perform the image processing on this samples, there are several method has been
implemented in image processing which using Matlab R2015a software. To per-
form the images segmentation process, the images which in RGB form are con-
verted to YCbCr and HSV colour model. This is on purpose to know which colour
model are the best colour models to perform segmentation on detecting skin pixels.
After conversion process, the image will be segmented using automatic Otsu
thresholding method. This will make the images change from grayscale form to
black and white form. After that, the images will be going through ﬁltering and
smoothing process using Gaussian ﬁlter. Region growing method also implemented
to remove the noise and unwanted object. To perform more accurate counting, the
images will be cropped to the area of heads of passenger. The counting will be take
part after that. After ran and collected the result from all collected samples, the
result of accuracy for YCbCr colour model is 72.53% and for HSV colour model is
Automatic Passengers Counting System Using Images Processing … 871

85.63%. This shows HSV colour model are more accurate rather than YCbCr
colour model in segmenting skin pixels. It also more suitable on various condition
of passengers and various lighting. Overall, this project is expected to help buses
administration to improve their services by updating their passengers counting
automatically by using images processing technique.

References

1. India Education. http://www.indiaeducation.net/studyabroad/malaysia/transport-system.aspx.

Accessed 30 Aug 2019
2. The Star Online. https://www.thestar.com.my/news/nation/2017/10/07/spad-public-transport-
average-daily-ridership-increases. Accessed 21 June 2019
3. Nasir ASA, Gharib NKA, Jaafar H (2018) Automatic passenger counting system using image
processing based on skin detection approach. In: 2018 international conference on
computational approach in smart systems design and applications (ICASSDA)
4. Anagnostopoulos CNE, Anagnostopoulos IE, Loumos V, Kayafas E (2006) A license
plate-recognition algorithm for intelligent transportation system applications. IEEE Trans
Intell Transp Syst 7(3):377–392
5. Lengvenis P, Simitus R, Vaitkus V, Maskeliunas R (2013) Application of computer vision
systems for passenger counting in public transport. Elektronika ir Elektrotechnika 19(3):69–72
6. Bernini N, Bombini L, Buzzoni M, Cerri P, Grisleri P (2014) An embedded system for
counting passenger in public transportation vehicles. In: 2014 IEEE/ASME 10th international
conference on mechatronic and embedded system and application (MESA), Italy, pp 1–6.
IEEE
7. Giken Trastem Co. Ltd. http://www.trastem.co.jp/eng/product/passenger_counter.html.
Accessed 21 June 2019
8. Li F, Yang F, Liang H, Yang W (2016) Automatic passenger counting system for bus based
on RGB-D video. In: 2nd annual international conference on electronics, electrical
engineering and information science (EEEIS 2016), China, pp 209–220. IEEE
9. Northland Primary School. https://northlandpri.moe.edu.sg/about-us/school-safety-n-security/
road-safety. Accessed 21 June 2019
10. Tirachinia A, Hensher A, Rose JM (2013) Crowding in public transport systems: effects on
users, operation and implications for the estimation of demand. Transp Res Part A Policy
Pract 53:36–52
11. Perng JW, Wang TY, Hsu YW, Wu BF (2016) The design and implementation of a
vision-based people counting system in buses. In: 2016 international conference on system
science and engineering (ICSSE), Taiwan. IEEE
12. Liciotti D, Cenci A, Frontoni E, Mancini A, Zingaretti P (2016) An intelligent RGB-D video
system for bus passenger counting. In: Weidong C, Hosoda K, Menegatti M, Shimizu M,
Wang H (eds) International conference on intelligent autonomous systems. AISC, vol 14,
pp 473–484. Springer, Heidelberg
13. Barabino B, Francesco MD, Mozzoni S (2014) An offline framework for handling automatic
passenger counting raw data. IEEE Trans Intell Transp Syst 15(6):2433–2456
872 M. S. Che Husin and A. S. Abdul Nasir

14. Bermini N, Bombini L, Buzzoni M, Cerri P, Grisleri P (2014) An embedded system for
counting passenger in public transportation vehicles. In: 2014 IEEE/ASME 10th international
conference on mechatronic and embedded system and application (MESA), Italy, pp 1–6.
IEEE
15. Potter PD, Kypraios I, Verstockt S, Poppe C, Walle RVD (2011) Automatic available seat
counting in public rail transport using wavelets. In: 53rd international symposium
ELMAR-2011, Crotia, pp. 14–16. IEEE
16. Arslan S, Türk E, Evin M (2017) A service oriented passenger counting application for public
transportation systems. In: 2017 international conference on computer science and
engineering, Turkey. IEEE
17. Xu X, Xu S, Jin L, Song E (2011) Characteristic analysis of Otsu threshold and its
applications. Pattern Recogn Lett 32(7):956–961
Face Recognition Using PCA
Implemented on Raspberry Pi

Ibrahim Majid Mohammed, Mustafa Zuhaer Nayef Al-Dabagh,

Muhammad Imran Ahmad, and Mohd Nazrin Md Isa

Abstract Currently, many consumers electronic are utilizing personal identiﬁca-

tion technique such as ID, fingerprint, face, Iris and palmprint recognition for
security reason. Among other biometric traits, face image is easy to capture using Pi
Camera. This paper proposes facial recognition using principle component analysis
(PCA) method implemented using Raspberry Pi embedded processor. The algo-
rithm is model using PC based method to identify the best eigenvector to represent
facial image. The pre-calculated statistical parameter is then used to implement the
PCA algorithm in Raspberry Pi embedded processor. OpenCV image processing
libraries is use to support the basic task of image pre-processing such as cropping,
resize and color conversion. In the propose work, the whole system is implemented
in low cost processor to evaluate the performance in terms of recognition rates and
processing time. The proposed method includes three main phases. The
pre-processing phase processes facial ORL image and to collect significant infor-
mation. The second phase extracts important features from image and decreases the
size of the image using the PCA method. This phase uses a linear projection
technique to decrease redundancies and remove noise from the image. Moreover,
this strategy also improves the strength of discrimination power in the feature space.
The Euclidean distance classifier is utilized in the third phase of classification. The
best recognition rates achieved using the propose method is 82.5% implemented
using low cost embedded system.

I. M. Mohammed M. I. Ahmad (&)

School of Computer and Communication Engineering, University Malaysia Perlis,
Arau, Perlis, Malaysia
e-mail: [email protected]
I. M. Mohammed
e-mail: [email protected]
M. N. M. Isa
School of Microelectronic Engineering, University Malaysia Perlis, Arau, Perlis, Malaysia
M. Z. N. Al-Dabagh
Department of Computer Science, College of Science, Knowledge University,
Erbil, Kurdistan Region, Iraq

© Springer Nature Singapore Pte Ltd. 2021 873

Keywords Face recognition Raspberry Pi Principle component analysis

Python OpenCv Numpy ORL dataset

1 Introduction

A trustworthy human management system is now needed in order to conflict the

widespread growth in identity theft and to encounter the security requirements in
many applications like transportation, health-care, public justice and safety, forensics,
finances, security and education. Research in information security is apprehensive
with the guarantee of integrity, confidentiality and availability of information in many
arrangements [1]. There are many implements and techniques that can provision the
controlling of information security such as system based on biometric has developed
to sustenance some features of information security. Biometric verification supports
all aspect of authentication, identification and non-repudiation in security system.
Establishing the human identity is a critical process in identity management system.
Traditional approach for human recognition is depends on external input such as keys
and passwords but such things may be forgotten and lost. The possible method to
solve these problems is by using biometrics where every people has a unique bio-
metric pattern definitely [2]. The popular biometric traits used by other researchers are
gait, speech, face, iris, fingerprints, palmprint and palm veins [3]. Nowadays bio-
metrics system has increased consideration in many low cost applications. With the
advance technology in embedded system, the application of biometric has been fully
utilized in many consumer product such as mobile phone and etc. Biometric signal or
data have several necessary features with respect to their practice as an validation
token, namely, convenience, reliability and universality. These features have directed
to the general deployment of biometric verification systems. Since all biometric are
unique to each individual, there is a robust and reasonably undying relation between a
human and the biometric traits. Thus, biometric identification can be used to recog-
nize individuals in surveillance operations where a person may attempt to conceal
their true identity (e.g., by using forged documents to claim social welfare [4]. A large
scale of biometric authentication system must have a high accuracy in term of
Genuine Accept Rate (GAR), False Accept Rate (FAR) and Equal Error Rate
(EER) to be safe and useful in many different applications [5]. Face image involve a
large size of data representing a unique pattern, thus a statistical method is one of the
approach to capture the underlying statistical information for each person.
Researchers propose to improve the existing algorithm with several new projection
technique such as advance PCA and kernel LDA [6]. However, this method required
an expensive computation method such as PCs and workstation to perform algorithm
computation. A face recognition system using low cost embedded platform is a
system which automatically identifies the human identity of a person from digital
images capture by CMOS image sensor. The image processing is done by using
OpenCV libraries and template are stores in memory [7]. The basic structure of the
face recognition system implemented in embedded platform has several stages such
as image pre-processing, feature extraction, statistical learning and matching process.
Face Recognition Using PCA Implemented on Raspberry Pi 875

PCA algorithm is among the popular algorithm and less computational task and
suitable for low cost processor such as Raspberry Pi. The attraction of the Raspberry
Pi in biometric algorithm computation comes from a combination of the computer’s
small size and affordable price. Raspberry Pi, a credit-card sized low-cost Linux
computer can be used to develop a biometric small scale system as it has providing of
connecting with Pi cameras, ﬁngerprint scanners via USB ports. It also has an
Ethernet port used for Internet connectivity or can be connected to Wi-Fi features and
now become popular in many consumer application [8].

2 Face Recognition

Face recognition is a form of biometric system using physiological biometric traits.

Facial image has special features in spatial and frequency domain to recognize an
individual based on their unique physical or behavioral characteristics. The process
of facial recognition involves automated methods to determine identity based on
statistical features as important elements of difference [9]. In the automated facial
recognition, the machine will look at the feature space to make a decision and it does
not recognize human in the same method as our brain. The way human interrelate
with other people is firmly based on our ability to recognize them based on our
knowledge [10]. A face recognition system would allow a user to be identified by
simply walking past a surveillance camera. A robust facial recognition method
necessitates both low size of dimensional features representation to enhance dis-
crimination power in the feature space and to avoid overlapping for difference
subjects. The algorithm of face recognition start with a dimensionality reduction
technique since the high dimensionality of the raw data makes the estimation of
statistical parameter very difficult and time consuming. A necessary facial descriptor
should be discriminative enough to inter-class differences but robust to intra-class
variations, and at the same time effective to design [11]. Basically the system con-
sists of four important stages: face detection, face alignment, feature extraction, and
pattern classification [12]. In face detection, it provides information related to the
location and scale of each identified face region. Several technique are propose for
features extraction, which can be broadly classified into appearance-based (Holistic)
and model-based methods [13]. The method based on holistic such as Gabor fea-
tures, principle components analysis and linear discriminant analysis will use the
whole facial area to derive the feature vector. Meanwhile, model based method will
use small region in facial image to construct a feature vector such as local binary
pattern and discrete cosine transform. Performance analysis face recognition system
can be measure using two approach based on a specific application which is
recognition or verification. Face verification (one-to-one matching) that compares
the test face image against many templates of face images whose identity is being
claimed. Face identification (one-to-many matching) that compares a query face
image against all image templates in a face database.
876 I. M. Mohammed et al.

3 System Hardware Requirements

The proposed system is implemented in embedded system where the face image is
used as a biometric trait. Raspberry Pi is a credit card size of microprocessor devel-
opment board available in various designs and processing speeds. It ranges from
700 MHz with a model of B, B+ or the old version which has several peripherals
hardware. Raspberry Pi uses the most powerful and effective linux based operating
system called Raspbian. The main reason for the advantage of Raspbian is that it has
thousands of pre-created libraries to do many processing tasks and optimize the
operating system. The Raspberry Pi establishment in the UK has developed the
Raspberry Pi with a low-price series, forehand-sized and single board computers.
Initially this board was created to promote instruction in schools regarding basic
computer skills and now achieves this purpose well. For now, the Raspberry Pi has
extended it’s footprint and has exceeded its intended purpose. This was made possible
through market penetration for embedded research and systems in industries.
Currently, the main programming platform is Python but languages like C, C++,
Java, Perl and Ruby can also be applied in programming Raspberry Pi. In this
project we employ Raspberry Pi 3 Model B+ to develop several computer vision
applications such as image preprocessing and statistical signal analysis. The
Broadcom 2837 700 MHz chip is the primary signaling chipset used for Raspberry
Pi, where the core CPU is a 32-bit ARIS1176JZF-S RISC processor developed by
Advanced RISC Machines. This primary processing chip is able to connect and
control the camera. Raspberry Pi uses an SD card for long-term playback and
storage [14]. Figures 1 and 2 demonstrate the hardware used in the propose project.

Fig. 1 Raspberry Pi 3 model B+

Face Recognition Using PCA Implemented on Raspberry Pi 877

Fig. 2 Algorithm is design and simulates using Python in PC. The algorithm is then transfer to
Raspberry Pi after all design parameter is ﬁnalized

4 The Propose Face Recognition Algorithm

In this work, the virtual network computing will be utilized (VNC viewer application)
on computer. Using this technique the computer and Raspberry Pi is connected using
WiFi. Figure 3 shows the block diagram of hardware involve in the propose method.
The Raspberry Pi board can be access using programming code developed from the
PC. The system will works in two different parts. The first part is for capturing and
creating a database by storing the image. And the second one is to compare the image
with the stored images in the database. For feature extraction we will use Eigen faces
methodology and Euclidian distances will used for recognition of the face.
The system is tested using the ORL database that contains a total of 400 facial
images. Different pre-processing techniques are used, such as RGB to grayscale and
image resizing. Each technique is applied to the face image to improve image
quality. The pre-processing techniques are applied to increase recognition rates.
Low-dimensional face features extracted from the linear projection technique is
used for the recognition process. In this work PCA is a statistical projection
technique used for face recognition to remove noise and redundant features exist in
raw image. PCA is used for feature extraction and dimensionality reduction, which
produce a better class separation in the feature space. Low dimensional vector
between testing and training image in the feature space is classified by using
Euclidean distance classifier. A single matching value provide by Euclidean dis-
tance is used to calculate the similarity between two vectors in the feature space. In
our project Euclidean distance is perform using Raspberry Pi and the template used
in this process is computed in offline using PC (Fig. 4).
878 I. M. Mohammed et al.

Fig. 3 Hardware block diagram

Fig. 4 Flowchart of the

propose method. Template is
the projected train image
computed offline using PC

4.1 Principal Component Analysis

PCA is one of the most successful techniques use in biometric recognition for data
reduction and feature extraction in the appearance-based approaches [15]. PCA is
used profusely in all forms of analysis from neuroscience to computer graphics
because it is a simple non-parametric method of extracting relevant information
from missing data sets. Several approach use PCA to perform data prediction,
redundancy removal, feature extraction, data compression and noise removal. The
main idea of using PCA for face recognition is to express the large 1-D vector of
pixels constructed from the 2-D facial image into the compact principal components
of the feature space.
The working principle of PCA algorithm is to reduce data dimension by
transforming the raw data into a set of linearly independent data by linear trans-
formation which can be used to extract the main feature components of the data
[16]. PCA can extract the important features, capture the nearly variable data
Face Recognition Using PCA Implemented on Raspberry Pi 879

components of samples, and then select numerous signiﬁcant individuals from all
the feature components. Through PCA, an efﬁcient and simple recognition process
can be obtained compared with other linear projection approaches [17]. Therefore,
it is often used for dimensionality reduction of high dimensional 2D and 1D data.
Eigen space is calculated by specifying the eigenvectors of the covariance matrix
derived from a set of facial training images. The mathematical process of PCA
involve the conversion 2-D facial image to a long 1-D vector by concatenating each
row (or column) into a stretched narrow vector. Suppose there is M vectors of size
N (= rows of image columns of image) representing a set of sampled images.
P1’s represent the pixel values. The images are mean centered by subtracting the
mean image from each image vector.

X1 ¼ jP1. . . PN jT; i ¼ 1. . . M ð1Þ

Mean image m represented as:

1X n
m¼ xi ð2Þ
n i¼1

Compute the Covariance Matrix S

1X
S¼ ðxi mÞðxi mÞT ð3Þ
n i¼1n

Compute the eigenvalues k and eigenvectors vi of S

Svi ¼ ki vi; i ¼ 1; 2; . . .; n ð4Þ

Order the eigenvectors by descending values of their eigenvectors. The

K principal components can now be obtained from the observed vector as:

Y ¼ W T ðx mÞ ð5Þ

The reconstruction from the PCA is given by:

x ¼ Wy þ m ð6Þ

where W ¼ ðv1 ; v2 ; . . .; vn Þ (Fig. 5).

880 I. M. Mohammed et al.

Fig. 5 PCA algorithm use in the propose method

4.2 Euclidean Distance Classiﬁer

Data distribution each class in the feature space is assumed normal distributed, thus
the Bayes’ rule of the classification process can be simplified to a Euclidean dis-
tance given in Eq. 7. Euclidean distance is widely use in biometric due to the faster
processing time and less complexity in the mathematical model. The Euclidean
distance is known as the straight line between two points in the feature space, which
analyses the root of square variations between couple of objects coordinates as
shown in Fig. 6. The equation below represents Euclidean distance:
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
XM
d ðx; yÞ ¼ i¼1
ðXi Yi Þ ð7Þ
Face Recognition Using PCA Implemented on Raspberry Pi 881

Fig. 6 Euclidean distance structure for one to many matching process

4.3 Facial Image Preprocessing

The performance of face recognition system is extremely depends on the flexibility

(availability of moderately big person samples) of the used data set. Many dataset
are created to the particular needs of the algorithm development. These dataset has
many face image variations such as UMIST, ORL, AR Face, Yale, MUCT, and
FERET are some of the important face database available in public. The recognition
performance of the proposed work is evaluated using ORL face dataset. ORL
database includes a total of 400 images related to 40 persons (36 males and 4
females) and each of them has 10 different images. Table 1 shows the properties of
the ORL database. Figure 7 shows the samples of images for one person with
several variations.

Table 1 Properties of the Properties Description

ORL database
No. of subjects 40
No. of images 400
Gray/Color Gray
Resolution 92 112
Purpose Used for face recognition
Background Dark homogeneous
882 I. M. Mohammed et al.

Techniques of image pre-processing are utilized to the ORL dataset to increase

recognition rates. Image preprocessing decreases the processing time depend on the
algorithm use to process the image. However the advantage of pre-processing will
increase the performance due to the clean image use in the system. Pre-processing is
use to remove background image, increase the brightness, reduce illumination effect
and geometry alignment. In this project two preprocessing stage is used which is
image color conversion as shown in Fig. 8 and image resize as shown in Fig. 9.

Fig. 7 Example of image variations in one subject

1D Pixel value
3D Pixel

Convert to

Fig. 8 Image color conversion from RGB to Grayscale

Face Recognition Using PCA Implemented on Raspberry Pi 883

Original image (11292) Resized image (9372)

Fig. 9 Image resizing

5 Result and Discussion

In this analysis, a series of experiments are conducted to evaluate the performance

of biometric system using benchmark ORL face dataset. This dataset has males,
females, white people, black people and different expression. There are 10 images
in each class with 112 92 pixels image size. The image contains small variations
such as pose change, different expression and wearing glasses. The recognition
analysis is performed using all possible tests, where the images are randomly
chosen for training and the others are used for testing for all 400 images in the ORL
dataset. To examine the effect of the information in the recognition rates the number
of PCA coefficients and number of training images is varied. In order to examine
the effect of PCA coefficients, one image is used for training and testing for several
number of PCA coefficients as shown in Table 2. PCA utilizes an orthogonal
transformation to convert a group of observations of possibly correlated variables
into a group of values of linearly uncorrelated variables called principal compo-
nents. The principal components with high eigenvalues are selected to represent the
images. These principal components have major variances in the feature space.
Considering measurement cost and classification accuracy, the number of principal
components used should be kept as small as possible. The best recognition rate of
82.5% is achieved by using 35 PCA coefficients, as shown in Table 2. By using a
few PCA coefficients the best performance can be achieved similar when using 100
PCA coefficients. Using 12 number of PCA coefficients 80% recognition rates is
achieved because many information is lost during the projection method. Figure 10
shows the recognition rates by using one training image and one test image with a
different number of PCA coefficients. From this analysis, using less number of
training images with a high value of PCA coefficients can give the best result.
884 I. M. Mohammed et al.

Table 2 Recognition rates with various number of PCA coefﬁcients

Number of train Number of test Number of Eigen Recognition rate
image image vectors (%)
1 1 12 80
1 1 20 80
1 1 25 80
1 1 30 80
1 1 35 82.5
1 1 40 82.5
1 1 45 82.5
1 1 50 82.5
1 1 55 82.5
1 1 60 82.5
1 1 65 82.5
1 1 70 82.5
1 1 75 82.5
1 1 80 82.5
1 1 85 82.5
1 1 90 82.5
1 1 100 82.5

Fig. 10 Recognition (%) rate with various number of PCA coefﬁcients

Face Recognition Using PCA Implemented on Raspberry Pi 885

5.1 Analysis of Recognition Rate Using Different Number

of Training Image

The analysis uses all 400 images in the ORL dataset to perform different number of
training images. Different number of training images produce different amount of
information in the feature space. For each class we randomly select one to nine
training images and compute the best recognition rate at different number of
training images using the best 35 PCA coefficients use in the previous analysis.
Figure 11 shows different recognition rates can be achieved when the number of
training images is varied from one to nine respectively. The performance is low
when using less number of training images with 35 number of PCA coefficients.
Table 3 demonstrates the recognition rate when using different training image with
one image for testing and 35 number of PCA coefficients. The best recognition rate
is 82.5% when using 6 training images and the image is represented using 35
number of PCA coefficients.

5.2 Recognition Rate Using Different Number of Testing

Pictures

In this analysis, we use different number of testing images to examine the effect of
the interclass variations in the recognition rates. Figure 12 shows the analysis when
the number of testing images is varied from 1 to 9 respectively. Using one image for
testing and training with 35 numbers of PCA coefﬁcients the recognition rate is
77.5%. The performance is decrease when using many testing image. This shows
the algorithm has a limitation for interclass variations. Table 4 demonstrates the
recognition rate using different number of testing image with one image for training
and 35 number of PCA coefﬁcients. The best recognition rate is 82.5% when the

Table 3 Recognition rates with different number of training image

Number of training Number of testing Number of Eigen Recognition rate
image image vectors (%)
1 1 35 65.0
2 1 35 67.5
3 1 35 72.5
4 1 35 75.0
5 1 35 80.0
6 1 35 82.5
7 1 35 75.0
8 1 35 62.5
9 1 35 60.0
886 I. M. Mohammed et al.

Fig. 11 Recognition rate with different number of train images

Fig. 12 Recognition rate with different number of test images

Face Recognition Using PCA Implemented on Raspberry Pi 887

Table 4 Recognition rate with different number of testing images

Number of training Number of testing Number of Eigen Recognition rate
image image vectors (%)
1 1 35 77.5
1 2 35 82.5
1 3 35 70.0
1 4 35 62.5
1 5 35 77.5
1 6 35 67.5
1 7 35 67.5
1 8 35 67.5
1 9 35 67.5

testing is performed using two training images. From the analysis, PCA still has a
limitation for interclass variations thus the testing image must have less variations
in terms of pose and expression. In order to overcome the variations in testing
images, we need to increase the number of training image, so enough information
exist in the feature space.

6 Comparison of the Proposed Algorithm with PC Based

Method

This experiment is conducted to perform comparison between the propose method

and PC based method using PCA and LDA data projection. The recognition and
data projection are conducted on both personal computer and Raspberry Pi board
with different computational times. The model of Raspberry Pi board is based on
BCM2837 quad-core processor and 1 GB RAM chip with 1400 MHz of speed.
Meanwhile, the general-purpose personal computer has an Intel CPU (1.80 GHz)
Core(TM) i7-8550U and 8 GB of physical memory (RAM). The analysis is per-
formed using PC based and Raspberry Pi to evaluate the performance of recognition
rates and computational time. The algorithm in both platforms involves several
processing tasks such as preprocessing, image normalization, feature extraction,
dimension reduction, and matching. Table 5 shows the performance of all systems
in terms of recognition rates and processing time. The propose method achieved
better processing time compare with the existing method using PCA and LDA
projection method. In terms of recognition rates, LDA method is outperform both
PCA method. This is due to the better class separation in the feature space using
LDA method. However, processing time for LDA is higher compare to PCA.
888 I. M. Mohammed et al.

Table 5 Performance comparison of the recognition rates and processing time

Platform Recognition rate (%) Processing time (sec)
PC based with DCT and LDA [18] 89 0.903
PC based using Gabor and PCA [19] 86 0.705
Propose method 82 0.206

6.1 Conclusion

This paper discusses the performance analysis of face recognition system imple-
mented in low cost embedded system. Pre-processing methods are utilized to
increase the identiﬁcation rate in the ORL database. The results of the PCA algo-
rithm implemented are also discussed. We also noticed several limitations in this
work which depend on the missing information and pose of a person, expressions
and illumination conditions. There are 40 distinct subjects and each of them has 10
different images. Despite the time difference between the PC and the Raspberry Pi
in processing time, Raspberry Pi is still gaining some pros such as small size,
portable, low power conception and low cost. The face recognition system
implemented for this project can be developed further to improve the identiﬁcation
rate and performance by using feature extraction method to produce high dis-
crimination features. A limited memory space in Raspberry Pi board can be solved
by developing the algorithm in PC based then the best parameter is then transferred
to the board. The best recognition rates using this board is 82.5% which is lower
compare to PC based method with advance feature extraction method.

References

1. Shah D, Haradi V (2016) IoT based biometrics implementation on Raspberry Pi. Procedia
Comput Sci 79:328–336
2. Senthilkumar G, Gopalakrishnan K, Kumar VS (2014) Embedded image capturing system
using Raspberry Pi system. Int J Emerg Trends Technol Comput Sci 3:213–215
3. Proﬁle SEE (2019) Digital image processing on face, 6
4. Jain AK, Nandakumar K, Ross A (2016) 50 years of biometric research: accomplishments,
challenges, and opportunities. Pattern Recognit Lett 79:80–105
5. Al Abdulwahid A, Clarke N, Stengel I, Furnell S, Reich C (2016) Continuous and transparent
multimodal authentication: reviewing the state of the art 19:455–474
6. Patil SK, Narole NG (2018) Smart face recognition security based on Raspberry Pi. Int J Eng
Sci Comput 8:16409–16412
7. Peter M, Minoi JL, Hipiny IHM (2019) 3D face recognition using kernel-based PCA
approach. Notes Electr Eng 481:77–86
8. Gupta B, Chaube A, Negi A, Goel U (2017) Study on object detection using open CV -
Python. Int J Comput Appl 162:17–21
9. Yli-Heikkilä VM (2015) Home surveillance with Raspberry Pi, pp 1–65 (2015)
Face Recognition Using PCA Implemented on Raspberry Pi 889

10. Wilmer JB (2017) Individual differences in face recognition: a decade of discovery. Curr Dir
Psychol Sci 26:225–230
11. Harel A (2016) What is special about expertise? Visual expertise reveals the interactive nature
of real-world object recognition. Neuropsychologia 83:88–99
12. Huang W, Yin H (2017) Robust face recognition with structural binary gradient patterns.
Pattern Recognit 68:126–140
13. Gaur S, Dixit M, Hasan SN, Wani A, Kazi T, Rizvi AZ (2019) Comparative studies for the
human facial expressions recognition techniques. Int J Trend Sci Res Dev Int J Trend Sci Res
Dev 3:2421–2442
14. Patel A, Verma A (2017) IOT based facial recognition door access control home security
system. Int J Comput Appl 172:11–17
15. Senthilkumar R, Gnanamurthy RK (2016) A comparative study of 2D PCA face recognition
method with other statistically based face recognition methods. J Inst Eng Ser B 97:425–430
16. Li X, Lin Z (2018) Proceedings of the fourth Euro-China conference on intelligent data
analysis and applications, vol 682, pp 10–22
17. Barnouti NH (2016) Improve face recognition rate using different image pre-processing
techniques. Am J Eng Res (AJER) 5:46–53
18. Ahmad MI, Ilyas MZ, Md Isa MN, Ngadiran R, Darsono, AM (2014) Information fusion of
face and palmprint multimodal biometrics 2014, pp 635–639
19. Ahmad MI, Ilyas MZ, Ngadiran R, Isa MNM, Yaakob SN (2014) Palmprint recognition using
local and global features. In: Proceeding of international conference on systems, signals, and
image processing 2014, pp 79–82
Comparability of Edge Detection
Techniques for Automatic Vehicle
License Plate Detection and Recognition

Fatin Norazima Mohamad Ariff, Aimi Salihah Abdul Nasir,

Haryati Jaafar, and Abdul Nasir Zulkifli

Abstract License plate recognition system is one of the famous topics in image
processing to identify the vehicle registration number. This system has been given a
lot of beneficial toward transportation system, especially for security system.
However, to get the perfect segmentation on alphabet shape for recognition purpose
is quite challenging due to the non-uniform condition of image acquisition. Hence
this paper proposes a methodology for segmentation of license plate number by
using edge-based segmentation. In this study, image segmentation based on edge
detection has been chosen due to the sharpness and detail in detecting the shape of
an object. Since there are various types of edge detection techniques have been
proposed by the previous researchers, several edge detection techniques from the
most commonly used techniques have been chosen to be compared and analyze the
results of various edge detection for license plate recognition. In this paper, several
types of edge detection techniques such as Approxcanny, Canny, Chan-Vese,
Kirsch, Prewitt, Robert, Sobel, Quadtree and Zero Crossing edge detector have
been compared through greyscale images. Grayscale image has been enhancing
before by modified white patch. Then, the holes area of the segmented license plate
image are filled to obtain the characters, followed by step for removing the
unwanted objects from the segmented license plate images. Later, the characters of
the license plate are recognized based on template matching approach. This
recognition analysis consists of two stages. First stage is all edge detector tech-
niques have been used same standard values in removing the noise. Five edge
detectors with best performance have been selected for next stage. In the second
stage, the unwanted objects have been removed with appropriate values which are
suitable for each of the edge detection techniques. The final result shows that

F. N. M. Ariff (&) A. S. A. Nasir H. Jaafar

Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI Alam Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]
A. N. Zulkifli
Institute of Creative Humanities, Multimedia and Innovation, School of Creative Industry
Management and Performing Arts, Universiti Utara Malaysia, Kedah, Malaysia

© Springer Nature Singapore Pte Ltd. 2021 891

Chan-Vese conquers the analysis with highest accuracy of edge detection obtained
in license plate recognition.

Keywords License plate recognition Image segmentation Edge detection

Template matching

1 Introduction

License plate recognition (LPR) which is under intelligent transportation system

(ITS) has become one of the important systems in transportation safety [1, 2]. The
license plate number is an identity of the car. Each country has its own license plate
arrangement which represents their countries identification. This causes LPR sys-
tems become high priority in computer vision technology [3]. Various LPR tech-
niques have been developed to suit with the purpose of the recognition system such
as monitoring speed control, tracking stolen cars, toll collection, etc. In parking
system, number plate memorization has currently been to measure the time
spending for that car park in that area. When the car enters the gate, camera
captured the license plate and records the time entering. When the car exit gate,
second camera capture the license plate again and compare with the first data input
to calculate the time duration that car has been park. The time difference is the cost
of parking fee [4]. Another purpose is license plate as identification. This normally
uses on company or elite place which used to control the access where only
authorized vehicles can enter the area [5].
Even though LPR has been discussed since the past decade, this task is still
being challenging due to its various factor of image quality obtained. Plus, in this
modern era, various license plate patterns have been introduced due to following
the trend. In addition, there are common problems during image acquisition pro-
cess. Examples of problems encountered are outdoor illumination conditions, where
reflection of the sunlight. Sometimes dazzled by headlight of vehicle or projection.
Complex nonessential background scenes of image or license plate itself such as
unclean, damaged or printed non-license plate characters on vehicle and so on [6].
Different distance and angle between vehicle and viewpoint also cause distortion for
image processing. A capable LPR system must be able to handle with unexpected
condition of the image obtained [7].
LPR is normally divided into three parts, which are license plate detection,
character segmentation and character recognition [8]. Detection of a license plate is
the process to identify the region or localize the location of the license plate in the
image. A lot of researches are very interest in license plate detection research from
simple method until the most complex mechanisms were ever exist [9]. Among
several segmentation methods that have been proposed before, Edge-based was the
most famous approached. Since its detection rate produced are quite fast, high and
more sensitive compared to other methods. Plus, detecting the edges is difficult in
nature when it comes from noisy or corrupted images. Thus, the aim of this paper is
Comparability of Edge Detection Techniques … 893

to compare and analyze which edge detection methods is suitable to be used in

license plate detection method.
The paper is organized as follows: Sect. 2 briefly reviews previous work for
edge detection. Section 3 discusses the differential of each edge detection tech-
nique. The procedure proposed for this analysis will be presented in Sect. 4.
Section 5 shows the result obtains and analyses which techniques produce the best
performance. Finally, conclusion will be made in Sect. 6.

2 Previous Work Review

Image segmentation is a method to separate between target object with the back-
ground. There are five different techniques in segmentation which are region-based,
thresholding, edge-based, features based clustering and model-based [10]. This
paper focuses on edge-based segmentation. Edge detection is a process in identi-
fying the adequate edges in the image. This process quite famous for LPR due to
the easier in detecting shape edges especially the alphabet and works well in noisy
or complex images. Reviewed on previous work has been done in order to find the
best edge detector application by comparing among each other toward license plate
detection. The performance of different edge operators is normally measured using
two methods which are objective and subjective. Objective means the result that
obtains through some scientific calculation or a ratio while subjective is the con-
clusion that has been made based on human eye observation.
A comparison study of commonly used edge detection technique which are
Sobel, Robert, Prewitt and Laplacian of Gaussian has been made by Sahoo et al.
[11]. Bala Krishnan et al. [12] also compared the commonly used edge detection
with adding three more edges in their study’s which are Canny, Kirsch and
Robinson edge detection. From the study shows that the canny edge detection
algorithm produces best result compared others technique based on the sensitivity
aspect. However, all this research not doing a specific comparison on any items
such as license plate. In this paper, all commonly used edge detection will be
included together with others good edge detection methods which are Canny,
Robert, Prewitt, Sobel, Quadtree, Approxcanny, Krisch, Zero Crossing and
Chan-vase to test on the license plate and find out which the best edge detector for
license plate.
Basically, some researcher has been used edge detection technique at the image
pre-processing stage before entering image segmentation stage in order to detect the
license plate. As example, Rokibul et al. [13] developed an ALPR system for
Bangladesh vehicles. The edge of license plate has been detected by using Sobel
edge detection method. Horizontal and vertical extortion from Sobel edge has been
used for highlighting the license plate location. Gou et al. [14] also used Sobel
method to get the vertical edge of license plate in the image. In the other hand,
Canny edge method has been used for localization purposed since canny operator
very useful algorithm to determine the edges from noisy image [15, 16].
894 F. N. M. Ariff et al.

In the meantime, edge detectors have been used for segmentation purposed. As
example, Sharma [17] segment the greyscale image with Sobel edge detection
technique. Then, the edges have been used morphological technique to dilated the
edges. The hole inside the object have been ﬁlled in order to get the alphabet back.
Meanwhile, Babu and Raghunadh [18] used the edges from Sobel for segmentation
and undergo dilation process to ﬁnd the rectangular shape which possible to similar
with the license plate shape. The rectangular obtained will masking on the original
image for getting the plate number after the unwanted pixel has been deleted. In
other hands, Prewitt edge detection has been used by Al Taee [19] for recognition
purposed. Fundamentally, this shows that edge detection technique is widely used
in image processing for LPR approach and flexible to use in any image processing
stages.

3 Edge Detection Technique

Edge detection ability is a process for identifying and locating discontinuities

boundaries in an image. This technique is signiﬁcant in many image processing
applications such as pattern recognition, object recognition, motion analysis, etc.
Edges detector is developing from a mathematical method with aiming to distin-
guish the intensity changes present in the image. These four steps involve edge
detection algorithm to works which are:
• Filtering is to highlight the interesting features or remove the possible noise in
the image.
• Enhancement is the process where the gradient magnitude is measured with
respect to pixels in order to analyses by adjusting the digital images into pixels.
• Detection steps are identifying the process of acceptable edges from pixels.
• Localization happens with respect to the sub-pixel by analyzing the resolution.
These steps keep on processing by sequence for edges prediction efﬁciently.
Various edge detection algorithms have been developing to achieve the target for
segmentation. This section explained the differences between each edge detection
technique for Canny, Approxcanny, Sobel, Roberts, Prewitt, Zero crossing,
Laplacian of Gaussian (LoG), Kirsch, and Chan-vase edge detector.

3.1 Canny Edge Method

The main objective of Canny operator is to isolate the interesting object from the
background by ﬁnding the effective edge. Potential marked edge and centre of the
edge are accurately localized by the edge detection point only once while pre-
venting false edges created. This operator works similar to the LoG which
Comparability of Edge Detection Techniques … 895

overwhelmed the noise tendency problem. The first derivative is used to measures
the mean squares distance, signal to noise ratio and error edge map. The Canny
edge detection algorithm is getting better to give optimal solutions with respect to
noisy images [7]. In simple, canny edge detection steps process is given:
Step 1: Smooth the image and gradient filter.
Step 2: Take the gradient of the image.
Step 3: Evaluate the edges using the gradient in the x and y coordinates.
Step 4: Trace the edges with respect to x and y coordinates.
Step 5: Non-maximum suppression.
Step 6: Hysteresis.
Canny edge detection is generally the first derivative of a Gaussian smoothing
concept by using probability to find the errors. The operator deals with
signal-to-noise ration and localization to optimize the product. Nonmaxima sup-
pression method with respect to the noise ratio automatically improves the signal.

3.2 Approxcanny Edge Method

The Approxcanny edge method is an edge detection which used the approximate
Canny method. The Approxcanny method is an approximate version of the Canny
edge detection algorithm that provides faster execution time at the expense of less
precise detection together with tiny detail on images. This simple approximation of
Canny operator is developed with different-sized of Gaussian filters in order to
smooth out the image. Gradient’s orientation and magnitude are produced from
gradient operator (Gz, Gy) are in sequence to outputted smooth image [20]. Five
steps implement on canny has been used in this method where:
Step 1: Elimination of image noise and deduct the fine details applied by the
Gaussian filter.
Step 2: Gradients intensity of the image is derived. For instance, the gradient can be
acquiring by implemented the edge detector operators such as Robert, Prewitt and
Sobel.
Step 3: Non-maximum suppression is capitalizing on subtracting the false response
of edge detection.
Step 4: Potential edges are determined from the double threshold method values.
Step 5: Hysteresis has traced the edges which have a bright edge and automatically
reduce the fade edges from the image.
896 F. N. M. Ariff et al.

Fig. 1 Sobel 3 3 Kernel

x-direction y-direction

3.3 Sobel Edge Method

Sobel edge method uses a ﬁlter which involves kernel in input image. Each pixel of
the image is calculating its gradient magnitude and direction. The magnitude of the
vector is expressed as:
h i1=2
Df ¼ magðDf Þ ¼ G2X þ G2y ð1Þ

Where Df are represents as magnitude vector, Gx and Gy in the equation is a

mask that identiﬁes horizontal and vertical edges. The Sobel uses two pairs of
3 3 kernel for x and y-direction as shows in Fig. 1.
90° is a degree others of one kernel rotated [6].

3.4 Roberts Edge Method

Robert’s edge method provides a simple and efﬁcient differential operator. This
approximates the gradient intensity in discrete differentiation, which is obtained by
summation of squares of differences between diagonally adjacent pixels. It uses a
pair of 2 2 convoluted kernels, which is like the Sobel method.
Both kernels were designed for pixel edge to response maximally of edge lying
at 45° angle. One kernel is holding for each orientation of two perpendiculars.
However, separating of gradient measurement in Gx and Gy (see Fig. 2) orientation
causes kernel can be used independently on input image. Each one is combined
together to ﬁnd absolute magnitude of gradient at individual point and orientation.
The gradient magnitude is given as:

Fig. 2 Roberts 2 2 Kernel

Gx Gy
Comparability of Edge Detection Techniques … 897

pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi
jGj ¼ Gx2 þ Gy2 ð2Þ

Approximate magnitude, as well as angle of orientation of the edge that give rise
to the spatial gradient, is also computed in Eqs. 3 and 4 [11]:

jGj ¼ jGx j þ Gy ð3Þ

Gx
h ¼ arctan 3p=4 ð4Þ
Gy

3.5 Prewitt Edge Method

Prewitt edge is quite similar with Sobel edge detection where the edge line detects
in horizontal Gx and vertical Gy direction. The differences of image pixel inten-
sities are calculated on edge region.
The Prewitt kernels for Gx and Gy are deﬁnes as:
From Fig. 3, it shows that central element in 3 3 matrix of x- and y-direction
is containing zero which not included the original edge values of image. Below and
above of pixel values was calculated its intensities difference along x-direction
while y-direction calculated in right and left of pixel values. Thus, this increases the
edge intensity [6].

3.6 Zero-Crossing (Laplacian) Edge Method

The Zero Crossing is also called as Laplacian edge where the points of an image
can be detected by ﬁnding the zero crossings of the second derivative of the image
intensity. Zero crossings operate as follows:

Fig. 3 Prewitt 3 3 Kernel

x-direction y-direction

Fig. 4 First derivative

898 F. N. M. Ariff et al.

Fig. 5 Second derivative

Fig. 6 Identiﬁcation of zero

crossing

Figure 4 shows the gradient of this signal that has been marked which is in one
dimension, which is the ﬁrst derivative with respect to ‘t’.
In Fig. 5 distinctly it shows the gradient which has a large peak centred on the
edge. By comparing the gradient to a threshold level, through the edge automati-
cally shows that the threshold is exceeded (as shown above). In this case, an edge is
found, but the edge has become “concentrated” due to the thresholding. As the edge
occurs at the peak, the laplacian operation can be applied in one dimension, it is the
second derivative with respect to t and ﬁnding the zero crossings.
In Fig. 6 it depicts the laplacian operations of a one-dimension signal. As
expected, the edge corresponds to a zero crossing, but other zero crossings are
corresponding to small ripples in the original signal which is also marked. In this
method the Laplacian operator is applied to test the image [21].

3.7 Laplacian of Gaussian (LoG) Edge Method

The Laplacian of Gaussian (LoG) which operating as much as Sobel operator has a
second name called MarrHilderath edge method. It works well in separating object
and at the same time decrease noise with finding the second derivative of given
image. Gaussian filter smoothen the image f(x,y) together with laplacian filter first
(see Fig. 7). Then, the true edges and zero-crossing were analyzed properly. In this
type normally the filter works by taking gray level image as input image and
produce the gray level image as output. The Laplacian (x, y) of an image with pixel
intensity values I (x, y) is given as:
Comparability of Edge Detection Techniques … 899

Fig. 7 Three commonly used

discrete approximations to the
Laplacian ﬁlter

@2l @2l
L ðx; yÞ ¼ þ 2 ð5Þ
@x 2 @y

The 2-D Gaussian function is explained as:

x2 þ y2
h ðx; yÞ ¼ e ð6Þ
2 r2

Here sigma r is the standard deviation, which blurs the image and blurring is
determined by sigma value [11].

3.8 Kirsch Edge Method

Kirsch is almost similar with Sobel but kirsch detects edge in more detail according
to each mask rotate in 45°. Kirsch operator or also called as Kirsch compass kernel
because it has eight directions which are North(N), North West (NW), West (W),
South West (SW), South (S), South East (SE), East (E) and North East (NE) as
shows in Fig. 8. This can make the edge are found effectively in maximum rate
with this approach [12].

Fig. 8 Kirsch 3 3 kernel

900 F. N. M. Ariff et al.

3.9 Chan-Vese Edge Method

Chan-Vese is proposed based on an active counter model using an energy mini-

mization technique. Edge is found without relying on gradient values and works
well on blurred and noisy images. Assume that I is given image. The curve is
represented implicitly via a Lipschitz function Ø (t, x, y). Chan and vese deduced
the associated Eular-Lagrange equation for Ø on the base of Mumford-Shah
functional
8 R
1H ð;Þdx
R
I ð1H ð;ÞÞdx
>
< c 1 ð ;Þ ¼ H ð;Þdx ; c2 ð;Þ ¼ X I ð1H ð;ÞÞdx
R
X
R
X
X
@;
¼ dð;Þ½lr: jr 2 2 ð7Þ
r; j v k1 ðI c1 Þ þ k2 ðI c2 Þ
;
>
: @t
;ð0; xÞ ¼ ;0 ð xÞ

where l ¼ 0, v 0, k 1, k 2 [ 0 are ﬁxed parameters,

1; z 0
HðzÞ ¼ are Heaviside function and dðzÞ ¼ dzd H(z) is the Dirac
0; z\0
function. Using the dðzÞ ¼ e = ðp ðe2 þ z2 ÞÞ. This algorithm can spontaneously trace
the interior contours. When the interior contour is small and far away from initial
contour, this model tent to unsatisfactory in behavior by seeking the local minima
[22].

3.10 QuadTree Edge Method

Quadtree segmentation has been approached a top-down technique to make the

algorithm simple and flexible. A quadtree is a segmentation based on tree structure
which has four branches in one internal node. Each branch which is containing
internal node will point to one node as a child of the given node in the quadtree.
Simply said that every node will have its own four branches or not at all. Figure 9
illustrates the tree structure of the quadtree in three-level.
Each node in the quadtree will link to the subblock of the image input with
determined their size and location. Splitting of four equal-sized square blocks from
one node which is four subblocks represented by four children of a precise parent
node.
Moment preserving technique is to classify the block activity whether want to
decide the nodes will use or not. The calculation is based on the mean values of a
block from given nodes. Pixel corresponds to the image block will be classiﬁed into
two functional groups with guided from the mean value. The ﬁrst group have
greater or same pixel value with mean, while second groups is a group holds
smaller pixel value than mean value.
Comparability of Edge Detection Techniques … 901

Internal node Leaf node

Fig. 9 Three-level of the quadtree

An absolute error has been calculating whenever the two groups of means
calculated. This absolute error is to decide either the node need to splitting or not.
A low detail block only happens when absolute error results are below than
threshold value. Further, the block will keep on going to splitting if there’s no limit.
The same process of splitting continuously process until the smallest allowable
block size has been reached [23].

4 The Proposed Procedure for License Plate Recognition

A simple LPR system has been an approach to test various edge detection tech-
niques in order to get the best edge detector for LPR system. The flowchart of the
method is presented in Fig. 10. In general, the proposed procedures consist of three
common steps which are license plate detection, character segmentation and
character recognition. This approach has skipped the license plate detection steps by
directly segment the character and remove unwanted region based on fixed char-
acter characteristic. There are elements that are highlighted as characteristic of
character such as height, width and area of the alphanumeric character’s pixel.
Based on the flowchart in Fig. 10, the procedure starts with input the original
image which containing colour scale also known as the RGB image. RGB stands
for Red, Green, and Blue, where it represents colour element automatically, shows
the image is in colour scale. The images have been captured by Fujifilm Digital
Camera with 14 megapixels. Three different angles have been captured where 45,
90 and 135° from the centre of target cars and 1.5–2.0 m distance used to captured.
The method starts with image pre-processing by using modified white patch for
enhancing the edges in the image [24]. This is due to the unbalance condition of
image lighting causes the license plate edges not clear enough to the detection
902 F. N. M. Ariff et al.

Fig. 10 The flowchart of the

procedure approach Start

Original image

Apply image enhancement by using modified

white patch

Convert into gray scale image

Filtering the image by using Median filter

Apply edge detection (Canny, Sobel, etc.) for

segmentation

Fill hole

Remove unwanted region based on the height,

width and area

Recognize character by using Template matching

End

process. Then, enhancement image is converted into greyscale by taking the

average sum of RGB. The median filter has been applied for noise removing
together with smoothing the image.
Next, extraction of character is segmented by edge detection that has been
discussed before. Fill hole algorithm is used to fill inside the illustrated line edges
where they are joined together and form an object. With this algorithm, can help in
identifying the alphanumeric characters and separate the unwanted region from the
characters.
After that, non-alphanumeric characters have been removed based on the
specific height (46–107), width (14–93) and 442 area of alphanumeric pixel. These
steps are important in order to clear the segmented image from the unwanted pixel
and automatically easier for the recognition process later. The process has been
done by using the region growing technique to analyze which objects achieve the
Comparability of Edge Detection Techniques … 903

(a) Original image (b) Modified white patch

(c) Gray scale image (d) Edge detection

(e) Fill hole (f) Remove noise

(g) Template matching (h) Result obtain

Fig. 11 Result of license plate segmentation

904 F. N. M. Ariff et al.

mentioned criteria pixels. An unselected object will be eliminated from the seg-
mented image.
Lastly, the segmented object will undergo Optical Character Recognition
(OCR) process which to classifying the alphanumeric characters. This process is
done by using the bounding box technique to highlight the object. Then, the
highlighted object has been evaluating with used an algorithm of matching tech-
nique. Matching algorithm is able to ﬁnd out the similarity of the obtained object
with the training databased. The matching algorithm mentioned is template
matching. This method highly used in OCR due to the simplest and fastest in OCR
process. Figure 11 shows the result of segmentation works based on the flowchart
explanation.

5 Results and Discussions

The performance of the proposed procedures has been evaluated using two posi-
tions of camera views which are license plate in direct position which is no angle
involved in taken image and second photo of license plate skew in 45° from first
position taken. The image has been taken during daylight with different lighting
featuring different distance between 1.3–2.5 m and its parked vehicles, have been
photographed by 14-megapixel resolution of Fujifilm digital camera. Total numbers
of image captured are 131 images with involve Malaysian license plate only.
Figure 11 shows the results of segmentation overview on license plate.
From Fig. 11, image (a) shows the original image captured by the digital camera.
Image (b) shows the resulting images of the modified white patch, where the
captured image has been enhanced to provide a clearer image in order to make it
easier for edge detection process. Subsequently, image (c) shows the greyscale
image converted from the RGB image using image enhancement. Conversions have
been made based on the weighted average sum of RGB. Edge-based segmentation
is shown in image (d). The process of this segment has been repeated later by using
all the techniques selected. Next, the fill hole algorithm was used after the segment.
Image (e) shows the image after compiling the fill hole algorithm inside the object,
to help process of the recognition step. Image (f) shows the results after noise
removing based on the growing region. It aims to smooth the recognition process
by eliminated unwanted objects. Image (g) shows a bordering box surrounded by
objects to undergo recognition process by using template matching. Finally, image
(h) shows the results obtained from the template matching process. This result has
been illustrated in a text file. Figure 12(a) shows the original image which used to
shows the comparing techniques while Fig. 12(b) and (c) shows the image results
of segmentation based on each techniques and Fig. 13 shows the others image
result techniques.
Figures 12 and 13 shows the comparison between the edges. All the edges have
been used on the method proposed to get which are the best edge detection that can
Comparability of Edge Detection Techniques … 905

(a) Original image

(b) Approxcanny (c) Canny

Fig. 12 The image of a image used in comparing edge detection techniques of b Approxycanny
and c Canny

extract the alphanumeric character perfectly. The edges in the image almost correct
detected but the are several techniques such as Approxcanny (see Fig. 12(b)) detect
in details every edge that obtains in the image. Krisch shows on Fig. 13(b) and
Quadtree shows in Fig. 13(h) shows that the characters have been expand from the
actual size. This causes the characters cannot be recognized well. Some characters
such as Fig. 13(d) show that the result of Prewitt’s edge is not filled correctly
because the cut line causes the characters to not fill up properly. This can cause
large numbers of unwanted regions and can interfere with extraction. Template
matching was used to recognize the true character from the edge-based segmen-
tation. Then, the evaluation has been made based on confusion matrix technique.
Table 1 shows the result of segmentation from Canny edge detection. This clas-
sification also has been made for the rest edge detection techniques.
Table 2 shows the result LPR system performance evaluated on comparison of
different edge detection proposed. From the table, highest correctly recognized by
the template matching is Canny edge technique with correct detection is 473 over
892 characters. Almost 419 of characters cannot be identified. The lowest true
character recognition produce is from Quadtree edge detection because only 51
characters can be recognized correctly and about 841 of characters have been
missing. This unwanted character exists due to the remaining noise which cannot be
removed because it has the same features as the characters.
906 F. N. M. Ariff et al.

(a) Chan-vese (b) Krisch

(c) LoG (d) Prewitt

(e) Robert (f) Sobel

(g) Zero crossing (h) Quadtree

Fig. 13 The edge detection of various techniques

Comparability of Edge Detection Techniques … 907

Table 1 Segmentation result from Canny edge detection

Input Actual number Result obtained Correct segment Remark
Car1 PMX 6573 Px6573 P _ X6573 Incorrect
Car2 JKH 6423 344JKH6423 JKH6423 Correct
Car3 BHK 2563 HK53 _ HK _5_3 Incorrect
Car4 BNR 9288 Incorrect
Car5 AHT 656 AHT656vjriziif AHT656 Correct
Car6 PHT 8578 PHT 8578 PHT8578 Correct
Car7 PLK 8585 PL8585 PL_8585 Incorrect
Car8 PLL 4128 rzt1PLL428 PLL4_128 Incorrect
Car9 WA 795 J JwA795Jrsrr wA795J Correct
Car10 QS 810 F S1FP _S_1_F Incorrect

Table 2 Performance comparison with different edge

Edge method Successful character recognition Unsuccessful character recognition
Approxcanny 77 815
Canny 473 419
Chan-vese 329 563
Kirsch 323 569
Log 436 456
Prewitt 255 637
Robert 310 582
Sobel 253 639
Quadtree 51 841
Zero crossing 88 804

The second result was shown in Table 3. The evaluation has been calculated
separately into three classes which are TP (True positive) is correctly recognize by
the system, FP (False positive) which means the character recognized wrongly and
FN (False negative) represent that true character is missing. From the result
obtained, ﬁve best of edge in producing true characters have been selected to
proceed a second test. On this second test, the feature of characters has been
implemented with following the measurement which suits with each edge detector.
Localization of license plate also has been added in the system. Chan-Vese have
been conquered the whole result by 60.15% of accuracy producing the highest
correctly recognized of characters. The canny operator moves into second stage
with 51.90% of true positive. Thus, this shows that every edge has its own char-
acteristic and localization of license plate also needed in order to get a higher rate of
character recognition. Template matching can recognize well for most of alphabet.
However, there have some issues regarding the speciﬁc character which have a
908 F. N. M. Ariff et al.

Table 3 Performance of the Edge TP FP FN Accuracy

second test on selected edge (%) (%) (%) (%)
Canny 51.90 6.18 41.29 51.90
LoG 48.70 8.45 42.85 48.70
Robert 36.23 5.78 57.99 36.23
Chan-vese 59.39 28.64 11.64 60.15
Krisch 36.69 4.34 58.96 36.69

unique shape which will lead to error recognition. The character that gives con-
fusing recognition such as ‘I’ and ‘J’ with ‘1’, ‘B’ confuse with ‘D’, ‘O’ with ‘Q’,
‘2’ with ‘Z’ and ‘9’ with ‘S’. All this confusing recognition automatically leading to
an error recognition and reduce the accuracy of edge detection performance. Thus,
this part will be focus on the next work to overcome the misleading of recognition
in template matching.

6 Conclusion

In this paper, a simple method has been proposed in order to lead the comparison of
the various edge detection technique. The technique of edge detections used for this
experiment were Approxcanny, Canny, Chan-Vese, Kirsch, LoG, Prewitt, Robert,
Sobel and Zero Crossing. The performance of edge detections has been done in two
stage test. First stage was segmentation based on all edge detection techniques with
standard measurement value for noise removing which based on height, width and
area. Second stage, five selected techniques from the best performance on first
analyses proceed for this stage. This stage has been removing the noise with
specific measurement value on each technique. The first experiment was conquered
by Canny that had been achieved 473 over 892 successful character recognition.
However, overall results rate was low. This may cause from the minimum number
of database used and the poor in quality of image. In future, number of database
need to increase. Another option was uses of other enhancement techniques need to
try or include in order to give better detection for every edge detection. Second test
results for the best edge detection were conquered by Chan-vese with 59.39% of
true character recognition. Template matching was the simplest recognition system
and unstable in recognized the character. This due to the noise that has same pixel
values with character and this cannot be reduced. This may cause the rate perfor-
mance obtained was low. Thus, in future classification method must be changed in
order to get more accurate in character recognition. Plus, the analysis data can be
added in future work by using real time taken of images and also indoor license
plate images such as in building car parking.
Comparability of Edge Detection Techniques … 909

References

1. Desertot M, Lecomte S, Gransart C, Delot T (2013) Intelligent transportation systems. In:

Computer science and ambient intelligence. ISTE, London
2. Nasir ASA, Gharib NKA, Jaafar H (2018) Automatic passenger counting system using image
processing based on skin colour detection approach. In: 2018 International conference on
computational approach in smart system design and applications (ICASSDA), pp 1–8
3. Musoromy Z, Ramalingam S, Bekooy N (2010) Edge detection comparison for license plate
detection. In: 11th International conference on control automation robotic vision, ICARCV
2010, no December, pp 1133–1138
4. Rashid MM, Musa A, Rahman MA, Farahana N, Farhana A (2012) Automatic parking
management system and parking fee collection based on number plate recognition. Int J Mach
Learn Comput 2(2):93–98
5. Saghaei, H.: Proposal for automatic license and number plate recognition system for vehicle
identiﬁcation. In: International conference on new research achievements in electrical and
computer engineering, pp 1–5 (2016)
6. Yu S, Li B, Zhang Q, Liu C, Meng MQH (2015) A novel license plate location method based
on wavelet transform and EMD analysis. Pattern Recognit 48(1):114–125
7. Kumari S, Gupta L, Gupta PP, Abdul APJ (2013) Automatic license plate recognition using
OpenCV and neural network. Int J Comput Sci Trends Technol 5(3):1786–1792
8. Kakani BV, Gandhi D, Jani S (2017) Improved OCR based automatic vehicle number plate
recognition using features trained neural network. In: 8th International conference on
computing communication and networking technology, ICCCNT 2017, pp 1–6
9. Wen Y, Lu Y, Yan J, Zhou Z, Von Deneen KM, Shi P (2011) An algorithm for license plate
recognition applied to intelligent transportation system. IEEE Trans Intell Transp Syst 12(3):
830–845
10. Yogheedha K., Nasir ASA, Jaafar H, Mamduh SM (2018) Automatic vehicle license plate
recognition system based on image processing and template matching approach. In: 2018
International conference on computational approach in smart systems design and applications,
ICASSDA 2018, pp 1–8
11. Sahoo T, Pine S (2017) Design and simulation of various edge detection techniques using
Matlab Simulink. In: International Conference on Signal Processing, Communication, Power
and Embedded System, SCOPES 2016, pp 1224–1228
12. Bala Krishnan K, Prakash Ranga S, Guptha N (2017) A survey on different edge detection
techniques for image segmentation. Indian J Sci Technol 10(4):1–8
13. Rokibul M, Hossain S, Roy S, Alam N, Jahirul M (2016) Line segmentation and orientation
algorithm for automatic Bengali license plate localization and recognition. Int J Comput Appl
154(9):21–28
14. Gou C, Wang K, Yao Y, Li Z (2016) Vehicle license plate recognition based on extremal
regions and restricted Boltzmann machines. IEEE Trans Intell Transp Syst 17(4):1096–1107
15. Ha PS, Shakeri M (2016) License plate automatic recognition based on edge detection. Artif
Intell Robot IRANOPEN 2016:170–174
16. Choubey S, Sinha GR, Choubey A (2011) Bilateral partitioning based character recognition
for vehicle license plate. Commun Comput Inf Sci 147:422–426
17. Sharma G (2018) Performance analysis of vehicle number plate recognition system using
template matching techniques. J Inf Technol Softw Eng 8(2):1–9
18. Babu KM, Raghunadh MV (2017) Vehicle number plate detection and recognition using
bounding box method. In: Proceedings of 2016 international conference on advanced
communication, control & computing technologies, no 978, pp 106–110
19. Al Taee EJ (2018) The proposed Iraqi vehicle license plate recognition system by using
Prewitt edge detection algorithm. J Theor Appl Inf Technol 96(10):2754–2764
20. Arxiv.org. https://arxiv.org/abs/1905.11731v1. Accessed 15 Sept 2019
910 F. N. M. Ariff et al.

21. Shrivakshan GT, Chandrasekar C (2012) A comparison of various edge detection techniques
used in image processing. Int J Comput Sci Issues (IJCSI) 9(5):269–276
22. Xia R, Liu W, Zhao J, Li L (2007) An optimal initialization technique for improving the
segmentation performance of Chan-Vese model. In: Proceedings of IEEE international
conference on automation and logistics, ICAL 2007, pp 411–415
23. Hu YC, Chang CC (1999) Variable rate vector quantization scheme based on quadtree
segmentation. IEEE Trans Consum Electron 45(2):310–317
24. Khairudin NAA et al (2019) Image segmentation approach for acute and chronic leukaemia
based on blood sample images. IOP Conf Ser Mater Sci Eng 557(1):1–6
Classiﬁcation of Facial Part Movement
Acquired from Kinect V1 and Kinect V2

Sheng Guang Heng, Rosdiyana Samad, Mahfuzah Mustafa,

Zainah Md Zain, Nor Rul Hasma Abdullah, and Dwi Pebrianti

Abstract The aim of this study is to determine the motion sensor with better
performance in facial part movements recognition among Kinect v1 and Kinect v2.
This study has applied some classification methods such as neural network, com-
plex decision tree, cubic SVM, fine Gaussian SVM, fine kNN and QDA in the
dataset obtained from Kinect v1 and Kinect v2. The facial part movement is
detected and extracted in 11 features and 15 classes. The chosen classifications are
then applied to train and test the dataset. Kinect sensor that has the dataset with
highest testing accuracy will be selected to develop an assistive facial exercise
application in terms of tracking performance and detection accuracy.

Keywords Kinect V1 Kinect V2 Face tracking Classiﬁcation Confusion

matrix Facial part movement

1 Introduction

Recently, assistive technologies have been widely used in human life in various
aspects such as vision and hearing care. Hence, devices that featuring assistive
technologies must have satisfying performance in terms of detection accuracy and
time response. Furthermore, assistive technologies can help in rehabilitation by
restoring the ability to its original state. For example, patients with Bell’s Palsy
syndrome are difficult to make facial expressions correctly or precisely like before.
They are required to do a series of rehabilitation to get back to normal. Normally,
the rehabilitation process will take a long time because of the difficulty of physical
exercises and lacking motivation repeating the same exercises. The assistive tech-
nologies help to improve the motivation for efficient rehabilitation.

S. G. Heng (&) R. Samad M. Mustafa Z. M. Zain N. R. H. Abdullah D. Pebrianti

Faculty of Electrical and Electronics Engineering, Universiti Malaysia Pahang, 26600 Pekan,
Pahang, Malaysia
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 911

For a detailed facial feature detection and tracking accuracy, 3D facial image
processing is better than 2D [1, 2] due to 3D sensor supports depth sensing. There
are various selections in RGB-D sensors but Kinect version 1 (v1) and Kinect
version 2 (v2) are the better choices due to cheaper price, robust face tracking
performance and high portability [3]. Besides, the Microsoft Kinect software
development toolkits (SDKs) [4] are free and available online. The featured face
tracking algorithms is robust enough to detect human face with the presence of
occlusions and poor lighting environment.
In this study, the Kinect motion sensors are chosen to acquire facial part
movements for facial exercise analysis. The facial part movements are classiﬁed in
order to determine which Kinect sensor has higher degree of recognition and
accuracy. Hence, the Kinect sensor with higher classiﬁcation accuracy will be
chosen to develop an assistive facial exercise application.

2 Literature Review

The face data is necessary in face recognition, facial expression analysis and bio-
metric applications. The collectible facial features from a human face are important
in many fields. For the facial image processing in medical field, the facial features
of a stroke patient can help the specialist to diagnose illness condition and reha-
bilitation progress [5]. Thus, the analysis of the facial part movement has been
carried out by using Kinect v1 and Kinect v2.
The classification types used in this study include neural network, complex
decision tree, cubic kernel SVM (Support Vector Machines), fine gaussian kernel
SVM, fine kNN (k Nearest Neighbors) and QDA (Quadratic Discriminant
Analysis). The basic working concept of these classifications is summarized.

2.1 Neural Network

A neural network consists of neurons organized in layers [6]. Each neuron will
multiply the inputs by the adjustable weight value and deliver the sum through the
transfer function to the next neurons. The number of neurons in each hidden layer is
commonly set between the numbers of input nodes and output nodes. The equation
of each neuron is given by
X
Y¼ ðweight inputÞ þ bias ð1Þ
Classiﬁcation of Facial Part Movement Acquired … 913

2.2 Decision Tree

Decision tree breaks down a dataset in to smaller and smaller subsets in a form of
tree structure. The decision nodes represent predictors or features of the dataset
while the leaf nodes represent the decision outputs or classes.
Quinlan [7] had designed ID3 that uses Entropy and Information Gain to con-
struct a decision tree. Entropy using the frequency table of one attribute E(S) and
two attribute E(T, X) are

X
c
E ð SÞ ¼ pi log2 pi ð2Þ
i¼1
X
E ðT; X Þ ¼ PðcÞE ðcÞ ð3Þ
c2X

The information gain is

GainðT; X Þ ¼ EntropyðT Þ EntropyðT; X Þ ð4Þ

The attribute with largest information gain is chosen as the decision node and the
branch with 0 entropy is considered a leaf node. The non-leaf branches need further
splitting by using the ID3 algorithm until all data is classiﬁed.

2.3 SVM

Support vectors are the difﬁcult data points that locate nearest to the hyperplane.
SVMs optimize the width of the margin of the hyperplane [8]. The data points that
are most difﬁcult to classify lie on the hyperplanes H1 and H2 , where the plane H0 is
the median in between. The weight vector, input vector and bias are represented by
w, x and b respectively.

H1 : w xi þ b ¼ þ 1 ð5Þ

H2 : w xi þ b ¼ 1 ð6Þ

H 0 : w xi þ b ¼ 0 ð7Þ

The kernel trick is used to solve nonlinear dataset. The common kernels used in
this paper are cubic and gaussian radial basis function (RBF).
914 S. G. Heng et al.

2.4 kNN

Given a testing point, x in a 2-class training dataset, the kNN algorithm identifies
the k nearest neighbors of x to find its class [9]. In a 2-class problem, k is set as an
odd value to avoid tie. For example, k is set to 3, so 3 nearest feature points of the x
will be identified. The class with the greatest number of nearest points to x will be
the class of x as well. Basically, Euclidean function is used to measure the distance
between testing point and feature points from the training dataset.
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u k
uX
t ðx y Þ2 ð8Þ
i i
i¼1

2.5 LDA

LDA (Linear Discriminant Analysis) creates a new axis between feature points,
projects the points onto the axis for maximizing the separability between classes.
The axis is created according to the maximum distance between means and mini-
mum scatter within each class. In overall, LDA is a classiﬁcation method that
reduce the dimensions of data and maximize the separation between classes [10].
The scatter matrices Si of a multi-class data ðc1 ; c2 ; . . .; cn Þ with m p-dimensional
samples, x1 ; x2 ; . . .; xm (where = xi ¼ ðxi1 ; . . .; xip Þ) are
X
Si ¼ ðx xi Þðx xi Þ0 ð9Þ
x2ci

P P
The intra-class scatter matrix c w and inter-class scatter matrix c b are given by

X
d
n X
X
w
¼ S1 þ S2 þ . . . þ Sn ¼ ðx xi Þðx xi Þ0 ð10Þ
i¼1 x2ci

X
d X
n

b
¼ mi ðxi xÞðxi xÞ0 ð11Þ
i¼1

3 Materials and Methodology

Both Kinect sensors have face tracking algorithms provided in their respective
version of Kinect SDK library. The face tracking algorithms are modiﬁed and
executed on Microsoft Visual Studio to extract desired facial point coordinates.
Classiﬁcation of Facial Part Movement Acquired … 915

User is required to perform neutral expression as a reference for the following facial
parts movement. Then, the distance between two paired facial points is calculated
from the extracted 3D coordinates. The distance ratio can be obtained by dividing
the facial feature points by neutral reference points. The distance ratio is then
categorized into 15 classes for classification purpose.
The MATLAB software is selected as a platform to apply classification on each
dataset for Kinect v1 and Kinect v2. There are 3000 training sets and 3000 testing
sets for both Kinect sensors. Each set comes with 11 distance ratios in respective to
their paired facial feature points. The types of classifications applied to the dataset
will be compared based on the accuracy results. Figure 1 shows the methodology
workflow.
Initially, user is asked to perform 15 facial parts movements, including neutral
pose, raising eyebrows, closing eyelids and lips expression. The collected 3D facial
points with significant changes in coordinate location are set as feature points.
Then, the feature points are paired as facial components to calculate the 3D dis-
tance. The distance ratio between neutral face and moving face is obtained for
classification. Total 6000 facial data are collected and divided into 3000 training set
and 3000 testing set for each Kinect v1 and Kinect v2. For the neural network
classification method, 30 hidden neurons are set for optimizing the training and
testing accuracy. While for the other classification methods, 5-fold cross-validation
is set to validate the training set with unseen data. The class category of each facial
part movement is listed in Table 1.

Fig. 1 Methodology workflow

916 S. G. Heng et al.

Table 1 Classes with Class Facial part movement

corresponding facial part
movement 1 Neutral
2 Raising eyebrows slightly
3 Raising eyebrows greatly
4 Frowning
5 Closing right eyelid
6 Closing left eyelid
7 Closing both eyelids
8 Stretching lips
9 Lowering lip corner
10 Pouting
11 Opening mouth
12 Stretching right lip corner
13 Stretching left lip corner
14 Pulling left lip corner
15 Pulling right lip corner

4 Results and Discussion

The optimum hidden neurons for dataset of Kinect v1 is 30. The training accuracy
is satisfying which is 96.6%. However, the testing accuracy is only 68.9%. This is
because some classes are undetectable by Kinect v1. The noises have been learned
by the model and caused overfitting. Hence, the model is poorly fit with the testing
data. In overall, the model does not generalize well.
From the Fig. 2, the 6th and 15th target classes have the lowest accuracy of
prediction due to the classes are asymmetrical facial parts movements and are not
supported by Kinect v1. For the 4th target class, the predicted output classes
included 3rd, 5th and 6th classes due to the Kinect v1 is likely to detect raising
behavior between the inner corner of eyes and eyebrows. Moreover, Kinect v1 is
responsive to the lip corners movement only. This results in poor output accuracy in
9th target class. Furthermore, Kinect v1 has inaccurate upper and lower lips
detection when closing lips due to the distance between upper lips and lower lips is
nearly zero. The overlapping issue in lips detection causes the 10th target class has
low output accuracy of prediction.
Both Kinect 1 and Kinect v2 use 30 hidden neurons. From the Fig. 3, the testing
accuracy is 91.2% which is far better than Kinect v1. For eyebrows raising
detection, Kinect v2 is less responsive to the outer eyebrows. This explains why
some testing data from 2nd and 3rd target classes are predicted wrongly into 3rd
and 2nd output classes respectively.
For the decision trees classification, the complex tree classifier type is chosen. The
model flexibility is high as the maximum number of splits is up to 100. From the
Fig. 4, the testing accuracy is 50.4% which is considered low because of the 5th, 6th,
Classification of Facial Part Movement Acquired … 917

Fig. 2 Kinect v1: confusion matrices of neural network training (left) and testing set (right)

Fig. 3 Kinect v2: confusion matrices of neural network training (left) and testing set (right)

7th, 12th, 13th, 14th and 15th classes are not supported by Kinect v1. The only way to
increase the training and testing accuracy is to dropout some noise from the dataset.
From the Fig. 5, the testing accuracy is 80.0% which is satisfying for Kinect v2.
The 2nd class has low detection accuracy due to the raising eyebrows slightly has
been misdetected as raising eyebrows greatly and lowering lips. Besides, the 4th
class has low detection accuracy due to the frowning expression has been cate-
gorized wrongly into closing right or left eyelid.
The cubic kernel function is applied in SVM classiﬁcation. From the Fig. 6, the
testing accuracy of Kinect v1 is 71.0% which is an acceptable result. The 15th class
has the lowest detection accuracy as Kinect v1 does not support asymmetrical
detection.
918 S. G. Heng et al.

Fig. 4 Kinect v1: confusion matrices of complex tree training (left) and testing set (right)

Fig. 5 Kinect v2: confusion matrices of complex tree training (left) and testing set (right)

From the Fig. 7, the testing accuracy is 92.8% which is high for Kinect v2. The
2nd and 3rd classes have low detection accuracy due to Kinect v2 is less responsive
to the outer eyebrows. Hence, the 2nd and 3rd classes have been classiﬁed wrongly
into 3rd and 2nd classes respectively.
The gaussian kernel is applied to the training dataset for both Kinect v1 and
Kinect v2. From the Fig. 8, the testing accuracy for Kinect v1 is 76.0% which is
considered acceptable. The 15th class has the lowest detection accuracy due to
Kinect v1 failed to detect asymmetrical expression. Besides, the low detection
accuracy of 9th class indicates that Kinect v1 is responsive to the lips corner
movement only.
Classiﬁcation of Facial Part Movement Acquired … 919

Fig. 6 Kinect v1: confusion matrices of cubic SVM training (left) and testing set (right)

Fig. 7 Kinect v2: confusion matrices of cubic SVM training (left) and testing set (right)

From the Fig. 9, the testing accuracy for the Kinect v2 is 93.7% which is
satisfying. The lowest detection rates have been recorded on 2nd and 3rd classes
where some testing samples have been misclassified into each other’s class due to
the similarity of raising eyebrows slightly and greatly. Besides, the 1st and 9th
classes have been mis-detected as each other’s class as well.
For the fine kNN classification, the distinctions between classes is finely detailed
as the number of neighbors is set to 1 only. From the Fig. 10, Kinect v1 has 78.8%
of testing accuracy which is considered acceptable. The 9th class has the lowest
detection rate due to the lowering lips corner has been misclassified into 12th and
13th classes which are stretching right and left lip corner respectively.
920 S. G. Heng et al.

Fig. 8 Kinect v1: confusion matrices of ﬁne Gaussian SVM training (left) and testing set (right)

Fig. 9 Kinect v2: confusion matrices of ﬁne Gaussian SVM training (left) and testing set (right)

From the Fig. 11, Kinect v2 has the highest testing accuracy among all the
classification types which is 94.3%. Some 2nd and 3rd classes are predicted
wrongly into 3rd and 2nd classes respectively due to Kinect v2 is less responsive to
the outer eyebrows movement.
The chosen classifier type for discriminant analysis is quadratic discriminant. It
creates nonlinear boundaries axis between the training classes. From the Fig. 12,
Kinect v1 has a low testing accuracy which is 60.1% only. The model is almost
failed to detect the 6th class as Kinect v1 does not support eyelids movement
detection. The 10th class is also categorized wrongly into 9th class.
Classification of Facial Part Movement Acquired … 921

Fig. 10 Kinect v1: confusion matrices of ﬁne kNN training (left) and testing set (right)

Fig. 11 Kinect v2: confusion matrices of ﬁne kNN training (left) and testing set (right)

From the Fig. 13, Kinect v2 has good testing accuracy which is 89.0%. Some of
the 2nd and 3rd classes facial part movements have been misclassified into 3rd and
2nd classes respectively due to Kinect v2 is less responsive to the outer eyebrows
movement detection.
In overall, the fine kNN classifier dominates the other classification types on
training and testing accuracy for both Kinect v1 and v2, which are shown in the
following Table 2. Besides, the complex tree classifier records the lowest training
and testing accuracy for both Kinect v1 and Kinect v2.
For fine kNN classification, Kinect v1 has 98.8% of training accuracy which is
slightly better than Kinect v2 that has 97.8% of training accuracy. Both Kinect v1
and v2 have the highest testing accuracy for fine kNN classification as well.
922 S. G. Heng et al.

Fig. 12 Kinect v1: confusion matrices of QDA training (left) and testing set (right)

Fig. 13 Kinect v2: confusion matrices of QDA training (left) and testing set (right)

Table 2 Classiﬁcation accuracy

Classiﬁcation type Training accuracy Testing accuracy
Kinect v1 (%) Kinect v2 (%) Kinect v1 (%) Kinect v2 (%)
Neural network 96.6 97.5 68.9 91.2
Complex tree 66.8 85.3 50.4 80.0
Cubic SVM 97.4 97.5 71.0 92.8
Fine Gaussian SVM 97.0 97.5 76.0 93.7
Fine kNN 98.8 97.8 78.8 94.3
QDA 72 92.8 60.1 89.0
Classiﬁcation of Facial Part Movement Acquired … 923

However, Kinect v1 has only 78.8% of testing accuracy compared to Kinect v2 that
has 94.3% of testing accuracy. Due to Kinect v1 does not support eyelids and
asymmetrical facial part movements, some relevant responses are not classiﬁed
correctly and have dragged down the testing accuracy. On the other hand, Kinect v2
has 94.3% of testing accuracy which is satisfying compared to Kinect v1. Most
responses are classiﬁed correctly due to Kinect v2 supports eyelids and asym-
metrical facial part movements.

5 Conclusion

In conclusion, the testing accuracy of Kinect v2 is better than Kinect v1. For Kinect
v2 dataset, the fine kNN classification which uses only 1 nearest neighbor has the
highest training and testing accuracy which are 97.8% and 94.3% respectively
among all classification types used in this study. The training and testing accuracy
are satisfying and closed to each other. Hence, the training dataset is proven suitable
for the testing dataset as Kinect v2 does support the eyelids and asymmetrical facial
part movements. In contrast, the Kinect v1 has relatively low testing accuracy
which is only 78.8% compared to Kinect v2 for fine kNN classification. Although
the Kinect v1 has 98.8% of testing accuracy which is the highest compared to
Kinect v2 among all classification types, the poor testing accuracy shows that
Kinect v1 has a lower face tracking performance compared to Kinect v2. In overall,
the classification results show that Kinect v2 is more suitable to develop assistive
facial exercise application.

Acknowledgements The research is funded by Fundamental Research Grant Scheme FRGS/1/

2016/TK04/UMP/02/1 and Universiti Malaysia Pahang (UMP).

References

1. Xu CH, Wang YH, Tan TN, Quan L (2004) Depth vs. intensity: which is more important for
face recognition? In: 17th International conference on pattern recognition on proceedings,
Cambridge, UK, vol 1. IEEE, pp 342–345
2. Abate AF, Nappi M, Riccio D, Sabatino G (2007) 2D and 3D face recognition: a survey.
Pattern Recogn Lett 28(14):1885–1906
3. Wu HH, Bainbridge-Smith A (2011) Advantages of using a Kinect camera in various
applications. University of Canterbury, pp 1–4
4. Webb J, Ashley J (2012) Beginning Kinect programming with the Microsoft Kinect SDK, 1st
edn. Apress, New York
5. Umirzakova S, Whangbo TK (2018) Study on detect stroke symptoms using face features. In:
2018 International conference on information and communication technology convergence
(ICTC), Korea. IEEE, pp 429–431
6. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach
Intell 12(10):993–1001
924 S. G. Heng et al.

7. Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

8. Berwick R An idiot’s guide to support vector machines (SVMs). http://web.mit.edu/6.034/
wwwbob/svm.pdf. Accessed 15 Oct 2019
9. Guo G, Wang H, Bell D, Bi Y, Greer K (2003) KNN model-based approach in classiﬁcation.
In: Meersman R, Tari Z, Schmidt DC. (eds.) OTM confederated international conferences
“On the move to meaningful internet systems”. LNCS, vol 1. Springer, Heidelberg, pp 986–
996
10. Li T, Zhu S, Ogihara M (2006) Using discriminant analysis for multi-class classiﬁcation: an
experimental investigation. Knowl Inf Syst 10(4):453–472
Hurst Exponent Based Brain Behavior
Analysis of Stroke Patients Using EEG
Signals

Wen Yean Choong, Wan Khairunizam, Murugappan Murugappan,

Mohammad Iqbal Omar, Siao Zheng Bong, Ahmad Kadri Junoh,
Zuradzman Mohamad Razlan, A. B. Shahriman,
and Wan Azani Wan Mustafa

Abstract The stroke patients perceive emotions differently with normal people due
to emotional disturbances, the emotional impairment of the stroke patients can be
effectively analyzed using the EEG signal. The EEG signal has been known as
non-linear and the neuronal oscillation under different mental states can be observed
by non-linear method. The non-linear analysis of different emotional states in the
EEG signal was performed by using hurst exponent (HURST). In this study, the
long-range temporal correlation (LRTC) was examined in the emotional EEG signal
of stroke patients and normal control subjects. The estimation of the HURST was
more statistically significant in normal group than the stroke groups. In this study,
the statistical test on the HURST has shown a more significant different among the
emotional states of normal subject compared to the stroke patients. Particularly, it
was also found that the gamma frequency band in the emotional EEG has shown
more statistically significant among the different emotional states.

Keywords Electroencephalogram Hurst exponent (HURST) Stroke Emotion

W. Y. Choong (&) W. Khairunizam M. I. Omar S. Z. Bong Z. M. Razlan

A. B. Shahriman
School of Mechatronic Engineering, Universiti Malaysia Perlis, 02600 Arau, Perlis, Malaysia
e-mail: [email protected]
M. Murugappan
Department of Electronics and Communication Engineering, Kuwait College of Science and
Technology, Block 4, Doha, Kuwait
A. K. Junoh
Institute of Engineering Mathematics, Universiti Malaysia Perlis, 02600 Arau, Perlis,
Malaysia
W. A. W. Mustafa
Faculty of Engineering Technology, Universiti Malaysia Perlis, 02600 Arau, Perlis, Malaysia

1 Introduction

Stroke was ranked the second in the top ten causes of deaths worldwide according
to the World Health Organization (WHO), where the statistic in year 2015 showed
that out of 85 deaths caused by stroke per 100,000 population [1]. Stroke or
cerebrovascular accident (CVA) happens due to the problem of oxygen supply to
the brain. The stroke caused the brain cells damage and leads to the loss of brain
functions. Hence, stroke patients often experience emotional problems, including
emotion impairment and having different emotional perceptions compared with
normal people for the same emotional situation [2, 3]. Yuvaraj has reviewed the
emotional impairment in stroke patients and the author concluded the stroke
patients experienced emotions differently with normal people and the right brain
damaged stroke patients were found to have more impairment than the left brain
damage stroke patients [4].
Most of the human activities are regulated by the brain, a huge amount of
information including the emotional experiences can be analyzed from the brain
activities [5–8]. Several methodologies have been implemented in the literature to
“capture” the brain activities for analysis, such as the positron emission tomography
(PET), functional magnetic resonance imaging (fMRI), magnetoencephalogram
(MEG) and electroencephalogram (EEG). In this study, the EEG was used as the
data acquisition devices are cheaper and portable in size than other devices, also
EEG signals have good temporal and spatial resolution.
EEG signal is a biosignal which generated from the stochastic phenomena of
biological systems, thus the analysis of EEG signals is useful in explaining the
neuronal activities of different mental states. The human brain is made up from four
lobes, known as frontal (F), parietal (P), temporal (T) and occipital (O), these lobes
located at the cerebral cortex (gray matter) of the brain. Different parts of the brain
form a complex network of interconnected control and coordination of body
activities. In order to understand how the brain works, researchers have been
studied the EEG signals from different brain parts. In this study, the emotional states
of human subject were collected in the form of EEG signals to study the brain
behavior in processing different emotional states.

1.1 Properties of EEG Signals

EEG signal is the most powerful biosignal and used in several clinical and real-time
systems design applications. Mostly, this signal is used for diagnosis of abnormalities
or disorders in the brain region such as traumatic brain injury [9], stroke [6, 10, 11],
Parkinson’s disease [12, 13], and Alzheimer’s disease [14]. This characteristic fea-
ture enables us to track the minute changes of emotions through brain signal activity
than other biosignals.
Hurst Exponent Based Brain Behavior Analysis … 927

From earlier studies, EEG is found to be non-linear, non-stationary, and non-

Gaussian signal [15, 16]. Researches have been work on verifying the properties of
EEG data to understand more on the human brain [17–19]. Thus, features from
different domains and methods have been used in the EEG signal analysis. Since EEG
exhibits non-linear characteristics, there were studies on EEG analysis using
non-linear features. In past studies, the non-linear analysis has been reported to per-
form better than linear analysis in EEG signal processing [20].
EEG signals are oscillatory in nature and most of the useful information about
brain state is reflected in the low frequency bands (<100 Hz). The behavior of the
EEG signal oscillation differs for different emotions. There are ﬁve frequency bands
(or frequency components, rhythms) in EEG signal, the delta (1–4) Hz, theta (4–8)
Hz, alpha (8–13) Hz, beta (13–30) Hz, and gamma (30–49) Hz [21]. Meanwhile,
the delta and theta are used to analyze the unconsciousness states or sleep stages of
human. Notably, the alpha, beta and gamma bands are used to investigate the
conscious or awake states of human, they are related to the intensity of awareness of
human. From the delta band to gamma band, the intensity is varied from the least
awareness to the most awareness states of brain.

2 Related Studies

From the literature, different non-linear features have been reported as effective
methods for analyzing EEG signals. For examples, the higher order statistical,
bispectrum features, approximate entropy (ApEN), Hurst Exponents (HURST) and
Detrended Fluctuation Analysis (DFA), Correlation Dimension and Lyapunov
exponent have been implemented in the EEG analysis [22–24].
HURST was first introduced for the modeling of the Nile river [25]. Later, it has
been applied in financial markets to predict the long-term effect of the time series,
as the HURST is effective in observing the fractal nature of financial markets [26].
Furthermore, the HURST is useful for other natural system such as the EEG signal.
HURST has been used in EEG signals to identify the persistent trends in time
series, and to analyze the EEG signals under different mental states [24, 27, 28].
In the EEG signal analysis using HURST, researchers reported the existence of
the long-range temporal correlation (LRTC) in EEG signals [29–31]. The LRTC
indicates that the interaction of the underlying neuronal population is able to
operate over a broad temporal scale. The LRTC shows the correlation of the
oscillatory brain activity [29]. Moreover, the LRTC relates to the severity level of
depression [31]. The LRTC in the EEG signals can be estimated from the Hurst
exponent and detrended fluctuation analysis (DFA) [31–34], these methods were
useful in estimating the LRTC in EEG signals.
Meanwhile, HURST expresses the autocorrelations of the time series, it can be
presented as the H value. The H value between 0 and 0.5 shows anti-correlation in the
time series, the H value between 0.5 and 1 shows positive correlation in the time series,
and the H value of 0.5 indicates there is no correlation behavior in the time series [35].
928 W. Y. Choong et al.

3 Materials and Methods

In this study, the EEG database was collected from stroke and normal subjects.
Then, the pre-processing was done prior to the feature extraction of the HURST
feature. The statistical analysis was done on the HURST features to analyze the
signiﬁcant among the emotional states. The summary of the methodology is shown
in Fig. 1.

3.1 EEG Data

The EEG database was collected from Hospital Canselor Tuanku Muhriz (HCTM),
Kuala Lumpur, Malaysia [6]. The detail experimental settings and procedures can
be referred in [6]. To analyze the EEG of different emotional states, three groups of
subject were studied, they were subject with left brain damage stroke patients
(LBD), right brain damage stroke patients (RBD), and normal control (NC). There
were 15 subjects in each group, hence there were 45 subjects in total that involved
in this study.
The subjects were emotionally induced for six different emotional states by using
audio-visual stimuli taken from the International Affective Picture System (IAPS) and
International Affective Digitized Sound (IADS). The six emotional states were anger,
disgust, fear, happiness, sadness and surprise. The EEG signals were recorded by
using a wireless 14-channel Emotiv Epoc, the EEG device has a sampling frequency
of 128 Hz and electrodes placement was according to the international standard 10–
20 system.
The emotional EEG signals were preprocessed by applying a Butterworth 6th
order bandpass ﬁlter with cut-off frequency of (0–49) Hz. There were three EEG

Fig. 1 Methodology block

diagram EEG Data
(Anger, Disgust, Fear, Happiness, Sadness, Surprise)

Pre-processing
(Alpha, Beta, Gamma)

Feature Extraction
(HURST)

Statistical Analysis
(ANOVA)
Hurst Exponent Based Brain Behavior Analysis … 929

sub-bands involved in this study, which were the alpha (8–13) Hz, beta (13–30) Hz
and gamma (30–49) Hz frequency bands. The HURST was analyzed in each fre-
quency band for the six different emotional states, the features were estimated from
every 6 s epoch of the EEG data for analysis.

3.2 Hurst Exponent

In this study, the HURST was estimated from the Rescaled Range Statistics
(RRS) method. The RRS method is used to evaluate the variability of data in time
series [36]. In order to investigate the amplitude fluctuations of the neuronal EEG, the
RRS based HURST were estimated from the amplitude envelope of the oscillatory
EEG. The detailed discussion of HURST estimation can be obtained from [36].

3.3 Statistical Validation

The extracted features were tested for their statistically significant among the six
emotions by implemented the one-way analysis of variance (ANOVA). There were
two types of value obtained from ANOVA, the calculated f-value and the p-value.
The f-value is the ratio of the mean square values, where higher f-value indicates
the variation among the group means has larger different [37, 38]. Whereas, the
p-value is use to make decision according to the f-value from the ANOVA table,
which represents the probability of the feature to be statistically significant. In this
study, the features were validated as statistically significant if the p-value is less
than or equal to 0.05.
The ANOVA was used to test the significant changes in each EEG channel
among the six emotions for all three groups, LBD, RBD and NC, respectively. The
f-value calculated from the ANOVA was used to rank the EEG channels, where the
f-values were rank from the highest to the lowest for each EEG frequency bands
and groups, then the p-value was used to determine the statistically significant EEG
channels.

4 Results and Discussion

Two HURST of the EEG in different emotional states were estimated from three
EEG sub-bands, the alpha, beta and gamma. The average HURST values of the
EEG channels were tabulated in Table 1. From Table 1, the HURST of all the
frequency bands are in between 0.5 and 1, which indicated the emotional EEG
signals exhibit positive correlation and there was the presence of LRTC in the
emotional EEG signals of LBD, RBD and NC groups.
930 W. Y. Choong et al.

Table 1 Average HURST values of the six emotional states

EEG band Group Anger Disgust Fear Happiness Sadness Surprise
Alpha LBD 0.792 0.792 0.793 0.787 0.789 0.791
RBD 0.781 0.780 0.783 0.784 0.784 0.782
NC 0.783 0.787 0.789 0.794 0.785 0.793
Beta LBD 0.727 0.734 0.733 0.719 0.716 0.731
RBD 0.707 0.707 0.713 0.705 0.701 0.705
NC 0.723 0.727 0.733 0.738 0.720 0.732
Gamma LBD 0.727 0.739 0.731 0.717 0.723 0.739
RBD 0.713 0.714 0.715 0.704 0.703 0.708
NC 0.728 0.733 0.731 0.740 0.720 0.732

Table 2 ANOVA results of HURST among the six emotional states in alpha band

LBD RBD NC
Rank
Channel f-cal p-value Channel f-cal p-value Channel f-cal p-value
1 T7 6.1391 < 0.001 FC6 3.056 0.009 T7 5.255 < 0.001
2 F4 2.8259 0.0150 AF3 2.426 0.033 O1 4.654 < 0.001
3 F8 2.6046 0.0234 O2 1.910 0.089 FC6 4.064 0.001
4 F3 1.7445 0.1210 F4 1.786 0.112 T8 3.040 0.010
5 P8 1.7439 0.1211 FC5 1.571 0.165 O2 2.835 0.015
6 FC6 1.5656 0.1663 T7 1.437 0.207 P8 2.477 0.030
*f-cal: f-value calculated from ANOVA

Moreover, by analyzing the f-value and the p-value using the ANOVA of each
EEG channel, the f-value and p-value of the top six EEG channels were tabulated in
Tables 2, 3 and 4 for each EEG frequency band. The results show that not all of the
EEG channels are statistically significant among the six emotional states. In
Table 2, the LBD and RBD groups have less statistically significant EEG channels,
which obtained p-value less than 0.05. The LBD group has three statistically sig-
nificant channels and RBD group has only two channels. Whereas the NC group
has more statistically significant channels with all the top six channels show sta-
tistically significant with p-value less than 0.05.
In Table 3, there are some EEG channels in RBD group show p-value higher than
0.05, and are not statistically significant, there are only three statistically significant
EEG channels in RBD group. For the LBD and NC groups, the p-values show that
all the top six EEG channels are statistically significant among the emotional states
in the beta band. In the gamma band as shown in Table 4, which is the highest EEG
frequency band, has shown all the top six EEG channels are statistically significant
among the six emotional states, in all three groups, the LBD, RBD and NC.
Hurst Exponent Based Brain Behavior Analysis … 931

Table 3 ANOVA results of HURST among the six emotional states in beta band

LBD RBD NC
Rank
Channel f-cal p-value Channel f-cal p-value Channel f-cal p-value
1 T7 9.443 < 0.001 T7 3.684 0.003 T7 10.179 < 0.001
2 F7 8.916 < 0.001 F8 3.217 0.007 FC6 5.093 < 0.001
3 F4 6.530 < 0.001 FC5 3.047 0.010 T8 4.596 < 0.001
4 O2 4.995 0.000 AF3 1.907 0.090 FC5 4.402 0.001
5 P8 4.526 0.000 F7 1.822 0.105 F3 4.289 0.001
6 O1 4.327 0.001 AF4 1.669 0.139 P7 3.480 0.004
*f-cal: f-value calculated from ANOVA

Table 4 ANOVA results of HURST among the six emotional states in gamma band
Rank LBD RBD NC
Channel f-cal p-value Channel f-cal p-value Channel f-cal p-value
1 T7 9.699 <0.001 AF4 4.713 <0.001 T7 9.865 <0.001
2 F4 8.767 <0.001 F4 4.361 0.001 FC5 6.409 <0.001
3 O1 5.011 <0.001 FC5 4.226 0.001 FC6 5.478 <0.001
4 AF4 4.339 0.001 O1 3.121 0.008 T8 4.796 <0.001
5 F7 4.203 0.001 F7 2.631 0.022 P8 4.085 0.001
6 T8 3.971 0.001 P8 2.629 0.022 F3 3.667 0.003
*f-cal: f-value calculated from ANOVA

From the ANOVA results in Tables 2, 3 and 4, the NC group has the highest
number of EEG channels that show statistically significant among the six emotional
states, followed by the LBD group, while the RBD group has the least number of
statistically significant EEG channels. Overall, the gamma frequency band was the
EEG sub-band with the highest number of statistically significant EEG channels.
Furthermore, from the ranking of the EEG channels in Tables 2, 3 and 4, the
EEG channel T7, which located at the left temporal of the scalp, is observed to rank
in the first for alpha (LBD and NC), beta (all groups) and gamma (LBD and NC),
also obtained the highest f-value in beta band (NC) as shown in Table 3.

5 Conclusion

This present work studied the statistical analysis of EEG channels among the
emotional states in the Stroke patients and Normal Control using HURST. The
estimated HURST showed the presence of LRTC in the emotional EEG of stroke
932 W. Y. Choong et al.

and normal subjects. In addition, the ANOVA results has shown that there is more
statistically signiﬁcant EEG channels in the NC group, and also the gamma fre-
quency band has the highest number of statistically signiﬁcant EEG channels
analyzed among the six emotional states.

Acknowledgements The author would like to acknowledge the support from the Fundamental
Research Grant Scheme (FRGS) under a grant number of FRGS/1/2019/ICT04/UNIMAP/02/1
from the Ministry of Education Malaysia.

References

1. World Health Organization (2017) Top 10 causes of death. World Health Organization
2. Robinson RG, Jorge RE (2016) Post-stroke depression: a review. Am J Psychiatry 173:221–
231
3. Cheah WK, Hor CP, Zariah AA, Looi I (2016) A review of stroke research in Malaysia from
2000–2014. Med J Malaysia 71:58–69
4. Yuvaraj R, Murugappan M, Norlinah MI, Sundaraj K, Khairiyah M (2013) Review of
emotion recognition in stroke patients. Dement Geriatr Cogn Disord 36:179–196
5. Choong WY, Khairunizam W, Omar MI, Murugappan M, Bong SZ, Shahriman AB,
Zuradzman MR, Zunaidi I (2018) Analysis of the distance metrics of KNN classiﬁer for EEG
signal in stroke patients. In: 2018 International conference on computational approach in
smart system design and applications, Kuching, Malaysia. IEEE, pp 1–4
6. Bong SZ, Wan K, Murugappan M, Ibrahim NM, Rajamanickam Y, Mohamad K (2017)
Implementation of wavelet packet transform and non linear analysis for emotion classiﬁcation
in stroke patient using brain signals. Biomed Signal Process Control 36:102–112
7. Daly I, Malik A, Hwang F, Roesch E, Weaver J, Kirke A, Williams D, Miranda E, Nasuto SJ
(2014) Neural correlates of emotional responses to music: an EEG study. Neurosci Lett
573:52–57
8. Yuvaraj R, Murugappan M, Palaniappan R (2017) The effect of lateralization of motor onset
and emotional recognition in PD patients using EEG. Brain Topogr 30:333–342
9. Thatcher RW, North DM, Curtin RT, Walker RA, Biver CJ, Gomez JF, Salazar AM (2001)
An EEG severity index of traumatic brain injury. J Neuropsychiatr 13:77–87
10. Adamaszek M, Olbrich S, Kirkby KC, Woldag H, Willert C, Heinrich A (2013) Event-related
potentials indicating impaired emotional attention in cerebellar stroke-a case study. Neurosci
Lett 548:206–211
11. Doruk D, Simis M, Imamura M, Brunoni AR, Morales-Quezada L, Anghinah R, Fregni F,
Battistella LR (2016) Neurophysiologic correlates of post-stroke mood and emotional control.
Front Hum Neurosci 10:428
12. Yuvaraj R, Murugappan M, Mohamed Ibrahim N, Sundaraj K, Omar MI, Mohamad K,
Palaniappan R (2014) Detection of emotions in Parkinson’s disease using higher order
spectral features from brain’s electrical activity. Biomed Signal Process Control 14:108–116
13. Stam CJ, Jelles B, Achtereekte HAM, Rombouts SARB, Slaets JPJ, Keunen RWM (1995)
Investigation of EEG non-linearity in dementia and Parkinson’s disease. Electroencephalogr
Clin Neurophysiol 95:309–317
14. Blinowska KJ, Rakowski F, Kaminski M, De Vico Fallani F, Del Percio C, Lizio R,
Babiloni C (2017) Functional and effective brain connectivity for discrimination between
Alzheimer’s patients and healthy individuals: a study on resting state EEG rhythms. Clin
Neurophysiol 128:667–680
15. Klonowski W (2009) Everything you wanted to ask about EEG but were afraid to get the right
answer. Nonlinear Biomed Phys 3:2
Hurst Exponent Based Brain Behavior Analysis … 933

16. Gonen FF, Tcheslavski GV (2012) Techniques to assess stationarity and Gaussianity of EEG:
an overview. Int J Bioautomation 16:135–142
17. Sugimoto H, Ishii N, Iwata A, Suzumura N (1977) Stationarity and normality test for
biomedical data. Comput Programs Biomed 7:293–304
18. Sugimoto H, Ishii N, Iwata A, Suzumura N, Tomita T (1978) On the stationarity and
normality of the electroencephalographic data during sleep stages. Comput Programs Biomed
8:224–234
19. Hosseini SA, Khalilzadeh MA, Naghibi-Sistani MB, Niazmand V (2010) Higher order spectra
analysis of EEG signals in emotional stress states. In: 2010 Second international conference
on information technology and computer science, pp 60–63
20. Yuvaraj R, Murugappan M, Ibrahim NM, Omar MI, Sundaraj K, Mohamad K,
Palaniappan R, Satiyan M (2014) Emotion classification in Parkinson’s disease by
higher-order spectra and power spectrum features using EEG signals: a comparative study.
J Integr Neurosci 13:89–120
21. Teplan M (2002) Fundamentals of EEG measurement. Meas Sci Rev 2:1–11
22. Hosseini SA (2012) Classification of brain activity in emotional states using HOS analysis.
Int J Image Graph Signal Process 4:21–27
23. Yuvaraj R, Murugappan M, Ibrahim NM, Sundaraj K, Omar MI, Mohamad K, Palaniappan R
(2014) Optimal set of EEG features for emotional state classification and trajectory
visualization in Parkinson’s disease. Int J Psychophysiol 94:482–495
24. Natarajan K, Acharya RU, Alias F, Tiboleng T, Puthusserypady SK (2004) Nonlinear analysis
of EEG signals at different mental states. Biomed Eng Online 3:7
25. Hurst HE (1951) Long-term storage capacity of reserviors. Trans Am Soc Civ Eng 116:770–
799
26. Mandelbrot BB, Wallis JR (1968) Noah, Joseph, and operational hydrology. Water Resour
Res 4:909–918
27. Hosseini SA, Akbarzadeh-T M-R, Naghibi-Sistani M-B (2013) Qualitative and quantitative
evaluation of EEG signals in epileptic seizure recognition. Int J Intell Syst Appl 5:41–46
28. Lahmiri S (2018) Generalized Hurst exponent estimates differentiate EEG signals of healthy
and epileptic patients. Phys A Stat Mech Appl 490:378–385
29. Watters PA, Martin F (2004) A method for estimating long-range power law correlations from
the electroencephalogram. Biol Psychol 66:79–89
30. Lee J-M, Kim D-J, Kim I-Y, Park KS, Kim SI (2004) Nonlinear-analysis of human sleep EEG
using detrended fluctuation analysis. Med Eng Phys 26:773–776
31. Bornas X, Fiol-Veny A, Balle M, Morillas-Romero A, Tortella-Feliu M (2014) Long range
temporal correlations in EEG oscillations of subclinically depressed individuals: their
association with brooding and suppression. Cogn Neurodyn 9:53–62
32. Hartley C, Berthouze L, Mathieson SR, Boylan GB, Rennie JM, Marlow N, Farmer SF (2012)
Long-range temporal correlations in the EEG bursts of human preterm babies. PLoS ONE 7
(2):e31543
33. Sheng H, Chen Y, Qiu T (2012) Multifractional property analysis of human sleep EEG
signals. Int J Bifurc Chaos 22:1250080
34. Nikulin VV, Brismar T (2005) Long-range temporal correlations in electroencephalographic
oscillations: relation to topography, frequency band, age and gender. Neuroscience 130:549–
558
35. Mansukhani S (2012) The Hurst Exponent: Predictability of Time Series. http://analytics-
magazine.org/the-hurst-exponent-predictability-of-time-series/
36. Voss J (CFA): Rescaled Range Analysis: A Method for Detecting Persistence, Randomness,
or Mean Reversion in Financial Markets. https://blogs.cfainstitute.org/investor/2013/01/30/
rescaled-range-analysis-a-method-for-detecting-persistence-randomness-or-mean-reversion-
in-financial-markets/
37. Singh G A Simple Introduction to ANOVA. https://www.analyticsvidhya.com/blog/2018/01/
anova-analysis-of-variance/
38. Lowry R Conceptual Introduction to ANOVA. http://vassarstats.net/textbook/ch13pt1.html
Examination Rain and Fog Attenuation
for Path Loss Prediction in Millimeter
Wave Range

Imadeldin Elsayed Elmutasim and Izzeldin I. Mohd

Abstract The core solution for the congestion bandwidth which are utilizing these
days in radio propagation inevitably is to move forward to higher frequencies,
which millimeter wave frequencies can provide it as well as satisfy clients requests
such as huge capacity and fast data rate. On the other hand, the high vulnerable
from complex condition become holdback for instance attenuation which case due
to atmospheric absorption such as rain, fog, foliage, oxygen and water vapor.
Coverage planning depends on which propagation model use in various atmo-
spheres to accomplish reasonable and dependable system. In this paper we con-
centrating on rain and fog parameters which are both represent most significant
condition variables to encourage the planners of wireless networks to select
appropriate model, which can structure and actualize the fitting model. Likewise,
we look at among three different models exponential, polynomial and power to
utilizing scientific conditions and goodness of fit parameters to demonstrate clearly
data curve. The outcome shows that polynomial agreeable model for rainy and
foggy atmospheric, and will extend the methodology to other models utilizing real
data analysis to obtain comprehensive contributable results, which will conceivably
be valuable assessment to millimeter wave frequencies planners.

Keywords Millimeter Wave (mmWave) Attenuation Goodness of ﬁt

Atmospheric environmental

I. E. Elmutasim I. I. Mohd (&)

College of Engineering, Universiti Malaysia Pahang (UMP), 26300 Gambang,
Pahang, Malaysia
e-mail: [email protected]
I. E. Elmutasim
e-mail: [email protected]

1 Introduction

The radio wave characteristically understands the signal propagation in different

medium with wide impact of atmospheric environment parameters. The new era of
mmWave, which would utilizing extraordinary high frequencies extend from 30 to
300 GHz to provide huge data rate due to high capacity of information per seconds
to keep up clients requests. The special aspect of mmWave is the different method
awareness with climate change and challenge to adjusted little wavelength with
high frequencies. Because of innovation requests numerous models are required to
concentrate to accomplish best execution particularly parameters in air, for exam-
ple, rain and fog. Besides, much worry about attenuation caused by the prior
phenomena [1]. High buildings, climate change, fog droplets and rain drops may
become holdback and cause high path loss for high frequency band [2].
For provide a wide scope of results that would be utilized in wireless research,
this paper focus on impact of rain drop and fog attenuation in mmWave frequencies
with real data collected [3] and utilizing curve ﬁtting to analysis data due to
goodness of ﬁt parameters as well as compare between the three models [4]:
Exponential, Polynomial and Power models. The models can evaluate and select the
appropriate structure in various climatic circumstances to assist the planners with
making choice and pick the ideal model.

2 Millimeter Wave Path Loss Parameters

Before starting planning the wireless network, one of the significant factor need to
consider is path loss, which totally depends on spatial separation, signal attitude [5]
and frequencies utilizing and will be use being developed and research to give solid
system. Nevertheless, the scientists put all the desire to improve every one of
clients’ requests through mmWave band [6] since it will give enormous bandwidth
and huge data rate every second. On the other hand, these desires may confronting
numerous troubles identified with mmWave characteristics, for example, penetrate
the obstacles and high sensitivity with attenuation according to climate change
conditions [7]. These complicated behaviors make the telecommunication planners
and designers more motivated to think about environmental factors effectiveness on
new coming mmWave period, for example, rainy and foggy weather, which are
clearly impact due to scattering and absorption the signals rate [8]. Many attenu-
ation models can be applied to study the phenomena, for instance DAH,
Lognormal, SAM and ITU [1, 9]. In order to push scientific research in mmWave
frequencies, strongly need to assessment models using deep specific study review to
show the impress of propagation parameters especially attenuation, which is main
component in high frequencies caused by rain drops and fog [10, 11]. The profound
effect of attenuation in a whole system especially in link budget as it will make
degrades the system performance particularly while using higher Gigahertz
Examination Rain and Fog Attenuation for Path Loss Prediction … 937

frequencies. In this way, many effect factors on high frequencies play a main role
and influence the complete performance, for example, free space path loss, atten-
uation due to water vapor, oxygen, fog, rain rate [12] as well as separation between
stations.
From the models perspective, the path loss measures the proportion between the
transmitter antenna radiated signal to the receiver antenna, the signal always should
keep more strength than noise to receive without distortion [13]. Below an equation
for path loss model:

PL ½dB ¼ 10 log10 ðPT = PR Þ ð1Þ

where PT and PR are transmitted and received power respectively. However, Friis
low equation applied considered free space and calculate the received power in a
function of distance as

PR ðd Þ ¼ PT GT GR ðk= 4pdÞ2 ð2Þ

where GT and GR are the gains of the antenna at transmitter and receiver, respec-
tively, k is the wavelength, d is the spatial distance between the near end and far
end.
In close-in (CI) path loss mode [14], in several different atmospheric conditions
which produce more attenuation and clear effect in the signals [3], the path loss in
function of frequency and distance expressed as:

PLCI ðf ; d Þ½dB ¼ FSPLðf ; 1 mÞ½dB þ 10nlog10 ðrÞ

þ AT ½dB þ vrCI ð3Þ
where r 1 m

where f, carrier frequency in GHz, r is the 3D separation distance between trans-

mitter–receiver, FSPL (f, 1 m) the free space path loss in dB with spatial distance
1 m, n is path loss exponent, AT is attenuation parameters due to atmosphere, and
vrCI is a zero-mean Gaussian random variable with a standard deviation r in dB:

FSPLðf ; 1 mÞ½dB ¼ 20log10 ð4pf 109 =cÞ2

ð4Þ
¼ 32:4½dB þ 20log10 ðf Þ

where c is the light speed in a space, f is frequency in GHz. The AT which is

effectiveness factor calculated by:

AT ½dB ¼ a ½dB=m d ½m ð5Þ

where alfa a is the attenuation factor in dB/m from collective atmospheric param-
eters such as water vapor, fog and rain in frequency range from 1 to 100 GHz, d is
the distance between transmitter and receiver antennas.
938 I. E. Elmutasim and I. I. Mohd

From the above and to consider the attenuation factor which become much
influence parameter, the equation aware of attenuation factor by:

PL ¼ 32:4½dB þ 20log10 ð f Þ þ a½dB=m d ½m ð6Þ

where alfa (a) is attenuation due to atmospheric conditions and d is spatial distance
between transmitter and receiver. In order to give useful results have much attention
of alfa factor, which play major role to provide accurate model in various atmo-
sphere conditions related to wavelength frequencies.
In this way, in the specific situation [3] an analysis made to examine prolifer-
ation adequacy in various frequencies extend from 1 to 100 GHz to study rain and
fog phenomenon impact in high frequencies such as mmWave utilizing fit to group
of data points name curve fitting, which is obviously much simplicity to recognize
information between data points tracing the curve and become easy for planners to
estimate the model according to data enter.
The Figures below demonstrate various weather conditions such as temperature
20 °C, humidity 80%, rain rate 5 mm/hr and atmospheric pressure 1013.25 mbar
[3], applied in rain and fog respectively, while using particular propagation models
such as exponential to compare with polynomial and power models respectively.
While adding more aggregate data to contribute mathematically model which
describe the whole system and produce prediction behavior using equations form
that can help the designers to construct reliable system while designing wireless
communication network (Figs. 1, 2 and 3).
Like to rain attenuation models and under similar specific atmospheric condi-
tions, the fog considered below utilizing the Gigahertz frequencies up to 100 GHz
(Figs. 4, 5 and 6).

-3 -3
10 10

3 3

alfa vs. f alfa vs. f

Exponential 1 Exponential 2
2 2
alfa
alfa

1 1

0 0
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100

f f

Fig. 1 Propagation attenuation due to rain at high frequencies using expo 1 and 2
Examination Rain and Fog Attenuation for Path Loss Prediction … 939

-3 -3
10 10

3 3

alfa vs. f alfa vs. f

Poly 3 Poly 4
2 2
alfa

alfa
1 1

0 0
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
f f

Fig. 2 Propagation attenuation due in rain at high frequencies using poly 3 and 4

-3 -3
10 10

3 3

alfa vs. f alfa vs. f

Power 1 Power 2
2 2
alfa
alfa

1 1

0 0
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
f f

Fig. 3 Propagation attenuation in rain at high frequencies using power 1 and 2

-6 -6
10 10

4 4

3 3

alfa vs. f
alfa
alfa

2 alfa vs. f 2
Exponential 2
Exponential 1

1 1

0 0
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
f f

Fig. 4 Propagation attenuation in fog at high frequencies using expo 1 and 2

940 I. E. Elmutasim and I. I. Mohd

-6 -6
10 10

4 4

3 3
alfa vs. f alfa vs. f

Poly 1 Poly 2
alfa

alfa
2 2

1 1

0 0
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
f f

Fig. 5 Propagation attenuation in fog at high frequencies using poly 1 and 2

-6 -6
10 10

4 4

3 3
alfa vs. f alfa vs. f
Power 1 Power 2
alfa
alfa

2 2

1 1

0 0
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100

f f

Fig. 6 Propagation attenuation in fog at high frequencies using power 1 and 2

3 Model Approach

To use fitting strategies for structure curve which speaks to the scientific capacities
and assess the model execution, with many features of modeling approach [15] such
as simplify and clearly give summary of specific system as well as provide easier
way to predict final outcomes of the system. The values in curves gathering uti-
lizing mmWave frequencies (GHz), which are represented by x hub, while rain and
fog phenomenon are represented by alfa (a in dB/m) in y pivot individually.
The goodness of fit give brief differentiation between qualities utilizing graphical
interface, so it turns out to be extremely clear to recognize model and whole
information, consequently evaluation the model execution and exactness class. To
study modeling approach may utilize curve fitting and goodness of fit test, which
can estimate the performance according to different pillars effect [16, 17]. Thus,
through variety of qualities could examine the model execution, validity and
capacity to choose reasonable model under explicit condition. These pillars content
most significant qualities to speak to the connection with each model, which are:
Examination Rain and Fog Attenuation for Path Loss Prediction … 941

3.1 R-Square

It quantiﬁes the proportion of data variation comparing to model. As per our paper,
the high rate gives robust description and smaller contrasts in model changeability.

3.2 RMSE

Standard error or Root Mean Squared Error, it quantiﬁes the proportion of infor-
mation variety contrasting with model. So, to increase the model validity, values
should be closer to zero as much as possible.

3.3 Number of Coefﬁcient

Its numbers connected with variables or letters in equations. Depends on coefﬁcient

parameter can determine the size and direction of model as well as help to choose
the ﬁt evaluate.

3.4 Adjusted R-Square

By adding more coefficient parameters, it will become high quality and more
accurate fitting. The nearest value to 1 is the best fitting.

3.5 SSE

According to deviation of values, it measures Sum of Square Errors of the ﬁt and

became useful with less random error when the value near to zero.
The parameters referenced above with coefficient esteems could create the exact
outcomes which would turn out to be most significant for designers to choose the
reasonable scientific models as appeared underneath in Tables 1 and 2:
For exact outcome at rain model, in addition to polynomial 1 and 2 values,
consider to specify polynomial 3 and 4 with plotted. While the coefficient values
and numbers play major role to fit the plot. Table 2 explains the values at rain
model.
942 I. E. Elmutasim and I. I. Mohd

Table 1 The mathematical Model Equation

various models at rainy
climate Exp 1 aebx
Exp 2 aebx + cedx
Polynomial 1 ax + b
Polynomial 2 ax2 + bx + c
Polynomial 3 ax3 + bx2 + cx + d
Polynomial 4 ax4 + bx3 + cx2 + dx + g
Power 1 axb
Power 2 axb + c

Table 2 The mathematical coefﬁcient values at rainy climate

Model Rain coefﬁcient variables values
a b c d g
Exp 1 0.0006702 0.01823
Exp 2 0.0007466 0.01735 −2.395 0.02002
Poly 1 4.114 −0.00020
Poly 2 −3.057 7.324 −0.000793
Poly 3 −8.983 1.109 1.236 −0.000196
Poly 4 2.721 −6.613 5.033 −8.459 0.000423
Power 1 0.03645 1.011
Power 2 0.2454 0.6448 −1.066

Table 3 The mathematical Model Equation

various models at foggy
climate Exp 1 aebx
Exp 2 aebx + cedx
Polynomial 1 ax + b
Polynomial 2 ax2 + bx + c
Power 1 axb
Power 2 axb + c

From the fog parameter perspective, the same models applied to give accurate
comparison result with the rain. Table 3 shows the mathematical models in fog
parameter.
While the associated coefﬁcient values in Table 4 below:
Examination Rain and Fog Attenuation for Path Loss Prediction … 943

Table 4 The mathematical Model Fog coefﬁcient variables values

coefﬁcient values at foggy
a b c d
climate
Exp 1 1.673 0.03402
Exp 2 9.036 0.01276 0 0.01276
Poly 1 4.535 −7.256
Poly 2 3.015 1.369 −1.451
Power 1 1.101 1.269
Power 2 1.27 1.269 −3.844

4 Results and Discussion

The results obtained by using particular attenuation parameters for prediction in rain
and fog phenomena on Gigahertz frequencies, evaluated using curve fitting depend
on goodness of fit data points with help of mathematical equations.
From the goodness of fit parameters aspect, each value of specific model indi-
cates the effectiveness in model validity which is strongly related to the plot fit.
Tables 5 and 6 show goodness of fit values.
To discuss the results under the particular weather conditions between
Exponential model, Polynomial model and Power model in rain and fog respec-
tively, we observed that during analysis rain climate:
a- At Exponential model due to high error in sum of squares (6.1785)
(6.1520) respectively, which indicted unhelpful data fitting.
Furthermore, the worthless value at model in Adjusted R-square
(78.42%) (75.83%) explain ineffective fitting with similar clear vari-
ability (79.56%) (79.65%) respectively with decay due to negative
coefficient value (c = −2.395).

Table 5 The goodness of ﬁt Model Rain goodness of ﬁt

parameters values in rain
R-square RMSE No Adj R SSE
attenuation in mmWave
Coeff square
Exp 1 0.7956 5.8587 2 0.7842 6.1785
Exp 2 0.7965 6.2008 4 0.7583 6.1520
Poly 1 0.9310 3.4035 2 0.9272 2.0850
Poly 2 0.9650 2.4963 3 0.9608 1.0594
Poly 3 0.9834 1.7731 4 0.9802 5.0301
Poly 4 0.9937 1.1307 5 0.9920 1.9176
Power 1 0.9246 3.5594 2 0.9204 2.2804
Power 2 0.9474 3.0571 3 0.9412 1.5888
944 I. E. Elmutasim and I. I. Mohd

Table 6 The goodness of ﬁt Model Fog goodness of ﬁt

parameters values in fog
R-square RMSE No Adj R SSE
attenuation in mmWave
Coeff square
Exp 1 0.9210 3.9394 2 0.9166 2.7934
Exp 2 0.7318 7.6975 4 0.6816 9.4802
Poly 1 0.9674 2.5311 2 0.9656 1.1532
Poly 2 0.9956 9.5617 3 0.9951 1.5542
Power 1 0.9627 2.7082 2 0.9606 1.3202
Power 2 0.9874 1.6214 3 0.9859 4.4693

b- The linear model appears in polynomial model 1 with higher random

error (2.0850) than Exponential 2 (1.0594) in SSE, while the Polynomial
2 better fit (0.9608) than (0.9272) at Adjusted R-square. The value of
standard error at Polynomial model 2 (0.2496) smaller than (0.3403) at
polynomial 1, which indicate the lower error at Polynomial 2, although
the higher variation percent at Polynomial 2 (96.50%) than Polynomial 1
(93.10) with more number of coefficient (3) and small decay rate
(c = −0.793). For estimations referenced before, disregarded drawing it
and bring up the related values.
c- In Polynomial 3 and 4 notice, both of terms are strong to fit the model,
but the Polynomial 4 more useful via high Adj R square 99.20% while
98.02% in Polynomial 3, the random error at Polynomial 4 smaller and
become better than Polynomial 3 1.9176 and 5.0301. Also, the strong
description in Polynomial 4 (99.37%) make the model helpful than
Polynomial 3 (98.34%). moreover, the Polynomial 4 gives less expect
error in model (1.1307) than Polynomial 3 (1.7731), the negative coef-
ficient indicate the clear slope for the fitting curve.
d- The linear fitting again appears at Power1 model with more sum of
square due to error (2.2804) while (1.5888) in Power 2 which have more
effective value at adjusted parameter (94.12%) with more coefficient
number (3), more data variation percent compare to model (94.74%)
compared to (92.46%) at Power 1 and expect to increase decay rate at
higher frequencies (c = −0.0009553).
At the point when atmosphere changing to foggy, the investigation information
become:
a- Firstly, the Exponential shows there are different between 1 and 2. The
value of SSE is less random error in Exponential 1 (2.7934) and closer to
zero than Exponential 2 which is (9.4802), and the nearest value to 1 at
Adj R-square (0.9166) contribute for the better fitting, although the
Exponential 2 have lower percentage in data to fit model 73.18% than
Exponential 1 which is 96.10% with higher number of coefficient. Also
at RMSE the smallest value in Exponential 1 which is (3.9394) gave
Examination Rain and Fog Attenuation for Path Loss Prediction … 945

better expectation compare with Exponential 2 value which is (76975) at

similar parameter.
b- Secondly, examination of Polynomial models explained that there is
small differences variability in Polynomial 2 (99.56%) while Polynomial
1 (96.74%), with respect of the number of coefficient which is lower in
Polynomial 1. Even in Polynomial 2 have little bit more random error
which is (1.5542), but the Adjusted R-square gave more accurate model
which is (99.51), which indicate validation of the useful fitting. So, the
clear linear model graphically at Polynomial 1 is not appropriate for
fitting data.
c- Thirdly, the Power model in atmospheric fog is fewest impact model
than mentioned, the both (Power1 and Power2) are very close each other
graphically and mathematically. The total variation in data is (96.27%)
(98.74%) in Power1 and Power2 respectively, also the RMSE values are
(2.7082) (1.6214) indicate Power 2 get smaller guess to error than Power
1. Addition to that, the Adjusted R-square values within easy reach to
value 1 in Power 2 illustrate better data fit (98.59%) than Power 1 which
is (96.06%) with more random error (4.4693) but in more coefficient
numbers.

5 Conclusion

The paper presented a three unique models utilizing goodness of ﬁt parameters to

accomplish competitive advantage due to comparison between models and to
choose the reasonable model under explicit climate condition. By and large, the
outcomes demonstrated that the Polynomial model especially in higher term is
satisfactory model for rainy and foggy atmospheric. For more precise, considered to
distribute the rain results in polynomial 4 into segments, each one could indicate
fluctuated results. The curve didn’t completely fit the data below 30 GHz, with little
bit divergences until reach the perfect point at 35 GHz, then start impairments when
dealing with higher frequencies up to 70 GHz, where again clearly the robust fitting
appeared. Nevertheless, again the little bit variation in data fit becomes visible.
From the foggy atmospheric aspect, similar way in higher Polynomial term
distributed into segments but with more fitting points at term 2 model until reach the
perfect point at 50 GHz, then little bit fluctuations when going to higher frequencies.
Certainly, the higher values of R-squared and Adj-R with the numbers of
coefficients as well as less value of SSE are played the key role of data fit
investigation.
Eventually and from my results it shows that attenuation could evaluated with
high respectful of rain and fog rate in Gigahertz frequencies. Our results could
extremely useful when design wireless networks system, while we would continue
the future work which could be in other weather parameters.
946 I. E. Elmutasim and I. I. Mohd

References

1. Mandeep JS (2011) Modified ITU-R rain attenuation model for equatorial climate. In:
Proceeding of the IEEE international conference on space science and communication
(IconSpace), Penang
2. Rappaport T (2013) Millimeter wave mobile communications for 5G cellular: it will work!
IEEE Access 1:335–349
3. Liebe HJ, Hufford GA, Cotton MG (1993) Propagation modeling of moist air and suspended
water/ice particles at frequencies below 1000 GHz. In: AGARD, atmospheric propagation
effects through natural and man-made obscurants for visible to MM-Wave radiation, (SEE
N94-30495 08-32), 11 p
4. Elkotby H, Vu M (2017) Interference modeling for cellular networks under beamforming
transmission. IEEE Trans Wirel Commun 16(8):5201–5217
5. Sun S, MacCartney GR, Rappaport TS (2016) Millimeter-wave distance-dependent
large-scale propagation measurements and path loss models for outdoor and indoor 5G
systems. In: 10th European conference on antennas and propagation (EuCAP). IEEE
6. Roh W (2014) Millimeter-wave beamforming as an enabling technology for 5G cellular
communications: theoretical feasibility and prototype results. IEEE Commun Mag 52(2):106–
113
7. Seker C, Güneser MT, Ozturk T (2018) A review of millimeter wave communication for 5G.
In: 2nd international symposium on multidisciplinary studies and innovative technologies
(ISMSIT), Ankara
8. Golovachev Y, Etinger A, Pinhasi GA, Pinhasi Y (2018) Millimeter wave high resolution
radar accuracy in fog conditions-theory and experimental verification. Sensors (Basel) 18
(7):2148. https://doi.org/10.3390/s18072148
9. Choi, KS (2011) Trends in rain attenuation model in satellite system. In: 13th international
conference on advanced communication technology. IEEE
10. Qingling Z, Li J (2006) Rain attenuation in millimeter wave ranges. In: 7th international
symposium on antennas, propagation & EM theory, Guilin, pp 1–4
11. Golovachev Y, Etinger A, Pinhasi GA, Pinhas Y (2019) Propagation properties of
sub-millimeter waves in foggy conditions. J Appl Phys 125:151612. https://doi.org/10.
1063/1.5083711
12. Hemadeh IA, Satyanarayana K, El-Hajjar M, Hanzo L (2017) Millimeter-wave communi-
cations: physical channel models, design considerations, antenna constructions, and
link-budget. IEEE Commun Surv Tutor 20(2):870–913. Secondquarter 2018. https://doi.
org/10.1109/COMST.2017.2783541
13. Goyal M, Prakash S, Xie W, Bashir Y, Hosseini H, Durresi A (2010) Evaluating the impact of
signal to noise ratio on IEEE 802.15.4 PHY-level packet loss rate. In: 13th international
conference on network-based information systems, Takayama
14. Sun S (2016) Propagation path loss models for 5G urban micro-and macro-cellular scenarios.
In: 83rd vehicular technology conference (VTC Spring). IEEE
15. Shen L, Yang F, Wang X, Comerford P (2016) Functional curve fitting algorithm via
multi-heterogeneous data curve. IEEE Access 5:470–479. https://doi.org/10.1109/ACCESS.
2016.2643679
16. Munandar D (2015) Optimization weather parameters influencing rainfall prediction using
Adaptive Network-Based Fuzzy Inference Systems (ANFIS) and linier regression. In:
International conference on data and software engineering (ICoDSE), Yogyakarta, pp 1–6
17. Liren Z, Xin Y, Yang P, Li Z (2017) Magnetic performance measurement and mathematical
model establishment of main core of magnetic modulator. In: 13th IEEE international
conference on electronic measurement & instruments (ICEMI), Yangzhou, pp 12–16
Introduction of Static and Dynamic
Features to Facial Nerve Paralysis
Evaluation

Wan Syahirah W Samsudin, Rosdiyana Samad, Kenneth Sundaraj,

and Mohd Zaki Ahmad

Abstract This paper presents a novel approach of facial nerve paralysis evaluation
system where it includes both static and dynamic features to evaluate the severity
level of paralysis and classify the type of paralysis whether it is Upper Motor
Neuron (UMN) lesion or Lower Motor Neuron (LMN) lesion. Two assessment
proposed in the system, regional assessment and lesion assessment, which used
static and dynamic features respectively. Individual score, total score and paralysis
score are introduced and experiments reveal that the proposed approach demon-
strates till 100% accuracy in classifying the subjects into normal and patient, the
level of severity, and also the type of lesion by using the k-NN classiﬁer. The results
proved that with more experiments and by increasing the number of the data, the
system will become a great aid to clinicians in evaluation of facial nerve paralysis
and rehabilitation programs to patients.

Keywords Facial nerve paralysis Facial nerve evaluation Optical flow Gabor

ﬁlter Upper motor neuron Lower motor neuron

W. S. W. Samsudin (&) R. Samad

Faculty of Electrical and Electronics Technology Engineering,
Universiti Malaysia Pahang, 26600 Pekan, Pahang, Malaysia
e-mail: [email protected]
K. Sundaraj
Faculty of Electronics and Computer Engineering, Universiti Teknikal
Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
M. Z. Ahmad
Department of Otorhinolaryngology, Hospital Tuanku Ampuan Afzan,
Jalan Tanah Putih, 25100 Kuantan, Pahang, Malaysia

1 Introduction

Facial nerve paralysis (FNP) happens when any of structures innervated by the
facial nerve is disrupted. Because of the pathway of the facial nerve starts from the
motor cortex where the distance is long, hence, various causes may result in facial
paralysis. However, the prevalent cause of this paralysis, which has been reported in
approximately 50% of cases, is Bell’s palsy, where it is an idiopathic disease
category named after Sir Charles Bell (1774–1842) [1]. Patients with these paral-
yses suffer from serious functional, cosmetic and psychological problems and
exhibit an impaired ability to communicate both verbally and non-verbally.
Numbness can occur on the affected side of the face, although no actual sensory
loss occurs. As a result, the eye should be lubricated with artificial tears until facial
paralysis ends [2]. However, the most notable impact of the paralysis is its psy-
chological effect, i.e., the patients may experience a lack of confidence and fear
when interacting with others.
Facial paralysis can be categorized into two types which is central facial
paralysis and peripheral facial paralysis. The central facial paralysis is normally
known as Upper Motor Neuron (UMN) since it happens due to the lesion of central
neuron and often cause stroke. The forehead and eyes are not distorted in UMN but
the lower half of face is affected due to the bilateral UMN innervation acquired by
neurons to the upper face [3]. Otherwise, the peripheral facial paralysis or mostly
known as Lower Motor Neuron (LMN) is caused by the lesion of the peripheral
neuron where the upper, middle and also lower part of the face is affected as
illustrated in Fig. 1. In addition, the most common cause of acute LMN is Bell’s
palsy. The patient will have experienced partial or complete disability on one half of
face, and make them not able to raise the eyebrows symmetrically and also not able
to close eyes tightly.
The determination of the degree of facial lesion is very crucial in specifying the
cause of paralysis and suggesting the further treatment and rehabilitation procedure
[3, 4]. Even though the identification of lesion is the main concern in initial facial
nerve assessment, however only a single reported work has been proposed to
identify the type of paralysis. The work has developed a system to identify the type
of paralysis based on iris segmentation [5, 6]. Moreover, to date, no study has been
conducted other than us [7] in analysing the wrinkles on the forehead during
eyebrows lifting movement as practiced in clinical applications despite the fact that
the forehead is actually the main significant area for distinguishing the UMN or
LMN lesions [8, 9].
In this study, the main contribution was highlighted and presented. A novel
method which includes both static and dynamic features has been introduced where
the study involves two types of assessment, regional assessment (dynamic features)
and lesion assessment (static features). Even many computer-aided systems have
been suggested in measuring the dysfunction of face and the level of severity, but
this study proposed the assessment which consists of both facial paralysis type and
level of severity as the classification problem. Image processing has been utilized in
Introduction of Static and Dynamic Features … 949

Fig. 1 Two types of lesion. a Upper Motor Neuron (UMN). b Lower Motor Neuron (LMN)

existing facial paralysis assessment for many decades [10], however, the efﬁciency
of each method used is still underway and produce a clinically usable and reliably
method is still very challenging [6].

2 Subjects and Methods

The study performs an objective evaluation of facial nerve paralysis which consists
of two types of facial assessment, regional assessment (dynamic) and lesion
assessment (static) by using different types of data.
The framework of the proposed objective facial nerve paralysis evaluation
system is presented in Fig. 2. The details of each assessment will be explained in
details throughout this section.
950 W. S. W Samsudin et al.

Fig. 2 The development of the facial nerve paralysis evaluation system

2.1 Ethical Statement

The procedures and experiments in this study was approved by the Medical
Research and Ethics Committee (MREC), Ministry of Health, Malaysia (Ref: No.:
KKM/NIHSEC/800-2/2/2/P13-96). The study has been conducted in accordance of
Declaration of Helsinki due to the involvement of human subjects and consent was
obtained from all the participants.
Introduction of Static and Dynamic Features … 951

2.2 Experimental Setup

Two locations are involved in this study, Laboratory of Universiti Malaysia Pahang
(UMP) and Physiotherapy and Rehabilitation Department of Hospital Tuanku
Ampuan Afzan (HTAA) for the data acquisition. An additional lighting system with
an adequate illumination is developed in order to improve the processing of the
data. The white light used in the lighting system is not harmful to the patient. Since
no standard available for the experimental setup, all the protocols for the experi-
ment is made based on the requirement for the research purpose.

2.3 Data Acquisition

Since the database of facial nerve paralysis is very restricted and not easily
obtainable due to using of human faces and need the consent to publish them, this
study has developed a database which comprised of frontal view of subjects’ face.
The data in the database has composed of two categories, static and dynamic data.
Static data are the static images of the cropped forehead whereas the dynamic data
are from the facial exercise videos. Both types of the data will be used depending on
the type of assessment used for the study. The study consisted of 62 data of normal
subjects and 40 data of patients. The age for all the subjects were ranged between
23–75 years old and the main etiology of the patient subjects is Bell’s palsy.

2.4 Regional Assessment

The development of regional assessment is presented in Fig. 3. Each video clip of

subjects in this regional assessment was analysed using the Kanade-Lucas-Tomasi
(KLT) algorithm [11, 12]. As simple method and made limited assumptions
regarding of the underlying image, the KLT method stays as the best video-feature
tracking algorithm [13, 14]. The method was chosen as it offers a high level of
computational efﬁciency, accuracy and robustness.

2.5 Video Processing

For this assessment, the initial position of each landmark in the video clip was
determined by manually clicking with the cursor on the anatomical points agreed by
medical professional as shown in Tables 1 and 2.
952 W. S. W Samsudin et al.

Fig. 3 The development of regional assessment of facial nerve paralysis

Introduction of Static and Dynamic Features … 953

Table 1 Landmarks selection for distance measurement [15]

Facial Exercise Involved Muscle Selected Landmarks

DLB1 DLB2
DRB1 DRB2

Rise Forehead Frontalis

DRE1 DLE1
Close Eyes Orbicularis Oculi

Risorius and Zygomaticus Major

Smile
DRL1 DLL1

DRL2 DLL2

Once the landmarks were speciﬁed, the KLT algorithm automatically tracks each
point on the face within the Cartesian plane. Then, by using a custom-written and
designed Matlab® application, the pixel coordinates of all landmarks as the features
are acquired and recorded for further investigations.
954 W. S. W Samsudin et al.

Table 2 Landmarks selection for area measurement [15]

Facial Exercise Involved Muscle Selected Landmarks

DRB4 DLB4
Rise Forehead Frontalis DRB3 DLB3

DRE1 DLE1
Close Eyes Orbicularis Oculi
DRE2 DLE2

Smile Risorius and Zygomaticus Major

DL1
DL2

2.6 Motion Analysis Using Distance and Area Computation

Two parameters, Euclidean distance and Eclipse area were examined in the study
by using these formula respectively:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
D¼ ðx2 x1 Þ2 þ ðy2 y1 Þ2 ð1Þ

A ¼ pD1 D2 ð2Þ
Introduction of Static and Dynamic Features … 955

Table 3 The characteristics of each measurement [15]

Facial exercise Distance measurement Area measurement Characteristics
Rise forehead DRB = DRB1 + DRB2 ARB = p. (DRB3/2). (DRB4/2) Right brow
DLB = DLB1 + DLB2 ALB = p. (DLB3/2). (DLB4/2) Left brow
Close eyes DRE = DRE1 ARE = p. (DRE1/2). (DRE2/2) Right eye
DLE = DLE1 ALE = p. (DLE1/2). (DLE2/2) Left eye
Smile DRL = DRL1 + DRL2 AL = p. (DL1/2). (DL2/2) Right lip
DLL = DLL1 + DLL2 Left lip

The denoted x and y in the Eqs. 1 and 2 are the pixel coordinates of each
landmark. The distance and area are quantified based on these values for all the facial
exercises as shown in Table 3. Both distance and area are measured on both sides of
face for finding the difference between them. Since the most of the causes of facial
paralysis is Bell’s palsy, which only affected the patients of one side of face, so the
results will be different with the normal subjects which have both normal facial
function. By the investigations of naked eye, the patient is having some trouble in
elevation the eyebrows even in mild paralysis cases. Furthermore, in a severe con-
dition, the patient is almost not able to close the eyelids and the maximum smile
cannot be executed due to drooping their corner of the mouth. Hence, the distance
and area of the eyebrows, eyelids and also the mouth are suited and appropriate to be
measured to find the difference between the normal and patient.

2.7 Quantitative Assessment of Facial Nerve Paralysis

An individual score was assigned to both sides of the face based on the score chart
presented in Table 4 with the objective to devise a quantitative tool for the eval-
uation of facial nerve paralysis.
This score chart was generated after repeated trials using normal subjects,
whereby based on the maximum possible movements of the landmarks, a corre-
sponding score was assigned. This was determined after numerous discussions with
the medical professional. These scores are assigned based on the calculated dif-
ference value between initial and maximum movement of the landmarks as shown
in Eq. 3 and Eq. 4 for distance and area respectively:

Table 4 Individual score Percentage value (%) Score assigned

chart [15]
<1 0
1–5 1
6–10 2
11–15 3
16–20 4
21–25 5
26 and above 6
956 W. S. W Samsudin et al.

Ddiff ¼ Dmax D0 ð3Þ

Adiff ¼ Amax A0 ð4Þ

Then, percentage value of the difference in distance and area was calculated by
using Eqs. 5 and 6. These values are very crucial in the analysis in determining the
paralysis side of face and then distinguishing the normal and patient subject. With
these values, the individual score chart is also developed.

Ddiff
Percentage value; Ddiff ð%Þ ¼ 100% ð5Þ
D0

Adiff
Percentage value; Adiff ð%Þ ¼ 100% ð6Þ
A0

Finally, a total score can be obtained after all the facial exercises successfully
conducted. The total scores will demonstrate the severity level of paralysis.
A paralysis score chart was designed and developed based on total scores as shown
in Table 5 with the agreement of medical professional. By referring to the table, the
range value of total score for normal facial function is 22–36 and it is the lowest
grade which is level I if compared with the House-Brackmann score. If the subject
has the score lower than 22, then he/she will be classified as patient with abnormal
facial function with different level of severity.
In this study, K-nearest neighbor (k-NN) method was applied to classify the
subjects into normal and patients with different level of severity of paralysis. The
percentage value of the distance and area will be the input features to the classifier.
The k–NN is a classifier that classifies objects based on the closest training
examples in the feature space. Previous research has proved that this algorithm is
one of the simplest algorithm among all other machine learning algorithm. Previous
research has proved that this algorithm is one of the simplest algorithm among all
other machine learning algorithm and a great possibility of applying this k–NN
classifier to identify various types of image data [16, 17]. There are 62 normal data
and 40 patient data with different level of severity of paralysis. Table 6 shows the
data distribution for the training and testing used in the analysis for the study.

Table 5 Paralysis score chart [15]

House-Brackmann grade Descriptions Total score
I Normal 22–36
II Slight dysfunction 11–21
III Moderate dysfunction 7–10
IV Moderate severe dysfunction 4–6
V Severe dysfunction 1–3
VI Total paralysis 0
Introduction of Static and Dynamic Features … 957

Table 6 Distribution of Types of video data Training data Testing data

training and testing data
Normal 42 20
Patient 29 11
Total 71 31

Table 7 Classiﬁcation Value k Accuracy (%)

performance of individual
Rise forehead (%) Close eye (%) Smile (%)
score using distance
k = 1 93.55 100.00 100.00
k = 2 93.55 100.00 100.00
k = 3 100.00 100.00 100.00
k = 4 93.55 100.00 100.00
k = 5 93.55 100.00 100.00

3 Results and Discussions

The analysis was done by varying the values of parameter k from 1 until 5. The
distance metric used is Euclidean distance. Table 7 shows the accuracy of each
investigated region. The results are obtained from the testing data accuracy. Since
the distance parameter has shown better results compared to area parameter [15], so
for the classification of individual score only the distance results are presented as
shown in Fig. 4.
As shown in Fig. 4, it can be seen that by using k = 3 produce the best result with
the 100% accuracy for all the facial exercises. However, the other values of k also
shows promising results in classification the subjects based on the individual score.
As mentioned before, the total score is the added values of all the individual
scores from each region of the face. By applying the value of k = 3 again, the results
presented in Fig. 5 shows the highest accuracy with 87.1% for the area and 64.52%
for the distance in classifying the subjects based on their total score. From these
total scores, the subjects is classified into the normal and patient category and also
the level of House-Brackmann score. For the total score, the results of area
parameter is more promising compared to the distance in classifying the subjects.
The paralysis score which is based on House-Brackmann is assigned based the
total score as mentioned before. Based on Fig. 6, k = 1 and k = 2 show the higher
percentage accuracy in classification the patient based on total score compared to
the values of k. Here, k = 3 shows the lowest accuracy and not preferable in this
case.
Hence, the results presented indicate that the appropriate value for this study is
k = 1 because it shows promising results throughout the analyses. Furthermore, the
classification performances show that k–NN is also capable in gaining high
958 W. S. W Samsudin et al.

Fig. 4 Classiﬁcation Classifica on of Individual Score

performance of individual (Distance Parameter)
score (Distance)
102.00%
100.00%
98.00%
96.00%
94.00%
92.00%
90.00%
k=1 k=2 k=3 k=4 k=5
Rise Forehead Close Eye Smile

Fig. 5 Classiﬁcation Classifica on of Total Score

performance of total score (Distance and Area Parameter)
100.00%
80.00%
60.00%
40.00%
20.00%
0.00%
k=1 k=2 k=3 k=4 k=5
Total Score Distance Total Score Area

Fig. 6 Classiﬁcation Classifica on of Level of Paralysis Score

performance of paralysis 100.00%
score
80.00%

60.00%

40.00%

20.00%

0.00%
k=1 k=2 k=3 k=4 k=5
Paralysis Score Distance Paralysis Score Area

accuracy in identifying the subjects into normal and patient including the level of
severity of paralysis. The reference data (ground truth) used in the results is the
assessment outcome acquired from the medical professional where it consists of
type of paralysis and level of severity based on House-Brackmann (HB) score.
Introduction of Static and Dynamic Features … 959

3.1 Lesion Assessment

The development of facial lesion assessment is presented in Fig. 7. The whole

process in this type of assessment started with facial images of the subjects. Then,
the process of cropping was done to produce only the forehead images. As illus-
trated in Fig. 2 for the whole system, this lesion assessment was only proceed with
the patient subject which was successfully classiﬁed in regional assessment.
However, due to the limited source of the database, the data used for UMN lesion in
this study are taken from normal subjects since the patient with UMN exhibit the
same criteria as normal where the forehead is spared and able to be lifted

Fig. 7 The development of lesion assessment of facial nerve paralysis

960 W. S. W Samsudin et al.

symmetrically during raising the forehead. This is the difference between UMN and
LMN lesion. In LMN cases, the patient will not able to wrinkle their forehead on
paralyzed side or raised symmetrically.

3.2 Image Processing

All the facial images of subjects are cropped on only the forehead region. Both the
rest condition of forehead without any movement and maximal condition of fore-
head during raising the eyebrows were taken as for comparison.
The Gabor filter then was applied to all of images. Gabor filter as named after a
physicist, Dennis Gabor has been used extensively in many applications, such as
texture segmentation or classification, face recognition, edge detection fingerprint
recognition.
The Gabor wavelets of Gabor filter have been acknowledged as one of the most
favorable local feature extraction method in representing the face [18]. The Gabor
wavelet has 40 sets of filters, however not all the wavelets have been required to
extract the features [19]. Hence, only five scales and eight orientation of Gabor filter
was used in this study.

3.3 Lesion Analysis Using Gabor Features Computation

Two main features have been computed and extracted from the Gabor response in
this study. Both features, Local Energy (LE) and Mean Amplitude (MA), were
extracted for both sides of forehead to ﬁnd the difference between them. The
computation of Percentage Difference of LE and MA values between right and left
halves of forehead is illustrated in Fig. 8.

3.4 Quantitative Assessment of Types of Lesion

The k-NN also used in this assessment to classify the types of lesion whether the
patient has UMN lesion or LMN lesion. The analysis also was done by varying the
values of parameter k from 1 until 5 and the distance metric used is Euclidean
distance. Based on our previous research [7], Scale 2 and Orientation 5 of Gabor
ﬁlter has been chosen as the best assessment results.
From the results in Fig. 9, k = 1 till k = 4 show the same performance where
100% accuracy in classifying the UMN and LMN lesion by applying the Local
Energy feature and 96.7% by applying the Mean Amplitude feature as the input to
Introduction of Static and Dynamic Features … 961

Fig. 8 Flow of percentage difference computation [7]

Fig. 9 Classiﬁcation Classifica on of Type of Lesion

performance of lesion 101.00%
assessment
100.00%
99.00%
98.00%
97.00%
96.00%
95.00%
k=1 k=2 k=3 k=4 k=5
Local Energy Mean Amplitude

the k–NN classiﬁer. Even though the Local Energy are better in performance,
however, the results demonstrated that these two features of Gabor are good at
determining the facial lesion.

4 Conclusions

The main contribution of this study is the development of an objective system for a
complete assessment of facial nerve function. However, a lot of works still remains
to be performed in this research area. The number of patients should be increased
especially in UMN lesion by cooperating and exchanging the data with other
962 W. S. W Samsudin et al.

research sites and hospitals. An increase of patient data may improve the accuracy
of the system, and a publicly database should be developed to encourage more
interests in this research area.

Acknowledgements The research is funded by Fundamental Research Grant Scheme FRGS/1/

2016/TK04/UMP/02/1 and Universiti Malaysia Pahang (UMP). The authors would like to thank
the Medical Research Committee (MREC) of Malaysia for providing the ethical approval for data
collection (Ref. No.: NMRR-12-1195-14375). Finally, thanks also go to the Ministry of Higher
Education of Malaysia (MoHE) for the ﬁnancial support.

References

1. Singhi P, Jain V (2003) Bell’s palsy in children. Semin Pediatr Neurol 10(4):289–297
2. Piercy J (2005) Bell’s palsy. BMJ 330(7504):1374
3. Scully C (2013) 20 - Sensory and motor changes. In: Oral and maxillofacial medicine, 3rd
edn. Churchill Livingstone, pp 145–153
4. Finsterer J (2008) Management of peripheral facial nerve palsy. Eur Arch Otorhinolaryngol
265(7):743–752
5. Barbosa J, Seo WK, Kang J (2019) paraFaceTest: an ensemble of regression tree-based facial
features extraction for efficient facial paralysis classification. BMC Med Imaging 19(30):1–14
6. Barbosa J et al (2016) Efficient quantitative assessment of facial paralysis using iris
segmentation and active contour-based key point’s detection with hybrid classifier. BMC Med
Imaging 16(23):1–18
7. Samsudin WSW, Samad R, Ahmad MZ, Sundaraj K (2019) Forehead lesion score for facial
nerve paralysis evaluation. In: IEEE international conference on automatic control and
intelligent systems (I2CACIS), Shah Alam, Malaysia. IEEE, pp 102–107
8. Hira HS (2011) Clinical examination: a practical guide in medicine. Jaypee Brothers Medical
Publishers, New Delhi, pp 245–256
9. Masterson L, Vallis M, Quinlivan R, Prinsley P (2015) Assessment and management of facial
nerve palsy. BMJ Br Med J 351(8025):h3725
10. Samsudin WS, Sundaraj K (2012) Image processing on facial paralysis for facial
rehabilitation system: a review. In: IEEE international conference on control system,
computing and engineering, Penang, Malaysia. IEEE, pp 259–263
11. Lucas BD, Kanade T (1981) An iterative image registration technique with an application to
stereo vision. In: Proceedings DARPA image understanding workshop, pp 121–130
12. Tomasi C, Kanade T (1991) Tracking of point features. Technology report CMU-CS-91-132.
Carnegie Mellon University, pp 2–3
13. Ali S, Shah M (2007) A Lagrangian particle dynamics approach for crowd flow segmentation
and stability analysis. In: IEEE conference on computer vision and pattern recognition,
Minneapolis, USA. IEEE, pp 1–6
14. Hu M, Ali S, Shah M (2008) Detecting global motion patterns in complex videos. In: 19th
international conference on pattern recognition, ICPR, Tampa, USA. IEEE, pp 1–5
15. Samsudin WSW, Sundaraj K, Ahmad A, Salleh H (2016) Initial assessment of facial nerve
paralysis based on motion analysis using an optical flow method. Technol Health Care 24
(2):287–294
16. Samsudin WSW, Samad R, Sundaraj K, Ahmad MZ, Pebrianti D (2019) Regional assessment
of facial nerve paralysis using optical flow method. In: Proceedings of the 10th national
technical seminar on underwater system technology. Springer, Singapore, pp 505–514
Introduction of Static and Dynamic Features … 963

17. Heroux P, Diana S, Ribert A, Trupin E (1998) Classification method study for automatic form
class identification. In: Proceedings of fourteenth international conference on pattern
recognition, Australia. IEEE, pp 926–928
18. Ismail I, Ramadan M, El Danf T, Samak A (2008) Multiresolution Fourier-wavelet descriptor
for fingerprint recognition. In: 2008 international conference on computer science and
information technology, Singapore. IEEE, pp 951–955
19. Zhao W, Park J, Lee S (2011) Fully automatic face detection and facial feature point’s
extraction using local Gabor filter bank and PCA. In: International conference on machine
learning and cybernetics, Guilin, China. IEEE, pp 1789–1792
Offline EEG-Based DC Motor Control
for Wheelchair Application

Norizam Sulaiman,
Nawfan Mohammed Mohammed Ahmed Al-Fakih,
Mamunur Rashid, Mohd Shawal Jadin, Mahfuzah Mustafa,
and Fahmi Samsuri

Abstract Brain-computer interface (BCI) connects the brain of human with

computer, where it allows people with physical disabilities to operate different
electronic devices with the help of brain waves. The process can be performed
without any involvement of human touch. This system will provide an easy use and
operation of certain device by disabled people. The system is fit for people, who
have no control over their normal muscular body to use the peripheral devices. In
addition, due to the good feature of this technology, namely, user friendly and low
cost, it is getting more popularity recently. The application of BCI is very wide
which cover medical and non-medical application, for instant, playing games, BCI
speller, cursor control, social interactions by detecting emotions, robotic arm
control, wheelchair control, home appliances control or smart phone operation
using Electroencephalogram (EEG) signals are all applications of BCI technology.
In this research, the possibility of DC motor control using single channel EEG
headset has been investigated. The research aims to find the best EEG features and
classifier where the output of the classifier can provide a correct device command to
control DC motor movement. Here, EEG feature in terms of power spectral density
has been extracted and classified using support vector machine (SVM) with the
classification accuracy achieved at 92%. Then, the classified EEG features had been
translated into three devices command to control the direction of DC motor.
The DC motor can be driven in three directions namely forward, right and left
direction. Data collection from EEG headset and sending commands to DC motor,
the entire process has been done wirelessly. The multi direction of DC motor will
enhance the wheelchair application by disabled people.

Keywords BCI EEG Motor control EEG features EEG classiﬁcation

Device command

N. Sulaiman (&) N. M. M. A. Al-Fakih M. Rashid M. S. Jadin M. Mustafa F. Samsuri

Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang,
Pahang, Malaysia
e-mail: [email protected]

1 Introduction

Human brain is definitely the control unit of the entire body. The neuron movement
in the brain due to the change of cognitive or neural state will produce electrical
signals and these signals can be measured using device called electroencephalogram
(EEG). Brain signals are often called EEG signals. Thus, the EEG signals can be
used to study the brain-related disease, mental condition, intelligence quotient (IQ),
and to allow human to communicate with the machine which is currently known as
Brain-Computer Interface (BCI) and become emerging technique in Biomedical
Engineering Field [1]. However, human brain can be affected by some severe
neuromuscular disorders, for example, amyotrophic lateral sclerosis (ALS) spinal
cord injury, and brainstem stroke [2]. Fundamentally, the exact cause of ALS is
unknown, and there is no cure. ALS starts with muscle weakness and atrophy.
Generally, all voluntary movement, such as swallowing, breathing, walking and
speaking deteriorates over many years, and ultimately is lost completely. However,
the disease has no impact on cognitive functions or sensations. People can still see,
hear, and understand what is happening around them, but cannot control their
muscles. This is because ALS only affects special neurons, the large Alpha motor
neurons, which are an integral part of the motor pathways. Thus, this leads to life in
the locked in condition. Once the motor pathway is lost, any natural way of
communication with the environment is also lost [3, 4]. Any natural form of
communication demands peripheral nerves and muscles. The process starts with the
user’s intent. This intent triggers a complex approach in which certain brain areas
are activated, and hence signals are transferred via the peripheral nervous system
(specifically, the motor pathways) to the corresponding muscles, which in turn
perform the movement necessary for the communication or control task. Here, BCI
aims to provide an alternative communication system by offering human brain a
way to control a machine or devices without involving any muscle movements.
Basically, BCI consists of 5 elements such as signal acquisition, signal
pre-processing, feature extraction, feature classification and device command as
illustrated by Fig. 1. Here, how good the BCI application depends on the device
command. Thus, the EEG feature classification process very crucial to provide the
correct command to control the direction of the DC motor to move the wheelchair.

Fig. 1 Elements of BCI

Offline EEG-Based DC Motor Control for Wheelchair Application 967

EEG feature is the unique feature extracted from EEG raw signals to indicate the
neural state after human brain being stimulated with certain exercises.
EEG signals consist of a set of signals which can be categorised with their
frequency. Well-known frequency bands are described according to the distribution
among the scalp or biological signiﬁcance. These frequency bands are: Delta (d),
Theta (h), Alpha (a), Beta (b), and Gamma (c) from low to high, respectively [5] as
depicted by Table 1. The Delta band exists below 4 Hz where the amplitude of the
signals measured in babies decreases as their age. Delta rhythms are generally
measured in adults in deep sleep state and are unusual in adults in an awake or alert
state. A large quantity of Delta activity in awake or normal adults is abnormal and is
associated with neurological disease [6]. It is convenient with low frequency to
confuse Delta waves with artifact signals or biological noises that are induced by
the large muscles of the neck or jaw. The range of theta waves is between 4 to 7 Hz.
In a typical awake adult, only a small range of theta frequencies can be captured.
A larger range of theta frequencies may be found in young children, older children,
as well as adults in meditative, drowsy or sleep states [6]. Same as Delta waves, a
large range of Theta activity in awake or normal adults can be associated with
neurological disease. Theta band has been associated with meditative concentration
[7] and a wide range of cognitive processes such as mental calculation [8], maze
task demands [9], or conscious awareness [10]. Alpha rhythms are measured over
the occipital region of the brain [11].
The range of these waves lies between 8 to 12 Hz. The wave amplitude will
increment during eyes closing where the body relaxing and, attenuate when the eyes
open and mental effort is made [12]. These rhythms fundamentally reflect visual
processing in the occipital brain region that may likewise be identiﬁed with the
brain function [11]. Additionally, there is evidence that alpha activity may be
related to the mental effort. Expanding mental effort creates a suppression of Alpha
activity, especially from the frontal regions [13]. Subsequently, these rhythms may
be valuable signals to detect mental effort. Mu rhythms may additionally be
observed in the same range as alpha rhythms, even though there are vital physio-
logical contrasts between both. Rather than Alpha rhythms, mu rhythms are
unequivocally associated with motor activities; additionally sometimes appear to
correlate with Beta rhythm [13–16]. Beta rhythms that existed between the ranges
of 12 to 30 Hz are measured in the Frontal and Central regions of the brain related
to motor activities. These rhythms are desynchronized at the time of real movement
or motor imagery [6]. Beta waves are characterized with their symmetrical

Table 1 Frequency range Band Frequency Activity

and activity of EEG bands
Delta 0.5–4 Hz Deep Sleep
Theta 4–8 Hz Drowsiness, Light Sleep
Alpha 8–13 Hz Relaxed
Beta 13–30 Hz Active Thinking, Alert
Gamma More than 30 Hz Hyperactivity
968 N. Sulaiman et al.

distribution without any motor activity. With active movement, the Beta waves
attenuate, and their symmetrical distribution changes [17, 18]. The existence of
Gamma waves in the brain activity of a healthy adult is associated with certain
motor functions or perceptions, among others [19–22].
The main objective of this research is to search for the best features of EEG
signals and classifier which can convert classifier output into correct device com-
mand to control the DC motor. Here, EEG power spectral density is selected as
EEG features. Meanwhile, Support Vector Machine (SVM) is chosen as a classifier
to classify the selected feature. An algorithm is constructed using MATLAB to
analyse EEG feature, to select the feature, to classify the feature and to convert the
feature into machine command.

2 Materials and Methods

Figure 2 shows the complete flow chart of research where the first step was subject
selection for EEG data collection. Here, a suitable EEG capturing device has been
selected which is 1-channel Neurosky Mindwave EEG Headset. A measurement
protocol has been designed to capture EEG data in right manner. After completing
the data collection, the first step is to pre-process EEG data from biological and
instrument noises in order to obtain a clean EEG data.
The clean EEG data then will be converted into their frequency spectrum using
Fourier Transform technique. The data then will be filtered and split into their
frequency bands. Next, the most suitable EEG feature has been extracted according
to the literature review. The selected features have been classified using selected
machine learning algorithms. After classifying the selected EEG feature, a trans-
lational algorithm has been constructed to convert the classifier output into device
commands. Finally, a prototype has been designed and tested.

2.1 Data Acquisition and Measurement Protocol

The ﬁrst step of the BCI design is data acquisition. Neurosky Mindwave EEG
headset was used for collecting EEG raw data for this research. This EEG headset
contains one electrode and this electrode is placed on the FP1 area of the human
brain. There is a reference electrode which is connected with the ear lobe. This EEG
ampliﬁer captures the raw EEG data at 512 Hz sampling rate. The EEG data have
been collected from the six subjects. Table 2 shows the description of the subjects.
Offline EEG-Based DC Motor Control for Wheelchair Application 969

Fig. 2 Flowchart of proposed research work

Table 2 Subjects for EEG Subject Sex Age Subject Sex Age
data collection
Subject-1 Male 25 Subject-5 Male 19
Subject-2 Female 23 Subject-6 Female 19
Subject-3 Male 21 Subject-7 Female 20
970 N. Sulaiman et al.

Fig. 3 Raw EEG data acquisition procedure

After subject selection, a measurement protocol has been designed as shown in

Fig. 3. Here, an android mobile app called eegID in the mobile phone and the
Neurosky Mindwave are paired through the Bluetooth.
Three classes of mental exercise have been considered in this research namely
relax (do nothing), quick math solving and playing game. During the data collection
of relax state, subjects were said to sit on the chair very comfortably and avoid the
movement of whole body. On the other hand, during quick math solving, subjects
were said to solve the math problems randomly as quick as possible from the
website in [11]. Duration of all the measurement data were one minute. During
playing game, subjects were said to play android racing game on smart phone.

2.2 Analysis of EEG Signals: Feature Extraction Technique

In order to get excellent classiﬁcation result, effective techniques of feature

extraction is necessary [15]. To make decision about the classification techniques, it
is essential to know what the features are, what is their application and in which
way they may help classification. There are quite a number of different features set
for different designs and applications of BCI systems such as Principal Component
Analysis (PCA), Independent Component Analysis (ICA), Band Power (BP),
Power Spectral Density (PSD) values, Spectral Centroid, Cerebrospinal Fluid
(CSF), Wavelet Transformations (WT), Wavelet Packet Decomposition (WPD),
Auto Regressive (AR), Adaptive Autoregressive (ARR) parameters and Fast
Fourier Transform (FFT) [16]. Applying combination of features for EEG signals is
a new development of BCI research. Several experiments which show that a
combined set of feature produce a better classification rate than using feature
independently [17]. Average power spectral density of EEG alpha, beta and gamma
band has been analysed as EEG feature in this research.
Offline EEG-Based DC Motor Control for Wheelchair Application 971

Power Spectral Density

With the help of Fast Fourier Transformation (FFT), the power spectral density of
the EEG data has been calculated. Equation 1 and Eq. 2 describes the FFT formula.
Meanwhile, Eq. 3 illustrates the formula to calculate power spectral density.
XN1
X ðk Þ ¼ k¼1
X ðnÞWNkn ; K ¼ 0. . .. . .N 1; ð1Þ

WN ¼ ej N
2p
ð2Þ
XN1 2

PSD ¼ jX ðk Þj2 ¼ n¼0 xðnTsÞej2pnk=N ð3Þ

Where one value of ‘k’ has N complex multiplications, since ‘k’ = 0, 1… N − 1.

The multiplication of x (n) and wkn was done for N times, since n = 0 to N − 1.
Classification
In this work, LDA, SVM and KNN have been used to classify the selected features.
Each classifier has been utilized to the extracted feature to find the best result. SVM
classifier is selected after producing a high classification rate of the selected EEG
feature in term of Power Spectrum.
Linear Discriminant Analysis (LDA)
LDA is deployed to find the linear combinations of feature vectors which describe
the characteristics of the corresponding signal. LDA seeks to separate two or more
classes of objects or events representing different classes. It utilizes hyperplanes to
accomplish this mission. Separating hyperplane is acquired by searching the pro-
jection through which the distance between the classes means are maximized, and
the interclass variance is minimized. This technique has a very low computational
requirement and it is simple to use. As a result, LDA has been utilized with success
in a great number of BCI systems including motor imagery based BCI, multiclass
BCI and P300 [16].

Support Vector Machine (SVM)

SVM is an algorithm that belongs to a category of classification methods which use
supervised learning to separate two different classes of data. It uses a discriminate
hyperplane to determine classes like does LDA. However in SVM, the selected
hyperplane maximizes the length from the nearest training points. This optimal
hyperplane is described by the vectors that lie on the margin which are called
support vectors. Such an SVM enables classification utilizing linear decision
boundaries which is called linear SVM. This classifier has been applied to a rela-
tively large number of synchronous BCI problems [17]. However, it is possible to
create nonlinear decision boundaries using a kernel function. The kernel commonly
performed in BCI research is the Gaussian or Radial Basis Function (RBF) kernel
and the corresponding SVM is called Gaussian SVM or RBF SVM [17].
972 N. Sulaiman et al.

K-Nearest Neighbors (KNN)

The aim of this technique is to assign to an unseen point the dominant class among
its k nearest neighbors within the training set [12]. For BCI designs, these nearest
neighbors are generally obtained utilizing a metric distance. With an adequately
high value of k and enough training samples, kNN can approximate any function
that allows it to develop nonlinear decision boundaries. KNN algorithms’ main
advantage is simplicity.

3 Results and Discussion

The EEG raw data captured from 2 subjects after performing given tasks (To solve
mathematical problem, to play game and do nothing (relax mode)) are illustrated by
Fig. 4. Meanwhile, Fig. 5 shows the EEG data in frequency domain. Figure 6 until
Fig. 9 elucidates EEG frequency bands in frequency domain and time domain
(Figs. 7 and 8) for subject 1 and 2.

Fig. 4 EEG Raw data in time domain for three types of exercises
Offline EEG-Based DC Motor Control for Wheelchair Application 973

Fig. 5 EEG Raw data in frequency domain for three types of exercises

Fig. 6 EEG raw data in time domain for three types of exercises for subject 1
974 N. Sulaiman et al.

Fig. 7 EEG raw data in time domain for three types of exercises for subject 2

Fig. 8 EEG raw data in frequency domain for three types of exercises for subject 1
Offline EEG-Based DC Motor Control for Wheelchair Application 975

Fig. 9 EEG raw data in frequency domain for three types of exercises for subject 2

Fig. 10 EEG raw data in frequency domain for three types of exercises for subject 2
976 N. Sulaiman et al.

Fig. 11 EEG raw data in frequency domain for three types of exercises for subject 2

The selected EEG features in power spectral density versus the given tasks (to
solve mathematical problem, to play game and do nothing) is shown in Fig. 10.
According to Fig. 10, Beta band for both subject are higher than other bands when
doing mathematical problem. However, the relax mode of subject 2 is higher than
subject 1.
The classiﬁcation of the EEG power spectrum using SVM with training to
testing ratio at 70:30 shows the selected feature in term of Power Spectral Density
able to produce high rate of the classiﬁcation accuracy at 94% as elucidated by
confusion matrix shown in Figs. 11 and 12.
Offline EEG-Based DC Motor Control for Wheelchair Application 977

Fig. 12 The true positive and false negative rate which achieved at 94% rate

Fig. 13 Overall block diagram to control the DC motor using EEG features
978 N. Sulaiman et al.

Fig. 14 DC motor attached to wheels

Thus, the overall block diagram to control the DC motor is shown in Fig. 13.
Here, the Bluetooth module is employed to send the captured EEG signals from
EEG device to EEG application in mobile phone. Next, the EEG data are sent to
Computer from mobile phone to analyze EEG data in MATLAB. From the analysis
in MATLAB, the best EEG feature is Power Spectral Density with 94% classiﬁ-
cation accuracy is selected to be converted to device command and download to
Arduino microcontroller. In order to control the speed of the 6 V DC motor, motor
driver is used. For testing purpose, DC motor is attached to the wheels as shown in
Fig. 14. The results show that the DC motor attached to the wheels can move the
wheels in direction of forward, right and left successfully after receiving instruc-
tions from Arduino microcontroller. Here, the translational algorithm was con-
structed to convert the classiﬁcation output into several classes to represent mental
exercises and to determine the direction of DC motor as shown in Table 3 below.

Table 3 Setting of Arduino Microcontroller

Mental Output of Microcontroller Pin 10 & 11 DC motor
exercises classiﬁcation setting direction
Relax 1 High Forward
Math solving 2 High & Low Right
Gaming 3 Low & High Left
Offline EEG-Based DC Motor Control for Wheelchair Application 979

4 Conclusion

In this research, signal processing and hardware platform for EEG based DC motor
controlling system have been developed. In signal processing platform, PSD and
SVM have been used as feature and classifier respectively. The high classification
accuracy rate achieved at 94% using single channel EEG amplifier show that the
signal channel EEG device can be employed to control the device movement.
Meanwhile, the PSD is the best EEG feature and SVM is the best classifier to
classify the selected EEG feature. The complete DC motor prototype has been
successfully tested. The results of the study show that DC motor merely can follow
three directions namely forward, right and left. Hence, in future, in order to make
the DC motor to move multiple directions, multi-channel EEG amplifier is required.
Moreover, this type of EEG amplifier should be capable of recording different EEG
control signals like P300, SSVEP and Motor Imagery. By using these control
signals, more commands for BCI can be generated for wheelchair application.

Acknowledgements The author would like to acknowledge the great research done by the second
author for his ﬁnal year undergraduate project, postgraduate student and research team members.
Acknowledgement also goes to Faculty of Electrical and Electronics Engineering Technology,
Universiti Malaysia Pahang and Ministry of Education Malaysia for providing fundamental
research grant scheme to support the research, FRGS/1/2018/TK04/UMP/02/3.

References

1. Aftanas LI, Golocheikine SA (2001) Human anterior and frontal midline theta and lower
alpha reflect emotionally positive state and internalized attention: high-resolution of EEG
investigation of meditation. Neurosci Lett 310(1):57–60
2. Akroﬁ K, Pal R, Baker MC, Nutter BS, Schiffer RW (2010) Classiﬁcation of Alzheimer’s
disease and mild cognitive impairment by pattern recognition of EEG power and coherence.
In: USA Lou Ruvo Center for Brain Health Cleveland Training, pp 606–609
3. Amiri S, Fazel-Rezai R, Asadpour V (2013) A review of hybrid brain-computer interface
systems. Adv Human-Comput Interact
4. Babiloni C, Pizzella V, Del Gratta C, Ferretti A, Romani GL (2009) Chapter 5 Fundamentals
of electroencefalography magnetoencefalography and functional magnetic resonance imag-
ing. Int Rev Neurobiol 86:67–80
5. Baillet S, Mosher JC, Leahy RM (2001) Electromagnetic brain mapping. IEEE Signal Process
Mag 18(6):14–30
6. Ball T, Kern M, Mutschler I, Aertsen A, Schulze-Bonhage A (2009) Signal quality of
simultaneously recorded invasive and non-invasive EEG. NeuroImage 46(3):708–716
7. Bamdad M, Zarshenas H, Auais MA (2015) Application of BCI systems in neurorehabil-
itation: a scoping review. Disabil Rehabil: Assist Technol 10:355–364
8. Bastos-Filho TF, Ferreira A, Atencio AC, Arjunan S, Kumar D (2012) Evaluation of feature
extraction techniques in emotional state recognition. In: 4th international conference on
intelligent human computer interaction: advancing technology for humanity, IHCI 2012
9. Birbaumer N, Elbert T, Canavan AG, Rockstroh B (1990) Slow potentials of the cerebral
cortex and behavior. Physiol Rev 70(1):1–41
980 N. Sulaiman et al.

10. Black AH (1972) The operant conditioning of central nervous system electrical activity.
Psychol Learn Motiv 6:47–95
11. Blankertz B, Losch F, Krauledat M, Dornhege G, Curio G, Muller K-R (2008) The Berlin
brain-computer interface: accurate performance from first-session in BCI-naive subjects. IEEE
Trans Biomed Eng 55(10):2452–2462
12. Bose R, Khasnobish A, Bhaduri S, Tibarewala DN (2006) Performance analysis of left and
right lower limb movement classification from EEG. In: 3rd international conference on signal
processing and integrated networks, SPIN 2016, pp 174–179
13. Bright D, Nair A, Salvekar D, Bhisikar S (2016) EEG-based brain controlled prosthetic arm.
In: Conference on advances in signal processing, CASP 2016, pp 479–483
14. Brown P, Salenius S, Rothwell JC, Hari R (1998) Cortical correlate of the piper rhythm in
humans. J Neurophysiol 80(6):2911–2917
15. Brunner P, Joshi S, Briskin S, Wolpaw JR, Bischof H, Schalk G (2010) Does the ‘P300’
speller depend on eye gaze? J Neural Eng 7(5):056013
16. Bullara LA, Agnew WF, Yuen TGH, Jacques S, Pudenz RH (1979) Evaluation of electrode
array material for neural prostheses. Neurosurgery 5(6):681–686
17. Pfurtsheller G, Neuper C (2001) Motor imagery and direct brain-computer communication.
Proc IEEE 89(7):1123–1134
18. Pfurtsheller G, Neuper C, Flotzinger D, Pregenzer M (1997) EEG-based discrimination
between imagination of right and left hand movement. Electroencephalogr Clin Neurophysiol
103(6):642–651
19. Caplan JB, Madsen JR, Raghavachari S, Kahana MJ (2001) Distinct patterns of brain
oscillations underlie two basic parameters of human maze learning. J Neurophysiol 86
(1):368–380
20. Carpaneto J, Umiltà MA, Fogassi L, Murata A, Gallese V, Micera S, Raos V (2011) Decoding
the activity of grasping neurons recorded from the ventral premotor area F5 of the macaque
monkey. Neuroscience 188:80–94
21. Chakladar DD, Chakraborty S (2018) EEG based emotion classification using correlation
based subset selection. Biol Inspir Cognit Archit 24:98–106
22. Chao ZC, Nagasaka Y, Fujii N (2010) Long-term asynchronous decoding of arm motion
using electrocorticographic signals in monkey. Front Neuroeng 3:3
Automated Cells Counting
for Leukaemia and Malaria Detection
Based on RGB and HSV Colour Spaces
Analysis

Amer Fazryl Din and Aimi Salihah Abdul Nasir

Abstract There are various types of diseases which are originated from the blood,
for example leukaemia, malaria and anaemia. Leukaemia is a cancer which starts in
blood forming tissues usually the bone marrow. On the other hand, malaria is
transmitted through the bite of infected mosquito that carrying the Plasmodium
parasite. Haematologists needs to perform the WBCs count in order to determine if
a person has leukaemia and parasite count to check for the malaria density.
However, the conventional procedure is very vulnerable due to human error and
large time consumption. As a solution, this study proposes automated cells counting
for leukaemia and malaria detection by analyzing the best colour component of
RGB and HSV colour spaces. To obtain the cells counting result, there are several
image processing steps to be implemented; (1) image acquisition by capturing the
leukaemia blood samples using a computerized Leica DLMA 1200 digital micro-
scope, (2) colour conversion from RGB to single colour component of RGB and
HSV, (3) image segmentation using Otsu thresholding, (4) removing of unwanted
regions and, (5) cells counting process. Overall, segmentation using green com-
ponent of RGB colour space has proven to be the best in segmenting leukaemia
images with 83.84% while saturation component of HSV colour space hold the
highest accuracy for malaria images with 89.87%. Conclusively, this research is
expected to help improving the detection phase of malaria and leukaemia diseases
by overcome problems that been identify in this research.

Keywords Leukaemia Malaria Image processing RGB colour models HSV

colour models Cells counting

A. F. Din (&) A. S. Abdul Nasir

Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI Alam Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia
e-mail: [email protected]
A. S. Abdul Nasir
e-mail: [email protected]

1 Introduction

1.1 Introduction

Leukaemia is known as a cancer originated from the blood or bone marrow [1]. The
bone marrow acts as a place for producing blood cells in the body. Basically,
leukaemia will occur when there is a problem with the production of blood cells. It
is most likely to affect people over the age of 55 years, but it is also the most
common cancer in those aged under 15 years. Meanwhile, malaria is a
mosquito-borne disease cause by a parasite. This parasite is transmitted to people
through the bites of infected females Anopheles mosquitoes [2]. Generally, some
symptom may occur after been infected such as fever, chills and flu illness [3]. It is
may become severe and if left untreated can cause severe complications and death.
For leukaemia, the statistics given by the global cancer observatory shows that
1717 new cases have been record in 2018 [4]. Leukaemia was placed in 9th position
for the most detected cancer in Malaysia with 1377 deaths recorded. In general, this
concluded that leukaemia in Malaysia is one of the most common cancer that
always occurred. In addition, the number of deaths is worrying and precaution step
must be taken to prevent a higher death toll. One of the most convincing step to be
taken is the early detection of leukaemia. In other word, early detection is crucial in
preventing the cancerous cell to spread all over blood stream.
World Health Organization stated that 200 million cases were recorded at
African region which bears the largest burden of malaria morbidity in 2017 [5].
Meanwhile, the number of indigenous human malaria cases record in Malaysia is
508 in 2017. Malaria transmission in Malaysia massively occurred in Sabah and
Sarawak which located on the island of Borneo. In responses, the country is
stepping up implementation of its international strategic plans for elimination
covering the period 2011–2020. One of the plan aims is the early detection of
malaria which will have huge effect in decreasing the number of cases.
Leukaemia presence can be detected with the growth number of abnormal white
blood cells (WBCs) while parasite presence in the blood may result in malaria.
Currently, the most economic and reliable diagnosis which is based on microscopic
examination of blood slide still remains the gold standard for the laboratory diag-
nosis of malaria [6]. According to Patel et al. [7], the traditional method of manual
count under microscope yield inaccurate result. The possibility of faulty detection
due to human error may arise. During the process of WBCs count, the haematol-
ogists will be in a high physical strain in order to complete the process. It takes a lot
of focus in examining for just one blood samples. The consequences from the
physical strain cause the haematologists to experience an eye fatigue. On the
contrary, others different methods in detecting leukaemia and malaria is bone
marrow test, computed tomography (CT) scan, rapid diagnostics test (RTD) and
magnetic resonance imaging (MRI) scan [8].
Subsequently, the manual recognition method is time consuming and effortful
especially in situation where large number of samples require reliable analysis. Just
Automated Cells Counting for Leukaemia and Malaria Detection … 983

within 48 to 72 h, the parasites inside the red blood cells multiply causing the
infected cells to burst open [9]. It is crucial to detect the presence of parasites in just
a short time to reduce the possibility for the patients to become more severe. Next,
the lack of research in terms of malaria density based on parasites counting is one of
the main problems. In this case, most of the research only focuses on identifying the
type of species for the Plasmodium (P.) parasite which consist of P. falciparum,
P. vivax, P. ovale, P. malariae and P. knowlesi [10].

1.2 Previous Study

Multiple type of digital image segmentation has been used by the researcher in
order to obtain the accurate result for malaria and leukaemia detection [11–15]. All
of these methods show the different number of accuracy to conclude the density of
malaria parasites. In easy word, each method needs to be carefully reviewed in
order to choose a correct method for this study.
According to Punitha et al. [14], detection of malarial parasite is performed
through the method of threshold technique. Before that, the original image needs to
go through pre-processing phase which usually used to remove unwanted effects
from the image and adjusting the image. After pre-processing phase, image is ready
to undergo the threshold phase which basically to segment the image into partition
of meaningful regions with respect to the object of interest.
Bashir et al. [6] proposed a system for detection of malaria parasites using digital
image processing and successfully obtain the high accuracy of 99.68%. Firstly, the
original images need to go through a number of enhancement phases which
included resizing, eliminating noise and greyscale conversion. After that, the
images undergo segmentation phase using threshold technique which separate the
region of interest that is distinguish from the pixel values. Lastly, a trained mul-
tilayer neural network was trained to detect the Plasmodium parasites.
Savkare et al. [16] used Otsu thresholding method in automated system for
malaria parasites identification. For this system, the original image firstly enhanced
with greyscale conversion. Next, median filter is used to remove unwanted pixel
and Laplacian filter works as a smoothening and edge enhancement. Subsequently,
the Otsu’s thresholding method been applied to obtain the region of interest in the
images. As a result, the authors managed to obtain the accuracy of 99.43%.
Patel et al. [7] performed automated cells counting for leukaemia detection using
k-means clustering technique. Firstly, the original image is enhanced using median
filtering and Wiener filter. Next, the image is converted from grey scale into cluster
index image. The image undergo three clustering process before successfully detect
the WBCs for the counting process to begin. This proposed segmentation method
procedure managed to obtain the accuracy of 93.57%.
According to Negm et al. [17], an accuracy of 99.74% is obtained from the
research of decision support system for acute leukaemia. Firstly, the images
undergo colour conversion based on three types of colour component which is red,
984 A. F. Din and A. S. Abdul Nasir

green and blue. After that, green component had been selected and k-means clus-
tering technique are performed on the image. The watershed distance segmentation
is performed to create a boundary for some overlapping WBCs.
Agaian et al. [18] proposed a new acute leukaemia automated classiﬁcation
system. The system involves pre-processing phase that includes the conversion of
RGB images into CIELAB colour component. Next, the images undergo seg-
mentation phase using k-means clustering method to obtain the region of interest.
As a result, this system is able to obtain accuracy of over 94%.

2 Methodology

In this research, various image processing techniques have been applied in order to
achieve a high percentages accuracy of cells counting. The techniques used have
been chosen carefully and arrange in properly order from data acquisition until
performing the cells counting. The steps are as follow:
Step 1: Capturing blood slide images using computerized Leica DLMA 1200
digital microscope
Step 2: Applying colour conversion into single colour component of RGB and
HSV colour spaces
Step 3: Segmentation process using Otsu thresholding
Step 4: Applying watershed segmentation (Leukaemia only)
Step 5: Removing unwanted object based on size
Step 6: Drawing bounding box
Step 7: Performing cells counting
Detail descriptions of image processing steps will be presented in the next
section.

2.1 Data Acquisition

Data acquisition is the process in obtaining the data that need to be process. In this
case, the data that need to be collected is the images of blood slides. The samples
were obtained from Hospital Universiti Sains Malaysia (HUSM). For this study, a
total number of 200 images have been obtained which consist of 100 malaria
images, 72 leukaemia images and 28 normal blood slide images. Leukaemia blood
slide images were capture at 10X magniﬁcation while malaria slide images were
captured at 100X magniﬁcation under computerized microscope with 800 600
resolution and save in bitmap (*.bmp) format. Figure 1 shows the blood slide
images of leukaemia and malaria capture under computerized microscope.
Automated Cells Counting for Leukaemia and Malaria Detection … 985

Fig. 1 Image of leukaemia and malaria blood slides

2.2 Colour Conversion into Single Component of RGB

and HSV Colour Spaces

The original images were captured in RGB images which are quite difﬁcult for the
segmentation process to be done. Therefore, from RGB images it is converted into
RGB and HSV colour space for easier segmentation process. Colour conversion
helps to reduce the colour dimension. Images is analysed with multiple colour space
to obtain the most suitable colour space with the region of interest clearly highlight.
Figure 2 shows the colour conversion applied on both leukaemia and malaria
images. For leukaemia image colour spaces were represented by red (c), green (e),
blue (g), hue (i), saturation (k) and value (m). Meanwhile, colour spaces for malaria
images were represented by red (d), green (f), blue (h), hue (j), saturation (l) and
value (n). From the colour conversion, the most suitable colour component for
segmentation process can be chosen based on the certain criteria. For example,
Value and Blue colour spaces shows a lighter contrast on the image which might
cause disappearing of cells after segmentation. As a result, from HSV colour
spaces, saturation colour component was chosen while green colour component was
chosen from RGB colour spaces to be segmented because the suitability in
detecting the cells.
The following Eqs. 1, 2, and 3 were used for conversion process from digital
images into single colour component of red, green and blue. The Eqs. 4, 5, and 6
were used to obtain the HSV colour spaces.
986 A. F. Din and A. S. Abdul Nasir

(a) Original image of leukaemia (b) Original image of malaria

(c) Red component of leukaemia (d) Red component of malaria

(e) Green component of leukaemia (f) Green component of malaria

(g) Blue component of leukaemia (h) Blue component of malaria

Fig. 2 Colour conversion on both leukaemia and malaria image

Automated Cells Counting for Leukaemia and Malaria Detection … 987

(i) Hue component of leukaemia (j) Hue component of malaria

(k) Saturation component of leukaemia (l) Saturation component of malaria

(m) Value component of leukaemia (n) Value component of malaria

Fig. 2 (continued)

RGB into single R, G and B colour component:

R
Red ¼ ð1Þ
RþGþB
G
Green ¼ ð2Þ
RþGþB
988 A. F. Din and A. S. Abdul Nasir

B
Blue ¼ ð3Þ
RþGþB

RGB to HSV colour spaces:

3
Saturation ¼ 1 minðR; G; BÞ ð4Þ
RþG þB

h; if G
Hue ¼ 360 0 ð5Þ
h; B [ G
8 9
< =
2 ½ðR GÞ þ ðR BÞ
1
h ¼ cos1 h i ð6Þ
: ðR GÞ2 þ ðR BÞðG BÞ2 ;

2.3 Image Segmentation Using Automatic Thresholding

This section will discuss about the analysis that involves in segmenting the images
by using Green and Saturation colour spaces. For designing automatic segmentation
process using both colour spaces, Otsu thresholding technique will be used in this
process. This is one of the popular method in image processing that was originally
introduced by Otsu in 1979. Otsu thresholding method is used to create a binary
image from grey level image.
Generally, this method minimized the interclass variance of the threshold black
and white pixels. It stores the intensities of the pixels in array. Subsequently, it is
necessary to determine a suitable threshold of grey level to allow extracting the
foreground from the background of an image. For this research, the image is
segmented into two and three parts of grey level respectively for malaria and
leukaemia in order to obtain the best result of segmentation. Table 1 shows the
result from the segmentation process.

2.4 Separation of Cells Using Watershed Segmentation

Watershed segmentation is mainly about ﬁnding the catchment basins and water-
shed lines for any greyscale images. In other words, it is helpful in the process of
separating the cells. This segmentation technique is based on concept of topo-
graphic representations of image intensity. Normally, this process is used to sep-
arate the overlapping cells for chronic leukaemia in which way enhance the
accuracy of cells counting process. Figure 3 shows the watershed segmentation that
manage to divide the cells successfully. The border of each cell will appear to mark
the different between each region.
Automated Cells Counting for Leukaemia and Malaria Detection … 989

Table 1 Result from segmentation process

Image Leukaemia Malaria
Original

Segmented
green

Segmented
saturation
990 A. F. Din and A. S. Abdul Nasir

(a) Original image shows the overlapping cells

(b) Applying watershed segmentation to create a boundary between cells

Fig. 3 Result on watershed segmentation

2.5 Removing Unwanted Object Based on Size

This technique performed the process of measuring the size of properties in the
images and determine the unwanted noise in the images. After separated the
touching cells, the unwanted cells at the image border must be removed since it will
disturb the accuracy for counting cells. Therefore, this operation will help in
identifying the size of all the cells in the images and make it easier to determine the
average area of a cell. By determining the average area, all the unwanted noise will
be ﬁlter.
Automated Cells Counting for Leukaemia and Malaria Detection … 991

2.6 Automated Cells Counting Method

The method for this part is by counting the object that left in the segmented images.
In easy words, this process can be done by ﬁnding connected component in binary
images. The bounding box will detect the object left in the images and calculated
the height and width of the images. After that, the boxes will be drawn based on the
center of the object. The boxes will be displayed on the original images. The
accuracy and error of the system will be calculated based on Eqs. 7 and 8
respectively.

Total manual Count

Percentages of Accuracy ¼ 100 ð7Þ
Total automatic Count
total manual count total automatic count
Percentages of Error ¼ 100 ð8Þ
total manual count

3 Results and Discussion

3.1 Result of Image Processing Technique for Leukaemia

and Malaria Images

For this section, all the results obtain will be discussed to analyse the end result.
From the previous section, it is clear that several steps needed to be applied before
obtaining the end result. Based on the method in Sect. 2.3, original image need to
go through colour conversion to make the process of segmentation become easier.
The comparison for each of colour spaces depends on several criteria which
included light intensity and contrast of the region of interest. It is clearly shown that
segmentation on green colour component for leukaemia and saturation colour
component for malaria was the best.
After that, the process of automatic thresholding will be performed to remove the
background of an image. This will only retain the region of interest in the image and
easier to identify the cells or parasites. From the segmented images, there are
countless unwanted object that need to be removed. Before removing unwanted
object, the area of each object in the images needs to be calculated to avoid the
system removing the cells. After securing the average area of cells and parasites, the
range of pixel size was decided. With range between 80 pixels to 2000 pixels for
chronic leukaemia, 40 pixels to 300 pixels for acute leukaemia and 30 pixels to 230
pixels for malaria all the unwanted object in successfully remove. Table 2 shows
the result from removing unwanted object.
For this example, chronic leukaemia image have been used in the diagnosis. In
this case, there are multiple overlapping cells in the image and make it harder to
acquire the accurate number of WBCs. In order to solve this, watershed
992 A. F. Din and A. S. Abdul Nasir

Table 2 Removing unwanted object based on size

Image Leukaemia Malaria
Original
image

Segmented
image

Removing
unwanted
object based
on size

segmentation was applied on the image to create boundaries between each cells and
separate them. Based on the result shown above, it can be observed that all the
unwanted object has all been successfully removed. For the next part is to draw a
bounding box on each object left in the images. The bounding box will help with
the mapping process for each cells on original image for references. Figure 4 show
the bounding box drawn on (a) leukaemia images and (b) malaria images.
Automated Cells Counting for Leukaemia and Malaria Detection … 993

(a) Bounding box drawn on leukaemia image

(b) Bounding box drawn on malaria image

Fig. 4 Bounding box drawn on both leukaemia and malaria image

994 A. F. Din and A. S. Abdul Nasir

3.2 Result of Counting for Leukaemia and Malaria Images

After processing the images with several methods, the result of counting ﬁnally
obtained. The result consists of 100 malaria images, 72 leukaemia images and 28
normal images. The counting process was compared between two colour spaces
which is Green and Saturation. The accuracy was obtained through the Eqs. 7 and 8
mentioned earlier. Table 3 shows the result of leukaemia with Green colour com-
ponent has higher percentages which is 83.84%. Saturation colour component with
percentages of 21.68% can be considered as failure in obtaining the accurate
number of WBCs. This is mainly because Saturation colour component highlights
most of the noise in the images which resulting in bad segmentation.
For normal blood slide, the result is basically the same with Green colour
component dominating with 90.91% while Saturation colour component have the
lower percentages of 1.10%. The normal image is easier to be segment with the
Green colour component because it is able to separate the contrast between the
region of interest and other noise. The result of malaria images with Saturation
colour component hold higher percentages which is 88.97%. Saturation colour
component perform better compare to Green colour component because malaria
images have a low light intensity for the parasites. Saturation colour component
able to increase the contrast of the parasites thus easier the segmentation process.
As a result, it is clearly shown that leukaemia image is suitable with Green colour
component while malaria images suitable with Saturation colour component.
Result of accuracy on leukaemia images:

6037
100 ¼ 83:84% ð9Þ
7201
6037
100 ¼ 21:68% ð10Þ
27850

Result of accuracy on normal images:

80
100 ¼ 90:91% ð11Þ
88

Table 3 Result for image processing

Details Manual cells Automatic cells counting Automatic cells counting
counting (Green) (Saturation)
Leukaemia 6037 7201 27,850
Normal 80 88 7242
Malaria 3936 14,503 4424
Automated Cells Counting for Leukaemia and Malaria Detection … 995

80
100 ¼ 1:10% ð12Þ
7242

Result of accuracy on malaria images:

3936
100 ¼ 27:14% ð13Þ
14503
3936
100 ¼ 88:97% ð14Þ
4424

4 Conclusion

As a conclusion, an automated screening and counting cells for leukaemia and

malaria detection has been successfully developed. This system is based on various
image processing techniques applied on the blood slide images. The detection for
leukaemia is based on the number of WBCs while the parasitic count is used as a
measure to identify the density of malaria in a blood. In order to get the best
segmented images, some proper techniques of image processing are developed.
This study has shown that the automated screening and cells counting for both
malaria and leukaemia cells are successfully develop to count the total of WBCs
and parasites. Even though the result obtained is good there might be some
improvement that can be done in the future. This recommendation is based on
multiple aspects that need to be improved for the current system.

References

1. Medical News Today Article. https://www.medicalnewstoday.com/articles/142595.php.

Accessed 11 Nov 2019
2. Medical News Today Article. https://www.medicalnewstoday.com/articles/150670.php.
Accessed 11 Nov 2019
3. Healthline. https://www.healthline.com/health/malaria. Accessed 12 Nov 2019
4. Registry PC (2019) Global Cancer Observatory. Malaysia Cancer Stat 593:1–2
5. World Health Organization (2018) World Malaria Report. ISBN 978 92 4 156469 4
6. Bashir A, Mustafa ZA, Abdelhameid I, Ibrahem R (2017) Detection of malaria parasites using
digital image processing. In: 2017 international conference on communication, control,
computing and electronics engineering. IEEE, Sudan
7. Patel N, Mishra A (2015) Automated leukaemia detection using microscopic images. Procedia
Comput Sci 58:635–642
8. Cancer. https://www.cancer.org/cancer/acute-lymphocytic-leukaemia/detection-diagnosis-
staging/how-diagnosed.html. Accessed 28 Mar 2019
9. Your Genome. https://www.yourgenome.org/facts/what-is-malaria. Accessed 11 Nov 2019
10. World Health Organization. https://www.who.int/ith/diseases/malaria/en/. Accessed 11 Nov
2019
996 A. F. Din and A. S. Abdul Nasir

11. Aris TA, Nasir ASA, Mohamed Z, Jaafar H, Mustafa WA, Khairunizam W, Jamlos MA,
Zunaidi I, Razlan ZM, Shahriman AB (2018) Color component analysis approach for malaria
parasites detection based on thick blood smear images. In: IOP conference series: materials
science and engineering. IOP Publishing, pp 1–7
12. Aris TA, Nasir ASA, Mustafa WA (2017) Analysis of distance transforms for watershed
segmentation on chronic leukaemia images. J Telecommun Electron Comput Eng 10:1–16
13. Nasir ASA, Mashor MY, Mohamed Z (2018) Enhanced k-means clustering algorithm for
malaria image segmentation. J Adv Res Fluid Mech Therm Sci 42(1):1–15
14. Punitha S, Logeshwari P, Sivaranjani P, Priyanka S (2017) Detection of malarial parasite in
blood using image processing. SSRN J 1(2):211–213
15. Khairudin NAA, Ariff FNM, Nasir ASA, Mustafa WA, Khairunizam W, Jamlos MA,
Zunaidi I, Razlan ZM, Shahriman AB (2019) Image segmentation approach for acute and
chronic leukaemia based on blood sample images. In: MEBSE 2018 - IOP conference series:
materials science and engineering. IOP, vol 557
16. Savkare SS, Narote SP (2015) Automated system for malaria parasite identification. In:
Proceedings - 2015 international conference on communication, information and computing
technology. IEEE, India
17. Negm AS, Hassan OA, Kandil AH (2018) A decision support system for acute leukaemia
classification based on digital microscopic images. Alexandria Eng J 57(4):2319–2332
18. Agaian S, Madhukar M, Chronopoulos AT (2018) A new acute leukaemia-automated
classification system. Comput Methods Biomech Biomed Eng Imaging Vis 6(3):303–314
Simulation Studies of the Hybrid
Human-Fuzzy Controller for Path
Tracking of an Autonomous Vehicle

Haﬁz Halin, Wan Khairunizam, Hasri Haris, Z. M. Razlan,

S. A. Bakar, I. Zunaidi, and Wan Azani Mustafa

Abstract Human intelligence and experience help them in making a decision and
recognize a pattern. This ability enables the driver to take action even in an unex-
pected situation. The hybrid integration between human intelligence/experience and
machine controller able to improve the autonomous vehicle path tracking capability.
The path tracking capability is the main concern of the autonomous vehicle. The
Fuzzy developed from the experiment’s data. The experiments (human navigation
experiments) used to gather the appropriate data from humans while controlling the
buggy car. Data then use to develop the membership functions for inputs and output
of the Fuzzy controller. The simulation uses to study the performance of the Fuzzy
controller. The recorded path tracking error from the simulations for the right and left
turn maneuver is 9 m and 7.5 m, respectively.

Keywords Simulation Fuzzy Autonomous Hybrid

H. Halin W. Khairunizam (&) H. Haris Z. M. Razlan S. A. Bakar

Advanced Intelligent Computing and Sustainable Research Group (AICoS),
School of Mechatronic Engineering, Universiti Malaysia Perlis (UniMAP),
Pauh Putra Campus, 02600 Arau, Perlis, Malaysia
e-mail: [email protected]
H. Halin
e-mail: abdhaﬁ[email protected]
I. Zunaidi
Faculty of Technology, University of Sunderland, St Peter’s Campus,
Sunderland SR6 0DD, UK
W. A. Mustafa
Faculty of Engineering Technology, Universiti Malaysia Perlis, UniCITI Alam Campus,
Sungai Chuchuh, 02100 Padang Besar, Perlis, Malaysia

1 Introduction

The autonomous vehicle is the future technology in the automotive industry. The
active sensors integration and real-time computation of the surrounding environ-
ment help the autonomous vehicle navigate through an unknown environment.
Negligence [1], drowsiness, and fatigue [2] are some of the factors contribute to
trafﬁc accidents. The autonomous system helps to reduce human errors because the
programmable autonomous system is more reliable [3].
Small path tracking error in the autonomous vehicle navigation makes it possible
to maintain the autonomous vehicle on the desired path [4, 5]. The complex control
algorithm, such as Model Predictive Control (MPC) [6], Multiple Model Adaptive
Control (MMAC) [7], Neural Network [8], and others need a lot of information
from the surrounding area to navigate through an unknown path. The fuzzy con-
troller serves the right functions for complex sensor integration [9] compare to other
controllers.
The simulations can reduce cost and time compared to the real-time experiments.
The simulation also can be used as a benchmark for the experiments result because
the simulation is developed with ideal situation while neglected disturbance/noises
[10]. The internal and external noises are neglected, and the full potential of the
developed Fuzzy controller can be observed. Some researchers used the combi-
nation of simulation and experiments as the simulations results used to validate the
results of the experiments. For example, Em Poh Ping and Sim Kok Swee [11] used
hardware-in-the-loop simulation (HILS) to test their lane keeping algorithm. HILS
is a real-time hardware simulation. A lot of cases shows that simulation studies
produce a better result than experiments [12, 13].
This research aims to create a Fuzzy controller for an autonomous vehicle with
human characteristics embedded in it. The rule-based for the Fuzzy controller
developed based on human data while controlling the vehicle through the designed
paths [14]. The simulations for the path tracking and steering wheel control use to
study the performance of the Fuzzy controller.

2 Methodologies

Figure 1 shows the overall process of developing the Fuzzy controller. The human
navigation experiment conduct to gather the human driving characteristic while
driving on the selected path. The experiment details were discussed in depth in [15]
and [16]. Then, Fuzzy membership functions and rule-based were develop based on
human navigation experiments data. The simulation was developed to imitate the
path taken in human navigation experiments. The simulation also uses to study the
performance of the developed Fuzzy controller.
Simulation Studies of the Hybrid Human-Fuzzy Controller … 999

Fig. 1 Overall flow chart for

the development of the fuzzy
controller

2.1 Development of the Fuzzy Controller

The human navigation experiments were the experiments developed to gather the
raw data from the human driver. As the driver drives through selected path, data
from GPS and encoder were collected. Data from the steering wheel, coordinate,
speed, and headings were used to calculate the parameters for the Fuzzy mem-
bership function. The average (l) and standard deviation (r) were calculated to
develop the parameters for the membership function, as in Fig. 2.
Two Fuzzy controllers were developed for this research, which is Fuzzy con-
troller 1 (FZ1) and Fuzzy controller 2 (FZ2). Inputs for the FZ1 was steering wheel
angle and speed. While inputs for the FZ2 was heading angle and steering angle.
The output for both FZ1 and FZ2 was steering wheel turn angle. The output from
FZ1 and FZ2 have their dedicated rule weights, as shown in Fig. 3. The controller
with highest rule weight was chosen, and its’ steering wheel turn angle was used for
the navigation. The detail explanation for the overall Fuzzy architecture (Fig. 3)
was simpliﬁed, as in Table 1.
1000 H. Halin et al.

Fig. 2 Steering wheel angle membership function (input)

Fig. 3 The architecture of the fuzzy controller

2.2 Fuzzy Simulation

The LabVIEW program used to simulate the performance of the developed Fuzzy
controller. The simulated autonomous vehicle will follow the waypoint, as in
Table 2. The scenario for the simulation the same as the human navigation
experiments. There were two paths used for the simulations, which were left and
right turned. The path tracking errors were recorded during the simulation. The path
tracking error was the distance between the Center of Vehicle (CG) to the reference
path. Figure 4 shows the position of the ﬁve predetermined waypoints used for
navigation, which were Point 1 until Point 5 (P1 to P5).
Simulation Studies of the Hybrid Human-Fuzzy Controller … 1001

Table 1 The details explanation for the fuzzy controller architecture

Stages Description
Stage 1 • The inputs for the two Fuzzy sub-controller were speed, steering wheel angle,
steering angle, and heading
• The membership function for the inputs was developed for the experiments
Stage 2 • Two sub-controller were developed to processes all four inputs at once
• The sub-controllers name as FZ1 and FZ2
Stage 3 • The output for both sub-controllers was steering wheel turn angle
• To navigate the autonomous vehicle, only one value of the steering wheel turn angle
needed
• The selection of the suitable steering wheel turn angle was selected from the output
of the one of the sub-controller with higher rule weight
Stage 4 • The selected steering wheel turn angle (the correction steering wheel angle for the
path tracking) was used for the path tracking of the autonomous vehicle

Table 2 Coordinate for the Waypoint Longitude Latitude

waypoints used in the
simulation P1 100.234486 6.555692
P2 100.234599 6.555694
P3 100.234647 6.555715
P4 100.234661 6.555769
P5 100.234661 6.555961

Fig. 4 Position of the waypoints

1002 H. Halin et al.

3 Results

3.1 Simulations-Right Turn Maneuver

Figures 5 and 6 shows the simulation result for the right turn maneuver. Figure 5
shows the path taken for the simulated autonomous vehicle with reference. The
reference path was created from the average path use by the subjects in the human
navigation experiments. The highest path tracking error recorded from the simu-
lations is 9 m. Figure 6 shows the comparison between the reference and simula-
tion of the steering wheel control. The reference steering wheel angle is generated
from the average values of the steering wheel angle for right turn maneuver in
human navigation experiments. The error for the steering wheel angle recorded is
below 103.9°. The error is small and acceptable as the simulated vehicle able to
navigate through the designed path.

Fig. 5 The simulation of the path tracking for right turn maneuver

Fig. 6 Steering wheel angle

for the right turn maneuver
simulation
Simulation Studies of the Hybrid Human-Fuzzy Controller … 1003

3.2 Simulations-Left Turn Maneuver

Figures 7 and 8 show the simulation result for the left-turn maneuver. Figure 7
shows the path taken for the simulated autonomous vehicle. The maximum path
tracking error recorded for the simulations is 7.5 m. Figure 8 shows the difference
between reference steering wheel angles to the simulated result for left-turn
maneuver. The simulated steering wheel angle shows the ability of the Fuzzy
controller to navigate. The angles transition is not as smooth as the reference angle,
but the simulation result still manages to take complete the navigation from start to
endpoint. The highest error recorded for the steering wheel angle is 79.6°.

Fig. 7 The simulation of the path tracking for right left maneuver

Fig. 8 Steering wheel angle

for the left turn maneuver
simulation
1004 H. Halin et al.

4 Conclusions

The human navigation experiments were one of the methods used to gather the
human characters while driving the buggy car. The human behavior while driving
the vehicle through the designed path was studied for the development of the Fuzzy
controller. The membership function develops using data from human navigation.
Fuzzy rule-based develop based on researcher knowledge on how to control the
vehicle. Based on the simulation result, the fusion between human element and the
controller makes it possible to develop comfortable and reliable autonomous
vehicles. The simulation develops to studies the performance of the Fuzzy con-
troller. Even though the simulation results show high path tracking error but the
simulated vehicle able to reach its ﬁnal target.

Acknowledgements The authors would like to acknowledge the support from Universiti
Malaysia Perlis (UniMAP) for ﬁnancial support under the Publication Scheme.

References

1. Jo K, Kim J, Kim D, Jang C, Sunwoo M (2014) Development of autonomous car-part 1:

distributed system architecture and development process. IEEE Trans Ind Electron 61
(12):7131–7140
2. Krajewsk J, Sommer D, Trutschel U, Edwards D, Golz M (2009) Steering wheel behavior
based estimation of fatigue. In: Proceedings of the 5th international driving symposium on
human factors in driver assessment, training, and vehicle design : driving assessment 2009,
University of Iowa, Iowa, pp 118–124
3. Manawadu U, Ishikawa M, Kamezaki M, Sugano S (2015) Analysis of individual driving
experience in autonomous and human-driven vehicles using a driving simulator. In: 2015
IEEE international conference on advanced intelligent mechatronics (AIM). IEEE, Korea,
pp 299–304
4. Bjelonic M, Homberger T, Kottege N, Borges P, Chli M, Beckerle P (2017) Autonomous
navigation of hexapod robots with vision-based controller adaptation. In: 2017 IEEE
international conference on robotics and automation (ICRA). IEEE, Singapore, pp 5561–5568
5. Kahveci NE (2008) Adaptive steering control for uncertain vehicle dynamics with crosswind
effects and steering angle constraints. In: 2008 IEEE international conference on vehicular
electronics and safety. IEEE, Ohio, pp 168–173
6. Yu R, Guo H, Sun Z, Chen H (2015) MPC-based regional path tracking controller design for
autonomous ground vehicles. In: 2015 IEEE international conference on systems, man, and
cybernetics. IEEE, Hong Kong, pp 2510–2515
7. Aoki T, Sugimachi T, Fukao T, Kawashima H (2012) Autonomous steering control with
adaptive mechanism by MMAC. In: 2012 proceeding of SICE annual conference. IEEE,
Japan, pp 203–210
8. Yang S, Wang W, Liu C, Deng W, Hedrick JK (2017) Feature analysis and selection for
training an end-to-end autonomous vehicle controller using deep learning approach. In: 2017
IEEE intelligent vehicles symposium (IV). IEEE, United States, pp 1033–1038
9. Rasheed U, Ahmed M, Afridi MJ, Kunwar F (2010) Road trajectory mining and autonomous
steering control for vision-based unmanned vehicles. In: 2010 10th international conference
on intelligent systems design and applications. IEEE, Egypt, pp 197–202
Simulation Studies of the Hybrid Human-Fuzzy Controller … 1005

10. Guo J, Hu P, Wang R (2012) Nonlinear coordinated steering and braking control of
vision-based autonomous vehicles in emergency obstacle avoidance. IEEE Trans Intell Transp
System 17(11):3230–3240
11. Ping EP, Swee SK (2012) Simulation and experiment of automatic steering control for lane
keeping manoeuvre. In: 2012 4th international conference on intelligent and advanced
systems (ICIAS2012) [Internet]. IEEE, pp 105–110
12. Gelbal SY, Altug E, Kececi EF (2016) Design and HIL setup of an autonomous vehicle for
crowded environments. In: IEEE/ASME International Conference Advance Intelligent
Mechatronics, AIM, pp 1370–1375
13. Hamzah N, Aripin MK, Sam YM, Selamat H, Ismail MF (2012) Yaw stability improvement
for four-wheel active steering vehicle using sliding mode control. In: 2012 IEEE 8th
international colloquium on signal processing and its applications. IEEE. Malaysia, pp 127–
132
14. Halin H, Khairunizam W, Ikram K, Haris H, Zunaidi I, Bakar S, et al (2018) Design
simulation of a fuzzy steering wheel controller for a buggy car. In: 2018 international
conference on intelligent informatics and biomedical sciences. IEEE, Japan, pp 85–89
15. Halin H, Khairunizam W, Ikram K, Haris H, Bakar SA, Razlan ZM, et al (2018) Investigation
of steering wheel control of an electric buggy car for designing fuzzy controller. In:
Hassan MHA (eds) proceeding of symposium intelligent manufacturing & mechatronics.
LNME, vol 1, Springer, Heidelberg, pp 459–468
16. Halin H, Khairunizam W, Ikram K, Haris H, Zunaidi I, Bakar SA, et al (2019) Analysis of
human behaviour while controlling the steering wheel of a buggy electric vehicle (EV). In:
IOP Conference Series Material Science Engineering, vol 557, no 1
A New Approach in Energy
Consumption Based on Genetic
Algorithm and Fuzzy Logic for WSN

Ali Adnan Wahbi Alwaﬁ, Javad Rahebi, and Ali Farzamnia

Abstract Although the sensor node is tiny, it covers large areas by connecting
these nodes together wirelessly, it called wireless sensor network (WSN). WSNs are
one of the common things that still evolving very fast nowadays. Routing protocols
challenge the energy consumption of wireless sensor networks. In this paper, we
proposed a new Fuzzy Logic and Genetic Algorithm based protocol (FL-GA) for
WSNs, as follows, we used the fuzzy logic Mamdani method for finding the best
cluster heads. We used two inputs for fuzzy, energy and distance, we used the
Genetic Algorithm for optimization. Taking into account variable parameters, the
choice of cluster heads will be more efficient and the cluster forming will be more
accurate, all the nodes will almost die at the same time. One of the classic routing
protocols is the Low Energy Adaptive Clustering Hierarchy (LEACH) protocol. We
compared our protocol to the LEACH protocol. Our network nods, still alive much
more than the LEACH protocol nodes. The proposed method is more efficient in
extending the network lifetime and maximizing the total number of data packets
received in the sink.

Keywords Wireless sensor network Genetic Algorithm Clustering Fuzzy

logic

A. A. W. Alwaﬁ
University of Turkish Aeronautical Association, Ankara, Turkey
J. Rahebi
Altinbas University, Istanbul, Turkey
A. Farzamnia (&)
Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
e-mail: [email protected]

1 Introduction

A wireless sensor network (WSN) is a collection of small sensors (nodes), generally

each node has a low power battery built-in, these nodes have the ability to gather
the information (data) of the environment which placed in and then send it to the
base station (sink) itself or through other nodes [1].
WSN is widely been used technique by day in our live in various fields, such as
health monitoring, weather, military, underwater, air contamination and many other
applications. WSN is more efficient than the wired sensor networks especially in
that places which is dangerous and difficult to reach because the sensor nodes are
tiny and easy to deploy, the small nodes have small batteries so the most chal-
lenging part of WSN is the energy consumption part [2]. Each node consisted of:
sensing unit, processing unit, energy supply unit (battery), transceiver and maybe
an external memory. Each node senses the physical quantity and converts it to an
electric signal then send it through the transmitter [3].
Sensors that located in the sea/ocean bottom are used in many applications such
as observing seismic movements, measuring seawater pollution, researching
underwater mines, monitoring underwater and over-water devices and vehicles. To
enable these applications, the devices must communicate underwater. By installing
two or three dimensional underwater networks, distributed and scalable wireless
sensors will be used, it is possible for each sensor to observe environmental events
locally. In these cases, the sensors are usually fixed on a platform and placed
underwater. Underwater sensors cost more. Underwater, the sensors are placed in
the application area less frequently than they were in terrestrial sensors. The spatial
communicating in underwater networks is weak due to the distances between the
sensors [4, 5].
There are three criteria which WSN nodes lifetime defined with, the First Node
Dies (FND), the Half Node Die (HND) and the Last Node Dies (LND), these
expressions are important while defining a WSN for a specific application cause its
importance differs from an application to another, (FND) is important for the patient
with heterogeneous sensors for example (monitoring a patient suffered from a Heart
Attack before), if one of the sensors die there will be unfixable problem so in sit-
uations like these, it is important to know the (FND) of the Network before using it.
Routing refers to finding the best path for the gathered data to deliver it to the
required destination. Routing WSN is the critical part of saving its energy, so
different types of communication protocols have been introduced for this purpose
such as, Flat (SPIN, GBR), Location-based (GAR, SPAN) and Hierarchical (Leach,
Teen) [6]. In the flat protocol, every nodes sense and route the data to the base
station, while in the location-based protocol the routing path established depending
on the location of sensor node, however, in the Hierarchical protocols the network
divided into clusters, each cluster will have one cluster head (CH) being chose
among the whole network nodes, these CHs will gather the information of the nodes
belongs to its cluster and send it to the base station [7–9]. The Hierarchical-based
protocols have an advantage that all nodes expend their energy equally so the nodes
A New Approach in Energy Consumption Based on Genetic Algorithm … 1009

still alive as much as possible. LEACH is one of the widely known clustering based
Hierarchical protocols, it has two phases like many other Hierarchical protocols, the
setup phase and the steady state phase. At the setup phase, the CHs will be chosen
firstly in terms of energy, the node with more energy will be more capable to be a
CH, then the clusters will be formed by assigning each node to the nearest CH.
After the clusters being formed the steady state begins, each node will sense and
collect data then send the data to the CH which it belongs to, the CHs then send the
collected data from the nods to the base station [10].
The main problem of each protocol is the choosing of cluster heads among the
sensor nods of the network and the other problem is sending data and gathering
information the third problem is how to keep the sensors active when there is an
information to send and when to change them to sleep mod when there is no
information to gather or to transmit and vice versa.
The sequence of this paper is as follow: on the Sect. 2 we detail revisiting the
other related works about the WSN protocols of other researchers. Sections 3 and 4
present our proposed protocol that contains the Mamdani fuzzy logic for choosing
cluster heads depending on two variable factors the Distance from the node to the
base station and the node residual energy, and the Genetic Algorithm (GA) as a
fitness function for organizing the fuzzy table. Later we discuss our work perfor-
mance over LEACH protocol in Sect. 5. Finally, Sect. 6 contains the conclusion,
ideas and proposals for the future works and applications.
In this paper, we prepared a new protocol based on fuzzy logic and Genetic
Algorithm (GA), we use Mamdani fuzzy logic for choosing the cluster heads by
considering two variables, the residual energy of the nodes and the distance
between the node and the base station. Our new protocol like LEACH protocol, also
have two phases, a setup phase and steady state phase, the selecting of the cluster
heads is the challenge. In the proposed method, firstly, the input values, energy and
distance, are analyzed to find the fuzzy rules, then suitable CHs have been selected.
The cluster head selection going to be more efficient and the clusters will be formed
accurately. We use the GA as a fitness function to optimize the fuzzy rules. At the
beginning of each round, the energy and the distance of all nodes will be measured,
these measurements used to find the fuzzy rule for each node individually.

2 Related Work

While Hierarchical clustering-based protocol schemes has been the most effective
technique for saving the sensors energy then saving WSNs lifetime, many types of
this technique have been introduced in recent years. These protocols classiﬁed into
different types according to the other techniques that used with such as, classical
methods, swarm intelligence based methods and fuzzy based methods.
LEACH (Low Energy Adaptive Clustering Hierarchy) is one of the common and
oldest routing protocols of WSN [8]. It consists of two phases, a setup phase and
steady state phase. The node itself decides whether to be a CH or not without any
1010 A. A. W. Alwaﬁ et al.

control of the user, at the setup phase, each node selects a number randomly
between 0 and 1, if the number is less than the threshold T (n) then this node will
become a cluster head for this round. The threshold is determined by:
(
P
TðnÞ ¼ array20l ; if n 2 G0;otherwise ð1Þ
1 P r mod P1

where, p is the desired percentage of CHs, r is the round number, and G is the set of
all nodes which have not been elected as CH in the last 1/p rounds.
After the CHs, have been chosen, they send a message to other nodes that aren’t
been a cluster head informing them that they have been a CH, each non-CH nodes
will calculate the distance between themselves and the CHs depending on the signal
strength that they received the message by, then the clusters formed by the CH and
the nearest node to it. The steady state phase begins after the clusters have been
formed, each node will send its collected information to the CH which corresponds,
then the CHs transmit all collected data to the base station.
LEACH-C [10] and LEACH-DT [11], both of them are based on LEACH
protocol, but the ﬁrst one is a centralized type and uses simulated annealing
algorithm for picking the cluster heads while the other considers the distance
between the node and the sink for selecting the CH.
Jin et al. [12] have presented a clustering protocol based on genetic algorithm tried
to ﬁnd an appropriate CH to minimize the clustering spaces. Generally, the evolu-
tionary based cluster approaches used for dynamically clustering sensor nodes in the
setup phase to reduce the energy consumption and forming appropriate clusters.
Gupta et al. [13] proposed a centralized fuzzy logic to select CHs, depending on
the energy, concentration and the centrality of the node. It has two phases like
LEACH protocol, but the difference is the choosing of cluster heads, it depends on
the output value of the fuzzy rule system, a node with the better value will become
the CH then the clusters will be formed.

3 System Model

3.1 Network Model

We arrange a single-hop clustering-based WSN model, it can be seen in Fig. 1. All

sensor nodes are randomly spread and then they still static. For the nodes, we used
the normal nodes and advanced node. As shown in Fig. 1, The normal nodes have
0.5 J energy, referred by colored plus signs, and the advanced nodes which have 1 J
energy, referred by colored circle signs, but all nods have the same hardware
capabilities. Each node provided by a GPS device, so it can aware of its own
location and the location of sink. The nodes could communicate among them and
each one of them have the ability to communicate with the sink. The information
gathering strategy has two rounds. In each round, the nodes gather the data and send
A New Approach in Energy Consumption Based on Genetic Algorithm … 1011

Fig. 1 Initial area and the

nodes separation for the new
FL-GA method

it to its CH. Then, the CHs send all the collected data to the sink. All nodes
communicate with each other’s and with their CHs using Time division multiple
access (TDMA). All CHs use medium access control (MAC) protocol to commu-
nicate with the sink.
Clustered WSN protocols have more features for saving the network lifetime
than other non-clustered protocols [8]. First, routing is easily administered. It
reduces the network trafﬁc then improves the flexibility of the network. Second, the
excessive data in homogeneous WSNs could be discarded from the gathered data at
CHs. Therefore, the overall energy consumption reduces because there will be no
more excessive data for being transmitted. Third, the network bandwidth will be
kept because it prevents excessive messages from exchanging between nodes.
The parameters which been used in this study is shown in Table 1.

Table 1 The parameters Parameter Value

which been used as an entry
in FL-GA protocol xm 100 m
ym 100 m
Sink.x 50 m
Sink.y 50 m
Number of nodes in the ﬁeld 100
Initial energy 0.5 J
Energy of transmitter 50 109
Energy of receiver 50 109
Energy of free space 10 1012
Energy of multi path 0:0013 1012
Energy of Data Aggregation (EDA) 5 109
Percentage of advanced nodes 0.1
Maximum number of rounds 4000
1012 A. A. W. Alwaﬁ et al.

We used MATLAB to simulate our method. Also at the beginning all values are
tuned to zero.
The initial area of the sensor node distributions which we obtained is shown in
Fig. 1.

3.2 Energy Model

In this paper, the energy model of the network from Heinzelman [10] is been used.
l-bit data packet for being transmitted through a distance d will have an energy
dissipation computed by:

ETX ðl; d Þ ¼ fl Eelec þ l efs d 2 if d d0 l Eelec þ l emp

ð2Þ
d 2
if d [ d0

ERX ðlÞ ¼ l Eelec ð3Þ

where ðETX Þ is the wasted energy in the transmitter and ðERX Þ is the wasted energy
in the receiver. Eelec is the wasted electronic energy (per bit). The distance threshold
ðd0 Þ is found by dividing the free space ðefs Þ channel by the multipath fading
channel ðemp Þ.

4 New Fuzzy-Genetic Based Protocol

4.1 Fuzzy Structure

Fuzzy logic system converts the normal values of the input to fuzzy variables using
the membership functions that have been memorized in the fuzzy learning system.
We used MATLAB Toolbox FIS (Fuzzy inference system), the system structure is
shown in Fig. 2. Our FL-GA clustering protocol stages is similar to LEACH, has
setup phase and steady state phase. The main difference between LEACH and
FL-GA lies on the setup phase, the nodes chance of being a CH depends on the FIS
results then the forming of clusters. The other difference between FL-GA and
LEACH is that the LEACH depends only on energy or distance while choosing the
CHs whether in FL-GA it depends on two variables the energy and the distance.
The inference techniques and our Fuzzy Logic system design are illustrated in
Fig. 2. Our New FL-GA system as we mentioned, consists of two variables on the
input (Energy and Distance) and one variable on the output (Chance).
As shown in Fig. 3, for each input we used the same three membership values
(Low, Med and High). Also, as shown in Fig. 3, for the output we used another
three membership values (Small, medium and Large. Later the fuzzy inference
A New Approach in Energy Consumption Based on Genetic Algorithm … 1013

Fig. 2 New FL-GA system

system will determine our rules as shown in Table 2. The AND-based Mamdani
fuzzy rule table used in FL-GA shown in Table 3. The number of rules equal to 32 .
We use the Eqs. (2) and (3) to ﬁnd the dissipated energy through the transmitter
and the receiver during transmission and reception.

4.2 Genetic Algorithm Structure

For genetic algorithm, the clustering cost will be find as the equation below:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
z¼ ðSink:x X Þ2 þ ðSink:y Y Þ2 ð4Þ

The genetic parameter which been used in this paper is shown in the Table 3.
After selecting the values, the initialization of parameters is done. The Fig. 4.
shows how the GA works.
1. Initialize Position
2. Evaluation
3. Sort Population
4. Store Best Solution
5. Array to Hold Best Cost Values
6. Store Cost
The parent index will calculate by:
bC
e Wc
P bC ð5Þ
e Wc

In this equation, b is the Pressure Selection, C is the Costs, and Wc is the Worst
cost.
1014 A. A. W. Alwaﬁ et al.

Fig. 3 Membership a
functions for: a input1
[energy], b input2[distance]
and c output[chance]

c
A New Approach in Energy Consumption Based on Genetic Algorithm … 1015

Table 2 The Mamdani fuzzy Energy Distance Chance

rule table
Low Low Medium
Low Med Small
Low High Small
Med low Medium
Med Med Medium
Med High Large
High Low Large
High Med Large
High High Large

Table 3 Parameter which Parameter Value

been used in the GA
Maximum number of iterations 5
Population size 100
Crossover percentage 0.8
Number of off springs 2 * round (0.8 * 100/2)
Mutation percentage 0.3
Number of mutants round (0.3 * 100)
Gamma 0.2
Mutation rate 0.02
Selection pressure 8

In the main loop, we have the following steps:

1. Crossover
2. Select Parents Indices
3. Select Parents
4. Apply Crossover
5. Evaluate Off springs
6. Mutation
7. Select Parent
8. Apply Mutation
9. Evaluate Mutant
10. Create Merged Population
11. Sort Population
12. Update Worst Cost
13. Truncation
14. Store Best Solution Ever Found
15. Store Best Cost Ever Found
1016 A. A. W. Alwaﬁ et al.

Fig. 4 Flowchart of the Genetic Algorithm operation

5 Results and Analysis

Results of the best cost function value of GA after ﬁrst round is shown in Fig. 5.
In the ﬁrst iteration, up to 4th iteration the best cost is 0.1654, in the last iteration
this value is 0.05395. Actually, good convergence refers to a good GA function. We
noticed a quick disappearing after the initial variations, that refers to a tiny amount
of improvement for longer generations. Then, it is evident that beyond 100

Fig. 5 Best cost function of

the GA after ﬁrst round
A New Approach in Energy Consumption Based on Genetic Algorithm … 1017

Fig. 6 The sensitive point

for deaden the nodes

generations, the objective function achieves a sufficiently stagnant fitness value and
stabilizes.
After this step, the CHs selected depending on the Fuzzy formula outputs. Then,
the cluster formation has been done.
The alive node’s situation is shown in Fig. 6. Gives a comparison between
proposed protocol and LEACH protocol. In LEACH method, the first node starts to
dead in round 507 and all nodes are dead in round 759. In the proposed method the
first node dies in the round 801 and the last node dies in the round 4000.
Figure 7 compares the packet transmitting data to the base station for 4000
rounds.
Although the LEACH protocol has a linear packet transmitting over rounds bit it
stops in round 759 as we mentioned before. Even before the all nodes dead (before
the round 759) in leach protocol It shows that the proposed method sends more
information than the LEACH method to the base station, regardless of all the
proposed method nodes being dead in the round 4000, it means that the proposed
method sends much more information than LEACH protocol.

Fig. 7 Packet to base station

vs. number of rounds
1018 A. A. W. Alwaﬁ et al.

Fig. 8 Cost function for 100

iterations

The cost function for 100 iterations is shown in Fig. 8.

The relationship between the cost and the iteration is linear and the cost function
goes higher by a small amount in FL-GA so that means it has an efﬁcient opti-
mization function. Actually, the cost function has a sensitive slope which changes
instantly when there are very small changes in the input values such as, a little
change in energy dissipating range or available energy range, can lead to a big
difference in function values. So, if there will be a small change in the level of the
cost function value it will be clearly noticeable.

6 Conclusion

In this study, a new energy-efﬁcient clustering-based routing protocol has been

introduced (named FL-GA), using the genetic algorithm and fuzzy logic to beneﬁt
from the least energy amount available on the network. For choosing the best
cluster heads, Fuzzy interface system is used. We compared the simulation results
with LEACH protocol in terms of (FND) & (LND) criteria, Our method shows that
the selecting of cluster head was more accurate, then we have better formed clusters
and the advantage of better processing in cluster head selection part because of the
ﬁtness function (GA) produces increasing the overall network lifetime because of
its feature, we took into account two variables, energy and distance, while the
LEACH protocol used only the energy for picking the CHs, which led to maxi-
mizing the received data packets in the sink.
A New Approach in Energy Consumption Based on Genetic Algorithm … 1019

References

1. Yick J, Mukherjee B, Ghosal D (2008) Wireless sensor network survey. Comput Netw
52:2292–2330
2. Sohrabi K, Gao J, Ailawadhi V, Pottie GJ (2000) Protocols for self-organization of a wireless
sensor network. IEEE Pers Commun 7
3. Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E (2002) Wireless sensor networks: a
survey. Comput Netw 38(4):393–422
4. Mihajlov B, Bogdanoski M (2011) Overview and analysis of the performances of ZigBee
based wireless sensor networks. Int J Comput Appl 29:28–35 (0975 – 8887)
5. Maraiya K, Kant K, Gupta N (2011) Application based study on wireless sensor network. Int J
Comput Appl 21:9–15 (0975 – 8887)
6. Ming LY, Wong VW (2006) An energy-efficient multipath routing protocol for wireless
sensor networks. Int J Commun Syst 20(7):747–766
7. Xu Y, Govindan R, Estrin D (2001) Geographical and energy aware routing: a recursive data
dissemination protocol for wireless sensor networks. Technical report UCLA/
CSD-TR-01-0023, UCLA Computer Science Department
8. Heinzelman W, Chandrakasan A, Balakrishnan H (2000) Energy-efficient communication
protocol for wireless microsensor networks. In: Proceedings of the 33rd annual Hawaii
international conference on system sciences (HICSS), Maui, HI, January 2000
9. Intanagonwiwat C, Govindan R, Estrin D (2000) Directed diffusion: a scalable and robust
communication paradigm for sensor networks. In: Proceedings of ACM MobiCom,
Boston USA, pp 56–67. ACM
10. Heinzelman WB, Chandrakasan AP, Balakrishnan H (2002) An application-specific protocol
architecture for wireless microsensor networks. IEEE Trans Wireless Commun 1(4):660–670
11. Kang SH, Nguyen T (2012) Distance based thresholds for cluster head selection in wireless
sensor networks. IEEE Commun Lett 16(9):1396–1399
12. Jin S, Zhou M, Wu AS (2003) Sensor network optimization using a genetic algorithm. In:
Proceedings of the 7th world multiconference on systemics, cybernetics and informatics
13. Gupta I, Riordan D, Sampalli S (2005) Cluster-head election using fuzzy logic for wireless
sensor networks. In: Proceeding of the 3rd annual conference on communication networks
and services research, pp 255–260. IEEE Computer Society Washington
Sustainable Energy and Power Engineering
Comparison of Buck-Boost Derived
Non-isolated DC-DC Converters
in a Photovoltaic System

Jotham Jeremy Lourdes, Chia Ai Ooi, and Jiashen Teh

Abstract A DC-DC converter plays a crucial role in a photovoltaic (PV) system.

Power generated by the PV system is a function of solar irradiance and temperature.
Power voltage (P-V) characteristic of a PV module exhibits a single power peak at
uniform irradiance and temperature. To operate the PV array at its maximum power
point, a maximum power (MP) point (MPP) tracking (MPPT) algorithm is required.
The DC-DC converter placed in between the PV array and load, works as an
impedance matching circuit. Depending on the application, a suitable selection of
DC-DC converter is an important decision. In this study, a comparative simulation
analysis of different buck-boost derived non-isolated DC-DC converters are dis-
cussed in terms of power conversion efﬁciency, output power ripple, and tracking
speed. The converters studied are buck-boost, Cuk, single-ended primary-inductor
converter (SEPIC), Zeta. To evaluate the performance of each DC-DC converter, a
widely used Perturb and Observe (PO) MPPT algorithm is modeled and simulated
in MATLAB Simulink.

Keywords Renewable energy Photovoltaic system DC-DC converter

1 Introduction

In recent years, the installation of photovoltaic (PV) panel has increased due to the
abundance of solar radiation, non-polluting, and low-maintenance characteristics.
The constantly changing surrounding temperature and irradiance affect the output
voltage and current from the PV system [1–7]. These changes cause the
power-voltage (P-V) curve to be non-linear. A maximum power (MP) point
(MPP) tracker (MPPT) is used with power converter to ensure that MP from PV
array is consistently achieved [1–3, 8–12].

J. J. Lourdes C. A. Ooi (&) J. Teh

School of Electrical and Electronic Engineering, Universiti Sains Malaysia,
Engineering Campus, 14300 Nibong Tebal, Penang, Malaysia
e-mail: [email protected]

With the aid of MPPT algorithm, the duty cycle (D) is controlled till load
matching between PV source and load is achieved and thus, extraction of MP is
feasible. Based on the D, the output voltage (Vout) of a converter can be increased
or otherwise [8, 11, 13].
Besides MPPT, the quality of power output (Pout) also depends on the converter
design. Among the available non-isolated DC-DC converters, buck, boost,
buck-boost, Cuk, single-ended primary-inductor converter (SEPIC) and Zeta con-
verters are the most commonly used converters in the PV systems. Such converters
are usually cheaper and simpler in design than the isolated DC-DC converter [14].
The choice of DC-DC converters also plays an important part in utilizing the system
output fully and to safeguard the overall operation [15]. The converter’s switching
frequency, inductance, capacitance value and other parameters play an important
role in ensuring the converter efficient.
A power converter in a PV system along with MPPT algorithm is installed
between the PV module and the load and is used to transfer power from the source
to load. Each converter has its operating methods and differences in term of cost,
efficiency, size and volume of the structure, robustness, power density, efficiency,
constructive aspects, and cost. The design of the converter is important to ensure the
reliability, safety and to produce maximum efficiency in a PV system.
The main interest of this paper is to investigate the differences between the
available non-isolated buck-boost derived DC-DC converters, i.e. buck-boost, Cuk,
SEPIC, and Zeta converters in terms of topology, power conversion efficiency, Pout
ripple, and tracking speed under several conditions.
This paper is arranged as follows: Sect. 2 describes the operation of the con-
verters used in this work, Sect. 3 shows the methodology of this comparative
research, Sect. 4 explains the results, and finally a conclusion and future exploration
are provided in Sect. 5.

2 Non-isolated DC-DC Converters

In order to extract MP from the PV module, it is important to maintain the load

impedance equal to source impedance. Therefore, an impedance matching circuit
(converter) is necessary between PV panel and the load. The purpose of this
impedance circuit is to adjust the internal impedance of the PV module at which it
can operate at its MP [8, 11, 16].
Comparison of Buck-Boost Derived Non-Isolated DC-DC Converters … 1025

I-V Curve P-V Curve

7 350
2
Isc Vmpp = 54.70 V 1000 W/m
2
1000 W/m
Impp = 5.76 A
6 2 MPP = 315.1 W 300
2
900 W/m 900 W/m
Rmpp = 9.50 Ohms
2 2
800 W/m 800 W/m
5 250
2
Vmpp = 51.56 V
700 W/m 2
Impp = 0.58 A 700 W/m

Power (W)
4 2 MPP = 29.75 W 2 200
600 W/m 600 W/m
Current (A)

Rmpp = 89.4 Ohms

2
500 W/m 2
500 W/m
3 150
2 2
400 W/m 400 W/m
2 2
2 300 W/m 300 W/m 100
2
200 W/m Voc 2
200 W/m
1 2 50
100 W/m
O
2
100 W/m
0 0
0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70

(a) Voltage (V) (b)

Fig. 1 Basic a I-V and b P-V curve characteristic of PV module at 25 °C

Figure 1 shows the (a) I-V and (b) P-V curves of the PV module at 25 °C for
various irradiance levels ranging from 100 W/m2 to 1000 W/m2. It can be observed
that the output power from PV module increases as the irradiance level increases
and for each irradiance value, the source impedance at MPP is unique. This relation
of source impedance and irradiance level will be clariﬁed in Sect. 3.
In this work, four (4) types of buck-boost derived converters are considered.
These are buck-boost, Cuk, SEPIC, and Zeta converters. Figure 2 shows the basic
PV system used in this work. The output from PV module is connected to the input
side of the converter. The output from the converter is connected to MPPT which is
sent back to the switch in the converter in terms of D. At the same time, the
converter output is sent to the scope to record the transient response of the
waveforms. The topology and the parameter of converter are shown in Sect. 3.
Perturb and observe (PO) algorithm is used as the MPPT for this work due to its
good response time and accuracy. The flowchart of PO algorithm can be found in
[2, 3, 12, 13]. PO records the Pout before and after the perturbation, if the power
increases, the perturbation continues in same direction, else, the perturbation is
reversed. The perturbation is continued until there is an oscillation at the MPP.
1026 J. J. Lourdes et al.

Fig. 2 PV system set-up in MATLAB Simulink

2.1 Buck-Boost Converter

Buck-boost is a converter formed by cascading buck and boost converters; hence, it

is capable to perform tracking over the entire I-V curve region. Figure 3 shows the
buck-boost converter topology. At ON state (switch is closed), there is a short
circuit at the switch and open circuit at the diode since it is reversed biased. The
current from the source will flow into the inductor, and its energy will rise. At this
time, the load receives continuous current from the capacitor. At OFF state (switch
is opened), there is an open circuit at the switch and short circuit at the diode since it
is forward biased. The energy will be discharged from the inductor, and the current
will flow to the load and capacitor through the diode [1, 3, 11]. The Vout of this
converter can be higher or lower than the input voltage (Vin) depending on the D
value of the converter. The relationship between the Vout, Vin, output current (Iout),
input current (Iin), output impedance (Rout), and input impedance (Rin) are shown in
Eq. 1.

Vout =Vin ¼ D=ð1 DÞ

Iout =Iin ¼ ð1 DÞ=D ð1Þ
Rin =Rout ¼ 1 D2 =D2

The complexity of this converter is medium. The drawback of this converter is

the inverted Vout polarity compared to its input that results in a high load current
ripple. Furthermore, the efﬁciency of buck-boost converter is also lower than SEPIC
and Cuk converters. To overcome these drawbacks [1, 3, 8, 11, 17–20], some
derivations are made onto its circuitry, and new converters were emerged; i.e. Cuk,
SEPIC and Zeta. The tracking region can be found in [8, 11, 21].
Comparison of Buck-Boost Derived Non-Isolated DC-DC Converters … 1027

Switch
Diode
Iin Iout
+
PV Inductor Resistor
panel Vout
Vin Capacitor -

Fig. 3 Circuit topology of buck-boost converter

Inductor 1 Capacitor 1 Inductor 2

Iout
Iin +
PV Switch Diode Resistor Vout
panel
Vin Capacitor 2
-

Fig. 4 Circuit topology of Cuk converter

2.2 Cuk Converter

Similar to buck-boost converter, Cuk converter also tracks over all I-V curve
regions and gives inverted Vout polarity. The difference between a buck-boost and
Cuk converter is, Cuk converter uses additional inductor and capacitor to store
energy and transfer power. Figure 4 shows the Cuk converter circuit topology.
At ON state, the diode is opened and the switch is shorted. The current in the
inductor 1 increases linearly and capacitor 1 transfers its energy to inductor 2 and
capacitor 2. At OFF state, the diode is forward biased, and it carries the current from
inductor 1 and inductor 2 and transfers to capacitor 1 and capacitor 2 respectively.
This results in a reduction of inductors current value [3, 11]. The Vout, Vin, Iout, Iin,
Rout, Rin equations of Cuk converter are similar to Eq. 1 respectively. The com-
plexity of this converter is high. The drawback of Cuk converter is it has high load
current ripples.

2.3 SEPIC Converter

SEPIC converter is unique compared to buck-boost and Cuk converter as it over-

comes the drawback of inverted Vout polarity in buck-boost and Cuk converters.
The SEPIC converter also is able to track over all the I-V curve regions. Similar to
Cuk converter, the energy transferring elements are inductor and capacitor. Figure 5
1028 J. J. Lourdes et al.

Inductor 1 Capacitor 1 Diode

Iout
Iin +
PV
Switch Inductor 2 Resistor Vout
panel Capacitor 2
Vin -

Fig. 5 Circuit topology of SEPIC converter

Switch Capacitor 1 Inductor 2

Iout
Iin +
PV Diode
Inductor 1 Resistor Vout
panel
Vin Capacitor 2 -

Fig. 6 Circuit topology of Zeta converter

shows the circuit topology of SEPIC converter. At ON state, the diode is in open
circuit and the switch is in short circuit. The inductor 1 receives the energy from the
source. Capacitor 1 charges the inductor 2 and capacitor 2 keeps the current flowing
in the load. At OFF state, the diode is in short circuit and it starts conducting. The
inductor 1 transfers energy to capacitor 1, and current flows from inductor 2 to the
capacitor 2 and load [3, 11]. The equations of the Vout, Vin, Iout, Iin, Rout, Rin are as
in Eq. 2 below. The complexity of this converter is high. The drawbacks are it
produces high load current ripple and high Vout ripple.

Vout =Vin ¼ D=ð1 DÞ

Iout =Iin ¼ ð1 DÞ=D
ð2Þ
Rin =Rout ¼ 1 D2 =D2

2.4 Zeta Converter

Figure 6 shows the circuit topology of Zeta converter. At ON state, there is a short
circuit at the switch and open circuit at the diode. The inductor 1 and inductor 2 are at
charging phase, their current increases linearly. The capacitor 1 and capacitor 2 are
discharging, and energy is transferred to the load. At OFF state, the diode is in short
circuit and the switch is in open circuit. Energy in inductor 1 is transferred to capacitor
1, whereas the energy in inductor 2 is transferred to capacitor 2 and load. Hence, their
Comparison of Buck-Boost Derived Non-Isolated DC-DC Converters … 1029

current decreases linearly [1, 3, 11]. Similar to the other converters, it can track over
all I-V curve regions. It also has a non-inverting characteristic like SEPIC converter.
The equations are presented in Eq. 2. Both inductor and capacitor are the energy
transferring elements. The complexity of this converter is high. Zeta improvises the
drawback in buck-boost by minimizing the Iout ripple [1, 3, 11, 18–20].

3 Methodology

The circuit topology of buck-boost, Cuk, SEPIC and Zeta converters for PV
application is modeled in MATLAB Simulink. In this work, PV array is built based
on the parameter speciﬁcations in Table 1.
In Fig. 1, it is noticeable that the MP can be harvested when the impedance level
is 9.50 Ω for irradiance level of 1000 W/m2 and 89.4 Ω for irradiance level of
100 W/m2. The source impedance for each irradiance value is listed in Table 2.
These values are calculated based on the Ohm’s law and Power law shown in Eq. 3.
The load impedance was set at 30 Ω which is within the operating range of the PV
module.

Voltage ¼ Current Impedance

ð3Þ
Power ¼ Voltage Current

The specifications of the components for the converters are stated in Table 3.
The component values are set equal for each converter for a fair comparison. To
justify the component values, the transient waveforms of Vin, Vout, Iin, Iout, input
power (Pin) and Pout of PV module under irradiance level of 1000 W/m2 and
temperature value of 25 °C were recorded and are shown in Fig. 7. Figure 7 (b) and
(d) show that the Vout and Iout of buck-boost and Cuk converters are inverted as
described in Sect. 2.1 and Sect. 2.2.
Figure 8 (a) and (b) indicate the measurement of power conversion efficiency
(%), Pout ripple (%) and tracking speed (settling time (s)). The steady state
(SS) represents a situation in which the waveform has converged, or the oscillation
remains within a certain range. The settling time is the time taken to achieve the
steady state and it is used to resemble the tracking speed of the converter.
The power conversion efficiency (%) is calculated from the Eq. 4. The Pout
ripple (%) is calculated from the Eq. 5, and the settling time (s) is obtained from the
respective scope in the simulation shown in Fig. 8 (b).

Power conversion efficiencyð%Þ ¼ ðPout at SS=Pin at SSÞ 100 ð4Þ

Pout rippleð%Þ ¼ ððHighest peak at SSLowest peak at SSÞ=Pout at SSÞ 100 ð5Þ
1030 J. J. Lourdes et al.

Table 1 PV module speciﬁcations

Parameters Values
PV module type SunPower SPR-315E-WHT-U
Parallel strings 1
Series-connected modules per string 1
Maximum power (W) 315.072
Cells per module (Ncell) 96
Open circuit voltage Voc (V) 64.6
Short-circuit current Isc (A) 6.14
Voltage at MPP Vmpp (V) 54.7
Current at MPP Impp (A) 5.76

Table 2 The calculated matching impedance for each irradiance level

Irradiance (W/m2) Voltage (V) Current (I) Power (W) Impedance (Ω)
100 51.56 0.58 29.75 89.4
200 52.86 1.15 61.04 45.8
300 53.65 1.73 92.74 31.0
400 54.05 2.31 124.6 23.4
500 54.32 2.88 156.5 18.9
600 54.45 3.46 188.4 15.7
700 54.59 4.03 220.2 13.5
800 54.60 4.62 252.0 11.8
900 54.96 5.16 283.5 10.7
1000 54.70 5.76 315.1 9.50

Table 3 Converter topology Components Buck-boost Cuk SEPIC Zeta

speciﬁcations
Impedance (Ω) 30 30 30 30
Capacitor 1 (F) 1e-4 1e-4 1e-4 1e-4
Capacitor 2 (F) Not applicable 1e-4 1e-4 1e-4
Inductor 1 (H) 3e-3 3e-3 3e-3 3e-3
Inductor 2 (H) Not applicable 3e-3 3e-3 3e-3
Comparison of Buck-Boost Derived Non-Isolated DC-DC Converters … 1031

70 100

50
50
Buck-boost

Output voltage (V)

Input voltage (V)

40 Cuk
0 SEPIC
30 Zeta

Buck-boost
20 -50
Cuk
SEPIC
10
Zeta
-100
0
0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2
(a) (b)
7
3
6
2
5
1 Buck-boost
Output current (I)
Input current (I)

4 Cuk
0 SEPIC
3 Zeta
-1
Buck-boost
2
Cuk
-2
SEPIC
1
Zeta
-3
0
0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2
(c) (d)
350 350

300 300

250 250
Output power (W)
Input power (W)

200 200

150 150

Buck-boost Buck-boost
100 100
Cuk Cuk
SEPIC SEPIC
50 50
Zeta Zeta

0 0
0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2
(e) (f)

Fig. 7 The transient waveform of a Vin, b Vout, c Iin, d Iout, e Pin, and f Pout over the timeframe of
0.2 s
1032 J. J. Lourdes et al.

350
Highest peak at steady state

300

250
Output power (W)

200
Settling time Steady state
150 Lowest peak at steady state
Lowest peak at steady state

100

50
Buck-boost

0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
(a)
0.65

0.6
Settling time for each converter
Duty cycle

0.55
0.64

0.63 Buck-boost
0.5 Cuk
0.62
SEPIC
0.61 Zeta
0.04 0.05 0.06 0.07
0.45
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
(b)

Fig. 8 a Power conversion efﬁciency (%), Pout ripple (%), and b settling time (s)

4 Simulation Results and Discussion

The simulation results of PV module under 100 W/m2 to 1000 W/m2 at 25 °C are
shown in Table 4, Table 5 and Table 6. Table 4 shows the power conversion
efficiency of each converter, Table 5 shows the Pout ripple of each converter and
Table 6 shows the tracking speed of each converter at specified irradiance level.
Figure 9 shows the power conversion efficiency (%) of converters under dif-
ferent irradiance levels. It can be observed that as the irradiance level decreases the
converters power conversion efficiency decreases. At 1000 W/m2, the average
power conversion efficiency of all converters is 95.76%, whereas at 100 W/m2, the
average efficiency is 87.40%. Furthermore, from Table 4, the Cuk converter per-
forms the best with power conversion efficiency of 93.50%, Zeta yields the lowest
efficiency of 93.40%. The validity of this curve in Fig. 9 is confirmed by [21–23],
which also justifies that the values of the components are within the operating
range.
Comparison of Buck-Boost Derived Non-Isolated DC-DC Converters … 1033

Figure 10 shows the Pout ripple (%) of converters under different irradiance
levels. Buck-boost, Cuk and Zeta converters show an overall decreasing trend in
terms of Pout ripple (%), whereas no trend can be observed for SEPIC converter.
Furthermore, it can be observed from Table 5 that, the Zeta converter produces the
smallest ripple among all the converters with an average of 6.66% ripple at steady
state and SEPIC converter producing the highest ripple with 18.81%.
Figure 11 shows the relationship between the converters and the tracking speed
(settling time) at different irradiance levels. No distinct pattern can be observed, but
from Table 6, it is known that the buck-boost converter is the fastest to reach steady
state with an average time of 0.038 s, with Cuk converter the slowest of an average
time of 0.072 s.
Table 7 shows the summarized performance comparison of the converters. There
are trade-offs between the converters. It is critical to choose the converters based on
the applications. Zeta converter performs slightly better than the rest.

Table 4 The converter’s power conversion efﬁciency (%) under different irradiance levels (W/m2)
Irradiance (W/m2) Buck-boost Cuk SEPIC Zeta Mean
100 87.37 87.65 87.37 87.20 87.40
200 90.78 91.04 90.91 90.73 90.86
300 92.39 92.46 92.52 92.37 92.44
400 93.46 93.45 93.46 93.46 93.46
500 94.17 94.09 94.16 93.99 94.10
600 94.61 94.66 94.50 94.60 94.59
700 94.99 94.98 95.03 94.99 95.00
800 95.27 95.29 95.15 95.30 95.25
900 95.57 95.57 95.58 95.57 95.57
1000 95.78 95.78 95.70 95.78 95.76
Mean 93.44 93.50 93.44 93.40

Table 5 The converter’s Pout ripple (%) under different irradiance levels (W/m2)
Irradiance (W/m2) Buck-boost Cuk SEPIC Zeta Mean
100 31.74 47.37 37.23 17.85 33.55
200 22.46 27.80 17.53 7.40 18.80
300 16.53 21.62 13.47 7.70 14.83
400 10.66 9.92 15.61 5.16 10.34
500 6.47 6.20 13.30 4.37 7.58
600 7.17 6.37 19.35 5.63 9.63
700 6.28 1.79 4.02 4.09 4.04
800 8.27 7.66 35.97 5.25 14.29
900 5.66 0.32 10.20 3.94 5.03
1000 6.72 5.35 21.40 5.26 9.68
Mean 12.20 13.44 18.81 6.66
1034 J. J. Lourdes et al.

Table 6 The converter’s settling time (s) under different irradiance levels (W/m2)
Irradiance (W/m2) Buck-boost Cuk SEPIC Zeta Mean
100 0.052 0.088 0.116 0.058 0.079
200 0.040 0.066 0.051 0.038 0.049
300 0.023 0.058 0.045 0.034 0.040
400 0.026 0.121 0.056 0.043 0.062
500 0.040 0.049 0.035 0.039 0.041
600 0.035 0.073 0.055 0.043 0.052
700 0.042 0.050 0.040 0.054 0.047
800 0.041 0.057 0.044 0.043 0.046
900 0.044 0.093 0.117 0.059 0.078
1000 0.041 0.064 0.053 0.052 0.053
Mean 0.038 0.072 0.061 0.046

Power Conversion Efficiency (%)

100
Buck-boost
Cuk
SEPIC
Zeta

85
100 200 300 400 500 600 700 800 900 1000
Irradiance (W/m2 )

Fig. 9 The power conversion efﬁciency (%) of each converter under different irradiance levels

Output Power Oscillation (%)

50
Buck-boost
Cuk
40 SEPIC
Zeta

0
100 200 300 400 500 600 700 800 900 1000
Irradiance (W/m2 )

Fig. 10 The Pout ripple (%) of each converter under different irradiance levels
Comparison of Buck-Boost Derived Non-Isolated DC-DC Converters … 1035

Settling Time (s)

0.15
Buck-boost
Cuk
SEPIC
Zeta

0.1

0.05

0
100 200 300 400 500 600 700 800 900 1000
2
Irradiance (W/m )

Fig. 11 The settling time (s) of each converter under different irradiance levels

Table 7 The summarized performance comparison of converters

Comparison Buck-boost Cuk SEPIC Zeta
Power conversion efﬁciency Good Best Good Good
Output power ripple Bad Bad Worst Best
Tracking speed Best Worst Bad Good

5 Conclusion

A comparison of four (4) types of buck-boost derived DC-DC converters were

carried out for PV system under several irradiance levels. MATLAB Simulink was
used to design and perform the simulation. PO was used as the MPPT algorithm for
this work. It is observed that when the irradiance level is altered, there is a unique
trend in the power conversion efﬁciency for all converters. Furthermore, it is
concluded that each converter has its advantages and disadvantages. There should
be a trade-off when selecting appropriate converter to be used in the application
with Zeta converter performs slightly better than the rest under given conditions.
This comparative research can be further explored using 1) different MPPT algo-
rithms such as Incremental Conductance, Cuckoo Search Algorithm and others; 2)
wider range of component values to investigate its individual effect on each con-
verter performance; and 3) under partial shading conditions to ﬁnd its dependency
on the surrounding condition.

Acknowledgments This work was supported by Universiti Sains Malaysia under the project
1001/PELECT/8014137.
1036 J. J. Lourdes et al.

References

1. Chaurasia MK, Gidwani L (2017) Analysis of PV module with buck-boost and zeta
converters with incremental conductance maximum power point tracking method. In: 2017
International Conference On Smart Technologies For Smart Nation (SmartTechCon). IEEE,
Bangalore, pp 1591–1596
2. Husain MA, Tariq A, Hameed S (2017) Comparative assessment of maximum power point
tracking procedures for photovoltaic systems. Green Energy Environ. 5–17
3. Soedibyo AB, Ashari M (2015) The comparative study of Buck-boost, Cuk, Sepic and Zeta
converters for maximum power point tracking photovoltaic using P&O method. In: 2015 2nd
international conference on information technology, computer, and electrical engineering
(ICITACEE). IEEE, Semarang, pp 327–332
4. Huynh DC, Dunnigan MW (2016) Development and comparison of an improved incremental
conductance algorithm for tracking the MPP of a solar PV panel. IEEE Trans Sustain Energy
1421–1429
5. Dolara A, Grimaccia F, Mussetta M (2018) An evolutionary-based MPPT algorithm for
photovoltaic systems under dynamic partial shading. Applied Sciences, 558
6. Haque A (2014) Maximum Power Point Tracking (MPPT) scheme for solar photovoltaic
system. energy technology & policy, 115–122
7. Sivachandran P, Lakshmi PS, Janani R (2015) Survey of maximum power point tracking
techniques in solar PV system under partial shading conditions. ARPN J Eng Appl Sci 256–264
8. Taghvaee MH, Radzi MAM, Moosavain SM (2013) A current and future study on
non-isolated DC–DC converters for photovoltaic applications. Renew Sustain Energy Rev
216–227
9. Ram JP, Rajasekar N, Miyatake M (2017) Design and overview of maximum power point
tracking techniques in wind and solar photovoltaic systems: a review. Renew Sustain Energy
Rev 1138–1159
10. Onat N (2010) Recent Developments in maximum power point tracking technologies for
photovoltaic systems. Int J Photoenergy, 11 p
11. Dileep G, Singh SN (2017) Selection of non-isolated DC-DC converters for solar photovoltaic
system. Renew Sustain Energy Rev, 1230–1247
12. Seyedmahmoudian M, Horan B, Soon TK (2016) State of the art artiﬁcial intelligence-based
MPPT techniques for mitigating partial shading effects on PV systems – A review. Renew
Sustain Energy Rev, 435–455
13. Saravanan S, Babu NR (2016) Maximum power point tracking algorithms for photovoltaic
system – a review. Renew Sustain Energy Rev, 192–204
14. Elankurisil SA, Dash SS (2011) Comparison of isolated and non-isolated bi-directional DC -
DC converter for DC motor. Int J Res Appl Sci Eng Technol (IJRASET)
15. Jiyong L, Honghua W (2009) A novel stand-alone PV generation system based on variable
step size INC MPPT and SVPWM control. In: 2009 IEEE 6th international power electronics
and motion control conference. IEEE, Wuhan, pp 2155–2160
16. Ramki T, Tripathy LN (2015) Comparison of different DC-DC converter for MPPT
application of photovoltaic system. In: international conference on electrical, electronics,
signals, communication and optimization (EESCO). IEEE, Visakhapatnam, pp 1–6
17. Tytelmaier K, Husev O, Veligorskyi O (2016) A review of non-isolated bidirectional dc-dc
converters for energy storage systems. In: 2016 II international young scientists forum on
applied physics and engineering (YSF). IEEE, Kharkiv, pp 22–28
18. Izadian A, Khayyer P, Yang H (2012) Adaptive voltage tracking control of zeta buck-boost
converters. In: 2012 IEEE energy conversion congress and exposition (ECCE). IEEE,
Raleigh, pp 414–417
19. Kambli O, Arun L (2016) Comparison between Zeta Converter and Boost Converter using
Sliding Mode Controller. Int J Eng Res Technol (IJERT)
Comparison of Buck-Boost Derived Non-Isolated DC-DC Converters … 1037

20. Babu PR, Prasath SR, Kiruthika R (2015) Simulation and performance analysis of CCM Zeta
converter with PID controller. In: 2015 international conference on circuits, power and
computing technologies [ICCPCT 2015]. IEEE, Nagercoil, pp 1–7
21. Sivakumar S, Sathik MJ, Manoj PS (2016) An assessment on performance of DC–DC
converters for renewable energy applications. Renew Sustain Energy Rev, 1475–1485
22. Yan Z, Chaoyi Z, Jinjun L (2012) Comparison of conventional DC-DC converter and a family
of diode-assisted DC-DC converter. In: Proceedings of the 7th international power electronics
and motion control conference. IEEE, Harbin, pp 1718–1723
23. Rosas-Caro JC, Ramirez JM, Peng FZ (2010) A DC-DC multilevel boost converter. IET
Power Electron 129–137
Fault Localization and Detection
in Medium Voltage Distribution
Network Using Adaptive Neuro-Fuzzy
Inference System (ANFIS)

N. S. B. Jamili, Mohd Raﬁ Adzman, Wan Syaza Ainaa Wan Salman,

M. H. Idris, and M. Amirruddin

Abstract This paper provides an overview method of ANFIS to improve the

performance of locating and detecting a single line to ground fault in medium
voltage (MV) distribution power system network. For this research, the proposed
method used faulted current signals from the network as an input in ANFIS to
obtain the expected fault location and detection. In order to accurately locate the
faults in the network, an ANFIS was trained and tested using various sets of data,
which was obtained from the simulation of faults at various fault scenarios such as
inception angle, load, fault location, and fault resistance. Based on the result obtain,
the fault distance, major and minor branch of the single line to ground fault can be
located with minimal error.

Keywords Single line to ground Fault location Fault detection Medium

voltage ANFIS Distribution network

1 Introduction

In electrical power systems network, transmission and distribution lines are vital
links that achieve the continuity of service from the generating plants to the end
users. Protection systems for transmission lines are one of the most important parts
of power systems. Fault location is a desirable feature in any protection scheme.
Accurate location of transmission line faults has been a subject of interest in recent
years [1, 2].
The major reason for this activity is that an accurate location of the fault can
reduce the time required for restoring service to customers [3]. The requirement to
improve service quality, forces the utilities to rapidly locate faults in their distri-
bution networks thus improving service continuity indexes and avoiding penalties

N. S. B. Jamili (&) M. R. Adzman W. S. A. W. Salman M. H. Idris M. Amirruddin

School of Electrical System Engineering, Universiti Malaysia Perlis, Kampus Pauh Putra,
02600 Arau, Perlis, Malaysia
e-mail: [email protected]

in case of failures to accomplish continuity indexes [4]. There were many

researchers who tried to solve this problem with their own methods. One of the
most recent methods is by using an adaptive neuro-fuzzy inference system
(ANFIS). By referring to [5, 6], the system is designed to determine the type of the
fault and the fault resistance. While [7–10] advances the research to determine the
fault condition, the inception angle, fault impedance and fault distance in the
transmission line. Along with those researchers, [11–15] combined ANFIS and
another method to find the fault location in an electrical network. The respective
studied method would find the fault location with increased accuracy and reliability
for locating all types of earth fault at a certain type of medium voltage distribution
network. However, another electrical system in another region may need different
specifications or requirements to find the fault location. Thus, the research must be
continued. In this study, the ANFIS hybrid algorithm is used to find the fault
location and detection in a typical 11 kV medium voltage (MV) distribution
network.

2 ANFIS Concept

The ANFIS is a fuzzy Sugeno model of integration where the final fuzzy inference
system is optimized via the ANNs training [4]. ANFIS approach targets only one
output from several given inputs. The target is manipulated through the perfor-
mance of the membership function curve according to a particular data input. The
curve parameters are identified based on the respective weighted values via the
product in between the created learning rules. A ratio between the individual and
overall weighted values is calculated. The ratio is gained by using the parameters of
output membership function. Finally, ANFIS predicts the target by producing an
overall gained value as an output. Membership function parameters in input and
output sides are adjusted through a learning process to get the targeted values.
ANFIS uses a hybrid algorithm that consists of a combination between
back-propagation and least-square estimation techniques. The techniques are
implemented in an artificial neural network as a learning algorithm that gives very
fast convergence and more accuracy in ANFIS targets. The ANFIS makes use of a
hybrid learning rule to optimize the fuzzy system parameters of first-order Sugeno
system, which can be graphically represented by Fig. 1. It maps inputs through
input membership functions and associated parameters, and then through output
membership functions to outputs [6]. The process starts with fuzzification, followed
by “IF-THEN” rule, normalization, defuzzification, and end with neuron addition.
The nodes represented with squares are adaptable nodes. Adaptable nodes are
where their parameters are adjustable.
Fault Localization and Detection in Medium Voltage … 1041

Fig. 1 Example of structure

of ANFIS model with two
inputs data and two MFs

The structure of the network is composed of 5 steps described as follows [16]:

i. Fuzziﬁcation
Its function is to convert the input signal to a fuzzy signal in which the signal
is yielded via the input side of the MF curve. The curve is performed by using
the following equations:

1
Xi ð xÞ ¼ ð1Þ
1 þ ððx ci Þ=ai Þ2bi

1
Y i ð yÞ ¼ ð2Þ
1 þ ððy ci Þ=ai Þ2bi

Where, Xi(x) and Yi(y) are fuzzied values for each input data, whereas ai, bi
and ci are MF parameters sets that are calculated by Gaussian input
membership function.
ii. IF-THEN
An output signal from the fuzziﬁcation stage becomes an input to the stage of
the ‘IF-THEN’ rule. In this stage, the fuzzy signal is gained by using Eq. (3).

Ri ¼ Xið xÞ x Yið yÞ i ¼ 1; 2; 3; 4 ð3Þ

iii. Normalization
Next, the output signal from the stage of ‘IF-THEN’ rule will be an input
signal is divided into the total of the gained signal by the following equation:

Ri
Ni ¼ i ¼ 1; 2; 3; 4 ð4Þ
R1 þ R2 þ R3 þ R4

iv. Defuzziﬁcation
In this stage, a normalized signal is gained again through a linear equation that
is formed from the MF of the output signal as shown in the following
equation:
1042 N. S. B. Jamili et al.

Gi ¼ Ni ðpi þ qiy þ riÞ i ¼ 1; 2; 3; 4 ð5Þ

where pi, qi and ri being the MF parameters for the linear signal.
v. Neuron addition
The last process in the ANFIS operation is called neuron addition in which all
defuzziﬁcation signals, Gi are added together as shown below:
X
OT ¼ Gi i ¼ 1; 2; 3; 4 ð6Þ

OT is a predicted value.

3 Power Distribution System

To test the accuracy of the ANFIS in determining the location of the fault, a circuit
with multiple branches is used. A 11 kV MV distribution network based on the
distribution power system network located in Perlis, Malaysia is selected to test the
hybrid model. The network was drawn using ATPDraw. Figure 2 shows the
selected feeder from the MV distribution network. The network consists of four
main branches (CB1, CB2, CB3, CB4) and 17 minor branches (CB11, CB12,
CB13, CB14, CB31, CB32, CB …, CB410) that is connected to the loads. CB and
V denoted in Fig. 2 is the current and voltage probe that is used to measure the
signals.

4 Proposed Methodology

The following are the total data used:

Number of training samples = 4104
Number of testing samples = 324
The following are the factors contributing in constructing a training data:
Name = Location
Type = Sugeno
Number of Inputs = 6
Number of Output = 1
Number of Membership function = 2
Membership function type = generalized bell
Number of rules = 64
The six inputs are the peak current value at the current branch (CB) 1, 14, 2, 3,
32, and 4 after a fault occurs. Expected output is the distance from the main branch
Fault Localization and Detection in Medium Voltage … 1043

DTX1

Y
SAT L 120kVA

I
CB11

70 CB12 185 CB13 DTX2

V
I I
Y
V
V3
SAT L 600kVA
500
V2 CB14 DTX3

I
F1 120 ILoad
#@0 I I
CB1 Y
SAT L 180kVA

30MVA Vs V1 DTX4
Is 500 IL 500 CB2 185
V V
I I I
Y Y
300kVA
SAT SAT L
V4 DTX5
CB3 185 CB31
V
I I
Y
500
SAT L 450kVA
F3
#@0
CB32 V5 CB33 DTX6
185 V
I I
Y
300kVA
SAT L
500
CB34 V6
F4

I
#@0 185 V CB35 DTX7
I
Y
CB37
L 180kVA

I
SAT

500
V7 CB38 DTX9
F5 185 V CB36
#@0 I 185 DTX8
Y I
SAT L 900kVA
SAT
Y

L 180kVA
500

F6
#@0
CB4
CB39 DTX10
185
I

I
Y
SAT L 450kVA
V8 CB41 DTX11
500 185 V
I
Y
F7
#@0
SAT L 180kVA
CB42
I

V9
185 V CB43 DTX12
I
Y
300kVA
SAT L
CB44 V10
CB45 DTX13
I

185 V
I
Y
SAT L 600kVA

CB46 V11
I

185 CB47 DTX14

V
I
Y
SAT L 450kVA

CB48
I

185 V DTX15
V12CB49 I
Y
SAT L 450kVA

CB410
I

150 DTX16
Y
SAT L 450kVA

185 DTX17
Y
SAT L 180kVA

Fig. 2 A typical 11 kV MV distribution network

or the minor branch. When training process is completed, the testing and validation
part is needed. The testing data is collected from various fault resistance and load
that is in the range of training sets. Tables 1 and 2 shows the training and testing
scenario of the simulation.

Table 1 Scheme for ANFIS training

Fault type Single line to ground
Fault resistance 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.2, 1.4, 1.6, 1.8, 2, 2.2, 2.4,
2.6, 2.8, 3, 3.5, 4, 4.5, 5, 5.5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70 Ω
Inception angle 0, 45, 90°
Load 4390, 6270, 8150 kVA
Fault distance: 3.55 km (branch 14) and 2.9 km (branch 32) from the sending end
(output)
1044 N. S. B. Jamili et al.

Table 2 Scheme for ANFIS testing

Fault type Single line to ground
Fault resistance 0, 5.5, 10.5, 15, 25, 35, 45, 55, 65 Ω
Inception angle 0, 45, 90°
Load 4390, 6270, 8150 kVA
Fault distance: (output) 3.75 km (branch 14) and 2.9 km (branch 32) from the sending end

5 Single Line to Ground Fault Location Based on ANFIS

Scenario in Table 1 needed to be specified before conducting the test. Type of input
(earth fault generated current signal) and output (distance, main branch or the minor
branch) plays an important role in producing an accurate measurement. When the
factors have been decided, a typical MV distribution network is constructed and
compiled in ATPDraw. Then, the signal produced from the current probes is saved
and analyzed in MATLAB. The peak value of current signals is used as input data
in ANFIS. While the fault distance and fault branch is used as output. ANFIS
hybrid model is simulated when the collected data is completed, and finally the
estimated result is produced.
Flowchart as shown in Fig. 3 explained the process taken to simulate ANFIS
hybrid model. There are 2 types of data, which is training and testing data. Both of
these data are collected with the same method but with different factors; different
fault resistance, fault distance. For training data, the data is configured and trained
in MATLAB environment. When the training is finished, training error will appear
at the bottom of GUI. If the error is less than 0.0, then it is acceptable, and it can
proceed to the next process. However, if the error is higher than 0.0, then the
number of MF needs to be changed until the error is reduced to acceptable level.
Then, testing data is loaded from the workspace and its accuracy is checked
depending on the training data simulated before. After the simulation is complete
with minimum error accuracy, the training and testing were exported to the
workspace for the execution of the estimated fault distance and fault branches.

6 Simulation Result and Discussion

For ANFIS fault location and detection estimation, two different fault distance and
locations were tested with three different loads, three different inception angles, and
various fault resistance. ANFIS hybrid model was trained for fault location and
fault detection. Once trained, the network performance was validated using testing
data that is different from the training data. The estimated fault distance and
detection obtained from the simulation is compared with the actual value.
Fault Localization and Detection in Medium Voltage … 1045

Fig. 3 Flowchart of
conducting the ANFIS hybrid Start
model

Load the training data

Choose number of membership function

Choose type of membership function

Select number of epoch

Train the data

No
Low error

Yes

Load the testing data

Test the data

Export training and testing to workspace

Execute the output for training and testing

End

For the simulation, the ﬁrst fault was located at 3.55 km, main branch 1 and
minor branch 14. While the second fault were located at 2.9 km, main branch 3 and
minor branch 32. Figure 4 shows the location of the fault in the network.
Table 3 shows the summary of the simulation that had been done to locate and
detect the fault. The table shows the training and testing data sets and the Root
Mean Square Error (RMSE) for three different simulations.
For training, 684 sets of current samples were used to test the fault location, fault
main and minor branch estimation model. The performance graph for training data
of the fault location estimation model is shown in Figs. 5(a), 6(a) and 7(a) with
RMSE equal to 0.0073474, 0.021798 and 0.19527 at epoch 700 for training.
1046 N. S. B. Jamili et al.

DTX1

Y
SAT L 120kVA

I
CB11

CB12 185 CB13 DTX2

V
I I
Y
V
V3
SAT L 600kVA
500
70 V2 CB14 DTX3
F1 120 120
#@0 I
Y
SAT L 180kVA

I
CB1
30MVA Vs DTX4
Is 500 IL 500 CB2 185 Fault 1
V V1 V
I I I
Y Y
300kVA
SAT SAT L
V4 DTX5
CB3 185 CB31
V
I I
Y
500
SAT L 450kVA
F3
#@0
CB32 X0150 V5 CB33 DTX6
185 185 V
I I
Y
300kVA
SAT L
500

Fault 2 CB34 V6
F4

I
#@0
185 CB35 DTX7
V
CB37 I
Y

L 180kVA

I
SAT

500
V7 CB38 DTX9
F5 185 V CB36
#@0 I 185 DTX8
Y I
SAT L SAT
Y
L 180kVA
500 900kVA

F6 CB4
#@0 185CB39 DTX10

I
I
Y
SAT L 450kVA
V8
500 185 V CB41 DTX11
I
Y
F7
#@0
SAT L 180kVA
CB42
I

V9
185 V CB43 DTX12
I
Y
300kVA
SAT L
CB44 V10
CB45 DTX13
I

185 V
I
Y
SAT L 600kVA

CB46 V11
I

185 CB47 V
DTX14
I
Y
SAT L 450kVA

CB48
I

185 V DTX15
V12CB49 I
Y
SAT L 450kVA

CB410
I

150 DTX16
Y
SAT L 450kVA

185 DTX17
Y
SAT L 180kVA

Fig. 4 Location of the fault in the network

Table 3 Summary of the simulation for fault distance, fault main and minor branch estimation
Fault distance Main branch Minor branch
Training Testing Training Testing Training Testing
Number of sets 684 54 684 54 684 54
RMSE 0.0073474 0.15469 0.021798 0.079476 0.19527 0.52271

While 54 sets of current samples were used to test the fault location, fault main
and minor branch estimation model. The performance graph for testing data of the
fault location estimation model is shown in Figs. 5(b), 6(b) and 7(b) with an
average error of 0.15469, 0.079476 and 0.52271.
Table 4 shows the training results for estimated fault distance, and fault detec-
tion at main and minor branch. The error produced in fault location estimation is
less than 0.001 km, while for fault detection, the system accurately detected the
location of the fault at minor branch 14 and 32. Table 5 shows the testing simu-
lation result for fault distance and detection. From the table, the result shows that an
accurate fault detection estimation.
Fault Localization and Detection in Medium Voltage … 1047

(a)

(b)

Fig. 5 Models for fault distance (a) Training (b) Testing

1048 N. S. B. Jamili et al.

(a)

(b)

Fig. 6 Models for fault main branch detection (a) Training (b) Testing
Fault Localization and Detection in Medium Voltage … 1049

(a)

(b)

Fig. 7 Models for fault minor branch detection (a) Training (b) Testing
1050 N. S. B. Jamili et al.

Table 4 Training output result for the estimated fault distance, main and minor branch at single
line to ground fault
Load Actual Inception Estimated
(kVA) Distance Main Minor angle Distance Main Minor
(km) (branch) (branch) (km) (branch) (branch)
4390 3.55 1 14 0 3.5497 1 14
45 3.5501 1 14
90 3.5501 1 14
2.9 3 32 0 2.9013 3 32
45 2.9003 3 32
90 2.9003 3 32
6270 3.55 1 14 0 3.5495 1 14
45 3.5500 1 14
90 3.5493 1 14
2.9 3 32 0 2.9000 3 32
45 2.9007 3 32
90 2.9006 3 32
8150 3.55 1 14 0 3.5502 1 14
45 3.5500 1 14
90 3.5495 1 14
2.9 3 32 0 2.9000 3 32
45 2.8997 3 32
90 2.9006 3 32

Table 5 Testing output result for the estimated fault distance, main and minor branch at single
line to ground fault at inception angle of 0° with various load
Actual Rf Estimated
Distance Main Minor (Ω) Distance Main Minor
(km) (branch) (branch) (km) (branch) (branch)
3.75 1 14 0 3.4780 1 14
5.5 3.5169 1 14
10.5 3.5267 1 14
15 3.5375 1 14
25 3.5545 1 14
35 3.5664 1 14
45 3.5656 1 14
55 3.5482 1 14
(continued)
Fault Localization and Detection in Medium Voltage … 1051

Table 5 (continued)
Actual Rf Estimated
Distance Main Minor (Ω) Distance Main Minor
(km) (branch) (branch) (km) (branch) (branch)
2.90 3 32 0 2.9022 3 32
5.5 2.8958 3 32
10.5 2.9054 3 32
15 2.8926 3 32
25 2.8954 3 32
35 2.8930 3 32
45 2.8954 3 32
55 2.9004 3 32

Based on the result, it is possible to locate the exact location of the fault when all
the results (fault distance, main and minor branch) are combined together. For
example, by referring to Fig. 4, the fault location is detected at 3.55 km, at main
branch 1, and at minor branch 14. By obtaining the exact location of the fault, this
will minimize the time taken to locate the fault and perform the necessary
maintenance.

7 Conclusion

This paper presents an application of fault location and detection in typical 11 kV

medium voltage distribution network based on Adaptive Network-Based Fuzzy
Inference System. The proposed ANFIS used a composite of data gathered from the
earth fault simulation as input output of ANFIS. Based on the obtained result, it can
be concluded that the proposed method has high accuracy to ﬁnd the exact location
of the fault thus minimizing the time taken to locate the fault location.

References

1. Jamili NSB, Adzman MR, Rahim SRA, Zali SM, Isa M, Hanaﬁ H (2019) Evaluation of earth
fault location algorithm in medium voltage distribution network with correction technique.
Int J Electr Comput Eng 9:1987–1996. https://doi.org/10.11591/ijece.v9i3
2. Adzman MR, Lehtonen M (2012) A method for ground fault distance computation in
unearthed medium voltage distribution network. Int Rev Model Simul 5:1765–1772
3. Banu G, Suja S (2014) Fault location technique using GA-ANFIS for UHV line. Arch Electr
Eng 63:247–262. https://doi.org/10.2478/aee-2014-0019
1052 N. S. B. Jamili et al.

4. Mora JJ, Carrillo G, Pérez L (2006) Fault location in power distribution systems using ANFIS
nets and current patterns. In: 2006 IEEE PES Transmission and Distribution Conference and
Exposition: Latin America, TDC 2006, pp 3–8
5. Huấn VP, Hùng LK (2014) An ANFIS based approach to improve the fault location on
110 kV transmission line Dak Mil—Dak Nong. Int J Comput Sci Issues 11:1–7
6. Elbaset AA, Hiyama T (2009) Fault detection and classification in transmission lines using
ANFIS. IEEJ Trans Ind Appl 129:705–713. https://doi.org/10.1541/ieejias.129.705
7. Kamel T, Hassan M (2009) Adaptive Neuro Fuzzy Inference System (ANFIS) for fault
classification in the transmission lines. Online J Electron Electr Eng 2:164–169
8. Jarrahi MA, Samet H, Raayatpisheh H, Jafari A, Rakhshan M (2015) An ANFIS based fault
classification approach in double circuit transmission line using current samples. In: Advances
in Computational Intelligence: 13th International Work-Conference on Artificial Neural
Networks, IWANN 2015
9. Zhang J, He ZY, Lin S, Zhang YB, Qian QQ (2013) An ANFIS based fault classification
approach in power distribution system. Int J Electr Power Energy Syst 49:243–252. https://
doi.org/10.1016/j.ijepes.2012.12.005
10. Yeo SM, Kim CH, Hong KS, Lim YB, Aggarwal RK, Johns AT, Choi MS (2003) A novel
algorithm for fault classification in transmission lines using a combined adaptive network and
fuzzy inference system. Int J Electr Power Energy Syst 25:747–758. https://doi.org/10.1016/
S0142-0615(03)00029-2
11. Laib A, Melit M, Nekhoul B, Kerroum K, Elkhamilichi Drissi K (2017) Soft fault
identification in electrical network using time domain reflectometry and adaptive neuro-fuzzy
inference system. In: The 5th International Conference on Electrical Engineering, Boumerdes,
pp 2–5
12. Hakim SJS, Abdul Razak H (2013) Adaptive Neuro Fuzzy Inference System (ANFIS) and
Artificial Neural Networks (ANNs) for structural damage identification. Struct Eng Mech
45:779–802. https://doi.org/10.12989/sem.2013.45.6.779
13. Eldin ESMT (2010) Fault location for a series compensated transmission line based on
wavelet transform and adaptive neuro-fuzzy inference system. In: Proceedings of the 2010
Electric Power Quality and Supply Reliability Conference, vol 46, pp 393–401. https://doi.
org/10.1016/j.measurement.2012.07.014
14. Abdullah A, Banmongkol C, Hoonchareon N (2014) Improvement of fault identification and
localization using Gustafson-Kessel algorithm in adaptive neuro-fuzzy inference system.
Aust J Basic Appl Sci 8:455–461
15. Veerasamy V, Abdul Wahab N, Ramachandran R, Mansoor M, Thirumeni M, Lutfi
Othman M (2018) High impedance fault detection in medium voltage distribution network
using discrete wavelet transform and adaptive neuro-fuzzy inference system. Energies
11:3330. https://doi.org/10.3390/en11123330
16. Barakat S, Eteiba MB, Wahba WI (2014) Fault location in underground cables using ANFIS
nets and discrete wavelet transform. J Electr Syst Inf Technol 1:198–211. https://doi.org/10.
1016/j.jesit.2014.12.003
Flashover Voltage Prediction
on Polluted Cup-Pin the Insulators
Under Polluted Conditions

Ali. A. Salem, R. Abd-Rahman, M. S. Kamarudin, N. A. Othman,

N. A. M. Jamail, N. Hussin, H. A. Hamid, and I. M. Rawi

Abstract The natural pollution which is mainly affected by the weather conditions
are the main cause of flashovers on high voltage insulators leading to outages in
power systems. In this work, characteristics of flashover for contaminated cup-pin
insulators have been studied based on experiential test and a mathematical model.
Information from laboratory test combined with new mathematical model results
are used to deﬁne Artiﬁcial Neural Network (ANN) algorithm and Adaptive
Neuro-fuzzy Inference System (ANFIS) for calculated the flashover characteristics
(current IF and voltage UF). several of experiments and measurement are carried out
for 1:1, 5:1, 10:1 and 15:1 ratios of bottom to top surface salt deposit density on
contaminated samples (z). Dimensional Analysis Method (DAM) was used to
derive new model for the variables which often effective in the flashover phe-
nomenon of polluted insulators. The model was derived by establishment the
relationship between flashover voltage UF and current IF, length of pollution layer
LP, exposure time t, arc constant A and layer pollution conductivity of insulator r.
The both arc constants A and n is computed using genetic algorithm. Comparative
investigates have clearly shown that the approach AI-based method gives the
agreeable results compared to the mathematical model.

Keywords Polluted insulators ANN Flashover Mathematical model DAM

A. A. Salem R. Abd-Rahman (&) M. S. Kamarudin N. A. Othman N. A. M. Jamail

Dept. of Electrical Power, Uni. Tun Hussein Onn Malaysia, 86400 Johor, Malaysia
e-mail: [email protected]
N. Hussin H. A. Hamid
School of Electrical System, Uni. Malaysia Perlis, 02600 M-Arau, Malaysia
I. M. Rawi
TNB Labs Sdn Bhd, 43000 Kajang, Selangor, Malaysia

1 Introduction

In almost all energy systems, the insulators subject to many environmental stresses
like erosions and pollution. Under these pollutants effect, the outdoor insulators are
susceptible to smalls discharges over its surface which consequently may result to
the flashover issue [1]. One of the important criteria for accessing a power system’s
reliability is controlled the flashover characteristics such as voltage and current on
the contaminated outdoor insulator. In recent studies [2–5], flashover assessment
particularly using mathematical methods have always been of interest. Artiﬁcial
intelligence, for instance, can be utilized to investigate levels of contamination,
other effects like tracking and erosion of insulator also used to calculate the voltage
of the flashover.
In this paper, results from the laboratory tests of uniform and non-uniformly
contaminated insulators combined with the proposed numerical model are applied
to develop an enhanced flashover variables prediction by mean of ANN and
ANFIS. Flashover current and voltage of polluted insulator for 1:1, 5:1, 10:1 and
15:1 ratios of B/T surface conductivity r on cup-pin porcelain and glass insulators
are measured.

2 The Proposed Model

The pollution flashover issues are represented by a resistance of arc Rarc in series
with the pollution layer resistance RP. Figure 1 illustrate the electrical equivalent
circuit to represent pollution arc development schematically [6].
The flashover phenomena over the surface of contaminated insulators has been
focused by many of researchers. In this study, Dimensional Analysis Method
(DAM) has been applied to establish the flashover current/voltage mathematical
model on the contaminated insulators. From literatures, it has been observed that
the most of parameters which influences the contaminated flashover are current I,
length of pollution layer LP, resistance of pollution layer Rp, arc constant A and
exposure time t. The function current with this parameter can be given as,

IC ¼ f ðLP ; RP ; A; tÞ ð1Þ

Fig. 1 The electrical

equivalent circuit of polluted
insulator
Flashover Voltage Prediction on Polluted Cup-Pin … 1055

The fundamental dimensions of SI units of length L, mass M, time T, and current A

matrix of the above variables using DAM and with exponents of the respective
variables can be written as in Table 1:
The dimensional matrix rank (r) is four, and the parameters number (n) is 5.
Therefore, according to Buckingham’s, p theorem [7], the solution can be written in
the form of (n−r) = 1 independent dimensionless products ðps Þ, which have
dimension M0 L0 T0 I0. By considering S, RP, A and t as independent variables and
I as dependent variable. The dimensionless products can be given as follows:

p ¼ IF ðLP Þa ðRP Þb ð AÞc ðtÞd ð2Þ

where a, b, c and d are exponents of S, RP, A and t respectively. The dimension less
product p is expressed in the form of basic dimensional units of the repetitive
variables as:
a b c
p ¼ L1 M L2 T 3 I 2 M L1 T 3 I n1 ½T d ð3Þ

The dependent variable considered is current and is written as:

½IF ¼ M 0 L0 T 0 I 0 ð4Þ

The dimensionless product current in terms of fundamental dimensions of

repetitive variables is,
a b c
½IF ¼ L2 M L2 T 3 I 2 M L1 T 3 I n1 ½T d ð5Þ

The power indexes a, b, c and d are calculated by comparing the power of the
fundamental dimensional units MLTI on both sides of the homogeneous linear
algebraic in (5):

0 ¼ bþc ð6Þ

0 ¼ a þ 2b þ c ð7Þ

0 ¼ 3b 3c þ d ð8Þ

1 ¼ 2b þ ðn 1Þc ð9Þ

Table 1 DAM matrix IF LP RP A t

M 0 0 1 1 0
L 0 1 2 1 0
T 0 0 −3 −3 1
I 1 0 −2 n−1 0
1056 A. A. Salem et al.

The solutions obtained are

1 1 1
a¼ b¼ c¼ d¼0
nþ1 nþ1 nþ1

The obtained power indices values substituted and the equation for current can
be given as:

IF ¼ d ðRP Þn þ 1 ðA:LP Þn þ 1

1 1
ð10Þ

where IC is the flashover current, d is the constant of dimensional calculated from

the flashover current measured from laboratory tests, A and n represent the con-
stants of arc and RP is the resistance of pollution layer in kX. this resistance can be
expressed as [4],

F
RP ¼ ð11Þ
r
Z l
1
F¼ dl ð12Þ
0 p:DðlÞ

L
x¼ ð13Þ
nþ1

where F is form factor, r is electrical conductivity, L is creabage distance and x is

arc length. Substituting for RP from (4), the new equation for the flashover current
(IF) is written by,
n þ1 1
F 1
IC ¼ d ðA:LP Þn þ 1 ð14Þ
r

From [7], the relation between the conductivity and Equivalent salt deposit
density (ESDD) plotted and a power curve is ﬁtted for r as a function of ESDD is
shown in Fig. 2,

Fig. 2 Relation the 40

Conducvity σ(μS)

conductivity with ESDD

20
σ = 86(ESDD)0.9
10 R² = 0.9994
0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
ESDD (mg/cm2)
Flashover Voltage Prediction on Polluted Cup-Pin … 1057

The obtained relationship is given by,

r ¼ ð86Þ ðESDDÞ0:9 ð15Þ

In non-uniform contaminated samples, the bottom and top surfaces were pol-
luted separately in terms of conductivity ðrB and rT Þ, the average conductivity rav
should be satisﬁed by:

rT ST þ rB SB
rav ¼ ð16Þ
ST þ SB
rB
z¼ ð17Þ
rT

where rT is the electrical conductivity of the top surface, rB represents the elec-
trical conductivity of bottom surface, ST is the top side surface area, SB is the bottom
side surface area and z is the ratio of pollution level in the bottom pollution layer
whose area SB to top pollution layer whose area ST. The flashover current in (10)
under non-uniform condition is became as,
n þ1 1
1 F 1
IF ¼ d1ðLP Þ2n þ 2 ð AÞn þ 1 ð18Þ
rav

3 Flashover Voltage Proposed Model

Based on Obenaus, the flashover voltage expressed by,

UF ¼ xAIFn þ Rp IF ðL xÞ ð19Þ

The ﬁnal expressed of flashover voltage (FOV) after Substituting for IF from
(10) is,
n 1 1 1
on
UF ¼ xA d1ðLP Þ2n þ 2 Fr n þ 1 ð AÞn þ 1 þ
n 1 1 1
o ð20Þ
Rp d1ðLP Þ2n þ 2 Fr n þ 1 ð AÞn þ 1 ðL xÞ

In nonuniformly polluted insulators can be written as

1
n þ1 1 1
n
UF ¼ xA d1ðLP Þ2n þ 2 rFav ð A Þn þ 1 þ
n þ1 1
ð21Þ
1 1
Rp d1ðLP Þ2n þ 2 rFav ð AÞn þ 1 ðL xÞ
1058 A. A. Salem et al.

Furthermore, the VOF and IF have been calculated for two type of insulators
(glass and porcelain) and is compared with tests findings and predicted results
computed using artificial intelligent methods such as ANN and ANFIS.
To calculate d1, the relation between d1 and pollution severity ESDD is graphed,
the fitting curves for uniform and nonuniform pollutions were given in the fol-
lowing equations,

d1 ¼ Ce0:569ESDD For uniform pollution ð22Þ

d1 ¼ Ce0:138ðBÞ
T
For nonuniform pollution ð23Þ

where C is a constant depend on type of insulator, the value of C for glass and
porcelain insulators used in this work are 4.698 and 7.927 in uniformly pollution
and 9.658 and 9.317 in nonuniformly pollution respectively.

4 Experimental Design

4.1 Experiment Facilities

The experimental technique was conducted in a speciﬁc multipurpose artiﬁcial

chamber roughly 1 m and the arrangement of the circuit is shown in Fig. 3. The
supply voltages were measured using capacitive divider voltage while LC measured
using shunt resistance of 10 kΩ. The data acquisition monitoring system is con-
sisting of computer PC, DAQ card and LabVIEW program.

Fig. 3 Schematic graph of test setup circuit

Flashover Voltage Prediction on Polluted Cup-Pin … 1059

4.2 Test Samples

The samples which used in this work are porcelain type (XP-70) and glass type
(LXY-70). The polluted and clean insulators were investigated under different
supplied voltages. The schematic diagram and parameters of structural of insulators
are illustrated in Table 2 in which H, D and L are the height, diameter and leakage
distance of sample respectively.

4.3 Test Procedure

Prior to the experiment, the samples were ﬁrst carefully cleaned using alcohol and
water with conductivity less than 0.001 mS/cm. contamination suspension was
prepared by mixing suitable ratio kaolin with salt of Sodium Chloride (NaCl) in
water to achieve the desirable conductivity degree [8–12]. The contamination layer
salinity on the insulator is then measured using a conductivity meter. The pollution
was applied using solid layer method by brush before allowed to dry naturally for
approximately one day. A sprayer is used to wet the contaminated insulator under
normal ambient condition with temperature of 27 °C and humidity of 87–91%.

5 Determination of (a, N)

According to Ayrton [9], the electric ﬁeld at the point flashover occur can be
expressed by:

Ea ¼ AIan ð24Þ

In this equation, A and n are the arc constants while Ea and Ia respectively
represent the electric ﬁeld and discharge arc [13]. The result of flashover voltage

Table 2 Tested insulators dimensions

Model Materials High Diameter Length F
mm mm mm
LXY-70 Glass 146 255 310 0.89

XP-70 Porcelain 146 255 295 0.736

1060 A. A. Salem et al.

and conductivity were utilised to formulate the A, n at various electrical conduc-

tivity by using genetic algorithm GA optimization method. The data of test arc
electric ﬁeld and arc current are then used for Eq. 24 which results in 46 equations
for each insulator type. These equations have the unknown’s arc constants of ‘A’
and ‘n’ which can be determined by using Eq. 25 as follow:
X46
i¼1
½Ea f N Ian ð25Þ

The numerical computation is performed by using MATHLAB software and

simpliﬁed with the following assumption as adopted by previous researcher [11].

500 N 31 and 0:21 N 1 ð26Þ

The optimization computation gave an arc constants value as tabulated in

Table 3 below.

6 Derivation of Artiﬁcial Intelligence (AI)

In an effort to extract measurement of insulator flashover voltage using artiﬁcial

neural network and ANFIS, the dimension of insulators and degree of deposited
contamination on insulators under uniform and non-uniform pollution was exam-
ined. MATLAB R2014a Toolbox has been used to simulation the ANN and
ANFIS. The four parameters reflecting the conductivity of the pollution layer (r)
and form factor (F) and area pollution layer (S) and arc constant (A) were used as
ANN and ANFIS inputs and flashover current and voltage as output parameters.

7 Results and Discussion

7.1 Flashover Current and Voltage

Figures 4 and 5 demonstrate the distribution of critical current and flashover voltage
in relation to the contamination surface conductivity of the proposed model under a
uniform and non-uniform pollution situation. The contamination levels and the
conductivity of the pollution layer on the surface of the insulator have a major impact
on the critical current and flashover voltage of the tested insulators. The flashover

Table 3 Arc constants based Insulator type A n

on optimization computation
Porcelain 62,529 0,727
Glass 75,928 0,677
Flashover Voltage Prediction on Polluted Cup-Pin … 1061

2 3

Flashover Current (mA) 2.5

Flashover current (mA)

1.5
2

1 1.5
z=1
T/B=1/1
z=5
T/B=1/5
1 z=1
T/B=1/1
0.5
z=10 z=5
T/B=1/5
T/B=1/10
0.5 z=10
T/B=1/10
z=15
T/B=1/15
0
z=15
T/B=1/15
0
0 10 20 30 40 0 5 10 15 20 25 30 35 40
Layer conducvity σ (μS)
Layer conducvity σ (μS)
(a) (b)

Fig. 4 Flashover current and layer conductivity r under non-uniform pollution levels a LXY-70
and b XP-70

30 40
T/B=1/1
z=1 T/B=1/1
z=1
35
25 T/B=1/5 z=5
T/B=1/5
z=5
Flashover voltage (kV)

Flashover voltage (kV)

30 z=10
T/B=1/10
T/B=1/10
z=10
20
T/B=1/15
z=15 25 z=15
T/B=1/15
15 20
15
10
10
5
5
0 0
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Layer conducvity σ (μS) Layer conducvity σ (μS)
(a) (b)

Fig. 5 Flashover voltage and layer conductivity r with different non-uniform level pollutions (a)
LXY-70 and (b) XP-70

voltage of insulator decreases with the increment of electrical conductivity degree.

While the critical current increases with increasing pollution level in all cases.
Consider insulator Type LXY-70 as shown in Fig. 4(a), under T/B = 1/5 when
conductivity r is 7, 13, 18, 24, 29 and 34 lS, the corresponding critical current is
0.6 mA, 0.9 mA, 1.14 mA, 1.35 mA, 1.54 mA and 1.71 mA, respectively. It can
1062 A. A. Salem et al.

Table 4 Lab test results of flashover current and voltage of tested samples
Insulator r/l z=1 z=1 z = 10 z = 15
type S IF VF IF VF IF VF IF VF
A 7 0.44 21.96 0.51 22.96 0.62 25.98 0.72 27.54
13 0.62 16.08 0.78 17.08 0.95 19.08 0.98 20.08
18 0.81 12.22 1.00 15.22 1.15 15.22 1.28 17.22
24 0.99 10.10 1.26 12.99 1.34 13.99 1.40 15.99
29 1.14 8.98 1.37 11.98 1.54 12.98 1.73 13.98
34 1.21 8.03 1.55 10.31 1.73 12.31 1.77 13.23
B 7 0.61 25.82 0.81 32.56 0.86 35.13 1.02 37.26
13 0.88 18.26 1.17 24.13 1.29 26.08 1.36 28.12
18 1.05 15.01 1.40 19.22 1.47 20.22 1.81 22.65
24 1.41 13.55 1.74 16.99 1.90 18.10 1.85 19.68
29 1.54 12.87 1.85 15.98 2.24 17.98 2.31 18.06
34 1.60 11.58 2.06 13.77 2.32 16.03 2.48 17.51

be seen that the critical current increases by 50%, 90%, 125%, 156.6% and 185%
when the electrical conductivity r increases from 7 to 13, 18, 24, 29 and 34 lS,
respectively. For insulator type XP-70 in Fig. 4(b) the critical current increases
from 0.84 to 2.35 with increasing conductivity from 7 to 34 lS, this mean the
percentage increase of current of porcelain insulator is 389.6% approximately
double of increment of critical current for glass insulator. The experimental results
of tested insulators shown in Table 4.
Figure 6 illustrate the flashover current and voltage curves corresponding to the
level of uneven pollution (z) at different levels of pollution (ESDD). From Fig. 6, it
can be noted that the unequal level of exposure (z) has a great impact on the
characteristic of flashover. Growth of the flashover voltage with a rise in the con-
tamination ratio for both insulators in the region near the high voltage terminal on
the lower layer of the insulator. On the other side, with less contamination at the
bottom the flashover voltage of the uniformly polluted insulator for the LXY-70
andXP-70 sample rises by 43.57 percent and 37.81 percent respectively.

7.2 System Veriﬁcation and Effects in Artiﬁcial Intelligence

The arc constant values in the model of flashover voltage (Eq. 13) can be employed
to calculate the model of flashover voltage. the artiﬁcial neural network and ANFIS
models for the glass and porcelain insulators were carried out in contamination
conditions dependent on the properties of insulators. The comparison of the ANN,
Flashover Voltage Prediction on Polluted Cup-Pin … 1063

50 0.06 mg/cm² 0.06 mg/cm²

60
0.12 mg/cm² 0.12 mg/cm²
Flashover voltage V FO (kV) 54

Flashover voltage VFO (kV)

0.18 mg/cm² 0.18 mg/cm²
40 0.24 mg/cm² 48 0.24 mg/cm²
0.30 mg/cm² 42 0.30 mg/cm²
30 0.36 mg/cm²
36
30
20 24
18
10 12
6
0 0
1/1 1/5 1/10 1/15 1/1 1/5 1/10 1/15
1/z 1/z
(a) (b)

Fig. 6 Flashover voltage against 1/z for non-uniform pollutions of. (a) LXY-70 and (b) XP-70

Proposed model ANN ANFIS Laboratory test

32
Flashover voltage (kV)

0
Layer conductivity (7-34) μS

(a)
Proposed model ANN ANFIS Laboratory test
1.8

1.5

1.2

0.9

0.6

0.3

0
Layer conducvity (7-34) μS
(b)

Fig. 7 ANN, ANFIS, DAM model and lab results for all non-uniform pollution conditions of
LXY-70 insulator: (a) Voltage and (b) Current
1064 A. A. Salem et al.

Proposed model ANN ANFIS Laboratory test

Flashover voltage (kV) 40
32
24
16
8
0
Layer conducvity (7-34) μS
(a)
Proposed model ANN ANFIS Laboratory test

2.4
Crical current (mA)

1.8

1.2

0.6

0
Layer conducvity (7-34) μS
(b)

Fig. 8 ANN, ANFIS, model based on DAM and test results for all non-uniform pollution cases of
XP-70 insulator: (a) Voltage and (b) Current

ANFIS, DAM-based mathematical model and the experiments ﬁndings under dif-
ferent non-uniform pollution conditions shown in Figs. 7 and 8. It is evident from
Figs. 6 and 7 that, under various levels of pollution, the simulation system based on
AI techniques achieve a precise evaluation for flashover current and voltage.

8 Conclusion

In this paper a new mathematical model for uniform and non-uniform polluted
insulators based on the dimensional analytical method (DAM) was proposed. Using
genetic algorithm optimization, the results obtained from the laboratory experiment
were used to determine the correct values for arc constants A and n. For artificially
non-uniform polluted glass and porcelain insulators, two artificial intelligence
technologies are developed to calculate flashover current and voltage. The mathe-
matical model was modified and tested using experimental flashover voltage out-
comes based on the contamination properties of the insulator, surface conductivity
Flashover Voltage Prediction on Polluted Cup-Pin … 1065

and arc constant. Therefore, it is possible to simulate critical conditions for the
flashover voltage by establishing the arc constants. In this article, most instances of
non-uniform pollution 1:1, 5:1, 10:1 and 15:1 are found to have a major impact on
flashover voltage. the ANN and ANFIS have been added to the ﬁndings of the
experimental measurement and a satisfactory agreement has been reached.

References

1. Abd-Rahman R, Harid N, Haddad A (2010) Stress control on polymeric outdoor insulators.

In: Proceedings of the universities power engineering conference. ISBN: 978-095655702-5
2. Salem AA, Abd-Rahman R (2018) A review of the dynamic modelling of pollution flashover
on high voltage outdoor insulators. J Phys Conf Ser 1049(1):012–019
3. Ahmad H, Abd-Rahman R, Ahmad MH (2006) Evaluation of transmission line insulator for
I-Type string insulator design. In: IEEE 6th international conference on power energy
(PECON 2016), pp 50–53
4. Salem AA et al (2019) Proposal of a dynamic numerical approach in predicting flashover
critical voltage. Int J Power Electron Drive Syst 10(2):602–610
5. Salem AA et al (2019) The effect of insulator geometrical profile on electric field
distributions. Indones J Electr Eng Comput Sci 14(2):618–626
6. Salem AA, Abd Rahman R, Kamarudin MS, Othman NA (2017) Factors and models of
pollution flashover on high voltage outdoor insulators: review. In: IEEE conference on energy
conversion (CENCON), pp 241–246
7. Alawi SA, Salam MA, Maqrashi AA, Ahmad H (2006) Prediction of flashover voltage of
contaminated insulator using artificial neural networks. Electr Power Compon Syst
34(8):831–840
8. Mahdjoubi A, Zegnini B, Belkheiri M, Seghier T (2019) Fixed least squares support vector
machines for flashover modelling of outdoor insulators. Electr Power Syst Res 173(Jan):29–37
9. Shahabi S, Gholami A (2016) Dynamic model to predict ac critical ashover voltage of
nonuniformly polluted insulators under thermal ionization conditions. Turk J Electr Eng
Comput Sci 25(3):2232–2235
10. IEC Std. 60507 (1991) Artificial pollution tests on high-voltage insulators to be used on AC
systems
11. Dixit P, Krishnan V, Nagabhushana GR (2009) Studies on pollution performance of ceramic
insulators under AC excitation. In: Proceedings of 16th international symposium on high
voltage engineering (ISH 2009), 28th August 2009, Cape town, South Africa, pp 1331–1336
12. Wilkins R (1969) Flashover voltage of high voltage insulator with uniform surface pollution
films. Proc IEE 116(3):457–465
13. Suresh AG, Dixit P (2017) ANN model to predict critical flashover voltages of polluted
porcelain disc insulators. Int J Appl Eng Res 12(11):2942–2951
Effect of Distributed Generation
to the Faults in Medium Voltage
Network Using ATP-EMTP Simulation

Wan Syaza Ainaa Wan Salman, Mohd Raﬁ Adzman, Muzamir Isa,
N. S. B. Jamili, M. H. Idris, and M. Amirruddin

Abstract During past few years, distributed generation (DG) technology has been
widely known in the industry as it can helps in providing backup power during high
power demand. However, adding a new system may changes the traditional power
system that are usually work in one direction which is from the generation to the
consumer. When DG is added, the power flows from two direction. Therefore, a
short circuit study was done to study the effects of DG to the local system during
fault. An IEC Standard was also calculated as guidance to determine the thermally
permissible of a cable.

Keywords Distributed generation (DG) ATPDraw software Short circuit

1 Introduction

The increasing power demand has introduced distributed generation (DG) into
power system as backup power [1–3]. DG units are referred as small scale gen-
eration units that are connected to the distribution system and supply in a residence
[4]. However, by adding another system such as DG into existing power, changes
may occur especially in the protection system. In radial systems, the power flow
from upper terminal voltage levels down to customers is placed along with the
radial feeders. Thus, over-current protection in radial systems is relatively direct as
the fault current is unidirectional. Compared to the inclusion of DG into the system,
the flow changes to dual direction. It leads to the complexity of the existing power
network. One of its major issues is the voltage and fault current levels during the
short circuit. Due to the potential current influence, DG may increase the fault
current level compared to the normal network. Besides, the power system has to

W. S. A. Wan Salman (&) M. R. Adzman M. Isa N. S. B. Jamili M. H. Idris

M. Amirruddin
School of Electrical System Engineering, Universiti Malaysia Perlis, Pauh Putra Campus,
02600 Arau, Perlis, Malaysia
e-mail: [email protected]

deal with the mechanical forces and thermal effects from the short circuit due to the
high current that passes through the equipment. Even if the fault is moderate in
magnitude, it may cause long term effects such as accelerated aging of dielectric
insulation. The objectives of this paper are to model a circuit by using IEEE 34
node test feeder as reference and to study the effect of DG during fault in distri-
bution network. It will also discuss the standard calculation for a thermally per-
missible cable based on IEC Std. as a guidance.

2 Literature Review

2.1 Distributed Generation

DG is a new approach in the electricity industry and the analysis of the relevant
literature has shown that there is no generally acceptable definition for distributed
generation yet [5, 6]. In the literature, a large number of terms and definitions are
used in relation to distributed generation [4]. DG can be referred as a unit with small
scales that are linked to the distribution system to support a residence. According to
[7], DG is defined as a generating plant that contains one or more generating units
that are interconnected to a distribution network where the voltage level is medium
and the total power output will be consistently arranged to be totally utilized by the
loads in the distribution network. Some researchers [8–10] have defined DG as a
unit that generates renewable resources. It produces electrical power by converting
another form of energy into a generating unit that is attached to a distribution
system. DG also supplies the system with less than 10 MW to produce electricity to
consumers [6, 11]. Numerous researches have been done in investigating the effect
of DG when it is embedded into a power system [12–19].

2.2 Calculation of Thermally Permissible of a Cable Based

on IEC Std

Short circuit analysis mostly helps to evaluate the level of the fault current to rate
the breaking capacity of a device and it offers a basis for the coordination of a
system protection in order to guarantee the selectivity [20]. Injury, damages of
electrical equipment, and costly downtown are some of many consequences that
can happen during fault when it exceeds the rating of the interrupted devices.
Besides, it also helps to incorporate the changes that happen in a system such as
utility fault levels, size and impedance values of the transformer, size and types of
the conductor and it also increases the mechanical and thermal stress in electrical
equipment.
Effect of Distributed Generation to the Faults… 1069

To calculate the fault contribution of numerous grid elements, a standard calcu-

lation method like IEC 60909 and IEC 949 has been proposed [21]. Short circuit
currents flowing through the conductors of various power system equipment create a
thermal effect on conductors and equipment due to heating and excess energy input
over time as measured by I2t where I is the short circuit current magnitude and t is the
short circuit current duration [22]. Protective relay operating times and circuit-breaker
clearing times must be fast enough to prevent prolonged overheating [23].
The thermal withstand strength against short circuits of any equipment is deter-
mined by the short circuit duration, the initial short circuit current and the conductor
temperature prior to the short circuit [24]. The following calculation is for the short
circuit current ratings in the conductor according to IEC 949 [25]. An ACSR type of
cable are used as referring to the IEEE 34 nodes test feeder datasheet with a cross
section of 159.2 mm with 0.5 s of short circuit duration time [26].
Cross-sectional area of cable,
pffiffiffiffiffiffi
I2t
A¼ ð1Þ
k

Where,
A = Nominal cross-sectional area, mm
I = fault current, A
t = duration of fault current, s
k = factor dependent on cable type (aluminum type: 148)
From physic,

Q ¼ cmDT ð2Þ

Where,
Q¼ Heat added, J
c¼ speciﬁc heat constant of material, J.g−1. K−1
m¼ mass of material, g
DT ¼ temperature rise, k
Energy into cable during a fault,

Q ¼ I 2 RT ð3Þ

Where,
R¼ resistance cable
1070 W. S. A. Wan Salman et al.

From the physical cable properties,

m ¼ qc Al ð4Þ

And

qr l
R¼ ð5Þ
A

Where,
qc ¼ Material density in g.mm−3
qr ¼ Resistivity of conductor, Ω.mm
l= length
Combining and substituting,

I 2 RT ¼ cmDT ð6Þ

ql
I 2 r T ¼ cðqc AlÞDT ð7Þ
A

From (7), we can derive k as below,

sffiffiffiffiffiffiffiffiffiffiffiffiffi
cqc DT
k¼ ð8Þ
qr

*Note that DT ¼ hf hi
hf ¼ ﬁnal (max) cable insulation temperature, °C (250 °C)
hi ¼ initial operating cable insulation temperature, °C (90 °C)
General form of adiabatic temperature rise formula,

hf þ b
I t ¼ k A ln
2 2 2
ð9Þ
hi þ b

Where,
I ¼ short circuit current (rms over duration) calculation on an adiabatic basic (A)
t ¼ duration of short circuit(s)
b ¼ reciprocal of temperature coefﬁcient of resistance of the current carrying com
ponent at 0 °C (aluminum: 228)
A ﬁnal formula for calculating the value of fault current has been derived in
Eq. 9.
Effect of Distributed Generation to the Faults… 1071

hf þ b
I t ¼ k A ln
2 2 2
ð10Þ
hi þ b

Substitute value into (10),

250 þ 228
I 2 ð0:5Þ ¼ ð148Þ2 ð159:2Þ2 ln 90 þ 228
) I ¼ 21:28kA

From the above calculations, it shows that a cable can allow a maximum of
21.28 kA of fault current to pass through during fault. If the value exceeds the
above value, it may cause interruption as well as damages to the system. Therefore,
by using the above value as reference, a simulation was conducted to study the
behavior of the short circuit when there was a presence of DG.

3 Methodology

IEEE 34 node test feeder was selected as reference and modelled in ATPDraw
software. An IEC Std calculation was also present to calculate the thermal per-
mission of a cable. It was a guidance to the limitation of a current that was allowed
to pass the cable during the fault. The result compared two types of faults which are
single line fault and three-phase fault when they were added into the system during
the short circuit with the addition of DG. Hilbert Transform was used to compare
the result before and after DG was inserted into the system. Some modiﬁcations
have been done to the modelled network such as removing the regulator to clearly
determine the influence of DG on the system voltage. The load was also changed to
balance the three phase loads. Throughout this project, the DG was considered to be
located at bus 828, bus 848 and bus 890. The feeder contained 34 bus with the
exposure of approximately 87.5 km. The DG was modelled with SM-type 58.
A current and voltage probe was located at the main substation to measure the value
short circuit during a fault. The DG size is 300 kV.

4 Result and Discussion

This section discussed about the overall result of the simulation. The collected
results are present in Table 1.
1072 W. S. A. Wan Salman et al.

Table 1 Overall results

Fault Distance of Distance of Type Current Current Voltage Voltage
fault to fault to DG of (no DG) (with (no DG) (with
substation (km) fault (A) DG) (A) (kV) DG) (kV)
(km)
F1 0.80 34.00 LG 2794.00 2800.00 22.03 22.57
LLLG 2617.00 2601.00 30.68 31.18
F2 32.00 3.40 LG 508.10 515.30 22.12 22.70
LLLG 636.20 639.60 23.46 23.98
F3 42.00 6.40 LG 387.20 228.9 20.85 21.77
LLLG 481.50 385.50 24.30 24.81
F4 56.50 21.00 LG 286.10 105.00 21.32 21.64
LLLG 351.30 241.00 23.83 24.52

4.1 Impact of DG to the Location of Faults

The main goal of this section is to study the effect of DG on the faults when they
were located in several places. The locations of the faults were randomly selected
and the system was equipped with DG at the node of 828. The simulation was run
for 0.5 s while a single line to ground fault and three-phase fault were separately
applied starting from 0.05 s. The results compared the value of phase current and
phase voltage when there was no DG with when DG was embedded into the
system. All locations of the faults and DG were shown in Fig. 1.

848

822 846

820 844

818 864 842

802 806 808 812 814 850 824 826
836
858
816 834 860 840
F4
F1 F2 832 862
810 888 890
852
838

DG
828 830 854 856

Fig. 1 Location of faults and DG

Effect of Distributed Generation to the Faults… 1073

Case Study 1: Fault at F1. Firstly, a single line to the ground fault was located at
F1 which is 800 m to the substation and 34 km from the DG. After the DG was
inserted into the system, the fault voltage increased from 2790 to 2800 A as shown
in Fig. 2. Similar with the voltage as in Fig. 3, the DG increased the short circuit
current from 22.03 to 22.60 kV. Next, a three-phase fault was inserted into the
system with similar distance. The current increased from 2617 to 22601 A and the
voltage increased from 30.68 to 31.18 kV. Figures 4 and 5 shows both of the
results respectively.
Case Study 2: Fault at F2. A single line to the ground was located which is
32 km from the substation and 3.4 km from the DG. In Fig. 6, the result shows that
after the DG was inserted into the system, the voltage increased from 22.12 to
22.70 kV. While the fault current increased after the DG was inserted into the
system from 508.1 to 515.3 A as refer to Fig. 7. As the three-phase fault was
inserted, the voltage increased from 23.46 to 24.00 kV, while the current decreased
from 636.2 to 639.6 A as in Figs. 8 and 9 respectively.
Case Study 3: Fault at F3. Another fault was inserted at F3 which is located at
42 km further from the substation and 6.4 km from the DG. In the single line to
ground fault case, it showed that the addition of DG increased the value of voltage
from 20.85 to 21.77 kV. The current magnitude on the other hand decreased from
387.2 to 228.9 A. The results were shown in Figs. 10 and 11 respectively. When
the three-phase fault was inserted into the system, the voltage magnitude increased
from 24.30 to 24.80 kV as in Fig. 12. In contrast, by referring to the signal in
Fig. 13 the current magnitude of the system decreased from 481.5 to 385.5 A.
Case Study 4: Fault at F4. Lastly, a fault was inserted at F4. It is located 56.5 km
further from the substation and 21 km from the DG. From Fig. 14 it shows that the

4 Voltage magnitude when fault was applied at F1

10
3.5
MagnitudeVDG
3 MagnitudeVnoDG

2.5
Voltage (V)

1.5

0.5

0
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 2 Comparison of voltage magnitude before and after the DG was inserted when a single line
to ground fault was inserted at F1
1074 W. S. A. Wan Salman et al.

Current magnitude when fault was applied at F1

3000
MagnitudeCDG
MagnitudeCnoDG
2500

2000
Current

1500

1000

500

0
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 3 Comparison of current magnitude before and after the DG was inserted when a single line
to ground fault was inserted at F1

104 Voltage magnitude when fault was applied at F1

3.5
MagnitudeVDG
3 MagnitudeVnoDG

2.5
Voltage (V)

1.5

0.5

0
0 500 1000 1500 2000 2500 3000 3500
No of samples (k)

Fig. 4 Comparison of voltage magnitude before and after the DG was inserted when the
three-phase fault was inserted at F1

addition of DG into the system increased the value of fault voltage from 21.32 to
21.64 kV. On the other hand, after the DG was added into the system, the fault
current decreased from 286.1 to 105 A. the signal is shown in Fig. 15. Instead,
when a three-phase fault was injected into the system, the addition of DG increased
from 23.83 to 24.52 kV after a DG was added. Whereas for current, the current
decreased from 351.3 to 241 A. Figures 16 and 17 shows both of the results
respectively.
Effect of Distributed Generation to the Faults… 1075

Current magnitude when fault was applied at F1

3000
MagnitudeCDG
MagnitudeCnoDG
2500

2000
Current (A)

1500

1000

500

0
0 500 1000 1500 2000 2500 3000 3500
No of samples (k)

Fig. 5 Comparison of current magnitude before and after the DG was inserted when the
three-phase fault was inserted at F1

104 Voltage magnitude when fault was applied at F2

2.4
MagnitudeVDG
2.2 MagnitudeVnoDG

2
Voltage (V)

1.8

1.6

1.4

1.2

1
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 6 Comparison of voltage magnitude before and after the DG was inserted when a single line
to ground fault was inserted at F2

For easier understanding, all of the collected results were inserted in Table 1.
From the result, as the fault was nearer to the substation, the value of voltage
decreases more from its nominal voltage. It also can be observed from the result, the
addition of DG into the system increased the magnitude of voltage. This indicates
1076 W. S. A. Wan Salman et al.

Current magnitude when fault was applied at F2

600
MagnitudeIDG
MagnitudeInoDG
500

400
Current (A)

300

200

100

0
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 7 Comparison of current magnitude before and after the DG was inserted when a single line
to ground fault was inserted at F2

4 Voltage magnitude when fault was inserted at F2

10
2.4
MagnitudeVDG
2.2 MagnitudeVnoDG

1.8
Voltage (V)

1.6

1.4

1.2

0.8

0.6
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 8 Comparison of voltage magnitude before and after the DG was inserted when the
three-phase fault was inserted at F2

that the presence of DG can help in improving the voltage proﬁle of the system
during all three types of faults. When a single line to fault occur in the system, it
causes the increment of fault current if the fault happened before the DG was
located.
Effect of Distributed Generation to the Faults… 1077

Magnitude of current when fault was inserted at F2

700
MagnitudeCDG
600 MagnitudeCnoDG

500
Current (A)

400

300

200

100

0
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 9 Comparison of current magnitude before and after the DG was inserted when the
three-phase fault was inserted at F2

104 Voltage magnitude when fault was applied at F3

2.4
MagnitudeVDG
MagnitudeVnoDG
2.2

2
Voltage (V)

1.8

1.6

1.4

1.2
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 10 Comparison of voltage magnitude before and after the DG was inserted when a single
line to ground fault was inserted at F3

In contra, if the fault was located after the DG was placed, the fault current
decrease from its original value. However, with the occurrence of three-phase fault,
the magnitude of fault current shows decrement in value even though the value is
still the highest compare to the other type of fault. Prior to the calculation made in
1078 W. S. A. Wan Salman et al.

Current magnitude when fault was applied at F3

400
MagnitudeCDG
350 MagnitudeCnoDG

300

250
Current (A)

200

150

100

0
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 11 Comparison of current magnitude before and after the DG was inserted when a single line
to ground fault was inserted at F3

4 Voltage magnitude when fault was applied at F3

10
2.5
MagnitudeVDG
MagnitudeVnoDG

2
Voltage (V)

1.5

1
0 500 1000 1500 2000 2500 3000 3500
No of samples (k)

Fig. 12 Comparison of Voltage Magnitude Before and After the DG was Inserted When the
Three-Phase Fault was Inserted at F3

literature review, the current should not exceed 21.28 kA for better performance.
Even the fault current is increase in small value during the addition of the DG, the
increment should not to be ignored so that it will not exceed its limited rating of the
cable.
Effect of Distributed Generation to the Faults… 1079

Current magnitude when fault was applied at F3

500

400

300
Current (A)

200

100
MagnitudeCDG
MagnitudeCnoDG

0
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 13 Comparison of current magnitude before and after the DG was inserted when the
three-phase fault was inserted at F3

104 Voltage magnitude when fault was applied at F4

2.2
MagnitudeVDG
2.1 MagnitudeVnoDG

1.9
Voltage (V)

1.8

1.7

1.6

1.5

1.4

1.3
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 14 Comparison of voltage magnitude before and after the DG was inserted when a single
line to ground fault was inserted at F4
1080 W. S. A. Wan Salman et al.

Current magnitude when fault was applied at F4

300
MagnitudeCDG
MagnitudeCnoDG
250

200
Current (A)

150

100

0
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 15 Comparison of current magnitude before and after the DG was inserted when a single line
to ground fault was inserted at F4

104 Voltage magnitude when fault was inserted at F4

2.5
MagnitudeVDG
MagnitudeVnoDG

2
Voltage (V)

1.5

1
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 16 Comparison of voltage magnitude before and after the DG was inserted when the
three-phase fault was inserted at F4
Effect of Distributed Generation to the Faults… 1081

Current magnitude when fault was applied at F4

350

300 MagnitudeCDG
MagnitudeCnoDG
250
Current (A)

200

150

100

0
0 500 1000 1500 2000 2500 3000 3500
No of sample (k)

Fig. 17 Comparison of current magnitude before and after the DG was inserted when the
three-phase fault was inserted at F4

5 Conclusion

In this paper, with some modiﬁcation, a standard IEEE 34 node test feeder was
modelled in ATPDraw software. It includes DG to study the short circuit behavior
during fault. Two types of faults were used which are single line to ground fault and
three-phase fault. By using IEC standard, the thermal permission of the cable was
calculated as a guidance to the limitation of the fault current that pass through the
cable.
According to the simulation result, it can be concluded that the fault current may
increase if the fault happened before the location of DG with the existence of a
single line to ground fault. While for three-phase fault, the current will decrease.
However, the fault current is still the highest when three-phase fault is compared to
a single line to ground fault. DG also helps in improving the voltage proﬁle as the
voltage increases in all cases. Even though the result of fault current shows only
little increment during the addition of the DG, it should not be ignored. The
calculation can be used as reference for the limitation of the cable. It can help in
order to improve the performance of the protection system.
Further researches need to be conducted in order to improve the protection
system of the power system. In future, the scope of work can be extended to study
the effect of short circuit current to mechanical strength and substation equipment as
this paper only discusses about the thermal permission of a cable. This can help
engineers to study how severe the effect of DG is and to improve the protection
system of a network. Besides, future works may include impacts of different sizes
of DG and number of DG embedded into a system.
1082 W. S. A. Wan Salman et al.

References

1. Yusof S, Osrin Z (2009) EPSCOM 2009 electrical power system competition 2009
2. Kotamarty S (2006) Impact of distributed generation on distribution contingency analysis
3. Anfossi RH (2010) An assessment of the technical and economic benefits of distributed
generation
4. Gharehpetian GB, Shahnia F (2005) Short circuit fault transient studies of power distribution
systems including dispersed generation. Ezepdico.Ir. 1–5
5. Ackermann T, Anderson G, Soder L (2001) Distributed generation: a definition. Electr Power
Syst Res 57:195–204
6. Bhadoria V, Singh N, Shrivastava V (2013) A review on distributed generation definitions
and DG impacts on distribution system. Researchgate. https://doi.org/10.13140/RG.2.1.4439.
4328
7. Osman H, Mohd Siam DF, Yusof DS, Ngah H, Azit AH, Majid AA, Koon LC (2005)
Asnawi: technical guidebook for the connection of generation to distribution system
8. Balan D, Dobri I (2011) Protection coordination issues distribution generation system
9. Christos A (2016) Investigation of the distribution grid hosting capacity for distributed
generation and possible improvements by SmartGrid technologies
10. Driesen J, Belmans R (2006) Distributed generation: challenges and possible solutions. In:
2006 IEEE power engineering society general meeting, pp 1–8. https://doi.org/10.1109/pes.
2006.1709099
11. Balafouti V (2006) Impact of Distributed generation on the power quality in distribution
networks
12. Yazdanpanahi H (2014) Protection and power quality impact of distributed generation on
distribution system
13. Hussain Shah Jillani SS (2012) Impact of distributed generation on distribution systems and
its protection
14. Toribio EEJ (2009) Impact of Distributed Generation on unbalanced power systems
15. Singh SN, Ostergaard J, Jain N (2009) Distributed generation in power systems: an overview
and key issues. In: 24rth Indian engineering congress, India
16. Teshome AD (2016) Study of distributed generation in improving power system reliability
17. Wan Salman WSA, Adzman MR, Isa M, Abdul Hamid H, Hanafi NH (2019) Effect of
distributed generation (DG) to earth fault. Springer, Singapore
18. Sudhakar P, Malaji S, Sarvesh B (2018) Reducing the impact of DG on distribution networks
protection with reverse power relay. Mater Today Proc 5:51–57. https://doi.org/10.1016/j.
matpr.2017.11.052
19. Hudananta S, Haryono T, Sarjiya (2017) Study of overcurrent protection on distribution
network with distributed generation : an Indonesian case. In: 2017 international seminar on
application for technology of information and communication (iSemantic), pp 126–131.
https://doi.org/10.1109/isemantic.2017.8251856
20. Bracale A, Caramia P, Carpinelli G, Fazio AR Di (2017) Electrical power and energy systems
modeling the three-phase short-circuit contribution of photovoltaic systems in balanced power
systems. Int J Electr Power Energy Syst 93:204–215. https://doi.org/10.1016/j.ijepes.2017.05.
032
21. Kamara W (2013) Representation of type 4 wind turbine generators for steady state
short-circuit calculation
22. Tleis N (2008) Power system modelling and fault analysis. Elsevier Ltd.
23. Adapa R (2006) Fault current management guidebook - updated. EPRI
24. Schlabbach J, Rofalski K-H (2008) Power system engineering. Wiley, VCH
25. British Standards Institute Staff (2003) Calculation of thermally permissible short-circuit
currents, taking into account non-adiabatic heating effects
26. Baughman ML, Dugan RC (1921) IEEE 34 node test feeder
Optimal Reactive Power Dispatch
Solution by Loss Minimisation Using
Dragonfly Optimization Algorithm

Ibrahim Haruna Shanono, Masni Ainina Mahmud,

Nor Rul Hasma Abdullah, Mahfuzah Mustafa, Rosdiyana Samad,
Dwi Pebrianti, and Aisha Muhammad

Abstract Optimal reactive power dispatch (ORPD) is an essential optimisation

aspect in an electrical power system that aids in attaining optimal reliability,
security as well as cost-effective operation of the system network. ORPD is an
optimal power flow (OPF) problem with nonlinear and non-convex constrained
objective functions having both discrete and continuous control variables. The
concept involves the minimisation of the entire network active power losses using
either classical or nature-inspired optimisation algorithms. This research study is to
deploy Dragonfly Optimization Algorithm (DOA) to obtain the best vector control
variables that give the minimal total active power loss on a standard IEEE 30-bus
system network. To ascertain the algorithm performance, the obtained result was
then compared with Particle Swarm Optimization (PSO) where DOA proves to be
more robust and effective in solving the ORPD problem.

Keywords Optimal reactive-power dispatch Dragonfly optimisation algorithm

Nature-inspired optimisation algorithms Loss minimisation IEEE 30-bus system
network

I. H. Shanono (&) M. A. Mahmud N. R. H. Abdullah M. Mustafa R. Samad

D. Pebrianti
Faculty of Electrical and Electronics Engineering, Universiti Malaysia Pahang, 26600 Pekan,
Pahang, Malaysia
e-mail: [email protected]
I. H. Shanono
Department of Electrical, Faculty of the Engineering, Bayero University, Kano, Nigeria
A. Muhammad
Department of Mechatronics, Faculty of the Engineering, Bayero University, Kano, Nigeria

1 Introduction

One major critical economic indicator in the contemporary world is the measure of
the amount of electricity a country/nation generates and utilise. By virtue of this, the
need for stable, sustainable, sufficient as well as reliable electrical power system
network is crucial. As shown in Fig. 1, Power grid consists of three stages,
specifically generation, transmission and distribution, which delivers the power to
both industrial and residential consumers [1]. The power network needs to be robust
to accommodate all the dynamic factors associated with the system [2].
Usually, network losses in a power system vary between 5–10% of the total
generation, which translates to millions of dollars per annum [3]. Due to the rapid
voltage fluctuations caused by the variation in consumer demands, there is the need
to device a suitable means of maintaining a stable load bus voltage, apparent power
through the lines, as well as the reactive generator power [4]. Consequently, loss
minimisation, as well as voltage balancing, needs to be adequately addressed in any
power system network [5–7]. The two can be controlled by regulating and moni-
toring the magnitude of the bus voltages, transformer tap settings and capacitive
reactive power injection while adhering to the units and system constraints [8, 9].
Previously, ORPD problems were solved using conventional optimisation
strategies, such as linear programming [10, 11], non-linear programming, quadratic
programming [12], Newton Rapson method, gradient-based algorithm and interior
point method [13]. However, they were found to be inefficient in solving the
nonlinear functions as well as discrete variables [14, 15], hence, resulting in
inaccurate solutions. Due to their disadvantages and coupled with the rapid
advancement in computer technology stochastic optimisation techniques for solving
power system problems emerged. Heuristic and Meta-heuristic population-based
strategies with stochastic transition rules, the likes of genetic algorithm (GA) [16,
17], improved GA [18], real parameter GA [19], adaptive GA [17], evolutionary
strategy (ES) [20], hybrid evolutionary programming [21], particle swarm opti-
mization (PSO) [22] hybrid PSO [23], bacterial foraging optimization (BFO) [24],

TRANSMISSION
STAGE
DISTRIBUTION
STAGE
GENERATION
STAGE

STEP DOWN
TRANSFORMER

STEP UP
TRANSFORMER

Fig. 1 Power grid system

Optimal Reactive Power Dispatch Solution by Loss Minimization … 1085

self-adaptive differential evolution (SADE) [25, 26] artiﬁcial bee colony

(ABC) [27], harmony search algorithm (HSA) [28], improved and modified HSA
[29, 30], Tabu search (TS), gravitational search algorithm (GSA) [31, 32], gray
wolf optimization (GWO) [33], hybrid firefly algorithm (HFA) [34],
teaching-learning-based optimization algorithm (TLBO) [35] and many more were
successfully applied to obtain ORPD solutions. Their stochastic nature to produce
the optimal global value together with the handling capability of non-convex, as
well as discontinuous function, is essential for their success.
A significant number of ORPD research articles has been published, among the
prominent ones is that presented in [20]. The work employs evolutionary pro-
gramming (EP) to simulate power flow computations on an IEEE 30-bus network.
EP appeared to provide broader landscape search and fast convergence than
Broyden, Fletcher, Goldfarb and Shanno (BFGS) [36]. The simulated evolution is
robust with the ability of parallel processing [20]. The same author presented an
improved genetic algorithm for ORPD, which equally finds the optimal global
solutions within the shortest computational time [17].
A combination of Fuzzy and Linear programming (LP) was presented in [37]
and was used to obtain ORPD solutions. The hybrid model is called Multi-objective
Fuzzy linear programming (MFLP). The same MFLP technique was later used to
work if the network is under emergencies and is capable of assisting in restoring the
network [37, 38].
Nakawiro et al. proposed a novel heuristic optimisation called Mean-Variance
Mapping Optimization (MVMO) and tested it on both IEEE 57 and 118 bus system.
The results obtained were then compared against enhanced evolutionary algorithms
where the heuristic IEEE 118 test case prove to be the most superior [39]. In a
similar work reported by El-Ela et al., an Ant colony optimisation (ACO) algorithm
was used in solving ORPD problem on 14, 30-Bus system and later the result was
tested on a real power system called Unified Egyptian Network (UEN). The
obtained simulated results appeared to show superiority over Linear Programming
(LP), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) [40]. Chao
et al. combine variable scaling mutation of differential evolution and the proba-
bilistic state transition rule of an ant system to solve the ORPD problem on IEEE 30
bus system. The proposed hybrid method was compared against EP, DE and PSO,
where it presents the least active power losses in the line and faster convergence
time [41]. Sulaiman et al. use a nature-inspired swarm technique called Cuckoo
Search Algorithm (CSA) to minimise the transmission line power loss in IEEE- 57
bus network. The technique proves to be the most effective when compared with
PSO and DE algorithms [42]. Whereas Lenin et al. use a modified shuffled
frog-leaping (MSFL) algorithm to solve the ORPD problem [43].
In this paper, a population-based Dragonfly meta-heuristic optimisation algo-
rithm is employed to solve the ORPD problem. Its intends to reduce the trans-
mission line power losses as well as enhance the voltage profile under control and
dependent variable constraints. The dragonflies’ static and dynamic swarming
behaviour is modelled to fit with the two optimisation stages called exploration and
exploitation. The proposed GOA is used on the standard IEEE 30-bus system. The
1086 I. H. Shanono et al.

obtained result is then compared with a PSO algorithm where it appears to exhibit
better performance than PSO.

2 The ORPD Problem Formulation

This section explains the ORPD problem formulation, by detailing the objective
function expressions, the equality and inequality constraint, reactive power and the
transformer tap ratio relations.

2.1 Objective Function

In ORPD problem, the objective function forms to minimise the transmission line
power losses under different sets of equality and inequality constraints. This is
achieved through proper variation of generator bus voltage, transformer tap settings
and capacitor banks. The equality constraints include the power flow equations, and
inequality constraints are the minimum and maximum voltage limit, capacitor
banks and transformer tap ratio. The mathematical expression for the non-linear
objective function of the ORPD problem to be minimised is given by the below
expression:

Minimise F ðx; uÞ ð1Þ

Subject to:

gðx; uÞ ¼ 0 ð2Þ

hðx; uÞ 0 ð3Þ

where F, g and h signiﬁes the objective function, equality and inequality constraints
and u and x are the control and dependent variables. The control variables are the
generator voltages, transformer tap ratio and the capacitive reactive power, whereas
the dependent variables comprise of the bus power, load bus voltages, generator
reactive power and line flows. Therefore, the solution of the ORPD objective
function will provide the optimal control variables that minimise the real power loss
and voltage deviation. The function can be formulated using a weighted approach
given in Eq. (4):

F ¼ w Ploss þ ð1 wÞ Vd ð4Þ
Optimal Reactive Power Dispatch Solution by Loss Minimization … 1087

Where w, Ploss and Vd represent the weighted factor, real power loss and the
voltage deviation respectively. Ploss and Vd are computed using the below
equations:
X X
Minimise Ploss ¼ keNl
Pkloss ¼ keNE
gk Vi2 þ Vj2 2Vi Vj coshij ð5Þ

XNl
Minimise Vd ¼ Vi Vref ð6Þ
i¼1

2.2 Equality Constraints

The equality constraints are active and reactive power equalities:

XNB
Pi ¼ Vi j¼1
Vj Gij coshij þ Bij sinhij ¼ 0 ð7Þ

X NB
Qi ¼ Vi j¼1
Vj Gij sinhij Bij coshij ¼ 0 ð8Þ

where k refers to the branch between buses i and j, Ploss, Gk and Bk are the active
loss, mutual conductance and susceptance and of branch k respectively, Vi or j is the
voltage at load bus i or j, Pi and Qi are the active and reactive power at the buses, hij
are the voltage angles at bus i and j, Nl and NB are the total numbers of transmission
lines and load buses.

2.3 Inequality Constraints

The inequality constraints are generator constraints, transformer tap settings that are
restricted by the upper and lower bounds and reactive compensation sources:

Vimin Vi Vimax i 2 B ð9Þ

Tkmin Tk Tkmax i 2 BRK ð10Þ

Qcimin Qci Qimax i 2 Bc ð11Þ

where Tk is the transformer tap in-branch k, Qci is reactive power of shunt capacitor
in bus i, B is the set of system buses, BC is the set of buses with shunt capacitor,
BRK is the set of transformer branches.
1088 I. H. Shanono et al.

3 Dragonfly Algorithm

Dragonfly algorithm optimisation technique originates from the inspiration driven

from the static and dynamic swarming behaviour of dragonflies or Odonata insects.
The two behaviours are analogous to the exploration and exploitation stages in
optimisation. The swarming behaviour is classiﬁed into three main categories,
namely, food hunting, survival and self-defence from enemies [44].
Therefore, the dragonfly foraging and migration swarming social interactions are
designated into the static and dynamic swarm. The static swarm is a model that
represent a small group of dragonflies that hovers back and forth over an area in
search of food. Whereas, the dynamic swarm represents a large group of migrating
dragonflies moving in the same direction for a long duration. Figure 2 below
illustrates the static and dynamic swarm behaviour.

3.1 Movement of Dragonfly

Dragonfly has ﬁve primitive behaviours, namely; separation, alignment, cohesion,

the attraction of food source and distraction/evasion from enemies [44]. Figure 3
(a)–(e) below depicts the pictorial representation of the ﬁve behaviours.
All these behavioural characteristics need to be considered and modelled in
MATLAB. Several researchers treat the ORPD problem using multi-objective
functions, such as the one presented by Quafa et al. Where they use Multi-objective
Dragonfly Algorithm (MODA) to minimise fuel cost, emission, voltage deviation
and power losses in the same system [45].
The DA optimisation algorithm starts by creating a set of random solutions for
the given function, which then initiates the dragonflies’ position and step vectors

(a) (b)

Fig. 2 a Static Swarm of Dragonfly b Dynamic Swarm of Dragonfly

Optimal Reactive Power Dispatch Solution by Loss Minimization … 1089

Fig. 3 Dragonfly algorithm principles

using some random values deﬁned within the lower and upper boundaries. For
every iteration, both the position and step of each dragonfly are updated using
Eqs. (13)–(16) given below. Equation (13) represents the mathematically modelled
expression for Separation properties, with S representing the individual dragonflies’
collision avoidance. Equation (14) is for the Alignment with A denoted as the
individual velocity of the dragonfly. Equation (15) is the mathematical model of the
Cohesion among the dragonflies with C taken as the tendency of each flies towards
the centre of mass. Lastly, Eqs. (16) and (17) are for Attraction towards the food
and Distraction away from the enemy with X+ and X− representing the food and
enemies position respectively, while X is the current dragonfly position.
XN
Si ¼ j¼1
X Xj ð13Þ

1 XN
Ai ¼ X j ð14Þ
N J¼1

1 XN
Ci ¼ Xj X ð15Þ
N j¼1

XN
Si ¼ j¼1
X Xj ð16Þ
1090 I. H. Shanono et al.

Fi ¼ X þ X ð17Þ

Ei ¼ X þ X ð18Þ

In updating X as well as DX vectors, the neighbourhood of each dragonfly is

selected by computing the Euclidean distance between all the selected
(N) dragonflies. The updating process is proceeded iteratively until either one of the
criteria is satisﬁed, which could be solution converges to the optimal value or the
maximum number of iteration is reached (Fig. 4).

Start

Set the number of dragonfly (search Assign the fitness (best results) to
agents) and max. iteration step vector

Set the line outages Update velocity and positions of

(contingencies analysis) each dragonflies

Set the loading factor Control

variable out Yes
of limit?
Get function details including lower
limit, upper limit, dimension and Pegging at
No
objective function the limit

Is Max Iteration
Initialization reached?
No
Assign control variables to load Yes
flow data
End
Evaluation process , obtain
transmission losses from load flow
calculation using MATPOWER

Fig. 4 Flowchart of Dragonfly for ORPD solution

Optimal Reactive Power Dispatch Solution by Loss Minimization … 1091

4 Results and Discussion

To assess the performance as well as the efﬁcacy of the proposed DA in solving

ORPD problems, an IEEE 30-bus system is used. As mentioned, the goal is to
minimise the active power transmission losses by optimally setting the control
variables within their limits and at the same time making sure that the equality and
inequality constraints are satisﬁed. Therefore, the variation of loading factor with
and without contingencies, the effect of variation in the number of iteration and
effect of variation in the number of search agents against the losses is investigated.
The script is coded in Matlab M-ﬁle.

4.1 IEEE 30 Bus Test System

The standard IEEE 30 bus system [46] has been used to demonstrate the effec-
tiveness of the DA algorithm in handling ORPD problems. It comprises of 30
buses, out of which six generators (situated at bus 2, 5, 8, 11 and 13), twenty-four
load buses within which three (Bus 3, 10 and 24) are selected for shunt VAR
compensation. It has forty-one branches of which four branches (line 6–9, 6–10, 4–
12, and 27–28) are tap changing transformers. Therefore, in total, thirteen control
variables are selected to be optimised. Based on the solution obtained from the
actual load flow study using the Newton-Raphson method, the initial system active
(PLOAD) and reactive (QLOAD) load power are set to 283.40 MW and
126.20MVAR, respectively. While the initial network total power generated and
total power loss in active and reactive form is 289.06 MW (PG), 97.61MVAR (QG)
and 5.663 MW (PLOSS), 30.33MVAR (QLOSS) respectively.
Table 1 summarises the limits of the control variables such as the generator bus
voltage magnitudes (Vg), the transformer tap settings and the reactive power limits
of the shunt VAR injections (Qc).

Table 1 Control variables Variables Limit

limits for the IEEE 30-Bus
Minimum limit Maximum limit
system
Generator voltage 0.9p.u. 1.1p.u.
Transformer tap setting 0.95p.u. 1.05p.u.
Shunt reactive power −12MVar 36MVar
1092 I. H. Shanono et al.

4.2 Case 1: Analysis Based on the Number of Search Agents

The proposed algorithm is evaluated to establish the relationship between the

number of search agents and the loss minimisation capability. This is to select the
most suitable number of agents for the entire optimisation process. Table 2 illus-
trates the results obtained over 50 iterations of ﬁve (5) sets of agents, with 10 as the
minimum, 50 as the maximum over a step increment of 10 agents. To compare the
viability of DA with other existing optimisers, the same process was repeated with
the PSO algorithm. In all the 5 sets of values, it appears that DA performs better
than PSO since it produces the least amount of transmission losses under the same
number of search agents. In both the DA and PSO algorithms, the search done with
30 agents produces the least transmission losses standing at 4.6482 and 4,7310 MW
respectively. Based on that, 30 is chosen as the best number of agents suitable for
the ORPD optimisation process.
Table 3 tabulates the IEEE 30-bus system optimised control variables for both
the DA and PSO algorithm. As mentioned earlier, 13 control variables were opti-
mised with the aid of 30 agents for 50 iterations. Looking at all the control
parameters in both cases, DA shows better performance than PSO. It has reduced
the PSO transmission losses by 1.75%. The percentage reduction represents a
tremendous amount of energy in high power transmission lines, therefore, enhances
the network reliability as well as saves a huge amount of cost.
Table 4 presents the minimum losses and the percentage loss reduction of both
DA and PSO with respect to the base case. It clearly shows that DA has been
reduced to 4.6482 MW with respect to the 5.663 MW base case value, which is
equivalent to 17.92%, whereas PSO has a minimum loss reduction of 4.7310 MW
standing at 16.47% reduction from the initial base case.
The variation of the transmission line losses with respect to the number of search
agents is analysed and presented in Table 5. DA algorithm is trailed 20 times using
20 and 30 number of search agents. For the entire trial, the result obtained using 30
agents are better than that of 20 agents. Figure 5 shows the convergence perfor-
mance curve for both the two cases. From the plot, the trial performed with 30
agents provides less amount of losses throughout the 20 free run simulation. It best,
average and worst losses are 4.6482, 4.67775 and 4.7156 MW respectively. It

Table 2 Transmission No. of agents Losses of nature inspired

Losses based on variations in computing technique
Number of Search Agents
PSO (MW) DA (MW)
10 4.8771 4.6980
20 4.7744 4.7074
30 4.7310 4.6482
40 4.8272 4.6955
50 4.6968 4.7831
Optimal Reactive Power Dispatch Solution by Loss Minimization … 1093

Table 3 ORPD results of Variables Nature inspired computing

control variables after technique
optimization by DA and PSO
PSO (p.u) DA (p.u)
for IEEE 30-bus system with
13 control variables and 30 V1 1.1 1.1
Search Agents = 30 V2 1.094 1.1
V3 1.072 1.082
V4 1.076 1.084
V5 1.083 1.074
V6 1.08 1.1
T1 1.01 1.05
T2 0.994 1.05
T3 1.036 1.033
T4 1.037 1.031
QC1 9.732 4.139
QC2 3.094 22.67
QC3 12.52 13.06
Loss (MW) 4.7310 4.6482

Table 4 Percentage of loss reduction after optimization by DA and PSO for IEEE 30- bus system
with 30 Search Agents
Compared algorithm Base case DA PSO
PLOSS (MW) 5.663 4.6482 4.7310
Percentage of loss reduction (%) – 17.92 16.47

equally has small step transitions across the trial. Unlike with 30 agents, 20 agents
have a large sudden transition between trials and has 4.7074, 4.775 and 4.877 MW
as the best, average and worst losses.
For comparison sake, the PSO algorithm is simulated and run 20 times under the
same parameters and characteristics with the DA. Table 6 contains a summary of
the results obtained using 30 and 20 PSO search agents. Figure 6 shows the
comparison plot for the PSO with 20 and 30 agents and that of DA with 30 agents.
Similar to the DA result, PSO with 30 agents shows the least amount of losses
compared to that of 20 agents. The best, average and worst values for both cases
stood at 4.6909, 4.6767 and 4.731 for 30 agents and 4.6525, 4.6501 and 4.9536 for
the 20 agents. On top of that, PSO with 30 agents appears to exhibit close per-
formance with the proposed DA with 30 agents, because in some trails it gives less
amount of losses compared to the DA technique. The average for the DA and PSO
optimisation algorithm is 4.6764 and 4.677, respectively indicating how closely the
performance of the two methods are in handling ORPD problem.
1094 I. H. Shanono et al.

Table 5 Comparison table No. of trial Losses

using 30 and 20 DA search
DA (30) (MW) DA (20) (MW)
agents over 20 trials
1 4.7060 4.7810
2 4.6957 4.7666
3 4.6911 4.7576
4 4.6749 4.7537
5 4.6778 4.8468
6 4.6806 4.7074
7 4.6744 4.7552
8 4.6482 4.7243
9 4.6615 4.7609
10 4.6495 4.7830
11 4.6556 4.7400
12 4.6611 4.8212
13 4.6714 4.8215
14 4.6987 4.8770
15 4.6806 4.8207
16 4.6872 4.7636
17 4.6575 4.7701
18 4.6746 4.7433
19 4.6666 4.7611
20 4.7156 4.7445

Convergence curve for 20 and 30 DA Agents

4.9

4.85
Line losses (MW)

4.8

4.75

4.7

4.65
DA 30 Ploss (MW)
DA 20 Ploss (MW)
4.6
0 2 4 6 8 10 12 14 16 18 20
Trials

Fig. 5 Comparison plot for 20 and 30 DA agents over 20 trials

Optimal Reactive Power Dispatch Solution by Loss Minimization … 1095

Table 6 Comparison table No. of trial Losses of nature inspired computing

using 30 and 20 PSO agents technique
with 30 DA over 20 trials
PSO (30) (MW) PSO (20) (MW)
1 4.6336 4.8788
2 4.6615 4.6956
3 4.7270 4.6835
4 4.7003 4.6908
5 4.6742 4.8193
6 4.6600 4.6813
7 4.6774 4.8055
8 4.6552 4.7735
9 4.6233 4.7284
10 4.6705 4.6779
11 4.6909 4.7727
12 4.6822 4.7047
13 4.6077 4.8943
14 4.6494 4.7018
15 4.6952 4.9536
16 4.6965 4.6946
17 4.6767 4.7439
18 4.7100 4.7842
19 4.7107 4.6525
20 4.7310 4.6653

Convergence for 30, 20 PSO and 30 DA Agents

5
PSO 30 Ploss MW
4.95 PSO 20 Ploss MW
DA 30 Ploss MW

4.9
Line losses (MW)

4.85

4.8

4.75

4.7

4.65

4.6
0 2 4 6 8 10 12 14 16 18 20
Trails

Fig. 6 Comparison plot for 20 and 30 PSO and 30 DA agents over 20 free trials
1096 I. H. Shanono et al.

4.3 Case 2: Variation of Loading Factor and Transmission

Losses Without Line Outages

In this case, bus 26 is uniformly loaded in order to assess the transmission system
stability without considering the line outages. With 30 agents and 50 iterations, the
system loss minimisation is evaluated for 0MVAR, 5MVAR, 10MVAR, 15MVAR,
20MVAR and 23MVAR loading. Table 7 below presents both the initial and post
ORPD losses for DA and PSO optimisers along with their percentage reduction.
The graphical representation of the loss parameters obtained is depicted in Fig. 7
below. Based on the curves, it appears both optimisers perform positively by
reducing the transmission losses from the initial given values. Relatively, DA
optimiser exhibits better performance than PSO as it yields the minimum losses and
at 23MVAR it has maximum percentage loss reduction of 34.21% with respect to
the pre-ORPD value. It is also worth mentioning that the maximum loading
capacity of Bus 26 is 23MVAR, as such anything beyond that will render the
system unstable and probably leading to total network collapse.
Table 8 shows the 13 control variables and transmission losses for both DA and
PSO algorithm at bus 26 with 23MVar loading factor. The result shows that DA
produces better performance than the PSO as it produces the highest percentage of
loss reduction of 28.05% as compared to 24.46% of PSO.

4.4 Case 3: Variation of Loading Factor and Transmission

Losses with Line Outages

In this case, the load on bus 26 is uniformly loaded in order to analyse the trans-
mission system stability with line outage consideration. With 30 agents and 50
iterations, the system loss minimisation is evaluated under the same loading con-
dition with that the prior case. Tables 9 and 10 below presents the transmission line

Table 7 DA and PSO transmission losses based on variations of Loading Factor without Line
Outages at Bus 26
Loading factor Pre-ORPD Post-ORPD
(MVar) (MW) DA DA % PSO DA %
(MW) Reduction (MW) Reduction
0 5.5954 4.7309 15.5 4.7771 14.6
5 5.8125 4.7247 18.7 4.8253 17.0
10 6.3277 5.0450 20.3 5.1371 18.8
15 7.2594 5.4915 24.4 5.6228 22.5
20 8.8413 6.3613 28.1 6.6788 24.5
23 10.3193 6.7895 34.2 7.6361 26.0
Optimal Reactive Power Dispatch Solution by Loss Minimization … 1097

Losses (MW) vs Loading Factor (MVar)

11
Pre-ORPD
DA Post-ORPD
10 PSO Post-ORPD

9
Losses (MW)

4
0 5 10 15 20
Loading Factor (MVar)

Fig. 7 Loss variation based on loading without an outage

Table 8 ORPD results of Variables Nature inspired computing

control variables after DA and technique
PSO optimization for IEEE
DA (p.u) PSO (p.u)
30- bus system with
Loading = 23MVar without V1 1.1 1.1
V2 1.1 1.1
V3 1.082 1.089
V4 1.086 1.089
V5 1.040 1.087
V6 1.072 1.1
T1 1.050 1.006
T2 1.035 1.040
T5 1.015 0.961
T4 0.969 0.951
C1 15.281 14.697
C2 36 34.225
C3 18.313 −1.257
Loss (MW) 6.7895 7.6361

losses for prior and post DA and PSO ORPD optimisation, respectively. The
loading is varied correspondingly with the previous cases with an outage at line 7, 9
and multi-contingencies at 7 & 9. In both DA and PSO the power losses were
minimised in all the three, with DA showing better performance than the PSO. The
losses tend to increase with loading factor, and it is more pronounced in the
1098 I. H. Shanono et al.

multi-contingency outage at line 9 & 7, followed by line 7 and the least is line 9.
Figure 8 (a) and (b) is the DA and PSO comparison plots for line outages inline 9
and multi-contingency 9 and 7. As earlier stated, it appears that DA and PSO have
significantly reduced the amount of loss from the initial value. The two optimisers
curves almost overlap, signifying a very close performance between the two. But in
overall, DA optimisation produces the least transmission losses.
Table 11 shows the voltage profiling using DA optimiser with a variation of
loading factor at bus 26. Consideration was given to outages inline 9, 7 and
multi-contingency for 9&7. It appears in all the three cases, and DA has signifi-
cantly improved the voltage level as compared to the initial value prior to the
application of ORPD optimisers.

4.5 Case 4: Variation of the Number of Iteration and Search

Agents

In this case, bus 26 is fixed at 23MVar loading factor with the outage inline 9 and
the effect of changing the number of iteration on the network stability is analysed
with 30 search agents for both DA and PSO under the given system contingencies.
Voltage profile and transmission losses for both optimisers were compared and
analysed over the range of 10 to 50 iterations in a step of 10. Table 12 below
summarises the transmission losses and voltages obtained for DA and PSO with
outage inline 9. The obtained clearly shows how both optimisers converge to a
better value after a couple of iterations, therefore obeying the rule of experience in
artificial intelligence. At 20th iterations DA deviates a bit from the properties

Table 9 DA transmission loss variation with loading factor and outage at Bus 26
Loading factor (MVAR) Analysis (MW) Lines
9 7 9,7
0 Before ORPD 6.1909 6.3568 6.5946
After ORPD 5.1342 5.5073 5.7372
5 Before ORPD 6.401 6.5768 6.8079
After ORPD 5.3447 5.5986 5.7831
10 Before ORPD 6.9047 7.1046 7.3232
After ORPD 5.518 5.8684 6.0365
15 Before ORPD 7.8174 8.0641 8.2616
After ORPD 6.1796 6.5308 6.6494
20 Before ORPD 9.3667 9.7031 9.8628
After ORPD 6.7994 7.2083 7.6041
23 Before ORPD 10.8109 11.2465 11.3661
After ORPD 7.4088 7.6526 7.8161
Optimal Reactive Power Dispatch Solution by Loss Minimization … 1099

Table 10 PSO transmission loss variation with loading factor and outage at Bus 26
Loading factor (MVAR) Analysis (MW) Lines
9 7 9,7
0 Before ORPD 6.1909 6.3568 6.5946
After ORPD 5.1437 5.6289 5.6994
5 Before ORPD 6.401 6.5768 6.8079
After ORPD 5.2225 5.5812 5.7133
10 Before ORPD 6.9047 7.1046 7.3232
After ORPD 5.5534 5.8461 6.1055
15 Before ORPD 7.8174 8.0641 8.2616
After ORPD 6.4317 6.4874 6.5838
20 Before ORPD 9.3667 9.7031 9.8628
After ORPD 6.7389 7.6642 7.7116
23 Before ORPD 10.8109 11.2465 11.3661
After ORPD 7.8703 7.8042 8.0529

a Losses (MW) vs Loading Factor (MVar) b Losses (MW) vs Loading Factor (MVar)
with outage at line 9 & 7 with outage at line 9
12 11
Pre-ORPD
Pre-ORPD
DA Post-ORPD
DA Post-ORPD
11 PSO Post-ORPD
10 PSO Post-ORPD

10
9
Losses (MW)

Losses (MW)

9
8
8
7
7

6
6

5 5
0 5 10 15 20 25 0 5 10 15 20 25
Loading Factor (MVar) Loading Factor (MVar)

Fig. 8 a Loss variation based on loading factor with an outage at line 9 b Loss variation based on
loading factor with an outage at line 9 & 7

making PSO produce the best result but subsequently recovers and produce the
least amount of transmission loss with an average of 7.5076 MW as compared to
that of PSO having 7.7489 MW. Both methods improve the voltage level, but DA
shows a better result than its PSO counterpart that produces the least at 20th
iterations.
The same bus condition is considered. However, instead, the number of search
agents are varied, and the effect of their variation on the network transmission loss
and voltage proﬁling is analysed using 50 iterations. In both the two situations, DA
shows better performance than PSO as it has transmission loss average of 7.52 MW
1100 I. H. Shanono et al.

Table 11 Result for voltage proﬁle using DA with varying loading factor at Bus 26
Loading factor (MVar) Before ORPD (p.u) After ORPD (p.u)
Line outages
9 7 9,7 9 7 9,7
0 0.919 0.911 0.915 1.006 1.008 0.986
5 0.906 0.898 0.902 1.028 0.99 1.026
10 0.865 0.858 0.861 0.99 0.99 0.988
15 0.815 0.808 0.811 0.935 0.943 0.954
20 0.756 0.747 0.751 0.951 0.932 0.924
23 0.676 0.665 0.67 0.923 0.902 0.923

Table 12 Transmission loss No. of Transmission losses Voltage (p.u)

and Voltage variation Result iteration DA PSO DA PSO
for DA and PSO with line 9
(MW) (MW)
outage and variation in
iteration number 10 7.8624 8.3275 0.8633 0.811
20 7.7694 7.7533 0.876 0.79
30 7.366 7.4812 0.946 0.902
40 7.2998 7.4337 0.945 0.91
50 7.2404 7.4076 0.965 0.929

as against 7.90 MW of PSO, likewise 0.93 p.u and 0.90 p.u for voltage proﬁle. In
general, DA has less amount of loss and better voltage proﬁle than its PSO
counterpart (Table 13).

5 Conclusion

Dragonfly is a new meta-heuristic algorithm recently proposed to solve various

engineering problems. In this work, DA is implemented to solve a vital non-linear
optimisation problem in power system. The proposed optimiser is successfully

Table 13 Transmission loss No. of agents Transmission losses Voltage (p.u)

and voltage proﬁle results for
DA PSO DA PSO
DA and PSO with line 9
(MW) (MW)
outage and agent variation
10 7.6792 7.9914 0.891 0.885
20 7.4221 8.106 0.934 0.895
30 7.4608 7.7608 0.941 0.904
40 7.5227 7.7516 0.954 0.92
Optimal Reactive Power Dispatch Solution by Loss Minimization … 1101

applied to solve ORPD problem hence making it among the ﬁrst to be used on
standard IEEE 30-bus system network. The effectiveness and performance of this
algorithm were tested against a known meta-heuristic PSO technique. The novel
DA results indicate superiority over that of PSO by providing better control vari-
ables and minimum power loss in the standard IEEE 30-bus network.
Conclusively, Dragonfly algorithm is a robust optimiser that aids in achieving an
optimal reliable, secured and economically viable electrical power system network.
In the future, multiple objective instead of single-objective functions should be used
with DA to solve the ORPD problem, and the application should extend to other
IEEE standard bus networks.

Acknowledgements The authors gratefully acknowledge the ﬁnancial supports from Universiti
Malaysia Pahang Grant, with number RDU1703226.

References

1. Adebayo I, Sun Y (2017) New performance indices for voltage stability analysis in a power
system. Energies 10(12):1–18
2. Loriya V, Patel V (2016) A review on optimal reactive power dispatch. IJIREEICE 4(1):83–87
3. Conejo AJ, Galiana FD, Kockar I (2001) Z-bus loss allocation. IEEE Trans Power Syst 16
(1):105–110
4. Mamundur KRC, Chenoweth RD (1981) Optimal control of reactive power flow for
improvements in voltage profiles and for real power loss minimization. IEEE Trans Power
App Syst 100(7):3185–3194
5. Roy PK, Ghoshal SP, Thakur SS (2012) Optimal VAR control for improvements in voltage
profiles and for real power loss minimization using biogeography based optimization. Int J
Electr Power 43(1):830–838
6. Abdelmoumene M, Mohamed B, Boubakeur A (2013) Optimal reactive power dispatch using
differential evolution algorithm with voltage profile control. Int J Intell Syst 5(10):28–34
7. Suresh R, Kumar C, Sakthivel S, Jaisiva S (2013) Application of gravitational search
algorithm for real power loss and voltage deviation optimization. Int J Eng Sci Innov Technol
2(1):283–291
8. Ghasemi M, Taghizadeh M, Ghavidel S, Aghaei J, Abbasian A (2015) Solving optimal
reactive power dispatch problem using a novel teaching–learning-based optimization
algorithm. Eng Appl Artif Intell 39:100–108
9. Shaheen AM, Spea SR, Farrag SM, Abido MA (2016) A review of meta-heuristic algorithms
for reactive power planning problem. Ain Shams Eng J 9(2):215–231
10. Deeb NI, Shahidehpour SM (1988) An efficient technique for reactive power dispatch using a
revised linear programming approach. Electr Power Syst Res 15(2):121–134
11. Aoki K, Fan M, Nishikori A (1988) Optimal VAR planning by approximation method for
recursive mixed-integer linear programming. IEEE Trans Power Syst 3(4):1741–1747
12. Lo KL, Zhu SP (1991) A decoupled quadratic programming approach for optimal power
dispatch. Electr Power Syst Res 22(1):47–60
13. Granville S (1994) Optimal reactive dispatch through interior point methods. IEEE Trans
Power Syst 9(1):136–146
14. Zhao B, Guo CX, Cao YJ (2005) A multiagent-based particle swarm optimization approach
for optimal reactive power dispatch. IEEE Trans Power Syst 20(2):1070–1078
15. AlRashidi MR, El-Hawary ME (2009) Applications of computational intelligence techniques
for solving the revived optimal power flow problem. Electr Power Syst Res 79(4):694–702
1102 I. H. Shanono et al.

16. Durairaj S, Devaraj D, Kannan PS (2006) Genetic algorithm applications to optimal reactive
power dispatch with voltage stability enhancement. J Inst Eng India Part 87:42–47
17. Wu QH, Cao YJ, Wen YJ (1998) Optimal reactive power dispatch using an adaptive genetic
algorithm. Int J Electr Power Energy Syst 20(8):563–569
18. Devaraj D (2007) Improved genetic algorithm for multi-objective reactive power dispatch
problem. Eur Trans Electr Power 17:569–581
19. Devaraj D, Durairaj S, Kannan PS (2008) Real parameter genetic algorithm to multiobjective
reactive power dispatch. Int J Power Energy Syst 28(1):1710–2243
20. Wu QH, Ma JT (1995) Power system optimal reactive power dispatch using evolutionary
programming. IEEE Trans Power Syst 10(3):1243–1249
21. Das DB, Patvardhan C (2003) A new hybrid evolutionary strategy for reactive power
dispatch. Electr Power Syst Res 65(2):83–90
22. Yoshida H, Kawata K, Fukuyama Y, Takamura S, Nakanishi Y (2000) A particle swarm
optimization for reactive power and voltage control considering voltage security assessment.
IEEE Trans Power Syst 15(4):1232–1239
23. Esmin AAA, Lambert-Torres G, De-Souza ACZ (2005) A hybrid particle swarm optimization
applied to loss power minimization. IEEE Trans Power Syst 20(2):859–866
24. Tripathy M, Mishra S (2007) Bacteria foraging-based solution to optimize both real power
loss and voltage stability limit. IEEE Trans Power Syst 22(1):240–248
25. Ela AA, Abido MA, Spea SR (2011) Differential evolution algorithm for optimal reactive
power dispatch. Electr Power Syst Res 81:458–464
26. Liang CH, Chung CY, Wong KP, Duan XZ, Tse CT (2007) Study of differential evolution for
optimal reactive power flow. IEEE Proc Gener Transm Distrib 1(2):253–260
27. Ayan K, Kılıc U (2012) Artificial bee colony algorithm solution for optimal reactive power
flow. Appl Soft Comput 12(5):1477–1482
28. Khazali AH, Kalantar M (2011) Optimal reactive power dispatch based on harmony search
algorithm. Int J Electr Power Energy Syst 33(3):683–692
29. Sinsuphan N, Leeton U, Kulworawanichpong T (2013) Optimal power flow solution using
improved harmony search method. Appl Soft Comput 13(5):2364–2374
30. Valipour K, Ghasemi A (2017) Using a new modified harmony search algorithm to solve
multi-objective reactive power dispatch in deterministic and stochastic models. AI Data Min 5
(1):89–100
31. Duman S, Güvenc U, Sönmez Y, Yörükeren N (2012) Optimal power flow using gravitational
search algorithm. Energy Convers Manage 59:86–95
32. Shaw B, Mukherjee V, Ghoshal SP (2014) Solution of reactive power dispatch of power
systems by an opposition-based gravitational search algorithm. Int J Electr Power Energy Syst
55:29–40
33. Sulaiman MH, Mustaffa Z, Mohamed MR, Aliman O (2015) Using the gray wolf optimizer
for solving optimal reactive power dispatch problem. Appl Soft Comput 32:286–292
34. Rajan A, Malakar T (2015) Optimal reactive power dispatch using hybrid Nelder-Mead
simplex based firefly algorithm. Electr Power Energy Syst 66:9–24
35. Mandal B, Roy PK (2013) Optimal reactive power dispatch using quasi-oppositional teaching
learning based optimization. Electr Power Energy Syst 53:123–134
36. Soler EM, Asada EN, Da-Costa GRM (2013) Penalty-based nonlinear solver for optimal
reactive power dispatch with discrete controls. IEEE Trans Power Syst 18(1):2174–2182
37. Abou El Ela, AA, El Sehiemy R, Shaheen AM (2013) Multi-objective fuzzy-based procedure
for enhancing reactive power management. IET Gener Transm 7(12):1453–1460
38. Shaheena AM, El-Sehiemyb RA, Farrag SM (2016) Optimal reactive power dispatch using
backtracking search algorithm. Aust J Electr Electron Eng 13(3):200–210
39. Nakawiro W, Erlich I, Rueda JL (2011) A novel optimization algorithm for optimal reactive
power dispatch: a comparative study. In: International conference on electric utility
deregulation and restructuring and power technologies, Weihai, Shandong. IEEE, pp 1555–
1561
Optimal Reactive Power Dispatch Solution by Loss Minimization … 1103

40. Abou El-Ela AA, Kinawy AM, El-sehiemy RA, Mouwaﬁ MT (2011) Optimal reactive power
dispatch using ant colony optimization algorithm. Electr Eng 93:103–116
41. Huang CM, Huang YC (2012) Combined differential evolution algorithm and ant system for
optimal reactive power dispatch. Energy Procedia 14(2012):1238–1243
42. Sulaiman MH, Mohd Rashid MI, Aliman O, Mohamed MR, Ahmad AZ, Bakar MS (2014)
Loss minimisation by optimal reactive power dispatch using cuckoo search algorithm. In: 3rd
IET international conference on clean energy and technology (CEAT), Kuching, Malaysia.
IEEE, pp 1–4
43. Lenin K, Reddy BR, Kalavathi MS (2015) Modiﬁed monkey optimization algorithm for
solving optimal reactive power dispatch problem. Indones J Electr Eng Inform 3(2):1049–
1058
44. Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for
solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27
(4):1053–1073
45. Ouafa H, Linda S, Tarek B (2017) Multi-objective optimal power flow considering the fuel
cost, emission, voltage deviation and power losses using multi-objective Dragonfly algorithm.
In: proceedings of the international conference on recent advances in electrical systems,
Tunusia
46. Attia A, Al-Turki YA, Abusorrah AM (2012) Optimal power flow using adapted genetic
algorithm with adjusting population size. Electr Power Compon Syst 49:1285–1299
Analysis of Pedal Power Energy
Harvesting for Alternative Power Source

Sheikh-Muhammad Haziq Sah-Azmi and Zuraini Dahari

Abstract Numerous research have been conducted to investigate the energy har-
vesting method from human motions such as walking, cycling and walking. The
human power energy will be very important and useful to exploit in remote area,
emergency and unexpected situations such as big flood and natural disasters. In
these kinds of situations, there might be power shortage conditions and it will be
very difﬁcult to fulﬁll basic power necessities for instance to power up light,
charging up mobile phones and other simple tasks. Out of variety of energy
recovery techniques that exist, pedal motion electricity generation shows the most
promising energy for large-scale recovery. The kinetic energy produced by pedaling
motion can produce large amount of voltage up to 15 V to 30 V. Since bicycle is a
commonly owned by vast range of people, cycling would offer us alternative power
source. In this research, a prototype of stationary bike comprises mechanical and
software part is developed to investigate the energy produced from pedal power.
Different speeds are tested to analyze the amount of power produced. At the end of
this research, it is found that at slow cycling speed, the time taken to full charge the
12 V battery is 14.02 h, whereas at high speed, the generated power is 5.44 V. The
harvested electrical energy is enough to charge up mobile phone, light up low
power LED light and fan during emergency situation.

Keywords Human energy harvesting Pedal power Renewable energy

1 Introduction

The global population is reaching up to 7.7 billion people, yet there is almost 1
billion people do not have access to energy [1]. The energy crisis and the escalating
costs of power have moved researchers’ attention toward new sustainable and

S.-M.H. Sah-Azmi Z. Dahari (&)

School of Electrical and Electronic Engineering, Universiti Sains Malaysia, Engineering
Campus, 14300 Nibong Tebal, Pulau Pinang, Malaysia
e-mail: [email protected]

renewable energy sources and materials. Currently, some of the mainstreams

renewable energy technologies are hydropower, wind energy, solar energy,
bio-mass energy, biofuels and geothermal energy [2–6]. In recent years, the pace of
research in energy harvesting from ambient energy from natural and human sources
is showing a rapid acceleration.
Recently the human energy harvesting has increasingly gaining attention from
the researchers. In average, human body consume about 1300 kcal per day. Adult
female consume about 1500 kcal per day while adult male consumes about 1600 to
1800 kcal per day which mean about 5.44 MJ to 7.53 MJ consume per day [7].
Some of this energy used for human body system for survival purpose and the
others energy are just wasted.
There are several human activities that could burn out unwanted calories such as
walking, running and cycling. Among the human body energy harvesting methods,
cycling/pedaling can be considered as one of significant method with relatively high
output power. The number of cycling activities has been increasing rapidly as more
people are looking for healthy life-style, eco-friendly and cost effective mode of
transportation. Due to the relatively low and reasonable price of bicycle, most
people own at least one bicycle in their home. From those activity, cycling with
regular training 2–3 times per week has a maximum power output of 250–400 W,
while a world class cyclist generates up to 600 W [8]. Hence, cycling/pedaling is
one of the most easy and potential energy harvesting approach. In emergency
situation, where there is power shortage and in remote location, bicycle can be used
as an alternative power source.
Mudaliar [9] have conducted a research on energy harvesting from bicycle pedal
system. They compared two types of generator which are brushless DC (BLDC)
and permanent magnet DC (PMDC) generator output power by fixing the output
load with 75 X 8-W resistance. Five persons with non-athlete background and
different weight variation cycled the bicycle to generate the energy. The results
showed that BLDC is more efficient to generate energy than PMDC.
Several researchers have explored on using gymnasium bicycle as the energy
harvesters [10, 11]. Méndez-Gayol et al. [11] has developed a smart grid for a
sustainable gym, where all the electric energy generated by sport machines, is
injected into the grid. People who came to the gym would do exercises to burn up
their calories and at the same time contribute power to gym as well. They used
permanents magnets generators (PMG) mounted to bicycle to mimic gym training
bicycle that will produce three phase AC voltage and rectify it to DC voltage.
Before being injected to the grid it need to be boosted and convert backed to high ac
voltage. Some of the energy produced is used to light up the gym itself. Different
values of load and speed are used to determine the value of current drawn and
power produced by the generator while pedaling.
In this research work, a prototype of pedaling energy harvester is developed. It
focuses on several factors that can affect the amount of electrical energy produced
by cycling motion at low and high speed. Techniques applied by previous
researchers will be used as the basis of the study. The output of the generator on
Analysis of Pedal Power Energy Harvesting for Alternative Power ... 1107

speed and amount of load are investigated. The research on time taken and energy
used to full charge the 12 V lead acid battery will also be studied and analyzed.

2 Methodology

Initially, the prototype of the pedal energy harvester is designed and developed. The
hardware consists of a stationary bike, PMDC generator and friction drive system.
A preliminary test to select suitable the friction drive is conducted. Next stage
involved the development of energy harvester circuit. The energy harvester circuit
is then connected to the prototype for further investigation. Since the output of the
generator is proportional to the speed and load, speed and load test need to be
recorded and the data are collected to analyze the raw output of the generator.
Then a regulator circuit was built in order to regulate the voltage generated by
the generator. A controller circuit was built for safety purpose which will isolate the
voltage of the battery and the speed of the generator. It will cut off the power input
to the battery when the battery is full and at minimum state to avoid over charge and
discharge which can harm the battery itself. Finally, test charge is conducted to
record the time taken and calculate the energy needed to full charge the battery
based on different level of cycling speeds. Figure 1 shows the flowchart of the
methodology.

2.1 Development of Prototype

The proposed pedal energy harvester comprises of a hardware prototype of a

bicycle, permanent magnet DC generator, software controller, switch relay, voltage
regulator and rechargeable lead acid battery. Figure 2 shows the block diagram of
the harvester and components needed. The PMDC generator is used to convert the
kinetic energy into electrical energy, while the Arduino board is programmed to
isolate the voltage of the battery where the power input is cutting off when the
battery is full and at minimum state to avoid over charge and discharge. The
regulator is used to regulate the voltage to required voltage for charging the battery.
The hardware prototype is initially designed with SolidWorks. It consists of
bicycle rear tire stand with ribs, adjustable base with hinge and threaded screw to
engage the friction shaft from the generator to the rear tire of the bicycle as shown
in Fig. 3. It is made of wood to minimize the total weight. Several brackets and
screws are used to assemble the prototype. In order to hang the rear tire of the
bicycle to the prototype, shaft with threaded hole is use to extend the length of the
original shaft. This prototype is then connected to the energy harvester circuit for
further investigation.
1108 S.-M. H. Sah-Azmi and Z. Dahari

Fig. 1 Flowchart of
methodology

In this study, PMDC generator is selected as transducer to convert kinetic energy

to the electrical energy. The speciﬁcation of the PMDC generator are 24 V voltage
output, 350 W rated power and 2200 rpm rated speed. The controller consists of
two parts, the automatic charger control and speed measurement. Both of the
systems used Arduino as the main controller.
In this project, infrared sensor module and reflective material are used as speed
detector. The output power from the generator is measured by using MAX 471
Analysis of Pedal Power Energy Harvesting for Alternative Power ... 1109

EXTERNAL ENERGY VOLTAGE

ENERGY CONVERSION REGULATOR
• Kinetic energy • PMDC CIRCUIT
from pedaling generator • Voltage regulator
bicycle

CONTROLLER STORAGE OUTPUT

CIRCUIT CIRCUIT CIRCUIT
• Arduino • Rechargeable • Inverter from 12
• IRsensor battery volt DC to 220V
AC

Fig. 2 Block diagram of harvester and the component needed

Fig. 3 Developed prototype

current module which is connected to the Arduino to read the current drawn by the
load resistor. From the current output, power can be calculated.
A 12 V 7.2 Ah sealed lead acid battery is used for charging and discharging. The
prototype also includes the automatic charger controller which will disconnect the
generator from the battery when the battery is fully charged. It can be divided into
three parts which are the reading on battery level, voltage regulator and switching
for connect and disconnect generator to the battery. The battery voltage reader
converts the voltage capacity to level state by using LM3914 IC. It will be read by
Arduino by using digital input and display the battery level during charging
1110 S.-M. H. Sah-Azmi and Z. Dahari

Fig. 4 Complete circuit of the pedal energy harvester

process. Two types of voltage regulator will be used for testing, which are linear
voltage regulator and DC-DC buck converter voltage regulator. NPN transistor acts
as the switching circuit to the battery. The complete circuit is as shown in Fig. 4.

2.2 Prototype Testing

There are several stages of prototype testing need to be conducted prior to the final
testing. The testing involves friction-drive testing, transducer testing and the final
testing (Refer to Fig. 1).
In the friction drive testing, three types of materials are tested to decide which
material is suitable. The materials are PLA (Polylactic Acid), ABS (Acrylonitrile
butadiene styrene) and aluminium. PLA and ABS shaft were prepared by using 3D
printer while aluminum by using lathe machine according to the specified dimen-
sion. Then, the bicycle is engaged to the prototype and the angle of adjustable base
is adjusted until the rear tire of the bicycle touches the friction drive shaft. The shaft
is tested by pedaling the stationary bike about 5 min and the temperature and speed
are measured. The physical appearance of the shaft are also observed.
Analysis of Pedal Power Energy Harvesting for Alternative Power ... 1111

The next stage is the transducer testing. In this stage, the voltage and current
generated are measured. The bicycle is cycled for 2 min at two variation of speeds
which are slow (3000–4500 rpm) and fast (5000–8000 rpm). Two types of voltage
regulation circuit were built to compare which one is suitable for energy harvesting.
The ﬁrst type is linear voltage regulator and the second one is DC-DC buck con-
verter. The main aim of voltage regulator is to regulate the output voltage to 14.4 V
as the voltage required to charge the battery.
The ﬁnal stage will be the testing part to investigate the time taken to cycle the
bike until the battery is fully charged. In this experiment, the person carries out this
testing will crank the bicycle at two variation of speed slow and fast at certain
amount of time.

3 Result and Discussion

Based on friction drive test, it is found that aluminium is the most suitable material
to be used as the shaft. It showed that aluminum can withstand heat produced by the
friction between shaft and bearing. It also has good heat conductivity hence the heat
are distributed evenly.
In the next experiment, the output of electrical energy generated by generator
which are connected with different value of resistors, which are 47 Ω, 33 Ω and 27
Ω. Table 1 shows the output voltage, current and power generated after 1 min at 2
variations of speed cycling.
Based on the obtained results, the 27 Ω load generated the smallest range of
output voltage during slow and fast cycling speed between 16.66–30.07 V followed
by 33 Ω and 47 Ω. Besides, the output current of 27 Ω is the highest among other
load which led to highest output power. Therefore, the 27 Ω load is chosen in this
study.
In the next stage, voltage regulation test is carried out. This experiment is to set
the limit of output voltage from the regulator. The output voltage is regulated to
14.4 V while the current of the output is not regulated. There are two types of
voltage regulator used in this experiment which are linear voltage regulator and
DC-DC buck converter. Figures 5, 6 and 7 shows the output voltage, current and
power with respect to the speed of generator.

Table 1 Output voltage, current and power generated by PMDC generator

Load (Ω) Voltage (V) Current, I (A) Power (W)
Slow Fast Slow Fast Slow Fast
27 16.66 30.70 0.603 1.112 10.388 34.605
33 15.02 31.69 0.458 0.966 7.175 31.274
47 13.38 32.62 0.284 0.694 3.853 23.193
1112 S.-M. H. Sah-Azmi and Z. Dahari

Fig. 5 Output voltage for 35

voltage regulation testing Slow
30
Fast
25

Output Voltage (V)

0
Raw Linear Buck
Regulator Converter
Types Of Regulator

Fig. 6 Raw output current 1.4

and current after voltage Slow
regulation 1.2
Fast
Output Current (A)

0.8

0.6

0.4

0.2

0
Raw Linear Buck
Regulator Converter
Types of regulator

Fig. 7 Raw output power 40

and power after voltage Slow
regulation 35
Fast
30
Output Power (W)

25
20
15
10
5
0
Raw Linear Buck
Regulator Converter
Types Of Regulator
Analysis of Pedal Power Energy Harvesting for Alternative Power ... 1113

Table 2 Calculated time taken and energy needed to full charge 12 V battery
Speed Voltage Current Power Time Energy Energy
(rpm) (V) (A) (W) (Hour) (Wh) (kJ)
Slow 14.4 0.51 7.39 14.02 103.7 373.2
Fast 14.4 1.32 19.05 5.44 103.7 373.2

Fig. 8 Energy harvested in Energy (Wh)

5 min with slow cycling 0.7
speed
0.6
0.5
Energy (Wh)

0.4
0.3
0.2
0.1
0
Theoretical Measured

Figure 5 shows that both linear regulator and buck converter successfully reg-
ulated the raw voltage to 14.4 V during slow and fast cycling speed. In terms of
current flow after the regulator circuit, linear regulator gives highest current flow
while buck converter gives same output current during fast and slow cycling speed
(refer Fig. 6).
Figure 7 shows that the output power of linear regulator during slow cycling
speed is maintained at high efﬁciency. Whereas at fast cycling speed, there is high
power loss due to heat in order to regulate from 30.07 V to 14.4 V. Meanwhile, for
the buck converter, it shows constant output power for both speeds but lower power
compared to linear regulator. Therefore, linear regulator was chosen in this study at
the condition of slow cycling speed.
This ﬁnal experiment is to investigate the time taken and amount of energy
needed to full charge the 12 V battery with 7.2AH capacity. The bicycle is cranked
at two variation of speed slow and fast in 5 min. The speed meter will guide the
cyclist to crank at which kind of speed, either slow or fast speed in order to generate
a constant power output. Table 2 shows the theoretical time taken and energy need
to full charge the battery.
Figure 8 shows the comparison of theoretical and measured energy harvested in
5 min of cycling at slow speed. Based on Table 2, the time taken to full charge the
battery at slow cycling speed took about 14 h whereas at fast speed it took about
5.44 h.
1114 S.-M. H. Sah-Azmi and Z. Dahari

4 Conclusion

In this research, the kinetic energy produced by cycling movement was harvested
and converted to electrical energy. The energy is then stored into a 12 V battery.
Based on the experimental results, the 12 V battery can be fully charged at slow
speed of cycling for approximately 14 h, whereas at fast speed of cycling, in 5.44 h.
The fast cycling speed can generate almost triple amount of current than slow
cycling speed. However, there is a huge power loss during fast cycling speed. Based
on the experimental results, it showed that the pedal power energy harvesting has
great potential to be used as a renewable energy source and as backup power during
emergency situations. There are some future improvements to be considered such
as, the practicality issues on the size of the bicycle and the integration of the energy
harvester circuit to PCB (printed circuit board) based. In addition, analysis on
different generator and rechargeable battery types can also be considered for further
investigation. The output of electrical energy generated by pedaling movement can
be increased by using alternator that can produce high current which can shorten the
time taken to full charge the battery. PWM power regulator can be used to stabilize
the output power. In overall, this simple pedaling energy harvester which depends
on any common bicycle with the aid of human power will deﬁnitely offers as a good
back up and alternative source of energy especially in emergency situation such as
disaster recovery in remote area.

References

1. IEA (The International Energy Agency) Homepage. https://www.iea.org/newsroom/news/

2018/october/population-without-access-to-electricity-falls-below-1-billion. Accessed 12 Oct
2019
2. Mathiesen et al (2015) Smart energy systems for coherent 100% renewable energy and
transport solutions. Appl Energy 145:139–154
3. Quaschning V (2016) Understanding renewable energy systems, 2nd edn. Routledge, London
4. Dincer I (2000) Renewable energy and sustainable development: a crucial review. Renew
Sustain Energy Rev 4(2):157–175
5. Cheng JJ (2018) Biomass for renewable energy processes. CRC Press, London
6. Khare V et al (2016) Solar–wind hybrid renewable energy system: a review. Renew Sustain
Energy Rev 58:23–33
7. Purves WK et al (2004) Life: the science of biology, 7th edn. Freeman, New York
8. Lindberg-Poulsen K, Andersen MAE, Knott A, Andersen T (2010) Energy harvesting from an
exercise bike using a switch-mode converter controlled generator. In: IEEE international
conference on sustainable energy technologies (ICSET), Sri Lanka, pp 1–5
9. Mudaliar SM (2015) Electrical power generation harnessing human energy and its analysis as
sustainable renewable energy solutions. In: International conference on energy systems and
applications (ICESA 2015). Institute of Engineering and Technology, Pune
10. Ahsan-uz-Zaman KM, Ullah KM, Mishir M, Alam M (2017) Generation of electrical power
using gymnasium bicycle. In: IEEE region 10 humanitarian technology conference
(R10-HTC), pp 198–200
11. Méndez-Gayol MR-S, Calleja M, Pablo J Quintana AJ (2016) Working in a smart grid for a
sustainable gym. In: 13th international conference on power electronics (CIEP), Mexico
An Application of Barnacles Mating
Optimizer Algorithm for Combined
Economic and Emission Dispatch
Solution

Mohd Herwan Sulaiman, Zuriani Mustaffa, Mohd Mawardi Saari,

and Amir Izzani Mohamed

Abstract This article proposes an application of recent technique namely

Barnacles Mating Optimizer (BMO) to solve Combined Economic and Emission
(CEED) dispatch problems. CEED is known as an extension of well-known
Economic Dispatch (ED) problem which basically considering not only to mini-
mize the cost of power generation, but also to minimize the emission level from the
generated power to the environment. These two objectives are conflicting each
other and CEED problem is to ﬁnd the tradeoff between these two objectives
simultaneously while satisfying all the equality and inequality constraints. BMO on
the other hand is the recent meta-heuristic technique based on the mating behavior
of the barnacles. It will be used to solve the CEED problem by implementing the
price penalty and weighting factors in order to obtain the best combination of power
generation so that minimization of cost and emission can be achieved. In order to
show the effectiveness of proposed BMO, 6-units and 10-units systems are utilized
and compared with the selected techniques available in the literatures.

Keywords Barnacles Mating Optimizer Combined Economic and Emission

Dispatch Economic Dispatch Price penalty factor

1 Introduction

Combined Economic and Emission Dispatch (CEED) is an extension problem

derived from Economic Dispatch (ED) problems, where the effect of emission from
power generators are considered in determining the optimal output of generation so
that the total cost of power production can be minimized. In CEED, the emission is

M. H. Sulaiman (&) M. M. Saari A. I. Mohamed

Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang,
26600 Pekan, Pahang, Malaysia
e-mail: [email protected]
Z. Mustaffa
Faculty of Computing, Universiti Malaysia Pahang, 26300 Gambang, Pahang, Malaysia

transformed into the cost by implementing the price penalty factors where the
function is contradicting with the cost function. This make this area of research
open wide by solving it using multi-objective approach or transforming the
bi-objectives into single objective optimization problem.
There are various techniques that have been proposed in literature to solve the
CEED problem especially using meta-heuristic approach such as Cuckoo Search
Algorithm (CSA) [1], Flower Pollination Algorithm (FPA) [2–4], Gravitational
Search Algorithm (GSA) [5], Nondominating Sorting Genetic Algorithm (NSGA
II) [6], Multi Objective Differential Evolution (MODE) [6], Particle Swarm
Optimization (PSO) [7], Grey Wolf Optimizer (GWO) [8, 9], Improved Harmony
Search Algorithm (IHSA) [10], Sine Cosine Algorithm (SCA) [11] and many more.
This paper proposes a recent meta-heuristic technique namely Barnacles Mating
Optimizer (BMO) into solving the CEED problem. This algorithm mimics the
mating concepts of barnacles in nature. The rest of this paper is organized as
follows: Sect. 2 discussed in brief the formulation of CEED follows by the BMO
implementation into CEED problem in Sect. 3. Section 4 presents the results and
discussion and lastly, Sect. 5 concludes the paper.

2 Combined Economic and Emission Dispatch (CEED)

Formulation

CEED is to ﬁnd the optimal output of generators in order to obtain the minimum
cost as well as the emission level simultaneously. These two objectives are for-
mulated as follow:

minðCostT Þ ¼ wFT þ ð1 wÞhET ð1Þ

X
N
Fi ðPi Þ ¼ ai P2i þ bi Pi þ ci þ di sin ei Pmin
i Pi ð2Þ
i¼1

X
N
Ei ðPi Þ ¼ fi P2i þ gi Pi þ ji þ ki expðmi Pi Þ ð3Þ
i¼1

where CostT is the total cost in $/hour, w is the weighting factor that can be varied
between 0 and 1, FT is the cost function, ET is the total emission generated by power
plants and h is the price penalty factor. Pi is the real power generation of generator
i for dispatched hour, N is the total number of generation units, ai, bi, ci are the
coefﬁcients of the fuel cost function and fi, gi, and ji are the coefﬁcients of the
An Application of Barnacles Mating Optimizer Algorithm … 1117

emission function. the coefﬁcients di, ei, ki and mi are used only if the valve-loading
effect is considered. The price penalty factor hi is the ratio between the maximum
fuel cost and maximum emission of corresponding generator as follows:

F Pmax
hi i
$=kg i ¼ 1; 2; . . .; N ð4Þ
E ðPmax
i Þ

Detail steps to ﬁnd the price penalty factor can be obtained in [12]. The equality
and inequality constraints that need to be fulﬁlled are as follow:

X
N
Pi ¼ PLoad þ PLoss ð5Þ
i¼1

Pmin
i Pi Pmax
i ð6Þ

where PLoad, PLoss are the demand and loss respectively and Pmin
i and Pmax
i are the
minimum and maximum capacity of respected generator i.

3 Barnacles Mating Optimizer for CEED Solution

BMO is the recent technique that has been proposed by [13–15] which is inspired
by the mating behavior of barnacles in nature. In BMO, only one parameter needs
to be tuned in order to obtain the acceptable results apart of number of population
and maximum iteration number. The parameter is the barnacle’s penis length, pl to
be set initially to determine the exploitation and exploration processes. If the
selection of barnacles to be mated out of range of pl, the new off-springs are
generated by using the sperm-cast mating as shown in Eq. (8) as follows:

xnn new
¼ pxndad þ qxnmum k pl ð7Þ

xnn new
¼ randðÞ xnmum k [ pl ð8Þ

where p is the distributed random numbers, q = (1 − p), xndad and xnmum are the
control variables (generator’s output) of Dad and Mum of barnacles which has been
selected for mating, k is the range of barnacle’s penis can be reached for mating
process and rand() is the random number (0–1). It can be noticed that Eq. (7) is the
exploitation process and Eq. (8) is the exploration process. If the solution is out of
bound, the algorithm will select the boundary values before proceeding to the
evaluation process. Figure 1 shows the flow of BMO in solving CEED problems.
1118 M. H. Sulaiman et al.

Fig. 1 Flow of BMO into solving the CEED problems

4 Results and Discussion

To exhibit the effectiveness of BMO for CEED solution, it has been tested and
analyzed on two test systems: 6-units and 10-units systems. Performance of BMO is
compared with other techniques that available in the literature.

4.1 6-Units System

This system consists of 6-units of generator with quadratic cost and emission
functions. The load for this system is set to 1000 MW and the system is assumed to
be lossless. All the coefﬁcients for fuel cost and emission level with the generator’s
capacity constraints can be obtained in [1]. The population is set to 30 barnacles and
the pl = 21 and the iteration is set to 100.
The best results of CEED compared to CSA and GSA is tabulated in Table 1.
From this table, it can be seen that BMO gave the competitive results compared to
An Application of Barnacles Mating Optimizer Algorithm … 1119

CSA and outperformed GSA in terms of total cost and emission levels. BMO able
to obtain the minimum total cost compared to CSA and GSA but for total emission,
CSA is slightly better compared to BMO as shown in bold letter in the table.
In order to show the consistency and robustness of proposed BMO in solving
CEED problem, BMO has been run for 30 times as shown in Fig. 2. It can be seen
that BMO able to give consistent results throughout 30 running simulations. The
convergence graph for the best results is depicted in Fig. 3. From this graph, it
shows that the proposed BMO produces an early convergence which is within 40
iterations.

4.2 10-Units System

This system comprises 10-units thermal generators where the data for coefﬁcients
and the capacity of individual generators are exhibited in Table 2 [2]. The loss
coefﬁcient, Bij is also embedded under this table and the system is including the
valve-loading effect. The load is set to 2000 MW.

Bij ¼ 1e2 4: 3
0:49 0:14 0:15 0:15 0:16 0:17 0:17 0:18 0:19 0:20
6 0:14 0:45 0:16 0:16 0:17 0:15 0:15 0:16 0:18 0:18 7
6 7
6 0:15 0:16 0:39 0:10 0:12 0:12 0:14 0:14 0:16 0:16 7
6 7
6 0:15 0:16 0:10 0:40 0:14 0:10 0:11 0:12 0:14 0:15 7
6 7
6 0:16 0:17 0:12 0:14 0:35 0:11 0:13 0:13 0:15 0:16 7
66 0:17
7
6 0:15 0:12 0:10 0:11 0:36 0:12 0:12 0:14 0:15 7
7
6 0:17 0:15 0:14 0:11 0:13 0:12 0:38 0:16 0:16 0:18 7
6 7
6 0:18 0:16 0:14 0:12 0:13 0:12 0:16 0:40 0:15 0:16 7
6 7
4 0:19 0:18 0:16 0:14 0:15 0:14 0:16 0:15 0:42 0:19 5
0:20 0:18 0:16 0:15 0:16 0:18 0:18 0:16 0:19 0:44

Table 1 CEED results using Unit (MW) CSA [1] GSA [5] BMO
BMO vs GSA vs CSA for
6-units generator system P1 80.995168 78.8221 80.9896
P2 80.756741 83.0013 80.7559
P3 165.42311 164.2907 165.4332
P4 164.25914 164.9136 164.2717
P5 255.09727 258.1108 255.109
P6 253.46857 250.8619 253.4406
Total cost ($/h) 51254.726 51255.788 51254.676
Total emission 826.98366 827.13808 826.98621
(kg/h)
1120 M. H. Sulaiman et al.

Fig. 2 Performance of BMO for 30 running simulations for 6-unit generator system

Fig. 3 Convergence graph for 6-unit generator system

Table 2 Coefﬁcients data for 10-units generator system
Unit a ($/ b ($/ c ($/h) d ($/ e (rad/ Pmin Pmax f (kg/ g (kg/ j (kg/h) k (kg/h) m (1/
MW2h) MWh) h) MW) (MW) (MW) MW2h) MWh) MW)
P1 0.12951 40.5407 1000.403 33 0.0174 10 55 0.04702 −3.9864 360.0012 0.25475 0.01234
P2 0.10908 39.5804 950.606 25 0.0178 20 80 0.04652 −3.9524 350.0056 0.25475 0.01234
P3 0.12511 36.5104 900.705 32 0.0162 47 120 0.04652 −3.9023 330.0056 0.25163 0.01215
P4 0.12111 39.5104 800.705 30 0.0168 20 130 0.04652 −3.9023 330.0056 0.25163 0.01215
P5 0.15247 38.539 756.799 30 0.0148 50 160 0.0042 0.3277 13.8593 0.2497 0.012
P6 0.10587 46.1592 451.325 20 0.0163 70 240 0.0042 0.3277 13.8593 0.2497 0.012
P7 0.03546 38.3055 1243.531 20 0.0152 60 300 0.0068 −0.5455 40.2669 0.248 0.0129
An Application of Barnacles Mating Optimizer Algorithm …

P8 0.02803 40.3965 1049.998 30 0.0128 70 340 0.0068 −0.5455 40.2669 0.2499 0.01203
P9 0.02111 36.3278 1658.569 60 0.0136 135 470 0.0046 −0.5112 42.8955 0.2547 0.01234
P10 0.01799 38.2704 1356.659 40 0.0141 150 470 0.0046 −0.5112 42.8955 0.2547 0.01234
1121
1122 M. H. Sulaiman et al.

Table 3 Results for the best solution of CEED for 10-units generator system
Power outputs NSGA II FPA [2] MODE CSA [1] BMO
(MW) [6] [6]
P1 51.9515 53.188 54.9487 54.960624 54.236872
P2 67.2584 79.975 74.5821 78.90525 80
P3 73.6879 78.105 79.4294 88.758608 82.564799
P4 91.3554 97.119 80.6875 83.809362 79.113712
P5 134.0522 152.74 136.8551 134.55947 151.64057
P6 174.9504 163.08 172.6393 172.32656 151.16191
P7 289.4350 258.61 283.8233 287.1269 300
P8 314.0556 302.22 316.3407 307.27663 310.0416
P9 455.6978 433.21 448.5923 443.066 442.94573
P10 431.8054 466.07 436.4287 433.02639 432.40779
PLoss 84.2500 84.317 84.3270 83.816 84.113
Cost ($/h) 113542.9 113658.96 113477.63 113421.45 113389.02
Emission (Kg/h) 4151.0 4147.1677 4124.8642 4121.6658 4125.8961

Fig. 4 Convergence graph for 10-unit generator system

An Application of Barnacles Mating Optimizer Algorithm … 1123

Results of CEED solution for this system is presented in Table 3. It can be seen
that BMO outperformed CSA, GSA, FPA, MODE and NSGA II in terms of cost
minimization while CSA is emerged as the best for emission and loss minimization
among all the compared techniques. Nevertheless, BMO gave the competitive
results after the CSA in terms of emission and loss minimizations. From this table
also can be noticed that all the optimal power generation obtained by all techniques
are within the capacity of individual generators as presented in Table 2. Figure 4
shows the convergence graph of BMO for this case system. The proposed BMO
converged within 50 iterations to obtain the optimal result for this case system.
From the simulation results, it can be said that BMO offers promising results in
solving the CEED problems and can be used as an optimizer for other optimization
problems.

5 Conclusion

The application of recent meta-heuristic technique: Barnacles Mating Optimizer

(BMO) into solving the well-known Combined Economic Emission Dispatch
(CEED) has been presented in this paper. The performance of BMO has been
evaluated through two test systems viz. 6-units and 10-units systems. From the
simulations that have been carried out, it can be said that BMO able to produce
competitive and promising results especially in CEED problems. The application of
BMO into more complicated optimization problems such as dynamic CEED will be
proposed in the near future.

Acknowledgements This work was supported by the Universiti Malaysia Pahang (UMP) and the
Ministry of Higher Education (MOHE) Malaysia and under Fundamental Research Grant
Scheme RDU 170105 & FRGS/1/2017/ICT02/UMP/02/3.

References

1. Sulaiman MH, Mustaffa Z, Mohamed MR, Abdullah NRH (2015) An application of cuckoo
search algorithm for solving combined economic and emission dispatch problem. In: 2015
International Conference on Informatics, Electronics and Vision (ICIEV 2015), Fukuoka,
Japan, pp 1–5
2. Abdelaziz AY, Ali ES, Abd Elazim SM (2016) Flower pollination algorithm to solve
combined economic and emission dispatch problems. Int J Eng Sci Technol 19:980–990
3. Mee Song H, Sulaiman MH, Mohamed MR (2015) An application of flower pollination
algorithm to solve combined economic emission dispatch by considering valve-point loading
effect. Int Rev Model Simul (IREMOS) 8(4):427–435
4. Abdelaziz AY, Ali ES, Abd Elazim SM (2016) Implementation of flower pollination
algorithm for solving economic load dispatch and combined economic emission dispatch
problems in Power Systems. Energy 101:506–518
1124 M. H. Sulaiman et al.

5. Güvenç U, Sönmez Y, Duman S, Yörükeren N (2012) Combined economic and emission

dispatch solution using gravitational search algorithm. Scientia Iranica 19:1754–1762
6. Basu M (2011) Economic environmental dispatch using multi-objective differential evolution.
Appl Soft Comput 11:2845–2853
7. Krishnamurthy S, Tzoneva R, Apostolov A (2017) Method for a parallel solution of a
combined economic emission dispatch problem. Electric Power Compon Syst 45:393–409
8. Mee Song H, Sulaiman MH, Mohamed MR (2014) An application of grey wolf optimizer for
solving combined economic emission dispatch problems. Int Rev Model Simul (IREMOS) 7
(5):838–844
9. Halbhavi SB, Kulkarni DB, Ambekar SK, Manjunath D (2017) Adaptive grey wolf
optimization for weightage-based combined economic emission dispatch in Hybrid
Renewable Energy Systems. New Rev Inf Netw 22:124–142
10. Rezaie H, Kazemi-Rahbar MH, Vahidi B, Rastegar H (2019) Solution of combined economic
and emission dispatch problem using a novel chaotic improved harmony search algorithm.
J Comput Des Eng 6:447–467
11. Gonidakis D, Vlachos A (2019) A new sine cosine algorithm for economic and emission
dispatch problems with price penalty factors. J Inf Optim Sci 40:679–697
12. Venkatesh P, Gnanadass R, Padhy NP (2003) Comparison and application of evolutionary
programming techniques to combined economic emission dispatch with line flow constraints.
IEEE Trans Power Syst 18:688–697
13. Sulaiman MH, Mustaffa Z, Saari MM, Daniyal H, Daud MR, Razali S, Mohamed AM (2018)
Barnacles mating optimizer: a bio-inspired algorithm for solving optimization problems. In:
9th IEEE/ACIS International Conference on Software Engineering, Artiﬁcial Intelligence,
Networking and Parallel/Distributed Computing (SNPD 2018), Busan, Korea, pp 265–270
14. Sulaiman MH, Mustaffa Z, Saari MM, Daniyal H, Musirin I, Daud MR (2018) Barnacles
mating optimizer: an evolutionary algorithm for solving optimization. In: 2018 IEEE
International Conference on Automatic Control and Intelligent Systems (I2CACIS 2018),
Shah Alam, Malaysia, pp. 99–104
15. Sulaiman MH, Mustaffa Z, Saari MM, Daniyal H (2020) Barnacles mating optimizer: a new
bio-inspired algorithm for solving engineering optimization problems. Eng Appl Artif Intell
87:103330
Development of Microcontroller Based
Portable Solar Irradiance Meter Using
Mini Solar Cell

Lee Woan Jun, Mohd Shawal Jadin, and Norizam Sulaiman

Abstract The aim of this paper is to design and develop a portable solar irradiance
meter which allows the users to measure the real-time solar irradiance. The mea-
suring device used a solar cell as the main sensor in order to convert solar energy
into electrical energy. The actual solar irradiance will be measured and displayed on
the LCD. The project used a microcontroller to obtain and process the output
parameters from the solar cell. In addition, the developed devices also provide
ambient temperature in real-time. Based on experiment, the developed system
showed the accuracy more than 90% which is the overall error less than 10%.

Keywords Solar irradiance Irradiation Solar energy

1 Introduction

1.1 Solar Energy and PV System

The increasing energy demand in the future has driven the human being to explore
more environmental-friendly alternative energy sources, such as solar energy, wind
energy, geothermal energy and hydroelectric energy [1]. Solar energy nether less is
great alternative energy to be applied in Malaysia as the region is located near the
equator.
Photovoltaic system (PV) is the direct conversion of the sunlight into electricity.
The light emitted onto the solar cells is converted into electric energy. This energy
conversion occurs according to a principle named the “photo-electric effect”. The
basic unit of solar electric generation is solar cells. They produce electric and create
current flow when light shines on them. Solar cells rely on the special electric
properties on the element silicon (and other semiconductor materials) that enable it

L. W. Jun M. S. Jadin (&) N. Sulaiman

Faculty of Electrical and Electronic Engineering, Universiti Malaysia Pahang (UMP),
26600 Pekan, Pahang, Malaysia
e-mail: [email protected]

to act as both insulator and conductor. Specially treated wafers of silicon ‘sort’
electrons dislodged by solar energy across an electric ﬁeld on the solar cell to
generate an electric current. Other materials are also used but majority solar cells
are made from silicon [2].
Before a PV system is installed, the system designer must take several steps to
ensure the PV system can operate to its full potential. One of the key factors is to
determine the solar access and the available solar energy resource for the selected
site [3]. Different techniques and devices are currently used to measure the amount
of solar irradiance at a site to verify if the site is suitable for PV system installation,
such as measurements made by meteorological satellites of the cloud-cover and of
the solar energy reflected and scattered back to space by the earth-atmosphere
system, Photovoltaic Geographical Information System (PVGIS) web application in
Europe, Pyrheliometer, Pyranometer, Longwave (Infrared) Measurements, Solar
Irradiance Meter and Photoelectric sunshine recorder. The existing solar radiation
instruments such as pyrheliometer and pyranometer have poor commercial avail-
ability as they are too large and bulky to be used for site survey. The size of these
devices creates burden to common users, and they need specialized knowledge for
handling it. The existing solar radiation measuring devices such as solar irradiance
meter, pyranometer are very costly.
This project aims to develop a portable solar irradiance meter that uses a silicon
solar cell which acts as the solar reference cell to obtain and display the solar
irradiance. By using this solar irradiance measuring device, users are able to keep
their eyes on the solar radiation at any location conveniently.

1.2 Recent Development of Global Solar Radiation

Measuring System

Different methods and instruments which were used to monitor global solar radi-
ation have been studied. The development of a computer program for estimating the
solar radiation is done by programming with the input of latitude, longitude, date
and temperature. The horizontal solar radiation is estimated using Angstrom
Equation. The results obtained from the programming is compared to the data
collected from the Meteorological Service. The tabulated project results have a
similar pattern to the Meteorological Service data. The difference between project
results and actual values vary from approximately 20 to 100 Wm−2. Although this
method is easy to use and suitable for rough estimation, however it is not accurate
enough and not practical for actual site survey [4].
Other research used a statistical analysis of solar radiation data. Prediction of
solar radiation for 9 years from 1990 to 1998 by using the empirical model based on
the modiﬁed Angstrom-type regression equation. This equation is used to estimate
the monthly average daily global radiation on a horizontal surface. The results
obtained have small errors compared to the actual solar radiation values. The results
Development of Microcontroller Based Portable Solar Irradiance ... 1127

are reliable for the experimental location. This method is only applicable to location
with clear day [5]. In other research, the solar energy measurement used a PIC
programmable industrial controller. Energy measurement system to monitor the
power output of solar module by using voltage sensor and current sensor and the
voltage, current and power output are displayed on Liquid Crystal Display (LCD).
The measured parameters such as voltage, current, light intensity, temperature and
power are displayed on the LCD from time to time. This method is suitable for
power monitoring to ensure the efﬁciency of solar panel [6].
Other research developed a handy and low cost digital solar insolation meter by
using 8051 microcontrollers (AT89S51) with some interesting features, which can
be used by common users. The device designed able to display readings and data
logging. The short circuit current through the solar cell is converted to the pro-
portional voltage across a small shunt resistance installed parallel to the solar cell.
The measured parameters of solar cell are displayed on LCD and compared to the
solar panel parameters. Losses are considered and minimized. This method is
suitable for common users as it requires minimum cost. The circuit can be sim-
pliﬁed by using different microcontroller to eliminate the use of Analog-Digital
Converter (ADC). Accuracy can be improved by comparing results to actual solar
radiation readings [7].
Other method used the global solar radiation estimation from measurements of
visibility and air temperature extremes. Presentation of a model for the estimation of
solar radiation value and other meteorological parameters where solar radiation is
not available or unable to be measured. This model uses the relationship between
solar radiation and measured air temperature and visibility extremes which is
described in the Bristow and Campbell 1984 equation. The region with complex
weather conditions like cloudy and rainy weather has low accuracy for calculated
solar radiation. The percentage errors vary from 5% to up to 35%. However, for
certain locations with clear weather, the estimated solar radiation is reliable for
short term estimation. This method is not applicable to most of the places where
have complex weather but suitable for short term estimation [8]. A method for the
determination of the global solar radiation from meteorological satellite data has
been proposed. A statistical method for the determination of global solar radiation
at ground level by collecting data from meteorological satellites. The satellite data is
compared to the computed albedo map and determine the cloud coverage index or
each ground point. The results are plotted to show linear regression between
satellite cloud index and pyranometer measured transmission factor. Few trials were
done and the results with the least deviation are recorded on sunny weather. This
method has estimation with low accuracy and is not suitable to be used in bad
weather [9].
The conventional approach for measuring solar irradiation are pyranometer and
pyrheliometer. Most of the pyranometers can be categorized into thermopile type
and silicon-cell type. Both pyranometer and pyrheliometer are often to be used
together to obtain multiple parameters including solar radiation. The measured
parameters can be recorded in software for data logging purposes. These devices
able to achieve excellence in accuracy and sensitivity which covers broad spectral
1128 L. W. Jun et al.

range, approximately 360 nm to 2600 nm depending on type and model. But these
devices are relatively large in size, not handheld size. These devices required
specialized knowledge to handle and operate [10]. In contrast, solar power or solar
radiation can be measured and displayed instantaneously by using solar cell or
silicon crystal as solar reference cell. Other parameters such as current and voltage
can be obtained with data logger feature. Measurements of solar energy can be
taken in situ or recorded for data log. Some of these meters are with low sensitivity
which is unable to measure solar radiation at location with minimum light intensity,
and they are pretty costly as well [11].

2 Methodology

2.1 Overview

The system development is divided into hardware and software parts. The hardware
development discusses the integration of solar reference cell, sensors, microcon-
troller and the display unit. The software development is about the measurement of
solar irradiance by interpreting a series of data collected on solar cell and devel-
oping the equations and programming to calculate the irradiance. The solar radia-
tion monitoring device starts with the detection of ambient temperature. The
ambient temperature will be detected by the temperature sensor (LM35) and the
data collected will be sent to the microcontroller.
Next, the microcontroller will detect if there is any current flow in the solar cell.
There will only be current flows in the solar cell if there is solar radiation reaches
the solar cell. If there are no current flows, indicating there is no solar radiation to
be measured, or the solar radiation is too weak, leading to extremely low current
flow in the solar cell to be detected. The microcontroller will continue detect current
flow until there is a valid current value to be obtained. When there is current flow
detected in the solar cell, the microcontroller works to interpret the current readings,
and compute the corresponding solar radiation. This computation for solar radiation
values corresponding to current flows in solar cell requires initial data collection for
the solar cell with simple circuit which will be further explained.

2.2 System Design

Figure 1 shows the block diagram of the system. The solar cell works as the solar
reference cell will be connected to a current sensor module (ACS712) which used to
detect the current flows in the solar cell whenever there is solar radiation reaching on
the solar cell surface. The current sensor is connected to the microcontroller in order
to interpret the current readings from the solar cell. At the same time, a temperature
Development of Microcontroller Based Portable Solar Irradiance ... 1129

Fig. 1 Block diagram of this project

sensor LM35 will collect the ambient temperature data and send the readings to
microcontroller. Arduino UNO which works as the microcontroller of the system
will interpret and compute solar radiation readings corresponding to the current
flows in solar cell. The solar radiation values and the ambient temperature are
displayed on the LCD screen of the handheld device. The data can be sent to the
developed Apps on the smartphone for user to have a clearer monitoring purpose.

2.3 Data Collection and Software Development

The proposed system should ﬁrst determine the relationship between solar irradi-
ance and the solar cell output characteristics. The results from data collection are
computed using Microsoft Excel software to obtain the relationship between the
parameters. After the relationship between the solar radiation, voltage and current
output of solar cell have been determined, software Arduino IDE will be used to
inscribe programming codes to generate responding solar radiation readings while
the microcontroller detects the input current and voltage from the solar cell.
Figure 2 illustrates the series circuit of a solar cell and a 30 X resistor. The voltage
sensor module is used to determine the voltage drop, Vd across the resistor con-
nected in series to the solar cell. By applying the Ohm’s law, V = IR, the current
flows in the circuit can be determined as in Eq. (1).

Vd
I¼ ð1Þ
30

The output current data is used to calculate the real-time solar irradiance based
on the relationship developed between solar irradiance, voltage and current output
1130 L. W. Jun et al.

Fig. 2 Series circuit of solar

cell and 30 X resistor

Fig. 3 Connection of circuit for data collection

from solar cell in the data collection phase. At data collection phase, the current and
voltage readings from the voltage sensor module will be compared to a multimeter
reading to evaluate the accuracy of the voltage sensor module. Solar irradiance
meter, multimeter and voltage sensor module are the equipment and components
that used to collect the parameters which are the solar irradiance, current and
voltage. The connection between each component in the circuit is illustrated in
Fig. 3. A multimeter is used to measure the current flows in the circuit. The
Seaward Solar Irradiance Meter is placed next to the multimeter to obtain the solar
irradiance readings correspond to the voltage and current output of solar cell.
Development of Microcontroller Based Portable Solar Irradiance ... 1131

Fig. 4 Data collection by using voltage sensor module, multimeter and solar irradiance meter

Figure 4 shows the data collection by using a voltage sensor module, multimeter
and solar irradiance meter. The recorded parameters will be used to plot graph to
obtain the relationship between the solar irradiance, current and voltage in
Microsoft Excel. A suitable line of equation is drawn, and the equation is obtained
for the programming purpose in microcontroller. Table 1 shows the recorded
readings for solar irradiance, current and voltage.
Table 1 above shows part of the recorded data. The readings are recorded in
such a table in Microsoft Excel and then a graph is plotted. The error between the
actual reading (multimeter reading) and the approximate reading (sensor reading) is
calculated based on the following equation:

jApprox Actualj
Error ð%Þ ¼ 100% ð2Þ
Actual

Equation (2) is crucial for knowing how accurate the sensor is reading compared
to the actual reading. Multiple calibration process should be done to achieve better
accuracy of the system. Ideal accuracy of system is having a percentage of error less
than 10%. Figure 5 illustrates the graph of solar irradiance vs. current plotted based
on Table 1.
1132 L. W. Jun et al.

Table 1 Recorded solar irradiance, current and voltage readings

No. Solar irradiance meter Multimeter Voltage sensor Error (%)
Solar irradiance Current Current Voltage
[W/m2] [mA] [mA] [V]
1 239 7.79 7.32 0.2197 6.03
2 237 7.79 6.51 0.1953 16.43
3 239 8.18 7.32 0.2197 10.51
4 244 8.19 6.51 0.1953 20.51
5 245 8.22 7.32 0.2197 10.95
6 257 8.64 8.14 0.2441 5.79
7 242 8.63 8.14 0.2441 5.68
8 252 8.65 8.14 0.2441 5.90
9 267 8.97 8.14 0.2441 9.25
10 255 8.98 8.14 0.2441 9.35
11 261 9.3 8.14 0.2441 12.47
12 275 9.3 8.14 0.2441 12.47
13 271 9.3 8.14 0.2441 12.47
14 264 9.38 8.14 0.2441 13.22
15 274 9.38 8.14 0.2441 13.22
16 265 9.35 8.95 0.2686 4.28
17 271 9.74 8.95 0.2686 8.11
18 299 10.06 8.95 0.2686 11.03
19 292 10.08 8.95 0.2686 11.21
20 313 10.55 9.77 0.293 7.39
21 314 10.54 9.77 0.293 7.31
22 312 10.66 9.77 0.293 8.35
23 318 11.22 9.77 0.293 12.92
24 321 11.22 9.77 0.293 12.92
25 326 11.21 9.77 0.293 12.85
26 325 11.19 10.58 0.3174 5.45
27 336 11.47 10.58 0.3174 7.76
28 336 11.51 10.58 0.3174 8.08
29 340 11.51 10.58 0.3174 8.08
30 321 11.49 10.58 0.3174 7.92

The relationship between the solar irradiance and the current can be interpreted
by an equation based on the collected data. The equation will be used in Arduino
programming to obtain the respective solar irradiance at any instance where the
current or voltage in the circuit is measured by using the voltage sensor. The
calculated solar irradiance by microcontroller will be displayed on LCD along with
the ambient temperature data obtained from temperature sensor.
Development of Microcontroller Based Portable Solar Irradiance ... 1133

Fig. 5 Graph of solar irradiance vs current

3 Result and Discussion

3.1 Calibration and Accuracy of System

An equation of relation is applied to the system designed. Multiple sets of data are
taken again to calculate the percentage of error between sensor reading and mul-
timeter reading, followed by the calibration process. The calibration process is
taken where the prototype is placed right next to the Seaward Solar Meter. The two
devices are placed closely together to ensure both devices capture equal amount of
solar irradiance. The data before calibration is recorded in table. Figure 6 shows a
photo of measurement taken from two systems, the system developed and Seaward
Solar Survey 200R device as the benchmark.
The collected data from the developed system is plotted. As can be seen in
Fig. 7, there are notable error and difference of solar irradiance readings between
the system developed and the Seaward Solar Survey 200R device. Thus, in order to
improve its accuracy, the data are divided into several clusters and the average
difference for that speciﬁc cluster is calculated. The average difference will be used
for the calibration process is shown in Table 2.
1134 L. W. Jun et al.

Fig. 6 Comparing system to actual solar irradiance reading

Fig. 7 Solar irradiance graph of seaward irradiance meter and system developed before
calibration

After the calibration, the data collection process is repeated. A total of 173
sampling data have been collected for the purpose of calibration. The new per-
centage of errors are calculated, and a new graph is plotted in Fig. 8. From Fig. 8,
we can spot the percentage of error of solar irradiance readings between the system
developed and Seaward Solar Survey 200R device has been further reduced. The
percentage of error is successfully been reduced to less than 10%. Figure 9 shows a
photo of prototype testing. A prototype of this system is developed and compared
Development of Microcontroller Based Portable Solar Irradiance ... 1135

Table 2 Data calibration Cluster Calibration

Irradiance 260 w/m2 −33.5263 w/m2
26 w/m2 < Irradiance 300 w/m2 −49 w/m2
300 w/m2 < Irradiance 350 w/m2 −48.8883 w/m2
350 w/m2 < Irradiance 405 w/m2 −57.7143 w/m2
405 w/m2 < Irradiance 510 w/m2 −44.5625 w/m2
Irradiance 260 w/m2 −33.5263 w/m2

Fig. 8 Solar irradiance graph of seaward irradiance meter and system developed after calibration

Fig. 9 Comparing solar irradiance reading from the system developed and seaward solar survey
200R device
1136 L. W. Jun et al.

Fig. 10 Final prototype of

the system developed

the solar radiation measurement with a Seaward Solar Survey 200R device. The
comparison shows a slight difference in the solar irradiance measurement.
Figure 10 shows the exterior design of the ﬁnal prototype of the system
developed. A mini solar cell is placed on top of the surface to collect solar radiation
and converts into electricity. A temperature sensor LM35 is placed on the right
surface, located close to the bottom surface. The solar irradiance is computed and
displayed on the yellow backlight LCD. The prototype is powered by a 9 V battery.

4 Conclusion

The developed system managed to measure the instantaneous solar irradiance and
ambient temperature and able to display the parameters measured on LCD for users.
The device developed is in a handheld size as proposed in the early stage, make it
convenient for the user to carry around. It is very user-friendly and common users do
not need any special handling knowledge to operate the device. Even though the
device able to measure solar radiation and ambient temperature, the accuracy can be
further improved. The system developed used a voltage sensor module with ±0.02 V
is having not enough resolution for the voltage drop across the small value resistor
connected to the solar cell. The solar cell with 30 mA maximum current output is too
small to be detected. Minimal voltage drops across the 30 X resistor is not detectable
due to the low resolution of voltage sensor module. As the voltage drop increases to a
certain level which is detectable by the voltage sensor module, the new solar irradi-
ance only able to be computed. To improve the device, a voltage sensor or current
sensor with better resolution and accuracy which can detect extremely small current
flows from the solar cell should be implemented. A voltage ampliﬁer circuit can also
be inserted in the circuit to amplify the voltage drop. In the other way round, a solar cell
with greater output current can be used. However, the size of the solar cell should be
small enough to be placed in a handheld device. More features such as data logging,
angle measurement and compass can be added to the system too.
Development of Microcontroller Based Portable Solar Irradiance ... 1137

Acknowledgement The authors gratefully acknowledge Universiti Malaysia Pahang

(UMP) under UMP Research Grant (RDU1703128) for their ﬁnancial support and facilities.

References

1. Kamat PV (2007) Meeting the clean energy demand: nanostructure architectures for solar
energy conversion. J Phys Chem C 111(7):2834–2860
2. Handkins M (2010) Stand-Alone Solar Electric Systems. Earthscan, London
3. Shaari S, Omar AM, Haris AH, Sulaiman, SI (2010) Solar photovoltaic power: fundamentals.
Putrajaya, Kementerian Tenaga, Teknologi Hijau dan Air
4. Shaari S, Omar AM, Haris AH, Sulaiman SI (2010) Solar photovoltaic power: designing
grid-connected systems. Putrajaya, Kementerian Tenaga, Teknologi Hijau dan Air
5. SEDA PORTAL. http://www.seda.gov.my/. Accessed 09 May 2019
6. Bezir NC, Akkurt I, Özek N (2010) The development of a computer program for estimating
solar radiation. Energy Sources Part A Recov Util Environ Effects 32(11):995–1003
7. Oturanc G, Hepbasli A, Genc A (2003) Statistical analysis of solar radiation data. Energy
Sources 25(11):1089–1097
8. Dhole SV, Grover A, Kumar S, Kumar J (2017) Solar energy measurement system using pic.
Int J Adv Comput Electron Technol (IJACET) 4(2):4
9. Singh D, Sharma Thakur A (2011) Microcontroller based digital solar insolation meter
10. Ekici C, Teke I (2018) Global solar radiation estimation from measurements of visibility and
air temperature extremes. Energy Sources Part A Recov Util Environ Effects 41(11):1344–
1359
11. Cano D, Monget JM, Albuisson M, Guillard H, Regas N, Wald L (1986) A method for the
determination of the global solar radiation from meteorological satellite data. Sol Energy 37
(1):31–39
Performance of Graphite and Activated
Carbon as Electrical Grounding
Enhancement Material

Mohd Yuhyi Mohd Tadza, Tengku Haﬁdatul Husna Tengku Anuar,

Fadzil Mat Yahaya, and Rahisham Abd Rahman

Abstract The main purpose of electrical grounding system is to disperse unwanted

electrical fault to the ground within shortest time. Graphite and activated carbon
(AC) are used as an alternative material to improve the performance of earth
grounding system. This study aims to investigate of the physical and mechanical
behavior of conductive aggregate derived from graphite and AC namely, the
electrical resistivity, water absorption and crushing strength. In addition, similar
tests were also conducted on mortar aggregate for comparison. The electrical
resistivity of aggregates was measured by using soil box method. Test results
showed that the electrical resistivity, water absorption and crushing strength of all
aggregates varied with time. These values were found to be stabilized after
approximately after 14 days. The electrical resistivity for aggregates containing
graphite and AC were found to be 49.2 and 185 X m far lower than 12,700 X m
obtained for mortar-based aggregates. Similarly, the water absorption for graphite
and AC aggregates were greater compared to mortar aggregates. On the other hand,
the crushing strength for graphite and AC aggregates was to be lower. Incorporating
graphite and AC signiﬁcantly improved the electrical resistivity behavior while
maintaining acceptable mechanical properties crucial for electrical grounding pur-
poses. Overall, graphite shows the good performance in electrical resistivity com-
pared to AC.

Keywords Grounding system Conductive aggregates Ground enhancement

material

M. Y. M. Tadza (&) T. H. H. T. Anuar F. M. Yahaya

Faculty of Civil Engineering Technology, Universiti Malaysia Pahang,
Lebuhraya Tun Razak, 26300 Gambang, Kuantan, Pahang, Malaysia
e-mail: [email protected]
R. A. Rahman
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussien Onn,
86400 Parit Raja, Batu Pahat, Johor, Malaysia

1 Introduction

Ideally, earth grounding system should be considered with a low resistivity in order
to allow high current to be effectively discharged to ground but low earth impe-
dance could never be achieved [1]. Earth grounding system serves as lightning
protection in industrial and power plants in order to distribute current during
lightning strike [2]. The system is to protect the equipment in the substation by
providing low impedance path for the fault current to flow to the earth [3]. Electrical
grounding system is applied to dispose unwanted flow of electrical charges such as
fault and lightning induced currents to the earth or return back to the generator via
earth mass [4, 5]. A good grounding system should able to provide low resistant
path while maintaining safe operation safe operation to power system and electrical
components [6]. Similarly, Amadi [7] stated that a proper installation of grounding
system helps to safeguard equipment and building from any damages related with
electricity.
Previous study has shown that, soil resistivity plays a role as key factor in
designing of an electrical grounding system for high-voltage transmission and
lightning prevention system due to fact of soil resistivity [8]. The resistivity values
of the soil are controlled by the types of soil and water content. It was noted that,
whereas dry soil has higher resistivity compared to wet soil [9]. Kizlo and Kanbergs
[10] and Coelho et al. [11] noted that the resistivity of soil changes widely
depending on the geological structure of the ground. The most signiﬁcant variation
occurs are water content, chemical com-position of material, porosity, temperature,
conductivity, vertical thickness and soil compaction. Interestingly, for any given
soil type, dry condition should greater resistivity value as compared to wet con-
dition. Nowadays the use of conductance-enhancement material or ground
enhancement material as backﬁll material is the most preferred method to maintain
the low resistant for a long period of time [12].
Although, an effective grounding system is one of the most important crucial
features in electrical design and installations, yet the electrical grounding concept in
electrical industry are still less emphasizes or rather misunderstood despite knowing
that failure of achieving effective grounding system may lead to serious injuries
including fatalities [13]. As a result of these accidents, the design of electrical
grounding has been seriously taken into considerations as it is the major parameter
that affects the behavior of earth fault in an electrical power system. Ideally, the
lower the ground resistance, the better the grounding system in terms of adequately
meeting the requirements. Yet, the value of ground resistance varies from different
installations as larger systems with higher levels of fault current requires lower
ground resistance than smaller system which yield lower of fault current. In general,
ground resistance ranging from 1 to 5 X is suitable for large commercial installa-
tions and industrial plant substations and building as stated in IEEE 142-2007
Green Book [14].
Previous study shows that there are a lot of materials used as ground
enhancement material, either chemical or natural based, but there is no used in
Performance of Graphite and AC as Electrical Grounding Enhancement … 1141

practice. Metal oxide powder, granite powder, limestone powder and cock breeze
are the low cost materials that can be used as ground enhancement material or
backfill material other than bentonite. However, these materials take long period of
time to stable compare with bentonite for low earth resistance [15]. Based on the
study [16], electrically conductive aggregate is a cement-based composite that
contains conductive materials to achieve a stable and relatively low electrical
resistivity. In previous study, ordinary Portland cement (OPC) were used as backfill
materials due to its applications and electrical conductivity [17]. Chen et al. [18]
state that, the electrically conductive of normal aggregate is very low due to the
very limited conductivity of C-S-H (Calcium-Silicate-Hydrate) and the highly
tortuous pore structure [19, 20], electrically conductive concrete is normally
developed through adding a certain conductive materials such as steel slag, stainless
steel fiber, graphite, carbon fiber and carbon black into the cement-based.
Carbon materials such as AC and carbon black (CB) are predominantly com-
posed of carbon atoms, only one kind of element, but they have largely diverse
structures and properties. Depending on the crystal structure carbon can exist in
various form. Graphite for example, is an electric conductor and its conductivity is
strongly enhanced by arsenic pentaflouride (AsF5) intercalation, higher than
metallic copper, whereas diamond is completely insulating [21]. According to
Shioya et al. [22] graphite compound has high electrical conductivity (i.e
9 105 S/cm) were produced by intercalation of AsF5. Carbon materials have
large amounts and various size of pores [23].
In order to achieve low resistivity, low resistivity ground enhancement material
(GEM) as commonly used. Bentonite and Marconite are widely used as backfill
materials [24, 25]. Marconite have extremely low resistivity of 0.001 X m at its
initial state, while bentonite demonstrates extremely lower resistivity in wet con-
dition [26]. Bentonite is a natural clay where it is widely used as ground
enhancement materials (GEM) which able to absorb and retained the moisture while
able to maintain the humidity in soil [27]. Apart from that, carbon fibre and CB are
also used as an alternative grounding material due to its good performance. Carbon
fibre is a good material to modify electrically conductive concrete due to its high
conductivity and tensile strength, while carbon black is significantly can improve
the conductivity of cement matrix. Thus, when carbon fibre were mix with CB, it
provides high conductive performance and excellent mechanical properties [28].
Apart from that, Wang et al. [29] stated that the graphite is used as a practical
engineering due to its good conductivity and low cost. Similarly, with adding of
graphite to concrete will reduce the compressive strength, flexural strength and
bending strength of the composite. Ruohan et al. [30] has proposed a new GEM
called flexible graphite cooper composite electrical grounding material. Similarly,
graphite powder has high carbon content which can significantly improve the
conductivity and decrease the resistivity of cement concrete with easily possesses
good electrical conductivity. Graphite indicate that the resistivity of the conduc-
tivity of concrete ranges from 0.1 to 100,000 X cm with graphite dosage changes
and good concrete conductivity with high graphite dosage [31]. Apart from gra-
phite, activated carbon has shown remarkable improvement in the resistivity value
1142 M. Y. M. Tadza et al.

as electrically grounding enhancement material. Egashira et al. [32] stated that AC

can be used as an electrode material because it can store substantial amount of water
on their pore surfaces. AC with high surface area and pore volumes are produced
from a variety of carbonaceous source materials such as coal and agricultural waste
[33]. In addition, activated carbon is produced from nearly all carbon containing
organic materials mainly wood, sawdust, nutshells, fruit stones, peat, lignite and
petroleum coke [34].
To date, information on physical and mechanical properties of graphite and AC
de-rived aggregates appears to be limited. In this study, graphite and AC are
incorporated in cement mix to produce conductive aggregate is used as alternative
grounding material. A semi-dry mixing method was adopted to disperse the gra-
phite and AC within the cement matrix. The physical and mechanical properties
were later determined and compared with mortar aggregates to identify the elec-
trical resistivity, water absorption and crushing strength behaviors.

2 Experimental Study

2.1 Raw Material

Ordinary Portland cement (OPC) and sand were used in this study as the main
materials to prepare the aggregates. Graphite and AC act as conductive materials
were initially mixed into the main materials to form conductive aggregates. In
addition, conventional cement mix (i.e. mortar) was also tested without the addition
of graphite and AC. Speciﬁc gravity and surface area for OPC are 3.15 and
3310 cm2/g, respectively. The chemical compositions of OPC is presented in
Table 1.
Graphite and AC are in powder form was purchased from R&M Chemical. The
speciﬁc gravity of graphite and AC used were about 2.07 and 1.93, whereas the
surface area of graphite and AC were around 3.56 m2/g and 2010 m2/g.

Table 1 Chemical Chemical composition OPC (wt.%)

composition of OPC
SiO2 19.85
Al2O3 3.68
CaO 65.14
SO4 5.40
Fe2O3 2.90
MgO 1.78
K2O 0.91
TiO2 0.27
Na2O –
Performance of Graphite and AC as Electrical Grounding Enhancement … 1143

2.2 Preparation of Aggregate

The graphite and AC samples were oven fried for 24 h at temperature 105 °C.
Later, the samples were kept in sealed plastic bag. The weight ration of OPC to
sand was fixed at 0.25, while for water to solid material ratio was also fixed at 0.3.
The electrically conductive aggregates were prepared by following the method
described by Chen et al. [18]. In this study, the main materials used to prepare
aggregates were 500 g OPC, two kilograms of sand and 750 g of water. Then, dry
powder of OPC and sand were first mix in dry condition. During the mixing stage,
one third of total water required to mix the OPC and sand at the semi-dry condition
to improve the dispersion. With another two thirds of water was added gradually
after 10 min to bind the fine particles. One litre of mixture was weighted to cal-
culate the mass of conductive material to be added up in the mixing which is 1% by
mass of total raw materials. Removes the 1% of mixed materials before added the
conductive materials. After that, the 1% of conductive materials, graphite and AC
were added into the mixture and then mixed vigorously for five minutes to ensure
the constant distribution of conductive component.
Three different series of mix formulations were considered in this study. The first
series of mix formulations were OPC and sand mixed with water without adding
with carbon-based materials known as mortar. Next, graphite was added in the
second series of formulations while AC was added in the third series. Before
proceeding with aggregates process, the materials were test for electrical resistivity
reading of 0 day. The produced aggregates as shown in Fig. 1 were cured at room
temperature for 7, 14 and 28 days.

2.3 Test Methods

Water Absorption. The water absorption of aggregates was measured according

to BS EN 1097-6 after 7, 14 and 28 day of curing process. The aggregates were
removed from water and dried at temperature of 105 °C until it reached the constant
mass. With the same aggregates engrossed in water for 24 h. Later, the aggregates
were removed form water and dried by gently rolled on a piece of dry tissue. The
water absorption of aggregates for 24 h is calculated using the Eq. (1).

Msat Mdry
WA24h ¼ ð1Þ
Mdry

where Mdry is the mass of the oven dry aggregates, and Msat is the mass of saturated
aggregate after 24 h.
Crushing Strength. The crushing strength of conductive aggregates was tested by
uniaxial compression test between two parallel rigid plates. Aggregates had
immersed in the water for curing process and performed the crushing strength at 7,
1144 M. Y. M. Tadza et al.

Fig. 1 Photo of manufactured electrically conductive aggregates

14 and 28 day. The aggregate was removed from water and dried on the surface by
gently rolling on towel and placed the aggregate at the centre of the plates. The load
at fracture was recorded and measured. The crushing strength of each aggregate is
calculated using Eq. (2).

2:8Pc
S¼ ð2Þ
pX 2

where Pc is the load at fracture and X is the diameter of aggregate.

Electrical Resistivity. The aggregates were prepared in Sect. 2.2 were carefully
molded in a rectangular soil box apparatus for easy measurement, ensuring that the
aggregates were placed as close as possible in order to achieve good contact
between each other. Electrical resistivity of cement as the main binder will be
directly affect the electrical resistivity of aggregates. The electrical resistivity for
conductive aggregates were measured at different elapsed time of 7, 14 and 28 days
after curing process. The aggregates were ﬁrst removed form water and gently
rolled on a surface of dry tissues. Later, the aggregates were placed in a soil box.
The rate of concrete strength gained depends on the reaction rate of the cement and
addition of water (hydration). As the cement mixture age and matured, concrete
Performance of Graphite and AC as Electrical Grounding Enhancement … 1145

Fig. 2 Tinker & Rasor resistivity meter with soil box attachment

Fig. 3 Schematic diagram of full testing impulse breakdown voltage

gain strength rapidly when temperature is higher [35]. Concrete strength increased
with age, where after 7 days curing, the concrete strength reaches up to 65% and
matured consistently to 99% of its strength after 28 days [36]. A four-point elec-
trode (Wenner probe setup) techniques were employed to record the aggregate
electrical resistance. A Tinker & Rasor SR-2 resistivity meter was used for this
purpose. Schematic view of measurement is illustrated in Fig. 2. The dimension of
soil box used in this study is 279 76 38 mm. This method conforms with
ASTM G57 for ﬁeld measurement of soil resistivity using the Wenner
four-electrode method.
HV Impulse Breakdown. The schematic diagram for lightning impulse break-
down test is shown in Fig. 3. Aggregates were ﬁlled into sample container, and
connected to the load terminals, to be energized with the standard lightning impulse
1.2/50 µs source up to 100 kV. The output voltage and waveform were measured
on the low voltage arm of the capacitive divider until achieved the highest voltage
and kept the data by using a digital oscilloscope.
1146 M. Y. M. Tadza et al.

According to BS EN 60,060-1 Standard [37], ‘Up and Down Method’ is one of

the methods used to determine the impulse breakdown voltage by the approach of
50% probability breakdown voltage, known as U50. The sample was initially
energized with the estimated impulse breakdown voltage, Vk until the sample
achieved the breakdown value. The subsequent impulse magnitude depends on
whether a breakdown occur in the present energization. Impulse voltage is
increased by DUk ranging between 3%–6% of the Vk, if there is no breakdown
occur on the sample aggregates. This procedure was repeated for a total of 25
impulses, in order to achieve a considerable breakdown certainty [38]. The U50
voltage was calculated using the numerical expression given by the Eq. (3).
P
ni Ui
U50 ¼ P ð3Þ
ni

where ni is the number of impulse energizations for the same impulse magnitude of
Uk.

3 Results and Discussion

3.1 Water Absorption

Figure 4 shows the typical water absorption results for graphite, AC and mortar
aggregates. The relationship between water absorption and curing time shows that
the resistivity value of concrete is affected by the curing regimes. The test data
agreed very well with the observation made by Presuel-Moreno et al. [39].
Referring to Fig. 4, it was noted that prolonged curing time resulted in increased

18 Graphite 


Activated carbon Mortar

16
Water absorption (%)

14     
 
  
   
    
        
        
        

12    
    
   
    
   
    
   
    
   
    

  
 
    
 
         
        
 
         
10    
    

 
   
    
   
  
 
    
   
    
 
  
 
      
 
 
 
      
 
 

8    
    
   
    
    
   
    
   
   
    
   
    
 
      
 
 
 
      
 
 
            
             

6    
    
   
    
   
    
   
    
   
    
   
    
   
    
   
             
            
             
  

4 

   
   
   
   
    
   
    
   
   
    
   
    
            
             
            
             
  
2 
   
   
   
    
    
   
    
   
   
    
   
    
            
             
  
  
0             

7 14 28
Age (days)

Fig. 4 Water absorption of aggregate with graphite, AC and mortar

Performance of Graphite and AC as Electrical Grounding Enhancement … 1147

water absorption for all aggregates. The aggregates values for all materials were
found to be stabilized after 14 days of curing. In this case, graphite and AC
remained high level water absorption after 28 days compared to mortar aggregates.
However, graphite steadily increased from 11.13 to 13.25% for 7 and 14 days but
reduce after 28 days curing with values 12.61%, which was about 0.64% decrease
in water absorption. AC shows the increased water absorption from 11.0% to
13.82% within 7 to 28 days. In addition, the mortar aggregates show a minor
increase in water absorption from 7 to 28 days within 10.18 to 10.85%, respec-
tively. Similar to lightweight aggregates in the water absorption of graphite and AC
are acceptable within the range of 9 to 13% [40]. Similarly, Tuan et al. [41]
suggested that, the water absorption of lightweight aggregates should be lower 20%
after 24 h immersion. By referring to standard GB/T 17431.1 [42], after 1 h
immersion of lightweight aggregates was specified in range of 10 to 30% due to the
type and density of lightweight aggregates. The water absorption with concrete
mixture immersion is mainly influenced by the factors that governing concrete
porosity [43]. In other words, presence of pores in graphite and AC aggregate are
greater and affected the water absorption characteristic. Isolated pores or vitrified
surface tends to absorb little water while one with connected or open pores will
absorb more water [44]. It was also suspected that the carbon derived graphite and
AC caused agglomeration of carbon influence high porosity of aggregates [18].
Studies have shown that carbon has the ability to absorb water and retained ions
which contribute to the improvement ability of current to flow within the matrix
[45]. Thus, conductive aggregates produced with adding of graphite and AC exhibit
satisfactory water absorption compared to common mortar aggregate. The avail-
ability of higher water content in graphite and AC aggregate also enhanced the
resistivity value as water film acts as bridging agent to allow more electrical current
to flow.

3.2 Crushing Strength

The relationship between crushing strength of graphite, AC and mortar aggregates

and curing time are shown in Fig. 5. The crushing strength of aggregates are
affected with elapsed time and maturity of cement mixture. As shown below, the
crushing strength for graphite and mortar were increased consistently and stabilized
after 28 days curing process, whereas the AC was little differ after 28 days.
Subsequently, the crushing strength of graphite aggregate gradually increased from
5.88 to 7.44 MPa, while for AC aggregates increased from 7 to 14 days within 8.20
to 10.76 MPa but decreased after 28 days curing to 10.31 MPa. There was ﬁrst
signiﬁcant drop in crushing strength when carbon was added. On the other hand, the
crushing strength for mortar aggregate was 9.20 MPa after 7 day and steadily
increased to 10.24 MPa and 11.75 MPa for 14 and 28 days, respectively. Previous
study have showed that, compressibility behavior of material incorporated with
carbon generally is much lower [46]. González-Corrochano et al. [40] stated that
1148 M. Y. M. Tadza et al.

Graphite 
Activated carbon Mortar
12

Crushing strength (N/mm2)


 

10 



       
      
 
 

  

8   

    
   
   
  
    
   
          
          
  

6 

   
   

   
  

    
   
       
      
    
   
  
4   

    
   
   

 
  
    
   
   
  
  

2     
   
   
  
  
    
   
          
          
  
  
0 

7 14 28
Age (days)

Fig. 5 Crushing strength of aggregate with graphite, AC and mortar

the crushing strength of artiﬁcial lightweight aggregates should be above 0.5 MPa
when comparing to commercial lightweight aggregate. Gennaro et al. [47] proposed
that lightweight aggregate with a compressive strength around 1.0 MPa could be
used to produce structural concrete. Concrete strength at a given age depends on the
type and characteristic mix-proportion of the slag in the concrete [48]. Compressive
strength is affected by interrelated factors, for instance, pore size and distribution,
shape of aggregates, and effects due sintering that can enhance the porosity and
strength while reducing water absorption [49, 50]. There are many factors that
contribute to the compressive strength of concrete. Mix proportioning, aggregate
quality, aggregate gradation, type of cement, mixing and placing method, curing
concrete, curing temperature and the water cement ratio are some of the signiﬁcant
parameters. Water cement ratio has the most prominent impact on concrete strength.
A minimum amount of water increases the workability but reduce strength [51].

3.3 Electrical Resistivity

Figure 6 shows the electrical resistivity of aggregates prepared with mortar, gra-
phite and AC with elapsed time. The values for aggregates were found to be
stabilized after 14 days of the curing process. As shown in Fig. 6 initially, the
electrical resistivity values for graphite and mortar remain at the same at 7 days
water curing. Later, graphite aggregates slightly decreased from 54.8 to 49.2 X m
for 14 and 28 days while mortar aggregates gradually increased from 12,810 and
12,700 X m within 14 and 28 days. In the case of AC aggregates initially, after
7 days the electrical resistivity was 213 X m and later decreased to 220.5 and
185 X m for 14 and 28 days, respectively. In this case, graphite and AC shows
slight variation in the electrical resistivity value. Test results showed that time has
Performance of Graphite and AC as Electrical Grounding Enhancement … 1149

1000000
Graphite Activated carbon Mortar
100000
Resistivity (Ω.m)

10000

1000

100

1
0 7 14 21 28
Age (days)

Fig. 6 Electrical resistivity of aggregate with graphite, AC and mortar

no signiﬁcant influenced and that maturity of cement mixture has no impact on the
resistivity behavior of both materials. Hallmark [52] stated that conductive cement
has a resistivity of about 20 X m, which is better than the most problematic soil
having high resistant (i.e. rocky and sandy in nature). As dry cement and sand have
high resistivity values (i.e order of 106 X m), the lowering of electrical resistivity
initially is primarily due to the presence of water. Water act as bridging agent to
conduct electricity within cement-sand mixture. During hardening, water evapo-
rated and increase the ability of current to flow through the mortar aggregate [2, 28].
On the other hand, the presence of highly conductive materials carbon maintained
the electrical resistivity value of derived aggregate and water have less influence in
determining the overall performance as grounding material [53].

3.4 Impulse Breakdown

Breakdown Voltage of Conductive Materials. Breakdown voltage is the con-

dition changes after a certain voltage applied towards a material. It also indicates the
maximum voltage for material proﬁcient to handle before its collapse. The lower
breakdown voltage of material is good compared to high breakdown voltage value
by means that it able to ground the low lightning strength for safety purpose [6]. The
results and general trend of breakdown voltage for conductive aggregate is shown in
Fig. 7. Figure 7a shows the impulse breakdown voltage of graphite when injected
with impulse up to 60 kV. The breakdown cutoff was observed to be at about
42.02 kV. It was noted that there is no oscillation of rise and fall of voltage with
declining amplitudes as time progress. Similar observation were noted for other
1150 M. Y. M. Tadza et al.

breakdown occasions as shown in Fig. 7b. This phenomenon shows that there is no
noticeable different between these two materials based on the voltage proﬁle alone.
Similarly, Fig. 7b shows the breakdown behavior of activated carbon. A distinct
breakdown value was observed when compared to graphite aggregate. The cutoff
value was found to be slightly higher at about 44.26 kV. It is understandable, as
graphite has lower resistivity and is more conductive than activated carbon.
Different types of material exhibit different physical properties and electrical
resistivity.
Figure 7c shows the breakdown characteristics of mortar when subjected
impulse up to 60 kV. As shown in Fig. 7c below, the waveform of impulse mortar
shows that there is an oscillation of rise and fall of voltage with declining ampli-
tudes as time progress that is commonly observed in air.

Fig. 7 Typical breakdown 50

a) Graphite
characteristic, a Graphite,
40
b AC, c Mortar
30
Voltage (kV)

-10
-10 0 10 20 30 40
Time (µs)
50
b) Activated carbon

30
Voltage (kV)

-10
-10 0 10 20 30 40
Time (µs)
70
c) Mortar
60
50
40
Voltage (kV)

30
20
10
0
-10
-20
-2 0 2 4 6 8 10
Time (µs)
Performance of Graphite and AC as Electrical Grounding Enhancement … 1151

Table 2 50% breakdown voltage of the conductive materials

Test material 50% breakdown voltage (kV) Standard deviation (kV)
Graphite 42.06 2.50
Activated carbon 44.26 2.63
Mortar 60.00 3.57

Table 2 summarizes 50% breakdown voltages for each materials tested. Low
breakdown voltage is a good property from the perspective of performance of
backﬁll materials. Backﬁll material with low breakdown voltage means that they
are relatively easier to undergo breakdown or ionization thus temporarily raising
their conductivity in the event of lightning strike [17]. Among these 3 materials
tested, it was noted that, graphite has lowest breakdown value compared to acti-
vated and mortar. Hence, would be a better candidate for ground enhancement
material for electrical grounding purposes.

4 Conclusion

In this study, two types of enhancement materials were used, namely graphite and
AC in which to investigate the performance of materials when mixed with ordinary
Portland cement (OPC) and sand. Electrical resistivity, water absorption and
crushing strength were investigated in aggregated form by adding carbon-based. It
was found that incorporation of graphite and AC as newly developed conductive
aggregates could improve the electrical resistivity up to 49.2 and 185 X m. The use
of graphite showed promising result as compared to AC based on the electrical
resistivity performance. The impulse breakdown voltage test results also in con-
formity with the resistivity behavior of both material. Both values were signiﬁcantly
lower than the use of cement and sand alone. In addition, water absorption of
graphite and AC aggregate is higher and would enhance the resistivity further under
wet conditions. However, incorporating graphite and AC produced lower crushing
strength compared to mortar. Nevertheless, this aspect did not hinders the use of
graphite and AC aggregate as alternative materials for grounding applications.

References

1. Nor MM, Haddad A, Grifﬁths H (2006) Performance of earthing systems of low resistivity
soils. IEEE Trans Power Delivery 21(4):2039–2047
2. Liu Y, Zitnik M, Thottappillil R (2001) An improved transmission-line model of grounding
system. IEEE Trans Electromagn Compat 43(3):348–355
1152 M. Y. M. Tadza et al.

3. Loo KH, Ukil A (2017) Design of safety critical substation earthing system based on soil
characteristics. In: 2017 IEEE industry applications society annual meeting
4. Lim SC, Ab Kadir MZA, Gomes C (2013) Electrical earthing in troubled environment. Int J
Electr Power Energy Syst 47:117–128
5. Lim SC, Ab Kadir MZA, Gomes C, Jasni J (2012) Ambiguity of grounding specifications:
IEC 62305 revisited. In: International conference on lightning protection (ICLP), pp 1–6
6. Hasni NAM, Abd-Rahman R, Ahmad H, Jamail NAM, Kamaruddin MS, Ridzwan SS (2017)
Investigation of potential grounding compound for portable applications. Int J Electr Comput
Eng 7(6):31–40
7. Amadi HN (2017) Soil resistivity investigation for substation grounding system in wetland
regions – a case study of Lagos State, Nigeria. Asian J Nat Appl Sci 6(December):90–99
8. Slaoui FH, Kanbergs A (2010) Evaluation of grounding resistance and inversion method to
estimate soil electrical grounding parameters. Int J Multiphys 4(3):201–215
9. Malanda SC, Davidson IE, Buraimoh E (2018) Analysis of soil resistivity and its impact on
grounding systems design. In: 2018 IEEE PES/IAS powerafrica. IEEE, pp 324–329
10. Kizlo M, Kanbergs A (2019) The causes of parameters changes of soil resistivity. Power
Electr. Eng. 25:43–46
11. Coelho VL, Piantini A, Almaguer HAD, Coelho RA, Boaventura WDC, Paulino JOS (2015)
The influence of seasonal soil moisture on the behavior of soil resistivity and power
distribution grounding systems. Electr Power Syst Res 118:76–82
12. Gomes C, Lalitha C, Priyadarshanee C (2010) Improvement of earthing systems with backfill
materials. In: 30th international conference on lightning protection (ICLP). Institute of
Electrical and Electronics Engineers, Cagliari, pp 1–9
13. Amadi HN (2015) Effective earthing system in the corrosive soil of Niger Delta. Int J Emerg
Sci Eng (IJSESE) 3(6):5–9
14. IEEE Standard 142 (Green Book) (2007) IEEE recommended practice for grounding of
industrial and commercial power systems
15. Gomes C, Ab Kadir MZA, Kottachchi CL, Lim SC (2014) Industrial wastes and natural
substances for improving electrical earthing systems. Int J Electr Eng Chin Inst. Electr. Eng
21(2):39–47
16. Chung DDL (2004) Electrically conductive cement-based material. Adv Cem. Res. 16
(4):167–176
17. Lim SC, Gomes C, Kadir MZAA, Nourirad G, Malek ZA (2015) Behaviour of backfill
materials for electrical grounding systems under high voltage conditions. J Eng Sci Technol
10(6):811–826
18. Chen B, Li B, Gao Y, Ling TC, Lu Z, Li Z (2017) Investigation on electrically conductive
aggregates produced by incorporating carbon fiber and CB. Constr Build Mater 144:106–114
19. Ma H, Hou D, Liu J, Li Z (2014) Estimate the relative electrical conductivity of C-S-H gel
from experimental results. J Constr Build Mater 71:392–396
20. Ma H, Li H (2013) Realistic pore structure of Portland cement paste: experimental study and
numerical simulation. Comput Concr. 11(4):317–336
21. Inagaki M (2013) Advance carbon materials. In: Handbook of advanced ceramics, pp 20–60
22. Shioya J, Matsubara H, Murakami S (1986) Properties of AsF5-intercalated vapor-grown
graphite. Synth Metal 14(1–2):113–123
23. Inagaki M, Kang F, Toyoda M, Konno H (2014) Carbon materials for adsorption of
molecules and ions. In: Advanced materials science and engineering of carbon, pp 335–361
24. Androvitsaneas VP, Ganos IF, Stathopulos IA (2012) Performance of ground enhancing
compounds during the year. In: 31st international conference of lightning protection, ICLP
2012
25. Tshubwana RP, Shongwe T, Gomes C (2016) Characteristics and performance of soil,
bentonite and conductive cement during impulse test in Swaziland. In: 2016 international
conference on emerging technological trends (ICETT). IEEE, pp 1–6
Performance of Graphite and AC as Electrical Grounding Enhancement … 1153

26. Tadza MYM., Mohamad D, Tripathy S, Rahman RA, Ismail MAM (2019) Bentonite and
marconite for electrical grounding applications from geotechnical engineering perspective. In:
AIP conference proceedings, vol 2129, no 1. AIP Publishing, p 020078
27. Azmi A, Ahmad NA, Yiew LK, Abdul-Malek Y (2019) The use of enhancement material in
grounding system: a review. Indones J Electr Eng Comput Sci 13(2):453–460
28. Zhang J, Xu L, Zhao Q (2017) Investigation of carbon fillers modified electrically conductive
concrete as grounding electrodes for transmission towers: computational model and case
study. Constr Build Mater 145:347–353
29. Wang D, Wang Q, Huang Z (2019) Investigation on the poor fluidity of electrically
conductive cement-graphite paste: experiment and simulation. Mater Des 169:107679
30. Ruohan G, Jiangjun R, Yuanchao H, Yangcong W, Shuo J (2016) Research on flexible
graphite-copper composited electrical grounding material. In: 12th IET international
conference on AC and DC power transmission (ACDC 2016), Beijing
31. Wu T, Huang R, Chi M, Weng T (2013) A study on electrical and thermal properties of
conductive concrete. Comput Concr 12(3):337–349
32. Egashira M, Izumi T, Yoshimoto N, Morita M (2016) Determining water content in AC for
double-layer capacitor electrodes. J Power Sources 326:635–640
33. Ahmadpour A, Do DD (1997) The preparation of AC from macadamia nutshell by chemical
activation. Carbon 35(12):1723–1732
34. Marsh H, Rodríguez-Reinoso F (2007) Production and reference material. In: Activated
carbon
35. Newman J, Choo B (2003) Advance concrete technology. Elsevier, Amsterdam
36. Misra AK (2017) Building materials and construction, 1st edn. S. Chand & Co Ltd., New
Delhi
37. BS EN 60060 – 1 (2010) 2010 – High-voltage test techniques. Part 1: general definitions and
test requirements: British Standard Institution Std
38. Kamarudin MS, Zainuddin H, Haddad A, Abd Rahman R, Radzi NH, Ponniran A, Zahari A
(2016) Purpose-built test rig for gas insulation breakdown tests under lightning impulse. In:
6th international conference on power and energy (PECon). IEEE, Melaka, pp 333–337
39. Presuel-Moreno F, Wu YY, Liu Y (2013) Effect of curing regime on concrete resistivity and
aging factor over time. Constr Build Mater 48:874–882
40. González-Corrochano B, Alonso-Azcárate J, Rodas M (2014) Effect of prefiring and firing
dwell times on the properties of artificial lightweight aggregates. Constr Build Mater 53:91–
101
41. Tuan BLA, Hwang CL, Lin KL, Chen YY, Young MP (2013) Development of lightweight
aggregate from sewage sludge and waste glass powder for concrete. Constr Build Mater
47:334–339
42. GB/T 17431.1 (2010) Lightweight aggregates and its test methods – part 1: lightweight
aggregates, p 6
43. Ahmed H, Bogas JA, Guedes M (2018) Mechanical behaviour and transport properties and
cementitious composites reinforced with carbon nanotubes. J Mater Civ Eng 30
(10):04018257
44. Hung MF, Hwang CL (2007) Study of fine sediments for making lightweight aggregate.
Waste Manag Res 25(5):449–456
45. Baharudin F, Tadza MYM, Imran SNM, Jani J (2018) Removal of iron and manganese in
groundwater using natural biosorbent. In: IOP conference series: earth and environmental
science, vol 140, no 1, p 012046
46. Tadza MYM, Baharudin F (2017) Treatment efficiency and compressibility behaviour of soil
modified with powdered AC. Int J 12(33):122–126
47. Gennaro R, Langella A, D’Amore M, Dondi M, Colella A, Cappelletti P, Gennaro A (2018)
Use of zeolite-rich rocks and waste materials for the production of structural lightweight
concretes. Appl Clay Sci 41(1–2):61–72
48. Shariq M, Prasad J, Ahuja AK (2008) Strength development of cement mortar and concrete
incorporating GGBFS. Asian J Civ Eng (Build Hous) 9(1):61–74
1154 M. Y. M. Tadza et al.

49. Fakhfakh E, Hajjaji W, Medhioub M, Rocha F, Lopez-Galindo A, Setti M, Kooli F,

Zargouni F, Jamoussi F (2007) Effects of sand addition on production of light-weight
aggregates from Tunisian smectite-rich clayey rocks. Appl Clay Sci 35(3–4):228–237
50. Huang SC, Chang FC, Lo SL, Lee MY, Wang CF, Lin JD (2007) Production of lightweight
aggregates from mining residues, heavy metal sludge, and incinerator fly ash. J Hazard Mater
144(1–2):52–58
51. Hasan M, Kabir A (2012) Early age tests to predict 28 days compressive strength of concrete.
In: Awam international conference on civil engineering (AICCE 2012)
52. Hallmark CL (2000) The use of conductive cement to extend and protect made ground
electrodes. In: Proceedings of the AREMA annual conference, AREMA
53. Tadza MYM, Anuar THHT, Yahaya FM (2019) Investigation on electrically conductive
aggregates as grounding compound produced by Marconite. Civ. Environ. Eng. Rep. (CEER)
3(30):86–96
Design on Real Time Control for Dual
Axis Solar Tracker for Mobile Robot

Muhammad Hanzolah Shahul Hameed, Mohd Zamri Hasan,

and Junaidah Ali Mohd Jobran

Abstract Nowadays, solar powered technology is commonly used in many

industries and many application. The example of applications that powered by solar
energy such as home appliances, mobile robot and surveillance system. Other than
that, solar energy is also used for recharging robot’s battery. The problem that can
be seen now is most of the solar panel in many application just in static condition.
In order to maximize the solar energy received from the sun, the solar panel has to
be positioned perpendicular to the sun. Thus, the tracking of the sun’s location and
positioning of the solar panel is important. So, the project that wants to develop
here is called “Solar Tracking System for Mobile Robot”. In order to obtain the
maximum solar energy received, the robot system consist a few process. First, four
light dependent resistor (LDR) will sense the intensity of light and the value
received from LDR will be sent to Arduino microcontroller. Then, the solar tracker
will adjust the position of solar panel by measuring the differences between the
LDRs and adjust the servo motor that used to move the solar panel perpendicular to
the sun. There are two servo motor, vertically mounted (to move Z-axis angle) and
horizontally mounted (to move Z-axis angle) used to move the solar panel position.
Then, solar panel will collect solar energy and convert solar power to electrical
power. The energy received from solar panel will be auto buck boosted to 12 V to
charge 7.4 V battery that is used for the usage of the mobile robot and solar tracker
itself using Lithium Polymer charger adapter. The battery will be charged and used
at the same time. On the robot, there is LCD that will display voltage received by
solar panel and battery percentage. Overall, this project will help any battery
powered mobile robot to recharge itself using dual-axis solar tracker and prevent
energy insufﬁcient while receiving maximum amount of energy available from sun.

Keywords Dual axis solar tracker LDR Self-powered robot

Charging system Mobile robot

M. H. S. Hameed (&) M. Z. Hasan (&) J. A. M. Jobran

Faculty of Engineering Technology, Universiti Malaysia Perlis (UniMAP), Arau, Malaysia
e-mail: [email protected]
M. Z. Hasan
e-mail: [email protected]

1 Introduction

Today, robots are being used widely for various types of application such as
surveillance and patrolling robot, security robot and freight transport robot [1–3].
Almost all mobile robots uses battery as power source and controlled wirelessly [4].
But the battery has to be charged manually and sometime the robot will be out of
battery and dead in middle of nowhere and require person to retrieve and charge the
robot. Or in some case, advanced robot have the automatic system to identify low
battery condition and head to the charging station when the battery is low [5].
Furthermore, the implementation of the self-recharging system to mobile robot can
be a solution to this problem.
Self-recharging system can be applied by using solar panel on mobile robot. The
most sustainable reserve for generating electricity is solar energy [6]. To boost solar
panel efficiency, solar panel needed to be perpendicular to the sun [7]. In order to
make the solar panel perpendicular to the sun, dual-axis solar tracker will be used to
track the position of the sun [8]. Solar tracker is an apparatus that used to trail the
sunlight for receiving highest electrical energy. From time to time, solar tracker
efficiency have been improved by studying solar tracker types [9].
There are two main type of solar tracker based on action, first is single-axis solar
tracker and second is dual-axis solar tracker [10]. Based on previously made
experiment, the result shown is moving the solar panel perpendicular to the sun
from time to time receive more voltage input in comparison to the solar panel in
static position [10]. This prove that dual-axis solar tracker will be productive in
comparison to solar panel in static position [11].

1.1 Previous Research

Many researchers around the globe have conducted experiments with various
method to increase the efﬁciency of solar panel. Previous studies are important in
conducting projects as these studies highlights the work previously done by others
that relates with this project. Related work is also considered as a primary reference
and guideline that helps contribute in producing this project. A total of four related
work have been chosen as reference.
Mallick et al. (2014) developed solar tracker using robot and reflector to increase
the solar energy received. This proposed method analysed three different condition
which is static at ideal azimuth angle, single axis solar tracker robot and single axis
solar tracker robot with reflector. The experimental result shows that the solar
energy received for single axis solar tracker robot without reflector and with
reflector is 4% and 7% respectively higher than static at ideal azimuth angle [12].
Fathabadi (2016) proposed offline dual axis solar tracker which is moved
through a system that have no feedback. The solar tracker is set direct to the sun by
calculations. Sun’s position is calculated throughout the year and programmed into
Design on Real Time Control for Dual Axis Solar Tracker … 1157

Arduino to operate the system offline. The results shows that using this offline
system, solar panel receive 20%–50% more solar energy than static solar panel. The
experimental results show that the percentage of tracking error is only by 0.43 [13].
Sawant (2018) designed and analysed automated dual axis solar tracker based on
light sensors. This method use four LDR sensor to accurately locate the position of
the sun and give signals to H-bridges DC motor to move the solar panel accord-
ingly. As comparison to single axis solar panel, the efficiency of dual axis solar
tracker using light sensor shows that additional 24.78% of solar energy received
than single axis tracker [14].
Afarulrazi (2011) developed solar tracker robot using 2 LDR to control hori-
zontal axis using servo motor. Vertical axis is controlled using digital compass to
always face a same side if any external forces applied. This project is controlled by
using PIC16F877A microcontroller. Compared to fixed solar panel, this method is
19.26% more efficient [15].

2 Methodology

This project will focus on designing a dual axis solar tracker for mobile robot and
the energy received by solar panel will be used to charge a 7.4 V lithium polymer
battery. This battery is also used to power up all actuators and microcontroller that
used in this project. Method that used to track the sunlight is using 4 LDR. Voltage
received from LDR will be input for Arduino which will be calculated the differ-
ence in the coding and then Arduino will send signal to servo motors to move
accordingly. The voltage received from solar panel, battery percentage and voltage
will be displayed on an OLED display. On the other side, the mobile robot will be
controlled using Bluetooth module HC-05. For supply, an auto buck boost con-
verter will be used to convert supply from 7.4 V to 5 V for supply DC motor and
Servo motor. The method divided into two parts which are hardware and software
development part. Detailed explanation for methodology will be described in next
section.

2.1 Project Flow

The sun tracker flowchart is shown in Fig. 1. This sun tracker will track the sun but
not when mobile robot is moving. In Fig. 2, flowchart of the mobile robot is shown.
This mobile robot will be controlled using Bluetooth controller. The robot will be
moving when there is input signal from controller.
1158 M. H. S. Hameed et al.

Fig. 1 Solar tracker flowchart

Design on Real Time Control for Dual Axis Solar Tracker … 1159

Fig. 2 Mobile robot flowchart

2.2 Software Development

In this project, Proteus is used to simulate the solar tracker circuit. In this circuit,
resistor with 10k ohm also connected to the LDR as safety for the Arduino. While
for the output, servo motor also have been connected to the Arduino Uno. Both
input and output which is LDR and servo motor are connected to the battery as
power. The mobile robot simulation shows if the button A pressed, both right and
left motor will rotate in forward direction. If button B pressed, left motor will move
forward and right motor will move in reverse direction to let the mobile robot turn
right. If button C is pressed, right motor will move forward direction and left motor
will move in reverse direction to let the mobile robot turn left. Meanwhile, if button
D is pressed, both motor will move in reverse direction. This simulation is shown in
Fig. 3.
1160 M. H. S. Hameed et al.

Fig. 3 Circuit simulation in Proteus

Bluetooth Interface. Bluetooth controller for mobile robot is created from MIT
app inventor to control the mobile robot (Fig. 4). The signal from this interface will
be read by Arduino and execute the coding accordingly.

2.3 Hardware Development

The hardware selection is important in hardware development process. Each

equipment parts are important to be viewed first before chose to be used in this
project. The component specification is imperative to be explore and researched
first. The components selection is indicated according to advantages and charac-
teristic of the component to satisfy the functionality of each part used.
Solar Tracker Circuit Construction. Before creating a real circuit design on
breadboard, the wiring connection between the interfacing devices must be
obtained. For this project, the wiring connection is created and designed by using
Proteus software in the beginning. Based on the researches that have been con-
ducted. The solar panel wiring design is shown in Fig. 5.
This circuit will operate when input signal from LDR is received. The LDR is
positioned as together with barrier with each other as shown in Fig. 6. As pro-
grammed in Arduino coding, the X-axis servo motor will move anti-clockwise if
Design on Real Time Control for Dual Axis Solar Tracker … 1161

Fig. 4 Bluetooth interface

only LDR A and B detect sunlight, otherwise X-axis servo motor will rotate
clockwise if only LDR C and D detect sunlight. For Z-axis servo motor, it will
rotate anti-clockwise if LDR A and C detect sunlight and rotate clockwise if LDR B
and D detect sunlight.
Charging Circuit Construction. The solar panel output is connected with
buck-boost converter (Fig. 7) to boost the input voltage from solar panel to 12 V.
Then the output voltage is connected to a Lithium Polymer adapter (Fig. 8) to
charge the battery. Solar panel nominal voltage is 5 V and the nominal voltage for
1162 M. H. S. Hameed et al.

Fig. 5 Solar tracker circuit design

Fig. 6 LDR position

Design on Real Time Control for Dual Axis Solar Tracker … 1163

battery is 7.4 V. So, whenever solar panel receive enough sun light to convert the
voltage, the charger will be turned on. When battery is fully charged and there is
still sunlight available, whole mobile robot will be powered by solar energy and
conserve the battery energy. Besides, solar panel input voltage and battery per-
centage will be displayed on the OLED display as shown in Fig. 9.

Fig. 7 Buck-boost converter.

Fig. 8 Lithium Polymer adapter.

1164 M. H. S. Hameed et al.

Fig. 9 OLED display

Fig. 10 Motor driver wiring

Design on Real Time Control for Dual Axis Solar Tracker … 1165

Mobile Robot Circuit Construction. Basic circuit is constructed ﬁrst to study the
wiring connection between Arduino and the Bluetooth controller. Then, the wiring
connection between the L298N motor driver and DC motor connected with
Arduino microcontroller is constructed. This circuit is shown in Fig. 10.
Overall Design. This Solar Tracker Robot is built using various type of material.
The mobile robot body is built using Perspex. Pine wood is used for build solar
tracker stand. Other than that, the solar plate, solar plate holder and LDR position
divider is built using PVC foam board. All the parts is painted ﬁrst before
assembled together. Figure 11 shows all the parts together before being assembled.
Before the whole body is attached together, the solar tracker and mobile robot
body is assembled separately. Then it is attached to the assembled mobile robot
body including motor and tyres. Figure 12 shows the solar tracker part assembly
with solar panel and LDRs attached together with dimensions. The solar plate is
then mounted to the servo motor that is attached to the X axis of the solar tracker.
Figure 13 shows assembled mobile robot base with dimensions.
Figure 14 shows the whole mobile robot hardware assembly. The solar tracker is
placed in front of the mobile robot body to balance the weight of mobile robot. This
is also to make place for other components of the solar tracker mobile robot such as
Arduino microcontroller and battery.

Fig. 11 Robot parts

1166 M. H. S. Hameed et al.

Fig. 12 Solar tracker with

dimension

Fig. 13 Assembled mobile

robot base
Design on Real Time Control for Dual Axis Solar Tracker … 1167

Fig. 14 Mobile robot

assembly

3 Result and Discussion

After this project is fully constructed, testing for functionality is conducted by using
torchlight as sun. Then, the project is tested on ﬁeld (Fig. 15) to collect data
received from solar panel and battery charging data.

Fig. 15 Project testing on ﬁeld and voltage reading

1168 M. H. S. Hameed et al.

3.1 Efﬁciency Analysis

To evaluate the performance and efﬁciency of the solar panel, six different time
were selected to compare the reading taken on the same day. This experimental was
taken at Uniciti ALAM, UNIMAP on 10 NOVEMBER 2019 from 8 am until 6 pm.
The solar panel is tested in two position which is directly to the sun and static
position.
Directly to Sun Position. The voltage received by solar panel displayed on OLED
screen was recorded in Table 1. Meanwhile, line graph is obtained from the data
collected and shown in Fig. 16.
After all the voltage taken throughout the day, the average output voltage is then
calculated. The average output for directly positioned solar panel is shown in Eq. 1.

½6:58 þ 6:7 þ 6:83 þ 6:38 þ 6:21 þ 5:36Vdc 6 ¼ 6:34Vdc ð1Þ

Static Position (30°/Azimuth Angle). For static position, angle of 30° is selected
because the azimuth angle for solar panel is nearly 30° throughout the year in
Malaysia. The solar panel is placed 30° inclined facing south by coding the angle
into Arduino. The voltage received by solar panel is taken and recorded in Table 2.
Other than that, line graph is obtained from the data collected and shown in Fig. 17.
After all the voltage taken throughout the day, the average output voltage is then
calculated. The average output for static (30°) solar panel is shown in Eq. 2.

½5:17 þ 5:66 þ 6:03 þ 5:34 þ 5:14 þ 4:81Vdc 6 ¼ 5:35Vdc ð2Þ

Comparison Between Directly Proportionally and Static Solar Panel. After all
the voltage is taken from 8.00 a.m. to 6.00 p.m. and recorded in table and graphs.
The comparison of the curve has been made to prove the theory that solar panel that
placed directly to the sun receive more voltage input than static solar panel that
placed in azimuth angle throughout the year.

Table 1 Voltage reading for Time (hour) Voltage (Vdc)

solar panel using solar tracker
8 am 6.58
10 am 6.70
12 pm 6.83
2 pm 6.38
4 pm 6.21
6 pm 5.36
Design on Real Time Control for Dual Axis Solar Tracker … 1169

Fig. 16 Line graph for voltage reading for solar panel using solar tracker

Table 2 Table captions Time (hour) Voltage (Vdc)

should be placed above the
tables 8 am 5.17
10 am 5.66
12 pm 6.03
2 pm 5.34
4 pm 5.14
6 pm 4.81

Fig. 17 Line graph for voltage reading for azimuth angle facing south
1170 M. H. S. Hameed et al.

Fig. 18 Voltage reading differences

Figure 18 shows the differences between proportionally placed solar panel and
static at azimuth angle solar panel. From the ﬁgure, the theory that solar panel that
placed directly to the sun receive more voltage input than static solar panel that
placed in azimuth angle throughout the year is been proved. The efﬁciency of the
solar panel is calculated below.

ð6:34=6:83Þ 100 ¼ 92:83% ð3Þ

The average voltage from the obtained data is divided with the open circuit
voltage to acquire the efﬁciency. The efﬁciency for solar that using solar tracker is
92.83% (Eq. 3).

5:35=6:83 100 ¼ 78:33% ð4Þ

The efﬁciency for solar that placed in azimuth angle is 78.33% (Eq. 4).

3.2 Battery Charging Analysis

This analysis discuss about time taken to charge the battery while using it. The test
is conduct for four hours at Uniciti ALAM, UNIMAP on 14 NOVEMBER 2019
from 12 pm until 4 pm. During this test, the percentage of battery rise from 35% to
56% as shown in Table 3. From the observation the highest increment of battery
percentage is from 12.00 pm to 12.30 pm which is 5% per 30 min. On the other
hand, the lowest increment is from 3.00 pm to 4.00 pm which is only 1% per
30 min. This is because output from solar panel is highest when the sun is brightest
at 12.00 pm–12.30 pm while the voltage is low at 3.00 pm–4.00 pm because less
energy received from solar panel.
Design on Real Time Control for Dual Axis Solar Tracker … 1171

Table 3 Battery charging Time (Hour) Voltage (Vdc) Battery percentage

percentage over time
12.00 pm 7.68 35%
12.30 pm 7.72 40%
1.00 pm 7.74 43%
1.30 pm 7.78 47%
2.00 pm 7.80 50%
2.30 pm 7.82 52%
3.00 pm 7.83 54%
3.30 pm 7.84 55%
4.00 pm 7.85 56%

Fig. 19 Line graph for battery percentage while charging

From Fig. 19, the line graph shows that the battery percentage increase pro-
portionally with time. Based on the results, when the solar panel receive enough
energy from the sun; it will take 30 min to increase the battery percentage for 5%.
Theoretically, it will take 10 h to charge the battery from 0% to 100% if the energy
received from the sun is enough.

4 Conclusion

To encapsulate, the sun tracker alone can be implemented in solar plant to increase
the solar energy received. Even though the cost for implementing this system in
existing solar plant is a bit high than install new solar panel, but the land in this
country is increasing over year [16]. So, installing dual axis solar tracker in existing
solar plant is more efﬁcient than installing new solar panel. From the results above,
the battery voltage is increasing when charged while using the solar tracker mobile
robot. So, it is proved that the input voltage from solar panel is higher than the
1172 M. H. S. Hameed et al.

usage of solar tracker mobile robot. Other than that, the development of the sun
tracker that attached to a mobile robot as a charger to the mobile robot system can
be utilized in many ﬁelds such as surveillance robot that can operate for a long time
and distance without losing power. Moreover, it also can be implemented in sur-
veying for security purpose compensating the existing human security guard.

References

1. Park H, Uhm TY, Bae GD, Choi YH (2018) Stability evaluation of outdoor unmanned
security robot in terrain information. In: 18th international conference on control, automation
and systems (ICCAS 2018), pp 955–957
2. Meghana S, Nikhil TV, Murali R, Sanjana S, Vidhya R, Mohammed KJ (2018) Design and
implementation of surveillance robot for outdoor security. In: 2017 2nd IEEE international
conference on recent trends in electronics information & communication technology
(RTEICT). IEEE, India, pp 1679–1682
3. Gwynne VDK, Ron VD (2018) Robotisation of urban freight transport. In: Bijdragen
Vervoerslogistieke Werkdagen 2018. University Press Zelzate, Belgium, pp 1–11
4. Shnaps I, Rimon E (2016) Online coverage of planar environments by a battery powered
autonomous mobile robot. IEEE Trans Autom Sci Eng 13(2):425–436
5. Zhang Y, Wei H, Yang B, Jiang C (2018) Sambot II: a self-assembly modular swarm robot.
In: AIP conference proceedings 1995. AIP Publishing, China, pp 1–11
6. Popovski V, Ackovska N (2013) A robotic system powered by solar energy. In: 10th
conference for informatics information technology (CIIT 2013). CIIT, Macedonia, pp 85–88
7. Yao Y, Hu Y, Gao S, Yang G, Du J (2014) A multipurpose dual-axis solar tracker with two
tracking strategies. Renew Energy 72:88–98
8. Wu J, Chen X, Wang L (2016) Design and dynamics of a novel solar tracker with parallel
mechanism. IEEE/ASME Trans Mechatron 21(1):88–97
9. Mitrofanov SV, Baykasenov DK, Suleev MA (2018) Simulation model of autonomous solar
power plant with dual-axis solar tracker. In: 2018 international ural conference on green
energy (UralCon). IEEE, Russia, pp 90–96
10. Hoffmann FM, Molz RF, Kothe JV, Nara EOB, Tedesco LPC (2018) Monthly profile analysis
based on a two-axis solar tracker proposal for photovoltaic panel. Renew Energy 115:750–
759
11. Vieira RG, Guerra FKOMV, Vale MRBG, Araújo MM (2016) Comparative performance
analysis between static solar panels and single-axis tracking system on a hot climate region
near to the equator. Renew Sustain Energy Rev 64:672–681
12. Mallick TC, Munna MS, Barua B, Rahman KM (2014) A design & implementation of a
single axis solar tracker with diffuse reflector. In: 2014 9th international forum on strategic
technology (IFOST). IEEE, Bangladesh, pp 289–293
13. Fathabadi H (2016) Novel high efficient offline sensorless dual-axis solar tracker for using in
photovoltaic systems and solar concentrators. Renew Energy 95:485–494
14. Sawant A (2018) Design and analysis of automated dual axis solar tracker based on light
sensors. In: 2018 2nd international conference I-SMAC (IoT soc. mobile, analysis cloud).
IEEE, India, pp 454–459
15. Afarulrazi AB, Utomo WM, Liew KL, Zarafi M (2011) Solar tracker robot using
microcontroller. In: 2011 international conference on business, engineering and industrial
applications. IEEE, Malaysia, pp 47–50
16. Szabó L (2017) The history of using solar energy. In: The 7th international conference on
modern power systems (MPS 2017), pp 1–8
Modified Particle Swarm Optimization
for Robust Anti-swing Gantry Crane
Controller Tuning

Mahmud Iwan Solihin, Wei Hong Lim, Sew Sun Tiang,

and Chun Kit Ang

Abstract This paper discusses an application of metaheuristic optimization algo-

rithms for a single-objective constrained optimization in a robust feedback controller
design of anti-swing gantry crane control. A set robust feedback controller gains is
optimized based on plant’s linear model having structured parametric uncertainty,
i.e. gantry crane system. A wedge region is assigned as the optimization constraint to
specify the desired closed-loop poles location which is directly related to desired
time-domain response. The simulation results of the proposed robust control design
using multiswarm particle swarm optimization without velocity (MPSOWV) is
presented. The control performance of the feedback system optimized with
MPSOWV are compared with that of PSO (particle swarm optimization), DE (dif-
ferential evolution) and TLBO-PSO (improved teaching-learning-based optimiza-
tion with the social character of particle swarm optimization). The simulation studies
show that the controller optimized by the proposed MPSOWV demonstrates the
most robust performance as compared to the other peer algorithms used in this paper
for being able to produce the largest stability radius ðrc Þ consistently, i.e. rc ¼
3:4325 in average for 20 runs.

Keywords Modiﬁed particle swarm optimization Anti-swing gantry crane

Robust control Differential evolution Teaching learning-based optimization

1 Introduction

Many real-world situations basically involve optimization. Optimization literally

means ﬁnding the best/feasible solution of the variable(s) according to the prede-
ﬁned objective(s) of the optimization problems and the constraint(s). It can be found
everywhere ranging from engineering design, control systems, economics, job
scheduling to internet routing [1, 2]. Some application examples in control

M. I. Solihin (&) W. H. Lim S. S. Tiang C. K. Ang

Faculty of Engineering, UCSI University, 56000 Kuala Lumpur, Malaysia
e-mail: [email protected]

engineering can be found in [3–5]. The classiﬁcation of optimization problems is so

wide such as single- and multi-objectives, constrained and unconstrained, contin-
uous and discrete, static and dynamic, etc. The detailed information on this topic
can be found in many textbooks [6–8].
Since the optimization problems are various, the methods or algorithms to solve
the problems ought to be various as well. They can be classified into several
categories, depending on the focus. Algorithms with stochastic (randomness)
components were often referred to as heuristic in the past, though the recent lit-
eratures tend to refer them as meta-heuristics. Meta-heuristic algorithm is regarded
as further development of the heuristic algorithms as the word ‘meta’ means beyond
or higher level [1]. In literatures, the terms heuristics and meta-heuristics may be
used interchangeably since there is no clear cut between these two definitions. The
main characteristic of these algorithms is the randomization component. Some
famous meta-heuristic algorithms are: genetic algorithm (GA), simulated annealing
(SA), taboo search (TS), ant colony optimization (ACO), differential evolution
(DE), particle swarm optimization (PSO), bacterial foraging optimization (BFO),
harmony search (HS), bee algorithms (BAs), firefly algorithm (FA), cuckoo search
(CS), bat-inspired algorithm (BIA) and recently teaching learning-based opti-
mization (TLBO).
PSO and DE are among the most significantly influential in this area. Their
modified versions (variants) are also proposed and applied to solve global opti-
mization problems [9–11]. In robust control system applications, PSO, DE and their
variants have been proposed to solve various problems. The author has proposed to
use PSO to tune feedback controller for anti-swing gantry crane based on state
space model of the plant with structured uncertainty such that the closed-loop
system would have maximum stability radius [12]. For the similar problem, DE has
been used to optimize the feedback controller gains [13] and also used to tune
robust PID controller gain [14, 15]. The overall advantage of Differential Evolution
(DE) and Particle Swarm Optimization (PSO) as compared with other considered
approaches has been discussed [16].
Anti-swing control design for crane systems has attracted considerable attention
due to its under-actuation property of the payload angular swing. Crane systems
have many applications in industries, workshops, constructions and shipyards. In an
automatic gantry crane, a hanging payload is transferred by a cart as fast as possible
without causing any excessive swing during motion. Therefore, anti-swing control
is well-known term for automatic gantry crane control. The robust anti-swing
control for crane system is therefore proposed to ensure the payload angular sup-
pression is maintained despite the changes of parameters variations such as payload
mass/weight and cable length hanging the payload.
Some efforts of anti-swing control using PID control optimized by PSO and its
variants has been proposed [17, 18]. These kind of studies can be considered as
applications of artificial intelligence and optimization for robotics, control engi-
neering and other applications as found in many other works [19–24].
Modified Particle Swarm Optimization for Robust Anti-swing … 1175

In these study, time-domain error functions are used as ﬁtness function.

Therefore, the robustness property of the designed controller is not addressed. In
addition, it is not easy to choose the most suitable ﬁtness functions for a certain
system. Standard PSO tends to suffer from premature conference due to the limited
search operator and directional information used to guide the search process.
On the other hand, DE is known to be more robust and able to produce con-
sistent optimization results than PSO. However, DE is normally slower than PSO in
terms of computation time. Applications of DE in anti-swing gantry crane control
tuning has been proposed [13, 15]. Although robustness criteria were used as the
cost function in the optimization, performance comparison with other optimization
approaches has not been made.
A PSO variant known as particle swarm optimization without velocity
(PSOWV) was proposed in [9] by discarding the velocity component of particles
and updating the particle’s position based on the linear combination of its personal
and global best position. While being able to enhance the algorithm’s convergence
speed, the robustness of PSOWV in addressing complex optimization such as
anti-swing crane control remain arduous due to the presence of single search
operator with limited exploration and exploitation strengths. An enhanced version
of PSOWV, namely multiswarm particle swarm optimization without velocity
(MPSOWV), is hence proposed in this paper as a metaheuristic constrained opti-
mization approach to tackle the robust state feedback controller design. To deal
with the system uncertainty, we employ complex stability radius as a tool of
measuring system robustness. In addition, the control performance is automatically
deﬁned by assigning a regional closed loop poles placement. This region is
incorporated in the optimization as the constraint. In other word, the controller
design technique is based on a constrained optimization to obtain a set of feedback
controller gains such that the closed-loop system would have maximum complex
stability radius.
Simulation study will be carried out and the effectiveness of the proposed
technique of robust controller gain tuning will be compared based on different
algorithms. Those algorithms used in this study are standard PSO, standard DE,
improved teaching-learning-based optimization with social character of particle
swarm optimization (TLBO-PSO) [25] and multiswarm particle swarm optimiza-
tion without velocity (MPSOWV) [2].
The rest of the paper is organized as follows. Section 2 presents problem for-
mulation about feedback control and stability radius. Section 3 discusses the pro-
posed control design process including MPSOWV mechanism, constraint handling
and stopping criterion. Section 4 presents the simulation results and analysis from
the optimization point of view and robustness test. The performance comparison
using the different optimization algorithms is also shown here. Finally, conclusions
will be presented in Sect. 5.
1176 M. I. Solihin et al.

2 Problem Formulation

2.1 State Feedback Control Design

Consider a plant model of linear time-invariant continuous-time system in state

space form:

x_ ðtÞ ¼ AxðtÞ þ BuðtÞ

ð1Þ
yðtÞ ¼ CxðtÞ þ DuðtÞ

where x 2 RL , u 2 RM and y 2 RP are state vector, control input and output vector
respectively. It is assumed that the system given in (1) is completely state con-
trollable and all state variables are available for feedback. One can use state
feedback controller with forward integral gain (kint) as shown in Fig. 1. The con-
troller gains, K ¼ ½k1 ; k2 ; . . .; kn ; kint ; can be computed using various methods such
as optimal control method to minimize the performance index:
Z 1
J¼ zT Qz þ uT Ru dt ð2Þ
0

via Riccati equation written as:

^ T P þ PA
A ^ PBR ^T P þ Q ¼ 0
^ 1 B ð3Þ

where:

xð t Þ ^ A 0 ^ B
z¼ ; A¼ ; B¼ : ð4Þ
nðtÞ C 0 0

The vector of controller gains is:

K ¼ R1 BT P: ð5Þ

Fig. 1 State feedback controller with feed-forward integral gain

Modiﬁed Particle Swarm Optimization for Robust Anti-swing … 1177

The closed loop poles of the system (Fig. 1) are the eigenvalues ðkÞ of the closed
loop matrix given as:

k ¼ eig A^ BK
^ ð6Þ

However, the control performance depends totally on Q and R positive-deﬁnite

matrices chosen by trial and error. This method also assumes that there is no plant
uncertainty.
In this study, a constrained optimization using metaheuristic algorithms is
adopted to search for a set of robust controller gains so that the plant uncertainty is
automatically handled with the use of stability radius that will be discussed in the
next section. In addition, a region of closed loop poles is incorporated as opti-
mization constraint to allow the designers to deﬁne the desired time-domain control
performance.

2.2 Stability Radius for Robustness

A tool of measuring system robustness is presented. It is called as stability radius

[26]. It is a maximum distance to instability. Equivalently, a system having a larger
stability radius implies that the system can tolerate more perturbations. In general,
we can classify stability radius into two types; complex stability radius and real
stability radius. Compared to real stability radius, complex stability radius can
handle a wider class of perturbations including nonlinear, linear-time-varying,
nonlinear-time-varying and nonlinear-time-varying-and-dynamics perturbations
[27]. The complex stability radius will be maximized as in the optimization.
The deﬁnition of complex stability radius is given here. Let C denotes the set of
complex numbers. C ¼ fz 2 CjRealðzÞ\0g and C þ ¼ CnC is the closed right
half plane. Consider a nominal system in the form:

x_ ðtÞ ¼ AxðtÞ ð7Þ

AðtÞ is assumed to be stable. The perturbed open-loop system is assumed as:

x_ ðtÞ ¼ ðAðtÞ þ EDðtÞH ÞxðtÞ ð8Þ

where DðÞ is a bounded time-varying linear perturbation. E and H are scale

matrices that deﬁne the structure of the perturbations. The perturbation matrix itself
is unknown. The stability radius of (8) is deﬁned as the smallest norm of D for
which there exists a D that destabilizes (7) for given the perturbation structure
ðE; H Þ:
1178 M. I. Solihin et al.

For the controlled perturbed system in the form (8), let:

GðsÞ ¼ H ðsI AÞ1 E ð9Þ

be the “transfer matrix” associated with ðA; E; H Þ, then the complex stability radius
is deﬁned by the following deﬁnition.
The complex stability radius, rc :
1
max
rc ðA; E; H; C þ Þ ¼ kGðsÞk ; ð10Þ
s@C þ

where @C þ ¼ C \ C þ is the boundary of C þ . In other words, a maximum rc can

be achieved by minimizing the H1 norm of the “transfer matrix” G [27].

2.3 Gantry Crane System Modelling

Anti-swing gantry crane control is an interesting control problem from our point of
view since the plant is an under-actuated and it has also attracted many researchers.
In this paper, a model for lab-scale gantry crane prototype is used. This crane will
perform planar gantry crane point to point motion, see Fig. 2. The maximal travel
distance of the cart/trolley is around 1 m and the cable length hanging the payload
vary around 2–75 cm. The modelling technique and parameter identification for the
gantry crane and the detailed model derivation is not presented in this paper to
instead emphasize on our proposed method and its application.
In short, a simplified linear model of the plant including the actuator dynamic
has been obtained by experimental identification [28]. The model of the system is
given below, with the plant parameters are listed in Table 1 and the state vector is

Fig. 2 A schematic diagram m2 O

for 2D gantry crane system V
x

θ l

m1
Modiﬁed Particle Swarm Optimization for Robust Anti-swing … 1179

Table 1 The parameters for Parameters Symbol Value

the crane model
Identiﬁed speed constant (ms−1) KA 0.26
Identiﬁed time constant (s) s 0.09
Gravitational acceleration (ms−2) g 9.81
Payload cable length, nominal (m) l 0.5

deﬁned by z ¼ x h x_ h_ ; where x denotes horizontal trolley position and h
denotes angular swing.
2 3 2 3
0 0 1 0 0
60 0 0 17 6 7
z_ ðtÞ ¼ 6 7zðtÞ þ 6 0 7Vm ð11Þ
40 0 a33 0 5 4 b31 5
0 a42 a43 0 b41

1 0 0 0
yð t Þ ¼ zðt Þ
0 1 0 0

where:

a33 ¼ 1=s a42 ¼ g=l a43 ¼ 1=ls

b31 ¼ KA s b41 ¼ KA=ls

3 Robust Control Design Using Multiswarm Particle

Swarm Optimization Without Velocity (MPSOWV)

The proposed multiswarm particle swarm optimization without velocity

(MPSOWV) used to solve the robust control design problem is presented in this
section. The multiple search operators and diversity maintenance schemes incor-
porated into both current and memory swarm evolutions of MPSOWV are ﬁrst
described. Next, the constrained optimization problem of robust feedback control
and the constraint handling method employed are elaborated. The overall opti-
mization framework of MPSOWV is summarized at the end of this section.

3.1 Current Swarm Evolution of MPSOWV

A multiswarm scheme is proposed in the current swarm evolution to enhance

population diversity by partitioning the MPSOWV population into multiple sub-
swarms, enabling different directional information to be utilized in guiding the
1180 M. I. Solihin et al.

search process of different subswarm members. Suppose that the main population of
MPSOWV with the population size of N is represented as U ¼ ½X1 ; . . .; Xi ; . . .XN .
The main population U is partitioned into Z subswarms, where Usub;z ¼
½X1 ; . . .; Xi ; . . .XN z refers to each z-th subswarm with a subswarm size of N z for
z = 1, …, Z and N z ¼ bN=Z c. In order to produce each z-th subswarm Usub;z , a
reference point Kz is randomly generated and the i-th particle with personal best
position Pbest;i that has the closest Euclidean distance to Kz is obtained. The
remaining N z 1 particles with personal best positions that have closest Euclidean
distances to Pbest;i are identiﬁed to create Usub;z before these particles are eliminated
from U. The multiswarm scheme is repeated until all particles are partitioned into
Z subswarms.
Denote Pzbest;i and DPzi as the personal best position and difference vector con-
sisting of unique directional information used to guide the search process of each i-
th particle in the z-th subswarm, while Gbest refers to the global best position.
A modiﬁed search operator is the incorporated into the current swarm evolution of
MPSOWV to update the new position Xiz of each i-th particle in the z-th subswarm
as follow:

Xiz ¼ c1 r1 Pzbest;i þ c2 r2 Gbest þ c3 r3 DPzi ð12Þ

where c1 , c2 and c3 refer to acceleration coefﬁcients; r1 , r2 and r3 refer to the

uniformly distributed random numbers with values between 0 and 1. For each z-th
subswarm, different exemplars are considered to generate the difference vector DPzi ,
i.e.,

Pzbest;a Pzbest;b ; for z ¼ 1
DPzi ¼ ð13Þ
PSubBest;z1 Pzbest;c ; for z ¼ 2; . . .; Z

where Pzbest;a , Pzbest;b and Pzbest;c are the randomly selected particles in the z-th
subswarm with a 6¼ b 6¼ c 6¼ i; PSubBest;z1 is the personal best position of best
subswarm member found from the previous (z − 1)-th subswarm. As shown in (10),
the difference vector DPzi plays crucial roles in enhancing the diversity of each z-th
subswarm by facilitating the information exchange within and between the sub-
swarm members. The ﬁtness value of the updated Xiz is evaluated and then com-
pared with those of Pzbest;i and Gbest . The latter two solutions are updated if the
former one is found superior. Otherwise, the former solution is discarded.

3.2 Memory Swarm Evolution of MPSOWV

Once the current swarm evolution of MPSOWV is completed, all subswarms

denoted as Usub;z ¼ ½X1 ; . . .; Xi ; . . .XN z for z = 1, .., Z are merged to form the main
Modiﬁed Particle Swarm Optimization for Robust Anti-swing … 1181

population U ¼ ½X1 ; . . .; Xi ; . . .XN , where U ¼ Usub;1 [ . . . [ Usub;z [ . . . [ Usub;Z .

A memory swarm evolution is proposed as an alternate learning phase to enhance
the search robustness of MPSOWV. Two modiﬁed search operators are introduced
to evolve the personal best position of MPSOWV further by leveraging the useful
information of other swarm members. During the memory swarm evolution, a
random
e-th particle is selected from the population andits personal best ﬁtness
f Pbest;e is compared with that of the i-th particle, i.e., f Pbest;i .
A differential evolution (DE) based search operator that aims to enhance the
exploration search is introduced in the memory swarm evolution of MPSOWV to

further evolve the Pbest;i of i-th particle if f Pbest;i is superior than f Pbest;e .
Denote f, g and h as the indices of particles that are randomly selected from
population and their associated personal best positions are represented as
Pbest;f ; Pbest;g and Pbest;h , respectively, where f ; g; h 2 ½1; N and f 6¼ g 6¼ h 6¼ i. For
each i-th particle, the d-th dimension of its new personal best position, denoted as
Pnew
best;i;d , are then determined as:

Pbest;f ;d þ c Pbest;g;d Pbest;h;d ; if ra [ 0:5
best;i;d ¼
Pnew ; if Pbest;i is better than Pbest;e
Pbest;i;d ; otherwise
ð14Þ

where ra 2 ½0; 1 and c 2 ½1;1 are two uniformly distributed random numbers.
If f Pbest;i is inferior than f Pbest;e , a search operator motivated by the teaching
learning based optimization (TLBO) is used in the memory swarm evolution to
promote the exploitation search of MPSOWV. Given the information carried by the
personal best positions of two randomly selected f-th and g-th particles, the new
personal best position of each i-th particle, i.e., Pnew
best;i , are then updated as:

best;i ¼ Pbest;i þ rb Pbest;f Pbest;i þ rc Pbest;g Pbest;h ; if Pbest;i is worse than Pbest;e
Pnew
ð15Þ

where rb ; rc 2 ½0; 1 are two uniformly distributed random numbers. The ﬁtness of
updated Pnew
best;i is evaluated and compared with those of Pbest;i and Gbest . The latter
two solutions are updated if the former one is superior. Else, the former solution is
discarded.
1182 M. I. Solihin et al.

Fig. 3 A wedge region in

complex plane for closed loop
poles placement

3.3 Constrained Optimization of Robust Feedback Control

Assume that the optimal controller gains K consists of n state feedback gains and
one integral feedback gain denoted as k ¼ ½k1 ; . . .; kn and kint , respectively. Let
S and F be the problem search space and feasible regions, respectively, where
F S. The desired transient response characteristic of control system is represented
by damping ratio f and settling time ts . As shown in Fig. 3, the wedge region w in a
complex plane is speciﬁed using two parameters of h ¼ cos1 ðfÞ and transient
stability margin q.
Let X 2 S be the solution vector of MPSOWV associated with the controller
gains K ¼ ½k1 ; . . .; kn ; kint to produce the eigenvalue of control system kn þ 1 , where
n = 1, …, n + 1. The optimal robust controller gains K aims to maximize the
complex stability radius rc ð X Þ while fulﬁlling all constraints. For convenience
purpose, a negative sign is assigned to rc ð X Þ for formulating the objective function
f ð X Þ as:

f ð X Þ ¼ rc ð X Þ ð16Þ

A constraint function uð X Þ is used to check the feasibility of solution X based on

the locations of all eigenvalue kn þ 1 ð X Þ from the wedge region w. An operator dn ðÞ
is used in Fig. 3 to measure the outer distance of each kn ð X Þ to w. Any kn ð X Þ that
lies within w is assigned as dn ðkn ð X ÞÞ ¼ 0. The constraint function uðÞ associated
with a solution of X K ¼ ½k1 ; . . .; kn ; kint is then formulated as:

X
L þ1
uð X Þ ¼ maxð0; dn ðkn ð X ÞÞÞ ð17Þ
n¼1

From (17), if all kn ð X Þ lie within w for n = 1, …, n + 1., the total constraint
violation becomes uð X Þ ¼ 0 to imply all constrains are satisﬁed. The robust
feedback controller design is formulated as a constrained minimization problem as
follow:
Modiﬁed Particle Swarm Optimization for Robust Anti-swing … 1183

min f ðXÞ
ð18Þ
s:t: uð X Þ ¼ 0

3.4 Dynamic-Objective Method as Constraint Handling

Dynamic-objective method (DOM) [29] is used as the constraint handling method

of MPSOWV. The original constrained minimization problem of (15) is reformu-
lated as an unconstrained bi-objective minimization problem F ð X Þ. The ﬁrst
objective uð X Þ aims to enter the feasible region, while the second objective f ð X Þ
emphasizes on minimizing the negative values of complex stability radius, i.e.,

min FðXÞ ¼ ðuð X Þ; f ð X ÞÞ ð19Þ

Essentially, DOM allows each MPSOWV particle to adjust its objective based
on the feasibility of solution obtained from (14). If the particle is infeasible, i.e.,
uð X Þ [ 0, the evaluation of f ð X Þ is discarded to reduce computational cost. For
feasible particle, it begins to minimize f ð X Þ when uð X Þ ¼ 0. If the particle is found
to fly out of feasible region when it starts to minimize f ð X Þ, then it needs to switch
back to minimize uð X Þ. Mathematically, DOM can then be represented as:

f ð X Þ; if uð X Þ ¼ 0
F ðX Þ ¼ ð20Þ
uð X Þ; otherwise

3.5 Overall Framework of MPSOWV

The overall framework of MPSOWV in solving the robust feedback controller

design problem is summarized in Fig. 4. Initially, all MPSOWV particles are
randomly generated. The updated current and personal best positions of each par-
ticle are obtained via the current swarm evolution and memory swarm evolution,
respectively. The search process is repeated until the termination criteria are met.
Notable differences can be observed between the proposed MPSOWV and original
PSO. First, each PSO particle consists of velocity and position components, while
MPSOWV has discarded the velocity component to enhance its convergence speed.
Second, the learning process is only applied on the current velocity and current
position of PSO particle, while two layers of learning process are applied to the
current and personal best position of each MPSOWV particle. Finally, the fully
connected topology is used in PSO, while the multiswarm concept is used to divide
MPSOWV particles into different subpopulation.
1184 M. I. Solihin et al.

Algorithm: MPSOWV
1: Randomly initialize the current positons of all particles;
2: Evaluate the fitness value F(X) of each particles using Eqs. (13) to (17);
3: Update the personal best positions of all particle and global best position of population;
4: while the termination criteria are not met do
/*Current Swarm Evolution*/
5: Divide population into Z subswarms;
6: for z = 1 to Z do
7: Update the current position of all subswarm members using Eqs. (9) and (10);
8: Evaluate the F(X) of all subswarm members using Eqs. (13) to (17);
9: Update of personal best positions of and global best position;
10: end for
/*Memory Swarm Evolution*/
11: Merge all subswarms to create a main swarm;
12: for i = 1 to N do
13: Update the personal best position of i-th particle using Eqs. (11) or (12);
14: Evaluate the F(X) of all new personal best positions using Eqs. (13) to (17);
15: Update of personal best positions and global best position;
16: end for
17: end while

Fig. 4 The pseudocode used to describe the complete framework of MPSOWV

4 Results and Analysis

4.1 Simulation Settings

The performance of MPSOWV is compared with three peer algorithms, namely

particle swarm optimization (PSO) [30], differential evolution [31] and improved
teaching-learning-based optimization with social character of particle swarm opti-
mization (TLBO-PSO) [25]. The parameter settings of all algorithms are summa-
rized in Table 2. The population size used by all algorithms to solve the gantry
crane control problem is set as N = 50. All algorithms are simulated independently
for 20 times using Matlab 2017a on a personal computer with Intel ® Core i7-7500
CPU @ 2.70 GHz to reduce random discrepancy of stochastic optimization.

Table 2 The parameter Algorithm Parameter settings

settings of all compared
algorithms PSO x 2 0:9 ! 0:4; c1 ¼ c2 ¼ 2:05
DE F 2 ½0; 2; CR ¼ 0:1
TLBO-PSO Tf 2 f1; 2g; PMUT ¼ 0:6; e ¼ 0:01; a ¼ 0:05
MPSOWV K ¼ 5; c1 ¼ c2 ¼ c3 ¼ 4:1=3
Modiﬁed Particle Swarm Optimization for Robust Anti-swing … 1185

4.2 Performance Evaluation of MPSOWV

The solution of feedback controller gains X K ¼ ½k1 ; k2 ; k3 ; k4 ; kint obtained by

the proposed MPSOWV in the 20 independent runs are presented in Table 3
together with the associated objective function of f ð X Þ ¼ rc ð X Þ. Smaller f ð X Þ is
more desirable because it implies for the larger complex stability radius of rc ð X Þ,
hence can tolerate greater perturbation.
The mean feedback controller gains produced by all algorithms in 20 runs are
also reported in Table 4 together with the best, median, worst, mean and standard
deviation (SD) values of f ð X Þ to compare the robustness of system. The lowest
values of f ð X Þ and SD produced by the best algorithm are highlighted in bold. From
Table 4, it is evident that the feedback controller optimized by the proposed
MPSOWV can tolerate most perturbation because the best, median, worst and mean
values of f ð X Þ obtained are the lowest among all compared algorithms. When
compared with the PSO, DE and TLBO-PSO, the proposed MPSOWV has suc-
cessfully improved the system robustness by 37.333%, 1.622% and 0.4595%,
respectively. Notably, the proposed MPSOWV also has the smallest SD value
among all compared algorithms, implying its excellent consistency to produce a
robust feedback controller system. The excellent optimization results achieved by

Table 3 Optimized feedback controller gains obtained by MPSOWV in 20 runs

Run no. k1 k2 k3 k4 k5 f(X)
1 49.7133 −28.4840 20.9585 7.9230 −33.0395 −3.4354
2 49.9102 −27.5377 20.7262 7.8367 −33.6227 −3.4327
3 49.7223 −28.0927 20.8181 7.8653 −33.3087 −3.4336
4 49.9920 −28.3055 20.9896 7.9348 −33.3778 −3.4352
5 48.9368 −27.2774 20.3405 7.6741 −32.8000 −3.4305
6 49.3204 −27.3466 20.4417 7.7152 −33.3360 −3.4301
7 49.5983 −27.4277 20.6058 7.7861 −33.3212 −3.4321
8 49.1540 −27.3593 20.4115 7.7021 −33.0570 −3.4304
9 49.9081 −27.5030 20.6985 7.8192 −33.7084 −3.4317
10 49.8285 −28.4221 20.9629 7.9263 −33.2093 −3.4351
11 49.1000 −27.1163 20.3405 7.6734 −33.0219 −3.4298
12 49.9424 −27.3514 20.6855 7.8241 −33.7090 −3.4321
13 49.6535 −27.0586 20.4984 7.7460 −33.5722 −3.4302
14 49.9556 −28.0212 20.8902 7.9008 −33.4701 −3.4346
15 49.9907 −27.6790 20.8158 7.8762 −33.5554 −3.4340
16 49.7511 −27.2604 20.5853 7.7801 −33.6150 −3.4311
17 49.5218 −27.1218 20.4733 7.7293 −33.4195 −3.4300
18 49.9762 −27.6877 20.8153 7.8758 −33.5363 −3.4341
19 49.8075 −28.2683 20.9059 7.9019 −33.2694 −3.4345
20 49.9504 −27.4282 20.7013 7.8234 −33.7138 −3.4318
1186 M. I. Solihin et al.

Table 4 Optimized feedback controller gains obtained by MPSOWV in 20 runs

PSO DE TLBO-PSO MPSOWV
k1 41.3199 48.1204 49.0899 49.6867
k2 −44.1955 −25.2592 −27.3229 −27.6374
k3 19.5017 18.9552 20.2702 20.6832
k4 3.5026 6.9348 7.6205 7.8157
kint −26.9835 −34.7460 −33.4352 −33.3832
Best f(X) −2.7169 −3.4118 −3.4314 −3.4354
Median f(X) −2.5780 −3.3963 −3.4283 −3.4321
Worst f(X) −2.1381 −3.1399 −3.2236 −3.4298
Mean f(X) −2.4994 −3.3777 −3.4168 −3.4325
SD 1.82E−01 5.90E−02 4.57E−02 1.97E−03

MPSOWV prove that the proposed modifications are benefiting in achieve better
balancing of exploration and exploitation searches of algorithm. Particularly, the
multiswarm scheme introduced in the current swarm evolution of MPSOWV is
crucial to maintaining the population diversity and facilitate information exchange
within and between subswarms, hence can tackle the premature convergence issue.
Two additional search operators are introduced in the memory swarm evolution of
MPSOWV can also offer different exploration and exploitation strengths to all
particles in order to improve their robustness in addressing complex optimization
problems.
Table 5 shows the control performance of the system with nominal values of
plant parameters, i.e. with l ¼ 0:5 m and KA ¼ 0:26: The control performance here
is measured in terms of overshoot and settling time for cart positioning (x) and max/
min amplitude and settling time (using ±0.01 rad bound criterion) for angular
swing (payload swing, h). The results show that the performance of the anti-swing
controller optimized with the four different algorithms used in this paper are not
significantly different in terms of overshoot and settling time. This makes sense
since the simulation is performed for nominal plant. The more important concern is
on the robustness test where the plant’s parameters are varied which will be shown
in next section.

Table 5 Anti-swing control performance for the feedback controller as shown in Table 4
Controller Cart positioning Angular swing
Over-shoot Settling-time Max/min Settling-time (s),
(%) (s) amplitude (rad) ±0.01 rad bound
PSO 0 3.35 0.09 (-) 3.16
DE 0 3.20 0.12 (-) 3.00
TLBO-PSO 0 3.46 0.11 (-) 3.12
MPSOWV 0 3.61 0.11 (-) 3.07
Modiﬁed Particle Swarm Optimization for Robust Anti-swing … 1187

4.3 Anti-swing Control Results and Robustness Test

In this section, the optimized controller gain is applied to anti-swing control of

gantry crane model presented in the earlier section. The objective of anti-swing
gantry crane control is to move the payload as fast as possible without causing
excessive angular swing, i.e. the controller can suppress the payload angular swing
due to sudden acceleration of the cart/trolley that carries the payload.
In order to achieve a setting time of cart positioning for about Ts ¼ 2 s, the
necessary parameter to set is q ¼ 2 in the wedge region (see Fig. 3). The damping
factor is set as for normal control system performance, i.e. f ¼ 0:7: The robustness
test is carried out by varying the cable length ðlÞ and KA , which is related to payload
mass variation, in the crane model.
As mentioned, the plant uncertainty is due to variations of payload mass and
cable length carrying the payload. These represent real-world situations. These
variations perturb the nominal parameters l and KA listed in Table 1. Therefore,
uncertainties appear in the rows 3–4 and the columns 2–3 of the state matrix. Since
the closed loop state matrix increases to order ﬁve from originally (open-loop
system) of order four due to addition of state n in (4), therefore the closed loop scale
matrices do. The scale matrices as the perturbation structure for the closed loop
system are Ecl and Hcl whose diagonal elements in rows 3–4 of Ecl and in columns
2–3 of Hcl are respectively equal to one.
2 3 2 3
0 0 0 0 0 0 0 0 0 0
60 0 0 0 07 60 1 0 0 07
6 7 6 7
Ecl ¼ 6
60 0 1 0 077 Hcl ¼ 6
60 0 1 0 077
40 0 0 1 05 40 0 0 0 05
0 0 0 0 0 0 0 0 0 0

Figures 5, 6, 7 and 8 show the robustness test results where 50 random responses
are recorded. These 50 random responses are set by varying the value of l and KA
from their nominal values. In this robustness test simulation, the variation of these
parameters are set respectively for 50% and 20% variation from their respective
nominal value. The nominal values are l ¼ 0:5 m and KA ¼ 0:26:
As can be seen from the results, in general the anti-swing control performance
can be achieved where the desired direction cart position is achieved while the
angular swing can be suppressed at different quality. In the case of the feedback
controller optimized PSO, there is a worst-case parameter variation that leads to
instability of the controller as in Fig. 5. While in Fig. 6, the one optimized by DE
shows slightly better in term of robustness. In Figs. 7 and 8, both anti-swing
controllers optimized by TLBO-PSO and MPSOWV are able to achieve good
robust performance despite of variation of parameter of l and KA . These results
further clarify the evaluation of the optimization results presented in the earlier
section.
1188 M. I. Solihin et al.

Fig. 5 Random response for the anti-swing controller optimized by PSO

Fig. 6 Random response for the anti-swing controller optimized by DE

Modiﬁed Particle Swarm Optimization for Robust Anti-swing … 1189

Fig. 7 Random response for the anti-swing controller optimized by TLBO-PSO

Fig. 8 Random response for the anti-swing controller optimized by MPSOWV

1190 M. I. Solihin et al.

Table 6 Robust stability Controller Tolerance of the modeled uncertainty (%)

measure for the modeled
uncertainty PSO 98.9
DE 100
TLBO-PSO 107
MPSOWV 120

The settling time for cart positioning of all cases in Figs. 5, 6, 7 and 8 is around
3 s. For TLBO-PSO and MPSOWV optimized control (Figs. 7 and 8), settling time
of payload angular swing is 3.5 s.
In addition to further clarify the results, robustness analysis of uncertainty sys-
tem using ‘robstab’ function in MATLAB is also performed for the four cases.
Table 6 shows the result of how much the system can be robustly stable for the
modeled uncertainty. Furthermore, advanced statistical test such as T-test can be
performed for these results. However, this is not feasible at this moment and kept
for future study.

5 Conclusions and Recommendation

A robust anti-swing controller design using metaheuristic optimization is proposed

in this paper. Particularly, an improved version of particle swarm optimization
called multiswarm particle swarm optimization without velocity (MPSOWV) is
presented and applied effectively to solve this constraint optimization problem
where a wedge region for pole placement is set as the constraint. Using simple
statistical test, the proposed MPSOWV shows its effectiveness to achieve better
balancing of exploration and exploitation searches to produce more consistent
solution. The controller optimized by MPSOWV shows its more robust perfor-
mance despite of parameter changes in the plant, at least when compared to those
controllers optimized by standard algorithms like particle swarm optimization and
differential evolution. In addition, that optimized by MPSOWV shows slightly
better performance with that optimized TLBO-PSO. This is indicated by the slightly
larger stability radius achieved which is used as ﬁtness function (cost function)
during the optimization. Further advanced statistical test can be performed to clarify
the results and this is for future recommendation.

References

1. Natarajan E, Kaviarasan V, Lim WH, Tiang SS, Tan TH (2018) Enhanced multi-objective
teaching-learning-based optimization for machining of Delrin. IEEE Access 6:51528–51546
Modiﬁed Particle Swarm Optimization for Robust Anti-swing … 1191

2. Ang KM, Lim WH, Mat Isa NA, Tiang SS, Wong CH (2020) A constrained multi-swarm
particle swarm optimization without velocity for constrained optimization problems. Expert
Syst Appl 140:112882
3. Solihin MI, Wahyudi, Akmeliawati R (2009) PSO-based optimization of state feedback
tracking controller for a flexible link manipulator. In: SoCPaR 2009 - soft computing and
pattern recognition, pp 72–76
4. Solihin MI, Wahyudi, Legowo A, Akmeliawati R (2010) Comparison of LQR and PSO-based
state feedback controller for tracking control of a flexible link manipulator. In: ICIME 2010 -
2010 2nd IEEE international conference on information management and engineering, vol 3
5. Tijani IB, Akmeliawati R, Legowo A, Iwan M, Muthalif AGA (2011) Robust H-infinity
controller synthesis using multi-objectives differential evolution algorithm (MODE) for
two-mass-spring system. In: 2011 4th international conference on modeling, simulation and
applied optimization, ICMSAO 2011
6. Glover F, Kochenberger GA (2003) Handbook of metaheuristics. Springer, New York
7. Rao SS (2009) Engineering optimization. Wiley, Hoboken
8. Yang XS, Deb S (2010) Engineering optimisation by cuckoo search. Int J Math Model Numer
Optim 1(4):330
9. El-Sherbiny MM (2011) Particle swarm inspired optimization algorithm without velocity
equation. Egypt Inform J 12(1):1–8
10. Ramezani F, Lotfi S (2012) The modified differential evolution algorithm (MDEA). Springer,
Heidelberg, pp 109–118
11. Yao L, Shen J-Y, Lim WH (2016) Real-time energy management optimization for smart
household. In: IEEE International conference on Internet of Things (iThings) and IEEE green
computing and communications (GreenCom) and IEEE cyber, physical and social computing
(CPSCom) and IEEE smart data (SmartData), pp 20–26
12. Solihin MI, Akmeliawati R, Tijani IB, Legowo A (2011) Robust state feedback control design
via PSO-based constrained optimization. Control Intell Syst 39(3):168
13. Solihin MI, Akmeliawati R, Muhida R, Legowo A (2010) Guaranteed robust state feedback
controller via constrained optimization using differential evolution. In: Proceedings - CSPA
2010: 2010 6th international colloquium on signal processing and its applications, pp 1–6
14. Azmi NIM, Yahya NM, Fu HJ, Yusoff WAW (2019) Optimization of the PID-PD parameters
of the overhead crane control system by using PSO algorithm. In: MATEC Web Conference,
vol 255, p 04001
15. Solihin MI, Wen MC, Heltha F, Lye LC (2014) Robust PID controller tuning for 2D gantry
crane using Kharitonov’s theorem and differential evolution optimizer. Adv Mater Res
903:267–272
16. Ab Wahab MN, Nefti-Meziani S, Atyabi A (2015) A comprehensive review of swarm
optimization algorithms. PLoS ONE 10(5):e0122827
17. Jaafar HI, Mohamed Z, Abidin AFZ, Ghani ZA (2012) PSO-tuned PID controller for a
nonlinear gantry crane system. In: IEEE international conference on control system,
computing and engineering, pp 515–519
18. Diep DV, Khoa VV (2014) PID-controllers tuning optimization with PSO algorithm for
nonlinear gantry crane system. Int J Eng Comput Sci 3(6):6631–6635
19. Ang CK, Tang SH, Mashohor S, Arrifin MKAM (2014) Solving continuous trajectory and
forward kinematics simultaneously based on ANN. Int J Comput Commun Control 9(3):253
20. Hong TS, Kit AC, Nia DN, Ariffin MKAM, Khaksar W (2013) Planning for redundant
manipulator based on back-propagation neural network. Adv Sci Lett 19(11):3307–3310
21. Solihin MI, Wahyudi W, Albagul A (2006) Development of soft sensor for sensorless
automatic gantry crane using RBF neural networks. In: 2006 IEEE conference on cybernetics
and intelligent systems, CIS, pp 1–6
22. Alrifaey M, Hong TS, Supeni EE, As’arry A, Ang CK (2019) Identification and prioritization
of risk factors in an electrical generator based on the hybrid FMEA framework. Energies 12
(4):649
1192 M. I. Solihin et al.

23. Ang CK, Tey WY, Kiew PL, Fauzi M (2017) An artiﬁcial intelligent approach using fuzzy
logic for sleep quality measurement. J. Mech. Eng. SI 4(2):31–47
24. Rao RV, Patel V (2013) An improved teaching-learning-based optimization algorithm for
solving unconstrained optimization problems. Sci Iran 20(3):710–720
25. Zou F, Chen D, Wang J (2016) An improved teaching-learning-based optimization with the
social character of PSO for global optimization. Comput Intell Neurosci 2016:1–10
26. Hinrichsen D, Pritchard AJ (1986) Stability radii of linear systems. Syst Control Lett 7(1):1–
10
27. Akmeliawati R, Tan CP (2005) Feedback controller and observer design to maximize stability
radius. In: IEEE international conference on industrial technology, pp 660–664
28. Solihin MI, Akmeliawati R, Legowo A (2011) Robust feedback control design using
PSO-based optimisation: a case study in gantry crane control. Int J Mechatron Autom 1
(2):121–131
29. Lu H, Chen W (2006) Dynamic-objective particle swarm optimization for constrained
optimization problems. J Comb Optim 12(4):409–419
30. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN 1995 -
international conference on neural networks, vol 4, pp 1942–1948
31. Storn R, Price K (1997) Differential evolution – a simple and efﬁcient heuristic for global
optimization over continuous spaces. J Glob Optim 11(4):341–359
Feasibility Analysis of a Hybrid System
for a Health Clinic in a Rural Area
South-Eastern Iraq

Zaidoon W. J. AL-Shammari, M. M. Azizan, and A. S. F. Rahman

Abstract Global warming, fuel depletion and diesel price volatility have prompted
many countries to develop energy produced from renewable resources. The prob-
lem is that many remote areas of Iraq have expanded randomly and the possibility
of connecting electricity requires a lot of money and time. This paper proposes
techno-economically feasible and optimal sizing for each component of a hybrid
system to generate electricity to a health clinic in Zerbattiya, a village located in
southeastern Iraq near the Iranian border. The proposed hybrid system is composed
of PV, WT, batteries, and converters. Homer software was used to model the
system. The average life expectancy of the proposed system is 20 years. Results
showed that the respective optimal values of NPV, NWT, Nconv, NBT, COE, NPC,
O&M, IC, and the renewable factor are 10; 7; 6; 41; US$0.547/kwh; US$72,878;
US$990/yr; US$50,000; and 100%, respectively. Therefore, using the hybrid sys-
tem is justiﬁed on technical, economic, and humanitarian grounds.

Keywords Hybrid system Homer Optimal sizing

1 Introduction

The growth of the world population and the development of the industrial sector are
the main reasons for increasing demands for electricity [1]. Non-renewable
resources such as coals, gas, and oil are used in conventional electricity generation
systems. However, the use of these sources incrementally contributes to emissions
which are hazardous to the atmosphere. All these issues encourage nations to look
for alternative sources of energy. In Iraq, despite the availability of fossil fuels, there
are a lot of remote areas with no connection to the electrical grid. In these places,

Z. W. J. AL-Shammari M. M. Azizan (&) A. S. F. Rahman

School of Electrical System Engineering, Universiti Malaysia Perlis, Arau, Perlis, Malaysia
e-mail: [email protected]
Z. W. J. AL-Shammari
Technical College/Al-Mussaib, Al-Furat Al-Awsat Technical University, Najaf, Iraq

people usually use off-grid diesel generators since they are located far away from
the central grid. Besides the dangerous emissions of diesel fuel, there are difﬁculties
in transporting diesel over long distances to remote areas. In such cases, the only
alternative to diesel is the use of new clean and silent energy resources [2].
Renewable sources of energy like solar and wind involve many challenges since
their electrical supply is intermittent and unpredictable because they mainly depend
on climate and weather conditions [3]. Thus, an energy storage system is usually
used when using renewable energy sources [4]. Consequent to the shortcomings of
both conventional and renewable energy sources, various types of energy sources
are utilized in order to effectively supplement each other. Accordingly, a
multi-source hybrid renewable energy system, properly designed, sized, and con-
trolled, is likely to achieve better reliability and higher quality as compared to a
single-source energy system [5–8]. The present study aims to investigate the suit-
ability of a hybrid PV/WT/BT power generation system that does not emit any
pollutants via simulation and mathematical modelling to supply electricity for a
remote rural health clinic in Zerbattiya city in southeastern Iraq [9, 10]. Reliable
electrical production at the site has demonstrated the ability to provide high-quality
electricity for vaccine refrigeration, lighting, communications, medical devices,
clean water supply and other essential applications.

2 The Setup and Conditions

2.1 Location and Resources

A remote rural clinic in Zerbattiya, a village located in Southeastern Iraq near the
Iranian border Fig. 1 [11]. Located at 33°24.2′N latitude and 45°51.7′E longitude
with a surface area of 170 km2 and an elevation of 95 m, it has a population of
approximately 7,000 [12]. Data of solar radiation, temperature and wind speed were
determined from National Aeronautics and Space Administrative, NASA 2019
[13]. The highest daily radiation is recorded in June while the lowest one is
recorded in December. The average annual daily solar radiation and temperature

Table 1 Nomenclature
Nomenclature
COE Cost of energy BT Battery
NPC Net present cost NPV Number of Photovoltaics
IC Initial cost NWT Number of Wind turbines
O&M Operating and maintenance cost NBT Number of Batteries
RF Renewable Factor Nconv Number of converters
PV Photovoltaic HRES Hybrid renewable energy system
WT Wind turbine RE Renewable energy
Feasibility Analysis of a Hybrid System for a Health Clinic … 1195

were measured as 5.14 kWh/m2/d, and 22.38 °C respectively. The annual average
daily radiation and temperature of the selected clinic is the average data in that area
for many years (Figs. 2 and 3). The average annual daily speed of wind was
measured as 5.36 m/s Fig. 4.

Fig. 1 Geographical location of Zerbattiya

Fig. 2 Average annual solar

daily radiation

Fig. 3 Average annual daily

temperature
1196 Z. W. J. AL-Shammari et al.

Fig. 4 Average annual daily

wind speed

2.2 Clinic Load Analysis

Zerbattiya is not connected to electricity from the national grid [14]. The clinic
building comprises the following rooms [15]: waiting room, small pharmacy,
nurses’ room, doctor’s room, administration room, restroom, and treatment room.
The required electric consumption for lighting, medical equipment, and other
devices used in the clinic are summarized in Table 2. On the basis of the load
information data, the average annual daily energy requirement in this clinic is about
31.54 kWh/d, while the peak requirement is estimated to be 6.21 kW. The small
load for outside lighting and some inside devices occurs from 4 pm until 8 am,
whereas the majority of the load occurs during the day time from 8 am to 4 pm. The
load proﬁle is shown in Fig. 5.

Table 2 Types of load and energy requirement of a typical clinic in Zerbattiya

N Load equipment Quantity Power (W) Time of use (h) Energy
required (W/h)
1 Vaccine refrigerator 1 350 24 8,400
2 Light bulbs 5 20 12 1,200
3 Oxygen concentrator 1 100 8 8,00
4 Microscope 1 20 6 120
5 Water heater 1 500 3 1,500
6 Radio 1 30 4 120
7 Vaporizer 1 350 4 1,400
8 Centrifugal 1 500 4 2,000
nebulizer
9 TV set 1 100 8 800
10 Ceiling fan 4 100 8 3,200
11 Air conditioning 1 1,500 8 12,000
Total Average daily energy load 31.54 KWh/d
Feasibility Analysis of a Hybrid System for a Health Clinic … 1197

Fig. 5 Load proﬁle

3 System Components

Considering the products that are available in the Iraqi market, a list of various
components for the project was developed, and their prices obtained from different
sales agents and manufacturers. As a result, the best and most appropriate com-
ponents were chosen, taking into account their operational qualities and mainte-
nance, lifetime, capital cost and any variation costs as described in Table 3. Each
part of the system’ components has it is own distinguished conﬁgurations, as well
as capital, replacement, and maintenance prices. All parts of the hybrid system
include PV, WT, batteries, and power converters, all of which are necessary to
ensure a continuous supply to the load.

3.1 PV Panels

Photovoltaic system is an interconnection PV module producing direct current

electricity from solar energy in proportion to solar radiation and temperature. Solar
panels are made of individual solar cells, connected together. The details of the PV
that is proposed in this system are provided in Table 3 [16].

Table 3 The project’s materials

N Capital cost Replacement cost O&M cost Lifetime Power Type
1 $1,250 $1,250 $10 25 years 1 kW PV
2 $2,000 $2,000 $60 20 years 1.5 kW WT
3 $500 $500 $10 5 years 1 kW Converter
4 $500 $500 $10 10 years 1 kWh Battery
1198 Z. W. J. AL-Shammari et al.

3.2 Wind Turbines

Wind energy is converted into mechanical energy and then into electrical energy
through an electric generator. The energy from wind turbines depends mainly on
the interaction between wind and rotor. The details of the WT that is proposed in
this system are provided in Table 3 [17].

3.3 Batteries

Due to the intermittent nature of solar and wind energy, a power system based on
photovoltaic and wind turbine dictates the necessity of using battery storage
facilities in order to ensure a constant power supply. The battery can store a certain
amount of power with limits on how much it is charged and discharged without
causing any damage. The details of the BT that is proposed in this system are
provided in Table 3 [18].

3.4 Converters

The converter is among the major components of the hybrid system, which converts
the (DC) electricity produced by the PV units into AC electricity and also for the
conversion of the excess AC to DC for the purpose of storage in the battery to be
utilized in case of lack of power processing. The details of the converter that is
proposed in this system are provided in Table 3 [19].

3.5 Fundamental Principle and Evaluation Criteria

The modeling concepts that are used by HOMER mainly depend on minimizing the
costs. Each dispatchable energy source in HOMER are economically represented by
two main values; ﬁxed cost in ($\hour), and a marginal cost of energy in ($\kWh).
These values represent all costs associated with producing energy with that power
source that hour. Based on these values, HOMER searches for the best solution
which can cover electrical loads as well as the operating reserve at the lowest cost.
Satisfying the loads’ demand and operating reserve is regarded as critical roles for
HOMER, meaning that any cost will be accepted to avoid capacity shortage. On the
other hand, if the proposed combinations of the dispatchable sources can equally
supply the loads demand, then HOMER will choose the lowest cost combination.
Feasibility Analysis of a Hybrid System for a Health Clinic … 1199

4 Scenario

This section discusses the possible scenario for providing electricity to the health
clinic in Zerbattiya using two renewable energy sources. Toward this end, the total
COE and NPC are determined for this scenario. PV, WT, batteries and converters
were the components included in this scenario as shown in Fig. 6.

Fig. 6 System design

Table 4 Detailed simulation N Station Results

results
1 Number of PV NPV 10
2 Number of Wind turbines NWT 7
3 Number of batteries NBT 41
4 Number of converters Nconv 6
4 Cost of energy COE 0.547 $/
kwh
5 Net present cost NPC 72,878 $
6 Operating and maintenance O&M 990 $/year
cost
7 Initial capital IC 50,000 $
8 Renewable factor RF 100%
1200 Z. W. J. AL-Shammari et al.

Fig. 7 Simulation results

5 Results

The optimal size for each component along with their best prices is presented in
Table 4, the annual percentage of energy production is shown in Fig. 7 and
Table 5, and their computed annualized costs are shown in Table 6. The proposed
conﬁguration requires 31.54 kWh/d obtained from the hybrid system, 10 PV, 7 WT,
41 batteries, and 6 converters with a COE of US$0.547/kWh. The NPC is evaluated
at US$72,878; the O&M cost is US$990/year; the IC cost is US$50,000; and the
renewable factor is 100%.

Table 5 Production N Production %

percentage
1 PV 48.8
2 WT 51.2
3 Total 100

Table 6 Annualized costs of the solutions

Components Capital Replacement O&M Salvage Total ($)
PV 12,500 0 1,158.03 797.02- 12,861.01
WT 14,000 0 4,863.72 0 18,863.72
BT 20,500 11,574.92 4,747.91 0 36,822.83
Converter 3,000 1,272.82 694.82 637.61- 4,330.02
System 50,000 12,847.74 11,464.47 1,434.63- 72,877.58
Feasibility Analysis of a Hybrid System for a Health Clinic … 1201

6 Conclusion

This paper has presented an analysis of the techno-economic feasibility for pow-
ering a remote health clinic in a small rural area in Iraq with renewable energy
options. The operational behaviors of all the configurations were quantified and
investigated. The results of the cost analysis show that the hybrid design PV, WT,
and battery have high work capacity in different conditions that could affect the
system at any stage of the project’s lifetime. For the establishment of a flexible
hybrid system, the effects of key factors should be studied because of their direct
impact on system performance and design. These factors include solar radiation,
temperature, wind speed, and load demand. Fully renewable sources and hybrid
designs seem to be the most preferred options from an environmental point of view
and may be economically comfortable in the long term. Therefore, using
Renewable energy (RE) can enhance the accessibility of the remote areas in Iraq to
electricity while elevating the living standards. Furthermore, the contributions of
this study can serve as a starting point or a support tool to drive rural electrification
initiatives and expedite the planning and implementation of various initiatives.

References

1. Isa MM, Das HM, Tan CW, Yatim AHM, Lau KY (2016) A techno-economic assessment of
a combined heat and power photovoltaic/ fuel cell/battery energy system in Malaysia hospital.
Energy 112:75–90
2. Ali A, Sulaiman MI, Mohd Yasin MN, Azizan MM, Jusoh M, Ahmad Hambali NAM,
Mat MH (2019) Wireless Power Transfer (WPT) optimization using resonant coil. In: AIP
conference proceedings, AIP, Indonesia
3. Wang R, Zhang T (2014) Multi-objective optimal design of hybrid renewable energy systems
using MOEAD. In: 3rd international conference on renewable energy research and
applications, United States, pp 1–5
4. Ogunjuyigbe ASO, Ayodele TR, Akinola OA (2016) Optimal allocation and sizing of PV/
wind/split-diesel/battery hybrid energy system for minimizing life cycle cost, carbon emission
and dump energy of remote residential building. Appl Energy 171:153–171
5. Al-Karaghouli A, Kazmerski LL (2010) Optimization and life-cycle cost of health clinic PV
system for a rural area in southern Iraq using HOMER software. Solar Energy 84(4):710–714
6. Aziz AS, Tajuddin MFN, Adzman MR (2018) Feasibility analysis of PV/wind/battery hybrid
power generation: a case study. Int J Renew Energy Res 8(2):661–671
7. Sadzali MN, Ali A, Azizan MM, Albreem MAM (2017) The security energy encryption in
wireless power transfer. In: AIP conference proceedings, AIP, Thailand
8. Hemeida AM, El-Ahmar MH, El-Sayed AM, Hany MH, Salem A, Esmail MFC, Senjyu T
(2019) Optimum design of hybrid wind/PV energy system for remote area. Ain Shams
Eng J. https://doi.org/10.1016/j.asej.2019.08.005
9. Aziz AS, Tajuddin MFN, Adzman MR, Azmi A, Ramli MAM (2019) Optimization and
sensitivity analysis of standalone hybrid energy systems for rural electriﬁcation: a case study
of Iraq. Renewable Energy 138:775–792
10. Darwish AS, Shaaban S, Marsillac E, Mahmood NM (2019) A methodology for improving
wind energy production in low wind speed regions, with a case study application in Iraq.
Comput Ind Eng 127:89–102
1202 Z. W. J. AL-Shammari et al.

11. Wikipedia. https://ar.wikipedia.org/wiki/iraq. Accessed 3 Oct 2019

12. Wikipedia. https://ar.wikipedia.org/wiki/Zerbattiya. Accessed 3 Oct 2019
13. HOMER PRO, NASA surface meteorology and solar energy database. www.homerpro.com.
Accessed 30 Aug 2019
14. Ministry of Electricity. https://moelc.gov.iq/. Accessed 4 Oct 2019
15. Ministry of planning. http://www.mop.gov.iq/. Accessed 4 Oct 2019
16. AliExpress. https://www.aliexpress.com/item/Solar-Panel-1000W-12v-10-Pcs-Lot-Solar-Power.
Accessed 10 Oct 2019
17. AliExpress. https://www.aliexpress.com/item/1.5kw-220v-AC-use-wind-turbine-wind-generator-
free-shipping-high-efﬁcient. Accessed 10 Oct 2019
18. AliExpress. https://www.aliexpress.com/wholesale?catId=0&initiative_id=AS_20180310220
504&SearchText=battery. Accessed 10 Oct 2019
19. AliExpress. https://www.aliexpress.com/wholesale?catId=0&initiative_id=SB_20180310215
354&SearchText=converter. Accessed 10 Oct 2019
Optimal Sizing of PV/Wind/Battery
Hybrid System for Rural School
in South Iraq

Zaidoon W. J. AL-Shammari, M. M. Azizan, and A. S. F. Rahman

Abstract Unexpected fluctuations in diesel prices, depletion of fossil fuel reserves,

and global warming have prompted many countries to develop new energy policies
that encourage the use of alternative energy sources. Renewable energy resources
such as solar and wind energy are clean and can be used extensively. Combining
these sources with storage batteries can result in better, cleaner, economical and
more reliable results compared to a single source. In this paper, the techno-economic
feasibility of using an integrated electricity generation system for a remote rural
school in southern Iraq was studied. The HOMER program was used in this study.
The average life expectancy of the proposed system is 20 years. The results show
that the cheapest systems are those with relevant optimal values of 11 PV and 6 WT,
utilizing 49 batteries and 8 converters at a cost of US$0.527 per kWh; NPC rated at
US$80,097; O&M cost of US$1,040/year; IC of US$54,250; and RF of 100%.

Keywords Hybrid system Homer Optimal sizing

1 Introduction

Most rural areas of southern Iraq remain underdeveloped and in chaos after the US
invasion, and electricity is needed. An important challenge in Iraq is to provide
electricity for empowering education services in isolated remote areas [1–4].
Consumption of fossil fuel has several negative effects such as air and coastal pol-
lution, deforestation, and biodiversity deterioration and loss [5]. Thus, globally,
researchers, environmentalists and policy makers are currently searching for potential
renewable energy sources in order to minimize the utilization of conventional energy
sources to reduce emissions [6]. This paper focuses on the techno-economic study of a

Z. W. J. AL-Shammari M. M. Azizan (&) A. S. F. Rahman

hybrid system consisting of a PV, WT, and BT to operate a rural school in Al-Faw.
The region enjoys a good amount of solar radiation and wind speed throughout the
year. As a result, using only a solar panels or wind system is sufﬁcient to meet the load
demand for the periods of time during which there is availability of high wind speed or
intense solar irradiation [7]. When the renewable energy sources (solar radiation, and
wind speed/cut-in speed) are not available, storage or backup energy sources supply
the demanded load. Although this resource can provide abundant and reliable elec-
tricity, it has not been largely exploited. Hybrid systems can meet the electrical needs
of schools and other social spaces in ways that can positively impact education,
ensuring adequate services to the population [8, 9]. The relationship between energy
and education is compelling, and as interrelated factors, they largely determine the
progress of rural development.

2 Materials and Methods

2.1 Description of the Selected Site and Resources

In this study, a small remote rural school in AL-Faw, Iraq, has been chosen for the
feasibility assessment. AL-Faw is located in Southern Iraq near the Kuwait border
Fig. 1 [10]. The city is located at 29°56.5′N latitude and 48°26.4′E longitude [11].
The area is not connected to the grid. Currently, electriﬁcation of the school is
achieved by using diesel generators. Solar radiation, temperature and wind speed
data were downloaded from National Aeronautics and Space Administrative,
NASA 2019 [12]. The average annual daily solar radiation, and temperature were
measured as 5.57 kWh/m2/d, and 26.15 °C respectively (Figs. 2 and 3). The
average annual daily wind speed was measured as 4.49 m/s Fig. 4.

Table 1 Nomenclature
Nomenclature
COE Cost of energy BT Battery
NPC Net present cost NPV Number of Photovoltaics
IC Initial cost NWT Number of Wind turbines
O&M Operating and maintenance cost NBT Number of Batteries
RF Renewable Factor Nconv Number of converters
PV Photovoltaic HRES Hybrid renewable energy system
WT Wind turbine RE Renewable energy
Optimal Sizing of PV/Wind/Battery Hybrid System ... 1205

Fig. 1 Geographical location

of AL-Faw

Fig. 2 Average annual daily

solar radiation

Fig. 3 Average annual daily

temperature
1206 Z. W. J. AL-Shammari et al.

Fig. 4 Average annual daily

wind speed

2.2 Load Proﬁle

The selected school is small. Lighting and electrical apparatus in the school do not
need large amount of electrical energy [13]. On the basis of the load information
data [14], the average annual daily energy requirement in this school was found to
be nearly 35.96 kWh/d as shown Table 2, and the peak load was reported as
7.95 kW. The school proﬁle load is shown in Fig. 5.

Table 2 Types of load and energy consumption of a typical school in AL-Faw

N Load equipment Quantity Power (W) Time of use (h) Energy
required (W/h)
1 Lamps 20 20 8 3,200
2 Lamps (out) 4 20 10 800
3 Refrigerator 1 200 8 1,600
4 Water heater 1 500 3 1,500
5 Radio 1 30 2 60
6 TV set 1 100 8 800
7 Ceiling fan 20 100 8 16,000
8 Air conditioning 1 1,500 8 12,000
Total average daily energy load 35.96 kWh/d
Optimal Sizing of PV/Wind/Battery Hybrid System ... 1207

Fig. 5 Load proﬁle

3 System Components

3.1 System Description and Details of the Components

The main sources of the proposed hybrid system for the school are solar and wind
energies. Solar panels, wind system, units of batteries, and converter are the major
components of the system design. Solar panels produce direct current electricity
(DC) in proportion to solar radiation and temperature. Wind energy is converted
into electrical energy through an electric generator. The energy from wind turbines
depends mainly on the interaction between wind and rotor. The battery can store a
certain amount of power with limits on how much it is charged and discharged
without causing any damage. The converter is among the major components of the
hybrid system, converting the electricity produced by PV units into AC electricity
and also for the conversion of the excess AC to DC for the purpose of storage in the
battery to be utilized in case of lack of power processing. The speciﬁcations of PV,
WT, battery, and converter are depicted in Table 3 [15–18].

Table 3 The materials used in the project

N Capital cost Replacement cost O&M cost Lifetime Power Type
1 $1,250 $1,250 $10 25 years 1 kW PV
2 $2,000 $2,000 $60 20 years 1.5 kW WT
3 $500 $500 $10 15 years 1 kW Converter
4 $500 $500 $10 10 years 1 kWh Battery
1208 Z. W. J. AL-Shammari et al.

3.2 Operating Principle

PV and WT work to feed loading requirements. When the power production of

these energy sources is sufﬁcient to supply the load, the excess power generated
provides for the batteries to be fully charged after the load demand is met. When PV
and WT are not enough to supply the load in situations such as low solar radiation
or wind speed, the batteries work to compensate for the shortfall in energy. During
the night hours, the output of the solar panels becomes zero; therefore, the wind
and/or batteries meet the load. The system design used in the HOMER software is
presented in Fig. 6.

4 Scenario

This section discusses a possible scenario for providing electricity to a typical

school in AL-Faw using two renewable energy sources. Toward this end, the total
COE and NPC are determined for this scenario. PV, WT, batteries and converters
were the components included in this scenario as shown in Fig. 6.

5 Simulation and Results

5.1 Optimization Results

The feasibility of supplying electricity for rural school using Hybrid renewable
energy system (HRES) is dependent on the availability of energy resources in the
location and the total cost of the systems. In HOMER software, an hourly time
series simulation for each feasible system on a yearly basis was performed in order

Fig. 6 System design

Optimal Sizing of PV/Wind/Battery Hybrid System ... 1209

to evaluate the characteristics of operations such as the annual load served, yearly
electricity production, and renewable fraction. HOMER investigates the best fea-
sible system conﬁguration and the sizes of each component that is capable of
meeting the load demand at the lowest NPC. Table 4 presents the summary of the
important electrical output results, while the annual percentage of energy produc-
tion is shown in Fig. 7 and Table 5 and their computed annualized costs are shown
in Table 6. It was found that the integration of an energy system consisting of
11 PV, 6 WT, 49 batteries, and 8 converters is the cheapest design with a COE of
US$0.527 per kWh, with NPC evaluated at US$80,097, an operating and mainte-
nance cost of US$1,040/yr, initial capital of US$54,250, and the renewable factor of
100%.

5.2 Feasibility of the Proposed Energy System

The development context in southern Iraq is an important feature in the case study
analysis in this research paper. The promising results of the proposed hybrid sys-
tems utilized in this study can be applied in many rural areas of southern Iraq since
the availability of solar and wind resources is very similar across that area. PV, WT,
and batteries can be integrated together to meet the demands. This would reduce the
use of the depleted oil supply and the harmful emissions related to fossil fuel
utilization. The results also indicate that for small scale demands, it is not eco-
nomically viable to extend the national grid to the remote rural areas.

Table 4 Detailed simulation N Station Results

results
1 Number of PV NPV 11
2 Number of Wind turbines NWT 6
3 Number of batteries NBT 49
4 Number of converters Nconv 8
4 Cost of energy COE 0.527 $/
kwh
5 Net present cost NPC 80,097 $
6 Operating and maintenance O&M 1,040 $/
cost year
7 Initial capital IC 54,250 $
8 Renewable factor RF 100%
1210 Z. W. J. AL-Shammari et al.

Fig. 7 Simulation results

Table 5 Production N Production %

percentage
1 PV 61.1
2 WT 38.9
3 Total 100

Table 6 Annualized costs of the solutions

Components Capital Replacement O&M Salvage Total ($)
PV 13,750 0 1,273.83 876.72- 14,147.11
WT 12,000 0 4,168.90 0 16,168.90
BT 24,500 13,833.44 5,674.33 0 44,007.78
Converter 4,000 1,697.10 926.42 850.15- 5,773.36
System 54,250 15,530.54 12,043.49 1,726.87- 80,097.15

5.3 Strengths and Weaknesses of the Proposed Hybrid

System

In this section, the outcomes of the technical, economic, social, and environmental
analyses are discussed.
The strengths of this system are that it requires low operation and maintenance
costs, fuel transport infrastructure is not required, and it is safe and environmentally
friendly. Hybrid energy systems have better reliability and higher quality as com-
pared to a single energy source, while the weakness of this system is the high
capital costs of PV and WT installations.

6 Conclusion

The techno-economic feasibility study to ﬁnd the optimal conﬁguration of a hybrid

system for electriﬁcation of a rural school in Iraq is assessed in this paper. The
results of the cost analysis show that the hybrid design PV, WT, and battery possess
high work capacity in different conditions that could affect the system at any stage
of the project’s lifetime. The proposed hybrid energy system is clean and has no
Optimal Sizing of PV/Wind/Battery Hybrid System ... 1211

pollutants, while the off-grid diesel system causes too much air pollution in addition
to many negative effects such as noise and the difﬁculties of transporting fuel to
remote areas. Therefore, it can be concluded that the use of a hybrid renewable
energy system is a potentially viable choice for supplying electricity to the remote
rural area of Al-Faw, Iraq. Furthermore, the contributions of this study can serve as
a starting point or a support tool to drive rural electriﬁcation initiatives and expedite
the planning and implementation of various initiatives.

References

1. Hemeida AM, El-ahmar MH, El-sayed AM, Hany MH, Salem A, Esmail MFC, Senjyu T
(2019) Optimum design of hybrid wind/PV energy system for remote area. Ain Shams
Eng J. https://doi.org/10.1016/j.asej.2019.08.005
2. Al-karaghouli A, Kazmerski LL (2010) Optimization and life-cycle cost of health clinic PV
system for a rural area in southern Iraq using HOMER software. Solar Energy 84(4):710–714
3. Aziz AS, Mohammad F, Mohd R (2018) Feasibility analysis of PV/wind/battery hybrid power
generation: a case study. Int J Renew Energy Res. 8(2):661–671
4. Saleh A, Mohammad FN, Azralmukmin A, Makbul AMR (2019) Optimization and sensitivity
analysis of standalone hybrid energy systems for rural electrification: a case study of Iraq.
Renew Energy 138:775–972
5. Azuwa A, Azizan MM, Mahmoud AM (2017) The security energy encryption in wireless
power transfer. In: AIP conference proceedings, September. https://doi.org/10.1063/1.
5002436
6. Ogunjuyigbe ASO, Ayodele TR, Akinola OA (2016) Optimal allocation and sizing of PV/
wind/split-diesel/battery hybrid energy system for minimizing life cycle cost, carbon emission
and dump energy of remote residential building. Appl Energy 171:153–171
5. Salam A, Sabry S, Erika M, Nazar MME (2019) A methodology for improving wind energy
production in low wind speed regions, with a case study application in Iraq. Comput Industr
Eng 127:89–102
8. Normazlina M, Himadry S, Chee W, Yatim AHM (2016) A techno-economic assessment of a
combined heat and power photovoltaic/fuel cell/battery energy system in Malaysia hospital.
Energy 112:75–90
9. Ali A, Mohamad IS, Mohd NMY, Azizan MM, Muzammil H, Hambali NAM, Ahmad, MMH
(2019) Wireless power transfer (WPT) optimization using resonant coil. In: AIP conference
proceedings, July. https://doi.org/10.1063/1.5118135
10. Wikipedia Homepage. https://ar.wikipedia.org/wiki/iraq. Accessed 3 Oct 2019
11. Wikipedia Homepage. https://ar.wikipedia.org/wiki/AL-Faw. Accessed 3 Oct 2019
12. HOMER PRO, NASA surface meteorology and solar energy database. www.homerpro.com.
Accessed 30 Aug 2019
13. Ministry of Electricity Homepage. https://moelc.gov.iq/. Accessed 4 Oct 2019
14. Ministry of planning Homepage. http://www.mop.gov.iq/. Accessed 4 Oct 2019
15. AliExpress Homepage. https://www.aliexpress.com/item/Solar-Panel-1000W-12v-10-Pcs-
Lot-Solar-Power. Accessed 10 Oct 2019
16. AliExpress Homepage. https://www.aliexpress.com/item/1.5kw-220v-AC-use-wind-turbine-
wind-generator-free-shipping-high-efficient. Accessed 10 Oct 2019
17. AliExpress Homepage. https://www.aliexpress.com/wholesale?catId=0&initiative_id=AS_
20180310220504&SearchText=battery. Accessed 10 Oct 2019
18. AliExpress Homepage. https://www.aliexpress.com/wholesale?catId=0&initiative_id=SB_
20180310215354&SearchText=converter. Accessed 10 Oct 2019
The Use of Gypsum and Waste Gypsum
for Electrical Grounding Backfill

Amizatulhani Abdullah, Nurmazuria Mazelan,

Mohd Yuhyi Mohd Tadza, and Rahisham Abd Rahman

Abstract Large quantities of gypsum arise from the production of titanium dioxide
and from construction waste plasterboard. Due to strict legislation, alternative
applications for both gypsum are desperately required. Gypsum-based compounds
have been shown to improve the electrical conductivity of materials. In this study,
both red gypsum and plasterboard gypsum were investigated for the use as
grounding enhancement backfill material. The current investigation undertaken
aimed to develop gypsum-based grounding enhancement material that has low
resistivity. Tests included determination of geotechnical properties, plasticity
characteristics and the electrical resistivity corresponding to different state and
water content. Test results showed that, the geotechnical properties of RG vary to
that of white gypsum. RG is classified as extremely high plastic whereas gypsum is
classified as high plastic. The plasticity index (PI) of RG was found to be 240%
greater due to surplus Fe ions. When mixed with water, both material can exist in
four different state, ranging from solid, semi-solid, plastic and liquid. Surprisingly,
the electrical resistivity of gypsum is lower under wet conditions irrespective of the
state. However for long-term and from practical perspective, RG is a better material
for grounding purposes as the plasticity of gypsum was found to be short lived and
hardened instantaneously due to exothermic reaction. Once hardened, the gypsum
resistivity behaviour behave similar to its dry state redundancy.

Keywords Gypsum Titanium Dioxide Red Gypsum Plasticity Grounding
Electrical resistivity

A. Abdullah N. Mazelan M. Y. M. Tadza (&)

Faculty of Civil Engineering Technology, Universiti Malaysia Pahang, Lebuhraya Tun
Razak, 26300 Gambang, Kuantan, Malaysia
e-mail: [email protected]
R. A. Rahman
Faculty of Electrical and Electronic Engineering, Universiti Tun Hussien Onn,
86400 Parit Raja, Batu Pahat, Johor, Malaysia

1 Introduction

Lightning strikes affect building structures by producing high electric field on the
surface and around the structure [1]. Currently, several methods have been pro-
posed to improve the lightning performance of transmission lines, including uti-
lizing unbalanced insulation, adding extra insulation, adding coupling overhead
ground wires, reducing protective angles of overhead shield wires, decreasing
grounding resistance of tower grounding devices, and installing line surge arresters
on transmission lines [2].
Although various approaches and safety measures have been considered, soil
type is still the primary factor to determine the grounding resistance. The main
problem depends entirely on the soil type, as different soil possesses different
behaviour and characteristic which include soil resistivity, ionization, and the level
of corrosive environment [3]. Electrical resistivity of the soil can be considered as a
proxy for the spatial and temporal variability of many other soil physical properties
(i.e. structure, water content, or fluid composition [4].
As been made mentioned, soil resistivity is the key factor that determines what
the resistance of a grounding electrode, and to what depth it must be driven to
obtain low ground resistance [5]. In fact, to obtain the low level of impedance in the
rocky and sandy soil is somewhat impossible, therefore, many researchers have
introduced several techniques to reduce and maintain the grounding resistant in low
level [3]. In these case, ground enhancement materials or backfill material were
proposed to enhance the grounding system to attain the required ground electrode
resistance [6].
A good ground enhancement material should provide low earth resistance over a
long period with little variation of resistivity value [6]. Bentonite is one of the
suitable example of backfill material in decreasing and maintaining the low
grounding resistance of electrodes for a long time due to its high water absorption
and retention tendency [6]. According to research by Fukue et al. [7], the resistivity
for bentonite is high when water content is low, however, when the water content in
bentonite is more than 40%, the resistivity is as low as 3 Xm.
Recently, researchers have taken an initiative to study the possibility to use
waste product in a grounding system [3]. Several types of waste products from the
industry have been identified in reducing grounding impedance such as the use of
fly ash, drilling mud and industrial wastes [8–10]. Interestingly, biological waste
has also been tested and performed as good grounding material [13]. The most
commonly used grounding material is the use of Marconite, which is an engineered
waste material derived from carbon and metallurgy wastes [11]. In recent study,
[18] noted that the use of waste gypsum from titanium dioxide industry also showed
promising results as ground enhancement material.
Currently, red gypsum (RG) is regarded as industrial waste and is commonly
disposed. In some cases, RG is sold as agricultural soil conditioner, but this
application uses only a small proportion of the material produced due to the strict
regulatory regime in some countries. Due to the rising cost of disposal, it is essential
The Use of Gypsum and Waste Gypsum for Electrical Grounding Backfill 1215

to ﬁnd an alternative approach to reutilise RG [16]. In recent years, the applicability

of RG as backfill material was investigated. RG is considered as a useful waste in
civil engineering field [17].
Tadza et al. [18] explored the use of RG as electrical grounding applications and
showed that under wet condition, the resistivity of RG is low (i.e. 11 Xm). In
comparison with pure gypsum, the resistivity of RG remained lower with little
variation in resistivity value over a long period. Surprisingly, RG was found to be
not affected by hardening and solidification process as occurred in pure gypsum.
Similar to bentonite, RG have good plasticity and water retention characteristics
that makes it as favorable material to be used as electrical grounding.
Information on the use of gypsum or any gypsum based material as grounding
material appears to be limited if not available. For long term practical use of
gypsum based material as alternative material for grounding applications has not
been well understood. In this study, the geotechnical properties of both gypsum and
RG were investigated and identified to be used as grounding enhancement material.
The objectives of the study were (i) to investigate the effect of plasticity charac-
teristics behaviour of both gypsum and RG and (ii) to investigate the performance
of both material as grounding enhancement material.

2 Test Method

2.1 Material

Gypsum. Gypsum is a sulphate based mineral made up of hydrated calcium

sulphate, (CaSO4.2H2O). Depending upon the size of the gypsum crystals, gypsum
soil layers can have a powdery or a sandy appearance. Naturally, gypsum consists
of about 21% water by weight, however this amount of water contributes to about
50% by volume. Due to its abundance and unique physical and chemical properties,
gypsum is widely used as construction material in many parts of the world [19].
Exposure of gypsum to heat causes it to lose most of its water and form calcium
sulphate hemihydrate (2CaSO4.H2O). When mixed with water, gypsum form a
paste that dries out and sets to form harder material.
In engineering, gypsum is used in the construction of plasterboards, drywalls and
ﬁnishing for walls and ceilings. It is commonly added to Portland cement to prevent
instantaneous hardening or flash setting [20]. Furthermore, gypsum is used for soil
improvements and is used in agriculture as a soil conditioning agent [21]. Apart
from that, RG is now being utilized as backﬁll material [22]. In this study, gypsum
was sourced from a local manufacturer.
Red Gypsum. Alternatively, Red Gypsum (RG) is a waste material derived from
the production of Titanium Dioxide (TiO). RG is produced globally with the same
raw material and similar processes. Titanium is extracted from ilmenite mineral
(Titanium iron oxide mineral, FeTiO3) through acid cracking process using
1216 A. Abdullah et al.

sulphuric acid. The process of neutralization leads to the generation of a by-product,

RG [23]. The red colour imparted by the iron from ilmenite ores (FeTiO) [24]. As
compared to normal gypsum, RG have varying states of hydration due to the
presence of iron oxide (i.e. 3–35%) and some other trace elements. Some minor
differences in composition were noted due to the presence of impurities or attached
other element notably the ilmenite ore. Waste RG was sourced from the Venator
Materials Corporations, located in Kemaman, Terengganu.

2.2 Characterization of Materials Used

The list of test methods carried out is listed in Table 1. General preparation of RG
samples would involve pulverising and sieving into powder form passing 425 µm.
On the other hand, gypsum samples were obtained in powder from straight from
the manufacturer. The samples were then kept in sealed bags prior to being tested.
Most of the test methods were conducted following BS1377 (1990). Chemical
properties were obtained by using Cation Exchange Capacity (CEC). A simpliﬁed
method suggested by [25] and ammonium hydroxide solution was used to increase
the pH value to 7 was adopted for this purpose.

2.3 Mixed Design

In developing a suitable mix the samples were mixed with varying amount of water
to different state. As with other plastic materials. Both gypsum and RG can exist in
4 different state namely, solid, semi-solid, plastic and liquid. The samples were
carefully mixed with deionized water until each state has been reached and tested
for resistivity value.

Table 1 Geotechnical and chemical properties test methods

Geotechnical Testing method
properties
Specific gravity, Gs Density bottle (Small pyknometer) method (BS 1377: Part 2 1990: 8.3)
Liquid limit, LL (BS 1377: Part 2: 1990: 4.3)
Plastic limit, PL (BS 1377: Part 2: 1990: 5.3)
Shrinkage limit, SL Standard Test Method for Shrinkage Factors of Soils by the Wax
Method (ASTM D4943 – 08)
Specific surface area Wet technique EGME (BS 4359-1:1984)
Chemical properties Testing method
Caption exchange Ammonium acetate method (Chapman, 1965; Lavkulich)
capacity
The Use of Gypsum and Waste Gypsum for Electrical Grounding Backfill 1217

2.4 Electrical Resistivity Measurements

A Tinker & Rasor soil resistivity meter was used to measure the resistivity of both
gypsum based materials under both dry and wet conditions by adopting the Wenner
4 point approach. For dry conditions, powder specimens were carefully placed
directly into the soil box apparatus, whereas under wet conditions, the specimens
prepared at different water content at different state before transferring into the
soil-box. In this study, 4 different wet specimen were considered namely at
semi-solid, plastic and liquid state prepared at different water contents.

2.5 Impulse Breakdown Voltage Test

Dixon and Mood had proposed up and down method based on an estimation of the
normal distribution of 50% breakdown voltages [26, 27]. Figure 1 shows the setup
of the modulator training set with the point and plane electrode at measuring spark
gap tool. In this method, the initial voltage is set and increased voltage between
3%–6% until the ﬁrst breakdown occurs which was carried out by using the up and
down method as detailed in BS EN 50050 Standard [28]. The voltage is reduced by
the same step of a ﬁxed amplitude by decreasing step by step until breakdown does
not occur. This procedure was repeated for a total of 10 impulses, in order to
achieve a substantial breakdown certainty. The average value of the applied volt-
ages is marked as the 50% breakdown voltage. The initial voltage level for all gap
distances in this study was set to 30 kV.

3 Results and Discussion

3.1 Geotechnical Properties

The geotechnical and chemical properties of RG and gypsum are presented in

Table 2. Referring to Table 2, distinct differences in the geotechnical properties
were noted between the two materials. For instance, RG specific gravity value was
3.61 as compared to 2.31 for gypsum. This higher value may be attributed due to
the presence of iron. Usually iron rich soil would have high specific gravity in range
of 2.75 to 3.0 or even higher (ASTM D 854-92). Normally, calcium carbonate has a
specific gravity value of about 2.7. In the case of RG, due to higher surface area, a
greater CEC value was attained [25]. Similar to highly plastic clays such as ben-
tonite, the higher CEC value also was manifested on the consistency limits (i.e. LL,
PL and SL) of RG [29]. This resulted in higher plasticity characteristics compared
to gypsum. The plasticity index (PI) values were 33.96 and 10 for RG and gypsum,
respectively.
1218 A. Abdullah et al.

2 2
8 1 3
7
5
6
9

4
8
10
11

Fig. 1 Single-stage DC Voltage Test Set-up used in this study: 1 HV Test Transformer, 2
Rectiﬁer, 3 Charging Resistor, 4 Smoothing Capacity, 5 Connecting Rod, 6 Measuring resistor, 7
Sphere Gap, 8 Wavetail Resistor, 9 Wave front resistor, 10 Load capacity, 11 Material

Based on the geotechnical, plasticity and chemical composition of the materials, it

is expected that RG and gypsum can perform well as grounding material. These
characteristics indicated that both RG and gypsum may have good ion (i.e. high CEC)
and water retention characteristics (i.e. high LL) which is ideal for parameters for
grounding purposes. At microscopic level, the availability of water within the pores of
the gypsums would act as interparticle bridging and enhance the flow electrical current
[14, 15, 30]. This is crucial to dissipate the impulse voltage that is generated during
lightning strike. However, the amount of water that determines the state of the samples
was found to be crucial for maintaining the resistivity behavior [12].
The plasticity index showed that RG is extremely highly plastic (PI = 33.69)
whereas, gypsum is highly plastic (PI = 10). Thus, both materials can exist in 4
different state depending upon the water content (see Table 2). Ideally, a good

Table 2 Geotechnical and chemical properties of RG testing result

Geotechnical properties Red Gypsum Gypsum
Specific gravity, Gs 3.163 2.31
Liquid limit, LL (%) 91.97 58
Plastic limit, PL (%) 58.28 48
Shrinkage limit, SL (%) 17.50 36
Plasticity Index, PI (%) 33.69 10
Specific surface area, B (m2, g) 814.72 17
Cation exchange capacity (meq, 100 g) 87.63 10.00
The Use of Gypsum and Waste Gypsum for Electrical Grounding Backfill 1219

grounding material should be (i) highly conductive and/or (ii) have high water
retention characteristic [14]. However, under long-term application, changes in the
environment would affect the amount of water stored within the material, and hence
may affected the resistivity and performance of the material as grounding.

3.2 Resistivity Behavior

Wetting and drying of the material under different weather conditions would also
affect the state in which the material during lightning strike [34]. Figure 2 shows the
differences between gypsum and RG tested under wet and dry conditions. Under
dry conditions, both gypsum and RG shows that the resistivity values were very
high (i.e. >1 00 000 Xm). Interestingly, under wet conditions, the resistivity
behavior changes with the state in which the samples were formed. For instance,
when the material is in its liquid state the values of resistivity reduced significantly
to 8.3 and 11 Xm for gypsum and RG, respectively. However, when the samples
were in its plastic state, the resistivity increased slightly to 9.5 and 18.7 Xm.
Similarly, the resistivity further increased when the water content reached its
semi-solid state. Under all wet conditions, irrespective of the state, the resistivity
behavior of gypsum showed much lower values compared to RG. Indicating that,
gypsum would perform better as grounding enhancement material when direct
comparison were made between the two materials under different state.
Surprisingly, gypsum could not retained the moisture under long term and dries
out with time. The resistivity value of gypsum although was found to be low, the
resistivity value was found to increase with time as water dissipated (dried out) as
gypsum hardened. On the other hand, RG did not hardened with time and main-
tained the resistivity value over long period of time.
Figure 3 shows the changes in the resistivity value of both RG and gypsum with
elapsed time. The resistivity value for gypsum was found to increase significantly
within short amount of time (i.e. 5 min). On the other hand, the resistivity of RG
remained stable and somewhat unchanged. It was observed that, RG remained wet
throughout the testing period as it did not dried out as gypsum.
When mixed with water gypsum initially turned to plastic slurry and with time,
gypsum tended to lose their plasticity characteristics [31]. In this process, the water
decreased gradually due to chemical reaction between water and semi-hydrate
gypsum, hydration and evaporation. At the end of the process, gypsum condensed
and particle crystallized. As water rapidly evaporated, large number of pores were
then generated with the porosity about 50–60% [32]. As gypsum hardened, the flow
of current was truncated. Furthermore, development of pore and increased porosity
had caused the electrical resistivity of gypsum to reduce significantly within short
period of time. Within minutes, the resistivity value of gypsum reverted back to its
original value (i.e. >10 000 Xm).
1220 A. Abdullah et al.

100000

Gypsum

Red gypsum
10000
Resistivity ( .m)

1000

100

1
0 20 40 60 80 100
Water content (%)

Fig. 2 Evolution of electrical resistivity with water content and different state

Fig. 3 Evolution of electrical resistivity with elapsed time

The Use of Gypsum and Waste Gypsum for Electrical Grounding Backﬁll 1221

On the contrary, a much lower resistivity value in RG was primarily due to the
presence of other impurities and higher available cations. Ciżman et al. [33] proved
that the large composition of Fe content able to increase the electrical conductivity.
When electrical conductivity is higher, the resistivity will be lower. This enables the
movement of electrical current to flow through the specimens and improved the
grounding functions of RG. Lack of water in gypsum after some time proved to be
the main drawback to be used as good grounding material under long term and
would not be a viable material for this purpose. In other words, RG is more suitable
to be as grounding material mixing and placing method, curing concrete, curing
temperature and the water cement ratio are some of the signiﬁcant parameters.
Water cement ratio has the most prominent impact on concrete strength.
A minimum amount of water increases the workability but reduce strength.

3.3 Breakdown Voltage

Figures 4, 5, 6 and 7 show the breakdown characteristics when impulse was

injected up to 60 kV. The breakdown characteristics for gypsum and RG at solid
states (shown in Figs. 4 and 6) breakdown at about 43.12 and 42.30 kV impulses
voltage, respectively. There is an oscillation of rise and fall of voltage with
declining amplitudes as time progresses. Figure 4 shows the gypsum rise rapidly
from 0–0.77 µs and drop until 6.77 µs. Solid state of RG in Fig. 6 shows the
voltage increase at 0–0.55 µs and decrease until 6.06 µs.
On the other hand, Figs. 5 and 7 show the breakdown characteristics of gypsum
and RG at semi-solid state. It is interesting to note that, the voltage cuts off were

35
Voltage (kV)

-5
-5 0 5 10 15 20
Time (µs)

Fig. 4 Breakdown voltage of Gypsum (Solid)

1222 A. Abdullah et al.

35
Voltage (kV)

-5
-5 0 5 10 15 20
Time (µs)

Fig. 5 Breakdown voltage of Gypsum (Semi-solid)

35
Voltage (kV)

-5
-5 0 5 10 15 20
Time (µs)

Fig. 6 Breakdown voltage of Red Gypsum (Solid)

observed to be lower than their dry counterparts (33.36 and 35.00 kV, respec-
tively). A much greater water content at semi solid state lowered the breakdown
value for both materials. The breakdown voltage for gypsum at semi-solid increased
from 0–0.19 µs whereas, for RG increased from 0–0.25 µs. Later, it was observed
that, gypsum and RG voltage dropped signiﬁcantly at about 4.44 and 4.19 µs,
respectively. Interestingly, at both plastic and liquid state, no breakdown were
observed for both materials. This indicates that the materials became highly con-
ductive and allowed passage of impulse freely.
The Use of Gypsum and Waste Gypsum for Electrical Grounding Backﬁll 1223

35
Voltage (kV)

-5
-5 0 5 10 15 20
Time (µs)

Fig. 7 Breakdown voltage of Red Gypsum (Semi-solid)

Tables 3 and 4 shows the respective 50% breakdown voltages. For each mate-
rial, shots of impulse were fired during the up and down method to determine the
50% breakdown voltage.
Surprisingly, it was noted that gypsum has the higher breakdown value than dry
RG irrespective of the states. Repeated experiments and also another experiment
conducted shows that gypsum indeed exhibits higher breakdown voltage compared
to RG. The availability of more cations within RG assisted in lowering the
breakdown voltage and provide passage for current to flow. Among the backfill
materials under dry condition, Gypsum has the highest breakdown voltage followed
by RG and in semi-solid condition, Gypsum has the lowest breakdown. Gypsum
and RG at plastic and liquid state could not be read because of the conductivity of
water. Based on the result, gypsum perform better than RG but gypsum will dry
within 5 min and it will back to solid state condition. Low breakdown voltage is a
good property from the perspective of performance of backfill materials. Backfill
material with low breakdown voltage means that they are relatively easier to
undergo breakdown or ionisation thus temporarily raising their conductivity in the
event of lightning strike [6].

Table 3 50% Gypsum breakdown voltage

Material Water content 50% Breakdown voltage Standard deviation
state (%) (kV) (kV)
Solid – 43.12 3.33
Semi-solid 36 33.36 2.32
Plastic limit 48 – –
Liquid limit 59 – –
1224 A. Abdullah et al.

Table 4 50% Red Gypsum breakdown voltage

Material Water content 50% Breakdown voltage Standard deviation
state (%) (kV) (kV)
Solid – 42.00 2.14
Semi-solid 22 35.00 1.33
Plastic limit 68 – –
Liquid limit 90 – –

4 Conclusion

Based on the ﬁndings of this study, the following conclusion were drawn:
The plasticity characteristics of both gypsum based materials plays a crucial role
in determining the performance as grounding material. Both material can perform
as grounding material when wet, however the amount of water should be less that
its own plastic limits (i.e. plastic state). However, due to exothermic reaction and
evaporation occurred in gypsum, it tends to lose plasticity and water retention
characteristics which in turn affected the resistivity value. Similarly, as the process
is time dependent and occurred rapidly, gypsum would not make a good grounding
material. On the other hand, this aspect was not observed in RG. The high plasticity
of RG and the stability of the material was found to perform better as geoelectrical
grounding material as compared to gypsum. Furthermore, the lower impulse
breakdown value exhibit by red gypsum irrespective of state conﬁrmed this aspect.

Acknowledgements The authors wish to extend their gratitude to Universiti Malaysia Pahang
and the Ministry of Education for funding the research under fundamental research scheme FRGS/
1/2017/TK05/UMP/02/2 (RDU170133) and RDU 170343.

References

1. Omar MF, Haris HCM, Munim WNWA (2016) Development of energy harvesting method
using shock absorber. Int J Eng Technol 2(3):401–408
2. Youping T, Jinliang H, Rong Z (2006) Lightning impulse performances of grounding devices
covered with low-resistivity materials. IEEE Trans Power Deliv 21(3):1706–1713
3. Azmi A, Ahmad NA, Yiew LK, Abdul-Malek Y (2019) The use of enhancement material in
grounding system: a review. Indones J Electr Eng Comput Sci 13(2):453–460
4. Samouëlian A, Cousin I, Tabbagh A, Bruand A, Richard G (2005) Electrical resistivity survey
in soil science: a review. Soil Tillage Res 83(2):173–193
5. Igboama W, Ugwu N (2011) Fabrication of resistivity meter and its evaluation. Am J Sci Ind
Res 2(5):713–717
6. Lim SC, Gomes C, Kadir MZAA, Nourirad G, Malek ZA (2015) Behaviour of backﬁll
materials for electrical grounding systems under high voltage conditions. J Eng Sci Technol
10(6):811–826
The Use of Gypsum and Waste Gypsum for Electrical Grounding Backﬁll 1225

7. Fukue M, Minato T, Horibe H, Taya N (1999) The micro-structures of clay given by

resistivity measurements. Eng Geol 54(1–2):43–53
8. Gomes C, Ab Kadir MZA, Kottachchi CL, Lim SC (2014) Industrial wastes and natural
substances for improving electrical earthing systems. Int J Electr Eng 21(2):39–47
9. Nazar NSM, Thanakodi S, Muhammad H (2017) New waste material to enhance the
performance of grounding system. Telkomnika 15(4)
10. Rahimi S, Siddiqua S (2017) Relationships between degree of saturation, total suction, and
electrical and thermal resistivity of highly compacted bentonite. J Hazard Toxic Radioact
Waste 22(2):04017025
11. Dărab PC, Turcu A, Constantin PICĂ, Pavel SG, Beleiu H (2018) Reducing earth grounding
electrical resistance by using metallurgy industrial waste as backfill materials. Acta Technica
Napocensis-Series: Appl Math Mech Eng 61(1)
12. Egashira M, Izumi T, Yoshimoto N, Morita M (2016) Determining water content in AC for
double-layer capacitor electrodes. J Power Sources 326:635–640
13. Anyasi FI, Obinabo EC (2007) Electrical properties of biological wastes used as effective soil
conditioners for electrical earthing. Council Sci Ind Res 66:873–879
14. Tadza MYM, Mohamad D, Tripathy S, Rahman RA, Ismail MAM (2019) Bentonite and
marconite for electrical grounding applications from geotechnical engineering perspective. In:
AIP conference proceedings, vol 2129, no 1. AIP Publishing, p 020078
15. Tadza MYM, Anuar THHT, Yahaya FM (2019) Investigation on electrically conductive
aggregates as grounding compound produced by marconite. Civil Environ Eng Rep (CEER) 3
(30):86–96
16. Hughes PN, Glendinning S, Manning DA, White ML (2011) Use of red gypsum in soil
mixing engineering applications. Proc Inst Civil Eng-Geotech Eng 164(3):223–234
17. August AE, Noble BC, Tooze JF (2013) Red gypsum in civil engineering applications,
EP1474367A1, European Pattern Office
18. Tadza MYM, Mazelan N, Yusri NF, Abdullah A (2019) Some geotechnical properties of pure
and waste gypsum for geoelectrical grounding applications. Civil Environ Eng Rep (CEER)
30(3):97–106
19. Bhamidipati R (2016) Use of laboratory geophysical and geotechnical investigation methods
to characterize gypsum rich soils. MSc dissertation, University of Kentucky
20. Samson G, Phelipot-Mardelé A, Lanos C (2017) Thermal and mechanical properties of gypsum–
cement foam concrete: effects of surfactant. Eur J Environ Civil Eng 21(12):1502–1521
21. Tang J, Bullard JW, Perry LN, Feng P, Liu J (2018) An empirical rate law for gypsum powder
dissolution. Chem Geol 498:96–105
22. Rahman M, Ghataora G (2011) Use of waste gypsum for trench backfill. Int J Geotech Eng 5
(4):405–413
23. Azdarpour A, Karaei MA, Hamidi H, Mohammadian E, Honarvar B (2018) CO2
sequestration through direct aqueous mineral carbonation of red gypsum. Petroleum 4
(4):398–407
24. Fauziah I, Zauyah S, Jamal T (1996) Characterization and land application of red gypsum: a
waste product from the titanium dioxide industry. Sci Total Environ 188(2–3):243–251
25. Tadza MYM, Azmi NSM, Mustapha R, Desa ND, Samuding K (2017) Malaysian alternative
to international reference bentonite buffer in underground nuclear waste repository. In: AIP
conference proceedings, vol 1799, no 1. AIP Publishing, p 030002
26. Liu Q, Wang ZD (2013) Breakdown and withstand strengths of ester transformer liquids in a
quasi-uniform field under impulse voltages. IEEE Trans Dielectr Electr Insul 20(2):571–579
27. Hauschild W, Mosh W (1992) Statistical techniques for high-voltage engineering. The
Institution of Engineering and Technology, UK
28. BS EN 60060 - 1 (2010) High-voltage test techniques. Part 1: General definitions and test
requirements: British Standard Institution Std.
29. Tripathy S, Tadza MYM, Thomas HR (2014) Soil-water characteristic curves of clays. Can
Geotech J 51(8):869–883
30. Mitchell JK, Soga K (2005) Fundamentals of soil behavior, vol 3. Wiley, Hoboken
1226 A. Abdullah et al.

31. Zhang H (2011) Building materials in civil engineering, 1st edn. Woodhead Publishing,
Sawston
32. Schilling CH (2001) Colloid casting in encyclopedia of materials: science and technology,
2nd edn. Elsevier, Amsterdam
33. Ciżman A, Rysiakiewicz-Pasek E, Krupiński M, Konon M, Antropova T, Marszałek M
(2017) The effect of Fe on the structure and electrical conductivity of sodium borosilicate
glasses. Phys Chem Chem Phys 19(34):23318–23324
34. Lim SC, Gomes C, Ab Kadir MZA (2013) Characterizing of bentonite with chemical,
physical and electrical perspectives for improvement of electrical grounding systems. Int J
Electrochem Sci 8(9):11429–11447
Energy-Efﬁcient Superframe Scheduling
in Industrial Wireless Networked
Control System

Duc Chung Tran, Rosdiazli Ibrahim, Fawnizu Azmadi Hussin,

and Madiah Omar

Abstract In industrial wireless control and monitoring applications, wireless

sensor and actuator (WSAN) nodes are usually deployed at field, thus, stringent
requirements for the nodes’ battery life-time are demanded. This is to extend
continuous servicing period of field instruments. Recent related works have pro-
posed several communication approaches to improve overall battery’s life-time by
reducing the nodes’ power consumption. However, in the case of wireless control
systems for process plants, the importance of communication superframe needs to
be considered. Therefore, this paper describes an adaptive superframe approach
taking into consideration of the process’ response and characteristics to improve the
nodes’ life-time. From the simulation results obtained, significant saving (up to
approximately 50%) on energy consumption can be achieved.

Keywords Superframe Battery Life-time Industrial wireless control

WirelessHART

1 Introduction

Wireless sensor actuator network (WSAN) is emerging technology that is applied in

various fields. The key advantages that lead to the popularity of WSAN are ease of
wireless node development, placement and reconfiguration [1, 2]. The recent
advancement of both wireless and electronic technologies has allowed the
deployment of wireless nodes (WNs) in industrial environment [3–6]. Their
application is mainly for real-time remote monitoring of process parameters such as
temperature, equipment vibration, pressure, liquid level, torque, heat flow profiles,
and other important measurements [3, 7, 8]. However, due to factors like delay

D. C. Tran (&)
FPT University, Hoa Lac Hi-Tech Park, Hanoi 155300, Vietnam
e-mail: [email protected]; [email protected]
R. Ibrahim F. A. Hussin M. Omar
Universiti Teknologi PETRONAS, 32610 Bandar Seri Iskandar, Perak, Malaysia

uncertainty [9, 10, 42], their control application is limited and still under on-going
researches [11–13]. In addition, another major concern for using WNs in control is
their batteries’ life-time constraint [14–16].
A battery-powered WN has a limited service duration. This duration depends on
multiple factors such as the battery’s characteristics (discharge current, initial
capacity), ambient temperature etc. For improving the battery’s life-time, apart from
improving its characteristics, energy efficient management approach can be applied.
The latter has an advantage over the former since it completely does not rely on
physical characteristics of the battery. Instead, it deals with how the battery is used
in specific applications. This approach can be done through optimizing superframe,
providing efficient routing protocol with regards to constrained route matrices [17]
and improving reliability of network-wide communication [18]. These works have
mainly focused on wireless network and communication aspect and have not
explored the control aspect. Therefore, this research proposes an approach to
improve battery’s life-time of WNs used in a typical WSAN shown in Fig. 1. In this
figure, the controller is connected to a wireless gateway to send control and receive
feedback signals from the remote plant. The plant is connected to a battery-powered
WN to transmit and receive signals from the gateway connected to the controller.
The significance of this work lies in (i) a methodology to improve battery’s
life-time of typical first-order process plant using an energy-efficient adaptive
superframe scheme; (ii) an analysis of the simulation results obtained by the pro-
posed approach.
The remaining of this paper is organized as follows. Section 2 presents a review
of battery’s life-time monitoring and extending techniques. Section 3 outlines the
proposed methodology for improving the battery’s life-time using adaptive super-
frame. Numerical results and analysis are presented in Sect. 4 and finally Sect. 5
concludes this paper.

Fig. 1 Typical wireless sensor actuator network (WSAN)

Energy-Efﬁcient Superframe Scheduling in Industrial Wireless … 1229

2 Industrial Wireless Node Life-Time

2.1 Life-Time Requirement

Industrial wireless technologies such as WirelessHART, ISA100 Wireless have

enabled wireless control and monitoring applications in process plants [10, 11, 19–
22]. In industrial wireless plants, field instruments are usually powered using bat-
teries for simplicity of deployment. Each battery has limited capacity which limits
the respective instrument from infinite operation resulting in limited life-time.
Currently, it is expected that a battery-powered field device should have life-time
from 4 to 10 years [8, 23, 41]. This significantly depends on the superframe and
ambient temperature [24]. Since maintaining the temperature in open space envi-
ronment similar to that at industrial plants is impossible, it is important to monitor
and improve the life-time of wireless instruments so that proper maintenance plan
can be scheduled.

2.2 Life-Time Monitoring

Life-time estimation for lithium-thionyl chloride battery used in wireless pressure

and temperature transmitters at different update rates and temperatures was briefly
reported by Blevins et al. [8]. However, specification of the battery and life-time
estimation procedures were not detailed [8, 19, 25]. On the other hand, for esti-
mating power consumption of a WN, MC13192 transceiver’s specification was
loaded to the UnibsFramework models. In case of no interference with IEEE
802.11b wireless network, the average current consumption for actuator and sensor
nodes were 212 and 226 µA respectively [11]. The power calculation in this work
was based on only transceiver and that of microcontroller (MCU), a significant
power consumption module in a WN circuit, was not considered. A more detailed
energy consumption analysis of WirelessHART TDMA protocol based on OMNeT
++ simulator was presented in [25]. The analysis was performed for an IEEE
802.15.4-compliant ChipCon CC242 transceiver and an 8-MHz MSP430 MCU. In
order to carry out the simulation, it was assumed that the MCU and transceiver
were active or sleep at the same time to reflect event-driven or reactive systems.
The MCU had two operational states, a nominal supply voltage of 3.3 V, while the
transceiver had four states. The power consumption per state of each device was
preloaded to physical layer model for obtaining energy consumption. It was often
calculated on the basis of per-node transaction [24] and timeslot i.e., 10 ms [25].
The bust message superframe was randomly selected from only three values: 1, 30,
and 60 s. In addition, to achieve comparable control performance to traditional
wired control system (i.e., Foundation Fieldbus), one should configure superframe
to be multiple of timeslots [26].
1230 D. C. Tran et al.

Looking into energy consumption of a WN, recent works [27–29] have used
energy consumption model with respect to travelling distance of the message.
Verma et al. [27] only studied battery residual capacity of an AA-typed battery
using linear and service life estimator models while Dohare et al. [28] paid attention
to network’s life-time with respect to its coverage. The specialty of energy con-
sumption model in the work of Tao et al. [30] was that it considered the relaying
energy dissipation of a WN when transmitting a message through various inter-
mediate relays in the network before the message reached its destination. Hou and
Zheng [31] have studied nonlinearity of battery model for energy consumption
estimation of WN. The paper lacked the battery’s life-time detail calculation. Both
fundamental issues in WSAN namely full network coverage and life-time had been
studied by Wang et al. [32]. However, in general, ensuring network’s connectivity
requires either more WNs to be deployed or an increase in their transmission range
by means of higher transmission power. Both result in increase of overall network’s
power consumption, which shortens its life-time.
Similar to cellular transceiver [33], transceiver in industrial WN is considered as
one of the most power consumption modules [34, 35]. The energy cost function of
the transceiver involved only two terms: active energy to maintain the network
interface card and transmit energy for transmitting and receiving wireless data. In
time division multiple access (TDMA) scheme for industrial wireless networks,
each node is scheduled to update its status at a speciﬁc timeslot, i.e., strictly within
10 ms (or multiple of 10 ms) for WirelessHART. Fixed communication schedule
(superframe) of a wireless network can result in increase of unuseful energy con-
sumption of WNs. A clear example is when considering an industrial process plant
with response approaching the desired setpoint. In this situation, it is not necessary
to keep updating the process response frequently as insigniﬁcant changes to setpoint
will be observed. Therefore, the superframe needs to be updated dynamically to
allow WNs to enter inactive period more often. This means the communication time
can be shorten thus, less energy is consumed.

2.3 Life-Time Extension

Several possible ways to extend life-time of wireless nodes are optimizing elec-
tronic circuit, improving network reliability, using renewable power source as
secondary power supply.
Optimizing electronic circuit is one of several suitable approaches to improve
overall life-time of WN’s battery [23, 36, 37]. However, it can only be done at
manufacturer sites. The end-users have only constrained choices to select WNs that
meet their requirements of installing location, voltage supply range, power con-
sumption, and other speciﬁcations of instrument. Therefore, the application of this
approach is very limited from the end-users or process plant operators’ perspectives.
Reliable communication helps to prevent unnecessary retransmission of faulty
node, thus, reducing its energy consumption [18]. However, this approach is not
Energy-Efﬁcient Superframe Scheduling in Industrial Wireless … 1231

without a price of deploying more WNs to improve overall network’s reliability

[38]. Therefore, extending life-time of one node can increase the deployment cost
of adding another redundant node.
At present, equipment installed at industrial process plants are often not
equipped with renewable energy source yet [8]. In addition, not every place in a
process plant is available for renewable energy harvesting. Hence, the application of
this approach in this context is limited.
It is seen that there is an inevitable demand for extending battery life-time of
WNs in industrial process plants. Due to the limitations of the aforementioned
approaches, this paper proposes an adaptive superframe scheduling for WN to
signiﬁcantly reduce energy consumption, thus improve its battery life-time. This is
in contrast with the current practice at industrial wireless plants of which the WNs’
communications are pre-conﬁgured. In order to facilitate mathematical derivation
for the proposed approach, a list of symbols used in this work is presented in
Table 1.

Table 1 List of symbols

Symbol Meaning
2 Feedback error
2% Percent feedback error
2 %nn New percent feedback error
D 2 %nn Difference in percent feedback error
Ef Fixed superframe energy consumption
Ea Adaptive superframe energy consumption
Ev Saving energy through adaptive superframe
Et Transmit energy
Er Receive energy
Es Sleep energy
Enn Energy consumption to reach setpoint
Fb Beneﬁt function
k Number of superframe
K Process gain
M Step input magnitude
N* Non-zero integer number set
n Number of timeslots
nn Number of new timeslots
Pt Transmit power
Pr Receive power
Ps Sleep power
Pnn Power consumption to reach setpoint
t Current time instant
(continued)
1232 D. C. Tran et al.

Table 1 (continued)
Symbol Meaning
tt Transmitting duration
tr Receiving duration
ts Sleep duration
s Process time constant
tre Remaining run-time to reach setpoint
te End-time to reach setpoint
T0 WirelessHART timeslot
Ts Superframe
Tsn New superframe
U(s) Process input transfer function
y(t) Process output in time domain
Y(s) Process output transfer function

3 Methodology

3.1 Timeslot, Superframe

Timeslot is deﬁned as a period in which a complete round trip direct communi-

cation with an acknowledgment is done between two devices in a wireless network.
For WirelessHART standard, the timeslot for such communication is 10 ms [39].
Although communication between the devices can take place in such short period
of time, in real scenario, due to internal clock drift of each device, the communi-
cation timeslot allocated for the communication can exceed 10 ms.
A collection of multiple timeslots forms a superframe (Fig. 2). In addition, a
ﬁxed superframe scheduling results in an unchanged periodic communication
between two nodes. However, an adaptive superframe scheduling can vary the
superframe length to adapt to the needs for communication in the network. This
approach is useful for preserving battery life-time, as wireless instrument is not
required to update its information to the control center continuously. The com-
munication with the control center is aperiodic.
The relationship between superframe, sleep duration, transmitting duration, and
receiving duration is realized from Fig. 2. In this ﬁgure, given a preset superframe,
the following relationship can be established:

Ts ¼ tt þ tr þ ts ð1Þ

where Ts is superframe period, tt is transmitting duration, tr is receiving duration, ts

is sleep duration.
Energy-Efﬁcient Superframe Scheduling in Industrial Wireless … 1233

Fig. 2 Timeframe relationship in a single update period

3.2 The Need for Adaptive Superframe

The current practice in industrial monitoring applications is to preset the value of

superframe Ts with multiple redundant transmissions during network configuration
stage to improve network’s communication reliability. With this approach, the WN
will be active during the entire superframe. As a result, multiple redundant trans-
missions occur, thus active duration increases. This results in reduced WN battery
life-time. The superframe is definable in process control system for ease of cus-
tomization network-wide configuration for specific application’s needs.
Considering the coexistence of WirelessHART and other wireless local area net-
work (WLAN) in industrial environment, it was suggested that the superframe
length should be 100 ms plus a randomized offset uniformly distributed in the range
of [−5 ms, +5 ms] to avoid communication interference [11]. In order to achieve
comparable control performances as compared to Foundation Fieldbus using wires,
WirelessHART network superframe should be scheduled to allow at least a
retransmission trial for communication between source and destination nodes [26].
Usually at process plants, the superframe are preset when deploying WNs to
field. An interesting question would be, should the fixed superframe be necessary
for the process plants at all time? A stable process plant usually has stable con-
trolled variable as well. Hence, fixing the superframe length will result in unnec-
essary redundant sensing or controlling information routed in the network.
A similar feedback information results in insignificant control action to the process
plant and vice versa. Therefore, firstly, it wastes energy consumption for carrying
out the task. Secondly, it results in unnecessary control action to be taken by
actuator. These lead to the need for development of an adaptive superframe
scheduling mechanism that ensures process value is sampled adaptively based on
actual needs.
1234 D. C. Tran et al.

3.3 Wireless Node Energy Consumption Reduction

Through Adaptive Superframe Scheduling

In this paper, a ﬁrst-order continuous-time system with transfer function described

in (2) is considered.

Y ðsÞ K
¼ ð2Þ
U ðsÞ ss þ 1

where K is process gain, s is process time constant, Y(s) is process output transfer
function, and U(s) is process input transfer function.
The plant response to an M-magnitude step input is expressed as:

KM
Y ðsÞ ¼ ð3Þ
sðss þ 1Þ

Expressing the process time constant s as multiple times of superframe Ts, one
has

s ¼ kTs ð4Þ

where k N :
Based on [40], as the rule of thumb, for ensuring controllability of the process
plant, k holds the following condition:

k4 ð5Þ

With a fundamental update period T0, set to be WirelessHART timeslot,

superframe is expressed as

Ts ¼ nT0 ð6Þ

where n N .
From (4) and (6) one has

s ¼ knT0 ð7Þ

Combining (5) and (7), one obtains

s
1n ð8Þ
4T0

where is floor operator.

Energy-Efﬁcient Superframe Scheduling in Industrial Wireless … 1235

In time domain, the solution for plant’s output in (3) is

yðtÞ ¼ KM 1 es
t
ð9Þ

Hence, based on (7) one has

yðtÞ ¼ KM 1 eknT0
t
ð10Þ

Comparing to the setpoint, the feedback error is

2 ¼ M yðtÞ ¼ M 1 K þ KeknT0
t
ð11Þ

Percent feedback error is

2 t
2% ¼ ¼ 1 K þ Ke knT0 ð12Þ
M

The use of percent feedback error helps to eliminate the importance of the step
input’s magnitude when considering control performance of the system. The
remaining run time to reach the setpoint is

tre ¼ te t ð13Þ

where t is current time instant and te is end time to reach setpoint. In control system,
it is approximately equal to ﬁve times process time constant.

te ¼ 5s ð14Þ

At this point, the controller has two options, to remain update period or to
reschedule a new update period to the WN. If the second option is selected, a
question should be addressed is: At which superframe, the beneﬁt function (fb) is
maximized? Let’s nn be the new number of fundamental period forming new
superframe. The conditions for the superframe in (8) holds
s
1 nn ð15Þ
4T0

In this paper, the beneﬁt function fb is deﬁned as the amount of remaining

percent feedback error over the amount of energy consumed to reach setpoint as
described in (16).

D%nn
fb ¼ ð16Þ
Enn
1236 D. C. Tran et al.

where D%nn is the difference in percent feedback error between the new and
previous superframes, Enn is expected energy consumption to reach setpoint [mJ], it
typically consists of three energy components: transmitting energy Et, receiving
energy Er and sleep energy Es [mJ].
These energy components are calculated using (17).

Et ¼ Pt tt
Er ¼ Pr tr ð17Þ
Es ¼ Ps ts ¼ Ps ðnn T0 tt tr Þ

where Pt is power consumption during transmitting mode, tt is transmitting dura-

tion, Pr is power consumption during receiving mode, tr is receiving duration, Ps is
power consumption during sleep mode, and ts is sleep duration.
The new percent feedback error is

%nn ¼ 1 K þ Keðt þ nn T0 Þ=knn T0 ð18Þ

Hence,
t þ nn T 0
knTt
Ke knn T0
e 0

fb ¼ ð19Þ
Pt tt þ Pr tr þ Ps ðnn T0 tt tr Þ

Based on (19), one can assess and ﬁnd the most suitable value at which fb is the
highest. Similar to (6), the new superframe is then updated as

Tsn ¼ nn T0 ð20Þ

Overall, the section has presented a new method to calculate and update
superframe for communication with the WN.

4 Numerical Results and Analysis

Overall WN’s power consumption and simulation results on control performance

and the energy saved are presented in Table 2. The important characteristic of a
WN such as DC9003A-C from Linear Technology is that it supports ultra-low sleep
current at 0.8 µA and low active current at 1.3 mA. While the mote supports radio
transmission and reception at two signal strengths of 0 dBm and 8 dBm with
transmit current at 5.4 mA and 9.7 mA respectively. The receiving current is
constant at 4.5 mA.
Energy-Efﬁcient Superframe Scheduling in Industrial Wireless … 1237

Table 2 Wireless node basic speciﬁcations

Item Value Unit
Operating voltage 3.6 V
Transmit current
0 dBm 5.4 mA
8 dBm 9.7 mA
Receive current 4.5 mA
Active current 1.3 mA
Deep sleep current 0.0008 mA

4.1 Wireless Node Power Consumption

Figures 3 and 4 present the scheduled superframe and fb of the process plant for the
wireless node with the conditions that the node is in active mode when it is not
transmitting or receiving signals. In addition, the transmit power is at default of
0 dBm. It is seen from Fig. 3 that the superframe is adaptive and its length is
converging. During the transition of the process to the new setpoint, the superframe
has longer length. This is because the response of the plant is predictable given a

8
Ts (s)

0
0 10 20 30 40 50 60 70 80 90 100
t (s)

Fig. 3 Adaptive superframe

1238 D. C. Tran et al.

0.25

0.2

0.15
ROI (%/mJ)

0.1

0.05

0
0 10 20 30 40 50 60 70 80 90 100
t (s)

Fig. 4 Beneﬁt function of adaptive superframe

specific control signal. In addition, during this rise time duration, the process plant
does not reach its desired setpoint yet, thus, less update rate is needed. After five
times process time constant, the superframe length reduces significantly and
maintains around and below 2 s. Although more updates are required during the
settling state of the plant, this is done through the proposed adaptive superframe
technique based on the fb value. This means, the update is scheduled to achieve the
highest fb given the system’s characteristics.
In Fig. 4, the fb is highest (about 0.22% error reduction per mJ spent) during the
transition of the plant to the new setpoint. The value decreases significantly after
about two times process time constant. This is because after this period, the plant
output has reached about two third of the desired setpoint, thus significantly
reduced the error with the setpoint. The closer to the setpoint, the less fb the system
can achieve. As seen from the figure, the fb converges to the value of approximately
0.02%/mJ. Overall, the changes in fb is in line with those of the superframe.
Energy-Efficient Superframe Scheduling in Industrial Wireless … 1239

4.2 Plant Performance

Based on Fig. 5, the plant response of the adaptive superframe approach is similar
to that of the ﬁxed superframe approach. This means the plant with adaptive
superframe is able to achieve good setpoint tracking performance from control
perspective. In addition, the key difference between two approaches is that during
the transition to the new setpoint, the adaptive superframe approach does not need
to periodically communicate with the WN at the plant. This results in up to 50%
energy saving (i.e., 236.21 mJ, see Table 3) compared with the energy spent (i.e.,
485.95 mJ) in the ﬁxed superframe approach.

4.3 System’s Performance at Different Operating Conditions

In order to further illustrate the effectiveness of the proposed adaptive superframe

approach, the system’s performance with different operating conditions are pre-
sented in Table 3. In general, when moving from low transmit power to high
transmit power, the energy spent for wireless node increases for both ﬁxed and

1.2

0.8
Plant Response

0.6

0.4

0.2

Dynamic Ts
Fixed Ts

0
0 10 20 30 40 50 60 70 80 90 100
t (s)

Fig. 5 Plant responses

1240 D. C. Tran et al.

Table 3 System’s performance at different operating conditions

Operational mode when not communicating Ef Ea Ev
Transmit power = 0 dBm
Active 485.95 249.74 236.21
Deep sleep 18.29 1.22 17.07
Transmit power = 8 dBm
Active 493.77 251.48 241.29
Deep sleep 26.10 1.68 24.42

adaptive superframe approaches. The amount of energy saved by using the adaptive
superframe is most signiﬁcant if the deep sleep mode is used. At this mode, the
system can save 93.33 and 93.56% energy for 0 dBm and 8 dBm transmission
powers respectively. While about 50% energy saving can be achieved when the
WN’s mode is active.

5 Conclusion

This paper has presented a method to adjust adaptively superframe of wireless

communication networked control system. The adaptive mechanism is accom-
plished by maximizing the benefit function which represents the amount of setpoint
error reduction over that of energy consumed by the wireless node. This is to
significantly reduce energy consumption of field device, thus improving its battery
life-time. The presented results reveal that the amount of energy saved using the
adaptive superframe is above 50% when the node is in either active or sleep mode if
not communicating with other nodes.

Acknowledgements The authors acknowledge the support of FPT University Hanoi, Vietnam;
and Universiti Teknologi PETRONAS (UTP), Bandar Seri Iskandar, Malaysia through the funding
of the Fundamental Grant: 015LC0-045.

References

1. Cecilio J, Furtado P (2014) Architecture for uniform (re)conﬁguration and processing over
embedded sensor and actuator networks. IEEE Trans Ind Inform 10(1):53–60
2. Chen J, Cao X, Cheng P, Xiao Y, Sun Y (2010) Distributed collaborative control for
industrial automation with wireless sensor and actuator networks. IEEE Trans Ind Electron 57
(12):4219–4230
3. Grilo AM, Chen J, Diaz M, Garrido D, Casaca A (2014) An integrated WSAN and SCADA
system for monitoring a critical infrastructure. IEEE Trans Ind Inform 10(3):1755–1764
Energy-Efﬁcient Superframe Scheduling in Industrial Wireless … 1241

4. Kyusakov R, Eliasson J, Delsing J, van Deventer J, Gustafsson J (2013) Integration of

wireless sensor and actuator nodes with it infrastructure using service-oriented architecture.
IEEE Trans Ind Inform 9(1):43–51
5. Tran CD, Ibrahim R (2019) Building ambient temperature measurement using industrial
wireless mesh technology. In: IEEE student conference on research and development
(SCOReD), pp 146–151
6. Yoo SE, Chong PK, Kim D, Doh Y, Pham ML, Choi E, Huh J (2010) Guaranteeing real-time
services for industrial wireless sensor networks with IEEE 802.15.4. IEEE Trans Ind Electron
57(11):3868–3876
7. Linderman LE, Jo H, Spencer BF (2015) Low-latency data acquisition hardware for real-time
wireless sensor applications. IEEE Sens J 15(3):1800–1809
8. Blevins T, Chen D, Nixon M, Wojsznis W (2015) Wireless control foundation - continuous
and discrete control for the process industry, 4th edn. International Society of Automation
(2015)
9. Drieberg M, Asirvadam VS, Zheng FC (2014) Accurate delay analysis in prioritised wireless
sensor networks for generalized packet arrival. IEEE Wirel Commun Lett 3(2):205–208
10. Saifullah A, Xu Y, Lu C, Chen Y (2015) End-to-end communication delay analysis in
industrial wireless networks. IEEE Trans Comput 64(5):1361–1374
11. Ferrari P, Flammini A, Rizzi M, Sisinni E (2013) Improving simulation of wireless networked
control systems based on wirelessHART. Comput Stand Interfaces 35(6):605–615
12. Chung TD, Ibrahim RB, Asirvadam VS, Saad NB, Hassan SM (2016) Adopting EWMA filter
on a fast sampling wired link contention in wirelessHART control system. IEEE Trans
Instrum Meas 65(4):836–845
13. Ascorti L, Savazzi S, Galimberti S (2014) Cloud-based wirelessHART networking for critical
industrial monitoring and control. In: IEEE international conference on industrial informatics
(INDIN), vol 12, pp 362–369
14. Dondi D, Bertacchini A, Brunelli D, Larcher L, Benini L (2008) Modeling and optimization
of a solar energy harvester system for self-powered wireless sensor networks. IEEE Trans Ind
Electron 55(7):2759–2766
15. Chung TD, Ibrahim R, Asirvadam VS, Saad N, Hassan SM (2017) WirelessHARTTM: filter
design for industrial wireless networked control systems, 1st edn. CRC Press, Boca Raton
16. Magno M, Boyle D, Brunelli D, Popovici E, Benini L (2014) Ensuring survivability of
resource-intensive sensor networks through ultra-low power overlays. IEEE Trans Ind Inform
10(2):946–956
17. Lu T, Zhu J (2013) Genetic algorithm for energy-efficient QoS multicast routing. IEEE
Commun Lett 17(1):31–34
18. Nobre M, Silva I, Guedes LA (2014) Reliability evaluation of wirelessHART under faulty
link scenarios. In: IEEE international conference on industrial informatics (INDIN), vol 12,
pp 676–682
19. Nawaz F, Jeoti V (2016) Performance assessment of wirelessHART technology for its
implementation in dense reader environment. Computing 98(3):257–277
20. Chung TD, Ibrahim RB, Asirvadam VS, Saad NB, Hassan SM (2015) Simulation of
wirelessHART networked control system with packet dropout. In: Asian control conference
(ASCC), vol 10, pp 1–6
21. Chen D, Nixon M, Han S, Mok AK, Zhu X (2014) WirelessHART and IEEE 802.15.4e. In:
IEEE international conference on industrial technology (ICIT), vol 1, pp 760–765
22. Tran CD, Ibrahim R, Asirvadam VS, Saad N, Sabo Miya H (2018) Internal model control for
industrial wireless plant using wirelessHART hardware-in-the-loop simulator. ISA Trans
75:236–246
23. Torfs T, Sterken T, Brebels S, Santana J, van den Hoven R, Spiering V, Bertsch N, Trapani D,
Zonta D (2013) Low power wireless sensor network for building monitoring. IEEE Sens J 13
(3):909–915
24. Zand P, Mathews E, Havinga P, Stojanovski S, Sisinni E, Ferrari P (2014) Implementation of
wirelessHART in the NS-2 simulator and validation of its correctness. Sensors 14(5):8633
1242 D. C. Tran et al.

25. Khader O, Willig A (2013) An energy consumption analysis of the wireless HART TDMA
protocol. Comput Commun 36(7):804–816
26. Zhu X, Lin T, Han S, Mok A, Chen D, Nixon M, Rotvold E (2012) Measuring wirelessHART
against wired fieldbus for control. In: IEEE international conference on industrial informatics,
vol 10, pp 270–275
27. Verma VK, Singh S, Pathak NP (2015) Optimized battery models observations for static,
distance vector and on-demand based routing protocols over 80.211 enabled wireless sensor
networks. Wirel Pers Commun 81(2):503–517
28. Dohare U, Lobiyal DK, Kumar S (2014) Energy balanced model for lifetime maximization in
randomly distributed wireless sensor networks. Wirel Pers Commun 78(1):407–428
29. Asharioun H, Asadollahi H, Wan T-C, Gharaei N (2015) A survey on analytical modeling and
mitigation techniques for the energy hole problem in corona-based wireless sensor network.
Wirel Pers Commun 81(1):161–187
30. Tao M, Lu D, Yang J (2012) An adaptive energy-aware multi-path routing protocol with load
balance for wireless sensor networks. Wirel Pers Commun 63(4):823–846
31. Houand R, Zheng M (2014) Packet-based nonlinear battery energy consumption optimizing
for WSNS nodes. IEICE Electron Express 11(9):167
32. Wang H, Huang Y, Roman HE (2014) Some fundamental results on complex network
problem for large-scale wireless sensor networks. Wirel Pers Commun 77(44):2927–2943
33. Zhang L, Zhang X, Qu K, Ren L, Deng J, Zhu K (2015) Green and cooperative dash in
wireless D2D networks. Wirel Pers Commun 84(3):1797–1816
34. Rahhal JS (2012) Wireless MIMO sensor network with power constraint WLS/BLUE
estimators. Wirel Pers Commun 63(2):447–457
35. Zeng M, Lee J-G, Choi G-S, Lee J-A (2009) Intelligent sensor node based a low power ecg
monitoring system. IEICE Electron Express 6(9):560–565
36. Magno M, Jelicic V, Srbinovski B, Bilas V, Popovici E, Benini L (2016) Design,
implementation, and performance evaluation of a flexible low-latency nanowatt wake-up
radio receiver. IEEE Trans Ind Inform 12(2):633–644
37. Lin T-H, Kaiser WJ, Pottie GJ (2004) Integrated low-power communication system design for
wireless sensor networks. IEEE Commun Mag 42(12):142–150
38. Silva I, Guedes LA, Portugal P, Vasques F (2012) Reliability and availability evaluation of
wireless sensor networks for industrial applications. Sensors 12(1):806
39. IEC (2010) Industrial communication networks - wireless communication network and
communication profiles - wirelessHART, 1st edn. International Electrotechnical Commission
40. Seborg DE, Edgar TF, Mellichamp DA (2004) Process dynamics and control, 2nd edn. Wiley,
New York
41. Tran DC, Ibrahim R, Bingi K, Battery’s life-time estimation of industrial WirelessHART
sensor actuator node. Arab J Sci Eng
42. Chung TD, Ibrahim RB (2020) Industrial wireless networked control system with
dynamically tuned EWMA filter. In: 2020 5th International Conference on Intelligent
Information Technology (ICIIT), pp 1–7
Design of Two Axis Solar Tracker Based
on Optoelectrical Tracking Using
Hybrid FuGA Controller

Imam Abadi, Erma Hakim Setyawan, and D. R. Pramesrani

Abstract One of solutions to optimize the output power of solar panel is to add
solar tracker system in solar panel. Solar tracker is a system that can move a solar
panel to follow the direction of the sun to always perpendicular to the sun. The solar
tracker system consists of sensors, controller, DC motors and solar panel. Two-axis
active solar tracker system uses LDR sensors to capture the light intensity received
by solar panel, it uses 4 sensors representing the North, South, East and West
direction. Solar panel moves at two directions: yaw and pitch. This research uses
fuzzy controller which is optimized by genetic algorithm. The test results show that
the hybrid fuzzy genetic algorithm control system can be applied to active two-axis
solar tracker system with the best performance index using 3 membership functions.
Pitch angle at 70° has performance index rise time (tr) 0.23 s, settling time (ts)
1.05 s, maximum overshoot (Mov) 0%, error steady state (ess) 0.02%, and yaw
angle at 80° has performance index tr = 5.37 s, ts = 22.16 s, Mov = 0% and ess =
0.1%. Increased energy efﬁciency generated by mobile active solar tracker is
62.15%.

Keywords Active solar tracker Fuzzy logic controller Genetic algorithm

Solar tracker

I. Abadi (&) E. H. Setyawan (&) D. R. Pramesrani (&)

Engineering Physics Department, Faculty of Industrial Technology, Sepuluh Nopember
Institute of Technology, Surabaya 60111, Indonesia
e-mail: [email protected]
E. H. Setyawan
e-mail: [email protected]
D. R. Pramesrani
e-mail: [email protected]

1 Introduction

The consumption of renewable energy such as solar energy, geothermal, water,

wind, biomass is increasing to replace fossil energy. Solar energy is an alternative
that has the highest potential of various renewable energies in Indonesia. This is
because the geographical location of Indonesia is located on the equator precisely at
6° North Latitude to 11° South Latitude and 141° West Longitude to 95° East
Longitude so that Indonesia becomes one of the areas that receive high solar
radiation throughout the year with average power reaches 4.8 kWh/m2 [1].
Solar energy can be utilized after it is converted into electrical energy therefore
required a technology which capable of converting solar energy into electrical
energy called solar cells. Solar cells produce electrical power that is directly pro-
portional to the amount of solar radiation received by solar cells. The sun position
change causes the limited amount of solar radiation received solar cells so that the
electric power production is not optimal, therefore there is a development of solar
cells to overcome the limited use of solar energy by adding solar tracker on solar
cell system. Solar tracker is a system that drives solar panels to follow the move-
ment of the sun to always perpendicular to the sun so as to optimize the output
power generated by solar cells. The solar tracker system can be classified based on
several criteria [2]. The first classification is determined by the number of axes of
rotation, i.e. single axis and two axis solar tracker. Single axis solar tracker is a type
of solar tracker which solar panel tracks the sun from East to West using two Light
Dependent Resistor (LDR) sensors, whereas two axis solar tracker is a type of solar
tracker which can track the movement of the sun from East to West and from North
to The South using four LDR sensors [3]. The other classification is determined by
its activity i.e. active and passive solar tracker. Active solar tracker uses LDR as the
sensor to detect the position of the sun and passive solar tracker uses the calculation
of sun position.
In previous research by Huang et al. [4] about two axis solar tracking system
using fuzzy controller was concluded that sensors used in solar tracker systems with
fuzzy logic controls work more accurately than traditional controls. In that research
two fuzzy parameters like fuzzy rules and membership function are determined by
data from the literature and human expert so that the parameters used are not
optimal therefore in this research we develop solar tracker system using combi-
nation of fuzzy logic control and genetic algorithm (GA). In this system the genetic
algorithm serves to find and get the optimal membership function so that it will
increase the expected solar cell efficiency.
Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1245

2 Methodology

Fuzzy Control Based on Genetic Algorithm

In this paper, GA serves as an optimization method applied to fuzzy logic control.
The fuzzy parameter optimized by GA is its membership function range. The
fuzzy-genetic algorithm flow chart can be seen in Fig. 1 [7].
The ﬁrst step is determine the membership function fuzzy then determine the
objective function and parameter of GA, where in this system the objective function
used is the Mean Square Error (MSE) which is calculated by using equation below.

1X m
_
mse ¼ ðyi yi Þ2 ð1Þ
m i¼1

_
where m = number of elements, yi = desired value, and yi = output value.
After determining the objective function and GA parameters, optimize fuzzy
using GA. Afterwards look at optimization results (membership function range)
whether it meets the criteria or not, if it does then the result is applied to the FIS and
if not then repeat the optimizing step.
System Modeling
The modeling of active solar tracker system consists of several modeling of its
constituent components i.e. LDR sensor model, DC motor model, and solar panel or
PV model.
LDR Sensor Model
Two axis active solar tracker system uses 4 LDR sensors for east, west, north and south
direction. North and South LDR sensors are used to track the altitude position of the
sun while the azimuth position of the sun is tracked using Western and Eastern LDR

Fig. 1 Fuzzy-genetic
algorithm flow chart
1246 I. Abadi et al.

sensors. The LDR sensor is able to convert the light intensity into resistance. The
output of LDR sensor in the form of resistance need to be converted into voltage using
voltage divider circuit to be used as controller input. In the voltage divider circuit the
LDR sensor is connected with a 1000 X resistor in series. The relationship of LDR
input and output can be shown by the following equation [6]:

3
LogR ¼ LogI þ 5 ð2Þ
4

The effective light intensity received by solar panels can be written in the fol-
lowing equation:

Ieff ¼ Imax sin h ð3Þ

where I-max means maximum intensity of sunlight received by LDR sensor.

DC Motor Model
Active two axis solar tracking system use 2 DC motors that work to move solar
panels in the direction of pitch and yaw. DC motor used here is 12 V DC motor.
The DC motor parameters used for modeling such as back e.m.f constant, torque
constant, motor inertia, motor resistance, motor inductance, and viscous friction
coefﬁcient are obtained by experiment as seen in Table 1.
DC motor consists of two elements i.e. electrical and mechanical. The DC motor
model is derived according to Kirchoff voltage law and Newton’s second law that
is:

dIa ðtÞ
Va ðtÞ ¼ Ra :Ia ðtÞ þ La: þ Kb :xðtÞ ð4Þ
dt
dxðtÞ
KT :Ia ðtÞ ¼ Jm þ Bm :xðtÞ ð5Þ
dt

Laplace transform can be obtained from (4) and (5) as shown below:

1
Ia ðsÞ ¼ ðVa ðsÞ Kb :xðsÞÞ ð6Þ
ðRa : þ La :SÞ

Table 1 DC motor Parameter Quantity

parameters
Back e.m.f constant (Ke) 27.17 Vs/rad
Torque constant (Kt) 27.17 Nm/A
Motor resistance (R) 9.33 X
Motor inductance (L) 0.009 H
Motor inertia (Jm) 242.697 Nms2
Viscous friction coefﬁcient (Bm) 0.615 Nms/rad
Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1247

1
xðsÞ ¼ KT :Ia ðsÞ: ð7Þ
Jm S þ Bm :xðsÞ

The relationship between angle and angular velocity are:

1
hðsÞ ¼ xðsÞ ð8Þ
s

DC motors can be modeled based on (6), (7), (8), and the parameters obtained as
in Fig. 2.
Solar Panel Model
The data used to model solar panel are radiation and temperature which is obtained
by the experiment and solar panel parameters as in Table 2 which is obtained from
the datasheet. Experiment is done for 11 h from 06.00 to 17.00 WIB every 30 min
[8–10]. Solar panels are modeled based on the single diode equivalent circuit. In the
PV equivalent circuit consists of 4 components i.e. light source (photocurrent),
diode, PV internal resistance (Rpv), and shunt resistor.
The load current can be calculated according to Kirchoff Law as:

I ¼ Iph Id Ish ð9Þ

Ideally Ish is zero due to Rsh ∞, therefore the Eq. (9) can be rewritten as
follow:

V þ IRpv
I ¼ Iph Id ¼ Iph Io ½expð Þ 1 ð10Þ
VT

I and V can also be represented as function of radiation and temperature as

shown below:

Fig. 2 DC motor model

1248 I. Abadi et al.

Table 2 Solar panel Parameter Quantity

parameters
Short circuit current (Isc) 9A
Open circuit voltage (Voc) 36.8 V
Maximum power current (Imp) 8.34 A
Maximum power voltage (Vmp) 30 V
Temperature coeff. of short circuit 0.00053 A/oC
current (l)
Temperature coeff. of short circuit −0.0036 V/oC
voltage (U)
PV Internal Resistance (Rpv) 0.15603 X
Maximum Power (Wp) 250 W

S S
I¼ l T Tref þ 1 Isc þ Imp ð11Þ
Sref Sref

I ¼ u T Tref Rpv I Imp þ Vmp ð12Þ

The value of Sref = 1000 W/m2 and Tref= 25 °C whereas Rpv can be calculated
by using (13).

uref ln 1 IImpsc þ Voc Vmp
Rpv ¼ ð13Þ
Imp

2Vmp Voc
uref ¼ ð14Þ
Isc
þ ln 1 IImpsc
Isc Imp

Solar panel model can be constructed based on equations above as seen below
Fig. 3.

Fig. 3 Solar panel model

Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1249

Membership Function Modiﬁcation

The control system designed in this research is fuzzy-genetic algorithm. The control
system used has two inputs that are error and delta error and the output is PWM
signal. This research use fuzzy Takagi-Sugeno.
1. Input Membership Function
The variation used in this paper is the number of membership function i.e.
3 membership functions and 5 membership functions for each pitch and yaw.
a. 3 input membership function
In this variation, input of fuzzy has 3 membership functions that are N, Z, and P
(Figs. 4 and 5).

Fig. 4 3 membership
functions of error and delta
error pitch

Fig. 5 3 membership
functions of error and delta
error yaw
1250 I. Abadi et al.

Fig. 6 5 membership
functions of error and delta
error pitch

Fig. 7 5 membership
functions of error and delta
error yaw

b. 5 input membership function

In this variation, input of fuzzy has 5 membership functions that are NB, NS, Z,
PS, and PB Figs. 6 and 7.
2. Fuzzy Membership Function Modiﬁcation with GA
GA parameter values include population number, generation number, elitism,
crossover probability, mutation probability and number of bits determined from
trial and error method. The GA parameters are shown in Table 3.
After the optimization using GA obtained new membership function as
presented in ﬁgure below Figs. 8 and 9.
Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1251

Table 3 Genetic algorithm Parameter Quantity

parameters
3 MF 5 MF
Pitch Yaw Pitch Yaw
Population number 200 200 200 200
Generation number 100 100 100 100
Elitism 0.9 0.9 0.9 0.9
Crossover probability 0.8 0.8 0.8 0.8
Mutation probability 0.01 0.01 0.01 0.01
Number of bits 20 20 20 20

Fig. 8 3 input membership functions for, a Pitch, b Yaw

Fig. 9 5 input membership functions for, a Pitch, b Yaw

Table 4 Rule base fuzzy-GA e De

3 MF for pitch
N Z P
N CCWF S CWF
Z CCWS S CWS
P CCWF S CWF
1252 I. Abadi et al.

Table 5 Rule base fuzzy-GA e De

3 MF for yaw
N Z P
N CWF CWS CWF
Z S S S
P CCWF CCWS CCWF

Table 6 Rule base fuzzy-GA e De

5 MF for pitch
NB NS ZE PS PB
NB CWF CWS CWS CWF CWF
NS CCWS CWS CWS CWS CWF
ZE S S S S S
PS CCWF CCWS CCWS CCWS CCWS
PB CCWF CCWF CCWS CCWS CCWF

Table 7 Rule base fuzzy-GA e De

5 MF for yaw
NB NS ZE PS PB
NB CCWF CCWF S CWS CWF
NS CCWF CCWS S CWS CWS
ZE CCWS CCWS S CWS CWS
PS CCWS CCWS S CWS CWF
PB CCWF CWS S CWF CWF

3. Rule Base
Fuzzy-GA control system with 3 membership functions use 9 rules and
fuzzy-GA control system with 5 membership functions use 25 rules [7]
(Tables 4, 5, 6 and 7).
4. Output Membership Function
The output of this control system is a PWM signal with a range of −255 to 255.
The minus value on the output parameter shows the direction of DC motor

Fig. 10 Fuzzy output

Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1253

rotation. The ﬁve categories of fuzzy output sets are expressed by CCWF
(Counter Clockwise Fast), CCWS (Counter Clockwise Slow), S (Stop), CWF
(Clockwise Fast), CWS (Clockwise Slow) as shown in Fig. 10.

3 Simulation and Experimental Result

3.1 Setpoint Testing

The performance of the designed controllers in setpoint test are measured according
to some indicators i.e. rise time (tr), settling time (ts), maximum overshoot (Mov),
and error steady state (ess).
1. Pitch Setpoint Test with 3 Membership Functions
The pitch setpoint test is performed using setpoint of 30, 50 and 70°. The result
of the output angle response of pitch motor is shown below Fig. 11.
From the picture above can be seen that fuzzy-GA with 3 membership functions
for pitch angle has a longer rise time compared to fuzzy but has faster settling time.
In addition the average error value generated by fuzzy-GA each setpoint is smaller
than fuzzy that is equal to 1.14%. The graph also shows that fuzzy-GA controller do
not have overshoot, it means that fuzzy-GA controller are able to adjust and follow
the setpoint without oscillation.

Fig. 11 Response of pitch angle with 3 membership functions for, a 30°, b 50°, c 70°
1254 I. Abadi et al.

Fig. 12 Response of output voltage of LDR with 3 membership functions for, a 4.8729 V,
b 4.9071 V, c 4.92 V

Another variable to consider in the setpoint test other than the angle is the output
voltage of LDR. The setpoint values in this test are voltage generated by the altitude
angle. The result of output voltage of LDR response is shown in Fig. 12.
From the response graph it can be seen that fuzzy-GA with 3 membership
function for LDR setpoint test has rise time and settling time faster than fuzzy
controller with each time difference of 0.047 and 0.037 s. The percentage of
average error produced by fuzzy-GA controller at each setpoint is smaller than
fuzzy control that is 0.028%.
2. Yaw Setpoint Test with 3 Membership Functions
This setpoint test is performed using fuzzy-GA controller with 3 membership
functions and 2 step input variations that are 30 and 80°. Figure below is a graph
of response of yaw angle.
Based on Fig. 13 it is seen that fuzzy-GA with 3 membership functions for yaw
has a longer rise time compared to fuzzy, but has faster settling time. In addition,
the average value of error generated by fuzzy-GA on each setpoint is smaller than
fuzzy that is 0.515%.
Similar to the pitch testing, besides the angle there is other variable to be
measured that is the output voltage of LDR. The LDR setpoint test uses two
variations of voltage generated by the azimuth angle. The output voltage of LDR
generated by the azimuth angles 30 and 80° are 4.9151 and 4.7274 V respectively.
The ﬁgure below is the result of output voltage West LDR response.
Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1255

Fig. 13 Response of yaw angle with 3 membership functions for, a 30°, b 80°

Fig. 14 Response of output voltage of LDR with 3 membership functions for, a 4.9151 V,
b 4.7274 V

Figure 14 shows that fuzzy-GA controller has longer rise time but faster to
steady state compared with fuzzy controller with average settling time difference of
each setpoint is 6.26 s. In addition, the error percentage generated by fuzzy-GA
controller on each setpoint is smaller than the fuzzy control with the difference of
0.002% for 4.9151 V and 0.01% for 4.7274 V.
3. Pitch Setpoint Test with 5 Membership Functions
In this pitch setpoint test we give an input angle representing the altitude of the
sun i.e. 30, 50 and 70° (Fig. 15).
1256 I. Abadi et al.

Fig. 15 Response of pitch angle with 5 membership functions for, a 30°, b 50°, c 70°

Based on the graph it can be seen that fuzzy-GA with 5 membership functions
for pitch angle has a faster settling time than fuzzy. The average error value gen-
erated by fuzzy-GA controller at each setpoint is equal to 1.16%, the percentage is
smaller than fuzzy controller.
Similar to pitch testing with 3 membership functions, the step after analyzing the
angular response is to analyze the voltage of LDR response. The setpoint values
used to test the output voltage of LDR is the converted voltage from the angles 30,
50 and 70°. Figure 16 is the response of output voltage of LDR.
Figure 16 shows that fuzzy-GA controller with 5 membership functions for the
LDR setpoint test has a faster rise time compared to fuzzy controller at 4.92 V and
longer at 4.8729 and 4.9071 V. The graph also shows the percentage of average
error value produced by GA fuzzy control at each setpoint is smaller than fuzzy
control that is 0.052%.
4. Yaw Setpoint Test with 5 Membership Functions (Fig. 17).
From the picture above it can be seen that fuzzy-GA with 5 membership
function for yaw has rise time and settling time faster than fuzzy. Both con-
trollers have the difference of rise time and settling time respectively of 0.01 and
2.95 s at 30° and 4.14 and 5.47 s at 80°.
The angle 30 and 80° are converted into voltages using the East LDR to 4.9151
and 4.7274 V. The voltage values are used as setpoint in West LDR setpoint testing
to generate a response of voltage of LDR as seen in the following graph Fig. 18.
Tracking Testing
The tracking testing is divided into two i.e. altitude angle tracking test and azimuth
angle tracking test. The tracking test is done for 11 h (660 min) starting from 06.00
Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1257

Fig. 16 Response of output voltage of LDR with 5 membership functions for, a 4.8729 V,
b 4.9071 V, c 4.92 V
1258 I. Abadi et al.

Fig. 17 Response of yaw angle with 5 membership functions for, a 30°, b 80°

to 17.00 WIB using fuzzy-GA controller. The result of tracking test of altitude and
azimuth can be seen in Figs. 19 and 20.
After analyzing the response of the tracking test, the South LDR’s output
response is analyzed for altitude angle tracking and the West LDR’s output voltage
for the azimuth angle tracking. The result of South and West LDR’s output voltage
response during the tracking test can be seen in Figs. 21 and 22.
From the results of tracking test of altitude and azimuth angle we can infer that
the designed fuzzy-GA controller is able to be applied to the active two axis solar
tracker system.
Performance Testing
Performance tests were performed to determine the improvement of PV perfor-
mance by using two axis active solar tracker by measuring the voltage, current, and
Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1259

Fig. 18 Response of output voltage of LDR with 5 membership functions for, a 4.9151 V,
b 4.7274 V

Fig. 19 The result of altitude

tracking
1260 I. Abadi et al.

Fig. 20 The result of

azimuth tracking

Fig. 21 Voltage of LDR response for altitude tracking

Fig. 22 Voltage of LDR response for azimuth tracking

Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1261

Fig. 23 Output voltage of mobile solar tracker and ﬁxed PV

Fig. 24 Output current of mobile solar tracker and ﬁxed PV

Fig. 25 Output power of mobile solar tracker and ﬁxed PV

1262 I. Abadi et al.

power output of PV. We also calculated the increase of energy efficiency produced
by active two axis solar tracker system. The result of performance testing can be
seen in figure below Figs. 23, 24 and 25.
Mobile solar tracker has maximum voltage 29.88 V, maximum current 8.255 A,
and maximum power 246.66 W whereas in fixed PV maximum voltage, maximum
current, and maximum power respectively 29.78 V, 7.61 A, and 226.61 W so it can
be said that mobile solar tracker system using hybrid fuzzy-genetic algorithm
controller can increase the output of PV. From the produced power can be calcu-
lated the increase of energy efficiency that can be achieved by mobile solar tracker
that is equal to 62.15%.

4 Conclusion

Based on the results we can obtain some conclusions, they are The parameters used
in the active two axis solar tracker system are obtained from the experiment and
datasheet such as sunlight intensity (lux) for LDR sensors, voltage constant (Ke)
27.17 Vs/rad, torque constant (Kt) 27.17 Nm/A, viscous friction coefficient (Bm)
0.615 Nms/rad, motor inertia (Jm) 242.697 Nms2, motor resistance (R) 9.33 X, and
motor inductance (L) 0.009 H for DC motor, solar radiation, temperature and PV
specifications for solar panel. Fuzzy-GA controller can be implemented on active
two axis solar tracker system with the best performance index using 3 membership
functions. Pitch angle at 70° has performance index rise time (tr) 0.23 s, settling
time (ts) 1.05 s, maximum overshoot (Mov) 0%, error steady state (ess) 0.02%, and
yaw angle at 80° has performance index tr = 5.37 s, ts = 22.16 s, Mov = 0%, and
ess = 0.1%. The designed fuzzy-GA controller is able to increase the energy effi-
ciency of PV by 62.15% for 11 h.

Acknowledgements The authors wish to thank and acknowledge the support for this work from
the Indonesia Endowment Fund for Education (LPDP).

References

1. Tharakan P (2015) Summary of Indonesia’s energy sector assessment. ADB papers on

Indonesia
2. Tudorache T, Kreindler L (2010) Design of a solar tracker system for PV power plants, vol 7
3. Dhanabal R, Bharathi V, Ranjitha R, Ponni A, Deepthi S, Mageshkannan P (2013)
Comparison of efﬁciencies of solar tracker systems with static panel single-axis tracking
system and dual-axis tracking system with ﬁxed mount. Int J Eng Technol 5:1925–1933
4. Huang CH, Pan HY, Lin KC (2015) Development of intelligent fuzzy controller for a
two-axis solar tracking system
5. Sendoya-Losada DF, Murcia DF, Garcia SO (2017) Design and implementation of a
photovoltaic solar tracker using fuzzy control for Supercolombiana University. ARPN J Eng
Appl Sci 12
Design of Two Axis Solar Tracker Based on Optoelectrical Tracking … 1263

6. Abadi I, Musyafa A, Soeprijanto A (2015) Design and implementation of active two axes
solar tracking system using particle swarm optimization based fuzzy logic controller. Int Rev
Model Simul (I.RE.MO.S.) 8:640–652
7. Setyawan EH, Abadi I, Kusumawarni SA (2019) Estimation of hourly solar radiation on
horizontal surface using GAMF (genetic algorithm modiﬁed fuzzy) (case study in surabaya).
IOP Conf Ser Mater Sci Eng 588:012024
8. Abadi I, Musyafa A, Baskoro KD, Fitriyanah DW (2019) Design and implementation of
mobile active two axes solar tracking with reflector based on particle swarm fuzzy controller.
Int Rev Model Simul (I.RE.MO.S.) 12:113
9. Fitriyanah DW, Abadi I (2019) Fuzzy logic control design of mobile PV using bacterial
foraging optimazion. In: Proceeding-2018 international seminar intelligent and its application
ISITIA 2018
10. Abadi I, Imran C, Faza N (2019) Smart solar tracking system based on fuzzy PI controller for
maximizing the power of PV. In: AIP conferences proceedings

Fundamental Research in Electrical Engineering Shahram Montaser Kouhsari Download
100% (3)
Fundamental Research in Electrical Engineering Shahram Montaser Kouhsari Download
59 pages
Proceedings of International Conference On Recent Innovations in Computing
No ratings yet
Proceedings of International Conference On Recent Innovations in Computing
689 pages
Lecture Notes in Electrical Engineering
No ratings yet
Lecture Notes in Electrical Engineering
879 pages
Ebook Rcaai Maanipal2023
No ratings yet
Ebook Rcaai Maanipal2023
1,039 pages
Artificial Intelligence For Sustainable Energy: Jimson Mathew Lenin Gopal Filbert H. Juwono
No ratings yet
Artificial Intelligence For Sustainable Energy: Jimson Mathew Lenin Gopal Filbert H. Juwono
413 pages
Control, Instrumentation and Mechatronics: Theory and Practice
No ratings yet
Control, Instrumentation and Mechatronics: Theory and Practice
880 pages
Proceedings of The 5th International Conference On Electrical Engineering and Information Technologies For Rail Transportation (EITRT) 2021
No ratings yet
Proceedings of The 5th International Conference On Electrical Engineering and Information Technologies For Rail Transportation (EITRT) 2021
741 pages
ICCCE 2020: Amit Kumar Stefan Mozar
No ratings yet
ICCCE 2020: Amit Kumar Stefan Mozar
1,561 pages
Proceeding of 2021 International Conference On Wireless Communications, Networking and Applications
No ratings yet
Proceeding of 2021 International Conference On Wireless Communications, Networking and Applications
1,251 pages
Advances in Automation II (Andrey A. Radionov, Vadim R. Gasiyarov)
No ratings yet
Advances in Automation II (Andrey A. Radionov, Vadim R. Gasiyarov)
1,423 pages
Emerging Technologies For Computing, Communication and Smart Cities
No ratings yet
Emerging Technologies For Computing, Communication and Smart Cities
778 pages
2021 Advances in VLSI - Communicatio-, and Signal Processing
No ratings yet
2021 Advances in VLSI - Communicatio-, and Signal Processing
726 pages
Advances in Power and Control Engineering: S. N. Singh R. K. Pandey Bijaya Ketan Panigrahi D. P. Kothari
No ratings yet
Advances in Power and Control Engineering: S. N. Singh R. K. Pandey Bijaya Ketan Panigrahi D. P. Kothari
284 pages
Air Traffic Management and Systems IV 2021
No ratings yet
Air Traffic Management and Systems IV 2021
348 pages
Digital Ecosystems: Interconnecting Advanced Networks With AI Applications
No ratings yet
Digital Ecosystems: Interconnecting Advanced Networks With AI Applications
918 pages
Solved Problems For Transient Electrical Circuits
No ratings yet
Solved Problems For Transient Electrical Circuits
235 pages
Recent Trends in Mechatronics Towards Industry 4.0
No ratings yet
Recent Trends in Mechatronics Towards Industry 4.0
990 pages
Proceedings of The 21st International Symposium On High Voltage 2020
No ratings yet
Proceedings of The 21st International Symposium On High Voltage 2020
1,570 pages
Proceedings of International Conference On Power Electronics and Renewable Energy Systems
No ratings yet
Proceedings of International Conference On Power Electronics and Renewable Energy Systems
683 pages
Emerging Electronics and Automation Select Proceedings of The 3rd International Conference, E2A 2023, Volume 2
No ratings yet
Emerging Electronics and Automation Select Proceedings of The 3rd International Conference, E2A 2023, Volume 2
329 pages
International Conference On Artificial Intelligence For Smart Community
No ratings yet
International Conference On Artificial Intelligence For Smart Community
1,049 pages
Latest Trends in Renewable Energy Technologies: Shelly Vadhera Bhimrao S. Umre Akhtar Kalam
No ratings yet
Latest Trends in Renewable Energy Technologies: Shelly Vadhera Bhimrao S. Umre Akhtar Kalam
450 pages
Technical Foundations of Embedded Systems: Karsten Berns Alexander Köpper Bernd Schürmann
No ratings yet
Technical Foundations of Embedded Systems: Karsten Berns Alexander Köpper Bernd Schürmann
380 pages
Proceedings of IEMTRONICS 2024 International IoT, Electronics and Mechatronics Conference, Volume 1
No ratings yet
Proceedings of IEMTRONICS 2024 International IoT, Electronics and Mechatronics Conference, Volume 1
510 pages
Machine Learning, Deep Learning and Computational Intelligence For Wireless Communication
No ratings yet
Machine Learning, Deep Learning and Computational Intelligence For Wireless Communication
639 pages
Parallel and Distributed Computing, Applications and Technologies
No ratings yet
Parallel and Distributed Computing, Applications and Technologies
325 pages
10.1007@978 981 15 6403 1
No ratings yet
10.1007@978 981 15 6403 1
1,257 pages
Advances in VLSI and Embedded Systems: Zuber Patel Shilpi Gupta Nithin Kumar Y. B. Editors
No ratings yet
Advances in VLSI and Embedded Systems: Zuber Patel Shilpi Gupta Nithin Kumar Y. B. Editors
306 pages
Editors Proceedingsof 4 TH International Conferenceon Machine Learning Advancesin Computing Renewable Energyand Communication MARC
No ratings yet
Editors Proceedingsof 4 TH International Conferenceon Machine Learning Advancesin Computing Renewable Energyand Communication MARC
399 pages
Communications, Signal Processing, and Systems: Qilian Liang Xin Liu Zhenyu Na Wei Wang Jiasong Mu Baoju Zhang
No ratings yet
Communications, Signal Processing, and Systems: Qilian Liang Xin Liu Zhenyu Na Wei Wang Jiasong Mu Baoju Zhang
1,228 pages
profileMunsifa-Khan-Barbhuyanpublication363124255 Performance Analysis of Improved Mobility Mode
No ratings yet
profileMunsifa-Khan-Barbhuyanpublication363124255 Performance Analysis of Improved Mobility Mode
557 pages
2019 Book ProceedingsOfThe1stInternation
No ratings yet
2019 Book ProceedingsOfThe1stInternation
793 pages
Micro and Nanoelectronics Devices, Circuits and Systems: Trupti Ranjan Lenka Durgamadhab Misra Lan Fu
No ratings yet
Micro and Nanoelectronics Devices, Circuits and Systems: Trupti Ranjan Lenka Durgamadhab Misra Lan Fu
519 pages
(249-261) Direct Torque Controlled Induction Motor Drive Using Super-Lift Converter For Performance Improvement
No ratings yet
(249-261) Direct Torque Controlled Induction Motor Drive Using Super-Lift Converter For Performance Improvement
1,036 pages
Air Traffic Management and System III 2019
No ratings yet
Air Traffic Management and System III 2019
293 pages
(3) 机器人辅助卫星板装配研究
No ratings yet
(3) 机器人辅助卫星板装配研究
1,257 pages
Alggorithms
No ratings yet
Alggorithms
897 pages
The Proceedings of 2023 4th International Symposium On Insulation and Discharge Computation For Power Equipment (IDCOMPU2023) Volume II
No ratings yet
The Proceedings of 2023 4th International Symposium On Insulation and Discharge Computation For Power Equipment (IDCOMPU2023) Volume II
731 pages
2020 Book InnovationsInElectricalAndElec PDF
No ratings yet
2020 Book InnovationsInElectricalAndElec PDF
839 pages
2021 Book AdvancesInRenewableEnergyAndSu
No ratings yet
2021 Book AdvancesInRenewableEnergyAndSu
426 pages
Recent Advances in Power Electronics and Drives
100% (2)
Recent Advances in Power Electronics and Drives
384 pages
Effectof Fiber Wavinessonthe Elastic Propertiesof Pultruded Glass Fiber Reinforced Composites
No ratings yet
Effectof Fiber Wavinessonthe Elastic Propertiesof Pultruded Glass Fiber Reinforced Composites
682 pages
Advanced Radiation Sensors VLSI Design
No ratings yet
Advanced Radiation Sensors VLSI Design
524 pages
Metaheuristics and Optimization in Computer and Electrical Engineering Vol 2 Hybrid and Improved Algorithms Navid Razmjooy Instant Download
No ratings yet
Metaheuristics and Optimization in Computer and Electrical Engineering Vol 2 Hybrid and Improved Algorithms Navid Razmjooy Instant Download
84 pages
Recent Advances in Power Electronics and Drives: Shailendra Kumar Bhim Singh Arun Kumar Singh Editors
No ratings yet
Recent Advances in Power Electronics and Drives: Shailendra Kumar Bhim Singh Arun Kumar Singh Editors
641 pages
10.1007@978 981 15 0214 9 PDF
100% (1)
10.1007@978 981 15 0214 9 PDF
1,011 pages
10.1007@978 981 15 0226 2
No ratings yet
10.1007@978 981 15 0226 2
204 pages
Bok - 978 981 15 7031 5
No ratings yet
Bok - 978 981 15 7031 5
1,126 pages
MEMS and Microfluidics in Healthcare: Koushik Guha Gorachand Dutta Arindam Biswas K. Srinivasa Rao
No ratings yet
MEMS and Microfluidics in Healthcare: Koushik Guha Gorachand Dutta Arindam Biswas K. Srinivasa Rao
251 pages
10.1007@978 3 030 37558 4 PDF
No ratings yet
10.1007@978 3 030 37558 4 PDF
411 pages
MCC CRC Implemenation Guide Edited
No ratings yet
MCC CRC Implemenation Guide Edited
38 pages
10.1007@978 981 15 2341 0
No ratings yet
10.1007@978 981 15 2341 0
693 pages
Sistema de Gestión de Documentos en El Lugar de Trabajo Que Emplea Computación en La Nube y Tecnología Social
No ratings yet
Sistema de Gestión de Documentos en El Lugar de Trabajo Que Emplea Computación en La Nube y Tecnología Social
724 pages
BOOK-Electrical Design of A 400 KV Composite Tower
100% (2)
BOOK-Electrical Design of A 400 KV Composite Tower
254 pages
Part1 Overview Release 13 en
No ratings yet
Part1 Overview Release 13 en
38 pages
Lecture Notes in Electrical Engineering
No ratings yet
Lecture Notes in Electrical Engineering
33 pages
Lecture Notes in Analog Electronics Discrete and Integrated Large Signal Amplifiers 9789811965272 9789811965289 - Compress
No ratings yet
Lecture Notes in Analog Electronics Discrete and Integrated Large Signal Amplifiers 9789811965272 9789811965289 - Compress
375 pages
G 2 Catalogue
No ratings yet
G 2 Catalogue
60 pages
Lecture Notes in Electrical Engineering: Series Editors
No ratings yet
Lecture Notes in Electrical Engineering: Series Editors
16 pages
EAPP Module2 v2
No ratings yet
EAPP Module2 v2
7 pages
Lecture Notes in Electrical Engineering: Series Editors
No ratings yet
Lecture Notes in Electrical Engineering: Series Editors
10 pages
I. Module 3: Market Study: Study of Demand Study of Supply Demand-Supply Analysis Study of The Price Marketing Program
No ratings yet
I. Module 3: Market Study: Study of Demand Study of Supply Demand-Supply Analysis Study of The Price Marketing Program
14 pages
Psychology and Other Disciplines
No ratings yet
Psychology and Other Disciplines
5 pages
Activity Sheet 1: Purposive Communication
No ratings yet
Activity Sheet 1: Purposive Communication
4 pages
Lecture Notes in Electrical Engineering: Series Editors
No ratings yet
Lecture Notes in Electrical Engineering: Series Editors
12 pages
3a Index PDF
0% (1)
3a Index PDF
4 pages
Architecture and Sociology
No ratings yet
Architecture and Sociology
11 pages
Fpga-Based Hardware Accelerators: Iouliia Skliarova Valery Sklyarov
No ratings yet
Fpga-Based Hardware Accelerators: Iouliia Skliarova Valery Sklyarov
257 pages
Diani The Concept of Social Movement
No ratings yet
Diani The Concept of Social Movement
26 pages
Heizer 17
No ratings yet
Heizer 17
33 pages
A New Genus of Terraranas Anura Brachycephaloidea From Northern South America With A Systematic Review of Tachiramantis
No ratings yet
A New Genus of Terraranas Anura Brachycephaloidea From Northern South America With A Systematic Review of Tachiramantis
26 pages
B. Stage 1 and 2
No ratings yet
B. Stage 1 and 2
20 pages
Tổng Hợp Đề Thi Ielts Speaking Quý 4 - 2019 by Ngocbach
No ratings yet
Tổng Hợp Đề Thi Ielts Speaking Quý 4 - 2019 by Ngocbach
14 pages
Improving Quality in Food Products: Nestlé's Strategies For Standard Operating Procedures (SOP) and Documentation
No ratings yet
Improving Quality in Food Products: Nestlé's Strategies For Standard Operating Procedures (SOP) and Documentation
10 pages
Nba Lab Details May 2014
No ratings yet
Nba Lab Details May 2014
38 pages
Fiber Optics and Optoelectronic Devices
From Everand
Fiber Optics and Optoelectronic Devices
S Mohan
No ratings yet
Kowsi Final Project
No ratings yet
Kowsi Final Project
50 pages
CSC403 - Software Engineering BOSU
No ratings yet
CSC403 - Software Engineering BOSU
13 pages
Aqa A Level English Literature Coursework Mark Scheme
100% (1)
Aqa A Level English Literature Coursework Mark Scheme
4 pages
C9 WS 3 PHY Electromagnet
No ratings yet
C9 WS 3 PHY Electromagnet
5 pages
Section08 Sorting
No ratings yet
Section08 Sorting
5 pages
Def Slide
No ratings yet
Def Slide
9 pages
Lecture Set Three-Wave Generator
No ratings yet
Lecture Set Three-Wave Generator
10 pages
A Conversation With William Rathje-Anthropology Today
No ratings yet
A Conversation With William Rathje-Anthropology Today
7 pages
Volktek - Solution Catalog For Surveillance Ethernet
No ratings yet
Volktek - Solution Catalog For Surveillance Ethernet
55 pages
Offline Schedule-Siioc2023 Version2
No ratings yet
Offline Schedule-Siioc2023 Version2
5 pages
Importance of Personal Growth
No ratings yet
Importance of Personal Growth
3 pages
Calculators List Allowed
No ratings yet
Calculators List Allowed
1 page
Darrel Todd Woodruff 261 WEST 600 NORTH #1, Logan, UT 84321 435-232-4326 Email Website
No ratings yet
Darrel Todd Woodruff 261 WEST 600 NORTH #1, Logan, UT 84321 435-232-4326 Email Website
2 pages
Instructional Design Rubric Final
No ratings yet
Instructional Design Rubric Final
1 page