0% found this document useful (0 votes)
8 views129 pages

2020 Book IntelligentComputingParadigmRe

The document presents the volume 'Intelligent Computing Paradigm: Recent Trends' within the 'Studies in Computational Intelligence' series, which includes eight chapters on various topics related to computational intelligence. The chapters cover areas such as disease prediction, wireless sensor networks, and product recommendation systems, showcasing recent advancements and methodologies. The editors express gratitude to the contributors and reviewers, emphasizing the importance of rapid dissemination of research in the field.

Uploaded by

Noel Cherfan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views129 pages

2020 Book IntelligentComputingParadigmRe

The document presents the volume 'Intelligent Computing Paradigm: Recent Trends' within the 'Studies in Computational Intelligence' series, which includes eight chapters on various topics related to computational intelligence. The chapters cover areas such as disease prediction, wireless sensor networks, and product recommendation systems, showcasing recent advancements and methodologies. The editors express gratitude to the contributors and reviewers, emphasizing the importance of rapid dissemination of research in the field.

Uploaded by

Noel Cherfan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 129

Studies in Computational Intelligence 784

J. K. Mandal
Devadutta Sinha Editors

Intelligent
Computing
Paradigm:
Recent Trends
Studies in Computational Intelligence

Volume 784

Series Editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Computational Intelligence” (SCI) publishes new develop-
ments and advances in the various areas of computational intelligence—quickly and
with a high quality. The intent is to cover the theory, applications, and design
methods of computational intelligence, as embedded in the fields of engineering,
computer science, physics and life sciences, as well as the methodologies behind
them. The series contains monographs, lecture notes and edited volumes in
computational intelligence spanning the areas of neural networks, connectionist
systems, genetic algorithms, evolutionary computation, artificial intelligence,
cellular automata, self-organizing systems, soft computing, fuzzy systems, and
hybrid intelligent systems. Of particular value to both the contributors and the
readership are the short publication timeframe and the world-wide distribution,
which enable both wide and rapid dissemination of research output.
The books of this series are submitted to indexing to Web of Science,
EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink.

More information about this series at http://www.springer.com/series/7092


J. K. Mandal Devadutta Sinha

Editors

Intelligent Computing
Paradigm: Recent Trends

123
Editors
J. K. Mandal Devadutta Sinha
Department of Computer Science Department of Computer Science
and Engineering and Engineering
University of Kalyani University of Calcutta
Kalyani, West Bengal, India Kolkata, India

ISSN 1860-949X ISSN 1860-9503 (electronic)


Studies in Computational Intelligence
ISBN 978-981-13-7333-6 ISBN 978-981-13-7334-3 (eBook)
https://doi.org/10.1007/978-981-13-7334-3

Library of Congress Control Number: 2019934794

© Springer Nature Singapore Pte Ltd. 2020


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface

This volume entitled “Intelligent Computing Paradigm: Recent Trends” is an


extended version of some selected papers from CSI-2017 along with papers sub-
mitted independently by the authors against the call for papers. There are eight
chapters in this volume: The first chapter deals with the classification of library
resources based on the recommender system; the third chapter deals with the
prospect of disease prediction using reliability. The second and fourth chapters deal
with wireless sensor network for studying the behavioral change and clustering of
nodes. The third chapter deals with Al-based disease prediction, whereas the fifth
chapter deals with product prediction and recommendation. The sixth chapter deals
with reliability and area minimization of VLSI power grid network. The seventh
chapter deals with the detection of forest cover changes from remote sensing data.
The last chapter deals with the identification of malignancy from cytological
images.
The chapters were reviewed as per norms of Studies in Computational
Intelligence book series. Based on comments of the reviewers, the chapters were
modified by the authors and the modified chapters are examined for incorporation
of the same. Finally, eight chapters are selected for publication in this special issue
Intelligent Computing Paradigm: Recent Trends under the book series Studies in
Computational Intelligence, Springer.
On behalf of the editors, we would like to express our thanks to the authors for
sending chapters into this special issue.We would like to express our sincere
gratitude to the reviewers for reviewing the chapters.
Hope this special issue will be a good material on the state-of-the-art research.

Kalyani, West Bengal, India Jyotsna Kumar Mandal


University of Kalyani
Kolkata, West Bengal, India Devadatta Sinha
April 2019 Calcutta University

v
Contents

Improved Hybrid Approach of Filtering Using Classified Library


Resources in Recommender System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Snehalata B. Shirude and Satish R. Kolhe
A Study on Collapse Time Analysis of Behaviorally Changing
Nodes in Static Wireless Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . 11
Sudakshina Dasgupta and Paramartha Dutta
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect of Disease
Prediction Using Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Sumit Das, Manas Kumar Sanyal and Debamoy Datta
Bacterial Foraging Optimization-Based Clustering in Wireless
Sensor Network by Preventing Left-Out Nodes . . . . . . . . . . . . . . . . . . . 43
S. R. Deepa and D. Rekha
Product Prediction and Recommendation in E-Commerce
Using Collaborative Filtering and Artificial Neural Networks:
A Hybrid Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Soma Bandyopadhyay and S. S. Thakur
PGRDP: Reliability, Delay, and Power-Aware Area Minimization
of Large-Scale VLSI Power Grid Network Using Cooperative
Coevolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Sukanta Dey, Sukumar Nandi and Gaurav Trivedi
Forest Cover Change Analysis in Sundarban Delta Using Remote
Sensing Data and GIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
K. Kundu, P. Halder and J. K. Mandal
Identification of Malignancy from Cytological Images
Based on Superpixel and Convolutional Neural Networks . . . . . . . . . . . 103
Shyamali Mitra, Soumyajyoti Dey, Nibaran Das, Sukanta Chakrabarty,
Mita Nasipuri and Mrinal Kanti Naskar

vii
About the Editors

J. K. Mandal is former Dean of the Faculty of Engineering, Technology and


Management, and Senior Professor at the Department of Computer Science &
Engineering, University of Kalyani, India. He has obtained his Ph.D. (Eng.) from
Jadavpur University. Professor Mandal has co-authored six books: Algorithmic
Design of Compression Schemes and Correction Techniques—A Practical
Approach; Symmetric Encryption—Algorithm, Analysis and Applications: Low
Cost-based Security; Steganographic Techniques and Application in Document
Authentication—An Algorithmic Approach; Optimization-based Filtering of
Random Valued Impulses—An Algorithmic Approach; and Artificial Neural
Network Guided Secured Communication Techniques: A Practical Approach; all
published by Lambert Academic Publishing, Germany. He has also authored more
than 350 papers on a wide range of topics in international journals and proceedings.
Twenty-three scholars awarded Ph.D. Degree under his supervision. His profile is
included in the 31st edition of Marque’s World Who’s Who published in 2013.
Government of West Bengal, India conferred him ‘Siksha Ratna’ award as an out-
standing teacher in 2018. His areas of research include coding theory, data and
network security, remote sensing and GIS-based applications, data compression,
error correction, visual cryptography and steganography, distributed and shared
memory parallel programming. He is Fellow of Institution of Electronics and
Telecommunication Engineers, and Members of IEEE, ACM, and Computer Society
of India.

Prof. Dr. Devadutta Sinha graduated with honors in Mathematics from


Presidency College and completed his postgraduation in Applied Mathematics and
then in Computer Science. He completed his Ph.D. in the field of Computer Science
at Jadavpur University in 1985. He started his teaching career at the Department of
Computer Engineering at BIT Mesra Ranchi, then at Jadavpur University and
Calcutta University, where he was a Professor at the Department of Computer
Science and Engineering. He also served as Head of the Department of Computer
Science and Engineering, and Convener of the Ph.D. Committee in Computer
Science and Engineering and in Information Technology at the University of

ix
x About the Editors

Calcutta. He also served as Vice-Chairman of the Research Committee in Computer


Science and Engineering, West Bengal University of Technology. During his
career, he has written a number of research papers in national and international
journals and conference proceedings. He has also written a number of expository
articles in periodicals, books and monographs. His research interests include soft-
ware engineering, parallel and distributed algorithms, bioinformatics, computa-
tional Intelligence, computer education, mathematical ecology and networking. He
has total teaching/research experience of more than 38 years. He was also on the
editorial boards of various journals and conference proceedings and served in
different capacities in the program and organizing committees of several national
and international conferences. He was Sectional President, Section of Computer
Science, Indian Science Congress Association for the year 1993–94. He is an active
member of a number of academic bodies in various institutions. He is a fellow and
senior life member of CSI and has been involved in different activities including
organization of different computer/IT courses. He is also a Computer Society of
India Distinguished Speaker.
Improved Hybrid Approach of Filtering
Using Classified Library Resources
in Recommender System

Snehalata B. Shirude and Satish R. Kolhe

Abstract The goal of planned library recommender system is to provide needful


library resources quickly. The important phases required to perform are build and
update user profiles and search the proper library resources. This proposed system
uses a hybrid approach for filtering available books of different subjects, research
journal articles, and other resources. Content-based filtering evaluates user profile
with available library resources. The results generated are satisfying the users need.
The system can generate satisfactory recommendations, since in dataset most of the
entries for books and research journal articles are rich with keywords. This richness is
possible by referring to abstract and TOC (table of contents) while adding records of
research journals articles and books, respectively. Collaborative filtering computes
recommendations by searching users with similar interests. Finally, to the active
user, recommendations are provided which are generated with the hybrid approach.
To make it simpler and develop the outcome of the recommendation process, cate-
gorization of available records is made into distinct classes. The distinct classes are
defined in ACM CCS 2012. The classifier is the output of relevant machine learning
methods. The paper discusses the improvement in results by hybrid approach due to
the use of classified library resources.

Keywords Improvement in library recommender system · Filtering · Hybrid


approach · Classification · ACM CCS 2012 · Machine learning

1 Introduction

A large amount of information, finding user’s choice without asking directly to the
user, are the challenges directing in the upgrading of the digital library recommender

S. B. Shirude (B) · S. R. Kolhe


School of Computer Sciences, North Maharashtra University, Jalgaon, India
e-mail: [email protected]
S. R. Kolhe
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2020 1


J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent
Trends, Studies in Computational Intelligence 784,
https://doi.org/10.1007/978-981-13-7334-3_1
2 S. B. Shirude and S. R. Kolhe

system. Searching of proper library resources satisfying the user’s need is a major step
in the implementation of such a system. Providing quick and relevant recommenda-
tions to the user of the system is the basic objective [1]. This process requires making
decision and taking action according to the perception, the agent-based framework
is fitting for the implementation of the system [2]. Agent’s filtering task is much sim-
ilar to the task of taking out a specific book/journal according to your interest from
bookshelf arranged in physical libraries. Therefore, if library records in a dataset
of the digital library are classified into right categories, it will be easy to find the
right resource to an agent. The implementation of classification into 14 classes given
in CCS 2012 given by ACM is explained in paper [3]. Due to the use of classi-
fied library resources while implementing the task of filtering, the system provides
relevant recommendations. The process combines the results from both content-
and collaborative-based techniques, which is the hybrid approach. This paper is
divided into six sections. The detailed literature study related to the classification
can be obtained from paper [3] as this is extended work done for perfection. Library
resources classification is given in Sect. 2. This explains the architecture of the pro-
posed system, dataset prepared, and novel classifier built using different machine
learning techniques. Section 3 describes the way recommendations generated are
improved by the application of a hybrid approach, which makes use of classified
library resources for filtering step. Section 4 discusses the results of classification
and use of the hybrid approach. Conclusion about the experiment and possible scope
in future is provided in Sect. 5.

2 The Framework of the System

Description of the system proposed along with library resources classifier imple-
mented is explained below [3].

2.1 Proposed System

Figure 1 explains the processes required to perform in the proposed library recom-
mender system [1].
The taxonomy of such systems includes two phases namely, profile generation
and maintenance and Pprofile exploitation. Representing, generating, learning user
profile, and relevance feedback are tasks included in profile generation and main-
tenance. Information filtering, matching user profile with the item and with other
profiles, adapting user profile are the tasks included in profile exploitation phase
[1]. The flow includes the first step of registering user, which creates basic user
profile. The primary responsibility of profile agent is keeping the user profile implic-
itly updated. The dataset includes important entities such as all library records and
user profiles in XML form. To satisfy the main purpose which is to recommend
Improved Hybrid Approach of Filtering … 3

Fig. 1 Proposed framework of the recommender system

books/research papers to logged user according to his or her choice, the system
needs to exploit the user profile of the logged user, then searching is required to gen-
erate the results/recommendations. The hybrid approach has been implemented by
the proposed system. Similarity measures are essential to calculate the resemblance
between the user profile and the library resources while implementing filtering [4].

2.2 Dataset Preparation

The prepared dataset includes profiles of a registered user (XML form),


books/research journal records (XML form), and ACM CCS 2012 (SKOS form).
Figure 2, 3, 4 shows sample for them, respectively [1].

2.2.1 User Profiles

See Fig. 2.

2.2.2 Library Resources

See Fig. 3.
4 S. B. Shirude and S. R. Kolhe

Fig. 2 Sample user profile generated by the system

Fig. 3 Sample library resources added in the system in XML form


Improved Hybrid Approach of Filtering … 5

Fig. 4 Ontology used

2.2.3 Ontology

See Fig. 4.

2.3 Library Resources Classifier

A novel classifier is built using various machine learning techniques. The results
obtained are analyzed and compared. It is found that PU learning approach which
uses Naïve Bayes algorithm outperformed. The details are given in paper [1, 3].
The extension to this work which is improving the recommendation using hybrid
approach is explained in the next section.
6 S. B. Shirude and S. R. Kolhe

3 Improved Hybrid Approach to Generate


Recommendations

This section describes the application of the hybrid approach to the classified library
resources to generate recommendations. This approach merges results from content
as well as collaborative filters. The algorithm given in Sect. 3.1 explains the use of
the hybrid approach.

3.1 Algorithm

HybridFilter (UserProfile, ClassifiedLibResources)//UserProfile is of active user


xml form, ClassifiedLibResources are results
by Naïve Bayes classifier
Begin

conRecom ContentFilter (UserProfile, ClassifiedLibResources) //ContentFilter


isContent based filter generating recommendations conRecom

colRecom CollabFilter (UserProfile, ClassifiedLibResources) //CollabFilter is


Collaborativefilter generating recommendations colRecom

hybridRecom CombineRessults (conRecom, colRecom) //CombineResults


from both filters

End of HybridFilter
ContentFilter (UserProfile, ClassifiedLibResources)//ContentFilter is
Content based filter generating recommendations conRecom
Begin

intUser ProfileAgent (SI, UserProfile) //SI is information searched for user


in text format and UP is the active user
profile. ProfileAgent return interests
of user reading user profile
intKeywordU RespresentVect (intUser) // intKeyword is a vector of key
words representing interests of the user

intKeywordL RespresentVect (intUser) // intKeyword is a vector of key


word representing interests of the user

conRecom MeasureSim (intKeywordU, intKeywordL) // Similarity is


measured between classified li-
brary resources and user profile
return conRecom

End for ContentFilter


Improved Hybrid Approach of Filtering … 7

CollabFilter (UserProfile, ClassifiedLibResources)//CollabFilter is


Collaborativefilter generating recommendations colRecom
Begin

distanceU ProfileAgent (Concepts, Users) //ProfileAgent returns a matrix


giving distance of users with the concepts in ACM CCS 2012

SimUsers RetriveSimUsers (UserProfile, distanceU) //From distances


returned by ProfileAgent retrieve similar user
having interest like active user

ratedLibResources IdentifyRatings (SimUsers, ClassifiedLibResources)


// Rated Library Resources are generated
used by similar users

colRecom MeasureSim (UserProfile, ratedLibResources) // Similarity is


measured between rated library
resources and user profile
return colRecom

End for CollabFilter


HybridFilter is mainly performing the task of combining recommendations generated
by content and collaborative filters. ContentFilter is basically an agent generating
recommendations by matching all available library resources with interest of the
active user. This agent is referring to the profile agent, whose job is to identify
the interests of the user. Profile agent performs subtask such as collect information
about the interest of the user, removal of noise, weight assignment, and updation.
CollabFilter is basically an agent generating recommendations by identifying similar
users and referring to the library resources they rated.

4 Results and Discussion

For evaluating the improved results the dataset uses 25 different users and approxi-
mately 500 different books records plus 205 research journal articles.

4.1 Results of Improved Hybrid Approach

Results of the improved hybrid approach described in Sect. 3 are given below:
Table 1 gives a comparison between approaches content-based, collaborative, and
improved hybrid using classified library resources.
Precision, recall, and f1 are calculated using the number of relevant recommended,
relevant not recommended, irrelevant not recommended, and irrelevant not recom-
mended from the dataset of classified library records of books and journals. The val-
ues show that improved hybrid approach provides more relevant recommendations
8 S. B. Shirude and S. R. Kolhe

Table 1 Evaluation for 25 different Users


Users Content based filter (%) Collaborative filter (%) Improved hybrid
approach (%)
Precision Recall F1 Precision Recall F1 Precision Recall F1
User1 87.50 77.78 82.35 85.71 69.23 76.60 93.10 87.10 90.00
User2 75.00 37.50 50.00 73.33 61.11 66.67 90.00 90.00 90.00
User3 53.57 83.33 65.22 57.69 71.43 63.83 86.36 90.48 88.37
User4 42.42 77.78 54.90 38.71 75.00 51.06 80.77 91.30 85.71
User5 90.00 90.00 90.00 81.82 81.82 81.82 81.82 81.82 81.82
User6 33.33 75.00 46.15 45.45 71.43 55.56 76.92 66.67 71.43
User7 44.00 84.62 57.89 50.00 84.62 62.86 75.00 80.00 77.42
User8 40.00 57.14 47.06 40.00 57.14 47.06 81.25 81.25 81.25
User9 60.87 70.00 65.12 50.00 60.00 54.55 81.58 81.58 81.58
User10 60.00 47.37 52.94 64.71 57.89 61.11 78.57 84.62 81.48
User11 79.31 79.31 79.31 75.00 75.00 75.00 84.62 84.62 84.62
User12 58.33 75.00 65.63 58.82 71.43 64.52 78.38 78.38 78.38
User13 58.62 65.38 61.82 60.00 66.67 63.16 77.27 85.00 80.95
User14 61.54 76.19 68.09 64.29 75.00 69.23 84.62 75.86 80.00
User15 72.73 72.73 72.73 73.53 73.53 73.53 92.31 85.71 88.89
User16 63.64 77.78 70.00 64.71 78.57 70.97 80.77 91.30 85.71
User17 68.00 85.00 75.56 62.07 72.00 66.67 89.29 86.21 87.72
User18 60.00 64.29 62.07 55.56 62.50 58.82 81.82 75.00 78.26
User19 63.64 63.64 63.64 55.56 55.56 55.56 87.10 77.14 81.82
User20 69.57 72.73 71.11 66.67 70.00 68.29 72.73 85.71 78.69
User21 66.67 60.00 63.16 62.50 55.56 58.82 70.00 75.00 72.41
User22 64.71 78.57 70.97 60.00 75.00 66.67 76.19 84.21 80.00
User23 68.97 86.96 76.92 66.67 85.71 75.00 83.33 86.96 85.11
User24 63.16 80.00 70.59 58.82 76.92 66.67 71.43 83.33 76.92
User25 72.97 75.00 73.97 72.97 75.00 73.97 86.36 90.48 88.37
Average 63.14 72.52 66.29 61.78 70.32 65.12 81.66 83.19 82.28

satisfying the need of users in comparison with only content-based filter and collab-
orative filter.
The innovation with respect to various aspects can be identified in comparison
with other similar works [5–7]. The aspect includes using SKOS form of ACM
CCS 2012 like ontology, assigning weights to semantic terms and using them while
matching, and automatic update of profiles.
Improved Hybrid Approach of Filtering … 9

5 Conclusions and Future Scope

The proposed recommender system for a digital library gives recommendations to


logged in users by merging results given by both content filter and collaborative filter.
Library records are grouped in common classes so that filter can refer them easily at
the time of generating recommendations. This is a similar idea like putting related
library books or journals into the common shelf so that we can find similar ones
in a group. Various machine learning techniques are experimented, and results are
analyzed for comparision. Naïve Bayes classifier which uses PU learning approach
performed fine because in the dataset no negative documents were present. It can
reach to 90.05%. The resultant groups of the library resources are employed to fur-
ther improvise the results/recommendations. These recommendations are generated
deploying content based, collaborative filtering, and by combining both the methods,
that is, the hybrid approach. It is observed that the hybrid approach applied to clas-
sified library resources improves the recommendations result. Currently, the system
applies collaborative filter after some number of users registers to the system, so that
the system does not enter into the cold start problem. This problem can be solved
using other approaches also. The approaches experimented in the evolution of the
recommender system can be applied to various problems like finding experts within
a specific domain.

References

1. Shirude, S.B., Kolhe, S.R.: Agent based architecture for developing recommender system in
libraries. In: Margret Anouncia S., Wiil U. (eds) Knowledge Computing and its Applications.
Springer, Singapore. https://doi.org/10.1007/978-981-10-8258-0_8, Print ISBN: 978-981-10-
8257-3, Online ISBN: 978-981-10-8258-0 PP 157-181 (2018)
2. Montaner, M., López, B., De La Rosa, J.L.: A taxonomy of recommender agents on the internet.
Artif. Intell. Rev. 19(4), 285–330 (2003)
3. Shirude, S.B., Kolhe, S.R.: Classification of Library Resources in Recommender System Using
Machine Learning Techniques. Annual Convention of the Computer Society of India, Springer
Singapore (2018)
4. Shirude Snehalata, B., Kolhe, S.R.: Measuring Similarity between user profile and library book.
Inf. Syst. Comput. Netw. (ISCON) 50–54, IEEE, (2014)
5. Morales-del-Castillo, J.M., Peis, E., Herrera-Viedma, E.: A filtering and recommender system
for e-scholars. Int. J. Technol. Enhanc. Learn. 2(3), 227–240 (2010)
6. Porcel, C., Moreno, J.M., Herrera-Viedma, E.: A multi-disciplinar recommender system
to advice research resources in university digital libraries. Expert Syst. Appl. 36(10),
12520–12528 (2009)
7. Hulseberg, A., Monson, S.: Investigating student driven taxonomy for library website design.
J. Electron. Resour. Libr. 361–378 (2012)
8. Vijayakumar, V., Vairavasundaram, S., Logesh, R., Sivapathi, A.: Effective knowledge based
recommender system for tailored multiple point of interest recommendation. Int. J. Web Portals
(IJWP) 11(1), 1–18 (2019)
9. Kaur, H., Kumar, N., Batra, S.: An efficient multi-party scheme for privacy preserving collab-
orative filtering for healthcare recommender system. Futur. Gener. Comput. Syst. (2018)
10 S. B. Shirude and S. R. Kolhe

10. Gunawardana, A., Shani, G: A survey of accuracy evaluation metrics of recommendation tasks.
The Journal of Machine Learning Research, 10, 2935–2962, (2009)
11. Azizi, M., Do, H.: A Collaborative Filtering Recommender System for Test Case Prioritization
in Web Applications (2018). arXiv preprint arXiv:1801.06605
12. Kluver, D.: Improvements in Holistic Recommender System Research (2018)
A Study on Collapse Time Analysis
of Behaviorally Changing Nodes in Static
Wireless Sensor Network

Sudakshina Dasgupta and Paramartha Dutta

Abstract Active participation of clustered nodes in a static Wireless Sensor Network


offers comprehensive relief to the perennial arising out of limited energy reserve. In
this paper, we propose a statistical composition for the lifetime prediction based on
the active and sleep probability of the participating sensor nodes in the network.
This approach is able to estimate the collapse time of the entire network. It identifies
two key attributes of the network that might affect the network lifetime. The key
attributes are the node density and active-sleep transition characteristic of the nodes.
The simulation results further establish the relevance of the analytical study and
assert that the overall network lifetime is increased as the node density is increased
in general. But, on the contrary, the comprehensive energy necessity of the network
is also increased. A trade-off between these two factors is observed by changing the
active-sleep transition characteristics of the nodes in the network.

1 Introduction

Due to the constraints in operational behavior, self-deployment structure, and lim-


ited computing and communication capabilities, managing node energy is crucial.
Contemporary progress in wireless sensor technology has empowered to evolve mul-
tifunctional, less costly, micro-sized sensor nodes that can be capable to communi-
cate in a short span. They are entrusted to sense physical environmental information
such as temperature, humidity, light intensity with potential applications of military,
industrial, scientific, health care, domestic etc. [1]. These aggregated information
need to be processed locally with limited computations and send it to one or more

S. Dasgupta (B)
Government College of Engineering and Textile Technology, Serampore, India
e-mail: [email protected]
P. Dutta
Department of Computer and System Science, Visva Bharati University,
Santiniketan, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2020 11


J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent
Trends, Studies in Computational Intelligence 784,
https://doi.org/10.1007/978-981-13-7334-3_2
12 S. Dasgupta and P. Dutta

sink nodes through wireless communications. Because of the difficulty in battery


replacement, preservation of energy is of prime challenge. The large number of self-
organizing sensor nodes and the limited operational node energy have emerged as
a major design concern in wireless sensor network. Grouping of sensor nodes into
clusters is an effective option to manage the energy consumption issues into a hier-
archical structure, in order to reduce communication distance and the transmission
overhead [2]. In that case, where the base station is at the outside of the application
region, the clustered wireless sensor network requires more energy to communicate
with it. Since more energy is required for establishing a communication between
cluster head to the base station as well as member nodes of one cluster to cluster
head. Battery utilization of sensor nodes is associated to the lifespan of sensors.
The simulation is used to consider a static network architecture with thousands
of nodes turned to account on the region of interest, where the base station is in the
outside of the application region. But the position of base station creates an adverse
effect in local data aggregation instead of direct communication to base station.
Therefore, clustering is significant to facilitate the deployment and functioning of
wireless sensors. This architecture helps to reduce comprehensive power consump-
tion in the network. This scalable feature of sensor nodes helps to balance the network
load. The energy depletion of cluster head is gradually very fast compared to other
member nodes of the Cluster.
In this paper, we have addressed a probabilistic formulation of collapse time of
sensor nodes with respect to change of operational modes between active and sleep
state so that it can predict the collapse time of the entire network. This prediction
results in a scheduling methodology among the nodes meticulously avoiding it to
reduce overloaded state. As sensors can behaviorally adjust their vogue of action
among active and sleep state, consequently minimum number of nodes covering
the application area according to their sensing capacity leaving the cluster alive.
As a result, the cluster head might not be overloaded with additional redundant
information from overlapping service region of sensor nodes. Numerical results show
the operational behavior of the sensor nodes with different probabilities from active
to sleep justifying the stability of the network. Mathematical analysis provides the
node dependence of the entire network with respect to collapse time of sensor nodes,
which determine the requirement of clustering of the concerned region space.
The rest of the paper is structured as follows. Section 2 introduces a summa-
rized brief of some existing works of relevance. Section 3 describes the proposed
method and mathematical analysis. Sections 4 and 5 contribute theoretical analysis
and simulation results, respectively. Section 6 draws the conclusion of the work.

2 Literature Survey

Of late, Wireless Sensor Network has become popular due to exceptional capabilities
on a substantial range of domains. WSN is used to address innumerable challenges in
the field of health and medicine by monitoring and transmitting data to a base station.
A Study on Collapse Time Analysis of Behaviorally … 13

Therefore, these low-cost and light- weighted sensor nodes are key device for a moni-
toring system. But, the short lifespan of these battery-driven devices happen to be the
bottleneck in ensuring a long-lasting monitoring service. Thus, energy is a censori-
ous affair for sensor lifetime. Hierarchical structure of the clustering mechanism has
been applied to sensor networks to enhance the sustainability of network performance
with reduced energy consumption. This approach results in considerable presentation
of network liveliness and also decrease interference among the participating sensor
nodes for channel access in a dense network. To overcome the organizational design
limitations, a number of routing algorithms have been reported in the literature [3].
Replacement of batteries and recharging is not possible in an adverse environment.
To maintain the long lifetime of network, the nodes can be scheduled according to
the changes of operational modes between sleep state to active state and vice-versa.
Therefore, prediction of collapse time of sensor nodes is one of the major concerns
in WSNs. Foudil Mir, Ahune Bouncer, and Farid Maziane have outlined a novel
approach for lifetime forecast in WSN using statistical regression analysis. But the
simulation was done only on smaller networks [4]. Kewei Sha and Weisong Shi [1]
have proposed IQ and traditional model with respect to remaining energy of each
sensor nodes as well as the entire network. But the author does not provide any con-
sideration of sleep/active dynamics in mathematical analysis. In [3], Rukpakarong
proposed a technique conducted with different battery models on real hardware and
software platforms in WSNs. The author explored several factors of sensor nodes
such as battery type, discharge rate, etc., but there was no consideration about duty
scheduling of sensor nodes on real testbed. Rodrigues in [2] highlighted a tech-
nique demonstrated by their developed software to evaluate the state of change of
an analytical battery model with respect to time at different temperatures. Here, the
author faces some problems in implementation with complex analytical models on
low- capacity hardware platform. Stefano Abbate proposed a software-based frame-
work for run-time monitoring of energy consumption by an energy tracker module in
WSN application [5]. This technique does not require any additional hardware cost
but attracts code size and computational cost. Yunxia Chen and Qing Zhao explored
the functional behavior of each nodes capitalizing its remaining energy information
and the channel state information in Medium Access Control protocol [6]. The pro-
posed Max–Min approach escalates the minimum residual energy across the network
in each transmission. Here, the achieved results demonstrated the improvements of
the network lifetime performance with the size of the network.
Our contribution in this paper is to bridge the gap between the implementation
of estimated calculations of Sensor Network lifetime and the maximization of this.
Here, we use probabilistic approach that establishes the node dependence of the
entire network with respect to collapse time of sensor nodes. These calculations lead
to predict the situation of collapsing the network as well as a need to reforming it
in cluster organization for surviving. The proposed work offers the direction of the
prediction of the lifetime of the network.
14 S. Dasgupta and P. Dutta

3 Proposed Work

Consider a scenario having N nodes with active behavior P1 , P2 , P3 , . . . , Pl in the


sequence. It is quite understandable that network will crumble on encountering all
N nodes transferred to sleep mode. Therefore, network failure is subject to a number
of active nodes among all the nodes at a given point of time. Nodes have been
initiated with the same initial energy in standing position after deployment. Nodes
are permitted to communicate in a single-hop manner. The motivation of the paper
is to analyze the transition characteristics of sensor nodes with respect to expected
collapse or breakdown time. This changing behavior of sensor nodes leads to estimate
the expected lifetime of sensor network. In this context, our proposed approach
probabilistically try to determine as and when the next turn of clustering is required.
This might, in turn, avoid unwanted clustering in each round of the algorithm and
hence capable of preservation of network energy.
Let us consider

B = Breakdown time of the System

Breakdown time or collapse time might be assigned with a positive integer value
randomly. The transition characteristic of the ith node is given by

 Active Sleep 
Active pi 1 − pi
Pi =
Sleep 0 1

Consider Bi = Shortest time duration in which ith node fall asleep, 1 ≤ i ≤ N .


Since breakdown time or collapse times takes place only when all the N nodes in
the system attains sleep state. Therefore, collapse time B is the maximum of all
individual Bi ,where 1 ≤ i ≤ N

B = Max {Bi } (1)


1≤i≤N

Bi follows geometric distribution with parameter 1 − pi where 1 ≤ i ≤ N

Bi  geometric(1 − pi ), pi ∈ ( 0, 1) , 1 ≤ i ≤ N

Accordingly, the probability mass function of Bi will be

P(Bi = li ) = pili −1 (1 − pi ), li ≥ 1, 1 ≤ i ≤ N

A small effort derives that

P(Bi ≤ li ) = 1 − pili , li ≥ 1, 1 ≤ i ≤ N (2)


A Study on Collapse Time Analysis of Behaviorally … 15

which is the probability distribution function Bi because of independence,



N
P(B ≤ l) = (1 − pil ), l ≥ 1
i=1

Now, the event B ≤ l is equivalent to the mathematical intersection of the events


Bi ≤ l for all i, 1 ≤ i ≤ N . This is valid for all l ≥ 1


N
{B ≤ l} = {Bi ≤ l}, l ≥ 1
i=1

Therefore,

N
P(l) = (1 − pil ), l ≥ 1 (3)
i=1

Now,

P(B = l) = P(B ≤ l) − P(B ≤ l + 1)



N 
N
= (1 − pil ) − (1 − pil+1 ), l ≥ 1
i=1 i=1


N
= (1 − pil ) pi , l ≥ 1
i=1
(4)
Now the expected breakdown time of the entire system is

 
N 
N
E(B) = l P(B = l) = pi l[ (1 − pil )] (5)
l≥1 l≥1 i=1 i=1

If identical node behavior is assumed, i.e., pi = p ∀i, 1 ≤ i ≤ n then it can be


shown by means of requisite mathematical derivation.
Hence, 
E(B) = l N l(1 − l m ) N
m≥1

4 Theoretical Analysis

The mathematical analysis demonstrates the expected breakdown time of a sensor


network consisting of N nodes with identical transition probability with pi . It estab-
lishes the fact that such a system of equilibrium is supposed to break after a stipulated
16 S. Dasgupta and P. Dutta

amount of time. The expected breakdown time of such a system might be predicted
as it reaches to a peak value for a certain value of pi . The transition probability pi
might be varied to delay the overall breakdown time of such network.

5 Simulation Result

The implementation is carried out in MATLAB to evaluate the performance of our


algorithm. There are 200 sensor nodes distributed across the simulation area of 300 m
× 300 m. The base station is positioned outside the simulation area. The sensor nodes
are comparable in quality. The transmission radius of each sensor node is set to 10 m.
Packets are generated at a constant rate of 1 packet in unit time. It is also considered
that, the sensors nodes are static and mobility is restrained. The first graph is p along
horizontal axis and E(B) along vertical axis, for different choices of N . As N is
considered larger, the curve is found to shift toward right, i.e., the curvature becomes
steeper. The value of p at which E(B) assumes maximum value is 0.9930. Very
surprisingly, this maximized value of p is observed to be independent of the choice
of N . The height of the peak, however, is found varying with different choice of N .
From Fig. 1 as p is increased, i.e., the activation probability of nodes, the expected
collapse time also increases. Again it is shown that the convergence of the optimum
value of E(B), i.e., the highest energy consumption of the network is found to
be faster. Accordingly, another experiment was conducted to explore as to what
exactly is the inherent relation being maintained between N and the height of the
corresponding peak. This is reflected in Fig. 2. Here N is taken along horizontal axis,
whereas the corresponding maximum peak value is considered along the vertical

Fig. 1 Transition
characteristics P versus
Expected collapse time E(B)
A Study on Collapse Time Analysis of Behaviorally … 17

Fig. 2 Number of nodes N


versus Maximum energy
consumption Y(max)

Fig. 3 Maximum value of


E(B) versus Number of
nodes N with transition
probability 0.80

axis. The nature of this curve also is very interesting. From Fig. 2, it is observed
that as N , i.e., number of nodes in the WSN increases the value of ymax (highest
energy consumption) decreases. This shows that in a sparse (when the number of
nodes is less) WSN environment, most of the nodes have to be kept active during
data gathering, hence would consume maximum energy. But as N increases (dense
network), due to the proper scheduling of the nodes, overall energy consumption
gets reduced.
Therefore, it is observed that with minimum node density, the overall energy
consumption of the network is maximum, which is reflected in Fig. 2. As these
minimum number of nodes take the responsibility of covering entire domain, hence
the energy depletion becomes faster for the entire region space. In Fig. 2 as N
increases, E(B) decreases. The height of the peak, i.e., the optimum value of E(B)
18 S. Dasgupta and P. Dutta

Fig. 4 Maximum value of


E(B) versus Number of
nodes N with transition
probability 0.90

is varied with different values of N . So the final observation is that whenever the
node density is minimum, the convergence rate toward the optimum value of E(B)
becomes smooth. Whenever the node density becomes maximum, the convergence
rate toward the optimum value of E(B) is much steeper. As in dense network, there
are maximum chances of redundant information which leads to consume more energy
(cumulative energy of all nodes).
Therefore, this situation with high node density needs to be clustered the network
more earlier. From Figs. 3, 4, 5, it might be seen that the expected breakdown time in
the network E(B) increases with the increased number of nodes but it is independent
for a certain range of N values. The experiment has been performed with different
values of p that is the probability of a node to remain active. It may also be noted
that in Fig. 3, E(B) remains fixed for N = 45 to 100, signifying that in some cases
energy consumption of the network remains unaffected over a specific range of N
(node density). All this together demonstrate the effect of node substantiality in the
system on the presumed breakdown time.
It is prominent from Fig. 6 that the expected breakdown time E(B) effectively
increases with the increasing value of number of nodes N using random and max–
min protocol [6]. The max–min protocol operates on two physical layer parame-
ters namely, CSI and REI of individual sensors. But the random protocol randomly
chooses a sensor node for communication. It neither utilizes CSI nor REI. By exploit-
ing CSI and REI, the max–min protocol assumes extra burden for transmission. But
with maximized transition probability of 0.85 our proposed technique outperforms
the network lifetime performance of the existing techniques. As the overall energy
requirement of the network is increasing with dense network from sparse scenario,
the entire network lifetime is beneficial.
A Study on Collapse Time Analysis of Behaviorally … 19

Fig. 5 Maximum value of


E(B) versus number of nodes
N with transition probability
0.95

Fig. 6 Comparison of the


network lifetime of the
proposed approach with pure
opportunistic max–min
technique

6 Conclusion

In this report, a conceptual skeleton along with essential inference has been offered.
The inference conducts the credence of the lifespan of the WSN on node density
and transition probability. It is observed that for a certain range of node density, the
expected breakdown time of the entire network maintains uniformity. This behavior
of the sensor network defines the scalability of the system as this balancing nature
improvises the functional performance of the system. A comparative study of some
existing protocol on communication capability is provided to analyze their expected
lifetime in the real scenario. A comparative scenario is also provided.
20 S. Dasgupta and P. Dutta

References

1. Kewei, S., Shi, W.: Modeling the lifetime of wireless sensor networks. Sens. Lett. 3, 110 (2005)
2. Rodrignes, L.M., Montez, C., Budke, G., Vasque, F., Portugal, P.: Estimating the lifetime of
wireless sensor network nodes through the use of embedded analytical battery models. J. Sens.
Actuator Netw. 6(8) (2017)
3. Rukpakavong. W., Guan, L., Phillips, L.: Dynamic node lifetime estimation for wireless sensor
netwoks. IEEE Sens. J. textbf14(5), 1370–1379
4. Mir, F., Bounceur, A., Meziane, F.:Regression analysis for energy and lifetime prediction in
large wireless sensor networks. In: INDS’14 Proceedings of the 2014 International Conference
on Advanced Networking Distributed Systems and Applications, pp. 1-6 (2014)
5. Abbate, S., Avvenuti, M., Cesarini, D., Vecchio, A.: Estimation of energy consumption for
TinyOS 2. x-based applications. Procedia Comput. Sci. 10, 1166–1171. Elsevier (2012)
6. Chen, Y., Zhao, Q.: On the lifetime of wireless sensor networks. IEEE Commun. Lett. 9(11),
976–978 (2005)
Artificial Intelligent Reliable Doctor
(AIRDr.): Prospect of Disease Prediction
Using Reliability

Sumit Das, Manas Kumar Sanyal and Debamoy Datta

Abstract Presently, diagnosis of disease is an important issue in the field of health


care using Artificial Intelligence (AI). Doctors are not always present and sometimes
although doctors are available but people are not able to afford them due to financial
issues. The basic information like blood pressure, ages, etc. are known at that moment
without knowing any symptoms how the disease can be predicted. If people know
the symptoms of how the disease can be predicted? Both of these aspects, we would
look into, propose algorithms, and implement them for the welfare of the society. The
proposed algorithms are capable of classifying diseases of people and healthy people
in efficient manner. In this work, the authors also link the concepts of probability
with fuzzy logic and describe how to interpret them. Then, we can consider human
being as a kind of machine and we know that any machine can be described by a
parameter called reliability but the definition of classical reliability if used in case of
human being fails miserably. The aim of this paper is to make a bridge among fuzzy
logic, probability, and reliability.

Keywords Gini coefficient · Reliability · Disease prediction algorithm · BMI ·


SVM · AIRDr

S. Das (B)
Information Technology, JIS College of Engineering, Kalyani 741235, India
e-mail: [email protected]
M. K. Sanyal
Department of Business Administration, University of Kalyani, Kalyani 741235, India
e-mail: [email protected]
D. Datta
Electrical Engineering, JIS College of Engineering, Kalyani 741235, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2020 21


J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent
Trends, Studies in Computational Intelligence 784,
https://doi.org/10.1007/978-981-13-7334-3_3
22 S. Das et al.

1 Introduction

This paper utilizes the concept of reliability, which is defined for artificial intelligent
machines. Here authors considering that human beings are in fact a kind of machine
for which it can be defined by the concept of reliability. At present, some parts
of a machine are more reliable as compared to the other parts. Consequently, for
this purpose we redefine reliability in a form suitable for the analysis of disease.
Further, it combines the concept of fuzzy logic and reliability to arrive at better
results. Currently, it is seen that although artificial neural networks are suitable in
many applications for classification as well as regression tasks, however, the concept
of support vector machine is more intuitive for this purpose and both of these are
used for comparison purpose in our transformed data, which would be described in
the methodology section. The definition has been modified to include two critical
parameters such as BMI (body mass index) and age. It uses another parameter, which
is statistical in nature, Gini coefficient in both algorithms, the Disease Prediction
Algorithm 1 (DPA1) and used the diabetes as the disease for the current investigation
in Disease Prediction Algorithm 2 (DPA2). The Pima Indian datasets were used for
the implementation of the second algorithm, the proposed methodology would enable
efficient reliable diagnosis of the disease, and thereby lives of many people would
be saved. The authors also visualize the data by bee swarm plot and try to judge
the quality of data that we would be using. This paper would also point out the
various pros and cons of the proposed method. We at the outset do a background
study and then we describe the methodology that we follow. This part is divided into
two subsections where one section represents description of DPA1 for classification
and someone can query from the user end. The next sections represent analysis and
description of DPA2 and the results.

2 Literature Survey

All the people in this world experience more or less disease in their daily lives.
Patients’ lives are at risk when the practitioners are unable to do proper diagnosis.
The term Artificial Intelligence (AI) comes into play the role for automating the
diagnosis [1]. The expert system was designed to evaluate the range of symptoms by
the application of Artificial Neural Network (ANN) and regression techniques [2]. In
Rio de Janeiro city, a paper focuses on Pulmonary Tuberculosis (TB) by training ANN
model for the purpose of classification. In that case, Multilayer Perceptron (MLP)
was applied and for group assignment self-organizing feature maps were used for
the patients with high complexity [3]; the seven variables such as gender-category,
patient-age, tendency of cough, loss of weight, sweating at night, and sitophobia are
very crucial in addition to test data. In the inspiring paper, for achieving the accuracy,
the Centripetal-Accelerated Particle Swarm Optimization (CAPSO) was used for
advanced learning in ANN [4]. Newton’s law of Motion and PSO were applied
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 23

for the enhancement of the algorithm. In these studies, the concept of data mining
was applied to enhance the diagnosis. [5]. Advance diagnosis of deadly disease is
acquired by ANN and regression technique [6]. The literature survey provides the
thorough process of encapsulating concept of statistical method, data mining, and
regression to empower the diagnosis system more accurately. Consequently, after
literature survey, the authors draw the following objectives:
• To improve the method of classification of disease from traditional methods that
only utilizes ANN or SVM.
• To propose two algorithms for disease prediction and analyze their effectiveness.
• To propose a connection between probability and membership function [7].
• To propose a fuzzy membership function that changes with time.
The most important paper by Shanker (1996) applied the method of features
selection to illustrate the variables such as BMI, glucose, and age of PIMA INDIANS
datasets to acquire the knowledge of the rate of classification [8]. Thus, this paper
takes the inspiration from their results and we define a variable R called reliability
and propose a hypothesis: “Increased reliability means the person will be healthy”.

3 Methodologies

Now, the term vague or fuzzy is used for the assessment of disease by interpreting
the knowledge of symptoms. For example, “I am feeling EXTREMELY COLD”, in
this statement the patient expresses the linguistic term EXTREME. Accordingly, the
symptoms peer with the linguistic variables such as very, extremely, etc. Thus, a fuzzy
membership function is used to describe accurately the occurrence of disease, which
is inherently associated with two probabilistic outcomes that a disease is there or not.
Consequently, perception is that there may be a relationship between probability and
fuzzy membership. So based on the above arguments, P ∝ μ P = kμ. Here, k is a
constant and at this instant, if the patient has disease then μ = 1 and P = 1, these
implies k = 1.

3.1 Gini Coefficient and Reliability in Primitive Health

The concentration of Gini coefficient is measured from Lorenz curve [9]. The con-
centration yields zero only when the investigation is evenly distributed and it is one
for extreme.

1
n
G=1− (vi−1 + vi ) (1)
n i=1
24 S. Das et al.

Fig. 1 Structure of the data after adding the reliability


i 
n
vi = ( xj )/( xj )
j=1 j=1
i
ui = (2)
n
For i = 0,……,n.
Here, the linguistic variables are arranged as 0 < x1 < x2 < · · · < xn .
Xi implies the observations and the Lorenz curve is obtained by plotting u along
x-axis and v along y-axis.
At this instance, G is the degree of randomness. For a particular group of people,
BMI is large and there may be disease for some part of the population. If G = 1,
then it means some of the patients in the population have disease and the sampled
population is stored in a vector termed as BMI. The concept of reliability is illustrated
as follows:

R = (1 − G) ∗ Age ∗ B M I (3)

where G is the average Gini coefficients over the entire range of variables, i.e., it is
average of Gini coefficient for each column of the data shown in Fig. 1. This variable
call primed reliability and if we take negative of it, we get the reliability, which we
denote by R.
The part (1−G) in the equation measures how evenly the values are distributed
over the ages and the BMI fractions. If it takes negative of the R’, then we get R. As we
would clearly see in the result section, high R clearly in the plot shows the presence
of disease. So if high R means you have disease followed by R’ which is negative of it
indicates that how much reliable you are. We will stick to the two notations introduced
here. The authors also fuzzify the other variables such that these variables work like
symptoms, i.e., the membership function produces 1 for that particular variable when
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 25

Fig. 2 Hypothetical contour


plot of MF

the patients absolutely have the disease. As described in the algorithm of Sect. 3.2
from line 12, we need to query from the user but we automate the process in this
paper in such a way that we do not need to query from the user. This is equivalent
to asking how much high glucose level do you feel? In addition, answering it to get
the membership function in the algorithm. We describe this process in our result
and analysis section. Afterward all the variables are multiplied with R, as a result
in the xy plot of any variable the top right corner would always represent diseased
population. This would also be clear in the result and analysis section. Although
numerous results have been used, we describe only those that are startling and new.

3.2 Modified Fuzzy Function

The probability of occurring disease is considered as the function of linguistic vari-


able, which is further refined. It is assumed that the human body is an automated
repair system in which the disease is recovered by continuous repairing. That is why
Membership Function (MF) is changed with time where MF is function of x and t.
The patient can go to the doctor many times in which the value starts with zero. When
the patient visits the doctor next time, then the value of t will be changed or updated
accordingly. The analysis of these three variables such as x, μ and t are depicted in
the following Fig. 2.
According to the above Fig.˜2, the MF is retraced with time and it can be mathe-
matically represented as M = ∇μ ∗ dx ∗ dt
The speculation is that M is an irrecoverable function with finite integral values.
The gradient value of μ need to be very large and it can be interpreted from the plot
of contour such as μ1 < μ2 < μ3 .
Test-Case1:
It is seen that for black arrow in Fig. 2, the value of μ can be obtained from x and t.
If μ has taken some fractional values 0.1, 0.2, and 0.3, it indicates that for large
variation of (x,t), the variation of μ is very less, which speculates μ is about approx.
constant over time and gradient is large.
26 S. Das et al.

Case1:The volume when the Case2:The volume when


membership is nearly constant the membership is varied

Fig. 3 Interpretation of M

Test-Case2:
Similarly, for red arrow (inner one), the gradient is small.
Interpretation of M: The observation is that M rises finitely when μ is increased.
Here, constants k and ϕ are considered in which ϕ is varying patientwise. It is obvious
that the value of M is varying but ϕ is constant for a specific patient (Fig. 3).
The value is completely dependent on weighted volume, which is depicted in
Fig. 14.
¨ ˚ 
∇μ ∗ ds) α( dv (4)
¨ ˚
∇μ ∗ ds = k ∗ dv (5)

or
˚ ˚
∇.(∇μ) dv = k ∗  dv (6)

According to the Theorem of Gauss’s divergence,


or

∇2 μ = k ∗  (7)

With the initial conditions:

μ(5, 0) = 1
μ(0, 0) = 0
lim μ(x, t) = ∞
t→∞

Hence, the authors derived the equation of partial differentiation of MF.


Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 27

3.3 Result of Partial Differentiation

The nonhomogeneous equation is converted into homogeneous by using substitution


as

1
μ(x, t) = f(x, t) + ∗ k ∗  ∗ x2 (8)
2

The resultant:∇ 2 f = 0.
The above Laplace equation can be solved by variable separation and the derived
equation is

1 − 25k 1
μ(x, t) = 2
∗ sin(px) ∗ ept + kx2 (9)
sin(5p) 2

Equation 9 is the boundary condition. The physical interpretation is considered


in the paper [10]. This paper linked the relationship of probability and MF; it also
speculates if the patients visited the doctor frequently, then recovery of disease is
partial.
The range of radial basis function is 0–1 and it is the same for MF. So, the
multiplication of these two functions ranges from 0–1. The input data is designed
based on training in the following algorithm.
Disease Prediction Algorithm 1(DPA1):
1. FOR: i = 1 to n
(a) WRITE “The age of patient”;
(b) Age = (age);
(c) A(i,0) = age;
(d) Compute BMI
(e)  2  0) =← (21.9 + 1.63) + (0.29 + 0.06) ∗ age − (0.0028 + 0.0008) ∗
BMI(i,
age ;
(f) END
2. Compute the Gini-coefficient
3. G = (Gini-coefficient)
4. FOR: i = 1 to n
(a) Input-value1(i,0) ← {BMI(i,0)*(1−G)*A(i,0)};
(b) Train-value1(i,0)←{1};
(c) END
5. State the Net
6. N1 = newrb(Input-value1,Train-value1,0.5)
7. //where 0.5 is spread after taking the samples.
8. FOR: i = 1 to n
28 S. Das et al.

(a) A1 = (age);  
(b) A2 =← (21.9 + 1.63) + (0.29 + 0.06) ∗ A1 − (0.0028 + 0.0008) ∗ A12 ;
(c) A3 = G;
(d) A = (A1*A2*A3);
(e) R = sim(1,A);
(f) END
9. Compute μ
10. Symp = {‘symp1’,’symp2’,’symp3’};
11. //3 symptoms (symp) are assumed
12. FOR: i = 1 to n
(a) “Feelings of patient, symp(i,0)”
(b) “READ 5 for extreme symp”
(c) “READ 4 for very much symp”
(d) “READ 3 for moderate symp”
(e) “READ 2 for somewhat symp”
(f) “READ 1fora little bit symp”
(g) “READ 0 for No fellings symp”
(h) X = (Numeric feeling of symp);
(i) P = 0.5
(j) K=1
(k) =1
1− 25k
(l) meu(i, 0) ← sin(5p)2
∗ sin(px) ∗ ept + 21 kx2
(m) END
13. //Generate a perceptron
14. Input-value2 = {1,1,1,1,0.4,0.3,0.2,0.1,0,0.6,0.7,0.8,0.9};
15. Train-value2 = {1,1,1,1,0,0,0,0,0,1,1,1,1}
16. U = newp(Input-value2,Train-value2)
17. //Feed the samples
18. FOR: i = 1 to n
(a) L = meu*R;
(b) Result = sim(U,L)
(c) If(Result ==1)
i. WRITE “The likelyhood of disease”
ii. WRITE “meu”
iii. END
(d) ELSE
i. WRITE“Healthy patient”
ii. END
19. //Compute α
20. FOR: i = 1 to n
(a) α(i, 0) = 1
BMI
−1
age
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 29

Fig. 4 Depict the generated perceptron

Fig. 5 Depictradial bias network

(b) T(i,0) = {1}


(c) END
21. //The α for healthy persons and perceptron is about1
22. //Theperceptron net for placebo existence
23. NPlacebo + newp(α,T)
24. If there is unknown alpha it produces 1 and it implies no placebo otherwise
placebo.
The results were good but we have improved the entire process to make highly
accurate results, which is described in the next section (Figs. 4, 5).
30 S. Das et al.

4 Result and Analysis

4.1 A Brief Description of Dataset Used for Analysis

In this paper, PIMA INDIANS dataset had been used [11]. In this dataset, all the
populations are female of around 21 years old.
Description of variables:
• Pregnancies: Number of pregnancy occurs.
• Glucose: Concentration of glucose in the interval of 2 hours.
• Blood Pressure: Pressure of blood at diastolic.
• Skin Thickness: Thickness of skin in fold.
• Insulin: Insulin in 2 hours duration.
• BMI: Weight measure in kg/(the tallness measure in m)ˆ2.
• AGE.
• Diabetes pedigree: It yields the knowledge of diabetes history and genetic influence
as well as risk factors [12].
• Outcome: Either 1 or 0, 1 indicating sick and 0 indicating healthy.

4.2 Analysis of the Techniques of Disease Prediction


Algorithm 2 (DPA2)

Steps followed for the analysis:


• Visualize data by bee swarm plot.
• Plot reliability against a variable to show a clear separation of two types of out-
comes and interpret it.
• To demonstrate the effect of traditional methods like clustering, multilayered per-
ceptron, and KNN on the data
• To propose, implement, and test the Prediction Algorithm 2 (DPA2).
• To further improve the DPA2 and devise a method called z-database method.

4.2.1 Visualization of Data by Bee Swarm Plot

Before doing analysis let us visualize data. Although box plot is most popular and
used but in this paper it uses a new approach for visualization. Here, we have used bee
swarm plot [13] for visualization purpose. This plot shows the number of observation
belonging in a particular range of a variable. For example, consider Fig. 6, which
is shown below. Since there are more number of observations in the range 0–1 for
diabetes pedigree function, it is seen that there are lots of points in that region and as
these plots appear like swarm of bees these are called bee swarm plot. The authors
can understand a very important thing about the data that are going to be used for our
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 31

Fig. 6 Showing the bee swarm plot of variable for different outcomes

Fig. 7 Showing the bee swarm plot of different variables for the two outcomes

analysis. The important fact is that we have more number of data points corresponding
to outcome 0. Hence, any accurate model that would make out of these data would
tend to classify more number of sample inputs as of 0 outcomes and that does not
mean that the model is inaccurate but it means that sufficient quality of data is not
available but still would see how our model performs better than the expected.
Key points obtained from Figs. 6 and 7 are as follows:
• More number of observations lie in the category 0, which means most of the people
belong to the healthy category.
• Maximum number of observation for the variable age lies in the Range of 20–70,
mostly belonging to the group 20–40 as in Fig. 6 most of the points lie in this range
32 S. Das et al.

Fig. 8 Showing the normalized value of R vs. diabetes pedigree function

• Maximum number of observation for the variable diabetes pedigree function lies
in the range of 0–1, mostly belonging to the group 0–0.6, as most of the points lie
in Fig. 6 in this range.
• BMI lies between 20 and 40, most of the observation lies in the range 20–35 for
the outcome 0, and in case of outcome 1, they are shifted very slightly in the range
of 25–45 for most observation and this becomes clear if one tries to find the center
from the cluster of points which is nothing but the mean of the observation as is
clear from Fig. 7.

4.2.2 Plots of Reliability Against a Variable

To test our hypothesis, the authors imported the PIMA INDIANS dataset [11]. This
dataset contains records of BMI, age, diabetes pedigree function, and the outcome.
If the outcome is 1, it means the person is diabetic and if the outcome is 0, it means
the person is healthy. The R as shown in our plot is actually the reliability that we
have defined.

R = −(1 − G) ∗ Age ∗ BMI

At this time to apply the techniques, we normalized the values of R to zero mean
and variance 1 and it is clear from the above figure. Higher R means person will be
healthy. Thus, any machine learning algorithm can use our theory to easily distinguish
the patterns in the dataset. Here, G is the average Gini coefficient of all the variables.
Key observations from plots of Figs. 8 and 9 are as follows:
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 33

Fig. 9 Plot of our data frame in pictorial format

• In the plot of Fig. 9, in the plot of R-value against all other variables represented
by the small boxes in the last column, it is seen that higher R-value corresponds
to the healthy people represented by the points in red.
• In the last row of Fig. 9, it represents the plot of R on y-axis against all other
variables, for example, the last row first plot is glucose on x-axis and R-value in y-
axis, in which it is seen that higher R-value corresponds to the healthy population.

4.2.3 Demonstration of Traditional Methods on the Data

The authors first analyze the above problem using certain techniques like clustering.
Here, the package Mclust in R is used. Two clusters are clearly visible and we
have also expected this outcome by our arguments. Here, in the figure is shown the
mixture model of probabilistic model presenting subpopulation over population. The
mixture distribution is related to controlling the characteristic of overall population
and subpopulation, which is used for statistical inference. The mixture model follows
some precise steps that postulate subpopulation identities to individual observation,
which is the concept of unsupervised learning or clustering [14] (Fig. 10).
The Gaussian mixture model is generated from a finite number of Gaussian dis-
tribution [15]. The expectation–maximization iterative algorithms were used to find
maximum likelihood, which estimates the parameter in statistical manner [10].
Thus, in Fig. 11, it is clearly seen that using our R-value and only limited infor-
mation we are good at our prediction using the theory that we developed. As we
can also see from the confusion matrix, our model does a good job at classification
34 S. Das et al.

Fig. 10 Clustering method applied to the above plot

Fig. 11 Showing our


clustering results where R
and diabetes pedigree
function have been used

using only the values of R and diabetes pedigree function. Let us see how does KNN
behaves.
Using only two variables, we are able to classify the results fairly accurately as
shown in Fig. 12 (Fig. 13).
In statistic, it is known that the Residual Square Sum (RSS) and it is also known as
Square Sum Residual (SSR) where deviation is predicted from empirical data [13].
Figure 14 shows that our multilayered perceptron model converges and is stable only
after a few iterations.
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 35

Fig. 12 Confusion matrix generated after classification using KNN. FALSE is 1 and TRUE
means 0

Fig. 13 Diagrammatic representation of the multilayer perceptron generated

4.2.4 Analysis of DPA2

The authors execute the entire proposed methodology and we call the method as
z-database method. Therefore, after adding a column on reliability values, we first
fuzzify the other variables with bell membership function. For fuzzification, we
determined for what value of the variable the membership function should produce
a membership value of one by following membership, disease prediction algorithm
as follows:
36 S. Das et al.

Fig. 14 Performance of the model that created

1. i = NULL
2. j = NULL
3. cal = NULL
4. //here diabetes is the data frame having
5. //10 variables out of which we use only 8 variables for fuzzification.
6. Diabetes = diabetes
7. //storing it in another dataframe.
8. for(j in 1:8)
9. {
10. for(i in 1:nrow(diabetes))
11. {
1. if(diabetes[i,9] ==1)
2. {
a. cal[i] = diabetes[i,j]
3. }
12. }
13. c = mean(cal,na.rm = T)
14. //if variable has value c that impliesthe membership function produces 1.
Now, the ninth column was the outcome column. After using this value of c as a
parameter for generating the bell membership function, we got (Figs. 15, 16, 17, 18,
19, 20, and 21).
As is clear from the above figures, after a certain x-value the membership function
would give a value of 1 and when we get the value of 1 it can be said that the value
of the outcome variable would be 1. The reliability values are multiplied with each
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 37

Fig. 15 Plot of MF age with x-axis

Fig. 16 Plot of MF blood pressure with x-axis

Fig. 17 Plot of MF diabetes pedigree with x-axis


38 S. Das et al.

Fig. 18 Plot of MF insulin levels with x-axis

Fig. 19 Plot of MF BMI levels with x-axis

Fig. 20 Plot of MF number of pregnancies with x-axis


Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 39

Fig. 21 Plot of MF skin thickness with x-axis

Fig. 22 Plot of our final transformed data frame for analysis

column to generate the z function, i.e., z = R*c where c is the variable we are
transforming.
The reliability is multiplied with all the columns and normalizes them such that
the maximum value is mapped to 1 and minimum value gets mapped to zero. After
that, we plot the resultant data frame.
After doing all these transformations, the resultant plot can be seen in the above
Fig. 22 with the startling fact that there is a clear separation of the outcome values in
each plot. Given any new value, you can easily identify its corresponding outcome.
That is the power of our method. Even a single plot is enough as is clear; points that
lie in top right corner are always those of healthy peoples. Hence, a linear hyperplane
is accurately able to separate healthy one.
Main observations from the plot of Fig. 22 are given below:
• All points lying in the top right corner of any plot from the data frame that is
generated belongs to the category of healthy people.
40 S. Das et al.

Fig. 23 Showing the SVM


that we created using a
simple linear kernel

Fig. 24 Showing the


confusion matrix on training
and test datasets

• A linear line can easily separate the two classes of observation.


• The plot of diabetes pedigree function versus R is a straight line.
• Plot of pregnancies versus R is also straight line indicating that increasing the
number of pregnancies increases the reliability of the female (Figs. 23, 24).
The authors partitioned the transformed data frame into 70% for training and 30%
for testing. As can be seen from the confusion matrix that there is a tendency for
classifying more number of 1’s as 0’s. This is due to the reason that it is shown from
the bee swarm plot that the data inherently contains more number of observation lying
in the outcome 0. If better quality of data with nearly even number of observation
lies in both types of outcome then this method can beat any other method known so
far.
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect … 41

5 Conclusion

The main hypothesis was the concept of reliability successfully applied for the classi-
fication of disease. Proposed algorithms conclude speculation that it could be applied
in the field of medical diagnosis to some extent and the z-database method that has
been described can be extended for other classification purposes that involve dif-
ferent types of disease. Any trained technician can handle the model and it is more
reliable than an ordinary practitioner’s speculation. It yields confidence in diagnosis
as it used probabilistic reasoning.
Findings of the work: Disease Prediction Algorithm 1 (DPA1) proposed accurately
predicts the disease provided queries that are allowed to be made by the user. The
Disease Prediction Algorithm 2 (DPA2) proposed in the z-database method classifies
the presence of disease very accurately provided age, BMI, and other parameters are
known. It is speculated from plots of Figs. 8 and 9 that high R-values represent
the absence of disease. The theoretical justification of time-varying membership
functions for prediction of disease is analyzed to serve the society as AIRDr. Future
scope of this paper is to use reliability that could be applied to predict disease other
than diabetes.

References

1. Das, S., et al.: AI doctor: an intelligent approach for medical diagnosis. In: Industry Interactive
Innovations in Science, Engineering and Technology. Lecture Notes in Networks and Systems,
vol. 11. Springer, Singapore (2017)
2. Adebayo, A.O., Fatunke, M., Nwankwo, U., Odiete, O.G.: The design and creation of a malaria
diagnosing expert system. School of Computing and Engineering Sciences, Babcock Univer-
sity, P.M.B.21244 Ikeja, Lagos, Nigeria
3. https://www.springerprofessional.de/development-of-two-artificial-neural-network-m
4. Beheshti, Z., Shamsuddin, S.M.H., Beheshti, E., Yuhaniz, S.S.: Enhancement of artificial neural
network learning using centripetal accelerated particle swarm optimization for medical diseases
diagnosis, vol. 18(11), pp 2253–2270 (2014)
5. Yacout, S.: Logical analysis of maintenance and performance data of physical assets, ID34.
D.Sc., PE, ÉcolePolytechnique de Montréal. 978-1-4577-1851-9/12/$26.00. IEEE (2012)
6. Das S., Sanyal M.K., Datta D.: Advanced diagnosis of deadly diseases using regression and
neural network. In: Social Transformation—Digital Way. CSI 2018. Communications in Com-
puter and Information Science, vol. 836. Springer, Singapore (2018)
7. Das, S., et al.: AISLDr: artificial intelligent self-learning doctor. In: Intelligent Engineering
Informatics. Advances in Intelligent Systems and Computing, vol. 695. Springer, Singapore
(2018)
8. Hung, M.S., Hu, M.Y., Shanker, M.S., Patuwo, B.E.: Estimating posterior probabilities in
classification problems with neural networks. Int. J. Comput. Intell. Organ. 1(1), 49–60 (1996)
9. Gini coefficient. In: Wikipedia, The Free Encyclopedia. Retrieved 16:52, September 9, 2017,
from https://en.wikipedia.org/w/index.php?title=Gini_coefficient&oldid=798388811 (2017)
10. Wikipedia contributors. Expectation–maximization algorithm. In: Wikipedia, The Free Ency-
clopedia. Retrieved 07:53, June 30, 2018, from https://en.wikipedia.org/w/index.php?title=
Expectation%E2%80%93maximization_algorithm&oldid=847180107 (2018)
11. https://www.kaggle.com/uciml/pima-indians-diabetes-database
42 S. Das et al.

12. Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., Johannes, R.S.: Using the ADAP
algorithm to forecast the onset of diabetes mellitus. In Symposium on Computer Applications
in Medical Care, pp. 261–265 (1988)
13. www.cbs.dtu.dk/~eklund/beeswarm/
14. Wikipedia contributors. Mixture model. In Wikipedia, The Free Encyclopedia. Retrieved
08:21, June 30, 2018, from https://en.wikipedia.org/w/index.php?title=Mixture_model&
oldid=847267104 (2018)
15. Gaussian mixture models—scikit-learn 0.19.1… (n.d.). Retrieved from www.scikit-learn.org/
stable/modules/mixture.html
Bacterial Foraging Optimization-Based
Clustering in Wireless Sensor Network
by Preventing Left-Out Nodes

S. R. Deepa and D. Rekha

Abstract The primary aim of Wireless Sensor Network (WSN) design is achieving
maximum lifetime of network. Organizing sensor nodes into clusters achieves this
goal. Further, the nodes which do not join any cluster consume high energy in trans-
mitting data to the base station and should be avoided. There is a need to optimize
the cluster formation process by preventing these left-out nodes. Bacterial Foraging
Optimization (BFO) is one of the potential bio-inspired techniques, which is yet to
be fully explored for its opportunities in WSN. Bacterial Foraging Algorithm for
Optimization (BFAO) is used in this paper as an optimization method for improving
the clustering performance in WSN by preventing left-out node’s formation. The
performance of BFAO is compared with the Particle Swarm Optimization (PSO)
and LEACH. The results show that the BFAO performance is better than PSO and
LEACH in improving the lifetime of network and throughput.

Keywords Wireless sensor networks · Bacterial foraging algorithm · Particle


swarm optimization · Clustering · Routing protocol

1 Introduction

WSN has many sensor nodes characterized by limited resources, e.g., (i) power,
(ii) processing capability, (iii) less internal storage, and (iv) restricted transmis-
sion/reception capacity. The capabilities of sensors are further limited by the power
supply, bandwidth, processing power, and malfunction [1]. With such characteristics,
the deployment of WSN was seen in various commercial applications, e.g., moni-
toring habitats in forests, inventory tracking, location sensing, military, and disaster
relief operations [2]. The communication in WSN is characterized by three types of

S. R. Deepa (B) · D. Rekha


SCSE, VIT, Chennai, India
e-mail: [email protected]
D. Rekha
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2020 43


J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent
Trends, Studies in Computational Intelligence 784,
https://doi.org/10.1007/978-981-13-7334-3_4
44 S. R. Deepa and D. Rekha

routing protocols, e.g., flat, hierarchical, and location based [3, 4]. Many stochastic
algorithms such as LEACH, LEACH-C, PEGASIS, HEED, etc. are applied to solve
the WSN clustering problem [5]. The significant problem in majority of cases of
energy consumption is found to be clustering process. The theoretical background
of WSN has clear definition of clustering and routing, but there is very less literature
found to attempt constructing bridging gap between routing and clustering process to
be dynamic. It is because, in order to design a dynamic and adaptive form of routing
along with clustering, the marginal characteristics of the sensors will need to sup-
port such algorithm implementation. With a scope of existing routing and clustering
privilege, the present state of the sensor is not much in a robust state to sustain a
better network lifetime.
Some nodes are left out during the cluster formation and do not join to any clus-
ter. Such nodes utilize higher power in transmitting data. Over time these nodes may
die leading to reduction in network lifetime. This problem calls for implementing
a better form of optimization-based approach that can bridge the trade-off between
communication performances with computational efficiency. However, such opti-
mization concepts are usually assisted by a sophisticated mathematical approach
and have good chances to be iterative as well in order to explore the better solution
to a problem. At present, there is a much rising discussion of computational ben-
efits of bio-inspired-based optimization techniques, which could possibly assist in
designing a dynamic process in the presence of real-time constraint to offer an elite
outcome. Studies in this direction have witnessed approximate and heuristic algo-
rithms involving swarm intelligence [6]. Bacterial Foraging Optimization (BFO) [7],
etc. are employed to reduce the time complexity of combinatorial optimization prob-
lems. Although there are some good numbers of studies of PSO and ACO toward
optimizing the performance of WSN, there is less standard implementation work of
optimization carried out in WSN using BFAO. Here, a simple clustering optimization
algorithm BFAO is introduced where energy efficiency is prioritized. Section 2 dis-
cusses prior research techniques toward clustering techniques followed by problem
and motivation in Sect. 3. Discussion of proposed research methodology is carried
out in Sect. 4 followed by the briefing of system model in Sect. 5. The obtained
results are discussed, followed by a summary of implementation.

2 Previous Works on WSN Clustering Problem

This section briefs about the existing research work toward improving clustering
performance in WSN. The probabilistic distributed energy-aware LEACH approach
is given in Heinemann [8] for routing. LEACH has low overhead by saving energy
with rotating CH in a certain number of rounds. The disadvantage in the LEACH
protocol was that the possibility of low-energy nodes being chosen as a CH in its
election protocol of CH would decrease the network lifetime. LEACH was improved
to LEACH-C. The base station (BS) assigns predefined number of CHs depending
on the residual energy criteria during the setup phase. The heuristic bio-inspired
Bacterial Foraging Optimization-Based Clustering in Wireless … 45

optimization algorithms using swarm intelligence have been applied to obtain global
minimum communication cost. Swarm intelligence is a collective performance of
tasks exhibited by colonies of ants, herds of sheep, swarms of bees, bats, colonies
of bacteria, etc. PSO was proposed in [9]. The paper [10] reviews applications of
PSO algorithm for problems of WSNs. The paper [11] applies the PSO method for
clustering in WSN to prevent residual nodes in the network. The BFAO was proposed
by Passino in 2002. The next section outlines the problems associated with existing
technique and motivation for carrying out research work in this.

3 Problem and Motivation

The problems associated with the existing studies are as follows: Majority of the
existing approaches toward clustering performance improvement depends upon the
selection strategy of the cluster head, which is normally carried out considering higher
residual energy criteria. However, there could be possibly large number of dynamic
and hidden attributes that could significantly affect the selection process. It is never
being appropriately investigated. Some nodes that do not join any cluster consume
more energy during communication of data due to exchange of control packets. Even-
tually, such nodes die leading to network death. There is a need for preventing left-out
node formation. Hence, usage of swarm intelligence offers a significant platform to
model such latent attributes and thereby includes dynamic behavior of environment
to be testified on WSN. However, existing swarm intelligence techniques are quite
iterative in principle and the majority of the researchers are more focused on imple-
menting distributed scenario. However, yet such distributed scenario is not able to
obtain better clustering performance till date. Moreover, there is no standard model
reporting to use BFAO for enhancing clustering performance in WSN, which poses
as one of the major research gaps. Hence, proposed study will be “To design a cen-
tralized BFAO algorithm that could significantly improve the clustering performance
in WSN and thereby enhance network longevity.”
A closer look at existing swarm intelligence algorithms shows that if they are
made to work in centralized manner they have higher likelihood of being affected by
the traffic density and cannot sustain the load eventually. BFAO algorithm never leads
to any nonlinearity problems. The probability of BFAO to offer superior convergence
is higher compared to any other analytical models presented till date. Another most
important fact still remains that BFAO offers quite minimal computational burdens
and exhibits enhanced capability to execute multiple objective functions. However,
such potential features are less investigated on WSN, and hence it motivates to per-
form the proposed study toward using BFAO for enhancing clustering performance.
46 S. R. Deepa and D. Rekha

4 Research Methodology

The proposed research work considers analytical modeling with emphasis on adop-
tion of undirected graph theory for constructing a novel clustering algorithm. The
prime idea is to construct a centralized BFAO algorithm that can offer better clustering
processing. Here, BFAO is centralized as well as hierarchical which will finally create
a base of any distributed routing technique. The potential advantage of considering
centralized clustering scheme is to ensure regular update and seamless monitoring
of any criterion (e.g., distance, energy, memory, algorithm execution, etc.) at a single
instance.
The proposed system introduces a simple and novel clustering technique using
BFAO by preventing left-out node formation, which is also associated with the energy
modeling. The proposed system performs energy modeling using standard first-order
radio energy modeling that allows simple and easier inference of the energy perfor-
mance of the sensor nodes in every state of its data forwarding operations.
As the proposed system considers clustering as a discrete optimization problem
therefore, selection of CH is also considered to be very crucial in enhancing network
lifetime. The CHs are taken as the variables for the optimization. 2 × N coordinates
in N nodes, xi ∀ i = 1, 2, . . . , N, are the variables for optimizing clusters. The 2
× N-dimensional search space (P) spans N nodes deployed in a two-dimensional
region ranging from 0 to Xmax in x-coordinate and 0 to Ymaxin y-coordinate. The set 
of solutions is given by the position of ith bacterium θi = x1x , x1y , . . . , xNx , xNy .
The objective function is called as the nutrient function, f in the BFAO algorithm,
assigns a positive
  cost to each candidate solution θi ∈ P. The optimal solution θopt
satisfies f θopt ≥ f (θ), ∀θ ∈ P. The objective function f is given by
i

ECHavg DOavg
f =α + (1 − α) (1)
EOavg DCHavg

where α = 0.5, EOavg , ECHavg are the average values of residual energy for ordinary
nodes and CHs, respectively, and DOavg is the average value of distance for ordinary
nodes to BS, and DCHavg is the average distance of CHs to BS.
The percentage of nodes to become CHs, where popt is selected prior. There is
provision to select each node as the CH in the 1/popt rounds, which is the epoch.
Hierarchical routing protocol clusters the sensor nodes during each round of com-
munication. Each communication round has two phases, setup phase to obtain the
optimal routing path and steady phase to transmit sensing data by ordinary nodes
to the BS via CHs. All the aggregated data are forwarded by CH to BS. BS runs
centralized optimization algorithm to choose the CHs at starting along the minimum
communication energy path. The sensor nodes are partitioned into the Ordinary node
(O) and CH during clustering process in the setup phase at the BS. The nodes send
the sensing data in the steady phase to the respective CH, which forwards it to the
BS. Energy consumed is the addition of energy at setup and steady phase. In this
work, nodes are considered to be in the transmission range of BS or the node distance
Bacterial Foraging Optimization-Based Clustering in Wireless … 47

to BS is less than threshold distance, d0 . The proposed system applies the following
operation in the cluster setup phase:
• CH selection phase at the BS;
• Assignment phase of assigning the ordinary nodes to the respective cluster; and
• The routing path of the CHs depends on the data load corresponding to the nodes in
cluster and distance from the BS; BS will find a multi-hop routing path to achieve
load balancing of the CHs.
• Transmission is by BS of message consisting of Node_ID (1 to Ntotal ), Node_type
(CH or ordinary node), Cluster_ID (1 to K), and Routing_Node_ID (set of
Node_ID). In this work, all the nodes are considered within the radio range or
transmission range of the base station.
– Reception of the same by every node
Total setup phase energy used can be obtained by the sensor nodes Esetup , equality
with Esetup_ON , energy consumed by ordinary nodes and Esetup_CH , and energy con-
sumed by CHs. Energy Esetup is consumed in one round of setup phase by all nodes
in the region of interest to receive la bits of assignment message from the BS.

Esetup = la · E elec · Ntotal (2)

1. Energy expenditure in steady phase:


Esteady . Ordinary nodes will transmit the information to its BS through the route
sent by a base station in the setup section. Hence, energy spent during such setup
phase is expressed as the amount of total energy of CH and active state of a
sensor.

Esteady = T otal_E C H + T otal_E O (3)

2. Total energy expenditure in one round of communication:


The communication cost, Jround , in one round of transmission of sensory infor-
mation of ordinary nodes to BS via CH is the addition of total energy expenditure
by CH and ordinary nodes.

Jround = la · Ntotal · E elec + T otal_E C H + T otal_E O (4)

Hence, the proposed system introduces a significant enhancement in the clustering


process where energy factor is further split into various sub-energies involved in the
process of clustering with respect to communication. The BFAO sorts the nodes
considering fitness factor that permits formation of the cluster only on the basis of
highest value of fitness factor. It does this task by considering neighboring sensors,
while the consecutive sensor from the sorted list constructs the cluster by including
nodes in its transmission range. This operation continues until each cluster is formed.
The next section further outlines the system model designed.
48 S. R. Deepa and D. Rekha

5 Proposed System Model

In the current work, a centralized BFO algorithm is used to optimize cluster formation
within WSN. The centralized clustering algorithms are suitable for managing many
nodes. It uses the nutrient defined in Eq. 1. Standard radio energy model [9] was used
to calculate the energy consumed for each round of communication. Each ordinary
node in the cluster transmits l packet of sensory data to the CH.
Amplification energy for transmission is εamp_ f s = 0.01e − 9J/bit/m2 in free
space model, d < d0 and εamp_mp = 1.3e−15J/bit/m4 in multi-path model, d > d0.
Electronics energy for transmission and reception is E elec = 50.0e − 9J/bit
where d is the distance between nodes.
Threshold distance is d0 = (εamp_ f s /εamp_mp )0.5 = 87.7 m.
Ntotal nodes have been partitioned into k clusters. The  total number of nodes
includes nj ordinary nodes and one CH Ntotal = kj=1 n j + 1 . Equations 5–13
explain the energy consumption by CHs and ordinary nodes for one round of trans-
mission of sensory data.

5.1 Modeling Energy Consumption for CH

The CH in the cluster Cj receives l bits of data from each node out of nj ordinary
nodes in the cluster Cj . So, total l · n j bits of data are received by the C j node.
Therefore, energy depleted by each CH due to reception is given by the following
equation:
  
E RxC H j l · n j n j = l · n j · E elec (5)

The CHj transmits information of nj nodes of cluster Cj and also from its sensory
region. So, total (l + 1) · n j bits of data are transmitted by the CHj . Energy depleted
by the CHj in transmitting (l + 1) · n j data bits through distance of Dj meter is given
by the following equation:
   
E T xC H j (l + 1) · n j , D j (nJ) = E elec + εamp · D 2j · (l + 1) · n j (6)

Energy used by k CHs for transmission, reception, and amplification can be empir-
ically expressed as


k
EC H = (2 · l + 1) · nj · E elec + εamp · D 2j · (l + 1) · nj (7)
j=1

(a) The sum of consumed energy of q CHs in one round of transmission of data for
D j ≤ d0 (nearer nodes) is
Bacterial Foraging Optimization-Based Clustering in Wireless … 49


k
T otal EC H = E elec · (2 · l + 1) · (Ntotal − q) + εamp_ f s · (l + 1) · n j · D 2j
j=1
(8)

(b) Total energy consumed by q CHs in one round of transmission of data for
D j > d0 (far nodes) is


k
T otal_E C H = E elec · (2 · l + 1) · (Ntotal − q) + εamp_mp · (l + 1) · n j · D 4j
j=1
(9)

5.2 Energy Consumption by Member Nodes

The Cj cluster has nj ordinary nodes, and l bits of data are


 sent
 to the CHj at a distance
d ij from the CH. The ordinary node loses energy, EON l, dij in the cluster, C j during
transmission, and amplification of l bits of data to the CHj . Equation 10 gives the
amount of energy depleted by sensor nj in the cluster, C j


nj
 
E O T xC j = E elec + εamp · di2j · l (10)
i=1

Total energy consumed by all ordinary sensors in the region is equal to the energy
consumed during transmission and amplification as there is no reception of data.


k 
nj
 
EO = E elec + εamp · di2j · l (11)
j=1 i=1

Total energy consumed by (Ntotal − q) ordinary nodes:


(a) For


k 
nj
 
dij ≤ d0 , T otal_E O = E elec + εamp_ f s · di2j · l (12)
j=1 i=1

(b) For


k 
nj
 
dij > d0 , T otal_E O = E elec + εamp_mp · di4j · l (13)
j=1 i=1
50 S. R. Deepa and D. Rekha

The communication cost, J, in one round of transmission of sensory data from


ordinary nodes to BS via CH is the sum of total energy expenditure by CH and
ordinary nodes.

J = T otal_E C H + T otal_E O (14)

The abovementioned equation for computing total energy consumption is appli-


cable for all the member nodes irrespective of their individual among each CH. The
mechanism is further optimized by using BFAO algorithm that is a global search
method mimicking the foraging behavior of bacteria to obtain a near optimal solu-
tion.
The clustering problem is transformed into optimization problem through two
techniques. The first one is the encoding strategy and criterion function. The encoding
technique represents the candidate solution as a particle in PSO, or a bacterial in
BFAO, or a chromosome in GA. The clustering property is evaluated by the criterion
function, called as the fitness function in PSO and nutrient function in BFAO. The
nutrient function is the weighted sum of the objective functions to be optimized in the
problem. The foraging E.coli bacteria are utilized to resolve optimization problem.
The position of each bacterium contains the candidate solutions to the problem.
The bacterium position is updated according to three mechanisms, Nc chemotactic
(swimming and tumbling) steps, Nre reproduction steps,  and Ned elimination
 and
dispersal steps. The position of ith bacterium, θi = θi (n)|n = 1, . . . , p having p
elements is updated in the p-dimensional search space at any time in the loop of
j = 1, . . . , Nc , m = 1, . . . , Ns , k = 1, . . . , Nre and l = 1, 2, . . . , Ned .
In Fig. 1, the flowchart of BFAO is depicted. Bacteria are initially positioned at
several points in the P search space. Their positions are updated to find the minimum
of fitness function f. Each chemotactic step, j, generates a unit length random direction
function ϕ(j) due to tumbling. The cost function at the new position at the beginning of
each chemotactic step is updated by adding the cost, Jcc , due to cell-to-cell signaling to
the previous objective function, f. It increases the cost function by releasing attractant
chemicals of depth dattractant = 0.1 and width, wattractant = 0.2 and repellant chemicals
of height h repellant = 0.1 and width, wrepellant = 10 [8].
The objective function, f, changes by Jcc because of swarming in case of nutrient or
noxious environment. The nutrient environment in case of maximization of objective
function will attract other bacteria at the peak position of the function. The bacterial
will make longer swim steps and swim up the nutrient gradient to make a swarm.
The noxious environment in case of minimization of objective function will attract
the bacteria around the valley. The bacteria will climb down the noxious gradient
with shorter swimming step and more tumbling. The bacteria then swim a distance
denoted by step size C, in the direction of tumbling. Positions of bacteria are updated
during the jth chemotactic step by θi(j + 1, k, l) = θi(j, k, l) + C(i, m) × ϕ(j), where
C(i, m) represents the swimming step of the ith bacterium in the mth swimming step.
If the objective function at the new position is increased, then the bacterium
takes a swimming step. The new position is calculated, and the objective function is
Bacterial Foraging Optimization-Based Clustering in Wireless … 51

Fig. 1 Process flow diagram of proposed BFAO

evaluated. If the objective function increases, the bacteria swim and reach the max-
imum number of steps Ns. If the objective function does not improve, then the
bacterium takes the next chemotactic step from j + 1 to j + 2 and tumbles with the
corresponding function ϕ(j + 2). The bacterium reproduces after Nc chemotactic
steps. The health of bacterium is represented by the sum of the costs in the chemo-
tactic steps. So, the bacteria that are stagnant in the local maximum and do not swim
are replaced by the dynamic bacteria. These dynamic bacteria have taken many
swimming steps and are in the correct direction of reaching the global maximum.
The health function is considered to sort bacteria. Half of the healthiest bacteria are
allowed to reproduce so that the total number of bacteria remains the same. Max-
imum four generations are replicated from the reproduction steps. The elimination
and dispersal step deletes a certain number of bacteria randomly by comparing with
52 S. R. Deepa and D. Rekha

the elimination-dispersal probability Ped , and reallocates the same number of bacte-
ria randomly. This will widen the search space of bacteria. Bacteria are allowed to
undergo the elimination-dispersal phase once while searching for the global maxi-
mum.

5.3 Proposed Clustering Using BFAO

The BFAO algorithm is divided into the following steps to be implemented for
clustering of WSN.
• Initialization: At the beginning, randomly deploy Ntotal a number of nodes having
2 × Ntotal coordinates in the given region. All nodes are given E0 energy. S bacteria
are assigned S candidate solutions of the coordinates of the q CHs. Each bacterium
contains 2 × q a random number of coordinates in the given region.
• CH Selection: The 2 × q numbers of coordinates are obtained from the positions
of bacteria and are assigned as the CHs. The localization operator assigns the
coordinates to the nodes. The nodes closest to these coordinates are assigned as
CHs, provided they satisfy residual energy criteria. The residual energy criteria
should be more than average residual energy of the total nodes.
– Formation of CHj (j = 1, 2, . . . , q)
– Find out the distance Dj of CHj from the BS
• Cluster formation: Sensor nodes are partitioned into ordinary node and CH by
assigning ordinary nodes to the cluster depending on their proximity to the respec-
tive CHs.
– Assign ordinary nodes, Oi (i = 1, 0, . . . , Ntotal − q), to the clusters Cj (j =
1, 2, . . . , q) having shortest distance from CH j .
– In C j cluster , n j ordinary nodes are found.

– Find distance di j of Oi i = 1, 2, .., nj from CH j .
• Communication energy calculation: Objective function also called as nutrient
function, f, is calculated for each bacterium.
• Update of nutrient function with swarming cost: The swarming cost Jcc is added
to the nutrient function, f, to obtain the new nutrient function for each bacterium.
• Update of positions by BFAO Algorithm: Updation of each bacterium position
is done according to the bacteria foraging algorithm. The new CHs are assigned at
the end of the chemotactic stage, reproduction, elimination, and dispersal stages
for each round of communication.
The proposed system considers applying the optimization operation on the basis of
the cost factor involved in the communication process. The nodes are sorted according
to fitness value, the node that possessed highest value of fitness constructs the cluster
by including the nodes in its transmission range, and similarly, the next node in the
sorted list forms the cluster by including nodes in its transmission range until the
clusters get formed.
Bacterial Foraging Optimization-Based Clustering in Wireless … 53

Table 1 Simulation results using BFAO


Nodes 500 rounds of communication Communication rounds
Alive nodes Residual Throughput Variance First node Half of the
energy (J) (bits) dead nodes dead
50 50 128881 10 × 107 0.00547 538 876
100 100 322255 1 × 107 0.01972 738 1713

Table 2 Comparison of clustering algorithms and half of the nodes dead (in communication rounds)
Clustering 100 nodes 50 nodes
algorithm First sensor Half the sensor First sensor Death of half of
node death nodes death node death the nodes
LEACH 65 249 110 218
PSO 178 1319 197 1298
BFAO 738 1713 538 876

Table 3 Comparison of LEACH, PSO, and in BFAO for 50 nodes and 500 communication rounds
Clustering algorithm Alive nodes Residual energy (J) Throughput (bits) Variance
LEACH 6 0.7279 2.6 × 107 0.00207
PSO 38 132653 14.7 × 107 0.02884
BFAO 50 132855 15 × 107 0.00547

6 Result Analysis

The assumptions for the numerical simulation are similar to that of LEACH [8].
The nodes and BS are stationary. The BS is placed at (110 m, 110 m) area which is
outside the deployment region of sensors. The nodes are deployed randomly. All the
sensor nodes are homogeneous. Every sensor node sends a fixed amount (2000 bits)
of message in each round. The simulation is performed with the value of optimal
percentage of the nodes to be CH per round per epoch, Popt as 5%. There were 100
homogeneous nodes deployed in a region of 100 m∗100 m area. Nodes exhibit initial
energy of 0.5 J and the size of each packet being 2000 bits. The epoch consists of
1/popt equal to 20 rounds. So on average five nodes become CH in each round, and
each node becomes CH once in 20 rounds.
Table 1 depicts the results of simulation considering 50 nodes and 100 nodes in
the network. The analysis is done by giving importance to the network longevity that
is calculated with respect to the first node death as well as half of the node death.
Table 2 shows that BFAO is efficient in improving network lifetime compared to
LEACH and PSO.
Table 3 depicts that BFAO is efficient in terms of residual energy, alive nodes, and
throughput compared to LEACH and PSO.
54 S. R. Deepa and D. Rekha

Fig. 2 Alive nodes analysis

Following graphs show the simulation results considering 100 nodes deployed in
the network.
The outcomes of proposed study show that proposed BFAO offers more network
sustainability with better communication performance with respect to number of
alive nodes and throughput, respectively. Figure 2 shows that live nodes decline at a
faster speed for PSO and LEACH, whereas it slowly degrades for proposed BFAO
technique. The graph shows that BFAO-based approach is able to sustain increased
number of alive nodes with increasing simulation rounds owing to its optimization
technique. The similar trend can be seen in Fig. 3, where throughput is found to
be significantly improved with the rise of the simulation rounds for the presented
approach. A significant consistency can be seen as LEACH protocol ceases to perform
till 700th rounds, while PSO sustains till 1400th rounds. BFAO exceeds more than
1600th rounds. The reasons for such trends of outcomes are as follows:
(i) The centralized positioning of a base station in LEACH leads to faster rate of con-
sumption of energy due to increasing traffic load from each cluster to base station. (ii)
PSO algorithm offers good solution toward exploring global optima causing better
energy efficient routes after performing clustering. However, the process is highly
iterative and soon the CH becomes overloaded when it crosses more than 800th iter-
ations causing many nodes to die. The proposed system solves this problem of faster
node death by ensuring a highly distributed form of behavior in its optimization
stages. This leads to equal dissemination of network traffic based on nodes leading
Bacterial Foraging Optimization-Based Clustering in Wireless … 55

Fig. 3 Throughput analysis

to a slower mode of sustenance. A closer look into the throughput curve shows that
LEACH collapses very soon, while PSO offers more than 30% of enhancement as
compared to LEACH, while proposed system offers much better throughput perfor-
mance in contrast to PSO as it is not affected by size and offers better convergence
performance irrespective of any network condition.
In Fig. 4, residual energy remains higher in BFAO as the message is being routed
through the global minimum energy path. A closer look shows that residual energy
of LEACH witness stiff fall, while the PSO and the proposed system offer smooth
declination of the residual energy with increasing rounds. The consistency of PSO
was good till 600th rounds; however, it started falling apart owing to consumption
of energy. On the other hand, proposed BFAO offers superior consistency that goes
very well till maximum round of simulation. Hence, proposed clustering is capable
of offering better network lifetime.
Figure 5 shows the variance of residual energy of the nodes. PSO includes lots of
recursive step that results in maximized energy demands from the sensor node. At
the same time, the synching operation between local and global optima is never in
parallel in PSO that results in higher fluctuation while transmitting data. It can be seen
that the amount of energy fluctuation for both LEACH and PSO is quite intermittent,
while proposed BFAO algorithm offers very much less intermittent fluctuation. The
BFAO algorithm takes care of the load balancing problem and hence is more robust
than LEACH algorithm.
56 S. R. Deepa and D. Rekha

Fig. 4 Residual energy analysis

Figure 6 shows that BFAO outperforms by reducing the left-out nodes compared to
PSO and LEACH. The outcome shows that the proposed system offers more number
of participation of the sensor motes for data aggregation, whereas PSO and LEACH
induce more depletion of node owing to energy depletion resulting in increasingly
lower partitioned node. For this reason, the number of left-out node is higher for
LEACH as well as PSO. Intentionally, the proposed study outcome is not compared
with the latest variants of PSO or LEACH as it will be quite challenging to infer
the comparative analysis as various variants have various unique schemes which do
not offer measurable outcomes. Hence, the proposed system is compared to only
generic version of PSO and LEACH which is also the base adoption of any new
variants. The performance of the proposed algorithm will be nearly uniform when it
is exposed to sparse or dense network system. There will be only negligible deviation
in the outcome of proposed system with another test environment of sparse/dense
network. As the proposed system is evaluated using random positioning system of the
nodes and various simulation iterations involved different deployment configurations,
therefore, it can be said that outcome of the proposed system is quite well justified
as well as applicable for both dense and sparse networks.
Bacterial Foraging Optimization-Based Clustering in Wireless … 57

Fig. 5 Analysis of energy variance

Left out Node Comparison


50
45
40
35
No. of Nodes

30
25
20
15
10
5
0
Series 1 BFAO PSO LEACH

Fig. 6 Left-out node comparison among different protocols

7 Conclusion

The Bacterial Foraging Algorithm for Optimization (BFAO) is proposed in this paper
to optimize cluster formation in WSN and reduce the number of left-out nodes to
enhance network lifetime. The left-out nodes consume more energy for data trans-
mission to base station and eventually die, thereby reducing the network lifetime.
The prevention of left-out node formation is done by using BFAO. This reduction is
58 S. R. Deepa and D. Rekha

done by considering the fitness of every node in deciding the cluster heads by BFAO.
The simulation results show that BFAO outperforms in reducing the left-out nodes,
increasing throughput, and improving the network lifetime.

References

1. Zhang, Y., Laurence Yang, T., Chen, J. (eds.): RFID and Sensor Networks: Architectures, Proto-
cols, Security, and Integrations. Wireless Networks and Mobile Communications, pp. 323–353.
CRC Press, Boca Raton, Fl (2009)
2. Kahn, J.M., Katz, R.H., Pister, K.S.J.: Next century challenges: scalable coordination in sen-
sor networks. In: MobiCom1999: Proceedings of the 5th Annual ACM/IEEE International
Conference on Mobile Computing and Networking, New York, USA, pp. 271–278 (1999)
3. Kulik, J., Heinzelman, W.R., Balakrishnan, H.: Negotiation-based protocols for disseminating
information in wireless sensor networks’. Wirel. Netw. 8,169–185 (2002)
4. Subramanian, L., Katz, R.H.: An architecture for building self configurable systems. In: Mobi-
HOC 2000: Proceedings of First Annual Workshop on Mobile and Ad Hoc Networking and
Computing, Boston, MA, pp. 63–73 (2000)
5. Banerjee, S., Khuller, S.A.: Clustering scheme for hierarchical control in multi-hop wireless
networks. In: IEEE INFOCOM 2001. Proceedings of Conference on Computer Communica-
tions; Twentieth Annual Joint Conference of the IEEE Computer and Communications Society,
Anchorage, AK, vol. 2, pp. 1028–1037 (2001)
6. Wang, X., Li, Q., Xiong, N., Pan, Y.: Ant colony optimization-based location-aware routing
for wireless sensor networks. In: Li, Y., Huynh, D.T., Das, S.K., Du, D.Z. (eds.) Wireless
Algorithms, Systems, and Applications, WASA 2008. Lecture Notes in Computer Science, vol
5258. Springer, Heidelberg (2008)
7. Passino, K.M.: Biomimicry of bacterial foraging for distributed optimization and control. IEEE
Control Syst. Mag. 22, 52–67 (2002)
8. Heinzelman, W.B., Chandrakasan, A.P., Balakrishnan, H.: An application specific protocol
architecture for wireless microsensor networks. IEEE Trans. Wireless Commun. 1, 660–670
(2002)
9. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: ICNN 1995: Proceedings of Inter-
national Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995)
10. Guru, S., Halgamuge, S., Fernando, S.: Particle swarm optimisers for cluster formation in
wireless sensor networks. In: Proceedings of International Conference on Intelligent Sensors,
Sensor Networks and Information Processing, pp. 319–324 (2005)
11. RejinaParvin, J., Vasanthanayaki, C.: Particle swarm optimization-based clustering by prevent-
ing residual nodes in wireless sensor networks. IEEE Sens. Journa 15, 4264–4274 (2015)
Product Prediction and Recommendation
in E-Commerce Using Collaborative
Filtering and Artificial Neural Networks:
A Hybrid Approach

Soma Bandyopadhyay and S. S. Thakur

Abstract In modern society, online purchasing using popular website has become a
new trend and the reason beyond it is E-commerce business which has grown rapidly.
These E-commerce systems cannot provide one to one recommendation, due to this
reason customers are not able to decide about products, and they may purchase. The
main concern of this work is to increase the product sales, by keeping in mind that at
least our system may satisfy the needs of regular customers. This paper presents an
innovative approach using collaborative filtering (CF) and artificial neural networks
(ANN) to generate predictions which may help students to use these predictions
for their future requirements. In this work, buying pattern of the students who are
going to face campus interviews has been taken into consideration. In addition to
this, buying patterns of the alumni for luxurious items was also considered. This
recommendation has been done for the products, and the results generated by our
approach are quite interesting.

Keywords Predictions · Recommender system · E-Commerce · Collaborative


filtering · Nearest neighbors · Artificial neural networks

1 Introduction

As recommendation has become a part of day-to-day life, we rely on external infor-


mation before taking any decision about an artifact of interest. After getting user’s
preferences, if accurate prediction algorithm is applied personalized recommenda-
tion can be done more correctly [1, 2].
According to Cosley et al. and Ziegler et al., it is essential to publish the infor-
mation related to recommendation so that system designers and users can utilize

S. Bandyopadhyay (B) · S. S. Thakur


MCKV Institute of Engineering, Howrah, West-Bengal, India
e-mail: [email protected]
S. S. Thakur
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2020 59


J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent
Trends, Studies in Computational Intelligence 784,
https://doi.org/10.1007/978-981-13-7334-3_5
60 S. Bandyopadhyay and S. S. Thakur

the same in future recommendation purpose [3, 4]. Though algorithmic accuracy is
important, it is not only adequate tool to analyze the accuracy of any recommendation
systems. Over a prolonged period, content-based and hybrid recommender systems
have been developed to provide recommendation [5]. It has been observed that fea-
tures, ratings, etc. are taken into consideration in case of content-based recommender
systems. Sometimes the reviews of customers are on various products that are also
taken to develop proper recommendation system [6].
In Sect. 1, literature survey related to the work was mentioned and the remaining
part of the paper is organized as follows: In Sect. 2, discussion on details related to
memory-based collaborative filtering technique has been done. Neural network pre-
liminaries and classification approach and related algorithm are explained in Sect. 3.
The focus on the proposed work was mentioned in Sect. 4. The implementation of
the work using collaborative filtering and ANN has been explained in Sect. 5. The
experimental results are discussed in Sect. 6. We conclude this paper with discussion
on future work in Sect. 7.

2 Memory-Based Collaborative Filtering Technique

Nowadays, memory-based CF uses either the entire database or a sample of user–item


database to generate prediction and recommendation for an active user (new user)
in E-commerce website. Both user-based and item-based collaborative filtering
approaches are used to find target user’s nearest neighbor [7].

2.1 Similarity Computation

In memory-based CF algorithms, similarity computation between item and user is a


crucial task. In order to find similarity, either weight wi, j or wu,v is calculated between
two items i and j or between two users u and v, respectively. Here, two items i and j
are the items to which the users have rated in similar manner. Similarly, u and v are
the users having same preferences of items or who have created the same items.
Correlation-Based Similarity. In this case, wu,v between two users u and v or
wi, j between two items i and j are computed by Pearson correlation (PC) or other
correlation method. To get accurate result, first the co-rated cases are isolated, and
the PC is computed between users u and v using the following equation:
   
i∈I r u,i − r̄ u r v,i − r̄ v
wu,v =   2   2 (1)
i∈I r u,i − r̄ u i∈I r v,i − r̄ v
Product Prediction and Recommendation in E-Commerce … 61

Here, i ∈ I summations are over the items for which both the users u and v have
given rating. r̄u denotes the average rating of the co-rated items of the uth user.
Equation 2 is used to determine PC for item-based CF algorithm.
   
u∈U r u,i − r̄i r u, j − r̄ j
wi, j =   2   2 (2)
u∈U r u,i − r̄ i u∈U r u, j − r̄ j

In this case, the set of users who rated both items i and j is denoted by u ∈ U, ru,i
denotes the rating of user u on item i, and r̄i is the average rating of the ith item by
the same users.
Vector Cosine-Based Similarity. In this case, instead of documents, either users
or items are used and instead of word frequencies ratings are used. Vector cosine
similarity between items i and j is given by
  i · j
wi, j = cos i, j =     (3)
  
i  ∗  j 

Adjusted Cosine Similarity. The measurement of adjusted cosine similarity where


M k,x denotes the rating of user k on item ix , and M̄k represents the average rating
value of user k on all items.
m
  (Mk,x − M̄k ) × (Mk,y − M̄k )
sim i x , i y =  k=1 m (4)
m
k=1 (Mk,x − M̄k ) k=1 (Mk,y − M̄k )
2 2

2.2 Computing Prediction

In the neighborhood-based CF approach, by finding similarity with active user, a


subset of nearest neighbors is chosen. Then, a weighted aggregate of their ratings
is computed which is used for generating predictions in future. After computing the
similarity between items, a set of k most similar items to the target item are selected
and a predicted value for the target item is generated. The weighted sum measurement
is used as follows:
k   
j=1 Ma, j sim i j , i t
Ma,t = k   (5)
j=1 sim i j, i t

Here, M a,t represents the prediction value of target user Ua on target item it . Only
the k most similar items are used to generate the prediction. Similarly, we compute
item-based similarities on the user–item matrix M [7].
62 S. Bandyopadhyay and S. S. Thakur

3 Neural Network Preliminaries and Classification


Approach

Artificial neural network (ANN) is a widely used as computation model that can pro-
cess information. The information from available dataset is taken by the input nodes,
and the summation of input is taken and activation function is used for producing
output of the neuron. Multilayer perceptron (MLP) is a popular algorithm that takes
back propagation error as input of the model. If correct parameters are available for
a particular dataset at the time of training, any algorithm may work well but lot of
testing is required when any algorithm is applied on new dataset [8].

3.1 Feedforward Neural Network (FFNN)

In FFNN, information moves from the input nodes to the output nodes in forward
direction through hidden nodes. In this case, the weight denotes the knowledge of
the network and weighted sum of the input is calculated which is used as new input
values for the next layer. This process is an iterative process until it goes through all
layers and finally provides the output.

3.2 Backpropagation Algorithm (BPA)

BPA is used to measure how each neuron in the network contributed to the overall
error after the processing of data. In this method, each weight is adjusted in proportion
to its contribution in overall error. If the error of each weight can be minimized, good
prediction result can be achieved.

3.3 Algorithm

Step 1: Create a neural network (NN).


(a) Set hidden unit equal to 1.
(b) Randomly initialize all weights.
Step 2: Train the artificial neural network by using training dataset.
Step 3: Compute error function on valid dataset.
Step 4: Calculate efficiency.
Step 5: Check error function and efficiency are within range.
Step 6: Stop the process if error function is acceptable else add one hidden
unit to the hidden layer and go to step 2.
Product Prediction and Recommendation in E-Commerce … 63

4 Proposed Work

This work presents a method which helps to analyze customers’ buying pattern and to
find customers’ future requirement. FFNN and BPA have been applied, for training
the neural network (NN) for future prediction. The weights are generated randomly
for each neuron, and final input is obtained using activation function as function. The
block diagram of the proposed system is shown in Fig. 1.
Initially, a survey was done which comprises 20 student’s dataset and question-
naire are explained to them. Then the students of different engineering institutes
were provided a form using Google form and were requested to participate in data
collection. Incomplete dataset was removed using normalization. The total number
of dataset available in the database is 1050. The customers are students of same age
group varying from 21 years to 23 years, who are going to participate in campus
interviews.
Another survey form was designed for the students who have already been placed
and joined some organization. They were asked to rate different items which they
would like to purchase in near future. Data of 915 alumni were taken who have
rated 10 different items which they wish to purchase. It has been observed that
there was a high probability among the users who want to purchase iPhone. In this
work, our major focus is for offline rating of luxurious items keeping in mind both
the middle-class and high-income groups. For rating data, variances in user rating
styles were taken into consideration. Here, we compare the different offline item-
based collaborative filterings in the context of two different datasets. The datasets
are of the students who have passed in the year 2015 and 2016. Out of collected 915
user–item datasets, only 15 user’s rating’s datasets have been shown in Table 1.
Another form was designed, and feedback was taken from the pass-out students,
1 year after their joining in the company, and was used to measure the accuracy of
prediction. Standard mean absolute error (MAE) between the prediction and actual
purchased data of 1 year has been taken for consecutive 2 years. The purchased data
of 665 alumni were collected and used in this work.

Fig. 1 Block diagram of the proposed system


64

Table 1 User–item datasets with ratings


I1 I2 I3 I4 I5 I6 I7 I8 I9 I10
iPhone Laptop HDD Digital camera Motorcycle Jewelry Refrigerator Air conditioner LED television Car

User1 5 4 3 2 1
User2 4 5 2 3 1
User3 5 4 3 4 1
User4 1 5 2 3 4
User 5 2 3 1 5 4
User6 1 4 3 5 2
User 7 1 5 4 3 2
User 8 2 5 1 4 3
User 9 4 5 3 2 1
User 10 5 1 2 4 3
User 11 3 1 5 2 4
User 12 4 2 5 3
User 13 4 3 5 2 1
User 14 1 5 4 3 2
User 15 2 1 5 3 4
S. Bandyopadhyay and S. S. Thakur
Product Prediction and Recommendation in E-Commerce … 65

5 Implementation of the Work Using Collaborative


Filtering and ANN

Initially, a dataset has been prepared based on the feedback of students participated
in survey. The dataset was divided into two parts: one is for training and other is
for testing purposes. At first, 70% of the data have been used for training purposes
and 30% of data were used for testing purposes, and the accuracy was found to be
74.5%. In the second case from the same dataset, 60% of the data were used for
training purposes and 40% of the data were used for testing purposes. In this case,
accuracy was 72.8%. In the last case, the dataset was partitioned into two parts where
50% of the datasets were used for training purposes and 50% of datasets were used
for testing purposes. In this case, accuracy was 71.6%. Table 2 clearly shows the
accuracy in % with different sizes of training and test dataset.
Table 3 shows the prediction about how many numbers of alumni will purchase
different items like iPhone, digital camera, etc. It also shows the actual purchase
data of the alumni and mean absolute error. To get the optimal results, there is a
requirement of changing the number of neurons and number of iterations.

Table 2 Dataset used for training and testing purpose


Training dataset (%) Test dataset (%) Accuracy in %
70 30 74.5
60 40 72.8
50 50 71.6

Table 3 Predicted and actual purchased data


Item name Predicted purchased data Actual purchased data Error MAE
iPhone 452 478 26 23.6
Laptop 134 98 36
HDD 35 22 13
Digital 292 246 46
camera
Motor cycle 65 42 23
Jewelry 105 90 15
Refrigerator 76 54 22
Air 206 233 27
conditioner
LED 167 145 22
television
Car 13 07 06
66 S. Bandyopadhyay and S. S. Thakur

Table 4 Total number of different attributes for input, predicted outputs, and actual output
Items Attributes Input(X) Predicted O/P(Ŷ) Actual O/P(Y) Error
1 Shirt 150 160 170 10
2 Trouser 132 140 150 5
3 Salwar 80 87 90 3
4 Shoe 55 41 50 9
5 Socks 43 32 40 8
6 Watch 13 15 10 1
7 Tie 32 27 25 8
8 Belt 18 9 15 6
9 Blazer 12 17 15 5
10 Cosmetics 45 50 52 2

ANN was applied to the database of 1050 students collected from different engi-
neering institutes for the products they have purchased before campus interviews.
The results are shown in Table 4.

6 Experimental Results

The results of collaborative filtering show that 68% of the placed students have shown
interest and rated iPhone as the first product they want to purchase after their joining
in an organization within 1 year, followed by digital camera which was opted by 44%

Fig. 2 Plot of actual versus predicted output


Product Prediction and Recommendation in E-Commerce … 67

of the students. 31% said that they would like to purchase air conditioner. Feedback
data show that 72% of the alumni bought iPhone, 37% of them bought digital camera,
and 35% bought air conditioner within a year. In Fig. 2, the plot shows the difference
between actual and predicted outputs. From Table 4 and Fig. 2, it is to be noted that
the probability of purchasing of shirt and trouser is maximum, while the purchasing
of shirt and trouser is maximum, while the purchasing of blazer, watch, belt, or tie
is minimum.

7 Conclusion and Future Work

At present, customers’ expectations are high as cost and quality are concerned and
at the same time manufacturers may compromise on profits as to competitors, due
to the dynamic business scenario. The performance of the model was found to be
good. In future, the same model can be used for prediction and recommendation,
with large database, i.e., newly added items.

References

1. Burke, R.: Hybrid recommender systems: survey and experiments. User Model. User-Adap. Int.
12, 331–370 (2002)
2. McNee, S., Riedl, J., Konstan, J.: Being accurate is not enough: how accuracy metrics have
hurt recommender systems. In: 24th International Conference Human Factors in Computing
Systems, Montréal, Canada, pp. 1097–1101 (2006)
3. Cosley, D., Lam, S., Albert, I., Konstan, J., Riedl, J.: Is seeing believing?: how recommender sys-
tem interfaces affect users’ opinions. In: SIGCHI Conference on Human Factors in Computing
Systems, Ft. Lauderdale, FL, pp. 585–592 (2003)
4. Ziegler, C., McNee, S., Konstan, J., Lausen, G.: Improving recommendation lists through topic
diversification. In: 14th International World Wide Web Conference, Chiba, Japan, pp. 22–32
(2005)
5. Huang, Z., Chung, W., Chen, H.: A graph model for E commerce recommender systems. J. Am.
Soc. Inform. Sci. Technol. 55(3), 259–274 (2004)
6. Liu, Z.B., Qu, W.Y., Li, H.T., Xie, C.S.: A hybrid collaborative filtering recommendation mech-
anism for P2P networks. Futur. Gener. Comput. Syst. 26(8), 1409–1417 (2010)
7. Paul, D., Sarkar, S., Chelliah, M., Kalyan, C., Nadkarni, P.P.S.: Recommendation of high-quality
representative reviews in E-commerce. In: Proceedings of the Eleventh ACM Conference on
Recommender Systems, Como, Italy, pp 311–315 (2017)
8. Baha’addin, F.B.: Kurdistan engineering colleges and using of artificial neural network for
knowledge representation in learning process. Int. J. Eng. Innov. Tech. 3(6), 292–300 (2013)
PGRDP: Reliability, Delay,
and Power-Aware Area Minimization
of Large-Scale VLSI Power Grid
Network Using Cooperative Coevolution

Sukanta Dey, Sukumar Nandi and Gaurav Trivedi

Abstract Power grid network (PGN) of a VLSI system-on-chip (SoC) occupies a


significant amount of routing area in a chip. As the number of functional blocks is
increasing in an SoC chronologically, the need of the hour is to have more power
lines in order to provide adequate power connections to the extra-added functional
blocks. Therefore, to accommodate more functional blocks in the minimum area
possible, the PGN should also have minimum area. Minimization of the area can be
achieved by relaxing few power grid constraints. In view of this, due to the resistance
of the PGN, it suffers from considerable reliability issues such as voltage drop noise
and electromigration. Further, it also suffers from the interconnect delay and power
dissipation due to its parasitic resistances and capacitances. These PGN constraints
should be relaxed up to a certain limit, and the area minimization should be done
accordingly. Therefore, in this paper, we have considered an RC model of the PGN
and formulated the area minimization for PGN as a large-scale minimization problem
considering different reliability, delay, and power-aware constraints. Evolutionary
computation-based cooperative coevolution technique has been used to solve this
large-scale minimization problem. The proposed method is tested on industry-based
power grid benchmarks. It is observed that significant metal routing area of the PGN
has been reduced using the proposed method.

Keywords Area minimization · Cooperative coevolution · Delay ·


Evolutionary computation · Large-scale optimization · Power grid networks ·
Reliability · VLSI

S. Dey (B) · S. Nandi


Department of CSE, IIT Guwahati, Amingaon, North Guwahati 781039, Assam, India
e-mail: [email protected]
S. Nandi
e-mail: [email protected]
G. Trivedi
Department of EEE, IIT Guwahati, Amingaon, North Guwahati 781039, Assam, India
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2020 69


J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent
Trends, Studies in Computational Intelligence 784,
https://doi.org/10.1007/978-981-13-7334-3_6
70 S. Dey et al.

1 Introduction

With the advancement of VLSI technology, the number of functional blocks in a


system-on-chip (SoC) is increasing constantly. These functional blocks are powered
by the power grid network (PGN) which is connected to the power pads of the chip.
In order to accommodate more number of functional blocks in an SoC, it is necessary
to have more power lines so that the extra-added functional blocks can be connected
to power. As PGN occupies a significant amount of the routing area, therefore, it
is desirable to have a lesser area of the PGN so that extra power connections can
be accommodated in the same area or in the minimum area of the chip. Hence, the
minimization of the area of the PGN is important. However, the minimization of the
PGN can be achieved by relaxing different reliability constraints, time-delay con-
straints, and power dissipation constraints. Generally, different kinds of reliability
issues may occur in the PGN due to the parasitic effects such as resistances, capaci-
tances, and inductances of the metal lines of the PGN. Voltage drop noise is one of
the major reliability issues occurred in the PGN. Due to the voltage drop noise, some
functional blocks may not get the required voltage which can malfunction the chip.
Electromigration is another major reliability issue due to which electrons transfer
its momentum to the metal atoms and as a result voids and hillocks form in metal
lines which may create open circuit or short circuit some part of the power grid.
Also, due to unequal voltage drop noises across the PGN and the length of the metal
lines of the power grid, some unwanted delays may occur. Moreover, due to the high
current through the PGN, the power dissipation by the grid also becomes signifi-
cant. In order to minimize the delay, the power dissipation, and the reliability issues,
designers overdesigned the PGN. Overdesigning it increases the area overhead and
reduces the yield of the chip. Hence, in this paper, we are trying to minimize the
metal area of the PGN considering different reliability and delay constraints of the
PGN. The significant findings of this research paper contain the following:
• The metal routing area minimization of the PGN is constructed as a large-scale
variables (also known as the large-scale problem) optimization problem. We have
used a simple DC load model or a steady-state model of PGN for the construction
of the problem.
• For the area minimization problem of the PGN different reliability, time-delay,
and power dissipation constraints are considered.
• The area minimization problem is solved by cooperative coevolution-based meta-
heuristics which are an optimization method for large-scale problems.
• The proposed minimization approach is able to minimize the total metal routing
area.
• To showcase the applicability of the method, different standard benchmark circuits
of PGN are used to test the proposed scheme.
The arrangement of the paper is presented as follows. Section 2 comprises the
necessary preliminary information required to understand the paper. The problem
formulation for the area minimization is constructed in Sect. 3, which also contains
PGRDP: Reliability, Delay, and Power-Aware Area … 71

a discussion about the different reliability, delay, and power-aware constraints of


PGN. Section 4 contains the cooperative coevolution scheme and its adaptation for
solving the metal routing area minimization problem of PGN. The results obtained
from different experiments using the PGN benchmark circuits are listed in Sect. 5.
At last, in Sect. 6, the conclusion of the paper is given.

2 Preliminaries

Analysis of power grid network (PGN) is an active area of research; one such recent
work is [1]. In literature, several works on PGN optimization exist, where the metal
area was minimized constructing it as a two-phase optimization problem. Tan et al.
[2] solved the PGN optimization problem with the help of linear programming. He
uses sequence of linear programming in order to optimize the two-phase optimiza-
tion problem of PGN. Wang and Chen [3] solved the same two-phase optimization
problem of PGN with the help of sequential network simplex algorithm. Further-
more, it is observed in the literature that there are several other works on the PGN
metal routing area minimization of the PGN. Zeng and Li [4] have come up with a
scheme for PGN wire sizing formulating it to be a two-step optimization problem
with the help of locality-driven partitioning-based solution of PGN. Zhou et al. [5]
have solved electromigration-lifetime-constrained power grid optimization by mini-
mizing the area using a sequence of a linear programming method. It is important to
perform the area minimization of the PGN considering timing delay and power dissi-
pation constraints along with the reliability constraints which have not been reported
yet in the literature. Also, power grid area minimization problem has not been solved
using evolutionary computation-based metaheuristics, which has been proved to be
an effective approach to solve the complex optimization problem. Recent work of [6]
showed that evolutionary computation-based metaheuristics can be used for power
grid optimization which has successfully able to minimize the IR drop noise with
an overhead of PGN metal routing area. In this work, we proposed a scheme to
minimize the metal routing area of the whole PGN by varying the widths of each of
the PGN metal fragments (edges) using evolutionary optimization technique. Here,
the metal routing area minimization problem of PGN is constructed in the form of
large-scale optimization problem. Our proposal also includes different reliability,
delay, and power-aware constraints while minimizing the metal area of the PGN.
And finally, the large-scale metal routing area minimization problem is solved using
cooperative coevolution-based metaheuristics.

2.1 Power Grid Network Model

An illustration of the PGN along with its functional blocks is shown in Fig. 2. In this
paper, the metal lines of the PGN are modeled as the RC circuit. The resistive (R)
72 S. Dey et al.

Fig. 1 π model of metal R


segment of the PGN
C/2 C/2

Fig. 2 An illustration of
floor plan is shown along
with its PG network (metal
lines) and the functional
blocks

Fig. 3 RC equivalent model


of the PGN
Vdd Vdd

Vdd Vdd

and capacitive (C) elements are considered in the modeling of the PGN as these two
circuit elements generate significant IR drop noises (voltage drop noise), signal delay,
and power dissipations. Therefore, only RC elements are considered in this work.
The equivalent R and C value of a metal segment for π model of an interconnect is
shown in Fig. 1. An RC model of the PGN is considered here for the metal routing
area minimization problem, which is shown in Fig. 3.
For modeling the currents drawn by the underlying standard cells of the chip, DC
load is being used which is connected to the ground pads from the power network as
shown in Fig. 3. Similarly, for the ground network, DC loads are connected from the
PGRDP: Reliability, Delay, and Power-Aware Area … 73

ground network to the Vdd pads. The vias connecting different layers of the metals
are considered to have zero resistance since it is considered that vias have very low
resistance. C4 bumps which are used for Vdd and ground connections are assumed
to have no inductances. Basically, any sort of parasitic effects due to inductances is
not contemplated in this work. To find all the node voltages and edge currents, it is
necessary to represent the PGN as mathematical model. Hence, the RC model of the
PGN is represented as system of equations, i.e.,

GV(t) + CV (t) = I(t), (1)

where G matrix indicates the conductances of the metal lines of the PGN, and the
capacitances connected to the ground at each of the nodes of PGN constitute the
C matrix. Similarly, I(t) is formed by the current sources connected to the grounds
and V(t) is the representation of the node voltages vector, and V (t) denotes the
first-order derivative of V(t). By solving this system of equations, node voltages
and edge currents can be obtained, which will be required for evaluating different
reliability, delay, and power constraints for minimization of area of the PGN. Here,
KLU-based solver is employed to determine the solutions of system of equations
after discretizing it with Backward Euler approximation technique [7]. For the area
minimization problem, evolutionary computation-based metaheuristics is used which
is described in Sect. 2.2.

2.2 Evolutionary Computation

Evolutionary computation is based on the biological evolution of the species. Dif-


ferent complex optimization problems can be solved using the evolutionary-based
metaheuristics. To perform the optimization of a mathematical function which is also
known as a cost function, initial solutions are generated randomly and evaluated for
the cost function. Then, the candidate solutions generated in new generations are
stochastically selected which has a greater chance of producing an optimal cost for
the cost function. And in this way, the process going on iteratively until the solutions
(or cost) does not saturate. The point where solutions are saturated is considered as
the optimal point, and corresponding solutions are considered as optimal solutions.
As minimization of the area for the large-scale PGN is a complex problem, evolution-
ary computation is used here. The problem is expressed in the form of a large-scale
problem. Henceforth, the process for minimization of the cost function is achieved
using cooperative coevolution-based metaheuristics which are described in the later
sections.
74 S. Dey et al.

3 Problem Formulation and Constraints

3.1 Cost Function for Metal Area Minimization

Here, we consider our PGN as a graph G = {V, E} with all the nodes of the PGN
as vertices set V = {1, 2, . . . , n} and all the branches of the PGN as edges set E =
{1, 2, . . . , b} for the DC load model of the PGN. A pictorial representation of metal
lines of 3 × 3 PGN is shown in Fig. 4.
If l (length) and w (width) are the dimensions of a single metal fragment of
the PGN which exhibits resistance R, then area covered by the metal fragment is
expressed as A:
A = lw (2)

For the same metal fragment, if it exhibits a sheet resistance ρ Ω/ which is generally
considered to be constant for the same layer of the metals, then the resistance of the
metal fragment is analytically expressed as

ρl
R= , (3)
w
For a current of I A across the metal line, the voltage drop (IR drop) across the metal
line can be defined by
Vir = I R
ρl (4)
=I
w

Fig. 4 A pictorial
representation of metal lines
of 3 × 3 PGN
ith metal line

li

wi
PGRDP: Reliability, Delay, and Power-Aware Area … 75

From the Elmore delay model of a metal line, the delay occurred across a metal line
can be represented by
Tdelay = RC (5)

Also the power dissipation of a metal line is represented by the following:

Pdiss = I Vir
(6)
= I2R

Our aim in this paper is the minimization of the metal routing area of the PGN
maintaining IR drop (Vir ) within an acceptable limit, without having significant
delay (Tdelay ), with having less power dissipation (Pdiss ), and also subject to other
reliability constraints mentioned in the Sect. 3.2. Hence, for the entire PGN containing
b metal wire fragments (or edges), the total metal routing area is expressed as given
below:
b
Atotal = li wi (7)
i=1

A large PGN will have a large value of b, which makes (7) a cost function containing
large tally of decision-making variables. In view of this, as the cost function of (7)
has to be minimized, hence this cost function is termed as large-scale minimization
problem, where wi makes the variables set w = (w1 , w2 , . . . , wb ) for i = 1, 2, . . . , b.
Here, li is considered to be a constant for the cost function (Eq. (7)) and the value of
li is imported from the PGN netlist in order to evaluate the cost function. Therefore,
the cost function is expressed as a large-scale total metal routing area minimization
problem with b number of variables and is constructed as follows:

P : minimi ze Atotal , (8)


wi ∈W

subject to the constraints mentioned in Sect. 3.2.

3.2 Reliability, Delay, and Power-Aware Constraints

3.2.1 IR Drop Constraints

From (4), the IR drop restriction is established by the expression given below:

li∈E
C1 : |Ii∈E |ρ ≤ξ (9)
wi∈E
76 S. Dey et al.

The inequality given above should be strictly obeyed for all the ith edges of the
PGN. ξ is the highest value of tolerance of IR drop noise permitted between two con-
nected vertices of the PGN. Basically, ξ is the maximum allowable voltage difference
between two consecutive nodes of the PGN.

3.2.2 Metal Line Area Constraint

In order to limit our design in a confined area, the total metal routing area occupied
by the metal lines of the PGN should be limited to Amax :


b
C2 : li wi ≤ Amax (10)
i=1

3.2.3 Current Density Constraint

The maximum current density of the metal lines of the PGN should be limited to
Im , in order to avoid degradation of the metal lines due to electromigration-based
reliability issues.
Ii∈E
C3 : ≤ Im (11)
wi∈E

3.2.4 Metal Line Width Constraint

The design of the metal lines should follow the design rules of the given CMOS
technology nodes and should follow the minimum width design rules in order to
avoid any design rule violations. The metal width constraint can be represented as
follows:
C4 : wi∈E ≥ wmin (12)

3.2.5 Current Conservation Constraint

At all the n vertices of the PGN, Kirchhoff’s Current Law (KCL), or the current
conservation constraint must be observed, which is represented as follows:


K
C5 : I ji + Ix = 0 ∀ j ∈ V (13)
i=1

where the symbol K denotes neighboring vertices tally around the vertex j and the
symbol Ix represents DC load current of the PGN model which is placed at all nodes
to the ground.
PGRDP: Reliability, Delay, and Power-Aware Area … 77

3.2.6 Time-Delay Constraint

A metal interconnect is modeled as RC element. However, the capacitance of a


interconnect can further be classified by plate capacitance Ci plate , fringe capacitance
Ci f ringe , and sidewall capacitance Cisidewall . Among these Ci plate has a simple mathe-
matical expression which is
εli wi
Ci plate = (14)
t

Therefore, time delay of each of the metal lines of the PGN should be with in ζ.
Using Ri = ρlwii and (14), we get

C6 : Tdelay = Ri (Ci plate + Ci f ringe + Cisidewall ) ≤ ζ


ρεli2 ρli Ci f ringe ρli Cisidewall (15)
= + + ≤ζ
t wi wi

3.2.7 Power Dissipation Constraint

The power dissipation of a metal interconnect of the PGN should be limited by ψ.

C7 : Pdiss = Ii Viir ≤ ψ
= Ii2 Ri ≤ ψ
(16)
ρli Ii2
= ≤ψ
wi

Proposition 1 Minimization of area of the PGN is dependent on reliability, delay,


and power-aware constraints

Proof From Eq. (7), we know that Atotal depends upon width (wi ) of each of the
metal interconnects.
Atotal ∝ wi (17)

Therefore, to minimize the area, we have to reduce the wi . However, most of the
constraints mentioned in Sect. 3.2 are directly or inversely proportional to wi

1
Constraints ∝ wi or (18)
wi

and reducing wi will surely affect these constraints. Therefore, minimization of the
area depends on the reliability, delay, and power-aware constraints. 
78 S. Dey et al.

4 Proposed Minimization Scheme

4.1 Basic Cooperative Coevolution Scheme

Cooperative coevolution (CC) is a decomposition scheme used in evolutionary com-


putation which embraces the divide-and-conquer technique in order to find optimum
solutions for optimization problems containing a large tally of variables. A complex
mathematical problem having a large tally of variables is decomposed into small
subcomponents using the divide-and-conquer approach. In order to incorporate the
divide-and-conquer, the CC scheme divides a large problem with n decision variables
into small subcomponents. Once the small subcomponents are created, then each of
the subcomponents undergoes optimization with help of a standard evolutionary opti-
mization process in a periodic manner. Evolutionary optimization algorithms try to
mimic the biological evolution process. It generates a pool of population correspond-
ing to a subcomponent of the large-scale problem. These populations represent the
candidate solutions of the subcomponent. These individuals undergo different genetic
processes naming mutation, crossover, and selection to find the optimum solutions
for the instances of the certain subcomponent. Consequently, the cooperative evalua-
tion of all the individuals in a pool of subpopulation is carried out by proper selection
of the current individual and also the best individual from the remaining individuals
of the subpopulation as described by [8]. The working of cooperative coevolution
scheme is expressed in a concise way in Algorithm 1.

Algorithm 1: The working of Cooperative Coevolution Scheme


Input: The cost function f , lower bound xmin , upper bound xmax , frequency of
decision variables n.
Output: Optimimum value of the cost function f and numerical values of
corresponding decision variables x1 , x2 , . . . , xn values.
1 subcomponents ← grouping( f, xmin , xmax , n) /*grouping based variable
decomposition*/;
2 population_arr ← random(population_size,n); /*Optimization stage*/;
3 for j ← 1 to size(subcomponents) do
4 group_number_var ← subcomponents[j];
5 subpopulation_arr ← population_arr[:,group_number_var];
6 subpopulation_arr ←
evolutionary_optimizer(best,subpopulation_arr,FE);
7 population_arr[:,group_number_var] ← subpopulation_arr;
8 (best,best_val)←min(population_arr);

Proposition 2 Cooperative coevolution scheme can generate near-optimal solu-


tions if the main optimizer generates the near-optimal solutions.

Proof Suppose if we consider f (x1 , x2 , . . . , xn ) as an objective function or the cost


function having n decision variables xi ∈ R ∀i ∈ n. Now if we want to decompose the
PGRDP: Reliability, Delay, and Power-Aware Area … 79

n variables with the help of cooperative coevolution scheme by employing random


grouping of the decision variables in such a way that each of the decomposed groups
contain s decision variables. The decomposition of n variables will create t = ns
number of groups or subcomponents of the main objective function. We can interpret
this as t instances of the cost function with each subcomponent containing n decision
variables each. Now, each of the t instances of subcomponents will go through an
optimization process with the help of a standard optimizer. The co-adaptation of
the near-optimal values will be done using random grouping strategy, to obtain the
global near-optimum of the cost function f . 
Potter and De Jong [9] first used genetic algorithm in cooperative coevolution
(CC). In a large-scale optimization problem, CC was first used by [10]. They have
proposed a fast method for evolutionary programming computation with the use of
cooperative coevolution. Van den Bergh [11] in their work for the first time introduced
CC into particle swarm optimization. Shi and Li [12] and Yang and Yao [13] have also
tested the performance of differential evolution (DE) by incorporating CC scheme
into it. An enhanced variant of DE is proposed by [14] which is named as self-adaptive
neighborhood search-based differential evolution scheme (SaNSDE). The merit of
this proposed self-adaptive DE is it self-adapts its mutation strategy, the crossover
rate CR, and the scaling factor F. Yang and Yao [14] also confirmed that the self-
adaptive version of DE works considerably good in comparison to the other versions
of DE schemes. Yang and Yao [13] further used the self-adaptive DE (SDE) along
with the CC scheme (CC-SDE) for the mathematical problems with large number
of decision variables and obtained exceedingly good results. In order to increase the
solution accuracy of the large-scale cost functions, it is observed that the random
grouping of the variables while decomposing gives the best results as discovered by
[15]. Although our area minimization problem is separable in nature, still random
grouping-based strategy is used as this strategy has been proved to be statistically
good grouping strategy in a large number of variables environment. Therefore, CC-
SDE is accommodated here to determine the optimum solutions of the total metal
routing area minimization of PGN.

4.2 Metal Area Minimization Using CC-SDE

The total metal routing area minimization algorithm for PGN exercising CC-SDE is
presented in Algorithm 2. Initially, to find all the edge currents and all node voltages
of the PGN, the grid analysis of PGN is performed with the help of the KLU-based
matrix solver [16]. In order to power grid analysis, backward Euler-based discretiza-
tion approach is used [7] to discretize (1) and KLU solver is used subsequently to
solve the linearized system of equations of the PGN. All the required parameters are
initialized for the SDE. Search space T is created considering all the constraints men-
tioned in Sect. 3.2 to restrict the search space of the cost function interior to the area
of validation. Subsequently, for the problem P, initially cooperative coevolution-
based scheme is used to decompose a large number of variables of the problems into
80 S. Dey et al.

Algorithm 2: Metal Area minimization using CC-SDE


Input: The cost function P, widths (wi ), lengths (li ), number of edges (b), number of
vertices (n). All the data are extracted from the power grid netlist.
Output: Optimum metal widths of the metal lines with decreased total metal routing
area.
1 The reliability, delay, and power-aware constraints C1 , C2 , . . . , C7 mentioned in Sect.
3.2 are incorporated to generate a search space T .;
2 while inside search space T do
3 Initialization is done for the initial parameters of CC-SDE;
4 Decomposition of the b variables in t subcomponents is done using random
grouping strategy;
5 For optimizing the subcomponents, SDE optimization algorithm is used;
6 The subcomponents are co-adapted by randomly grouping the best solutions of the
subcomponents.;
7 Optimum widths of the metal lines of the PGN are evaluated corresponding to the
minimized total metal routing area and the model parameters are updated.;

smaller instances of the subcomponents. The decomposed variables are collected


randomly in a number of smaller groups to make different subcomponents. Once the
subcomponents are formed, the minimization of each of the subcomponents is done
individually, with the help of SDE optimizer. Eventually, after all the subcomponents
are minimized individually, again arbitrary grouping-based co-adaptation scheme is
employed to reach the global near-optimum value of the cost function. In this way,
the total metal routing area of P is minimized. Optimized edge widths are deter-
mined corresponding to the minimized metal routing area of the PGN with the help
of the cost function P.

5 Experimental Results

For the implementation of all the algorithms used in this paper, MATLAB program-
ming language is used. The experiments are accomplished on a computer with Linux
operating system and with 64GB memory for validation of the proposed schemes.
IBM PGN benchmarks [17] are used to showcase the area minimization results which
are listed in Table 1.

Table 1 Power grid benchmark circuits data [17]


Benchmark circuits #Nodes(n) #Edges(b) Edge resistance limits (in Ω)
ibmpg2 127238 208325 (0,1.17]
ibmpg3 851584 1401572 (0,9.36]
ibmpg4 953583 1560645 (0,2.34]
ibmpg5 1079310 1076848 (0,1.51]
ibmpg6 1670494 1649002 (0,17.16]
ibmpgnew1 1461036 2352355 (0,21.6]
PGRDP: Reliability, Delay, and Power-Aware Area … 81

IBM power grid benchmarks are the industry standard power grid benchmarks
available which are extracted from the IBM processors, and which are widely used
in the research of the power grid simulation. The node counts (n) and the edge counts
(b) for different six IBM power grid benchmarks are listed in Table 1. Also, resistance
values of all the edges are listed in Table 1. We also assumed different capacitances
similar to the transient IBM power grid benchmarks [17] for time-delay constraint
estimation. The power grid benchmarks do not contain length and width informa-
tion, for which appropriate dimensions of the metal segments are considered with a
minimum width of the metal lines be 0.1 µm. The metal interconnects’ sheet resis-
tance is considered to be 0.02 Ω/. Power grid analysis is performed on the PGN
benchmarks with the help of KLU solver. The edge currents and node voltages of
the PGN are obtained from power grid analysis results, which can be used for the
evaluation of the reliability, delay, and power dissipation constraints. Subsequently,
all the constraints are evaluated and a search space T is constructed. The algorithm
looks for optimum values of width within the search space T , corresponding to
the minimized area of the PGN. The experiments were performed for the six IBM
benchmarks circuits ibmpg2 to ibmpgnew1. The area minimization is done for the
benchmark circuits using Algorithm 2. Before and after minimization results are
given in Table 2. It is clear from Table 2 that the proposed scheme is able to mini-
mize the metal routing area of the PGN significantly. For the ibmpg2, we have got
25.85% reduction in area. This shows that the metal routing area can be minimized
for an overdesigned PGN without violating reliability, time-delay, and power-aware
constraints. In order to obtain a variation of width before and after minimization,
resistance and metal width budgeting is done for ibmpg4 circuit. Resistance budget
for the ibmpg4 circuit is shown in Fig. 5 which shows different values of the resis-
tances which are present in ibmpg4 circuit. From Fig. 5, we have got the metal width
budget for ibmpg4 circuit before minimization which is shown in Fig. 6. Comparing
Figs. 6 and 7, we can see that after minimization of the area for ibmpg4, the widths
have been reduced significantly. In our experiments, we used fitness evaluation (FE)

Table 2 Comparison of metal routing area for IBM PGN benchmarks before and after minimization
procedure
Benchmark circuits Total area (mm2 ) Area reduced (%)
Before minimization After minimization
ibmpg2 97.73 72.46 25.85
ibmpg3 931.95 744.62 20.10
ibmpg4 279.97 230.21 17.77
ibmpg5 575.73 505.83 12.14
ibmpg6 754.85 662.38 12.25
ibmpgnew1 799.97 719.01 10.12
82 S. Dey et al.

Fig. 5 Resistance budget for 105


ibmpg4 circuit according to 3
the benchmark circuit data
2.5

Number of branches
2

1.5

0.5

0
0 0.5 1 1.5 2 2.5
Branch resistances (in Ohm)

Fig. 6 Metal width budget × 105


for ibmpg4 circuit before 3
minimization
2.5
Number of branches

1.5

0.5

0
0 10 20 30 40
Metal width of the branches (in µm)

Fig. 7 Metal width budget × 104


14
for ibmpg4 circuit after
minimization 12
Number of branches

10

0
0 10 20 30 40
Metal width of the branches (in µm)

value as 106 . One of the important reasons behind using this numerical value is
that our proposed scheme (Algorithm 2) furnishes the best result with respect to the
convergence, for the numerical FE value of 106 .
PGRDP: Reliability, Delay, and Power-Aware Area … 83

6 Conclusion

The paper manifests a scheme for minimizing the metal routing area of the VLSI
PGN. The cost function of metal area minimization problem is expressed in the form
of large-scale optimization problem. For the minimization procedure, an evolutionary
computation-based minimization approach using cooperative coevolution scheme
is proposed in this paper which is used for the metal routing area minimization.
Reliability, time-delay, and power-aware constraints are considered as a part of the
minimization process in order to define the search space of the cost function. Different
PGN benchmarks are used to demonstrate the applicability of the algorithm. Results
on PGN benchmarks show significant reduction of the metal routing area without
violating reliability, delay, and power-aware constraints.

References

1. Dey, S., Nandi, S., Trivedi, G.: Markov chain model using Lévy flight for VLSI power grid
analysis. In: Proceedings of VLSID, pp. 107–112 (2017)
2. Tan, S.X.D., Shi, C.J.R., Lee, J.C.: Reliability-constrained area optimization of VLSI
power/ground networks via sequence of linear programmings. IEEE TCAD 22(12), 1678–
1684 (2003)
3. Wang, T.Y., Chen, C.P.: Optimization of the power/ground network wire-sizing and spacing
based on sequential network simplex algorithm. In: Proceedings of ISQED, pp. 157–162 (2002)
4. Zeng, Z., Li, P.: Locality-driven parallel power grid optimization. IEEE TCAD 28(8), 1190–
1200 (2009)
5. Zhou, H., Sun, Y., Tan, S.X.D.: Electromigration-lifetime constrained power grid optimization
considering multi-segment interconnect wires. In: Proceedings of ASP-DAC, pp. 399–404
(2018)
6. Dey, S., Nandi, S., Trivedi, G.: PGIREM: reliability-constrained IR drop minimization and
electromigration assessment of VLSI power grid networks using cooperative coevolution. In:
Proceedings of ISVLSI (2018)
7. Butcher, J.C.: Numerical Methods for Ordinary Differential Equations. Wiley (2016)
8. Potter, M.A., Jong, K.A.D.: Cooperative coevolution: an architecture for evolving coadapted
subcomponents. Evol. Comput. 8(1), 1–29 (2000)
9. Potter, M.A., De Jong, K.A.: A CC approach to function optimization. In: Proceedings of
PPSN, pp. 249–257 (1994)
10. Liu, Y., Higuchi, T.: Scaling up fast evolutionary programming with CC. Proceedings of CEC,
vol. 2, pp. 1101–1108 (2001)
11. Van den Bergh, F.: A cooperative approach to PSO. IEEE TEC 8(3), 225–239 (2004)
12. Shi, Y.J., Li, Z.Q.: Cooperative co-evolutionary DE for function optimization. In: Advances in
Natural Computation, p. 428. Springer (2005)
13. Yang, Z., Yao, X.: Large scale evolutionary optimization using CC. Elsevier Inf. Sci. 178(15),
2985–2999 (2008a)
84 S. Dey et al.

14. Yang, Z., Yao, X.: Self-adaptive DE with neighborhood search. In: Proceedings of CEC, pp.
1110–1116 (2008b)
15. Omidvar, M.N., Yao, X.: Cooperative co-evolution for large scale optimization through more
frequent random grouping. In: IEEE CEC, pp. 1–8 (2010)
16. Davis, T.A.: KLU, a direct sparse solver for circuit simulation problems. ACM TOMS 37(3),
36 (2010)
17. Nassif, S.R.: Power grid analysis benchmarks. In: Proceedings of ASP-DAC, pp. 376–381
(2008)
Forest Cover Change Analysis
in Sundarban Delta Using Remote
Sensing Data and GIS

K. Kundu, P. Halder and J. K. Mandal

Abstract The present study deals with change detection analysis of forest cover in
Sundarban delta during 1975–2015 using remote sensing data and GIS. Supervised
maximum likelihood classification techniques are needed to classify the remote sens-
ing data, and the classes are water body, barren land, dense forest, and open forest.
The study reveals that forest cover areas have been increased by 1.06% (19.28 km2 ),
5.80% (106.82 km2 ) during the periods of 1975–1989 and 1989–2000, respectively.
The reversed tendency has been observed during 2000–2015, and its areas have been
reduced to 5.77% (111.85 km2 ). The change detection results show that 63%–80%
of dense forest area and 66%–70% of open forest area have been unaffected during
1975–2015 and 1975–2000, respectively, while during the interval 2000–2015, only
36% of open forest area has been unaltered. The overall accuracy (86.75%, 90.77%,
88.16%, and 85.03%) and kappa statistic (0.823, 0.876, 0.842, and 0799) have been
achieved for the year of 1975, 1989, 2000, and 2015 correspondingly to validate
the classification accuracy. Future trend of forest cover changes has been analyzed
using the fuzzy logic techniques. From this study, it may be concluded that in future,
the forest cover area has been more declined. The primary goal of this study is to
notify the alteration of each features, and decision-maker has to take measurement,
scrutinize, and control the natural coastal ecosystem in Sundarban delta.

Keywords Change detection · Remote sensing data · Dense forest · Open forest ·
Fuzzy logic · Sundarban

K. Kundu (B)
Department of Computer Science and Engineering, Government College of Engineering & Textile
Technology, Serampore, Hooghly, India
e-mail: [email protected]
P. Halder
Department of Computer Science and Engineering, Purulia Government Engineering College,
Purulia, West Bengal, India
e-mail: [email protected]
J. K. Mandal
Department of Computer Science and Engineering, University of Kalyani, Kalyani, Nadia, West
Bengal, India
e-mail: [email protected]
© Springer Nature Singapore Pte Ltd. 2020 85
J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent
Trends, Studies in Computational Intelligence 784,
https://doi.org/10.1007/978-981-13-7334-3_7
86 K. Kundu et al.

1 Introduction

Sundarbans (India and Bangladesh) is the biggest continuous tidal halophytic man-
grove [1] forest delta in the globe. Its net area is approximately 10,000 km2 . Among
these areas, 40% of the areas are present in India and the remaining areas belong to
Bangladesh. It is the largest complex intertidal region where the three major rivers
(Ganges, Brahmaputra, and Meghna) meet the Bay of Bengal. It is enclosed by the
Hooghly River in West, on the east by Ichamati-Kalindi-Raimangal, on the south
by Bay of Bengal, and on the north by Dampier Hodge line. Out of 102 islands in
Sundarban delta, 48 islands are reserved for forest and remaining islands are reserved
for the human settlement. Along the shoreline region it creates halophytic mangrove
forest. It was stated as a biosphere reserve in the year of 1989 by United Nations
Educational and Scientific Co-operation (UNESCO), and in the year of 1987, it was
declared as world heritage place by International Union for Conservation of Nature
(IUCN).
Forests are very important natural resource on the earth. It does construct a nat-
ural blockage to guard along the riverbank and inland areas from natural calamities
(hurricanes, cyclones, and tsunamis) [2]. It plays an important role in ecological safe-
guard such as soil conservation, enlarged biodiversity [3], and avoidance of climate
changes. It is also utilized for widespread national-level economic growth through
providing such as timber of industry and construction and source of medicine.
Mangrove forests were declined in various region of the world. In present decades,
rigorous and swift deforestation [4] has led to increasing global notifications for
defended managing of forest assets. The region has enormous pressure due to raising
various factors like settlement area, agricultural growth, huge wood removal, weather
change, industrialization, and urbanization. The main causes of forest degradation are
infringement, unlawful cutting of tree, forest fire and climate change. Forest lands
cover areas translation into agricultural land, livelihood, farmland, and improper
industrialization. The deforestation outcomes consist of fall down the biodiversity,
impacts on climate, loss of a vital sink for atmospheric carbon dioxide, and negative
effects on the regular livelihoods of stifling region peoples [5]. Tropical deforestation
is accountable for enormous species destruction and effects on biological diversity.
There are mainly two ways that affect the biological biodiversity such as habitat
demolition and separation of the previously proximate forest into forest fragments.
During 2000–2005, the worldwide total forest cover reduction was 0.6% per year.
The usual ground observation techniques of forest cover monitoring [6] are too
much more arduous, inadequate, time-consuming process. There are some drawbacks
in normal ground verification method; the remote sensing tool is very important in
supervising the forest, and it is also able to find out the deforestation patterns and
movement of tropical forests. The primary data source like remote sensing data
for change detection analysis plays a key role in the present scenario [7]. Remote
Forest Cover Change Analysis in Sundarban … 87

sensing data yield more accurate measurements than conventional techniques. The
digital characteristics of remote sensing data are allowed to classification, com-
patibility with geographic information systems and advanced computer analysis.
The digitally stored remotely sensed data offer a remarkable prospect; they help to
study the present forest cover changes [8]. Recently, IKONOS, Quick Bird, and IRS
are used for forest cover change detection analysis with high degree of accuracy
while their spatial resolution is high and the cost is also very high. The lower cost
image which is freely available from the Internet is used for examining land cover,
land use, and landscape ecology, instead of high cost high-resolution image using
medium and coarse resolution satellite data (MSS, TM, ETM+) with tolerable levels
of precision [9].
India is a developing country and largely density populated nation in the earth. It
has restricted number of natural resources. In current decades, numerous studies in
India have been analyzed forest cover alteration in Sundarban with remote sensing
data and GIS. In India, most of the forests are resides in the Sundarban region. During
the period of 1970–1990, the mangrove forests areas were increased by 1.4%, while
they were reduced by 2.5% during 1990–2000 in this region (India and Bangladesh).
Mangrove forests areas are increased due to regrowth, plantation, and aggradations.
Around 0.42% of mangrove forest areas were lost during 1975–2006 due to illegal
cutting trees, enlarged settlement, augmented agricultural land, increased aquatics
farm, etc. It was also notified that entire mangrove forests areas in South Asia were
more deforested in compared with the reforested during the period 2000–2012 [10].
Remote sensing data are classified into various classification techniques such as
on-screen visual interpretation, supervised, and unsupervised classifications [11]. To
determine the various characteristics of the land cover features, different classification
methods are used. It has been examined that among these various classification
methods, on-screen classification procedure is more superior to other methods. As
per accuracy, it is divulged that band ratio of supervised classification is healthier in
compared with the other classification methods. In a recent scenario, it has been seen
that for the past few years, the earth’s temperature is rising because of unplanned
industrialization, deforestation, etc. Normal biodiversity has been changed due to
the climate changes of an environment. The changes of climate directly or indirectly
impact on the Sundarban estuary along the river bank or shoreline. As a result, along
the shoreline of Bay of Bengal, the sea level has been raised, increased downstream
salinity, and regular occurrence of natural distrusters such as cyclones, storms, etc.
Recently, the status and distribution of various mangrove species have been discussed
[12] in the Indian Sundarban region, and it is also observed that mangrove vegetation
areas are more declined. The main purpose of this study is to explore the changes
in various features and how much areas are converted into various features. By this
inspection, the policy-makers can able to take decision to coordinate, manage, and
sustain the normal biodiversity in the Sundarban delta.
88 K. Kundu et al.

Study Area (Sundarban) India

88°30'0"E 88°40'0"E 88°50'0"E 89°0'0"E

22°10'0"N
22°10'0"N

22°0'0"N
22°0'0"N

21°50'0"N
21°50'0"N

West Bengal
21°40'0"N
21°40'0"N

88°30'0"E 88°40'0"E 88°50'0"E 89°0'0"E

Fig. 1 Geographical position of the study area

2 Study Area

The present study area is positioned in the district of South 24 Parganas at West
Bengal state in India, which is shown in Fig. 1. It lies between latitude 21˚40 00
N to 22˚30 00 N and longitude 88˚32 00 E to 89˚4 00 E. The study area covers
3038.68 km2 , and the whole region is covered by the reserve forest. The area is delim-
ited by the Bay of Bengal on the south, borderline of India and Bangladesh on the
east, Thakuran River on the west, and Basanti block on the north. Numerous natural
biodiversity with various flora and fauna is present in the Sundarban region. More
than 27 mangrove species, 40 species of mammals, 35 groups of reptiles, and 260 bird
species have resided in this region. Wildlife species that exist in the vicinity include
the Indian python, man-eating Royal Bengal tiger, spotted deer, sharks, macaque
monkey, crocodiles, and wild boar. The forests are demonstrated three main tree
groups Sundri, Goran, and Gewa. Additional species that build up the forest assem-
bly consist of Avicennia, Xylocarpus, Sonneratia, Bruguiera, Rhizophora, and Nypa
palm. The region contains various islands’ such as Gana, Baghmara, Bhangaduni,
Mayadwip, Chandkhali, Harinbangha, and Holiday Island. It is usually flooded by
diurnal tides.
Forest Cover Change Analysis in Sundarban … 89

Table 1 Detailed descriptions of the Landsat image


Satellite Sensor No. of Date of Path and Spatial
type bands acquisition row resolution (m)
Landsat 3 MSS 4 05.12.1975 p-148, r-45 60
Landsat 5 TM 7 03.01.1989 p-138, r-45 30
Landsat 7 ETM+ 8 17.11.2000 p-138, r-45 30
Landsat 7 ETM+ 8 25.11.2015 p-137, r-45 30

Table 2 Description of the Class Description


forest cover types defined in
this study Water body Area cover by open water like rivers, lakes,
small channels, etc.
Barren land Area cover by such as wetland, bare river
beds, degraded land, newly cleared land, etc.
Dense forest Tree density of canopy cover of 50% and
above
Open forest Tree density of canopy cover of less than 50%

3 Materials and Methods

3.1 Data Source

In this paper, four multispectral Landsat satellite data were obtained from earth
explorer website (https://earthexplorer.usgs.gov/), which were freely available on
the web. The details of the Landsat satellite images information are presented in
Table 1, and it is noticeably shown that all the images were acquired almost in the
same season. All the satellite images were cloud free and unambiguous. Multispectral
scanner (MSS) is obtained from Landsat 3 satellite with spatial resolution of 60 m,
and it includes four spectral bands such as green (0.5–0.6 µm), red (0.6–0.7 µm),
near-infrared (NIR) (0.7–0.8 µm), and near-infrared (NIR) (0.8–1.1 µm). Thematic
mapper (TM) sensor was acquired from Landsat 5 satellite, and it contained seven
spectral bands such as blue, green, red, near-infrared (NIR), shortwave infrared 1,
thermal, and shortwave infrared 2 with spatial resolution of 30 m except for the
thermal band. Landsat 7 satellite contains Enhanced Thematic Mapper Plus (ETM+)
sensor, and it consists of eight spectral bands such as blue, green, red, near-infrared
(NIR), shortwave infrared 1, thermal, shortwave infrared 2, and panchromatic with
a spatial resolution of 30 m except for the thermal and panchromatic bands. A topo-
graphic map (79 C/9) with a scale of 1:50,000 was collected from the Survey of
India, which was used for georeferencing purpose.
90 K. Kundu et al.

3.2 Methodology

Four Landsat multispectral satellite data have been preprocessed to find out the
significant information from the satellite images. In this article, image preprocess-
ing operations have been performed through TNTmips Professional 2017 software.
The images were geometrically corrected, radiometric calibration, and drop lines
or systematic striping or banding is removed from the collected images. Histogram
equalization techniques are needed to rectify the quality of the image. Then, each
image is digitized and obtained the boundary line. A layer stacking tool is used to
integrate the three bands (bands 2, 3, and 4) into a single layer and to generate a
false color combination (FCC). The nearest-neighbor algorithm has been applied for
resampling the dataset. Then, select datum WGS84 and project to UTM zone with
45 N, which is used for mapping purpose. Finally, crop the study area to carry out
the present work.
In this study area, the forest cover image has been classified into four classes,
which include water body, barren land, dense forest, and open forest. The maximum
likelihood classification technique has been used to classify the 4 years images and
select the various colors for the various features (water body, barren land, dense
forest, and open forest). After the classification of each image, evaluate the area
of each feature in km2 , which is shown in Table 3. Precision assessment has been
obtained to verify the classification exactness through overall accuracy and the kappa
statistic. For change detection analysis, 2 years (ex. 1975–1989, 1989–2000, and
2000–2015) of raster classification images are combined into a single raster file. The
single raster file format is converted into the vector file format. From the vector file,
various alteration features area has been obtained to find out how much areas have
been unchanged and how much areas have been changed. Figure 6 demonstrates the
changes in various features such as unchanged area, increased forest areas, decreased
forest areas, increased barren land area, decrease barren land area, and water body,
which are represented in various colors. Figure 2 depicts the detailed representation
of the flowchart of methodology.

4 Results

4.1 Classification

In this article, forest cover areas are classified into four classes: water body, barren
land, dense forest, and open forest. Table 2 depicts the detailed description of the
forest cover types defined in this study. The images are categorized through the
maximum likelihood classifier (MLC) algorithm, which is based on the supervised
classification techniques. For this classification technique, the training dataset is
necessary for classifying the image. The training sets are used to recognize the
forest cover classes in the whole image. The pixels are assigned to a particular class
Forest Cover Change Analysis in Sundarban … 91

Table 3 Summary of forest cover class area (in km2 ) and percentage of area in 1975, 1989, 2000,
and 2015
Class Year: 1975 Year: 1989 Year: 2000 Year: 2015
name Area % Area % Area % Area %
Water 1016.89 33.47 993.11 32.69 902.11 29.69 1141.93 37.58
body
Barren 210.17 6.92 214.62 7.06 199.76 6.57 71.58 2.36
land
Dense 595.62 19.60 701.87 23.10 725.4 23.87 658.39 21.67
forest
Open 1215.79 40.01 1128.82 37.15 1211.41 39.87 1166.57 38.39
forest
Total 3038.34 100 3038.42 100 3038.68 100 3038.47 100

Table 4 Total forest area in km2 for the years 1975, 1989, 2000, and 2015
Forest class 1975 (area in 1989 (area in 2000 (area in 2015 (area in
km2 ) km2 ) km2 ) km2 )
Dense forest 595.62 701.87 725.4 658.39
Open forest 1215.79 1128.82 1211.41 1166.57
Total forest area 1811.41 1830.69 1936.81 1824.96

Landsat Multi-temporal Satellite images (MSS-


1975, TM-1989, ETM+-2000, and ETM+- 2015)

Geometric Correction

Boundary Digitization

Final 1975, 1989, 2000, 2015

Supervised Classification

Future Prediction
Forest Cover Analysis Accuracy Assessment
using Fuzzy Logic

Change Detection Analysis Change Detection Map

Fig. 2 Flowchart of the methodology


92 K. Kundu et al.

Table 5 Total forest area change (in km2 ) and percentage of area change for the years 1975–1989,
1989–2000, and 2000–2015
Year: 1975–1989 Year: 1989–2000 Year: 2000–2015
Total forest area change (in km2 ) 19.28 106.12 −111.85
Percentage of forest area change 1.06 5.80 −5.77

according to its probability fit into a specific class. MLC contains two important
parts such as mean vector and covariance metric, which are reclaimed from training
dataset. The outcomes of the classification have shown that MLC is the vigorous
method and more superior than other methods, and there are minimum chances of
misclassification. Table 3 clearly inspects that water body areas have been declined
during the period 1975–2000 while the same has been increased during 2000–2015
because of global warming and the effects of rising sea level. During 1975–1989, the
barren land areas have been slightly increased due to fall in rainfall and water body,
but the barren land areas have been gradually reduced during 1989–2015 which is
caused by rising sea level. Dense forest area has been significantly increased during
1975–2000 although it has been declined during 2000–2015 due to deforestation.
Open forest area has been depleted during the period 1975–1989, but it has been
marginally increased during 1989–2000 and from the year 2000–2015, a declining
trend is observed. Table 3 illustrates the summary of forest cover class area (in km2 )
and the percentage of area in 1975, 1989, 2000, and 2015. Figure 3 depicts the detailed
representation of the four forest cover classes (water body, barren land, dense forest,
and open forest) in the years of 1975, 1989, 2000, and 2015. From the observation,
it is seen that more dense forest areas exist in the regions of southeast, south, and
east position of the Sundarban. It is also examined that dense forest and open forest
areas have been increased along the river bank or shoreline. Figure 4 represents
the year-wise forest cover class areas, and Fig. 5 depicts the year wise total forest
areas. Table 4 describes the total forest area includes dense and open forest area for
the year of 1975, 1989, 2000 and 2015. From Table 5, it has been seen that during
the period 1975–2000, the forest areas has been increased by approximately 6.86%
(125.4 km2 ) whereas during the period 2000–2015, it has been declined by around
5.77% (111.85 km2 ).

4.2 Forest Covers Change Analysis

In this study, the forest cover areas primarily classify into four classes, namely, water
body, barren land, dense forest, and open forest. Table 6 depicts the detailed rep-
resentation of percentage of change in forest area during 1975–1989, 1989–2000,
and 2000–2015. During the period 1975–2015, the net water body areas have been
increased by approximately 12.30%, whereas during the year 1975–2000, it has been
declined by around 11.5%, while from the year 2000–2015, it has been significantly
Forest Cover Change Analysis in Sundarban … 93

Fig. 3 Maximum likelihood classification results of a 1975, b 1989, c 2000, and d 2015

2000 Water Body


Total area in sq.

Barren Land
0
KM

1975 1989 2000 2015 Dense Forest


Year Open Forest

Fig. 4 Forest covers area for the years of 1975, 1989, 2000, and 2015
94 K. Kundu et al.

Area in sq. KM
2000
1900
1800
total forest area
1700
1975 1989 2000 2015
Year

Fig. 5 Forest area for the years of 1975, 1989, 2000, and 2015

Table 6 Percentage of forest Class Year: Year: Year:


cover area change during 1975–1989 1989–2000 2000–2015
1975–1989, 1989–2000, and
2000–2015 Water body −2.34 −9.16 26.58
Barren land 2.12 −6.92 −64.17
Dense forest 17.84 3.35 −9.24
Open forest −7.15 7.32 −3.70

increased by about 26.58%. Water body area has been increased because of rising
sea level due to the effects of global warning. During the period 1975–2015, the
barren land areas have been decreased by about 64.17% because of growing water
body and deforestation, etc., whereas during the year 1975–1989, the barren land
areas have been marginally increased by around 2.12%, while it has been declined
during 1989–2015. The dense forest areas have been increased by about 10.54%
during the period 1975–2015, while it has been significantly declined by approxi-
mately 9.24% during the year 2000–2015. The open forest area has been depleted by
about 4.05% during the period 1975–2015, whereas during 1989–2000, it has been
significantly increased by about 7.32%. Therefore, the forest area has been increased
by approximately 1.09% during the period 1975–2015.
From Tables 7, 8 and 9, it has been illustrate that 84–98% of water bodies and
63–81% of dense forests areas have no changes during the period 1975–2015, while
reversed trend has been observed for the barren land and its unchanged area is
23–34%. For the open forests areas, 66–70% of areas have been unchanged dur-
ing the year 1975–2000 and 36% of areas are fixed during 2000–2015. The major
change in areas has been seen in the region of outer edge or near the shoreline because
of anthropogenic and natural forces. Moreover, it has been surveyed that during the
period 1975–2000, 15–32% of barren land areas have been transformed into water
bodies because of rising sea level or variation tidal inundation during acquisition of
satellite image, 35–51% of barren land areas have been converted into the open forest
areas which is caused by new plantation, 15–34% of dense forests areas have been
transformed into open forests areas because of deforestation, and 21–46% of open
forests areas have been converted into dense forests areas due to regrowth. During
the period 1975–2000, 4–13% of water bodies have been translated into the open
forests areas because of new plantation program on the shoreline or along the river
Forest Cover Change Analysis in Sundarban … 95

Table 7 Forest cover change matrix (area in km2 and percentage) during 1975–1989
Year Water body Barren land Dense forest Open forest
1975–1989 Area % Area % Area % Area %
Water body 935.37 91.98 44.91 4.42 3.38 0.33 31.55 3.10
Barren land 32.92 15.66 71.19 33.87 19.24 9.15 86.66 41.23
Dense forest 4.82 0.81 8.68 1.46 378.95 63.62 202.89 34.06
Open forest 17.95 1.48 88.85 7.31 300.51 24.72 808.13 66.47

Table 8 Forest cover change matrix (area in km2 and percentage) during 1989–2000
Year Water body Barren land Dense forest Open forest
1989–2000 Area % Area % Area % Area %
Water body 839.92 84.57 22.52 2.27 1.62 0.16 129.04 12.99
Barren land 45.01 20.97 82.06 38.24 10.37 4.83 76.11 35.46
Dense forest 7.64 1.09 10.74 1.53 477.62 68.05 205.87 29.33
Open forest 9.48 0.84 84.22 7.46 234.78 20.80 800.11 70.88

Table 9 Forest cover change matrix (area in km2 and percentage) during 2000–2015
Year Water Body Barren land Dense forest Open forest
2000–2015 Area % Area % Area % Area %
Water body 888.88 98.53 1.92 0.21 0.76 0.08 7.79 0.86
Barren land 42.18 21.12 47.16 23.61 20.32 10.17 102.33 51.23
Dense forest 13.83 1.91 5.68 0.78 586.94 80.91 111.44 15.36
Open forest 194.98 16.10 17.46 1.44 558.06 46.07 436.45 36.03

bank while during the year 2000–2015, opposite tendency has been seen, 16% of
open forest areas has been transformed into the water bodies due to erosion along the
coastline and rising sea level. The high turnover is done between forests and barren
land areas because of infringement, erosion, aggradations, and forest rehabilitation
programs. The major erosion occurred on the southern region of Sundarban delta
which shown in Fig. 6(a–d).

4.3 Future Trend Analysis Using Fuzzy Logic

From the study, it is clearly indicated that Sundarban mangrove forest area has been
declined while it was not uniform over the periods (1975–2015). The main causes of
forest degradation are natural and human influence pressures, decreased fresh water
supply, changes in coastal erosion, increased salinity level, pollution from industry,
rising sea level, natural disaster, improper planning and management, increased in
man–animal conflicts, etc. In fuzzy set theory, the magnitude of membership of an
96 K. Kundu et al.

Fig. 6 Change detection results of a 1975–1989, b 1989–2000, c 2000–2015, and d 1975–2015

element is continuously changing from 0 to 1. In this fuzzy logic system, the two
parameters are taken as inputs, i.e., rising sea level (RSL) and climate change (CC),
and considered as one output, i.e., forest cover change (FCC). Figure 7 depicts the
block diagram of the fuzzy logic system. Initially, two input parameters (RSL and
CC) are fuzzified, and then, the fuzzy rules are penetrated into the fuzzy inference
engine; it is the heart of the system where two input parameters (RSL and CC) are
processed and the output is obtained, i.e., FCC. For the input–output parameters
(RSL, CC, and FCC), the fuzzy logic system is categorize into three intervals such
Forest Cover Change Analysis in Sundarban … 97

Fuzzy Rule
RSL

Fuzzification Fuzzy Inference Engine Defuzzification


FCC
CC

Fig. 7 Fuzzy inference system

Fig. 8 Membership function for low, medium, and high

as low, medium, and high. Fuzzy membership values between 0 and 1 for the low,
medium, and high are shown in Fig. 8. The fuzzy-based rules are given below:
IF (RSL is low) AND (CC is low) THEN (FCC is low).
IF (RSL is high) AND (CC is low) THEN (FCC is medium).
IF (RSL is low) AND (CC is high) THEN (FCC is medium).
IF (RSL is high) AND (CC is high) THEN (FCC is high).
IF (RSL is high) OR (CC is high) THEN (FCC is high).
Whenever rising sea level (RSL) and climate change (CC) are very high, then the
changes in forest area (FCC) may be high, i.e., forest area may be more declined. If
the rising sea level (RSL) or climate change (CC) is very high, then the forest cover
changes (FCC) may be high, whereas if the changes in the two input parameters
are very low, then the reversed trend has been obtained for the output parameter. If
the changes in the two input parameters (RSL and CC) are either low or high, then
the changes in the output variable (FCC) will be moderate. In the recent years, it
is observed that natural calamities have been rapidly increasing (i.e., increasing sea
level, increasing storm, cyclone, hurricane, temperature, etc.), as a result of degra-
dation of forest cover area. From the study, it may conclude that forest degradation
tendency will be more increased in the future.
98 K. Kundu et al.

Table 10 Confusion matrix for the year 1975


Class Reference data
Water Barren Dense Open Total User
body land forest forest accuracy
Water 29 0 0 0 29 100%
body
Barren 2 31 0 3 36 86.11%
land
Dense 0 0 37 2 39 94.87%
forest
Open 1 3 9 34 47 72.34%
forest
Total 32 34 46 39 151
Producer 90.63% 91.18% 80.43% 87.18%
accuracy
Overall classification accuracy  86.75%, Kappa statistic  0.823

4.3.1 Accuracy Assessment

Image classification accuracy has been obtained through the error matrix which
is generally used as the quantitative technique. It is a matrix association between
the reference image and the classification results. In the confusion matrix or error
matrix, the columns represent the ground truth data, i.e., field observation data or
visual interpretation data or Google Earth data, the rows represent the classes of the
classified image to be evaluated, and the cells represent the number of pixel for all
possible correlation between ground truth and the classified image. Diagonal cells
indicate that the number of properly identified pixels and other pixels specify that not
appropriately recognized pixels. Overall classification accuracy has been obtained
by ratio of total number of elements in the diagonal position into the total number of
elements are used in classification. Kappa statistic is also another marker to measure
the accuracy. It is computed how the classification results contrast to values assigned
by possibility. It ranges from 0 to 1. The higher Kappa statistic value indicates that
the classification result is more accurate. To obtain the correctness of the forest cover
maps, confusion matrices (error matrices) have been produced. Confusion matrices
of 1975, 1989, 2000, and 2015 images are presented in Tables 10, 11, 12 and 13,
respectively. Overall accuracy, producer accuracy, user accuracy, and Kappa statistic
have been achieved for scrutiny of classification exactness. The overall accuracy
of 1975 image is 86.75% (with related Kappa statistic of 0.823), of 1989 image is
90.77% (with related Kappa statistic of 0.876), of 2000 image is 88.16% (with related
Kappa statistic of 0.842), and of 2015 image is 85.03% (with related Kappa statistic
of 0.799).
Forest Cover Change Analysis in Sundarban … 99

Table 11 Confusion matrix for the year 1989


Class Reference data
Water Barren Dense Open Total User
body land forest forest accuracy
Water body 23 2 0 0 25 92%
Barren land 1 31 2 0 34 91.18%
Dense forest 0 0 27 3 30 90%
Open forest 1 1 2 37 41 90.24%
Total 25 34 31 40 130
Producer 92% 91.18% 87.10% 92.5%
accuracy
Overall classification accuracy  90.77%, Kappa statistic  0.876

Table 12 Confusion matrix for the year 2000


Class Reference data
Water Barren Dense Open Total User
body land forest forest accuracy
Water body 39 4 0 0 43 90.70%
Barren land 2 30 3 0 35 85.71%
Dense forest 0 2 29 1 32 90.63%
Open forest 0 1 5 36 42 85.71%
Total 41 37 37 37 152
Producer 95.12% 81.08% 78.38% 97.30%
accuracy
Overall classification accuracy  88.16%, Kappa statistic  0.842

Table 13 Confusion matrix for the year 2015


Class Reference data
Water Barren Dense Open Total User
body land forest forest accuracy
Water 47 3 1 0 51 92.16%
body
Barren 3 25 5 0 33 75.76%
land
Dense 1 4 37 1 43 86.05%
forest
Open 0 2 5 33 40 82.5%
forest
Total 51 34 48 34 167
Producer 92.16% 73.53% 77.08% 97.06%
accuracy
Overall classification accuracy  85.03%, Kappa statistic  0.799
100 K. Kundu et al.

5 Conclusions

The present study reveals that Sundarban forest area has been increased by about
6.86% during 1975–2000, while it is not uniform over the period. Its area has been
increased by 1.06% for the interval 1975–1989, and by 5.80% during the period
1989–2000. During the period 2000–2015 the opposite trend has been observed,
and its area has been declined by around 5.77%, although these outcomes were not
significant in the viewpoint of inaccuracy interrelated due to natural atmosphere
which was not equal during the collection of images. The main causes of forest
decline are frequently occurring storm, decline in fresh water supply, rising sea level,
submerge of the coastal region, human activity, etc. From the study, it is seen that
forest area has been gradually declining along the shoreline on the southern region of
the Sundarban delta. The forest region has been increased along the small channel of
northern side of the delta. The change detection results signify that 63%–80% of dense
forest area and 66%–70% of open forest areas have no changes during the periods
1975–2015 and 1975–2000, respectively, although during the interval 2000–2015,
almost 36% of open forest areas have been unaffected. The studies examine that
some forest areas have been transfer to barren land and water bodies. The overall
classification accuracy is more than 85%, which indicates that the classification
results are well. In future, by the year 2030, the forest areas will decline by around 2%
of its net areas in 1975 because of rising sea level, global warning, and deforestation
over the world. Therefore, to supervise, plan, execute are immediately needed to
survive the natural coastal ecosystem in sundarban region.

Acknowledgement This research activity has been carried out in the Dept. of CSE, University
of Kalyani, Kalyani, India. The authors acknowledge the support provided by the DST PURSE
Scheme, Govt. of India at the University of Kalyani.

References

1. Ghosh, A., Schmidt, S., Fickert, T., Nüsser, M.: The Indian Sundarban mangrove forests:
history, utilization, conservation strategies and local perception. Diversity 7, 149–169 (2015)
2. Alongi, D.M.: Mangrove forests: resilience; protection from tsunamis; and responses to global
climate change. Estuar. Coast. Shelf Sci. 76, 1–13 (2008)
3. FSI: India State of Forest Report 2011. Forest Survey of India, Ministry of Environment and
Forests, Dehradun (2011)
4. Jha, C.S., Goparaju, L., Tripathi, A., Gharai, B., Raghubanshi, A.S., Singh, J.S.: Forest frag-
mentation and its impact on species diversity: an analysis using remote sensing and GIS.
Biodivers. Conserv. 14, 1681–1698 (2005)
5. Giri, C., Pengra, B., Zhu, Z., Singh, A., Tieszen, L.L.: Monitoring mangrove forest dynamics
of the Sundarbans in Bangladesh and India using multi-temporal satellite data from 1973 to
2000. Estuar. Coast. Shelf Sci. 73, 91–100 (2007)
6. Pan, Y., Birdsey, R.A., Fang, J., Houghton, R., Kauppi, P.E., Kurz, W.A., et al.: A large and
persistent carbon sinks in the world’s forests. Science 333(6045), 988–993 (2011)
7. Giri, C., Long, J., Sawaid Abbas, R., Murali, M., Qamer, F.M., Pengra, B., Thau, D.: Distribution
and dynamics of mangrove forests of South Asia. J. Environ. Manage. 148, 1–11 (2014)
Forest Cover Change Analysis in Sundarban … 101

8. Ostendorf, B., Hilbert, D.W., Hopkins, M.S.: The effect of climate change on tropical rainforest
vegetation pattern. Ecol. Model. 145(2), 211–224 (2001)
9. Giriraj, A., Shilpa, B., Reddy, C.S.: Monitoring of Forest cover change in Pranahita Wildlife
Sanctuary, Andhra Pradesh, India using remote sensing and GIS. J. Environ. Sci. Technol. 1(2),
73–79 (2008)
10. Jayappa, K.S., Mitra, D., Mishra, A.K.: Coastal geomorphological and land-use and land cover
study of Sagar Island, Bay of Bengal (India) using remotely sensed data. Int. J. Remote Sens.
27(17), 3671–3682 (2006)
11. Mitra, D., Karmekar, S.: Mangrove classification in Sundarban using high resolution multi
spectral remote sensing data and GIS. Asian J. Environ. Disast. Manage 2(2), 197–207 (2010)
12. Giri, S., Mukhopadhyay, A., Hazra, S., Mukherjee, S., Roy, D., Ghosh, S., Ghosh, T., Mitra,
D.: A study on abundance and distribution of mangrove species in Indian Sundarban using
remote sensing technique. J Coast Conserv. 18, 359–367 (2014)
Identification of Malignancy
from Cytological Images Based
on Superpixel and Convolutional Neural
Networks

Shyamali Mitra, Soumyajyoti Dey, Nibaran Das, Sukanta Chakrabarty,


Mita Nasipuri and Mrinal Kanti Naskar

Abstract This chapter explores two methodologies for classification of cytology


images into benign and malignant. Heading toward the automated analysis of the
images to eradicate human intervention, this chapter draws curtain from the history
of automated CAD-based design system for better understanding of the roots of the
evolving image processing techniques in the analysis of biomedical images. Our first
approach introduces the clustering-based approach to segment the nucleus region
from the rest. After segmentation, nuclei features are extracted based on which clas-
sification is done using some standard classifiers. The second perspective suggests
the usage of deep-learning-based techniques such as ResNet and InceptionNet-v3.
In this case, classification is done with and without segmented images but not using
any handcrafted features. The analysis provides results in favor of CNN where the
average performances are found better than the existing result using feature-based
approach.

Keywords Cytology · FNAC · Superpixel-based segmentation · ResNet50 ·


InceptionNet-V3 · Random crop · Random horizontal flip

S. Mitra · M. K. Naskar
Department of Electronics and Telecommunication Engineering, Jadavpur University, Kolkata,
India
e-mail: [email protected]
M. K. Naskar
e-mail: [email protected]
S. Dey · N. Das (B) · M. Nasipuri
Department of Computer Science and Engineering, Jadavpur University, Kolkata, India
e-mail: [email protected]; [email protected]
S. Dey
e-mail: [email protected]
M. Nasipuri
e-mail: [email protected]
S. Chakrabarty
Theism Medical Diagnostics Centre, Dumdum, Kolkata, India
e-mail: [email protected]
© Springer Nature Singapore Pte Ltd. 2020 103
J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent
Trends, Studies in Computational Intelligence 784,
https://doi.org/10.1007/978-981-13-7334-3_8
104 S. Mitra et al.

1 Introduction

In recent times, incidence rates of cancer have reached an alarming situation. Cancer
causes abnormal overgrowth of cells from the originating site. Normally, cell divides
to form new ones by replacing the old and worn out cells. This helps in healing
process. But in case of cancer cells, the cell divides indefinitely and more rapidly
by violating the intricate control system of cell division. Mutations in gene cause
the cells to proliferate and evade other tissues and cells. Though genetic factor is the
leading cause of cancer for an individual, there are various other factors that accelerate
the process of acquiring the disease. Exposure to external agents like plastics, heavy
metals, radiation, toxic and chemical compounds, intake of junk and processed foods,
and the overall lifestyle are additive factors for abnormal cell division.
There are various imaging techniques to detect lumps or masses like magnetic res-
onance imaging (MRI), X-ray (plain film and computed tomography (CT), ultrasound
(US), optical imaging, etc. But these imaging techniques cannot analyze the images
at cellular level. For that reason, cytology is used popularly to detect abnormality
in the cell structure. In cytology, cancer is diagnosed by expert cytotechnologists by
examining cell samples taken either through biopsy or cytology. Cytology has various
advantages compared to biopsy and is vividly stated under the section cytology. As
our current discussion is restricted to cytology-based diagnosis, we will discuss only
on cytology-based research works on automated computer-based system design.
In cytology, nucleus plays a vital role to detect abnormalities in cell. In some
cases, nuclei are highly overlapped. Therefore, it is very hard to segment the nucleus
regions. Most of the techniques available in literature segment out the nucleus region
prior to classification into benign and malignant. With the onset of deep-learning-
based techniques, it is possible to classify the images without the need for prior seg-
mentation. But the techniques are not yet successful for cytology images, especially
for breast cancer images. So, in this chapter, we have discussed two methodologies
and used them accordingly for classification of benign and malignant cells. In the
first approach, classification is done by segmenting out the nuclei region from rest
by using superpixel-based approach. Later, we have used the segmented images for
classification purpose using standard classifiers based on some extracted features.
In the second approach, we have used the same set of segmented images to classify
using deep-learning-based approach without extracting any handcrafted feature. We
have observed that both the approaches are almost comparable and to some extent
better for deep-learning-based approaches.

2 Cytology: A Brief Overview

Human body is composed of innumerable cells and each cell contains a nucleus and
a cytoplasm. The nucleus consists of the genetic imprint, the DNA which undergoes
mutation when a certain disease is acquired. The change is observed in various
Identification of Malignancy from Cytological Images … 105

attributes of nucleus and cytoplasm like shape, size, color, texture, nature, etc. and
by observing these criteria under microscope, malignant and premalignant conditions
can be diagnosed. The branch of medical science that deals with the examination of
cells to detect the presence of any abnormal cellularity is known as cytology. Based
on the originating site there are several types of cytology. Cytology can be categorized
into various types based on originating site. Normally, a benign cell exhibits a well-
defined pattern with regular nuclear contour. On the other hand, malignant nucleus
is characterized by irregular nuclear contours with varied shapes and sizes. Thus,
nuclei have utmost diagnostic value from clinical point of view.
There are various modalities by which the cellular specimen can be collected by
the pathologist. One of the very common techniques is the fine needle aspiration
(FNA). It is a diagnostic procedure to sample cells from a cyst or a palpable mass or
from the region detected by using other imaging techniques like X-ray [1], ultrasound
[2], or mammography [3]. A thin needle (normally 23–25 gage) is injected through
the skin and sufficient amount of cells are taken out for investigation. This procedure
was first done at Maimonides Medical Center, United States, in 1981 successfully.
Soon it was realized that it is relatively faster, safer, cheaper, noninvasive, much less
painful method, and trauma-free diagnostic process compared to surgical biopsy [4].
Complications are very rare and include mildly soared area and minor hemorrhage.
It is extensively used in the detection and diagnosis of breast lumps, liver lesions,
renal lesions, ovarian masses, soft tissue masses, pulmonary lesions, thyroid nodules,
subcutaneous soft tissue mass, salivary gland, lymph nodes, etc. [5]. Many techniques
such as bacterial culture, immune cytochemistry, cytogenetics, polymerase chain
reaction, etc. are possible from FNAC.
There are two types of tests that are commonly used in cytology to detect the
malignancy condition which are as follows:
Screening Test: This test is normally done before the actual onset of the disease,
that is, when symptoms are not perceivable and it is recommended for those who are
at high-risk zone. Screening test at regular intervals can help to diagnose the disease
at a premature stage and it can be readily treated with best possible outcome.
Diagnostic Test: This test is usually performed when the symptoms actually start
to develop and are prescribed for patients who are found to suffer from serious
ailment.
Both the tests can be performed manually or via automatic computer-assisted
design (CAD)-based Systems. A manual test is normally performed under human
jurisdiction, which involves human effort, time, and fatigue. To examine more than
hundreds of slides per day is cumbersome task and it requires expert cytotechnologists
which is a dearth as of now. A manual diagnostic procedure is shown in Fig. 1.
Therefore, automatic CAD-based systems are designed which can process thou-
sands of specimens per minute despite much of human intervention. Nevertheless to
mention, it saves time and energy of the cytotechnologist by assisting them techni-
cally in various ways.
• It can work as prescreening system to distinguish between a normal and more
often an abnormal specimen. Thus, the system should show greater false positive
106 S. Mitra et al.

Fig. 1 Diagram of a manual screening system

cases to make sure that no single malignant specimen is overlooked that requires
further investigation by doctors. Thus, this system can work as an assistant to the
doctors by eradicating the necessity to assess the normal specimens. Thus, it saves
significant amount of time and energy of the doctors and improves efficiency of
the diagnosis process.
• An automated system can be implemented in a parallel mode to the conventional
manual procedure to screen the disease. Thus, the screening process becomes bi-
folded so that chances of faulty diagnosis and false negative cases are reduced
to a greater extent, which is utmost important from diagnostic point of view. An
automatic diagnostic system is shown in Fig. 2.
Though cytology has numerous advantages, there are a few issues that are yet to
be resolved:
• It cannot localize neoplastic lesion to an exact anatomic location.
• It cannot distinguish between preinvasive and invasive cancer.
• It is unable to distinguish reactive from dysplastic.
• May not be able to determine tumor type.
• The cellular structure of the specimen is influenced by the experience of the spec-
imen collector, the technique, and instruments used.
• False negative cases are more with FNA.
Identification of Malignancy from Cytological Images … 107

Fig. 2 Diagram of an
automatic screening system

2.1 Automation as a Growing Concept in Cytology

Automation is a pertinent solution in present context of increasing number of cancer


cases and aimed to reduce the workload of the cytotechnologists. There were various
designs developed during the period 1970–1990 like CERVISCAN [6], BioPEPR
[7], LEYTAS [8], Xcyt [9], etc. which laid strong foundation on today’s era of
automation-assisted screening system.
In cytology, there are some automated devices that work as an adjunct with human
interface to reduce the workload which are given below:
Automated Slide Preparation Devices: Two systems have got FDA sanction for
automated preparation of slides, ThinPrep Processor, and the AutoCyte Prep. Thin-
Prep processor 5000 uses thin prep technology for the preparation of slides and
can produce roughly 35 slides/hour. AutoCyte prep also uses the same liquid-based
technology but the technique of slide preparation of the two systems is different.
Computerized Screening Systems: These systems work on the principle of the
underlying image processing algorithms, where corresponding to an input cytology
an output or processed image is produced. Image processing algorithms segment
the region of interest, i.e., nuclei from the rest and extract features which could
classify the image into benign and malignant. There are few systems that focus
the portion of the slide carrying abnormalities and reduce the time by examining a
small portion of slide. Computerized microscopes like CompuCyte’s M Pathfinder,
Accumed International’s AcCell 2000 are highly appreciated for this purpose.
108 S. Mitra et al.

3 Literature Survey

Segmentation is always a challenging problem for extracting the region of inter-


est in biomedical images because of their diverse and complicated nature. Several
researchers approached to mitigate this problem in various ways. In the present work,
as already stated, we will approach to classify the specimen images into two ways.
Before diving straight into the present work let us have a look at some significant
state-of-the-art methods that were implemented successfully and is also helpful in
understanding the context of the present work.
An adaptive thresholding based segmentation method was proposed by Zhang
et al. [10] for automatic segmentation of cervical cells. They explored concave point-
based algorithm to delineate overlapping nuclei with reduced computational cost.
Zhang et al. [11] used local graph cut technique to segment nuclei and a global graph
cut to segment cytoplasm of free-lying cells.
Zhao et al. [12] invoked a superpixel-based Markov random field (MRF) seg-
mentation technique. The labeling process delineates nucleus, cytoplasm, and other
components of cell.
Li et al. [13] proposed spatial K-means clustering to initially classify image into
three clusters nucleus, background, and cytoplasm. A final segmentation is done
using a radiating vector flow (RGVF) snake model. RGVF uses a new edge-map
computation method to locate the ambiguous and incomplete boundaries but fails to
delineate the overlapping cells. Thanatip Chankong et al. [14] proposed a patch-based
fuzzy C-means (FCM) clustering to segment nucleus, cytoplasm, and background.
Hough-transform-based technique has been extensively used in contemporary
works. To segment breast FNAC images, George et al. [15] proposed Hough-
transform-based technique. Circular shaped structures were detected first using this
and to eliminate the false circles that were created during the process, Otsu’s thresh-
olding was used. The marker-controlled watershed transform then accurately draw
the nucleus boundary. For classification, 12 features were extracted. Four different
classifiers like MLP, PNN, LVQ, and SVM were used to classify the images into
benign and malignant with tenfold cross-validation. Marker-controlled watershed
proposed by Xiaodong Yang et al. 2006 [16] segment out nuclei region from the rest.
W. N Street et al. [9] proposed a system called Xcyt [9] to perform screening
test of breast cancer. They proposed Hough transform to detect circle-like shapes
and active contouring technique to detect boundaries of nucleus. Hrebien et al. 08
[17] also used Hough-transform-based technique followed by an automatic nuclei
localization method using (1 + 1) search strategy. Nuclei segmentation is done using
watershed, active contour model, and grow-cut algorithm. But they failed to address
overlapping nuclei. Another drawback was the generation of false circles which was
not be resolved later.
Garud et al. [18] proposed deep CNN-based classification approach on breast
cytology samples. Experiments were conducted on eightfold cross-validation pro-
cess of 37 cytopathology samples by using GoogLeNet architecture. Manually, ROI’s
were extracted from the cell samples and then GoogLeNet architecture was trained
Identification of Malignancy from Cytological Images … 109

to classify these breast FNAC region of interest (ROIs) and achieved the mean recog-
nition accuracy of 89.7%.
Artificial neural network (ANN)-based diagnosis process was proposed by Dey
et al. [19] to classify lobular carcinoma cases. Dataset consisted of 64 images (40
training data, 8 validation data, and 16 validation data). By using HE strain, auto-
mated image morphometry operation was analyzed to study nuclei features like area,
diameter, perimeter, etc. The network consisted of 34 inputs in the first layer, 17 input
hidden layers, and 3 class output layers.
ANN-based classification of cytology images was proposed by Isha et al. [20] to
classify breast precancerous lesions. Dataset consisted of 1300 precancerous cases
collected from Penang General Hospital and Hospital University Sains Malaysia,
Kelantan, Malaysia for training and testing purpose. Hybrid multilayered perceptron
network was used to classify images.
An automated detection technique of nuclei of cervical cell was proposed by Braz
et al. [21] by using convolution neural network (CNN). For the experiment, they used
overlapping cervical cytology image segmentation challenge—ISBI 2014 dataset.
The square patches were extracted from the training images, where the central pixel
must belong to the target classes. They used rectified linear units for convolution in
fully connected layers.
Tareef et al. [22] suggested deep-learning-based segmentation and classification of
Papanicolau smeared cervix cytological images. The image patches were generated
by simple linear iterative clustering process. The diagnosis process was done by
superpixel-based convolution neural network.

4 Present Work

As indicated earlier, present work consists of two different approaches and both the
approaches are validated on the same dataset. In both the approaches, nucleus seg-
mentation is performed. As seen, segmentation is the most crucial and vital problem
in image analysis. The main reason behind it is the complex and varied nature of the
images. In cytological images, the region of interest is the nucleus where most of
the abnormalities are registered. So, absolute delineation of the nucleus is required
to extract the meaningful contents. Various algorithms are proposed for accurate
segmentation of the nucleus. In the first approach, we will segment the nucleus in
the images using a combination of various clustering algorithms. After segmenting
the region of interest, i.e., nuclei, features like compactness, eccentricity, area, and
convex area of nuclei are extracted. Based on the extracted features classification is
done using MLP, k-NN, AdaBoost, SVM, and random forest classifiers. In the sec-
ond approach, after segmenting the nuclei, classification is done using deep learning
approach without using the feature set. In the next section, we will discuss both the
approaches in details.
110 S. Mitra et al.

Fig. 3 Sample images of benign tumors

5 Dataset Description

The experiment was performed on the FNAC-based cytological images, collected


from pathology center “Theism Medical Diagnostics Centre, Dumdum, West Ben-
gal.” 100 cytological images were collected consisting of 50 malignant samples
and 50 benign samples. The images are captured with 5-megapixel resolution using
Olympus microscope at 40X optical zoom (Figs. 3, 4).

6 First Approach: Classification of the Images


with the Help of Standard Classifiers

In this approach, we have invoked feature-based classification using standard clas-


sifier such as K-NN, MLP, etc. The outline of the first process is shown in Fig. 5.
The images are transformed into RGB color space using Eq. 1.

I RG B = (FR , FG , FB ) (1)
Identification of Malignancy from Cytological Images … 111

Fig. 4 Sample images of malignant tumors

Fig. 5 A block diagram of the first approach

where I R (x, y) = intensity of the red channel pixel (x, y), IG (x, y) = intensity of
the green channel pixel (x, y), I B (x, y) = intensity of the blue channel pixel (x, y).
Now, the image is split into red, green and blue channels. Anisotropic diffusion
[23] is applied individually on the three channels for removal of noise. Anisotropic
diffusion performs blurring or removes noise selectively inside the edges. Edges are
not blurred in this process of noise removal which is very advantageous for further
processing of the images. It is performed by using the following equation:
112 S. Mitra et al.

Ft = div(c(x, y, t)∇ F) = c(x, y, t)ΔF + ∇c.∇ F (2)

where the symbol “∇” denotes the difference in intensity of nearest-neighboring


pixel and c is the diffusion coefficient which is a function of g(x).
where g(x) is defined by g(x) = e(−(x/k) ) and g(x) =
2
2.
1
1+( x
k )
Now median-based SLIC [24] is used for segmenting the noise removed
image. 2000 superpixels are formed from the image. The superpixel, whose area
≤10 pixels, is removed and the features are empirically determined using median-
colored value.
The labeled image thus produced is subjected to Spatial DB SCAN [25] based
clustering method to identify the high-density regions in the image. Let, the new
labeled image be Idb-scan corresponding to new clustered regions. The labeled image
is then used to detect boundary to produce Iboundary and is binarized to produce image
binary IBinary . This is reversed to
 make the label more distinct and is denoted by
I = [1 1 . . . . . . .1]m×n − IBinary m×n .
Now, morphological erosion of the image I’ is done with the presence of a struc-
tural element s, , I’  s, = {z | s, z ⊆ I’} where s, z is the translation of s, . Assume,
I” ≡ I’  s, . Now, the mean intensity of the connected components of the image
I” is calculated.
 Mean intensity
 of each connected component is calculated as
1 n
(MI) = n j=1 f (x j , y j ) , where n is the number of pixels and f (x j , y j ) is the
intensity value of the pixel at (xj , yj ). The values of mean intensity of each pixel of
the connected components are F1(x, y) = MIired for all x, y, where (x, y) are the pixels
of ith connected components corresponding to red channel.
Similarly,

F2(x, y) = MIigreen
F3(x, y) = MIiblue .

Thus,

Icontour = F1 ∪ F2 ∪ F3 (3)

Icontour is splitted into red, green, and blue channel. After conversion, each channel
is turned into gray level and is finally merged. Fuzzy C-means clustering [26] divides
the image and 15 such clusters are chosen to form a new clustered image Ifcm. The
irrelevant details are removed using connected component analysis. For ith connected
component CCi(x, y) = 0, if area of CCi ≤ 700 pixels. Morphological operations
like erosion and dilation are applied to remove unwanted objects. Finally, the original
pink-colored nuclei are overlayed on the masked background.
To isolate the overlapped nuclei, entropy-based superpixel [27] algorithm is intro-
duced resulting in 750 superpixels and the overlapping portions are separated based
on superpixel labels using the entropy rate.
Identification of Malignancy from Cytological Images … 113

Fig. 6 A block diagram of the superpixel-based segmentation approach

Table 1 Statistical information on performance of different classifiers [28]


Classifier Precision Recall F-measure Accuracy (%)
K-NN Class #1 1 0.813 0.897 91
Class# 2 0.85 1 0.919
Weighted average 0.923 0.909 0.908
MLP class#1 0.93 0.875 0.903 91
class#2 0.889 0.941 0.914
Weighted average 0.91 0.909 0.909

The overlapping regions are separated by the maximum value of the entropy-
based objective function [27]. Thus, the segmented image is produced with only
deep pink-colored nuclei independent from each other (Fig. 6).

6.1 Experimental Results of the First Approach [28]

The experiment was performed in threefold cross-validation using K-NN and MLP
classifier and we have observed that both K-NN and MLP classifier have achieved
average maximum recognition accuracy of 91%. For more details, see Table 1.
114 S. Mitra et al.

7 Second Approach: Classification of the Images


with the Help of CNN

In this section, we introduce deep-learning-based techniques for identification of


benign and malignant cells. Deep-learning-based techniques are nowadays used
heavily for computer vision related task. The success rate highly depends on archi-
tecture of the developed network and the number of image samples in the database.
Most of the successful networks are used mainly for natural image recognition such
as Alexnet [29], ResNet [30], InceptionNet [31], etc. On the other hand, benign
and malignant cell identification from cytological images is very difficult due to
not having any hard distinguishing factors among them. Another major challenge
of cytological images is the limited database. Here, we first segment the cytological
images based on superpixel. Then, the segmented images are used for training and
testing purpose using deep learning architecture such as ResNet and InceptionNet-
V3. We have observed the CNNs performed better using segmented images rather
than using raw images. And, the average performances are better than the existing
result using feature-based approach.
We have used deep learning more specifically convolution neural network for
identification of benign and malignant cells. To do that we have used superpixel-
based image segmentation technique to segment nuclei from the cytoplasm and then
the segmented regions are classified using CNNs.
Dataset preparation for deep-learning-based classification: The 100 segmented
images (50 are benign and 50 are malignant samples) which are previously segmented
by superpixel-based approach are divided randomly in the ratio 3:2:1 to make train,
test, and validation sets. The validation set is used to select appropriate deep learning
model using training data. The best model will be selected depending on the validation
set. It is used to give an estimate of the tuned model. In our proposed method, two
types of deep learning networks are used for classification of the images, ResNet50
and InceptionNet-V3. The predefined architectures of these two neural networks
are used for training purpose. The detailed descriptions of these architectures are
described in the following two subtitles.

7.1 Network Architecture Description of ResNet50

ResNet architecture, which was introduced by Microsoft Research Asia on 2015,


is now popularly used in computer vision domain. In Resnet50 architecture, each
two-layer block of Resnet34 is replaced with three-layer bottleneck block (Fig. 7
and Table 2).
Identification of Malignancy from Cytological Images … 115

Fig. 7 Architecture of ResNet50

Table 2 Parameters of the network architecture [32]


Layer Name Output Size
Conv1 x 7 × 7,64, stride 2 112 × 112
Conv2 x 3 × 3 max pool, stride 2 56 × 56
1 × 1,64
3 × 3,64 × 3
1 × 1,256
Conv3 x 1 × 1,128 28 × 28
3 × 3,128 × 4
1 × 1,512
Conv4 x 1 × 1,256 14 × 14
3 × 3,256 × 6
1 × 1,1024
Conv5 x 1 × 1,512 7×7
3 × 3,512 × 3
1 × 1,2048
Average pool, 2, softmax 1×1

7.2 Network Architecture Description of InceptionNet-V3

InceptionNet-V3, a deep neural network, is one of the pretrained models in PyTorch


environment. First, the Inception module was trained on ImageNet dataset of 1000
classes by using Google Inc. But in our proposed work, the inception module is trained
for our binary class dataset of cytology images. The Inception-v3 architecture is an
upgraded module of Inception-v1 and Inception-v2. In the network Inception-v2,
a 7 × 7 convolution is factorized into three 3 × 3 convolutions. Inception-v2 is a
little updated from Inception-v1. There are three inception modules of sizes 35 × 35
(Table 3).
Inception-v3 has the same architecture as Inception-v2 with inclusion of some
minor changes. Here, the batch normalization auxiliary is added with the Inception-
v2, i.e., the fully connected layer of the auxiliary classifier is also normalized
(Fig. 8).
116 S. Mitra et al.

Table 3 Parameters of the network architecture [33]


Layer name Input size
Conv (convolution layer) 299 × 299 × 3
Conv 149 × 149 × 32
Conv padded 147 × 147 × 32
Pool (pooling layer) 147 × 147 × 64
Conv 73 × 73 × 64
Conv 71 × 71 × 80
Conv 35 × 35 × 192
3 X Inception 35 × 35 × 288
5 X Inception 17 × 17 × 768
2 X Inception 8 × 8 × 1280
Pool 8 × 8 × 2048
Linear layer 1 × 1 × 2048
Softmax 1×1×2

Fig. 8 Architecture of InceptionNet-V3 (Image Source https://hackathonprojects.files.wordpress.


com/2016/09/74911-image03.png)

7.3 Training Process

First, the images from trained set are randomly flipped horizontally and the flipped
images are cropped randomly. So, from an image with dimension 960 × 1280 is
cropped randomly into the dimension of 224 × 224 (for ResNet50) and 299 × 299 (for
InceptionNet-V3). In PyTorch implementation, the input images must be resized in
these dimensions. These randomly cropped images are trained using neural networks.
These transformation processes are done at runtime and during training at each epoch
the cropped portions will be changed randomly. For these experiments, NVIDIA
GTX Geforce 970 GPU system with 1664 CUDA core and 4 GB RAM is used. The
batch size, number of epochs, and learning rate of the model are set to 8, 200, 0.0001,
respectively, for the present work. Among different optimization techniques, ADAM
optimizer is used for both networks. The training loss is calculated by negative log
likelihood estimation method. In a 4 Gb GPU, the training time for InceptionNet-V3
Identification of Malignancy from Cytological Images … 117

Fig. 9 The experimentation module for Experiment-1 of second approach

Table 4 The results of classification accuracy of Experiment-1


Neural network Test 1 accuracy Test 2 accuracy Test 3 accuracy Average
model (%) (%) (%) accuracy (%)
InceptionNetV3 90 95 90 91.67
ResNet50 85 90 85 86.67

is approximately 1.5 h and for ResNet50 is approximately 1 h. The developed training


architecture for Experiment 1 is shown in Fig. 9.

7.4 Classification Result

Experiment-1:
First, the 20 test images (10 benign and 10 malignant) are transformed by random
horizontal flip and then cropped randomly with the dimension 224 × 224 and then
they are predicted by the best-trained models of ResNet50 and InceptionNet-V3. We
tested three times by using these trained models (Figs. 10, 11 and Table 4).
118 S. Mitra et al.

Fig. 10 A diagram of using deep learning with corresponding segmented input images

Table 5 Statistical information of Experiment-2


Inception-v3 ResNet-50
# Benign # Malignant # Benign # Malignant
Precession 1 0.8 0.9 0.8
Recall 0.83 1 0.82 0.89
F-measure 0.91 0.89 0.86 0.84
Accuracy 90% 85%

Experiment-2:
From the best-trained models, the class-specific probability distributions of test
samples are extracted. Since the test images are transformed by random horizontal
flip and then randomly cropped, so we can get distinct class-specific probability
distribution values of test samples at each testing. Here, five tests are conducted. The
average probability distribution values are calculated.
Let piB (k), piM (k) are the probability distribution values of class-1 (i.e., Benign)
and class-2 (i.e., malignant), respectively, of kth testing sample in ith test, where
i = 1(1)5, k = 1(1)20.
Resultant probability
m distribution value of kthtest sample is
m
p B (k) = m1 i=1 piB (k) and p M (k) = m1 i=1 piM (k) where p B (k), p M (k)
denote the average probability distribution value of class-1 and class-2, respectively,
of the kth sample and m is the number of test cases.
The test samples are predicted by these probability distribution values (i.e., if the
test sample has high probability distribution value on class-k, then the test sample
belongs to the class-k) (Table 5).
Identification of Malignancy from Cytological Images … 119

Fig. 11 Graph plot of train loss (in red color) versus validation loss (in blue color) per epoch
(training on segmented images) of Experiment-1

Table 6 Experiment results of Experiment-3


Neural network Test-1 (%) Test-2 (%) Test-3 (%) Average accuracy (%)
ResNet50 80 75 85 80
Inception-v3 85 75 70 76.67

Experiment-3:
The raw RGB images are transformed by random horizontal flip and then random
crop by 224 × 224 for ResNet50 and 299 × 299 for InceptionNet-V3 (Fig. 12 and
Table 6).
120 S. Mitra et al.

Fig. 12 Graph plot of train loss (in red color) versus validation loss (in blue color) per epoch in
Experiment-3

8 Conclusion

In this chapter, we discussed the advantages and challenges of cytology images and
explored two techniques for automatic recognition of cytology images. One approach
consists of traditional feature-based approach, where prior to feature extraction seg-
mentation is done and based on extracted features the segmented nuclei classification
is done. Segmentation is done using different clustering-based methods with fine-
tuning using entropy-based superpixel. We found maximum recognition accuracy
of 91% on threefold cross-validation using MLP and K-NN. However, the classifi-
cation of dataset using deep learning especially using CNNs is explored in another
approach, where we conducted two experiments. Experiment-1 deals with the raw
dataset and we detected poor performance during testing. Experiment-2 investigates
the performance of CNN using the segmented data and we saw profound improve-
ment in recognition accuracy. Both the experiments in second approach are devoid
of handcrafted features. Two popular CNNs modules, Resnet and InceptionNet-V3,
Identification of Malignancy from Cytological Images … 121

are used for classification of data. We also explore the probability-based combination
of two CNNs modules. We found that average performance of InceptionNetV3 is
better than the other methods. The highest recognition accuracy of 95% is accorded
by InceptionNet-V3 in one set of segmented data, which is slightly higher than our
first approach. Inclusion of more number of samples may improve the performance
of deep learning module significantly.

References

1. Sagawa, M., Usuda, K., Aikawa, H., et al.: Screening for lung cancer: present and future. Gan
To Kagaku Ryoho 39, 19–22 (2012)
2. Xian, G.M.: An identification method of malignant and benign liver tumors from ultrasonog-
raphy based on GLCM texture features and fuzzy SVM. Expert Syst. Appl. 37, 6737–6741
(2010). https://doi.org/10.1016/j.eswa.2010.02.067
3. Muhammad Hussain NK (2012) AUTOMATIC MASS DETECTION IN MAMMOGRAMS
USING MULTISCALE SPATIAL WEBER LOCAL DESCRIPTOR. IWSSIP 2012
4. Domanski, H.A.: Fine-needle aspiration cytology of soft tissue lesions: diagnostic challenges.
Diagn. Cytopathol. 35, 768–773 (2007). https://doi.org/10.1002/dc.20765
5. Lopes Cardozo, P.: The significance of fine needle aspiration cytology for the diagnosis and
treatment of malignant lymphomas. Folia Haematol. Int. Mag. Klin Morphol. Blutforsch. 107,
601–620 (1980)
6. Tucker, J.H.: Cerviscan: an image analysis system for experiments in automatic cervical smear
prescreening. Comput. Biomed Res. (1976). https://doi.org/10.1016/0010-4809(76)90033-1
7. Zahniser, D.J., Oud, P.S., Raaijmakers, M.C.T., et al.: BioPEPR: a system for the automatic
prescreening of cervical smears. J. Histochem. Cytochem. 27, 635–641 (1979). https://doi.org/
10.1177/27.1.86581
8. Vrolijk, J., Pearson, P.L., Ploem, J.S.: LEYTAS: a system for the processing of microscopic
images. Anal. Quant. Cytol. (1980)
9. Street, W.N.: Xcyt: a system for remote cytological diagnosis and prognosis of breast cancer.
In: Artificial Intelligence Techniques in Breast Cancer Diagnosis and Prognosis, pp. 297–326.
World Scientific Publishing (2000)
10. Zhang, L., Chen, S., Wang, T., et al.: A practical segmentation method for automated screening
of cervical cytology. In: 2011 International Conference on Intelligent Computation and Bio-
Medical Instrumentation (2011). https://doi.org/10.1109/icbmi.2011.4
11. Zhang, L., Kong, H., Chin, C.T., et al.: Segmentation of cytoplasm and nuclei of abnormal
cells in cervical cytology using global and local graph cuts. Comput. Med. Imaging Graph. 38,
369–380 (2014). https://doi.org/10.1016/j.compmedimag.2014.02.001
12. Zhao, L., Li, K., Wang, M., et al.: Automatic cytoplasm and nuclei segmentation for color
cervical smear image using an efficient gap-search MRF. Comput. Biol. Med. (2016). https://
doi.org/10.1016/j.compbiomed.2016.01.025
13. Li, K., Lu, Z., Liu, W., Yin, J.: Cytoplasm and nucleus segmentation in cervical smear images
using Radiating GVF Snake. Pattern Recognit. (2012). https://doi.org/10.1016/j.patcog.2011.
09.018
14. Chankong, T., Theera-Umpon, N., Auephanwiriyakul, S.: Automatic cervical cell segmentation
and classification in PAP smears. Comput. Methods Programs Biomed. 113 (2014)
15. George, Y.M., Zayed, H.H., Roushdy, M.I., Elbagoury, B.M.: Remote computer-aided breast
cancer detection and diagnosis system based on cytological images. IEEE Syst. J. 8, 949–964
(2014). https://doi.org/10.1109/JSYST.2013.2279415
16. Yang, X., Li, H., Zhou, X.: Nuclei segmentation using marker-controlled watershed, tracking
using mean-shift, and Kalman filter in time-lapse microscopy. IEEE Trans. Circ. Syst. I Regul.
Pap. 53, 2405–2414 (2006). https://doi.org/10.1109/TCSI.2006.884469
122 S. Mitra et al.

17. Hrebień, M., Korbicz, J., Obuchowicz, A.: Hough transform, (1 + 1) search strategy and
watershed algorithm in segmentation of cytological images. In: Kurzynski, M., Puchala, E.,
Wozniak, M., Zolnierek, A. (eds.) Advances in Soft Computing, pp. 550–557. Springer, Berlin,
Heidelberg (2007)
18. Garud, H., Karri, S.P.K., Sheet, D., et al.: High-magnification multi-views based classification
of breast fine needle aspiration cytology cell samples using fusion of decisions from deep
convolutional networks. In: IEEE Computer Society Conference on Computer Vision and
Pattern Recognition Workshops (2017)
19. Dey, P., Logasundaram, R., Joshi, K.: Artificial neural network in diagnosis of lobular carcinoma
of breast in fine-needle aspiration cytology. Diagn. Cytopathol. 41, 102–106 (2011). https://
doi.org/10.1002/dc.21773
20. Isa, N.A.M., Subramaniam, E., Mashor, M.Y., Othman, N.H.: Fine needle aspiration cytology
evaluation for classifying breast cancer using artificial neural network. Am. J. Appl. Sci. 4,
999–1008 (2007)
21. Braz, E.F., Lotufo, R.D.A.: Nuclei detection using deep learning. In: Brazilian Symposium on
Telecommunications and Processing of Signals, pp. 1059–1063 (2017)
22. Tareef, A., Song, Y., Huang, H., et al.: Optimizing the cervix cytological examination based
on deep learning and dynamic shape modeling. Neurocomputing 248, 28–40 (2017). https://
doi.org/10.1016/j.neucom.2017.01.093
23. Weickert, J.: Anisotropic diffusion in image processing. Image Rochester NY 256:170 (1998).
http://doi.org/10.1.1.11.751
24. Achanta, R., Shaji, A., Smith, K., et al.: SLIC superpixels compared to state-of-the-art super-
pixel methods. 6, 1–8 (2011)
25. Sander, J., Ester, M., Kriegel, H.-P., Xu, X.: Density-Based clustering in spatial databases:
the algorithm GDBSCAN and Its applications. Data Min. Knowl. Discov. 2, 169–194 (1998).
https://doi.org/10.1023/A:1009745219419
26. Luukka, P.: Feature selection using fuzzy entropy measures with similarity classifier. Expert
Syst. Appl. 38, 4600–4607 (2011). https://doi.org/10.1016/j.eswa.2010.09.133
27. Liu, M., Tuzel, O., Ramalingam, S., Chellappa, R.: Entropy rate superpixel segmentation. In:
CVPR 2011, pp. 2097–2104 (2011)
28. Mitra, S., Dey, S., Das, N., et al.: Identification of Benign and Malignant Cells from cytological
images using superpixel based segmentation approach. In: Mandal, J.K., Sinha, D. (eds.) 52nd
Annual Convention of CSI 2018: Social Transformation—Digital Way, pp. 257–269. Springer,
Singapore (2018)
29. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Alexnet. Adv. Neural Inf. Process. Syst. (2012).
http://dx.doi.org/10.1016/j.protcy.2014.09.007
30. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR
(2016)
31. Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the
IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)
32. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015). CoRR
abs/1512.0
33. Szegedy, C., Vanhoucke, V., Ioffe, S., et al.: Rethinking the inception architecture for computer
vision (2015). CoRR abs/1512.0

You might also like