0% found this document useful (0 votes)
630 views331 pages

AITCand CSSP2023 Proceedings

International Virtual Conferences on AITC & CSSP - 2023

Uploaded by

uday vivek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
630 views331 pages

AITCand CSSP2023 Proceedings

International Virtual Conferences on AITC & CSSP - 2023

Uploaded by

uday vivek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 331

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/370230085

AITC and CSSP 2023 Proceedings

Conference Paper · April 2023

CITATIONS READS

0 1,609

1 author:

Aditya Kumar Tripathi


Teerthanker Mahaveer University
3 PUBLICATIONS   0 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Aditya Kumar Tripathi on 25 April 2023.

The user has requested enhancement of the downloaded file.


International Virtual Conferences on
AITC & CSSP - 2023

© Hinweis Research, 2023


International Virtual Conferences on
AITC - 2023 and CSSP - 2023

Chief Editors

Dr. Janahanlal PS
Viswajyothi College of Engineering, India

Dr. Yogesh Chaba


Guru Jambheshwar University of Science & Technology, India

Hinweis Research
HR comprises the world's most distinguished Engineers, Scientists and Academicians covering the entire
spectrum of scientific disciplines.

© Hinweis Research, 2023


Copyright

Published by
Hinweis Research
KP 7/581, Kazhakkuttam, Thiruvananthapuram, Kerala, India-695301

Proceedings of the
Joint International Conferences on AITC and CSSP 2023
ISBN: 978-81-958173-5-1

Copyright © AITC and CSSP 2023 Organizers.


All rights reserved.
This Proceedings Book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and
retrieval system now known or to be invented, without written permission from the Publisher,
Editor(s) or from the Organizers.

© Hinweis Research, 2023


Committees
Honorary Chair
Dr. R Rajesh, Bharatiar University, India
Dr. Wan Abdul Rahim Wan Mohd Isa, Universiti Teknologi MARA, Malaysia
Dr. Marina Binti Yusoff, Universiti Teknologi MARA, Malaysia

Technical Chair
Dr. Phiroj Shaikh, Don Bosco Institute of Technology, India

General Chair
Dr. Mukta Dhopeshwarkar, Dr Babasaheb Ambedkar Marathwada University, India
Dr. Shuzlina Binti Abdul Rahman, Universiti Teknologi MARA, Malaysia

General Co-Chair
Dr. Sonali B. Kulkarni, Dr.Babasaheb Ambedkar Marathwada University, India
Dr. Rekha. K S, The National Institute of Engineering, India

Publicity Chair
Dr Pingkun Yan, Philip Research North America
Dr. Savan K Patel, Ganpat University, India
Dr. C Namrata Mahender, Dr Babasaheb Ambedkar Marathwada University, India
Dr. Savan K Patel, Ganpat University, India

Publicity Co-Chair
Prof. Dacheng Tao, NTU, Singapore
Dr. Amlan Chakrabarti, University of Culcutta, India
Dr. Sonali B. Kulkarni, Dr.Babasaheb Ambedkar Marathwada University, India

Program Committee Chair


Dr. Mustafa, Anadolu University, Turkey
Dr Deepak Laxmi Narasimha, University of Malaya, Malaysia
Dr. N. Nagarajan, Anna University, Coimbatore, India
Prof. Akash Rajak, Krishna Institute of Engg. & Tech., UP, India
Dr. Javed Vassilis Khan, Academy for Digital Entertainment, The Netherlands

International Advisory Committee


Dr. Pawel Hitczenko, Drexel University, USA
Dr. Kristian J. Hammond, Northwestern University, USA
Dr. Long Que, Louisiana Tech University, USA
Dr. Peter Vadasz, Northern Arizona University, USA

Program Committee Members


Dr. Shu-Ching Chen, Florida International University, USA

© Hinweis Research, 2023


Dr. T.S.B.Sudarshan, BITS Pilani, India
Dr. Habibollah Haro, Universiti Teknologi Malaysia
Dr. V. K. Bhat, SMVD University, India
Dr. Keivan Navi, Shahid Beheshti University, Tehran
Table of Contents
1. Encoder-Decoder Approach toward Vehicle Detection 1-6
Pushan Deb, Sunny Kumar and Preethi N

2. Home Automation with Node MCU & Firebase using Internet of Thing (IoT) 7-13
Aarya Pawar, Pratham Khinvsara, Revant Pund, Tushar Raikar,
Rishikesh Dayma and Nishant Kulkarni

3. Open-Source Workforce Administration System using Django 14-22


Ritika Rastogi, Riya Gupta and Ayushi Agarwal

4. E-Learning: Research and Applications 23-28


Shantanu Sharma, Shreeyanshi Gautam and Ayushi Agarwal

5. Medical Chat Bot for Ambulance During Emergency Situations 29-32


Dhyaneshwaran J, Farrel Deva Asir J and Saranya S

6. A Comprehensive Study on Time Series Analysis in Healthcare 33-42


Karthick Myilvahanan J, Nivetha K, Krishnaveni A, Mohana Sundaram N
and Santosh R

7. Detecting and Isolating Black-Hole Attacks in Manet using Timer based 43-49
Baited Technique
Paramjit and Saurabh Charya

8. Design of Wideband Band Stop Filter using Signal Interference Technique 50-54
Madhukumar Patnala, Bachu Munideepika, Vetti Pavithra, Totthuku Sunil
and Nallapothula Sreenivasulu

9. Block Chain-based E-Voting System using Smart Contract 55-60


Priya Shelke, Suruchi Dedgaonkar, Nilesh Gopale, Rohit Desai,
Ninad Deogaonkart and Nachiket Joshi

10. Phishing E-Mail Detection and Blocking it based on the Header Elements 61-66
Sulaiman Awadh Ali Obaid Maeli and Ajay U Surwade

11. IoT based Weather Detecting System 67-71


Priyanshi Patil, Nikhil Patil, Pratham Patil, Mohit Patil and Pratap Patil

12. Automatic Depression Level Detection 72-76


Rupali Umbare, Vedant Bhamre, Danish Tamboli, Trushant Jadhav and
Pranav Kurle

© Hinweis Research, 2023


13. Cryptocurrency Price Prediction by Integrating Optimization Mechanism to 77-84
Machine Learning
Deepak Nandal and Pankaj

14. A Comprehensive Study on Current Trends in Unsupervised Machine 85-92


Learning Algorithms and Challenges in Real World Applications
Velvadivu P, Sujithra M and Priyadharshini R

15. Variable Selection Methods, Comparison and their Applications in Machine 93-101
Learning: A Review
Kirti Thakur, Harish Kumar and Snehmani

16. Devanagari Characters Recognition: Extracting Best Match for Photographed 102-109
Text
Neelam Chandolikar, Swati Shilaskar, Vaishali Khupase and Mansi Patil

17. Dental Biometrics Segmentation on Panoramic X-Ray Images using 110-115


Computational Intelligence Approach
Sujithra M, Rathika J, Velvadivu P, Abinanda P, Gayathri G and Rekha V S

18. Credit Risk Analysis of Loans using Social media Information 116-121
Halkarnikar P P, Khandagale H P and Amol Dhakne

19. Blockchain Enabled Marksheets and Degree Certificates 122-129


Sharon Christa and Tanusha Mittal

20. Design of a Miniaturized Microstrip Antenna using Slots on the Radiating 130-135
Patch for Wireless Applications
Susmita Bala, Biplab Bag, Sushanta Sarkar and ParthaPratim Sarkar

21. Cloud-based Resource Distribution using a Blockchain Approach 136-139


Radha T Deoghare and Sapana A Kolambe

22. Detecting Human Emotion by Text Classification 140-147


Brunda U, Palakuru Akhilesh and Kalaiselvi K

23. Effects of Integration of Electric Vehicle Charging Stations into the Grid 148-154
Deepti Jagyasi and Ramchandra Adware

24. Enhancement of Accuracy and Performance of Deep Learning System for 155-162
Intrusion Detection System
Abhishek Kajal and Vaibhav Rana

25. Real-Time Remote General Healthcare Clinic 163-167


Yashwanth M, Yoga Verma V and Saranya S
26. Detection of Varicose Superficial Venous Thrombophlebitis in Vein using 168-173
MSNN Algorithm
Shiyam R, Srividya R and Saranya S

27. Email Automation and Database Management 174-177


Umakant Tupe, Shubham Ghalme, Kanchan Shelke and Rutuja Kadam

28. A Novel Approach with Deep Learning Method with Effective Storage 178-184
Security in Hybrid Clouds
Vijay Prakash, Aditya Tripathi, Shashank Saxena and Arshad Ali

29. Video Surveillance Fire Detection System using CNN Algorithm 185-189
Tupe U L, Lakhan Jadhav, Shivanand Koli, Prasad Kulkarni and
Mayur Gaikwad

30. Smart Time Table Generation using Artificial Intelligence 190-195


Sukhwant Kour Siledar and Vijaya B Musande

31. Review of AI/ML in Software Defined Network from Past to Present 196-206
Raghavendra Kulkarni

32. Preprocessing and Segmentation of Retinal Blood Vessels in Fundus Images 207-216
using U-Net
Sudha Abirami R and Suresh Kumar G

33. COVID-19 Tracker 217-222


Malarvizhi N, Arun Kumar Dash, Aswini J and Manikanta V

34. Smart Vision Goggles for Blind People 223-230


Manne Sowmya, Swathi N, Raja Sri A, Sai Prakash M and Jayadeep K

35. Data Science based Recommendation System -An Application of Computer 231-236
Science
Zeba Khan and Abdul Rahman

36. Mobile Malware Attacks, Classification, Propagation, Analysis, Detection, 237-243


Challenges and Future Directions – A Survey
Senthilkumar B, Sujithra M and Mani Barathi S P S

37. A Survey on Hyperspectral Sensing Techniques for Identification of Fake 244-253


Pharmaceuticals Medicines
Pravin V Dhole, Vijay D Dhangar, Sulochana D Shejul and
Bharti W Gawali

38. Electrical Design of Off-Road Electric Vehicle 254-259


Lipika Nanda, Nikita Lahon, Arjyadhara Pradhan, Babita Panda,
Chitralekha Jena and Sourav Kumar Satpathy
39. PM based Eddy Current Braking for Automobile Applications 260-268
Vinayak C Magadal and Mrityunjaya Kappali

40. Skin Disease Identification using online and Offline Data Prediction using 269-275
CNN Classification
Minakshi M Sonawane, Ali Albkhrani, Bharti W. Gawali, Ramesh R Manza
and Sudhir Mendhekar

41. Automatic Rail Track Inspection System 276-281


Chandrasekhar D, Sumalatha A, Poojitha K, Neeraj Manikanta Sai G and
Kavya K

42. Automatic Industrial Gas Leakage Detection and Control System 282-289
Teja Sai Ethesh, Swathi N, Sai Vamsi Reddy, Anusha N and
Yashwanth Krishna A

43. Polymer Conducting Nanocomposite Film to Improve Electromagnetic 290-294


Compatibility of Electronic Devices
Vikas Rathi, Brijesh Prasad, Varun Mishra, Hemant Singh Pokhariya and
Himanshu Pal

44. Floating Sun Tracking Solar Panel 295-302


Anjani Pujitha PSVL, Umamaheswari K, Karthik A, Harshavardhan J and
Gopi Sivanadh D V S

45. Application of Grey Wolf Optimization Algorithm for Improving Inertia 303-308
Constant Selection in Wind Farm Deployments
Deepesh Bhati and Sandeep Bhongade

46. Negative Emotion Detection using ECG and HRV Features 309-316
Sindhu N and Jerritta S
Grenze International Journal of Engineering and Technology, June Issue

Encoder-Decoder Approach toward Vehicle


Detection
Pushan Deb1, Sunny Kumar2 and Dr Preethi N3
1-3
Department of Data Science, Christ University Lavasa, Pune, India
Email: [email protected], [email protected], [email protected]

Abstract—Vehicle Detection algorithms run on deep neural networks. But one problem arises,
when the vehicle scale keeps on changing then we may get false detection or even sometimes no
detection at all, especially when the object size is tiny. Then algorithms like CNN, fast-RCNN,
and faster-RCNN have a high probability of missed detection. To tackle this situation YOLOv3
algorithm is being used. In the codec module, a multi-level feature pyramid is added to resolve
multi-scale vehicle detection problems. The experiment was carried out with the KITTI dataset
and it showed high accuracy in several environments including tiny vehicle objects. YOLOv3
was able to meet the application demand, especially in traffic surveillance Systems.

Index Terms— Surveillance video, vehicle detection, codec, convolutional neural network,
YOLOv3, moving object detection and tracking.

I. INTRODUCTION
Road accidents and crime are increasing day by day. An intelligent Road Monitoring System is becoming the
need of the hour. Vehicles need to be identified with a license plate so that further research can be made on a
specific vehicle to identify the driver and provide proper evidence to law enforcement.[1] Because of the
increasing number of network cameras, locally manufactured visual data, and Netizens, it is difficult yet
essential to analyze a large amount of background data at once. Moving object detection (MOD) is a technique
for extracting dynamic foreground elements from video frames, such as moving pedestrians or automobiles, and
removing the background that isn't moving. Due to the recent success of convolutional neural networks (CNN),
there is a great deal of interest in deep learning-based object identification algorithms, and numerous models
have achieved cutting-edge results [2]. Particularly opposed to artificially manufactured methodologies, deep
learning techniques utilize proposal generation methods like MultiBox, DeepBox, and region proposal networks
(RPNs) can provide fewer candidates of superior value.
YOLOv3 Algorithm is a pre-trained model on COCO dataset with over 80 classes that it can detect with a MAP
to 69% under sunny weather condition. As YOLOv3 is a pre-trained model, the detection is very fast and can be
implemented in live traffic conditions. As shown in fig 1, It can be shown that the distance of the item and the
size of the vehicle are inversely related. This may lead to incorrect detection or even faulty detection in some
contexts.
This paper improves the YOLOv3 network to address this problem. Given that features like SSD, YOLOv3, and
FPN all use feature pyramid structures at the detection stage, this study proposes a novel multi-level feature
pyramid structure introduced to the codec module to recognise vehicle targets at various forms. The multilayer
characteristics that the backbone network had retrieved were first merged into basic features. The essential

Grenze ID: 01.GIJET.9.2.1


© Grenze Scientific Society, 2023
properties listed above are then transmitted to the codec module via its decoder layer as the feature of the
detection object. We eventually combine the multi-level properties of the backbone network with equivalent
scales at the decoder layer to produce a feature pyramid for target identification. In section II. Related works are
explained with the main algorithms. In Section III Proposed methodologies are explained and the various
methods that are applied. In Section IV Conclusion and further works are given.

II. RELATED WORKS


A. Fully Supervised Object Detection
The fully supervised technique, which employs bounding box annotation for object recognition, may be split into
two primary categories: single-stage detectors and multiple-stage detectors. The most popular single-stage object
identification technique is called YOLO [1], which uses a Dark net architecture for real-time (30 fps) grid-based
object recognition and predicts the centre of the item as well as the width and height of the bounding boxes for
each grid cell. The author proposed Draknet19, which, inspired by ResNet [3], offers quicker detection (100 fps)
by giving feature concatenation from the preceding layer, for a later version of YOLO, also known as
YOLO9000. However, YOLO and YOLOv2 have trouble detecting tiny objects. The author suggested Darknet
53 as a replacement for YOLOv3[4], which is a bit slower (45 fps) than YOLOv2 but significantly more
effective at identifying small things since it applies detection to objects on three separate sizes. Similar to
YOLO, the single-shot multibox detector (SSD) is a grid-based real-time (59 fps) detection technique.
B. Object Counting
Counting items in a scene that are visible to a model is known as object counting. Numerous techniques for
doing item counting using clustering, either detection or regression, are available. Counting methods based on
clustering are frequently used to count objects. These methods usually come after an unsupervised learning
pipeline that divides features into categories based on an object's appearance. For instance, Tu et al.[13]
employed expectation-maximization to count persons based on their shoulders and faces. Rabaud and Belongie
published a counting method based on the observation of feature points over time. Since they often need a video
sequence to do so, clustering-based algorithms find it difficult to follow feature points in the still picture.
C. The Combined Loss Function
The mean squared error regression and binary cross-entropy classification are used to create the final loss
function for training.
D. Bounding Box Formation Technique
In the subsections that came before, the structure of the recommended network and its training loss function was
discussed. This subsection described how to create a bounding box. Class activation maps (CAMs) and
regression activation maps are used during the inference phase (RAMs). As previously mentioned, two more
1X1 N convolutional layers are added in order to generate the CAMs and RAMs for the test picture, where 1X1
signifies the kernel size and N is the number of classes.
1) The predictions from the classification output are thresholded by a value to check if a class instance is
present. The best value of this threshold for testing empirically is 0.5 since multi-label categorization is
frequently binary.
2) If a class has a value greater than the threshold value, the regression predictions are used to calculate the
number of class instances for an item. At this point, further thresholding is carried out in order to
determine the precise (integer) number of instances. If the regression forecast for the class's instance
count is larger than 0.5, it is considered that this class has one instance (a threshold chosen based on the
standard mathematical rounding concept in which values less than 0.5 are mapped to zero and values
greater than 0.5 are mapped to one).
3) Once the number of occurrences for each class in the picture has been established, the regression
activation map (a 28-28 grid) is normalized by the number of objects in order to eliminate the noise
blobs. The largest of the remaining blobs is then chosen as the proper single-cell center. Because RAMs
are learned by global average pooling, the prior RAM filtering method works well because this
produces high activations for the region corresponding to the visual characteristic of the class with the
highest level of activation; this one-cell activation is taken as the center of the object.
4) To do non-maximum suppression, the next step is to create the bounding box using the threshold CAMs
(we used 1.0, 0.9999, 0.999, 0.99, and 0.9 for each set of classes). The extensions of the objects are

2
found in CAMs, and they can be scattered throughout the globe as multiple little portions or they can
overlap. There are four types of class activations that can affect how the bounding box develops based
on the amount of instances of each existing class:
 If a class has exactly one instance, there is just one instance of each item in the related CAM (a 28
28 grid). All of the active cells are counted. The locations of the top and leftmost activated cells on
the vertical and horizontal axes, respectively, of the grid, are taken as the top left corner of the
bounding box, and the locations of the bottom and rightmost are taken as the bottom right corner,
regardless of the shape of the region in the CAM (i.e., whether it is unitary or fractionated in the
CAM's grid)
 If there are several instances of a class and the number of instances equals the number of unique
regions in the CAMs, then each centre has a corresponding instance represented by a linked area in
the CAM. Each region's bounds are determined by the grid position's lowest and maximum indices.

Fig.1. Vehicle Detection Model Network Structure.[3]

III. METHODOLOGY
A. Yolov3
A real-time object detection system called YOLOv3 (You Only Look Once, Version 3) recognizes particular
things in films, live feeds, or still photos. To find objects, YOLO employs features that a deep convolutional
neural network has learned.
Darknet-53: Darknet-19 is the name of the network architecture that YOLOv2 uses. It has 24 layers in total,
including 19 convolutional layers (thus the name darknet-19) and 5 maximum clustering layers. Due to the loss
of several fine-grained information during input down sampling, YOLOv2 is not particularly good at recognizing
small targets. In order to get low-level features, YOLOv2 uses identity mapping to connect feature maps from
the preceding layer.
Under Three Scales Detect: When the input image size is decreased to 32, 16 or 8, respectively, YOLOv3
generates predictions at each of three scales that are precisely stated. The 82nd layer is in charge of making the
initial prediction. The network lowers the visual resolution for the first 81 layers until the 81st layer's pitch is 32.
The size of the resultant feature map, if we start with a 416 416 image, is 13 13. Here, we employ a 1 1 detection
kernel to produce a detection feature map with dimensions of 13 13 255. The layer 79 feature map is then
sampled twice to a dimension of 26x26 after passing through numerous convolutional layers. The feature map of
layer 61 and this feature map are then thoroughly concatenated.
Anchor Boxes: 9 anchor boxes altogether are used by YOLOv3. In each ratio, three. If you train YOLO on your
own dataset, you must create 9 anchor points using K-Means clustering.
Additional Bounding Boxes: More bounding boxes are predicted by YOLOv3 than YOLOv2 for input photos
of the same size. When YOLOv2's original resolution is 416 × 416, for instance, it is assumed that 13 x 13 x 5 =
845 boxes. Five boxes are found in each grid cell using five anchor points. The prediction are given below[5]: -
= ( )+ ()
= + ( )
= ( )
= ℎ ( )
Softmax Abandoned : YOLOv3 classifies items found in photos using several labels. Previously in YOLO, the
author was accustomed to using softmax level scores and regarded the class of objects encompassed in the
bounding box as having the greatest score. This was altered in YOLOv3.
Loss Function
Loss = ∑ ∑ 1 [( − ) + ( − ) ]

3
+ ∑ ∑ 1 (2 − × ℎ )[( − )

+(ℎ − ℎ ) ] − ∑ ∑ 1 [ log ( )

+ 1− log (1 − )]

− ∑ ∑ 1 [ log ( )

+ 1− log (1 − )] − ∑ 1 ∑ × [ ( )log ( ( )) + (1 − ( ))log (1 −


( ))] (v)
B. Fast R-CNN
By categorizing object suggestions with a deep ConvNet, the Region-based Convolutional Network approach
(RCNN) provides excellent object detection accuracy. On the other hand, R-CNN has substantial disadvantages:
Training involves several stages. A ConvNet on object suggestions is initially tuned using R-CNN using log loss.
Then SVMs and ConvNet features are matched. These SVMs take the place of the SoftMax classifier that was
trained through fine-tuning as object detectors. It is at the third training step when bounding-box regressors are
learnt.
C. Faster R-CNN
A step up from Fast R-CNN is Faster R-CNN. Due to the region proposal network, Quicker R-CNN is faster
than Fast R-CNN, as the name suggests (RPN). A fully convolutional network that produces suggestions with
various sizes and aspect ratios is the region proposal network (RPN). The RPN uses the language of neural
networks with an emphasis on object detection (Fast R-CNN) what to watch out for.
Anchor boxes were proposed in this study as an alternative to pyramids of photos (several instances of the same
image at various scales) or pyramids of filters (i.e. multiple filters with varied sizes). An anchor box is a
reference box with a specific scale and aspect ratio. The same region can have multiple sizes and aspect ratios if
there are many reference anchor boxes. This may be compared to a pyramid constructed of reference anchor
boxes. The detection of objects with different scales and aspect ratios is made possible by the process of
mapping each region to a distinct reference anchor box.
D. Mask R-CNN
Mask R-CNN, sometimes known as Mask RCNN, is the most advanced Convolutional Neural Network (CNN)
for instance and picture segmentation. Faster R-CNN, a region-based convolutional neural network, served as the
foundation for Mask R-CNN. Knowing the idea of image segmentation is a prerequisite for understanding how
Mask R-CNN operates. The task of computer visions, the technique of dividing a digital image into several parts
is called image segmentation (sets of pixels, also known as image objects). In this segmentation, borders and
objects are located (lines, curves, etc.).
E. Pyramid Network
With the accuracy and speed of a pyramid notion in mind, a feature extractor known as the Feature Pyramid
Network (FPN) was developed. It replaces detectors like Faster R-feature CNN's extractor for object
identification and creates several feature map layers (multi-scale feature maps) with greater quality information
than the traditional feature pyramid.
Dataset: The dataset used was Kitti Dataset and COCO Dataset and our own collected data from live city traffic.
The Kitti Dataset consists of snippets got from a 360-degree camera, recorded in highway and roads of rural
areas in Karlsruhe. The dataset almost depicts as if the video is a surveillance video.
Analysis: The picture that shows beneath the traffic film is quite similar to a real image that was really taken on
the road by a car's camera and is included in the KITTI data collection. According to three distinct KITTI data
set requirements, as shown in Table 2, the AP of the algorithm utilized in this work is 95.04%, 92.39%, and
87.51%, respectively. These outcomes outperform YOLOv3, whose results were, respectively, 2.49%, 3.68%,
and 9.73%. Because the convolutional feature MAP at the bottom is only up sampled once by the YOLOv3
detection model's top stage before performing feature stitching.

4
TABLE I. AVERAGE PRECISION IN THREE DIFFERENT DIFFICULTY LEVELS UNDER THE KITTI DATASET
Algorithm Name Average Precision % Time
Easy Moderate Hard
R-CNN 32.23 26.04 20.93 -
Faster-R-CNN 87.90 79.11 70.19 142
YOLOv3 95.04 92.39 87.51 34

Fig 3.a & 3.b. P-R Diagram in YOLOv3 and Faster R-CNN in three difficulties

Fig.4. P-R Diagram in R-CNN in three different Difficulties

(a)

The above Graphs show that the model worked with high accuracy and with high speed in YOLOv3 model
rather than Faster R-CNN or R-CNN. The self-collected images also showed the same results in YOLOv3
model. The difficulties that were set for the images were of three levels easy moderate and hard. as showed in
the fig 3a and 3b. The difficulties differed with according to several properties like rainy, sunny, cloudy, dark,
bright, distorted images etc. In all the difficulties YOLOv3 gave very high accuracy.

5
(b)
Fig 5a & 5b. Vehicle Detection YOLOv3

IV. CONCLUSION
In this study, the YOLOv3 network model is applied to the problem of vehicle recognition in videos of traffic
surveillance. It was shown that during the actual detection phase, small scale autos frequently go undetected. To
efficiently and effectively construct multi-scale features that can adapt to the identification of multi-scale target
vehicles, we present a unique feature pyramid module built on the basis of YOLOv3 and based on encoding and
decoding. After being tested on the KITTI dataset, the impact has been increased. Good detection results have
been reached for vehicle targets of various sizes, especially for the identification of microscopic targets. The
accuracy is significantly better than the YOLOv3 algorithm and can better meet the requirements of practical
applications.

REFERENCES
[1] BINGXIN HOU et al, A Fast Lightweight 3D Separable Convolutional Neural Network with Multi-Input Multi-Output
for Moving Object Detection, 2021
[2] Yiping Gong, et al, Context-Aware Convolutional Neural Network for Object Detection in VHR Remote Sensing
Imagery, 2020
[3] FENG HONG, et al, A Traffic Surveillance Multi-Scale Vehicle Detection Object Method Base on Encoder-Decoder,
2020
[4] Hong-Mei Sun, Rui Sheng Jia, Finding every car: a traffic surveillance multi-scale vehicle object detection method,
[Springer Science + Business Media, LLC, part of Springer Nature 2020 ]
[5] Shi, L,;Zhang, F,;Xia,J,;Xie,J, Z,;Liu,R. Identifying Damaged Building in Aerial Images Using the Object Detection
Method, Remote Sens. 2021,13,4231
[6] Xiu-Zhi Chen, Chieh-Min Chang, Chao-Wei Yu and Yen-Lin Chen A Real-Time Vehicle detection system under various
Bad Weather Conditions Based on a Deep Learning Model without, Published: 9 October 2020
[7] Dinesh Rajan,Brett Story, Xinxiang Znang, Night Time Vehicle Detection and Tracking by Fusing Sensor Cues from
Autonomous Vehicle, May-2020
[8] Kun Wang Maozhen Liu1, YOLOv3-MT:A YOLOv3 using multi-tracking for vehicle visual detection, 30 April 2021/
Published online: 4 june 2021
[9] XIAOTAO SHAO, CAIKE WEI, YAN SHEN, Feature Enhancement on CycleGan for Night time Vehicle Detection,
November 27, 2020 acceted December 15,2020
[10]Mohammed Rabah, Ali Rohan, Heterogeneous Parallelization for Object Detection and Tracking in UAVs, February 7,
2020, accepted February 19, 2020 date of current version March 11,2020
[11]GIHA YOON,GEN-YONG KIM, HARK YOO, Implementing Practical DNN-Based Object Detection Offloading
Decision for Maximizing Detection, Performance of Mobile Edge Devicedate of publication October 8,2021
[12]Ye Tao, Zho Zongyang, Chai Xinghua, Low-altitude small-sized object detection using lightweight feature-enhanced
convolutional neural network, journal of System Engineering and Electronics, Vol. 32, 4 August 2021, pp.841-853
[13]P. Tu, T. Sebastian, G. Doretto, N. Krahnstoever, J. Rittscher and T. Yu, "Unified crowd segmentation", Computer
Vision.

6
Grenze International Journal of Engineering and Technology, June Issue

Home Automation with Node MCU & Firebase using


Internet of Thing (IoT)
Aarya Pawar1, Pratham Khinvsara2, Revant Pund3, Tushar Raikar4, Rishikesh Dayma5 and Nishant Kulkarni6
1-6
Dept. of Mechanical Engg, Vishwakarma Institute of Technology, Pune-37
Email: [email protected], {[email protected], [email protected]}

Abstract—The Internet of Things is made up of objects with unique identities that are linked to
one another online. The idea is to simply connect and keep an eye on numerous sensors
and equipment via the Internet, which is widely used in this new era. This paper
primarily explains the general overview and knowledge of IOT-based sensing systems and
monitoring systems, which leverage databases and software to construct smart, automated
household appliances. board, and an Android OS smartphone is used to remotely control the
internet. This system's core part and brain, the Node MCU, can serve as an interface between a
wide range of hardware parts and the real-time database. The system provides many cutting-
edge switching features that turn on and off lights, fans, and other connected household
equipment. It is widely used for switching lights, fan on/off by sensing and analyzing data.The
cloud-based system is another notification element of this system architecture. The main feature
of this system is that it can also be controlled from remote areas which can prove to contribute
towards energy saving.

Index Terms— Node MCU, Flutter, Sensors, Voice Control ,Firebase.

I. INTRODUCTION
In IOT(Internet of Things),devices communicate with each other . IoT devices can share contents based on
function control in a predefined manner. This project focuses on the use of cloud to operate the home appliances
over the internet from even remote areas.

Fig. 1 Illustration of IoT

Grenze ID: 01.GIJET.9.2.8


© Grenze Scientific Society, 2023
This article explains how the home appliances can be controlled in a secure way even from remote areas A single
device can be linked to several devices.
Wireless fidelity technology is used to connect the network. The frequency range that has been formally agreed
upon [3] is 2.4GHZ. The node MCU used in this system must be connected to this WiFi to which the other
appliances are connected. The above Fig.1 represents the flowchart of the system .This project consists of a
flutter app which is platform independent and allows the user to change the states introduced in a real time
database provided by firebase .A connection is established between the Node MCU and Firebase Database where
the changes made in the Database are reflected .

II. RELATED WORK


Emerging technologies in the present and future are playing a significant role in automating human existence. In
our fast-paced society, people are enamored with the internet and automated technology, and they are mostly
reliant on them. As a result, automated houses or smart homes have become a buzzword, and their adoption is
fast expanding. Smart homes should have secure connection as well as communication with physical devices
over the internet. We learned a lot from excellent study on papers on Smart Home Automation and various
designs used in leisure. Some of the existing designs that were utilized are detailed here. Kumar Mandala
implemented home automation in two methods in his work, employing Bluetooth and Ethernet. Arduino is used
here to programme and control numerous gadgets. Bluetooth is a short-range communication technology. As a
result, with smart home automations that use Bluetooth, one can only activate the gadgets from home within a
distance of 10-20m. This constraint has been solved in the forthcoming Ethernet-based architecture. This study
just described how to operate various electrical gadgets in the home using mobile applications and did not
contain any security aspects.

III. PROPOSED SYSTEM


As the cost of items in our daily lives rises owing to technological advancements, a little notion known as the
smart home project is launched to minimize costs and inconvenience while also providing energy-saving
solutions. A smart house is capable of controlling the home even when the owner is not present [6]. The IoT
system may be constructed by combining an MCU with additional components such as a DC motor, L293D,
USB cables, and software such as Firebase, which are used to operate various household appliances namely as
fans, lights, and lamps. It provides status updates on the mobile app [7]. Data is uploaded to the cloud via
Firebase, where it may be stored and accessed as required. The suggested system and its operation may be
explained in detail in Fig.2.

Fig. 2 Schematic of Home automation using node MCU

Various components like DC motor, L293D, etc. are connected to Node MCU microcontroller, these
components help in acquiring data from surroundings, which include the state of appliances and send this
collected data from microcontroller to the database .Users can access as well as update this data anytime, from
anywhere using Android app developed using Flutter which uses Dart Programming language. Using this
application the states of appliances in the database can be updated. The Microcontroller (Node MCU) fetches
the data from the database and reflects the updated state on the Arduino program . This program then executes or
operates the various appliances based on the updated conditions in the database . The state of the appliances can
be updated using buttons or the voice control commands through the application.

8
IV. IMPLEMENTATION SETUP
Components Required Hardware Requirements:-
1) NODE MCU ESP8266
2) DC Motor
3) L293D
4) Male to female jumper wire
5) Connecting Wires
6) Breadboard
7) LED
8) Fan
9) Mobile phone
10) Power Supply Software requirements:
11) Android Flutter App
A. NODE MCU ESP8266
Node MCU The ESP8266 is an open source Internet of Things (IoT) platform that comprises both software and
hardware (WHAT IS A NODEMCU?). The application is powered by Espressif's ESP8266 Wi-Fi SoC, which is
a System-on-Chip. The ESP-12 module contains the hardware. The ESP8266 is a System-on-Chip (SoC) that
combines a 32-bit CPU, an antenna, switches, filters, a power amplifier, power management modules, and
standard digital peripheral interfaces in a small and simple package [8]. The ESP8266 is a low-cost
microprocessor that works with a Tensilica Xtensa LX106 core and is used in a variety of IoT applications. This
microcontroller is connected to the WiFi continuously which helps in fetching the states of the appliances from
the database. Different appliances are connected to this controller.

Fig.3 ESP8266 Module


B. DC Motor
The DC motor is a sort of electrical equipment that transforms electrical energy into mechanical energy. DC
motors use direct current to convert electrical power into mechanical rotation. It is a type of rotary electric motor
that transforms direct current (DC) electrical energy into mechanical energy. The most prevalent forms are based
on magnetic field forces. We have used a DC motor to create a prototype of the rotating fan.The motor was
controlled using a L293D Motor Driver to which when a high input is given the motor rotates at a certain default
speed in a particular direction .The direction of rotation can be reversed by inverting the high voltage inputs and
low voltage inputs.

Fig.4 DC Motor

9
C. L293D
L293D is a standard motor driver or motor driver IC that allows a DC motor to be driven in either direction.
L293D is a 16-pin integrated circuit that can operate two DC motors in any direction at the same time. It means
that a single L293D IC may operate two DC motors. The l293d can also power tiny and silent large motors. It is
based on the H-bridge idea. The H-bridge circuit enables voltage to flow in either direction. Because voltage
must change direction in order to rotate the motor in either a clockwise or anticlockwise direction, H-bridge ICs
are perfect for controlling a DC motor. Fig 5 shows L293D motor driver. We have used a L293D motor driver to
control a DC motor to which we have attached a fan .This can be used to prototype the switching on/off states of
a fan.

Fig .5 – L293D Motor Driver

D. Breadboard
For building a prototype and to mount our components at one place for connection purposes and creating a
proper circuit ,we have used a breadboard. Many electrical components in electronic circuits can be coupled by
placing their leads or terminals into the holes and then connecting them with wires as needed. The breadboard
contains metal strips below it that link the holes on the top of the board.

Fig. 6- Breadboard

E. LED
The light-emitting diode (LED) is a common standard light source in electrical electronics. It can be used in
various applications such as in mobile phone or large advertising billboards. They are mostly used in devices
that that include works related to time and displaying various types of data. Aviation, illumination, fairy lights,
car headlights, marketing, general lighting, traffic lights, camera flashes, lit wallpaper, horticulture grow lights,
and medical supplies are just a few of the uses for LEDs.

Fig .7- LEDs

F. Android Flutter App


We can design our own application through a development platform which is provided by the Android Flutter
Application.This application is connected to the Firebase Server using which we can establish a connection for
transmission and reception between the microcontroller and the user which is shown in Fig.11.This Android
Flutter Application can be developed by the user. The user must sign into this application to access the states of
all the home appliances and switch ON/OFF their respective states through this app .
This Android Flutter App is in charge of generating an interface for user interaction by giving a dashboard from
which the user may operate.This app is connected to hardware, and connection between them is facilitated via
the Firebase server. These hardware components can communicate with the server with the help of commands
defined in Firebase libraries.

10
Fig. 8 Representation of project using Android Mobile App

G. Firebase
Google Firebase is an application development software provided by Google that enables developers to develop
iOS, Android & web apps. Firebase offers capabilities for measuring statistics, reporting and troubleshooting app
problems, and generating marketing and product experiments.
By giving safe access to the database directly from client-side code, Firebase Real-time database enables you to
develop complex, collaborative apps. Data is saved locally, and real-time events continue to fire even when the
user is offline, providing the finest responsive experience ever.

Fig. 9– Firebase

V. RESULTS AND DISCUSSIONS


The main objective of developing the Home Automation system was to operate the home appliances from
remote locations using a secure smartphone application which was developed using dart language and flutter
IDE, which also allows cross platform usage. A fine working flutter application was successfully developed
which was not only limited to control the appliances using buttons but also allowed the user to control the same
by taking voice commands as the input. All the members in that family can share this Android flutter App so
that, when one person from the family switches a device either fan or light or any other appliance. From this
project we can confidently say that we can make a home automation system with readily and cheaply available
components and can be used to operate multiple appliances such as lamps, televisions,and also the whole house
lightning system. It is evident that we can make this easily as the components required are so small that they can
be packed into a container. The Home Automation framework is utilized to be something which is extremely
important & useful in this emerging digital world. In any case, as innovation progresses so rapidly, the normal
people have prepared access to calculations that can do some really astonishing things. By interconnecting user’s
gadgets with user’s Internet, user’s WIFI, or other advanced gadgets, the user's home turns out to be
progressively effective at warming, cooling, lighting, and, well, running.

Fig .10– Real-time Database created by Google firebase

11
Fig. 11 – On/Off Buttons in Android Flutter App for turning on/off lamp, fan, light

Fig. 12– Button for Voice Command for turning Home appliance on/off

Fig .13 – Smart Home Automation Setup with Overall Functioning

VI. CONCLUSION
With the help of IOT technology, we get to know or have basic idea on how we can control home appliances. As
long as the user is connected to internet, this prototype of ours will help the user to control his home appliances
irrespective of his location . The GUI System which we have created allows the user to easily control the
appliances using a smartphone which is connected to internet ,so as any change occurs,the user will be notified
immediately and he/she can control the appliances using the provided GUI interface.The microcontroller which
we have used is Node MCU which acts like an interface between the components and the user. This Node MCU
is connected to several appliances like light,fan,lamps,etc. To establish an application layer for connection
between the user who is remotely located ,we have used a micro web server. This system communication
between the user and the appliances was possible through internet. Notifications are delivered to users via the
Android flutter app, which is installed on their smartphone. Users may operate remotely or automate household
appliances by utilising components such as NodeMCU, DC Motor, L293D, Firebase, and so on. All these
components together help in building a remotely controllable smart home automation system through which we
can switch the lights , fan ON/OFF.

REFERENCES
[1] J. Lertlakkhanakul, J.W.Choi and M. Y.Kim, Building Data Model and Simulation Platform for Spatial Interaction
Management in Smart Home, Automation in Construction, Vol. 17, Issue 8, November 2008, pp. 948- 957

12
[2] A. R. Al-Ali and M. AL-Rousan, Java-based Home Automation System, IEEE Transactions on Consumer Electronics,
Vol. 50, No. 2, May 2004
[3] R. J. C. Nunes and J. C. M. Delgado, An Internet Application for Home Automation, 10th Mediterranean Eletro-
technical Conference, MeleCon 2000,Vol. I. pp. 298-301
[4] D. H. Stefanov and Z. Bien, The Smart House for Older Persons and Persons with Physical Disabilities: Structure,
Technology Arrangements, and Perspectives, IEEE Trans- actions On Neural Systems And Rehabilitation Engineering,
Vol. 12, No. 2, June 2004, pp. 228-250
[5] C. Douligeris, Intelligent Home Systems, IEEE Communications Magazine,Vol. 31, Issue 10, October 1993, pp. 52-61
[6] Y.-J. Mon, C.-M. Lin and I. J. Rudas, Wireless Sensor Network (WSN)Control for Indoor Temperature Monitoring,
Acta Polytechnica Hungarica,Vol. 9, No. 6, 2012, pp. 17-28
[7] E. N. Ylmaz, Education Set Design for Smart Home Applications, Computer Applications in Engineering Education,
Vol. 19, Issue 4, December 2006, pp.
[8] N. Sriskanthan and Tan Karand. “Bluetooth Based Home Automation System”. Journal of Microprocessors and
Microsystems, Vol. 26, pp.281-289, 2002.
[9] E. Yavuz, B. Hasan, I. Serkan and K. Duygu. “Safe and Secure PIC Based Remote Control Application for Intelligent
Home”. International Journal of Computer Science and Network Security, Vol. 7, No. 5, May 2007.
[10] Amul Jadhav, S. Anand, Nilesh Dhangare, K.S. Wagh “Universal Mobile Application Development (UMAD) On Home
Automation” Marathwada Mitra Mandal’s Institute of Technology, University of Pune, India Network and Complex
Systems ISSN 2224-610X (Paper) ISSN 2225-0603 (Online)Vol 2, No.2, 2012
[11] Rana, Jitendra Rajendra and Pawar, Sunil N., Zigbee Based Home Automation (April 10, 2010). Available at SSRN:
http://ssrn.com/abstract=1587245 http://dx.doi.org/10.2139/ssrn.1587245
[12] R.Piyare, M.Tazi “ Bluetooth Based Home Automation System Using Cell Phone”, 2011 IEEE 15th International
Symposium on Consumer Electronics.

13
Grenze International Journal of Engineering and Technology, June Issue

Open-Source Workforce Administration System using


Django
Ritika Rastogi1, Riya Gupta2 and Ayushi Agarwal3
1-2
ABES Engineering College, CSE Department, Ghaziabad, India
Email:[email protected], [email protected]
3
Assistant Professor, CSE Department, ABES Engineering College, Ghaziabad, India
Email:[email protected]

Abstract—The paper focuses on Workforce management using Django Framework. This open-
source software application has a wide-spread use especially in small scale organizations who
can’t afford expensive software. Every organisation, public or commercial, uses an information
system to keep information about their employees. However, it has been discovered that many
small-scale enterprises in India still utilise paper and pen to preserve records. Even though
there are many sophisticated technology systems that can perform this function, they are all too
expensive for these low-level industries to afford. This essay addresses developing a method to
handle their challenges at a lower cost. Our Workforce Administration System has 4 views
namely HR, Employee, Team Lead and Fresher based on the different categories of users of this
software application. There are several functionalities in these 4 views that makes this
framework not only employee friendly but also helps builds a bond between the company and
the staff by ensuring smooth interaction between the two.

Index Terms— Open-Source, Analytics, Secure, User-friendly interface, CRUD Operations,


Cost Efficient, Interactive Dashboard.

I. INTRODUCTION
Employees are an organization's most important resource for growth and seamless operation. The documentation
that was previously completed to manage personnel was really onerous and demanded a lot of time in addition to
extra labor. Conflicts might also result from it. For instance, manual searching would be necessary and take a lot
of time if the information of any employee was required. The information was not secure in this type of system,
and the registers' information could be simply altered. As a result, a system was required to automate everything,
including the monitoring of attendance, the tracking of existing projects inside the companies, and the
methodical training of new hires. There are several systems on the market that can assist in carrying out these
tasks, but they are highly expensive and occasionally need to be handled by skilled experts. However, we're
going to open-source this programme so that it may readily operate on any machine with a few installations.
Additionally, using it wouldn't require a lot of technical knowledge.
Different portals would be included in the system that would be created for human resources, employees,
trainees, and team leads. The human resource site, for example, would offer the ability to view employees,
change their information, and add new hires, among other options. Examine staff attendance, the status of the
organization's ongoing projects, and other things. Similar to this, the team lead, trainee, and staff would each

Grenze ID: 01.GIJET.9.2.13


© Grenze Scientific Society, 2023
have unique features available in their respective portals. The organization would benefit from this application's
contribution to efficient operation. Additionally, this programme would be much more secure than the currently
available software in the market. We require a thorough system study before we can design this system. It calls
for a thorough understanding of how the sector operates. We need to properly comprehend databases and
construct them so that they can meet all of the system's requirements. Users should have access to a suitable.
GUI when using the software. The whole system has mainly 4 types of users- HR, Employee, Team Lead and
Freshers. We have assigned functionalities accordingly in each view. Some primary features of HR(Admin) view
are: Viewing the Employees List, and their details, and executing other CRUD operations on them. The
Employee view includes: Review of personal information and making any necessary changes, marking their
attendance, Requesting leave etc. Similarly, the Fresher's view has features such as: Viewing their information
and making any necessary changes, marking their participation in training, submitting a request for leave, and
downloading reference materials and study guides. And lastly, the team lead view which provides functionalities
such as: View team details, assign daily tasks to each team member, or uploading any necessary documents or
plan for the current project.
This system hence will not only allow the users to perform basic tasks but also help the organization to have
better co-ordination with the employees and achieve positive results. Not only does it solve the issue of better
management of the employees but will also help the managers and owners of the small-scale industries to have a
multi-functional dashboard to manage their business efficiently and at a minimal cost.
A. Motivation
The management of employees is still done manually on paper in many firms, especially small ones that cannot
afford the cost of sophisticated software. To keep track of attendees, record contact information, etc., the
organizations have a variety of registers. If they need to look up information for a specific day, an employee,
etc., it becomes exceedingly challenging. Furthermore, this kind of system made it simple to make proxies,
which made it difficult for businesses to manage their workforces and made it time-consuming to look for
information on them. As a result, we sought to create an application system that could quickly and simply
function on their machines after a few installations. This would also be affordable and assist in automating the
system.
B. Scope
The project's primary goal is to develop a workforce administration system. The Software application provides 4
views namely-
1. HR
2. Employee
3. Team Lead
4. Fresher
The system's capabilities include keeping each employee's information, adding new hires, and eliminating those
who leave the company. This system will also keep track of the employees' daily attendance as well as their
absences. Moreover, along with these functionalities, the system will make it easy for the project teams and leads
to interact with each other and would also make the working on the project and its progress very transparent.
Thus, this application would not only make the administration of employees easy but would also increase the
performance rate of the employees as they would be offered bonuses and coupons for their good performance,
which could be redeemed easily by the employees through the dashboard. We have made use of Vroom's
expectancy theory here which states that people tend to put in greater effort when they believe that their efforts
would increase their performance which in turn would help them gain rewards and recognition. Thus, it can be
clearly stated that rewarding the employees for their good performance would increase their productivity and
would also motivate their peer employees and increase and develop a healthy work environment.

II. LITERATURE REVIEW


This section of the content shows the summary about the papers that we have followed for the research and
understanding.
While the above literature review of previous works provides valuable information regarding the existing
Employee Management Systems, there are certain aspects that we have analyzed for improvement and
incorporated in our proposed Workforce Administration System using Django.

15
TABLE I. LITERATURE SURVEY RELATED TO THE PROBLEM DEFINED

Name Author Publishing Published in Techniques Advantages & Limitations


Year Used
Employee Rishabh Bajpai December- International 1.Cloud based Advantages
Management 2020 Journal for Modern data storage 1. On time salary calculation in just
System Trends in Science 2. Firebase a click, help strengthen the
and Technology, handles security employer - employee relationship.
6(12): 225-234, and provides free 2. It is cheaper and easy to use.
2020 support for email 3. It gives errorless calculations
authentication 4. Prevent any kind of malpractice
3. Compatible by employees
with both android Limitations
and iOS 1. Since, labour might not be
having smartphones and many of
them would not be knowing usage
of android system, it was a great
task to make a system such that it
can be used widely else it will not
be useful.
2. It is also difficult for this system
to be used properly by companies
because any type of mistake cannot
be solved later.
Employee Madya Ansari, February- International 1. At front end Advantages
Management Maviya Shaikh, 2018 Journal of ‘HTML’ and 1. Time saving due to digital
System Ansari Abdul Basit, Scientific & ‘CSS’ is used. management in software very less
Jigna Waghela Engineering 2. At the backend manual intervention.
Research Volume ‘php’, is used and 2. Secure data storage.
9, Issue 2 scripting 3. Proper management of employee
languages are resources will lead to profit
used such as enhancement.
‘JavaScript ‘and Limitations
‘AJAX’ 1. Has only 4 modules that records
3. For database, data of the employee in the
MySQL has been database, one for, workdays, salary
used. and provident fund calculation.
2. Employee view has very less
functionalities mostly view only.
Employee Mr.Pratik May-2019 International 1. The Advantages
Management Udayshankar Singh, Research Journal application is 1. Transparency to all the user of
System Mr. Hemant Singh of Engineering & actually a suite of system.
Fartyal, Mr. Khan Technology applications 2. Less paper use and removal of
Abdul Ahad Zubair, developed using redundancy.
Prof. Akshata PHP 3. Less prone to errors.
Laddha 2. This software Limitations
project has been 1. Has 2 views only i.e., HR and
developed using employee however there are other
the powerful staff members who might want to
coding tools of have access to multiple
HTML, CSS and functionalities.
PHP at Front End 2. Restricted to limited members of
and Microsoft the organization.
SQL Server at
Back End
Administration 1-Mohammed January- Journal of 1. The Advantages
in Employee Eshteiwi Ahmouda 2020 Emerging application has 1. It is a user-friendly system.
Management Shafter Technologies and been developed 2. It records various details of the
System 2- Prof (Dr.) Innovative using C employees.
A.K.SINGH Research, Volume language. 3. Each of the employees can
7, Issue 1 2. File handling update their own details but are
is used to store authenticated based on
and retrieve data administrator authorization.
Limitations
1. The areas of concern are –
system reliability and the storage of
data along with the operations
needed to be performed.

16
Firstly, we have tried to make this system user-friendly by making its UI simple enough so that it can be used
even if the user is not highly skilled in using technologies.
Secondly, we have made our system more diverse by increasing the number of views to 4 that provides more
functional capabilities to different types of users that include HR, Employee, Team Lead and Fresher.
Lastly, since we have used Django in building our system, it makes our system more reliable and secure.

III. METHODOLOGY
To create this project, we have used Django, a high-level python web framework. For hr, team-lead, and
employees, there would be various portals with varying rights and functionalities. For instance: While employees
are not given these functionalities, the hr can add or delete employees. In this project, Django was used. Django
requires a minimum of:
-4GB RAM
-an Intel Core i3
-Windows 7 or later
The architecture of the system mainly consists of the following parts: -
i. Frontend-This is the interface with which the users will interact. This is being made simple, attractive
and user-friendly so that the users could easily interact with the various services provided.
ii. Backend-The backend of any application is basically what goes on behind the scenes. It consists of
APIs, servers, operating systems, databases and more all of which come together to ensure that correct
information is served to the user as quickly as possible. It is the backbone of the website and is
responsible for fetching the information which is to be displayed on the front end. It responds to the
requests made by the user and serves them with the required information.
iii. Database-It is an organized collection of structured data. It is responsible for storing the information
entered by the user and for storing the data which is displayed on the front-end of the application after
being fetched from here.

Figure 1. Architecture of the system

The various technologies used for implementing the application are discussed below: -
1). Django-It is a high-level python framework used to create websites that use python. It has many ready-to-use
features like user login and authentication system and database connection and it supports various databases also.
The database that we have used in this project is the default database that is being used by Django that is db.
sqlite3.Django also helps in the re-usability of various components and features like template inheritance etc.
Django follows MVT architecture that is Model, View and Template.
i. Model-The data that we want to display on the frontend of the website or the data that we want to store
in the database is done with the help of models.
ii. View-It is responsible for handling the requests from the user. It renders and associated content on
receiving request from the user.
iii. Template-It is an HTML file that contains the layout of the webpage to be rendered.
Some of the features of Django that make it so popular are: -
1. High Security
2. Rapid Development
3. High Scalability
4. SEO optimized
5. Thoroughly tested
Django has one project and within which different modules are built which provide functionalities to the Project.
The different apps have to include in the settings.py file of the project.

17
2.) HTML, CSS and JavaScript-HTML are used to provide structure to the website.CSS is used for styling the
website and JavaScript is used to provide interactivity to the website. In this application, we have used two CSS
frameworks: -
i. Bootstrap-It is one of the most extensively used HTML, CSS and JavaScript frameworks. It is an open-
source framework and is free to use. It follows a mobile-first approach. It helps to make the website
fully responsive and has various built-in classes which could easily manipulate the styling of the web-
page.
ii. Materialize CSS-It is a UI component library developed and designed by Google. The main goal of
building this was to allow for a unified user experience across all they products developed by them
across all the platforms. We have used along with bootstrap to make the dashboards more interactive
and to follow the dashboard convention.
This system offers four views with different functionalities as discussed below: -
A. HR View
i. View the Employees List, and their details, and execute CRUD actions on them.
ii. View the Organization's different departments.
iii. Take a look at the projects the company is working on and keep track of their progress.
iv. Verify any employee's attendance record.
v. Distribute discounts and bonuses to various staff.
vi. Send out crucial notifications.
vii. Examine employee feedback.
viii. Approve leaves
ix. View issues and questions and address them.
x. View fresh applications on the job portal of the organization and decide whether to accept or reject
them.

Figure 2. Use Case Diagram for HR View

B. Employee View
i. Review their information and make any necessary changes.
ii. Mark their attendance.
iii. Request leave.
iv. Verify the day's assignment.
v. View discounts and rewards and redeem them.

18
Figure 3. Use Case Diagram for Employee View

C. Team Lead View

Figure 4. Use Case Diagram for team lead view


i. View information
ii. Assign daily tasks to each team member.
iii. (Any necessary documents for the current project could be uploaded.
D. Fresher's View
i. View their information and make any necessary changes.
ii. Verify their participation in training.
iii. Submit a request for training leave.
iv. Download reference materials and study guides.

19
Figure 5. Use Case Diagram for freshers view

IV. RESULTS & DISCUSSIONS


1.On successful installation of the software, the user can login using the credentials to access the portal.

Figure 6. Login View of the System

2. When the HR of the company logins, he is directed to the following page where he can take several actions as
discussed under the functionalities of HR view.

Figure 7. HR Dashboard

20
3. Below is the screenshot of the portal’s detailed employee view. It will help the system admin or HR to see all
the details of the employee like his employee code, department, position, his personal details etc. The HR also
has the option to edit these details or remove any employee.

Figure 8. HR accessing the details of an employee

4.Every functionality in the dashboard has a different purpose. For instance, the ongoing projects functionality in
the HR Dashboard will help the organization to see all the ongoing projects, their progress, project lead and the
details of the team members.

Figure 9. Ongoing Project Functionality of the HR Dashboard

Test Cases: -Some of the test cases that we have tested the system for are: -
Test Case 1: - We have tried to test if we are able to login using the wrong credentials.

Figure 10. Trying to login into the system using wrong credentials

The test case has passed, we will not be able to login using the wrong credentials. Thus, the system is secure and
the person who has the right credentials can only login and perform various operations.
Test Case 2: - Now, we are trying to see if we are able to delete the details of any employee from the system.

Figure 11. Testing if we are able to delete an employee Figure 12. Thus, we have deleted an employee successfully

21
Test Case 3: - Now we are testing if we are able to add a new Department.

Figure 13. Testing if we are able to add a new department Figure 14. Thus, we are able to add a new department successfully
successfully in HR Dashboard

V. CONCLUSION
Thus, we have created a system that will help organizations to manage their workforce efficiently. Thus,
reducing the problems faced by the organizations earlier. This system will also ensure transparency in the
organization and bridge the communication gap between the employees and the employers as they will be able to
directly write their queries through the portal. This system in comparison to the other system will make use of
analytics which would make it easier to analyze data and make important decisions based on them.

REFERENCES
[1] Bajpayi, Rishabh. (2020). Employee Management System. International Journal for Modern Trends in Science and
Technology. 6. 225-234. 10.46501/IJMTST061242.
[2] Singh, P., Fartyal, H., Zubair, K. A. A., & Laddha, A. (2019). Employee Management System. International Research
Journal of Engineering and Technology (IRJET), 6.
[3] Punia, R., Panwar, S., Kamra, R., & Gupta, R. (2020). Voice Based Employee Management System Using AWS and
Alexa. International Journal of Innovative Research in Computer Science & Technology (IJIRCST), ISSN, 2347-5552.
[4] "Administration in Employee Management System", International Journal of Emerging Technologies and Innovative
Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.7, Issue 1, page no. pp125-132, January-
2020.

22
Grenze International Journal of Engineering and Technology, June Issue

E-Learning: Research and Applications


Shantanu Sharma1, Shreeyanshi Gautam2 and Ayushi Agarwal3
1-3
ABES Engineering College, Ghaziabad, Uttar Pradesh, India

Abstract— The goal of educational institutions should be to find effective ways to provide new
and efficient learning opportunities based on their environment, student characteristics, teacher
preparation, economic crisis, and advancing technology in an effort to make learning more
efficient, equitable, and innovative in higher education.This study paper identifies the need for
and possibility for developing new online courses in order to engage and motivate students in
accordance with their demands (e-Learning, blended learning, mobile learning).
This paper presents the insights and harmonizes the acquired knowledge since the
implementation of a mobile learning solution, reviews a few definitions of e-learning, explores
the motivating factors behind its development, and provides an overview of the circumstances
in which e-learning is a perfect option, as well as the main e-learning component types. It is
concluded that e-learning is a useful tool for the growth of the Indian educational sector.
Considering the idea of e-learning and examining the different types of e-learning are the key
objectives of this research work.

Index Terms— Electronic Learning(e-learning); education; learning 2.0; higher education;


success; learning; internet; survey;educational development; formal e-learning; informal e-
learning;.

I. INTRODUCTION
E-learning is a crucial activity for each and every nation's advancement. Everyone involved in this new era is
contemplating growth. We will achieve the desired outcomes if it is adequately planned. This research report
examines the state of e-learning in India. Because of the great ease of use & accessibility, navigation, interaction,
and user-friendly interface design of e-learning as in contrast to traditional learning, student satisfaction rates rise
with continuous use. It has been determined that very few professional courses offered by higher education
educators than that of the average national use e-learning, while few such non-professional courses are taught by
teachers using e-learning. In recent years, there has been an increase in research into just how effective e-
Learning performs. This is primarily due to greater opportunities for integrating IT and learning, but also
because political and social focus on "what works" in learning is expanding.

II. LITERATURE REVIEW


[1] The use of any new technologies or International Journal of e-Education,
e-Business, e-Management and e-Learning 40 Volume 6, Number 1, March 2016 applications in the service of
learning or learner support has been deemed to be the operational definition of e-learning, according to f
Laurillard's (2006) research. According to Marc Prensky's study, particular learning activities are more effective
for teaching various learning outcomes. He asserts that everyone learns: a) attitudes through restriction, criticism,
and practice. b) using play to foster creativity.

Grenze ID: 01.GIJET.9.2.14


© Grenze Scientific Society, 2023
c. learning information through drill, questions, associations, and memories
d. language acquisition through imitation, social interaction, and impression e) using examples, puzzles,
and dilemmas from the real world to reason.

III. WHAT IS E-LEARNING?


Internet-enabled learning is known as e-learning.
[2] E-learning is the process of delivering a wide range of solutions to facilitate learning and enhance
performance using computer and Internet technology.
E-learning is the transfer of knowledge, skills, and information made possible by computers and networks. Web-
based learning, computer-based learning, virtual learning possibilities, and digital collaboration are all examples
of e-learning processes and applications.
It elevates the standard of learning and instruction. It addresses the demands and learning preferences of learners
while also enhancing the efficacy and efficiency of the educational process. It
enhances flexibility and user-accessibility for engaging students in the learning process.
E-learning incorporates a networked population of scholars, content providers, and experts as well as the
delivery of content in a variety of formats and administration of the learning process. E-learning offers everyone
involved in the learning process clear accountability, faster learning at lower costs, and expanded access.
Organizations which use e-learning give their staff the skills to use change as an advantage in today's fast-paced
world. As people may now take charge of their own lifelong learning by removing barriers of time, distance, and
socioeconomic status, e-learning will be the great equalizer in this new era.

IV. WHY DEVELOP E-LEARNING?


E-learning is utilized by numerous institutions and businesses since it can be more economical and efficient than
conventional learning. Especially compared to starting to make materials and training instructors, developing e-
learning is more expensive, especially when highly interactive strategies are used. E-learning delivery costs,
however, are significantly cheaper than those for classroom space, instructor time, participant travel, and lost
productivity from missing work to attend classes.Additionally, e-learning engages students who find it
challenging to attend traditional classroom training because they are:
 occupied with family or work
 geographically remote with little access to travel resources, time, or both
 located in volatile regions Consequently, they are limited in their mobility for security concerns.
 less involvement in class due to cultural or religious convictions.
 experiencing communication issues (e.g. foreigner or shy learners)
A. Can any kind of talent be developed through e-learning?
A training/learning programme aims to improve several kinds of abilities, including:
 Cognitive abilities include knowledge, comprehension, the ability to follow directions and apply new
ideas to circumstances in order to solve issues.
 interpersonal abilities (such as those required for active listening, speaking, presenting, and bargaining),
as well as
 psychomotor abilities, which entail learning bodily senses and motions (e.g. making sports or driving a
car)
B. How does e-learning tackle a variety of domains?
Since the cognitive domain is the one most suited to online learning, the majority of e-learning courses are
designed to enhance cognitive abilities. Because thinking abilities can only be learnt and developed "by doing
them," they may require more interactive e-learning activities within certain cognitive areas. To alter attitudes
and actions, for instance, interactive games might be used in conjunction with suitable feedback.
C. E-learning is a wise choice if
 To a big number of students, there is a lot of material that has to be provided.
 Geographically spread places make up the student body.
 Learners have limited mobility.
 Learners have less time to devote to learning because of other jobs.

24
 The goal of learning is to acquire uniform background information about a given subject.
 Learners like studying at their own speed and are highly motivated to pick up new skills.
 The course emphasizes long-term training requirements rather than urgent ones.
 when it is necessary to get the information.

V. QUALITY OF E-LEARNING
 The quality of an e-learning course/training is increased by:
 learner-centered content: learner-focused material The curriculum for online learning should be precise,
pertinent, and tailored to each learner's needs, responsibilities, and roles in both their professional and
personal lives. Information, knowledge, and skills should be offered in the conclusion.
 granularity: To aid in the assimilation of new information and to provide the student flexibility in their
learning schedule, e-learning content has to be normalized (broken down into manageable chunks).
 engaging content: In order to create an engaging and inspiring learning environment, innovative use of
instructional approaches, methods, and tactics is required.
 Interactivity: To keep students' interest, encourage learning, and create a positive learning environment
for them, frequent
 learner-teacher interaction is essential.
 personalization: Self-paced courses must be altered to take into account the needs and interests of the
student.
 Tutors and facilitators in instructor-led courses need to be able to monitor each learner's development
and performance.

VI. HOW IS THE EFFECTIVENESS OF E-LEARNING DEFINED AND MEASURED?


According to several studies, when suitable teaching methods and approaches are used during the learning time,
e-learning may be just as successful as face-to-face training, if not more so.
Employee feedback - In the first place, it is crucial to get employee feedback that will allow us to gauge how
effective the training and learning were.Staff members must be questioned on if the course/training was a
worthwhile experience and whether they approved of the subjects, resources, curriculum, and teachers' methods
of instruction. Another study looked at how e-learning affected students' behavior, attitude, and academic
achievement. The results revealed that learners using e-learning had mean scores that were statistically
considerably higher than those using traditional teaching approaches. The effectiveness of e-learning as a tool or
approach to improve the delivery of teaching and foster the development of learning abilities through transfer of
learning was well received by instructors and students. The primary takeaway is that e-learning may be regarded
as one of the greatest current approaches and methods for use in teaching and learning.

VII. WHAT MAKES E-LEARNING SOLUTIONS EFFECTIVE?


[5]If this was mentioned or implied in the abstracts, then all of the abstracts utilized in this study were coded for
whether the e-Learning was effective, ineffective, or somewhat effective. This was the situation for 61 of the 111
abstracts analyzed, as illustrated in figure 1.

Figure 1 shows the effectiveness of e-learning

Only 10% (6/61) of the studies are labeled as "not successful," which calls into question the validity of the
classifications given the difficulties and issues that e-Learning must overcome. A closer examination of the
abstracts reveals that many of the empirical studies and reachers on efficacy were carried out by scholars who
seemed to have an interest in e-success. The literature review currently does not support the examination into
whether e-Learning solutions are particularly successful due to the issue of "effectiveness bias."
As seen in figure 2, this model depicts the crucial elements (in gray) that the review found to be important in
determining how effective e-Learning is according to various definitions. There must be a positive and helpful

25
Figure 2 flow chart

learning atmosphere. The amount of motivation of the person(s) using the eLearning (artifact) affects how long
they use it, and prior online or professional experience tends to have a generally beneficial effect on
efficacy.Program designs that include peer and instructor interaction as well as opportunities for practise
improve the efficacy of online learning. This model illustrates how the important variables that affect efficacy
interact, however many studies in the literature review do not take into account the wide range of definitions.

VIII. INDIAN E-LEARNING


Understanding the idea and principles of e-learning in its entirety as well as looking into and analyzing the many
types of e-learning are some of the goals of this study. Additionally, it highlights several points of view in
relation to the contrast between conventional learning, classroom learning, and online learning. This essay
attempts to provide some answers by examining the benefits of both learning styles while taking into account the
limitations of the scenario. It intends to analyze the benefits and drawbacks of both learning styles in depth. Such
an objective secondary approach is used to achieve it. Books, magazines, journals, and published materials
relating to e-learning were used to gather secondary data for the study.
A. Indian Education Situation
For a long time, India's old educational system and practices were viable, but as educational demands change
and a global education standard comes into force, the Indian educational system is being forced to make several
adjustments. as seen by the Covid-19 epidemic. Though at a slower rate than in other nations, the idea of e-
learning is undoubtedly growing in favor throughout the nation. The Indian Constitution aims to meet the
country's educational demands, particularly for its many distinct communities and cultures. It also commits to
provide high-quality education to all citizens. The development of the full person and the fostering of potential
inherent qualities and features are what different educational categories such as elementary education, secondary
school, higher education, adult education, and technical and vocational education are all about. The objective of
education in rural regions may be achieved in India through the use of e-learning, which can also inspire students
to pursue further education and empower women.Education must adapt to the new demands of the modern
world, including the need to produce a workforce with global competency. The world is changing and evolving
quite quickly.
B. Concept and Aspect of e-Learning in India
The term "e-learning," sometimes known as "digital learning," refers to the use of a computer to offer all or a
portion of a course or programme, whether it be at a school, college, as part of training, or as a full distance
learning course. E-learning is essentially a method of accessing educational materials outside of a traditional
classroom by using electronic technology. It frequently refers to a fully delivered course, programme, or degree.
Education is what remains after one has forgotten what one has learned in school.” —Albert Einstein.
Although Einstein may have meant his remarks as a joke, they represent the truth that successful and high-
quality education is ongoing and always changing as technology advances. In actuality, during the past several
decades, there have been many changes to how education is seen. Learning has changed from being

26
characterized by the old classroom paradigm to learning that is immediate, online, self-driven, and available
whenever a student desires. There have been several turning points in the path of education in India.In plain
English, e-learning is defined as education that is provided online, over the internet, and includes a variety of
formats and styles, including remote learning, computerized electronic learning, online learning, and internet
learning.
C. E-Learning and Government
The government can employ e-learning in a variety of ways, including the following:
 Effective policy and rule communication may aid the government.
 It can raise awareness among individuals about various programmes and goals.
 According to their demands, it will offer citizens/people a public forum for communication or
education.
 Both unstructured and semi-structured information may be managed by it.
 It can carry out government policy.
The government may benefit from an effective e-learning system in many ways. The government may offer a
learning portal centered on public private partnerships' (PPP) policies, rules, and regulations. A meaningful and
worthwhile education among the populace through e-learning can help a government become more open in its
governance.
D. E-Learning and Higher Education
Studies conducted on a worldwide scale indicate that, after the United States, India has the second-highest
number of students enrolled in online courses, with more than 1,55,000 coming from the nation. 32% of the over
1.2 million students globally are from the United States, while 15% are from India (but keep in mind that both
nations' populations are large). There is an increasing need in higher education to develop an e-learning
programme in which all components of a course are controlled through a standardized user interface throughout
the whole school. Many of these programmes have been launched in our nation; students must attend orientation
sessions at institutions, but the course material is distributed online. The majority of colleges do provide online
advising and registration, e-counseling, and student newspapers, among other online learning assistance and
services. E-learning has the capability to overcome India's rural areas' lack of access to professors and teachers
with the necessary qualifications. Live online coaching, streaming films, and virtual classrooms are a few of the
answers that e-learning may provide to these issues. E-learning is the greatest choice even though there is no
replacement for efficient and well-organized classroom instruction.

IX. CONCLUSION
According to the results of our study, we draw the essential conclusion that the fast expansion of internet
connectivity is a key driver of the development of e-learning. Online learning will become more efficient and
educational quality will rise thanks to a solid internet infrastructure with a wide range of regional and
international actors. E-learning improves both the standard of education and the state of the economy in
emerging nations like India.
Our research article addressed the following research queries: How are e-efficacy learning's metrics determined?
How is the effectiveness of e-learning measured? Why are e-Learning programmes so effective? The benefits of
taking into account and making clear how these notions are utilized in study and practice were emphasized in
this article. The bulk of studies, according to this survey, employed quantitative and comparative methods. This
paper argues that practitioners and academics are prevented from uncovering unexpected and unintentional
potential reasons of error by utilizing only quantitative measurements to satisfy preset learning objectives.Open-
ended qualitative survey questions can significantly increase the validity of such methods [8] The environment in
which the e-Learning solution was employed and deployed as well as the users of the artifact were taken into
account while categorizing these elements.

REFERENCES
[1] Sharma, R. C., & Mishra S. (2013). International Handbook on e-Learning, Vol. 2
[2] Harden, R. M., & Hart, I. R. (2002). An international virtual medical school (IVIMEDS): The future for medical
education. Medical Teacher, 24, 261-267

27
[3] Laurillard, D. (2006). E-learning in higher education. Changing Higher Education: The Development of Learning and
Teaching, 71-84
[4] Government of India. (2011). Census Report, 223.
[5] Ministry of Human Resource Development, Government of India. (2014). Annual Report, 2013-2014.
[6] Chandra, S. (2014). E-learning prospects and challenges. International Journal of Research in Finance & Marketing,
4(10)
[7] Shinde, S. P., & Deshmukh, V. P. Web-based education in schools: A paradigm shift in India.

28
Grenze International Journal of Engineering and Technology, June Issue

Medical Chat Bot for Ambulance During Emergency


Situations
Dhyaneshwaran J1, Farrel Deva Asir J2 and Dr. S. Saranya3
1-3
Department of Electronics and Communication, Easwari Engineering College
Email: [email protected], [email protected], [email protected]

Abstract—India is among the nations also with the greatest populations. One of India's biggest
problems has continued to be health misinformation due to overcrowding. Death occurs every
minute as a result of unforeseen and unplanned situations. Saving a life is lucky and good.
Using the stretcher's embedded microcontrollers and sensors, a smart, intelligent healthcare
system will be created. In the event of an accident, it will determine the condition of the corpse
and send that information to the hospital in addition to sending a server-based alert to the
closest police station to avoid any potential legal issues. If this process is followed, critical care
units in hospitals can have their physical requirements improved before patients arrive,
potentially saving many lives.

I. INTRODUCTION
One of the biggest problems facing humanity is its health. The previous 10 years have seen a lot of attention
focused on the healthcare industry. The main objective was to create a reliable method for patient monitoring
that would let medical professionals keep a watch on patients who might be being treated in a hospital or going
about their everyday lives as usual. Due to recently enhanced technology, patient monitoring systems have
become one of the most significant developments. Currently, a more contemporary strategy is required.
In the conventional method, the main role is played by healthcare experts. They must go to the hospital ward to
provide the required diagnostics and guidance. There are two fundamental issues with this strategy. First, the
patient must always have a healthcare provider nearby, and second, the patient must spend some time being
admitted to the hospital with biomedical equipment at their bedside.
To tackle these two difficulties, patients receive education and information on diagnosing and preventing
illnesses. The second requirement is for a patient monitoring system (PMS) that is dependable and easily
accessible. We can employ technology more wisely to improve the aforementioned situation.

II. RELATED WORKS


Guanqun Zhang, Amber C. Cottrell, Isaac C. Henry, and Devin B. McCombie (2016) developed continuous
heart rate monitoring in the arteries. Utilizing separate wearable sensors that might provide a different and
maybe more useful evaluation of the vascular condition of a patient An appealing method for continuously
monitoring blood pressure is pulse wave velocity (PWV). However, because of the confusing impact of the
ejection before the ECG, innovations in PWV relying on patients who are mobile have inaccurate time
estimations between the ECG and a distal PPG (PEP). inaccuracy in mobile patients (PEP). In this study, we
introduced the ViSi Mobile, a portable, continuously-reading blood pressure device that can track and measure

Grenze ID: 01.GIJET.9.2.17


© Grenze Scientific Society, 2023
PEP changes. PEP is determined using precordial vibrations captured by an accelerometer attached to the
patient's sternum. The effectiveness of the PEP measures was evaluated on test participants who experienced a
postural shift and patient activity. The accuracy of CBP in active patients may be improved, according to the
results [1].
Radhika K. A., Raksha B, and Pruthviraj U. (2018) developed one of the main forces behind the growth of
robotic technology: a combination of automated frameworks with the Internet of Things (IoT). Robots are
controlled by smartphones in modern technology. In our method, a Driving Force GT joystick that offers
accurate remote control is used. In order to remotely operate a robotic vehicle, manual control is used. A bicycle
that is controlled by a joystick is more efficient than one that is web- or smartphone-driven because of the speed
control. In this piece, a robotic automobile dubbed Pibot is controlled using a Raspberry Pi as a base controller to
operate Using a USB joystick, real-time system and vehicle operations are remotely controlled from the ground
station. For security purposes, the Pilot's camera transmits real-time video to a website through an HTTP server.
Python programming is used to control the car from the remote station, utilizing consumer connection interaction
(through strong Wi-Fi) [2].
The method to conserve the power of the probe circuit while remaining in contact with application requirements
were created by Floriano De Rango and Domenico Barletta in 2016. Another critical issue that must be
addressed while developing network technologies for WSN is Current studies have resulted in new guidelines
designed specifically for sensing devices where energy monitoring would be a key component. "Internet of
Things" (IoT) is a cutting-edge digital approach in which several intelligent gadgets related to the World Wide
Web take part in knowledge exchange and group decision-making. Integration of Internet-connected devices for
detecting or controlling energy consumption entails the integration of all types of energy-intensive equipment,
including outlets, lightbulbs, air conditioners, etc. The system may occasionally be able to connect with the
utility provider, which helps achieve equilibrium between energy consumption and production, or more
generally, is most likely to optimize overall energy consumption. To facilitate power connectivity between
connected devices and the Internet of Things infrastructures, various new trends and problems are identified in
this study. To save energy and extend the system's lifetime, communication between devices is examined. By
adjusting various communication settings, such as data size, the device's Radiofrequency interfaces and Internet
access are examined in a variety of scenarios to determine the ideal setup for the gadget as well as the shortest
device lifetime [3].
Electricity is a basic human requirement that is frequently utilized for home, commercial, and farming purposes.
In this logic, energy waste costs the nation millions of dollars. Technology-based solutions, like the Internet of
Things, make it possible to integrate the physical and digital worlds to manage and/or monitor resources,
including, in this case, energy use. Additionally, the creation of communication devices like the XBee, which is
extensively used for monitoring and controlling operations and makes it possible to swiftly and effectively create
wireless sensor network while using very little energy, has been made possible by the advancement of micro-
and nanoelectronics. By creating hardware and software solutions, the demonstrated prototype takes advantage
of the prior benefits. Through a scalable and flexible platform employing XBee technology and a specially
designed protocol for data exchange between the four modules that make up the system, it enables remote
monitoring of power use in a home. Results are shown, showing how accurate the prototype is when compared
to readings from a typical electricity meter. This was developed by Darwin Alulema and Mireya Zapata in 2018
[4].
The last few years have seen a variety of technological advancements. These technological advances could offer
methods for obtaining and processing data that identify significant clinical features of the patients for medical
care. The goal is to highlight the key functions of a voice-activated, clever expander prototype and a method for
monitoring vital signs in individuals with disabilities or restricted mobility. The data processing hardware, which
is integrated with the data collection software to create a global system, is Raspberry Pi and Arduino. Since this
system was created using readily available technology and a fusion of several functions to provide a workable
solution, its main contribution is to simplify medical treatments for patients. This was developed by Wilmer
Calle and Manuel Eduardo Flores Moran in 2018 [5].

III. COMPONENTS
The ESP32 was used to create the prototype. A fingerprint sensor, a heartbeat sensor, and a breathing sensor
were also utilized. To reflect the best quality-price ratio, components are chosen from those sold at nearby
retailers. Finding patient information to communicate to the doctor is done via a fingerprint sensor.

30
Figure 1: Block Diagram

IV. DESIGN
System-on-a-chip microcontrollers in the ESP32 family are low-cost, low-power devices that have Bluetooth and
Wi-Fi built in. The ESP32 family, power amplifiers, low-noise receive amplifiers, RF baluns, integrated antenna
switches, and power-management modules in addition to the Tensilica Xtensa LX6 dual- and single-core CPUs,
the Tensilica Xtensa LX7 dual-core microprocessor, and the single-core Tensilica Xtensa LX7 processor. The
ESP32 was designed and built by the Chinese business located in Shanghai, Espressif Systems, and is produced
Via TSMC using their 40 nm innovation Excellent in functionality, this optical biometric fingerprint reader may
be used in a range of finished goods, including automobile door locks, safe deposit boxes, attendance, and access
control. Summary of the Product The fingerprint sensor R305 fingerprint module has a direct connection to a
microcontroller UART or a PC through a MAX232 or USB Serial converter. A TTL UART interface is also
possible.
The Respiration Sensor measures diaphragmatic breathing in real-world or fictitious biofeedback applications
like stress reduction and relaxation training. This sensor measures breathing frequency in addition to displaying
the relative depth of breathing.
A digital output of the heartbeat is created when one finger is positioned on the heartbeat sensor. When a
heartbeat is detected, the beat LED blinks in time with each one. A microcontroller (BPM) may simply be linked
to this digital output to calculate the beats per minute rate. It uses a finger to modulate the light.
Internet using the ESP32 controller. And the data in the cloud are in encrypted format for basic security
purpose and for viewing the details an OTP generation method for the registered email id is used The data is
available with the time stamps Once the website has verified the OTP, the message will be sent to the registered
mobile number.

Figure 2a: Control Unit Figure2b: send a text message

V. LIMITATIONS OF THE PRODUCT


The transformer that lowers the alternating current voltage to the necessary dc output level is connected to the
alternating current voltage, which is typically 220 V RMS. Following that, a full-wave rectified voltage created
by a diode rectifier is first filtered by a basic capacitor filter to provide a direct current voltage. In most cases,
there are some waves or alternating current voltage shifts in the final voltage in dc. Despite changes in the point

31
Figure 3: Live patient Monitoring

of common coupling to the output voltage or the input dc voltage, a regulator circuit reduces ripples while
maintaining the same dc value. This voltage regulation is often accomplished using one of the widely used
voltage regulator IC chips.

VI. FURTHER IMPROVEMENTS


The current prototype has room for improvement. If, for some reason, the doctor is not in the ambulance, we can
still communicate with the doctor via a chatbot created in Python. The physician will communicate with the
paramedics via the chatbot and give them instructions on the necessary procedure.

VII. CONCLUSION
Due to the significance of health care services to our society, automating them relieves human workers of stress
and makes measurement easier. Patients are more likely to trust this system because of its transparency. The
doctor may assess the patient's current condition while continuously monitoring the patient's condition, which
will help him decide which treatment is most suited.

REFERENCES
[1] M. E. Mlinac, M. C. Feng,”Assessment of activities of daily living, selfcare, and independence”, Archives of Clinical
Neuropsychology, Vol. 31, Issue 6, Aug 2016 [1].
[2] V. Pasku et al.,”Magnetic Field Based Positioning Systems.” IEEE Comm.Surveys & Tutorials, Mar 2017 [2].
[3] K. Nguyen, Z. Luo, “Dynamic route prediction with the magnetic field strength
[4] for indoor positioning”, Int. Journal of Wireless and Mob. Computing, Vol.12Issue 1, Jan 2017 [3].
[5] A. Alarifi et al., “Ultra wideband indoor positioning technologies: Analysis andrecent advances,” Sensors (Basel), vol.
16, no. 5, May 2016 [4].
[6] E. Wang, M. Wang, Z. Meng, X. Xu. ”A Study of WiFi-Aided Magnetic Matching Indoor Positioning Algorithm.”
Journal of Computer and Comm. 5.03.2017 [5].
[7] A. R. Jimenez and F. Seco, “Event-driven Real-time Location-aware Activity Recognition in AAL Scenarios,” in Proc.
12th Int. Conf. on Ubiq. Computing and Amb. Intell., UCAmI 2018, 4-7 Dec, 2018; Punta Cana, Dominican Republic
[6].

32
Grenze International Journal of Engineering and Technology, June Issue

A Comprehensive Study on Time Series Analysis in


Healthcare
J. Karthick Myilvahanan1, Nivetha K2, Krishnaveni A3, Dr. Mohana Sundaram N4, Dr. Santosh R5
1-3
New Horizon College of Engineering/ISE, Bengaluru, India
Email: [email protected], [email protected], [email protected]
4-5
Karpagam Academy of Higher Education/CSE, Coimbatore, India
Email: [email protected], [email protected]

Abstract—There has been a lot of interest in time series forecasting in recent years. Deep neural
networks have shown their effectiveness and accuracy in various industries. It is currently one
of the most extensively used machine-learning algorithms for dealing with massive volumes of
data due to the reasons stated above. Statistical modeling includes forecasting, which is used for
decision-making in various fields. Time-varying variables may be forecasted based on their past
values, which is the goal of forecasting. Developing models and techniques for trustworthy
forecasting is an important part of the forecasting process. As part of this study, a systematic
mapping investigation and a literature review are used. Time-series researchers have relied on
ARIMA approaches for decades, notably the autoregressive integrated moving average model.
But the need for stationary makes this method somewhat rigid. Forecasting methods have
improved and expanded with the introduction of computers, ranging from stochastic models to
soft computers. Conventional approaches may not be as accurate as soft computing. In
addition, the volume of data that can be analyzed and the efficiency of the process are two of the
many benefits of using soft computing.

Index Terms— ARIMA, Deep Learning, Healthcare, Survey, Time Series.

I. INTRODUCTION
According to Time Series Data, a process is observed at predefined intervals and a predetermined sample rate.
Developing rules from data and generating predictions about future values based on current observations are at
the heart of time series analysis. There has been an increase in the use of time-series observation data across
various industries and fields. Furthermore, the amount of time series data being produced is rising. One of the
most prominent academic fields is forecasting time series data. Meteorological and weather forecasting,
industrial production forecasting, and stock trend forecasting have benefited from its implementation. It might
assist decision-makers in avoiding danger and making better choices. Traditional time series forecasting methods
based on chance and statistics have succeeded in various fields, including meteorology, economics, and more.
Time series forecasting algorithms face substantial issues with the influx of large, non-linear time series data that
follow various distribution patterns due to the introduction of data science in health care. Outstanding results
have been achieved by using deep and machine learning to very sophisticated algorithms for forecasting time-
series data. This article aims to categorize the many approaches for predicting time series that are currently
available.

Grenze ID: 01.GIJET.9.2.19


© Grenze Scientific Society, 2023
II. BACKGROUND STUDY
A. A survey on Health care Time series analysis using ARIMA MODEL
In our research study, Abdul Jalil Niazai et al. [2] looked at the current and projected prevalence of COVID-19
in Afghanistan. From 22 March to 24 June 2020, to gather COVID-19 pandemic data from Afghan patients using
the time-series model (ARIMA). Three ARIMA models were created and chosen: ARIMA (0,2,2) for recovered
cases and (ARIMA 0/2) for death cases with the lowest AIC values.
Pratyaksa, Hans; et al. [44] Using daily data from Prof. Soeparwi Veterinary Hospital, an ARIMA model was
used to forecast the quantity of povidone-iodine antiseptic treatment that was used. R software was used to create
an ARIMA forecasting model. Results show that ARIMA(1,0,1) works well with the historical data. AR(1) = -
0,6463, MA(1) = 0.9436, and intercept/constant = 14.6796 were the model coefficients. The chosen model's
MAE value was found to be 24.769. By developing better resource management policies and forecasting future
medicine demand, stakeholders and pharmacy departments may benefit from medication consumption
prediction.
B. A survey on Health care Time-series Analysis of Deep learning
Divya Gupta et al. [13] As shown in this study, cutting-edge medical technologies such as artificial intelligence
(AI), distributed ledger technology (DLT), and robots — all of which are part of the current ubiquitous
healthcare paradigm — may help enhance diagnostic procedures as well as patient care. The sheer volume of
data makes it difficult for medical personnel to make timely decisions and implement new technologies, even
while certain devices may aid in illness prevention, fitness promotion, and remote assistance in emergencies. A
CNN model was trained using the original IoMT-based WESAD mental health dataset. Scalability difficulties
have been addressed by restricting intra-cluster similarity computations by time and reducing processing costs in
the k-medoid model. These researchers used the k-medoid approach to summarize the WESAD dataset. Data
clustering reduced the execution time by half.
Fabien Viton et al. [15] An increasing number of researchers are looking at how to explain DL models,
particularly in the healthcare industry. The work may be made easier by using visual-based solutions. This
paper's CNN models for multivariate TS issues were explained using heat maps. A visual representation of the
relevance of each TS variable over time was presented in addition to the forecasts. The research focused on
predicting in-hospital mortality as a practical healthcare application. Using heat maps allows you to see which
factors have the most impact and when it is most important to use them throughout your ICU stay. The model
predictions are seen to be better supported by this visual explanation. Our next step is to study the heat map
output with healthcare professionals interacting with patients in the intensive care unit (ICU).
Kristoffer Wickstrom et al. [30] Deep ensemble strategies for explainable CNNs are presented here. The
suggested strategy was tested using both synthetic and real-world data. According to the findings, clinical time
series may benefit from using deep ensembles to identify key traits, and modeling the uncertainty in relevance
scores can help to give more accessible and reliable explanations for the results. New thresholding methods have
been presented and tested. In this work, a single thresholding approach was explored; however, utilizing other
thresholding strategies is an exciting area for future research. Using the results of this study, it is feasible to
construct more reliable and precise decision-support systems than those based on deep learning.
C. Multivariate Time-series Analysis in Healthcare
Kale, David C et al. [28] An rising need for fast and reliable time series comparisons, especially in healthcare, is
shown in this study. According to experts, no one-time series similarity metric performs consistently well across
all datasets and situations. Most of these approaches also have problems with large time-series datasets. Hashing
a time series with kernelized hashing is flexible and consistent regardless of the format or distance measure.
With KLSH and arbitrary time series similarity measures, an efficient and reliable framework is provided for
searching for time series that are similar to each other.
Ordonez, P et al. [40] Each interactive and animated, multivariate time-series visualization customized to an
individual patient and tailored to the physicians' baseline and thresholds was provided in this research.
Visualizations were created to help physicians rapidly and efficiently identify important changes in a patient's
state of health. In our study, visualizations are as accurate as or less than conventional techniques for detecting
PDA. Hence these authors did not reject the null hypothesis. As a result, this is acceptable given its innovative
interface despite a lower overall level of trust in the visualization. However, the research found that around two-
thirds of the providers preferred to employ the conventional way following training and before usage.

34
Yanke Hu et al. [63] Although RNNs' slow processing speed is typically neglected, new research has shown that
RNN-based strategies may be useful in numerous time series applications. By converting vital signs into a (0, 1)
vector and treating the problem as a computer vision problem, this article proposes a novel method for the
multivariate time series classification challenge in healthcare.
Zina M et al. [65] A Voronoi diagram-based approach for detecting outliers in time series data has been
described. The technique has several major benefits. Outliers are dealt with by considering the multivariate
nature of the data in the first place. Because the authors may choose whether or not to use a parametric model, it
is versatile in extracting relevant characteristics for separating outliers from non-outliers (such as a regression
model, as in this paper). Finally, Voronoi diagrams reveal the underlying geometric connection of the data
points. According to Experimental Data, our MVOD technique can accurately, sensitively, and robustly identify
outliers in a multivariate time series.
D. Multi-Dimensional Time Series Analysis
Dai, Xiangfeng et al. [12]. Researchers in this research provide three different ways of altering hypothesizing the
HASF approach for finding patterns in fragmented time series. This technique has proven to be resilient because
of the HASF's capacity (a) to retain the underlying trend, (b) to cope with the nonstationary and
heteroscedasticity of data, and (c) to represent the relevance of data samples that remain after deleting nearby
data.
Dugast, Mael, et al. [14] An early decision-making aid for Emergency Department administrators is crucial for
financial and public health reasons. When respiratory-like infections spread, these authors know that ED
admissions increase. Sometimes, the clinical signs of these disorders may be recognized. In particular, RSV is
linked to bronchiolitis symptoms in children by doctors. It is thus possible to address the RSV pandemic by
analyzing the temporal series of bronchiolitis admissions to pediatric emergency rooms. There is a need for a
new and unique method for recognizing early on the start of epidemic-related aberrant arrival in emergency
departments (EDs) and calculating the maximum number of arrivals, which indicates how soon the epidemic will
fade away. Detrended Fluctuation Analysis was utilized to get the admissions time series variability, and the
authors applied the persistent homology technique (DFA). To get the best DFA parameter value, solve a multi-
objective optimization problem.

TABLE I: COMPARATIVE ANALYSIS OF VARIOUS MACHINE LEARNING TECHNIQUES IN HEALTH CARE


Author Name Techniques Merits Demerits
Used
Abbasi et al. (2011) FRN Efficiently Irrespective
enable the of the feature
inclusion of subset sizes.
extended sets of
heterogeneous Ingram
features
Chen & Tseng (2011) Product An effective Classification
reviews information problem
quality
framework
Jiang et al. (2011) Twitter Incorporating Targetindependent
sentiment target-dependent features
classification features and
good
performance
Xu et al. (2011) Mining Using three Slightly
reviews from supervised improve
travel blogs machine accuracy
learning
algorithms
Sobkowicz et al. (2012) Opinion Content analysis Accuracy
formation of social media
framework and socio
physical system
modeling.
Liu et al. (2012) Word based To extract The problem
Translation opinion targets of error
Model (WTM) and to generate a propagation
global measure in traditional

35
bootstrap based
methods
Kamal et al. (2012) A rule-based To identify Opinions that
system and candidate are related
opinion feature-opinion either directly
mining system pairs from or indirectly
review
documents and
product features
Lin et al. (2012) JST, Reverse- Detects Weakly supervised
JST and LDA sentiment and nature of JST
topic and no
simultaneously labeled
from text documents.
Moraes et al. (2013) Document-level Achieved better Unbalanced
sentiment levels of data contexts
analysis, SVM classification
and ANN accuracy
Bagheri et al. (2013) A novel To learn multiword Accuracy
unsupervised aspects,
and domain-independent bootstrapping
model iterative
algorithm and
pruning methods
Kalaivani&Shunmuganat The Good accuracy Compare
han (2013) performance various
for sentiment sentiment
classification classification
approaches
Hai et al. (2014) Opinion IEDR Domain-specific
features from and
online reviews independent
corpus
Stavrianou&Brun (2015) Product Improving the Review only
reviews, NLP recommendation particular
and fine-grained s system product.
data
like opinion
Agarwal et al. (2015) Concepts Semantic Basic
extraction relations problem of
protocol between words SA
in natural
language and
Concept Net
ontology
Agarwal & Mittal (2016) Machine Bow High
learning representations dimensionality
protocols for y of features
SA space
Ahmad et al. (2016) Sentence-level Text level The problem
lexical based corpus based of domain
domain machine portability
independent learning
sentiments techniques
classification
technique

Garg, Bindu, et al. [19] New techniques described in this research produce the highest accuracy with the lowest
mean square error among all forecasting-related work. The pioneering dynamic computational algorithm may be
used to accurately and reliably estimate and anticipate the frequency of outpatient visits in any territory-care
hospital. Health care planning, allocation, and management might benefit from using the model presented in this
article. A decision support system for healthcare institutions may be developed using the design of the suggested
technique. Such a decision assistance system may significantly impact healthcare service efficiency. The
suggested model may be improved using a genetic algorithm in the future to cope with multi-dimensional time-
series data.
Gunnarsdottir, Kristin, et al. [20] an efficient technique to categorize sepsis in intensive care units (ICUs) may be
to use a generalized linear model. The concept is that the 781 probability model is updated each time a new

36
measurement is taken and utilized by ICU physicians to understand their patient's clinical status better. Instead of
only considering demographic factors, these authors found that including physiological time-series signals
improved classification accuracy and specificity. These authors were constrained by the number of patients in
the MIMIC II database that could be included in this investigation. Even though these findings are early, they
show that GLM can be used to monitor sepsis in real-time.
Liu, Bo et al. [34] MDLats is a method and system for discovering motifs in large-scale time series presented in
this work. It utilizes the RP algorithm and the ED to swiftly and precisely locate the motifs, combining the
benefits of both approximate and exact techniques. Hadoop is used to construct a production-level system.
MDLats' ECG classification findings and real-world use in healthcare prove its usefulness. In the future, these
authors want to use MDLats in various other areas, including air pollution, social networks, and logistical
optimization and lot of other areas where the system is being use on a large scale.
E. A Survey on Time Series Analysis on General Health Care
Almeida, Rui Jorge, et al. [6] A straightforward method for obtaining medical data summaries in descriptive
linguistic form is presented in this paper. Proposals incorporate categorical data and clearly show disparities
between patients with distinct class labels from linguistic summary protoforms. These authors propose
summarizing data in a new differential form based on a numerical criterion to compare linguistic summaries.
Multiple occurrences were detected in the same individuals over long periods in the reviewed data set.
Summaries of linguistic features are proposed that offer chronological context for the quantification of
characteristics and time
Baldassano, Steven, et al. [8] ICU caregivers are alerted to crucial occurrences in real-time using a configurable
platform established in this article. This platform developed and clinically applied open-source techniques for
identifying defective EEG electrodes, tracking burst suppression ratios, and detecting problems in
neuromonitoring data. When it comes to improving ICU workflow, easing the strain on nurses, and enhancing
research data quality, these authors showed how this platform could do all of those things and calculate clinically
significant trending indicators. Medical data analytics may greatly impact patient care, and this study provides a
framework for understanding a broad range of ICU data streams.
Biem, A et al. [9] These authors introduced STAM as a domain-agnostic, multi-component, generic time-series
analysis and management system and exhibited its capabilities via experiments on a real-time, large-scale
anomaly detection application and generated tests. The STAM system is created with a specific emphasis on
well-defined qualities. STAM is a general plug-and-play system. It gives the capacity to handle multi-
dimensional time-series data of nearly any size. The user inputs the data source, and the system executes the
processing. STAM stresses simplicity of use: the system needs minimum adjustment from the user and minimal
settings to start, save for adding data sources. It also offers sophisticated user control (e.g., sensitivity
modification and parameter selection).

TABLE II: COMPARATIVE ANALYSIS OF DIFFERENT HEALTH CARE DATA

Type of Model Size of Data Patient Age Accuracy

Multiple Regression 1M > 65 0.66

LassoLarsIC-AIC 2.3 million All 0.72

Decision Tree 2.3 million All 0.76


Regression

Cao, Xi Hang, et al. [10] Methods for learning continuous-time LDS from MTS with different types of
imperfection, such as restricted time points and uneven sample intervals, are presented in this study. These
authors used a support vector machine model for classification tasks and a sophisticated LDS kernel formulation.

37
These authors demonstrated that our suggested technique is successful and superior to other ways based on the
outcomes of three diagnostic tasks with varying degrees of imperfection.
Hajihashemi, Zahra and Popescu, Mihail [22]. Elderly people at the highest risk of deterioration and adverse
events may be identified using our methods in this research. Automated in-home monitoring systems will use our
computational approach to track the health trends of older persons and notify healthcare practitioners so that they
may take action before things worsen. The features of TSW are discussed in depth here, followed by the findings
of the suggested approach's performance on a benchmark dataset.
This research was constrained by Helander, Elina, et al. [23] because it relied on data from just two healthy
patients. However, some findings were consistent with previous research. Behavioral weight loss participants
were tracked on average, 28 percent on weekdays and 17 percent on weekends. These two individuals showed a
reduction in self-monitoring frequency throughout the weekend. Weight fluctuates during the day and over time.
An average daily weight fluctuation of 2% to 3% may be regarded as typical, with daily weight fluctuations
being more frequent than day-to-to-day weight fluctuations
Hochstein, Axel, et al. [25] Using static Bayesian network theory, these authors develop the concept of
probabilistic event networks, which describe the relationships between regime shifts in time and their locations.
These authors demonstrate how RSVAR inference and learning algorithms must be altered to take higher-order
regime dynamics into account.
Lehman, Li-Wei H. et al. [32] This research aims to see whether the SLDS framework can be used to follow the
evolution of patients' health status over time. An ICU patient cohort was studied using the framework during the
first 24 hours of hospitalization. These authors found that the vital sign dynamics of patients who did not survive
their hospital stays evolved differently from those who did. As patients' health improves or worsens, the
distribution of their vital sign dynamics likewise changes. These findings confirm our idea.
Liu, junjian et al. [35] Using ontology, these authors developed a real-time monitoring system to monitor patient
care flow and compare it to the specified CP treatment requirements. The suggested system's monitoring data is
organized and stored in a database, making it easy for the computer to handle and evaluate.
Mei, Jiangyuan, et al. [37] Computer vision and pattern recognition applications rely on accurately measuring
and categorizing motion tracking signals (MTS). A new method for measuring MTS has been presented. MTS's
local distance is first determined using the Mahalanobis distance, as described in the new technique. The DTW is
then used to discover the best route to align MTS out of synchronization or have various lengths. Once this is
done, the difference between two MTS may be derived for MTS classification and clustering purposes. Learning
the Mahalanobis function for the MTS dataset is another major issue in the proposed MDDTW metric. LogDet
divergence-based metric learning with triplet constraints was developed in this study for the MTS example. Our
technique was tested on several well-known datasets. The results showed that the recommended strategy was
dependable and accurate. An issue with the suggested framework is that it is inefficient in computing.
According to this article, pealat, Clement et al. [41], Respiratory viruses significantly affect emergency
departments (ED) in France each winter. To prevent this from happening, it's critical to have the means to
monitor the passage of these viruses across the patient population. This is exactly what is being discussed in this
paper.
Penfold, Robert B, and Zhang, Fang [42] it is a simple yet effective evaluation of policies and programs. Despite
its drawbacks, few statistical techniques are as well-designed or as effective in their effect on the audience as this
one
Pierleoni, Paola, et al. [43] When used for Parkinson's disease diagnosis and treatment, this new system
presented in this research offers qualities that make it an excellent choice for ambulatory and home monitoring.
Tremors may be classified using a basic IMU device and a set of algorithms that can measure the intensity of
their symptoms in real-time using the UPDRS scale. The system also provides a mechanism for annotating
illness severity and progression to the neurologist. The report, thus, is unaffected by subjective judgments, such
as those made by medical experts who alternate in subsequent analyses of a patient.
Rusanov, Alexander, et al. [47] Data-driven, time-series clustering identifies people with managed and
uncontrolled diabetes. Other illness factors or long-term medical issues may apply to this strategy.
Sana Imtiaz et al. [48] Using a fitness monitoring app or a wearable device, these authors develop an online
system to predict a user's food habits and health statistics. To this end, these authors have developed and
deployed a pipeline that can reliably forecast user behavior and utilize commonalities between people to increase
model performance while ensuring data privacy. Assuming that the dataset and characteristics are consistent, our
predictions are less than 0.025 percent out of whack with reality.
Shi, Yong, et al. [50] Researching the Home Health Care 690 logistics optimization is of major importance in
this article since transportation expenses are one of the main kinds of spending in the business.

38
According to this research, Sindhu Shantha Nair et al. [52] Ethical considerations must be considered when
applying these dimensions to the healthcare business. Healthcare businesses must practice these aspects to the
letter to achieve a competitive edge. These fields may benefit from more education and training to improve
productivity. These perceptions and characteristics of organizational pedestal morally contribute to health care
excellence if they are understood, aware, and ready.
Stylianides, Nikolas, et al. [55] The suggested solution's usability and cost-effectiveness are shown in this paper's
assessment scenario. As discussed earlier, many research institutions have formed common repositories to share
medical data. Researchers may access the data they need to analyze in databases or binary files.
Vasco, Todor, et al. [57] Sensor data streams are modeled using semantics. WSN and WBAN data streams
demand that adaptable architectures be designed based on multiagent systems to meet the needs of real-time vital
signs monitoring. Semantically-driven sensor data streams are created by the system using the suggested
approach. Attributes such as timestamps, vital signs, and values define each reading. Yandong Zheng et al. [61]
These authors provide an efficient and privacy-preserving forward algorithm, which these authors then employ
to build a healthcare monitoring system. A collection of mutually orthogonal matrices was the first thing these
authors offered, and a strategy for building one was presented shortly after.
Zhang, Ying, et al. [64] in Consistent correlation between physiological and clinical events has been shown by
this method for synchronized data collection and clinical annotations. Even though hardware capabilities might
affect its performance, the system gathered and evaluated patient monitoring algorithms in real-time at the
bedside.

III. CONCLUSION
Two things are anticipated to occur in the future growth of technology and healthcare. First, the rising
sophistication of computerization and software development will be a potent mix for predicting. Various
complicated procedures and strategies now only imaginable may soon be realized and used in actual
circumstances. The variances and degrees of complexity are also growing regarding the quantity of data. In
addition, the healthcare departments requiring large-scale data forecasting will expand. In the future, there
will be rapid growth in using these two technologies, which combine forecasting and data mining. There
are various benefits to developing new forecasting methodologies, such as those based on soft computing
technologies, which may produce more accurate forecasting results than conventional approaches and more
efficient processes. A thorough evaluation of existing time series forecasting approaches is expected to
serve as a direction for future field classifications and analysis research.

REFERENCES
[1] Abbasi, A, France, S, Zhang, Z & Chen, H 2011, „Selecting Attributes for Sentiment Classification Using Feature
Relation Networks‟, IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 3, pp.447-462.
[2] Abdul Jalil Niazai; Abdullah Zahirzada; Mohammad Akbar Shahpoor; Abdul Rahman Safi; 2020 IEEE International
Conference on Advent Trends in Multidisciplinary Research and Innovation (ICATMRI)
[3] Agarwal, B & Mittal, N 2016, „Machine Learning Approach for Sentiment Analysis‟. In Prominent Feature Extraction
for Sentiment Analysis (pp. 21-45). Springer International Publishing.
[4] Agarwal, B, Poria, S, Mittal, N, Gelbukh, A & Hussain, A 2015, „Concept-Level Sentiment Analysis with Dependency-
Based Semantic Parsing: A Novel Approach‟. Cognitive Computation, vol. 7, no. 4, pp. 487-499.
[5] Ahmad, S, Kundi, FM, Tareen, I & Asghar, MZ 2016, „Lexical Based Semantic Orientation of Online Customer
Reviews and Blogs‟. arXiv preprint arXiv:1607.02355.
[6] Almeida, Rui Jorge; Lesot, Marie-Jeanne; Bouchon-Meunier, Bernadette; Kaymak, Uzay; Moyse, Gilles (2013).IEEE
International Conference on Fuzzy Systems (FUZZ-IEEE) - Linguistic summaries of categorical time series for septic
shock patient data.
[7] Bagheri, A, Saraee, M & De Jong, F 2013, „Care More About Customers: Unsupervised Domain-Independent Aspect
Detection for Sentiment Analysis of Customer Reviews‟. Knowledge-Based Systems, vol. 52, pp. 201-213.
[8] Baldassano, Steven; Gelfand, Michael; Bhalla, Paulomi Kadakia; Hill, Chloe; Christini, Amanda; Wagenaar, Joost; Litt,
Brian; Roberson, Shawniqua Williams; Balu, Ramani; Scheid, Brittany; Bernabei, John; Pathmanathan, Jay; Oommen,
Brian; Leri, Damien; Echauz, Javier (2020). IRIS: A Modular Platform for Continuous Monitoring and Caretaker
Notification in the Intensive Care Unit. IEEE Journal of Biomedical and Health Informatics.
[9] Biem, A.; Feng, H.; Riabov, A. V.; Turaga, D. S. (2013). Real-time analysis and management of big time-series data.
IBM Journal of Research and Development, 57(3), 8:1–8:12.
[10] Cao, Xi Hang; Han, Chao; Obradovic, Zoran (2018). IEEE International Conference on Healthcare Informatics (ICHI) -
Learning a Dynamic-Based Representation for Multivariate Biomarker Time Series Classifications.

39
[11] Chen, CC & Tseng, YD 2011, „Quality Evaluation of Product Reviews Using an Information Quality Framework‟,
Decision Support Systems, vol. 50, no. 4, pp. 755-768.
[12] Dai, Xiangfeng; Bikdash, Marwan (2017). Trend Analysis of Fragmented Time Series for health Apps: Hypothesis
Testing Based Adaptive Spline Filtering Method with Importance Weighting.
[13] Divya Gupta; M. P. S. Bhatia; Akshi Kumar; (2021). Resolving Data Overload and Latency Issues in Multivariate Time-
Series it Data for Mental Health Monitoring.
[14] Dugast, Mael; Bouleux, Guillaume; Mory, Olivier; Marcon, Eric (2018). Improving Health Care Management Through
Persistent Homology of Time-Varying Variability of Emergency.Department Patient Flow. IEEE Journal of Biomedical
and Health Informatics.
[15] Fabien Viton;Mahmoud Elbattah;Jean-Luc Guerin;Gilles Dequen; (2020). Heatmaps for Visual Explainability of CNN-
Based Predictions for Multivariate Time Series with Application to Healthcare . 2020 IEEE International Conference on
Healthcare Informatics (ICHI).
[16] Ferenti, Tamas (2017).IEEE 30th Neumann Colloquium (NC) - Biomedical applications of time series analysis.
[17] Fujita, Hamido (2017).IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY) - Data
analytics for cloud healthcare and risk predictions based on ensemble classifiers and subjective projection.
[18] Gang, Qu; Cui Shengnan, ; Jiafu, Tang (2014). The 26th Chinese Control and Decision Conference (2014 CCDC) -
Time series forecasting of medicare fund expenditures based on historical data.
[19] Garg, Bindu; Beg, M. M. Sufyan; Ansari, A. Q. 2012 Annual Meeting of the North American Fuzzy Information
Processing Society (NAFIPS) - A new computational fuzzy time series model to forecast several outpatient visits.
[20] Gunnarsdottir, Kristin; Sadashivaiah, Vijay; Kerr, Matthew; Santaniello, Sabato; Sarma, Sridevi V.2016 38th Annual
International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) - Using demographic and
time-series physiological features to classify sepsis in the intensive care unit.
[21] Hai, Z, Chang, K, Kim, JJ & Yang, CC 2014, „Identifying Features in Opinion Mining Via Intrinsic and Extrinsic
Domain Relevance‟, IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 3, pp. 623-634.
[22] Hajihashemi, Zahra; Popescu, Mihail (2015). A Multi-dimensional Time Series Similarity Measure with Applications to
Eldercare Monitoring. IEEE Journal of Biomedical and Health Informatics.
[23] Helander, Elina; Pavel, Misha; Jimison, Holly; Korhonen, Ilkka (2015). 37th Annual International Conference of the
IEEE Engineering in Medicine and Biology Society (EMBC) - Time-series modeling of long-term weight self-
monitoring data.
[24] Hirano, Shoji; Tsumoto, Shusaku (2017). 6th International Conference on Informatics, Electronics and Vision & 2017
7th International Symposium in Computational Medical and Health Technology (ICIEV-ISCMHT) - Towards
knowledge discovery from heterogeneous time-series medical databases
[25] Hochstein, Axel; Hyung-Il Ahn; Ying Tat Leung; Denesuk, Matthew 2014 International Conference on Prognostics and
Health Management - Switching vector autoregressive models with higher-order regime dynamics Application to
prognostics and health management.
[26] Jiang, L, Yu, M, Zhou, M, Liu, X, & Zhao, T 2011, „Target-Dependent Twitter Sentiment Classification. In Proceedings
of the 49th Annual Meeting of the Association for Computational Linguistics. vol. 1, pp. 151-160.
[27] Kalaivani, P & Shunmuganathan, KL 2013,Sentiment Classification of Movie Reviews By Supervised Machine
Learning Approaches‟, Indian Journal of Computer Science and Engineering (IJCSE), vol. 4, no. 4, pp. 285-292.
[28] Kale, David C.; Gong, Dian; Che, Zhengping; Liu, Yan; Medioni, Gerard; Wetzel, Randall; Ross, Patrick (2014). IEEE
International Conference on Data Mining - An Examination of Multivariate Time Series Hashing with Applications to
Health Care.
[29] Kamal, A, Abulaish, M & Anwar, T 2012, „Mining Feature-Opinion Pairs and Their Reliability Scores From Web
Opinion Sources‟, In Proceedings of the 2nd International Conference on Web Intelligence Mining and Semantics
ACM, pp. 15.
[30] Kristoffer Wickstrom;Karl Oyvind Mikalsen;Michael Kampffmeyer;Arthur Revhaug;Robert Jenssen;
(2021). Uncertainty-Aware Deep Ensembles for Reliable and Explainable Predictions of Clinical Time Series. IEEE
Journal of Biomedical and Health Informatics.
[31] Lavergne, M. Ruth; Law, Michael R.; Peterson, Sandra; Garrison, Scott; Hurley, Jeremiah; Cheng, Lucy; McGrail,
Kimberlyn (2017). Effect of incentive payments on chronic disease management and health services use in British
Columbia, Canada: Interrupted time series analysis.
[32] Lehman, Li-Wei H.; Nemati, Shamim; Adams, Ryan P.; Moody, George; Malhotra, Atul; Mark, Roger G. 2013 35th
Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) - Tracking
progression of the patient state of health in critical care using inferred shared dynamics in physiological time series.
[33] Lin, C, He, Y, Everson, R & Rüger, S 2012, „Weakly Supervised Joint Sentiment-Topic Detection From Text‟,
Knowledge and Data Engineering, IEEE Transactions, vol. 24, no. 6, pp. 1134-1145.
[34] Liu, Bo; Li, Jianqiang; Chen, Cheng; Tan, Wei; Chen, Qiang; Zhou, Menchu (2015). Efficient Motif Discovery for
Large-Scale Time Series in Healthcare. IEEE Transactions on Industrial Informatics.
[35] Liu, junjian; Huang, Zhengxing; Lu, Xudong; Duan, Huilong (2014). 7th International Conference on Biomedical
Engineering and Informatics - An ontology-based real-time monitoring approach to a clinical pathway.

40
[36] Liu, K, Xu, L & Zhao, J 2012, „Opinion Target Extraction Using Word- Based Translation Model‟, In Proceedings of
the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language
Learning, Association for Computational Linguistics, pp. 1346-1356.
[37] Mei, Jiangyuan; Liu, Meizhu; Wang, Yuan-Fang; Gao, Huijun (2015). Learning a Mahalanobis Distance-Based
Dynamic Time Warping Measure for Multivariate Time Series Classification. IEEE Transactions on Cybernetics.
[38] Moraes, R, Valiati, JF & Neto, WPG 2013, „Document-Level Sentiment Classification: An Empirical Comparison
Between SVM and ANN‟, Expert Systems with Applications, vol. 40, no. 2, pp. 621-633.
[39] Nickerson, Paul; Baharloo, Raheleh; Wanigatunga, Amal A.; Manini, Todd D.; Tighe, Patrick J.; Rashidi, Parisa
(2017). Transition Icons for Time Series Visualization and Exploratory Analysis. IEEE Journal of Biomedical and
Health Informatics.
[40] Ordonez, P.; Oates, T.; Lombardi, M. E.; Hernandez, G.; Holmes, K. W.; Fackler, J.; Lehmann, C. U.
(2012). Visualization of multivariate time-series data in a neonatal ICU.
[41] Pealat, Clement; Bouleux, Guillaume; Cheutet, Vincent (2019). IEEE EMBS International Conference on Biomedical &
Health Informatics (BHI) - Extracting Most Impacting Emergency Department Patient Flow By Embedding Laboratory-
confirmed and Clinical Diagnosis on The Stiefel Manifold.
[42] Penfold, Robert B.; Zhang, Fang (2013). Use of Interrupted Time Series Analysis in Evaluating Health Care Quality
Improvements.
[43] Pierleoni, Paola; Palma, Lorenzo; Belli, Alberto; Pernini, Luca (2014). IEEE-EMBS International Conference on
Biomedical and Health Informatics (BHI) - A real-time system to aid clinical classification and quantification of tremor
in Parkinson's disease.
[44] Pratyaksa, Hans; Permanasari, Adhistya Erna; Fauziati, Silmi; Fitriana, Ida (2016). 1st International Conference on
Biomedical Engineering (BIOMED) - ARIMA implementation to predict the amount of antiseptic medicine usage in a
veterinary hospital.
[45] Rajaei, Rasoul; Shafai, Bahram; Ramezani, Amin (2017). IEEE High-Performance Extreme Computing Conference
(HPEC) - A top-down scheme of descriptive time series data analysis for a healthy life: Introducing a fuzzy amended
interaction network.
[46] Roberts, Lauren; Michalak, Peter; Heaps, Sarah; Trenell, Michael; Wilkinson, Darren; Watson, Paul (2018). IEEE 14th
International Conference on e-Science (e-Science) - Automating the Placement of Time Series Models for IoT
Healthcare Applications.
[47] Rusanov, Alexander; Prado, Patric V.; Weng, Chunhua (2016). IEEE International Conference on Healthcare
Informatics (ICHI) - Unsupervised Time-Series Clustering Over Lab Data for Automatic Identification of Uncontrolled
Diabetes.
[48] Sana Imtiaz; Sonia-Florina Horchidan; Zainab Abbas; Muhammad Arsalan; Hassan Nazeer Chaudhry; Vladimir
Vlassov; (2020). Privacy-Preserving Time-Series Forecasting of User Health Data Streams. 2020 IEEE International
Conference on Big Data (Big Data).
[49] Shamsuddin, Rittika; Maweu, Barbara M.; Li, Ming; Prabhakaran, Balakrishnan (2018). IEEE International Conference
on Healthcare Informatics (ICHI) - Virtual Patient Model: An Approach for Generating Synthetic Healthcare Time
Series Data.
[50] Shi, Yong; Boudouh, Toufik; Grunder, Olivier (2017). A hybrid genetic algorithm for a home health care routing
problem with a time window and fuzzy demand. Expert Systems with Applications.
[51] Shukla, Shubhangu; Singh, Pulkit; Neopane, Narayan; Rishabh, (2019). 2019 4th International Conference on
Information Systems and Computer Networks (ISCON) - Health Care Management System Using Time Series Analysis.
[52] Sindhu Shantha Nair;Kennedy Andrew Thomas;Smritika S. Prem; (2021). The organizational pedestal of quality of care
climate in health care excellence. Zeitschrift für Evidenz, Fortbildung und Qualität im Gesundheitswesen.
[53] Sobkowicz, P, Kaschesky, M & Bouchard, G 2012, „Opinion Mining in Social Media: Modeling, Simulating and
Forecasting Political Opinions in the Web‟, Government Information Quarterly, vol. 29, no. 4, pp. 470-479.
[54] Stavrianou, A & Brun, C 2015, „Expert Recommendations Based on Opinion Mining of User‐Generated Product
Reviews‟, International Journal of Computational Intelligence, vol. 31, no. 1, pp. 165-183.
[55] Stylianides, Nikolas; Dikaiakos, Marios; Gjermundrod, Harald; Theodoros, (2012). 2012 IEEE 12th International
Conference on Bioinformatics & Bioengineering (BIBE) - Intensive Care Cloud: Exploiting cloud infrastructures for
near real-time vital sign analysis in intensive care medicine.
[56] Takeuchi, H.; Mayuzumi, Y.; Kodama, N. (2011). Annual International Conference of the IEEE Engineering in
Medicine and Biology Society - Analysis of time-series correlation between weighted lifestyle data and health data.
[57] Vascu, Todor; Frincu, Marc; Negru, Viorel (2016). International Symposium on innovations in Intelligent systems and
Applications (INISTA) - Energy-efficient sensors data stream model for real-time and continuous vital signs monitoring.
[58] Vong, Keovessna; Rasmequan, Suwanna; Chinnasarn, Krisana; Harfield, Antony (2015). 8th Biomedical Engineering
International Conference (American) - Empirical modeling for dynamic visualization of ICU patient data streams.
[59] Wickramasinghe, Asanga; Ranasinghe, Damith C.; Fumeaux, Christophe; Hill, Keith D.; Visvanathan, Renuka
(2016). Sequence Learning with Passive RFID Sensors for Real-Time Bed-egress Recognition in Older People. IEEE
Journal of Biomedical and Health Informatics.
[60] Xu, K, Liao, SS, Li, J & Song, Y 2011, „Mining Comparative Opinions From Customer Reviews for Competitive
Intelligence‟, Decision Support Systems, vol. 50, no. 4, pp. 743-754.

41
[61] Yandong Zhang; Rongxing Lu; Songnian Zhang; Yunguo Guan; Jun Shao; Hui Zhu; (2022). Toward Privacy-Preserving
Healthcare Monitoring Based on Time-Series Activities Over Cloud . IEEE Internet of Things Journal.
[62] Yang, Chengliang; Delcher, Chris; Shenkman, Elizabeth; Ranka, Sanjay 2018. IEEE 20th International Conference on e-
Health Networking, Applications and Services (Healthcom) - Clustering Inter-Arrival Time of Health Care Encounters
for High Utilizers.
[63] Yanke Hu;Raj Subramanian;Wangpeng An;Na Zhao;Weili Wu; (2020). Faster Healthcare Time Series Classification for
Boosting Mortality Early Warning System . 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS).
[64] Zhang, Ying; Silvers, Christine Tsien; Randolph, Adrienne G. 2007 29th Annual International Conference of the IEEE
Engineering in Medicine and Biology Society - Real-Time Evaluation of Patient Monitoring Algorithms for Critical
Care at the Bedside.
[65] Zina M. Ibrahim; Daniel Bean; Thomas Searle; Linglong Qian; Honghan Wu; Anthony Shek; Zeljko Kraljevic; James
Galloway; Sam Norton; James T Teo; Richard JB Dobson; (2022). A Knowledge Distillation Ensemble Framework for
Predicting Short- and Long-Term Hospitalization Outcomes From Electronic Health Records Data . IEEE Journal of
Biomedical and Health Informatics.
[66] Zwilling, Chris E.; Wang, Michelle Yongmei (2014). IEEE Healthcare Innovation Conference (HIC) - Multivariate
Voronoi outlier detection for time series.
[67] Zikos.D and D. Ostwal, "A Platform based on Multiple Regression to Estimate the Effect of in-Hospital Events on Total
Charges," in Healthcare Informatics (ICHI), 2016 IEEE International Conference on, 2016, pp. 403-408.

42
Grenze International Journal of Engineering and Technology, June Issue

Detecting and Isolating Black-Hole Attacks in Manet


using Timer based Baited Technique
Paramjit1 and Saurabh Charya2
1-2
OSGU/CSE, Hisar, India
Email: [email protected], [email protected]

Abstract—The network model (MANET) is just a wireless system that has no established
infrastructure. These nodes seem to be able to communicate without a central authority.
MANETs are ideal for emergency circumstances, vehicle networks, including military activities.
However, the MANET's flexibility makes it vulnerable to attacks like black hole attacks. The
black hole attack is one of the most common threats to MANET. In this attack, an unauthorized
node claims to have the best path to a target node, causing data packets to be misdirected and
then dropped. Several fixes have been made now. An overview of black hole attack prevention
measures and conclusion are presented in this work.

Index Terms— Cooperative Black Hole Attack, Black Hole Attack, Malicious Node, Packets.

I. INTRODUCTION
This short-term network has only one mobile node, which is capable of sending and receiving data by itself,
without the aid of any connections. Using multi-hop communication, MANET nodes can exchange information.
There is no direct data link between a destination node and a source node if the two are not within
communication range (G., P., Parmar, V., & Rishi, R., 2011). MANET has a dynamic topology since the nodes
can link and disconnect fast. The network's dynamic topography makes it more vulnerable to a wide range of
attacks. As a result, building this network and ensuring its route stability is extremely difficult. There are other
forms of malicious attacks being carried out by MANET; however, our primary focus is on black hole attacks
(Gerhards-Padilla, Aschenbruck and Martini, 2010). The healthcare industry demands round-the-clock
monitoring, which includes both routine updates and real-time emergency alerts sent across the network. But the
main problem in these situations is that the attacker nodes occur, causing unnecessary delays and potentially
disastrous effects. Traffic jams and network delays are both caused by this. When an attacker uses a black hole
assault, the node attacking it pretends to have the quickest pathway to the target node. Using this strategy, an
attacker's node will generate a bogus route, and all traffic will be diverted to that node (Percher et al., 2004). As
a result, the attacker node will be able to intercept all of the packets that are sent to or from the designated
destination (Santhakumar and Prabha, 2017). To find and separate Black hole nodes inside a MANET, the
(TBBT) Timer Based Baited Method includes both timers and baiting. Through the use of a Baited message, this
strategy enhances the ability to detect black holes. False id baiting
is used to find the network's black-hole nodes in this strategy. This method, on the other hand, increases network
latency while decreasing throughput (Yasin and Abu Zant, 2018). The Counter & Timer Based Baited Technique
(CTBM) in Splitting Black Hole Attacks with MANET is presented to address these issues. The baiting message,
the non-neighbor reply, and the retort are all part of this strategy. The network's "Black hole nodes" are

Grenze ID: 01.GIJET.9.2.20


© Grenze Scientific Society, 2023
distinguished by their performance in each of these three key areas.

II. RELATED WORK


Black hole attack is among the bellying attacks; likewise It has the moniker just a network packet drop attacks
due to an open intermediate and also astir topology. The facade for black hole nodes occurs all ubiquitously
route discovery step. Primarily, a sender node doesn't really present any appropriate path to the receiving node
(Sarma, K. J., Sharma, R., & Das, R., 2014)). Black hole assaults can be classed into three categories of attacks
including such multiple attacks, single hole attacks and also collaborative attacks. As their identification entail, a
single entity alternatively moreover node may participate in attack actions (Nakayama et al., 2009). The nodes in
the network could be easily assaulted by collaborative operations including such black hole attack, grey hole
Attack and jellyfish attacks. Are the most significant attacks that drops the packet before transmission. Improved
Cooperative Bait Detection technique is employed for maintaining against collaborative attacks. It is detecting
malignant nodes in MANETs under active black hole and jellyfish assaults (Bala, Bansal and Singh, 2009).
single hole attacks, Blackhole assaults can be classed into three sorts of attacks such as multiple attacks, and
also collaborative attacks. As their identification imply, a single entity other additional node can engage in
attackers actions (Woungang, Dhurandher, Obaidat GE and Peddi, 2013). The network nodes can be readily
assaulted by collaborative attack vectors like black hole attack, grey hole attack as well as jellyfish attacks.
These are the most significant attacks that drops a packet without transmission. Modified Collaborative Bait
Detection technique is employed in maintaining against collaborative attacks. It is differentiating malignant
nodes in MANETs during interactive black hole called jellyfish attacks (Sherif, A., Elsabrouty, M., & Shoukry,
A., 2013). Anti black hole method which detects the Blackhole nodes. Here, assesses untrusting value by RREQ
and RREP. If the node trusted value is extra than the edge then node shows a black hole within the network.
Variation Network Routing knowledge is employed to determine in addition to lessen co-operative black hole
and also grey hole attacks. . This table is employed to be aware the attacker node additionally maintain the
history of its previous adverse exemplifies and include the grey hole conduct. Identification and elimination a
Cooperative Black and Gray hole attack can notice in addition to eliminate the assault both sender and also in
node (Patil and Kshirsagar, 2020). In Blackhole & Grayhole attackers purposefully interrupt data connectivity
through delivering erroneous routing data. Ad hoc On-demand Length Variable (AODV) approach that such an
in-between node finds the attacker node delivering bogus routing information; routing packets are applied to
exceed routing data, and to spread data regarding cruel nodes (Kumar and Kumar, 2015). Behavioral and also
Node functioning with AODV method to notice grey hole assault. Here, behavioral irregularity recognition for
grey hole attack additionally node monitor the irregularity of data provided through grey hole node also send the
grey hole node block message to each and every entering nodes for avoiding of this assault (PratapSingh, Pal
Singh and Singh, 2013).
A. Black hole attack
Both a single and a group assault on a black hole (Baadache and Belmehdi, 2017) are two forms of black hole
attacks. The number of attacking nodes is used to classify black hole attacks.
B. Single Black Hole Attack
For example, a single node may use a routing protocol to make a false claim that it is a "adjacent" route to a
target, and then distribute data packets to other nodes(Baadache and Belmehdi, 2017). MANETs are vulnerable
to just a single black holes attack.In the figure 1 shows us a single black hole attack.

Figure 1Representation of Source and Destination node (source:Kaur and Kaur 2017)

44
A source node is shown as node 1 in fig1 while a destination node is shown as node 4. When the RREQ packets
from of the source node are received, it is assumed that Node 3 is malicious since it answers that path to target
node is shortest (Baadache and Belmehdi, 2017).
A malicious node is responsible for the loss of data packets. In the context of MANET, such malicious node
could be referred to as a "Black hole."
C. Cooperative Black hole attack
During a cooperative black hole attack, numeroushateful nodes work composed to break the routing protocols
specifications. Figure 2 depicts an attack by a Cooperative Black Hole(Baadache and Belmehdi, 2017).

Figure 2 Nodes with Freq(source:Kaur and Kaur (2017)

In Fig. 2, the source node is designated by the letter A, while the destination address is denoted by the letter D.
Nodes B1 and B2 were collaborating on a project. The source node transmits the Further Request (FReq) onto
B2 over a variety of routes other than through B1. Because it is the next-hop to node B1, it certifies that node B2
does have a line to a target(Baadache and Belmehdi, 2017). However, data packets by rogue node B1, despite the
fact that both B1 and B2 produce FurtherRep (FRep) packets claiming to have the most secure and fastest route.

III. DIFFERENT DETECTION SCHEMES FOR SINGLE BLACK HOLE


A. Watchdog Mechanism
This approach was proposed. The watchdog mechanism discovers nodes that are misbehaving. It keeps a buffer
that includes data packets that have been recently sent. By monitoring to all of the neighboring nodes, the
watchdog guarantees that data packets are forwarded to the next node inside the path as well (Baadache and
Belmehdi, 2017). Whenever a node fails to provide data packets,The network identifies it as a malicious node or
black hole node. Watchdog has less overhead and it has a lower End-to-End Delay than other solutions.
B. Time-based Threshold detection Scheme
This algorithm, which is based on the original AODV(it means loop free routing protocol) routing protocol,
exhibits a high degree of reactivity. Following the receipt of the initial request, a timer is started inside the Time
Expired Tables to collect requests from other nodes in the network. The arrival time as well as the Threshold
value is cast-off to store information about the packorder and timeouts(Baadache and Belmehdi, 2017). In this
case, the simulation tool of choice is GloMoSim (Global Mobile Simulator). A larger PDR can be reached with
both the least amount output delay and overhead, as seen below.
C. Resource Efficient Accountability
Effective Accountability (react) is based on a Random Audit Scheme and is designed to be efficient. Whenever
the performance of a source node the destination is degraded, the React strategy is activated to prevent further
degradation (Baadache and Belmehdi, 2017). It is divided into three phases, which are as follows: I the
Inspection phase, (ii) a Hunt phase, and (iii) ID phase Once a Pack Drop Relation is detected by the destination
node; the source node am notified by sending feedback to the end point node (PDR). The source node is selected
the Audit nodes that search for evidence against the attacker node and afterwards locates the attacker node's
location.
When compared to a traditional routing scheme, the React Scheme minimizes overload, however the delay is
greater because it relies on the reactive dynamic source proposed protocol (RDSRP). There are some
disadvantages to using REACT(Baadache and Belmehdi, 2017).

45
It is meant for such a non-cooperative Attacks, but it is unsuccessful inside a black hole since the black hole
node transmits phony proof towards the audit node, making the attack ineffective. For the second time, the
attacker node's origin is not recorded since the behavioral proof in React only stores information about
transmission packets, not nodes.
D. Neighborhood based Routing Recovery System
Routing Recovery Scheme is based just on AODV protocol to identify black holes in the neighborhood. Routing
recovery protocols are used to identify the assault and build the correct path(Baadache and Belmehdi, 2017). A
Modify-Route-Entry control message might be sent to the destination node if the paths are not same.
This method achieves a high detection rate while requiring less time to detect. When an attacker creates a forged
RREP packet, the attack fails.

IV. DIFFERENT DETECTION SCHEME FOR COOPERATIVE BLACK HOLE


A. Hybrid Routing Scheme
Bait DSR combines Watchdog and DSR, while Hybrid Routing Protocol associations both reactive and proactive
route techniques. The RREP field in DSR contains information about other nodes' RREP(Zant, 2017). In order
for the basis node to track down the intruder node. Bait-DSR designates a node by way of a Black hole if the
dropped pack value surpasses the verge value(Zant, 2017).
A comparison is made between Bait DSR and the simulations of Watchdog and DSR. In comparison to DSR and
Watchdog, its PDR is ninety-percent higher(Zant, 2017).
B. Hash-based Scheme
Hash-based method for generating proofs of node behavior that incorporates information on data traffic flowing
along a routing path (Zant, 2017). Auditing techniques are used to prevent assaults like black hole & gray hole
attacks as well as to fight against them in this growing system. React is used to find the answer. In this approach,
another audit node is required, and this node is set up either by source node(Zant, 2017). All packets are sent to
an auditing node, and a random number is added to the tail and each and every one. Using received packet and a
random wide variety generated by using the intermediate node, the value is determined (Zant, 2017). After
receiving a packet from an intermediate node, the audited node is able to continue the auditing process.

V. RESULTS
A. Single Black-Hole Attack
Because the packet dropping brought on by the black-hole attack, when there is just one, the net was at its
lowest. When a black-hole attack is not present-day in the network, the native AODV throughput result was the
greatest. When a black-hole attack is present in the network, the throughput of TBBT is higher than native
AODV, but it is lower that native AODV when there's not a black-hole assault. The suggested TBBT increases
performance by discarding any answers from unidentified nodes who right to have a quicker pathway to the
target node than any other node, which results in a reduction in throughput. Additionally, the location of a black-
hole attack is crucial since it can be situated on the route that travels the shortest distance from source to
destination.

Figure 3: Results of Amount versus the number of nodes (source:Yasin and Abu Zant, 2018)

46
Figure 4: Results of regular End-to-End Stay versus the statistics (source:Yasin and Abu Zant, 2018)

Figure 5: Results of PDR versus the amount of nodes (source:Yasin and Abu Zant, 2018)

B. Cooperative Black hole attack


The fact that there are more black-hole transmissions will really make it impossible for the source node as well
as the destination node to link causes cooperative black-hole attacks to have zero throughput. Throughput for
TBBT AODV is reduced while the network's black-hole node count rises as a result. The black-potential hole's
location in the pathway connecting the source node as well as the end point node, as well as the detail that TBBT
discards any response on or afterunidentified nodes, is the causes of the decrease in throughput.

Figure 6: Results of Quantityagainst the amount of the black hole nodes (source:Yasin and Abu Zant, 2018)

47
Figure7: Results of the average End-to-End Delay against the number (source:Yasin and Abu Zant, 2018)

Figure 8: Results of PDR against the number of the black-hole nodes (source:Yasin and Abu Zant, 2018)

C. Comparison with Other Proposed Models


We validated our proposed model against the alternatives discussed previously using two distinct data sets. We
assigned the PAODV code to the proposed model. Unlike other proposed defenses, using a smart black hole
node to defend against PAODV, as was discussed previously, is not possible. We simulated TBBT using the
same number of nodes as PAODV (15–50). TBBT tripled its Throughput while simultaneously decreasing its
End-to-End Delay by 22.31 percent. TBBT is more effective than PAODV in terms of throughput, but PAODV
is more effective in terms of end-to-end delay, as demonstrated by these two experiments. The proposed model,
which we will refer to as DAODV, is utilized for the second comparison. We conducted simulations of TBBT
using the same metrics as DAODV, in which the mobility of nodes can range from 0 to 10, and generated
conclusions based on these simulations. TBBT achieved a 3.78 percent increase in End-to-End Delay and a
15.60 percent reduction in Throughput when compared to the native AODV without a black-hole assault. In
contrast to the native AODV, TBBT's black-hole assault reduced End-to-End Delay by 9.04% and increased
Throughput by 542.85%. This model offers the best End-to-End Delay, but its Throughput is certainly not the
best available. If no blackhole nodes exist in the network, the natural AODV throughput in a static topology is
151,529. Otherwise, it is 14,346. We believe this is significant enough to bring up. In the presence of a black
hole, TBBT's throughput of 143.476 approaches that of native AODV. This is due to the fact that there is little
variation in the topology and TBBT will not discard a packet if it originates from a node, it already knows about.
This indicates that there are no answers from unidentified nodes.

VI. CONCLUSION
Black-hole attacks are one of the most serious dangers to MANET. To keep the network against collapsing,
black hole nodes must be identified and isolated. This work proposed Techniques for identifying and shutting
down black holes which should be considered when developing black-hole aggressive protocols or methods. It

48
uses timing and baiting for improve black-hole identification without preserving End-to-End Delay, Throughput,
and Packet Delivery Ratio. Throughput and Packet Delivery Ratios of the planned technique were determined to
be almost identical to native AODV in simulation. We hope to improve this model's throughput & packet
delivery ratios with reducing overall late latency.

REFERENCES
[1] Baadache, A. and Belmehdi, A., 2017. Solution for Black Hole and Cooperative Black Hole Attacks in Mobile Ad Hoc
Networks. Egyptian Computer Science Journal (ISSN-1110-2586)Volume 41– Issue 1, January 2017,.
[2] Bala, A., Bansal, M. and Singh, J., 2009. Performance analysis of MANET under blackhole attack. In 2009 First
International Conference on Networks & Communications (pp. 141-145). IEEE.,.
[3] Gerhards-Padilla, E., Aschenbruck, N. and Martini, P., 2010. TOGBAD-an approach to detect routing attacks in tactical
environments. Security and Communication Networks, 4(8), pp.793-806.
[4] Goyal, P., Parmar, V., & Rishi, R., 2011. Manet: vulnerabilities, challenges, attacks, application. IJCEM International
Journal of Computational Engineering & Management, 11(2011), 32-37.,.
[5] Kaur, R. and Kaur, A., 2017. Technique for Detection and Isolation of Black Hole Attack in MANETs. International
Journal of Computer Applications, 174(4), pp.22-25.
[6] Kumar, V. and Kumar, R., 2015. An Adaptive Approach for Detection of Blackhole Attack in Mobile Ad hoc
Network. Procedia Computer Science, 48, pp.472-479.
[7] Nakayama, H., Kurosawa, S., Jamalipour, A., Nemoto, Y. and Kato, N., 2009. A Dynamic Anomaly Detection Scheme
for AODV-Based Mobile Ad Hoc Networks. IEEE Transactions on Vehicular Technology, 58(5), pp.2471-2481.
[8] Patil, A. and Kshirsagar, D., 2020. Blackhole attack detection and prevention by real time monitoring. In 2013 Fourth
International Conference on Computing, Communications and Networking Technologies (ICCCNT) (pp. 1-5). IEEE.,.
[9] Percher, J., Puttini, R., Mé, L., Sousa, d., Jouga, B. and Albers, P., 2004. A fully distributed IDS for MANET. In
Proceedings. ISCC 2004. Ninth International Symposium on Computers And Communications (IEEE Cat. No.
04TH8769) (Vol. 1, pp. 331-338). IEEE.,.
[10] PratapSingh, H., Pal Singh, V. and Singh, R., 2013. Cooperative Blackhole/ Grayhole Attack Detection and Prevention
in Mobile Ad hoc Network: A Review. International Journal of Computer Applications, 64(3), pp.16-22.
[11] Santhakumar, R. and Prabha, N., 2017. Resource Allocation In Wireless Networks By Channel Estimation And Relay
Assignment Using Data-Aided Techniques. International Journal of MC Square Scientific Research, 9(3), pp.40-47.
[12] Sarma, K. J., Sharma, R., & Das, R., 2014. A survey of black hole attack detection in manet. In 2014 International
Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT) (pp. 202-205).,.
[13] Sherif, A., Elsabrouty, M., & Shoukry, A., 2013. A novel taxonomy of black-hole attack detection techniques in mobile
Ad-hoc network (MANET). In 2013 IEEE 16th International Conference on Computational Science and Engineering
(pp. 346-352). IEEE.,.
[14] Woungang, I., Dhurandher, S., Obaidat GE, M. and Peddi, R., 2013. A DSR-based routing protocol for mitigating
blackhole attacks on mobile ad hoc networks. Security and Communication Networks, 9(5), pp.420-428.
[15] Yasin, A. and Abu Zant, M., 2018. Detecting and Isolating Black-Hole Attacks in MANET Using Timer Based Baited
Technique. Wireless Communications and Mobile Computing, 2018, pp.1-10.
[16] Yasin, A. and Abu Zant, M., 2018. Detecting and Isolating Black-Hole Attacks in MANET Using Timer Based Baited
Technique. Wireless Communications and Mobile Computing, 2018, pp.1-10.
[17] Zant, A., 2017. Detection and Prevention of Warmhole Attack in MANET: A Review. International Journal of Science
and Research.

49
Grenze International Journal of Engineering and Technology, June Issue

Design of Wideband Band Stop Filter using Signal


Interference Technique
Mr. Madhukumar Patnala1, Ms. Bachu Munideepika2, Ms.Vetti Pavithra3, Mr.Totthuku Sunil4 and
Mr. Nallapothula Sreenivasulu5
1
Assistant Professor (SL), Department of Electronics and Communication Engineering, Sree Vidyanikethan Engineering
College (Autonomous), A. Rangampet, Tirupati
Email: [email protected]
2-5
UG Students, Department of Electronics and Communication Engineering, Sree Vidyanikethan Engineering College
(Autonomous),A. Rangampet, Tirupati
Emai {munideepikab, pavithrav228, suneety8008, seeseenivas093}@gmail.com

Abstract—A wideband band stop filter based on signal interference technique has been
designed. The proposed filter is designed using coupled line and stepped impedance modified pi
type transmission line with open circuited stubs in transmission path 1 and path 2 respectively.
The proposed filter has been designed at a frequency of 0.9 GHz. The simulated 3dB fractional
bandwidth is 0.8GHz and the insertion loss is well below -18dB or -20dB from 0.2 GHz to 2.14
GHz. Using path 2 ABCD parameters of the filter are found and these equations were solved to
obtain the position of zeros.

Index Terms— Signal Interference Technique, Coupled Line, Open Circuited Stub, Fractional
Bandwidth, Insertion loss and ABCD parameters.

I. INTRODUCTION
The performance of the wireless system degrades if there is interference from the existing communication
systems. In modern high data rate wireless communication applications for effective suppression of spurious
signals there is greater demand for compact wideband band stop filters. With the advent in Microwave Integrated
Circuits there is demand for such filters. A wideband band stop filter with larger fractional bandwidth and lower
insertion loss are essential in GSM, Zigbee, WLAN applications. Several methods [1-6] were reported in
designing wideband band stop filters. Rectangular micro strip open loop resonator with stub loaded resonator as
building blocks offering tuneable low space occupying wide band stop filter (BSF) for wireless applications was
reported [2]. A wideband BPF with an open coupled line in way 1 and transmission line in way 2 offering
reduced and sharp selectivity with six transmission zeros dependent on signal interference technique was
reported [3]. High selective fifth-order wideband band-pass filters (BPFs) with different transmission zeros
based on signal-interaction concepts are proposed in this paper. Transmission ways comprising of a shorted stub
and a couple of open coupled lines are utilized to acknowledge signal transmission originating from Port1 to
Port2 [4]. The impacts of electromagnetic coupling in parallel-conductor inhomogeneous transmission lines were
considered and it was demonstrated that the qualities of different coupled-line circuits implanted in an
inhomogeneous dielectric, (for example, the suspended substrate) differ extraordinarily from those in a
homogeneous domain [5]. This article presents a triple wideband band pass channel (TWB-BPF) with compact

Grenze ID: 01.GIJET.9.2.22


© Grenze Scientific Society, 2023
size, distinguishable band to band segregation and numerous transmission zeros (TZs). The proposed TWB-BPF
depends on a Multi-mode resonator (MMR), driven by the strategy for even and odd mode examination method
[6]. A Branch line resonator be composed of shunt stubs and open stubs with multiple independent transmission
zeros such that the open stubs length is designed to equal quarter wavelength near the unwanted frequency. A
remnant stub is deployed here for establishing coupling between adjacent resonators. At different frequencies
zeros can be planned to reduce the objectionable coupling between the stubs [7]. So far in the literature there
are multiple topologies that were proposed for wideband band stop filters and yet there is requirement for
compact, high selectivity and reduced insertion loss wideband band stop filters for wireless applications.
In this article we are proposing a topology where an open coupled line and stepped impedance modified pi type
transmission line with open circuited stubs in transmission path 1 and path 2 respectively. Signal interference
technique is used to obtain wideband band stop filter with sharp rejection characteristics.
The takeaways of the proposed work are as follows:
a) A band stop channel with two zeroes and six poles are obtained at an operational frequency of
0.9GHz.
b) 0.8GHz was the 3dB fractional bandwidth of the filter.
c) The insertion loss is well below -18dB or -20dB from 0.2GHz to 2.14GHz.
The design theory with its analysis of proposed wideband band stop filter is presented initially, followed by
design procedure and simulation results.

II. PROPOSED WIDEBAND BAND STOP FILTER AND ITS ANALYSIS


The Fig.1 below describes the proposed filter design. The filter comprises an open coupled line and stepped
impedance modified pi type transmission line with open circuited stubs in transmission path 1 and path 2
respectively. Here ZE and Zo are even and odd impedances of the coupled line, Z1 and Z2 are impedances of
first and second transmission lines respectively and Zs is the impedence of the open circuited stubs. c, 1, 2 and
s are the transmission lines electrical lengths respectively. The two zeroes in the rejection band obtained in the
simulated result using Ansoft software are found theoretically, derived and proved to exist.

Fig. 1. Proposed Filter Topology

A. Design Equations
In the proposed filter design ABCD Parameters of the transmission lines along path2 are given as:
cos 1 1
1 = → (1)
1 cos 1
1 0
2 = → (2)
2 1
cos 2 2
2 = → (3)
2 cos 2

51
cos 1 1
1 = → (4)
1 cos 1
1 0
2 = → (5)
2 1

B== − + + + (2 1 + 2)+ → (6)


Let the variables be x1, x2, x3 where
2 1 2 2 1 1
1=− + + → (7)
2
2= (2 1 + 2) → (8)
3= → (9)
−1
= → (10)
Equating (10) to Zero
2( − ) 1
+ = 0 → (11)
( − ) −4 ( 1 − 2 + 3 )
2 1( − ) − 2 2( − ) +2 ( − ) +( − )
−4 =0 → (12)
Let us consider R1, R2, R3, R4, R5 where
1 = 2 1 ( − 0) → (13)
R2= 2 ( − 0) → (14)
R3= 2 3( − 0) → (15)
R4= ( − 0) → (16)
R5= 4 → (17)
Replacing R Values in Equation (12)
1 (1 − ) − 2 (1 − ) + 3 (1 − ) (1 − ) + 4 (1
− )− 5 = 0 → (18)
(− 1 − 2 − 3) + ( 1 + 2 2 + 3 3 4) + (− 2 − 3 3 + 4 − 5) + 3
= 0 → (19)
MATLAB Coding was done to obtain roots, out of which only one is a real root ( ) using which position of two
zeros were found.
The frequencies at which zeros exists are obtained by Equation (20) where i = 2, = 0.9 and = 180 −

fi = × → (20)
Two zeros are therefore proved to be residing at 0.67 GHz and 1.12 GHz respectively in coincidence with the
simulation results.

Fig. 2. Snapshot of Ansoft Circuit Schematic for the proposed topology where ZE=125 Ω, Zo=120 Ω, Z1=90 Ω, Z2=107 Ω, Zs=40 Ω

52
III. FILTER DESIGN AND RESULTS
The circuit simulation of the proposed model was performed using Ansoft designer SV and was reported in
Fig.3. After the circuit simulation a 3D model has been constructed in HFSS to perform full wave simulations.
The proposed filter’s 3D model is exhibited in Fig. 4. Optimization has been performed to get the perfect results.
Data tables of both the results have been collected to plot in the same graph using MATLAB so that both the
results can be compared and was reported in Fig.5. In comparison it was found that both circuit simulation graph
and full wave simulation graph are in good agreement.

Fig. 3. Simulated Results of S-Parameters

Fig. 4. 3D Model of Proposed Wideband band stop filter

Fig. 5. Comparison of Full Wave and Circuit Simulation Results

53
IV. CONCLUSIONS
A compact filter response is recorded with a wideband band stop channel configuration utilizing micro strip
transmission lines with open circuit stubs dependent on signal interference scheme. The claimed model utilizes
in path 1 an open coupled line and in path 2 a transmission line. Due to the superposition of two transmission
ways a band stop channel with two zeroes and six poles were obtained at an operational frequency of 0.9GHz.
The location of obtained two transmission zeros with in the rejection band at 0.67GHz and 1.12 GHz
respectively were analysed and proved using design equations. Theoretical analysis and simulated analysis were
approved and was seen that they were in great understanding.
The proposed channel configuration has to be manufactured on a Teslin sheet utilizing an adhesive copper sheet.
The future work involves generating a Dxf file using the 3D model in HFSS. Using this fabrication has to be
been done on a Teslin sheet by pasting adhesive copper sheet on the design. Two connectors are to be connected
on both the sides making a connection between ground plane and the design .These connectors should be used
for measuring and obtaining results. Measurements has to be done using a device called VNA(Vector Network
Analyser) where two ports of the device are to be connected to the two connectors. Before measuring the results
Calibration of VNA should be performed. Various bending effects on the fabricated prototype can then be
carried out on the fabricated model.

REFERENCES
[1] Hong J-S, Lancaster MJ." Microstrip filters for RF/Microwave applications." New York: Wiley Publications, Apr.
2004.
[2] Ahamad A Ibrahim, Omar K. El Shafey, et.al., “Compact and Wideband microwave band stopfilter for wireless
applications”, Springer-Analog Integrated Circuits and Signal Processing, Vol. 104, No.3, Sep 2020, pp 243–250
[3] Kanaparthi V. Phani Kumar , S.S. Karthikeyan, "Compact, high selectivity and wideband band pass filter with multiple
transmission zeros", AEU - International Journal of Electronics and Communications (AEU-INT J ELECTRON C),
Jul.2018
[4] Wen Jie Feng, Wen Quan Che, Yu Mei Chang, Su Yang Shi, and Quan Xue, “High Selectivity Fifth-Order Wideband
Bandpass FiltersWith Multiple Transmission Zeros Based on Transversal Signal-Interaction Concepts”, IEEE
Transactions on Microwave Theory and Techniques, Vol.6,1 No.1, Jan.2013
[5] George I. Zysman, A. Kent Johnson, “Coupled transmission line networks in an inhomogeneousdielectric medium”,
IEEE Transactions on Microwave Theory and Techniques, Vol.17, No.10, Oct.1969.
[6] Yang Xiong , LiTian Wang , Li Gong, “Compact tri wideband band wideband band transmission zeros”, ETRI
Journal, Oct. 2018
[7] Deng PH, Tsai J-T. Design of microstrip cross-coupled band pass filter with multiple independent designable
transmission zeros using branch-line resonators. IEEE Microwave Wireless Compon. Lett., Apr. 2013, pp:249-51.

54
Grenze International Journal of Engineering and Technology, June Issue

Block Chain-based E-Voting System using Smart


Contract
Dr. Priya Shelke1, Suruchi Dedgaonkar2, Nilesh Gopale3, Rohit Desai4, Ninad Deogaonkart5 and Nachiket Joshi6
1-6
Dept. of Information Technology, Vishwakarma Institute of Information Technology, Pune, India.
Email: {priya.shelke, suruchi.dedgaonkar, nilesh.21910766, rohit.21910786, ninad.21910774,
nachiket.21910799}@viit.ac.in

Abstract—Block chain is presenting new chances to develop new categories of digital services.
Even though research on the topic is still in its early stages, it has mostly focused on the
technical and legal challenges rather than utilizing this ground-breaking concept and creating
better digital services. The study offers a novel block-chain based E-voting system that tackles
some of the drawbacks of current systems and assesses some of the well-known block chain
frameworks in order to build a block-chain based E-voting system. It has long been challenging
to develop a secure electronic voting system that maintains the fairness and privacy of paper
ballots while ensuring the transparency and flexibility afforded by electronic systems in current
electoral schemes. We concretely assess the technology's potential by explaining a case study,
including the election process and the implementation of a blockchain-based application that
improves security and reduces the cost of holding national elections. A block chain is a type of
distributed database that is this allows for the sharing of all data among all network users. By
definition, a block chain system has several benefits that are suitable for electronic voting
system. It is independent of a centralized server because its distributed architecture guarantees
great system performance. Availability. Since each participant has full data, using the protocol,
they may verify each block uploaded to the chain regarding the electronic voting system we
propose, the double envelope encryption is combined here. Block chain technology and
technique. This is the mechanism for turning the people's opinions into action in order to better
manage system. Traditional elections have not pleased either the people or the government in
recent years. They are not completely secure because ballots are easily tampered with.
Additionally, it raises concerns about transparency and voter safety. Furthermore, it takes far
too long to count the votes.

Index Terms— Voting, Block chain, Hashing of Fingerprint, Smart-Contract, Mining, Markle
tree.

I. INTRODUCTION
Democratic voting is an important and rigorous mechanism in all regions. Countries typically vote using
traditional the paper-ballots, mechanical devices, and electronic voting systems. [1] However, it requires new
digital technology. Digital voting uses electronic voting machines, and there are two types of digital voting:
electronic voting and I-voting. Electronic voting is when voters use their devices to vote at a vote canter. I-voting
needs a software interface because of this. The essential requirements for determining if a democratic process is
legitimate include Accuracy, resilience against illegal conduct, efficiency, stability, and transparency of the

Grenze ID: 01.GIJET.9.2.23


© Grenze Scientific Society, 2023
voting process. Digital voting methods can increase accuracy, confidentiality, and integrity while requiring less
financial and human resources. Simply put, it makes sure that the votes cast and the outcomes are accurate [2].
Digital voting has certain disadvantages as well. False voting, cost- cutting, hastening results, and other unethical
voting procedures are some examples. In fact, a number of unauthorized users could impair smart or IoT
(Internet of Things) systems by materially altering voting or voting calculations in order to gain an advantage. A
decentralized ledger that preserves a consistent perception of reality is block chain technology. Block chain is a
peer-to-peer networking platform and mutual, tamper-proof ledger that has been utilized in cryptocurrencies like
Bitcoin and Ethereum. In this instance, user anonymity is safeguarded by the public or private key's
identification. There are numerous models built on the block chain that provide security and anonymity.
Although block chain technology offers security, privacy, accountability, and durability, speed and scalability
are the main implementation problems. Our team is developing a digital voting architecture that includes a smart
contract to address the challenges of block voting adoption and ensure authentication, transparency, anonymity,
accuracy, and autonomy, as well as uniqueness, integrity, and mobility. Based on the voter's information, a hash
will be created in our system and placed in the chain [8]. Voters can remain anonymous because the data is
hashed and kept in the block chain, which allows for scalability. Any modifications to the hash information will
be quickly noticed. Smart contracts on the chain guarantee privacy and security. Smart contracts select miners to
speed up transactions. Candidates are determined by many factors, including data transfer and power
consumption. Each block has completed its voting procedure. The final vote count from the previous block can
be simply evaluated. It reduces the amount of time spent counting.

II. RELATED WORK


A. Blockchain
Blockchain is a decentralized data management system in which data are spread across a peer-to-peer (P2P)
network and subsequently stored in an encoded chain of blocks [3]. The electronic Bitcoin system put out by
Satoshi Nakamoto is where the concept of blockchain was born [4].
These are some of the main characteristics of blockchain:
1. Keep the consensus mechanism in place, i.e., demand proof of work (POW) down the entire chain.
2. Put information into the blocks as a ledger.
3. Network-wide synchronization of the entire ledger.
4. Provides data Decentralization [5].

Figure 1. Flow of Transaction in Blockchain

Modern cryptocurrencies were developed in 2008 using the blockchain concept in conjunction with a number of
other technologies and computing ideas. These digital currencies are rather than a centralized system or source,
cryptosystems are used to secure data. Bitcoin was the first such blockchain-based cryptocurrency [6]. A
cryptographic address is associated with information that represents electronic cash on the Bitcoin blockchain.
Users of Bitcoins can digitally agree to and move ownership of that data to a different user, with a public record

56
of this transfer on the Bitcoin, enabling every connection user to independently confirm the authenticity of the
exchanges. A dispersed group of individuals stores, upkeeps, and cooperatively manages the Bitcoin blockchain.
Bitcoin deployments are typically developed with a specific objective or purpose in mind. Examples of these
features include distributed ledger systems between businesses, cryptocurrencies, smart contracts (software
loaded on a blockchain and executed by machines running that chain), and smart contracts themselves. A
continual stream of innovations in the ledger technology sector have been made, and as the market landscape
changes, new platforms are frequently unveiled.
B. Voting methods
In a democracy, there are several ways to cast a ballot. Many nations have switched from straightforward
elections with ballots on paper to computerized voting machines over time. The following are the most popular
techniques [7]:
1. Ballot Paper
2. Vote over Internet
3. Electronic voting Machine
4. Biometric voting Machine
 Ballot Paper: The act of writing down your preferences on a piece of paper and submitting it is one of
the most basic election mechanisms. This is referred to as ballot paper voting. In the table's first
column, which is left vacant for voters to express their preferences, are the names of the candidates and
the parties running in the election. The guidelines are quite basic. You can mark your choice of
candidate with a cross, place the paper in the voting booth after folding it. The Ballot Box must be taken
care of and monitored by the Election Authority in charge of overseeing the election [7].
 Vote over Internet: With such a method, we can vote in the convenience of our own homes. Contrary to
other internet-based services, this platform has received harsh criticism for the way it operates. The
administration has repeatedly asked teams of cybersecurity specialists from around the world to
evaluate its voting platform. The testing team has frequently come to the conclusion that not only can
they alter the vote total, but they can also erase all evidence that they ever existed. There won't be any
digital fingerprint that can be used to identify them in the system. [8] This has sparked considerable
division among the populace. Justice for everyone was a key democratic principle. One of the major
democratic celebrations is the right to vote. Therefore, it is the responsibility of the government to make
sure that the public has complete faith in the system and that any problems are resolved.
 Electronic voting Machine: The Ballot button on the control unit is pressed by the electoral authority in
charge of overseeing the election, which initializes the EVMs. When a voter presses a button that
corresponds to the candidate he wishes to support, the voting system will then turn on an LED light to
let him know that the vote was accurately recorded. The machine then locks by itself. Now, when a new
voter casts their ballot, the lock can only be opened by a fresh ballot number, which the person in
charge will push once more. This will prevent a single person from casting numerous votes.
 Biometric voting Machine: They devised a voting system based on fingerprints, where a biometric
machine helps identify the voter. Numerous investigations have established the uniqueness of each
person's fingerprint. When casting a vote, this can be used to identify the voter. The entire process
concludes fast and painlessly. The government must establish a nationwide scheme for voter
registration, whereby everyone is fingerprinted and a final list of all eligible voters is compiled. In
conclusion, because voters can cast ballots from any location in the world, we may run into a similar
authentication issue while designing a blockchain-based system. Thus, to ensure that the voters who
cast their votes are authentic, we may utilize biometric sensors or even face recognition software. [7]
C. EVM
Voting is done using electronic machines, and a voting method is introduced. This system is centralized. Voting
data can be easily changed in this way. There is no system in place for voters to check the accuracy of their vote.
[9] Describe a voting system based on block chain in which each EVM is directly connected to another EVM in
a network. This method has three parts: peer verification transactions, chain manipulation detection, and
fingerprint authentication. It is subject to DoS (Denial- of-Service) attacks and eavesdropping.
D. Blockchain Based E-voting
Due to the rise in popularity of currencies, blockchain, the technology that underpins them, is receiving increased
attention from researchers. Numerous e-voting methods have been implemented in conjunction with blockchain.

57
Electronic voting is the term used to describe voting that relies on gear or software that is electronic. Such
systems may be able to support/implement a wide range of tasks, from the voting process' startup to the storage
of votes. There are many different types of method, including computers, mobile devices, and kiosks seen at
voting offices [10].
For government and organizations to polling devices, Agora was created. It is an end-to-end verifiable
blockchain-based voting system. Agora used tokens to identify eligible voters in elections, and each eligible
voter received a token from an institution. However, in order to oversee the voting process, each of these
methods leaned on reliable outside sources. The third party could collaborate with the candidates to manipulate
the election. Additionally, the public can access and see the data that is kept in the blockchain. This compromises
the security and dependability of Using block chain for voting.
Blockchain-based voting’s protocols incorporate smart contracts and encryption algorithms to address this issue.
The third party was typically replaced using smart contracts. Smart contracts enabled the Open Vote Network, an
identity protocol. The following procedures were part of the E- voting systems: The first stage is voter
registration (registration). Officials check voters' IDs on election day (verification and authentication). People
who are qualified to vote may do so in the following phase (casting collation). The vote ought to be verified and
encrypted. The votes' correctness, confidentiality, and anonymity must all be ensured and cannot be modified in
any manner. The final step in the counting process for electronic voting systems is to add up all the votes in
accordance with the design (counting display of results). The majority of e- voting applications use central
authority control. Such systems have a number of shortcomings and perceived hazards. The majority of e-voting
applications use centralized government control. Such systems have a number of shortcomings and perceived
hazards. For instance, there are no standards for electronic voting systems, security and reliability risks, fraud
and hacker vulnerabilities, expensive machine costs, and insecure transaction storage.

Figure 2. Block-chain based E-Voting System

E. E-voting Systems process


• Setup: Enter the security parameters or values, then encrypt (or decode) the processes after producing
the private (or public) pair of keys.
• Register: Give the identifiers as IDs to generate the secret (or public) passcode as an output [11].
• Vote: After establishing a vote element or component, the electors calculate the cypher text and
matching approval.
o Credible: This helps to authenticate voting is integrity in the election server by selecting the
input as a vote.
o Validate Vote: Following the voting phase, voters may submit requests for the bitcoin
contracts they voted at the time of polling advertising is occurring, and by entering the public
parameters, voter status, and privacy information, they could confirm the outcomes given.
Results the other legitimate or untrue when they are returned.
o Counting up: After all votes have been cast and verified, the results are tallied with the input
being the necessary secret key and the output being the polled box element. The system returns
False if the result is incorrect.
o Verify: When the publicity phase's public parameters are entered, a vote is authenticated as
having been cast in accordance with the ballot's outcome in a valid and accurate manner. [12]

58
III. PROPOSED VOTING MECHANISM
A. Data Management of the system
Huge amount of data is generated during the election process. As a result, data should be collected in a
systematic manner. As shown, our system employs two types of storage- Database Storage and Cloud Storage.
B. Voter Registration
• Each person must visit their local voter registration office and submit the required documentation in
order to cast a valid ballot.
• A set of public and private keys will be created using a key creation method.
• In the block chain network, voters are identified using their public key. The voter's mobile phone
numbers receive the secret key. They can use this private key to cast a ballot and take part in the voting
system.
• The hash is generated from voter’s submitted fingerprint and the algorithm used is fingerprint hash
generation. The whole procedure of creating hashes from data provided by voters is shown. [2]
• To create a new hash value, the created hash will be coupled with the voter's other information.
• The final hash value will serve as a voter list in the block chain’s genesis block.
, where hash value = membership proof.
C. Voting with a smart contract
The following are the functions of smart contracts running on the block chain:
• Voter Verification:
- Using an internet-connected device, the voter logs into the voting system with their private key.
- Submit your fingerprint, NID, and other data.
- In the genesis block, smart contracts created on the block chain compare the information provided by
the valid voter to the information submitted.
- If the data matches, a list of candidates is shown to the voter. [13]
• Make a Casted vote block:
- The voter selects one of the candidates from the list and votes.
- Use a digital signature to sign the vote, then send the transaction to the smart contract (SC).
- For each vote cast by the voter, smart contract creates a Voter ID for Identification of each vote.
– Increase the number of votes cast for the chosen candidate. Make a block containing the transactions made
by the voter, together with their VID and Candidate Vote number. [2]
• Selection of Miner:
- A miner selection algorithm is run by the SC.
- In order to prevent record interference, all miners compete to generate outcome of the block’s hash
first in Bitcoin, which requires a significant amount of processing power. It has been suggested that the
voting architecture choose a Miner based on heuristics extrapolated from that Miner's accomplishments.
The SC compiles factors such node capacity, energy use, and delay. [5]

IV. CONCLUSION
Securing vote security is a challenge for many nations. To guarantee voter participation and validity, the security
of vote data, and the accurate measurement of votes, a smart contract-based block chain- based voting system
can be developed. In order to cut down on computational expenses, this technique entails the SC providing voter
authentication and participating in the choice of a Miner in the Block chain. Additionally, it promptly counts the
votes, speeding up the election process. Citizens can vote using smart devices from anywhere thanks to this
mechanism. This will help to raise the number of voters needed to establish democracy in any nation. The
objective of this project is to develop an encryption method that will increase the security of our system in the
future.

ACKNOWLEDGMENT
The authors wish to thank the anonymous reviewers for their useful suggestions that helped in improving the
quality of this paper. We would also like to thank Vishwakarma Institute of Information Technology, Pune for
supporting this work.

59
REFERENCES
[1] F. Hjálmarsson, G. K. Hreiðarsson, M. Hamdaqa and G. Hjálmtýsson, “Blockchain-based e-voting system,” in 2018
IEEE 11th international conference on cloud computing (CLOUD), 2018.
[2] B. L. J. K. S. A. N. S. S. R. R. P. Yu and M. H. Au, “Platform-independent secure blockchain-based voting system,” in
Information Security: 21st International Conference, Guildford, UK, 2018.
[3] K. M. Khan, J. Arshad and M. M. Khan, “Investigating performance constraints for blockchain based secure e-voting
system,” Future Generation Computer Systems, pp. 13--26, 2020.
[4] A. Alam, S. Z. U. Rashid, M. A. Salam and A. Islam, “Towards blockchain-based e-voting system,” 2018 international
conference on innovations in science, engineering and technology (ICISET), pp. 351--354, 2018.
[5] Fatrah, S. El Kafhali, A. Haqiq and K. Salah, “Proof of concept blockchain-based voting system,” in Proceedings of the
4th International Conference on Big Data and Internet of Things, 2019.
[6] P. Baudier, G. Kondrateva, C. Ammi and E. Seulliet, “Peace engineering: The contribution of blockchain systems to the
e-voting process,” Technological Forecasting and Social Change, vol. 162, p. 120397, 2021.
[7] Y. Soni, L. Maglaras and M. A. Ferrag, “Blockchain based voting systems,” in European Conference on Cyber Warfare
and Security, 2020.
[8] M. a. P.-M. A. Pawlak and N. Kryvinska, “Towards the intelligent agents for blockchain e-voting system,” Procedia
Computer Science, vol. 141, pp. 239--246, 2018.
[9] Li, J. Xiao, X. Dai and H. Jin, “AMVchain: authority management mechanism on blockchain-based voting systems,”
Peer-to-peer Networking and Applications, vol. 14, pp. 2801--2812, 2021.
[10] R. Taş and Ö. Ö. Tanrıöver, “A systematic review of challenges and opportunities of blockchain for E-voting,”
Symmetry, vol. 12, no. 8, p. 1328, 2020.
[11] J.-H. Hsiao, R. Tso, C.-M. Chen and M.-E. Wu, “Decentralized E-Voting Systems Based on the Blockchain
Technology,” in Advances in Computer Science and Ubiquitous Computing: CSA-CUTE 17, 2018.
[12] Y. Abuidris, R. Kumar, T. Yang and J. Onginjo, “Secure large-scale E-voting system based on blockchain contract using
a hybrid consensus model combined with sharding,” Etri Journal, vol. 43, no. 2, pp. 357--370, 2021.
[13] U. Jafar, M. J. A. Aziz and Z. Shukur, “Blockchain for electronic voting system—review and open research challenges,”
Sensors, vol. 21, no. 17, p. 5874, 2021.

60
Grenze International Journal of Engineering and Technology, June Issue

Phishing E-Mail Detection and Blocking it based on the


Header Elements
Sulaiman Awadh Ali Obaid Maeli1 and Ajay U Surwade2
1-2
School of Computer Sciences, Kavayitri Bahinabai Chaudhari North Maharashtra University, Jalgaon, INIDA
Email: [email protected], [email protected]

Abstract—E-mail is one of the most important modern official means of communication with
high reliability, and this is the reason for its widespread and wide popularity. However, this
does not make it safe from threats and attacks. The major threat to email is spamming and
phishing, which cause a lot of financial losses to the victims. In this paper, we have described a
filter based on analyzing the email header elements and its characteristics, extracting the most
important features, and testing it on many rules and conditions that can detect and block
phishing email messages. This filter is tested on five standard datasets containing spam and
phishing emails using header information only and has achieved an overall average accuracy of
about 96.31 percent.

Index Terms— Email header, features extraction, Phishing emails, Black-lists, White-lists.

I. INTRODUCTION
Email is the most important and best way to communicate between companies, institutions, and offices, despite
the wide spread of modern means of communication such as social media and others, due to its ease of use,
strength of protection, and reliability. But it is not without some problems and defects that threaten protection,
and the most prominent of these threats is phishing attack, which is the fraudulent act of pretending to be a
reliable entity in a communication in order to obtain confidential user information (such as usernames,
passwords, bank account information, or credit card information) and other information [1]. Due to the increase
in phishing attacks and the significant financial losses it causes to individuals and companies, much research has
appeared to study and block email phishing attacks.
In this paper, we have analyzed the elements of the email header and its various properties to create rules and
conditions that can classify an email message as phishing email or non-phishing email. The email, like normal
postal mail, contains two parts header part and body part. Fig. 1 shows the header elements in the email message.

Grenze ID: 01.GIJET.9.2.31


© Grenze Scientific Society, 2023
Figure 1. Message Headers in the E-mail

The header: represents an envelope of the E-mail containing information like the sender's and receiver's email
addresses, email subject, message journey information across the various servers, cc, bcc, etc [2]. It consists of
following fields such as:
i. From: contains Email sender information like name and email address of the sender.
ii. To: contains the email address(es) of the receiver(s) of the email, which may be delivered to a single
recipient or several recipients. It's a mandatory field. The message has to have at least one recipient's
address.
iii. Subject: contains information about message content.
iv. Received: contains information about the message journey, which involves details of mail transmission
servers which it has travelled. It can be used to track the message's path.
v. Reply-To: includes an email address that is immediately inserted in the "To" field when the user replies
to the email message.
vi. Return-Path: includes the sender's details, like their email and a link to reply to them. It is added by the
server that delivers the message to the recipients.
vii. Message ID: It is a special identifier that is given to each message by the host when the message is
created. divided into two parts, local and domain, which are separated by an at-sign and enclosed in
angle brackets. "<" local-part "@" domain-part ">" [3].
The email has another part called Body which contains the content to be read by the recipient; it could be text, an
image, an attachment, or both [4]. The phishers or spammers used to modify these headers information by doing
forgery. Thus, changes in these fields are important features.

II. RELATED WORK


Tianda Yang, Kai Qian, Dan Chia-Tien Lo, K. Al Nasr and Ying Qian (2015) Combines different filtering
techniques. In hard cases, email header meta features can be used to correctly classify spam. In his experiments,
a Naive Bayes filter incorporates these meta features [5].
Ankit Kumar Jain & B. B. Gupta (2016) proposed approach for phishing web pages detection by checking the
legitimacy of a webpage not in a whitelist using hyperlink features. Also, this approach detects various types of
phishing attacks, like DNS poisoning, zero-hour attacks, etc [6].
D. Kaur and S. Kalra (2016) developed hybrid methods to detect phishing attacks by using whitelists with
another technique to check and classify URLs as phishing or non-phishing [7].
Ghogare, Pramod, Surwade, Ajay and Patil, Manoj (2018) devised an approach for spam classification using
feature selection and extracting sender email from the message header and used it for classification [8].
Omar Abahussain & Yousef Harrath (2019) have checked incoming mail and examined the email name and
email address in the From field, as well as searching for URLs in the email content, extracting the domain name,
and comparing it with known domain addresses on the blacklist [9].
T. Krause, R. Uetz and T. Kretschmann (2019) presented a new approach based on meta data for spam
classification using a static set of engineered features with automatically extracted features from header data
only, without analyzing an email’s body [10].
Thashina Sultana, K. A. Sapnaz, Fathima Sana, and Mrs. Jamedar Najath (2020) proposed a model for detecting
spam email and adding the IP address of the spam sender to the blacklist [11].
Anchit Bijalwan (2020) suggested the blacklists method for network traffic to filter infected packets. This
technique is used to detect malware and botnets. He has described packet filtering procedure; all suspicious IP
addresses are classified as blacklisted [12].
Kulkarni, Priti, Saini, Jatinderkumar, and Acharya, Haridas (2020) Investigated the header attributes of emails
using five different feature selection techniques and five different machine learning classifiers [13].

62
Ajay U. Surwade (2020) developed Origin based Filter which blocks phishing e-mail by extracting header part
information of e-mails using Blacklist approach [14].
Youness Mourtaji, Mohammed Bouhorma, Daniyal Alghazzawi, Ghadah Aldabbagh, and Abdullah Alghamdi
(2021) developed a solution based on a hybrid rule-based approach that extracts features from six different
methods, including the blacklisting method, which checks the domain name against two antiviruses blacklists
that consider this domain blacklisted [15].

III. METHODOLOGY
This work is based on analyzing the important header element to extract important features to design an origin-
based filter that has some rules and conditions, as mentioned in Section-IV, to classify the email as phishing or
non-phishing. The methodology adopted for this origin-based filter (OBF) as shown in Figure-2.

Figure 2. Structure of the OBF Filter

The features such as ‘From’, ‘Reply-To’, ‘Return-Path’ and ‘Message-ID’ as shown in Table-1, are extracted
from the standard datasets such as Enron [16], Public Phishing Corpus [17], SPAM Archive [18],
CSDMC2010_SPAM [19] and Spam Assassin [20].
According to the RFC822 protocol, the message header has been extracted and four standard fields selected for
features extraction as shown in table I:

TABLE I. STANDARD HEADER FIELDS WITH EXTRACTED FEATURES


Field Extracted Features Probability Value
From invalid address (wrong or fake mail address) 0 or 1
Reply-To mismatching domain names between “ReplyTo” and “From” addresses. 0 or 1

Return-Path mismatching domain names between “Return-Path” and “From” addresses. 0 or 1


invalid address or
Message-ID 0 or 1
mismatching domain names between “Message-ID” and “From” addresses.

The Python code is developed using the rules mentioned in Section-IV. This code is tested with the standard
datasets as mentioned above. This architecture is classifying emails as Phishing or Non-Phishing emails. The
Accuracy of classification has been calculated and reported in Table-II to VI. The IP addresses or domain names
of classified Phishing emails are extracted and are stored as the Blacklists similarly, the IP addresses or domain
names of classified Non-Phishing emails are extracted and are stored in the Whitelists.

IV. IMPLEMENTATION DETAIL


The following rules and conditions for classification have been set after carrying out the detail analysis of the
standard datasets which are described with necessary description:
a) From field is invalid mail (wrong or fake mail address).

63
A regular expression is used to validate the From field with various email address formats.
b) From field and Reply-To field has different domain name.
FROM != REPLY-TO
c) From field and Return-Path fields has different domain name.
FROM !=RETURN-PATH
d) Reply-To or Return-Path or both is empty.
len(REPLY-TO) == 0 OR len(RETURN-PATH) == 0
OR (len(REPLY-TO) == 0 AND RETURN-PATH) == 0)
e) From field and message id has different domain name.
from_domain!= message_id
There is one possible probability value for each condition mentioned above. "0" means "false" or "1" means
"true.". The value “1” (ie, ‘true’) classify email as phishing email while, value "0" (ie, ‘false’) classify email as
non-phishing email. The decision-making condition is as given below:
If ((FROM != RETURN-PATH) or (FROM != REPLY-TO) or (len(REPLY-TO) == 0) or len(RETURN-
PATH) == 0 or (len(REPLY-TO) == 0) AND (RETURN-PATH) == 0) or (from_domain !=
message_id))
Then classify email as Phishing Email and Extract it’s IP addresses or Domain names and
store it as Blacklists.
Else, classify email as Non-Phishing Email and Extract it’s IP addresses or Domain names and store it
as Whitelists.
The results collected during experiments are reported in next sections.

V. RESULT AND DISCUSSION


The Tables-II to VI represents the results collected after the classifications are reported with the help of
confusion matrix.

TABLE II. EMAIL CLASSIFICATION IN ENRON DATASET TABLE III. EMAIL CLASSIFICATION IN PUBLIC PHISHING CORPUS DATASET

Enron Public Phishing Corpus


SPAM Folder Phishing Non Total Accuracy in %
Folder Phishing Non Total Accuracy in % Name Phishing
Name Phishing Phishing 0 398 16 414 96.13
BG 9402 598 10000 94.02 20051114 411 27 438 93.83
GP 13719 0 13719 100
Phishing 2 1398 25 1423 98.24
SH 9256 13 9269 99.85
Phishing 3 2225 54 2279 97.63

TABLE IV. EMAIL CLASSIFICATION IN SPAM ARCHIVE DATASET TABLE V. EMAIL CLASSIFICATION IN CSDMC2010_SPAM DATASET

SPAM Archive CSDMC2010_SPAM


Folder Phishing Non Total Accuracy in %
Name Phishing Folder Phishing Non Total Accuracy in %
Name Phishing
01/2020 2528 115 2643 95.64
Spam 1315 63 1378 95.42
02/2020 6780 715 7495 90.46

TABLE VI. EMAIL CLASSIFICATION IN SPAM ASSASSIN DATASET


Spam Assassin
Folder Name Phishing Non Total Accuracy in %
Phishing
20030228_spam 486 14 500 97.20
20030228_spam_2 1360 37 1397 97.30

The results for ‘Spam Archive’ dataset shown in Table-IV, the Folder name ‘02/2020’ has minimum accuracy
which is 90.46 %. This folder contains total 7495 spam emails which should have been classified as Phishing
email but, only 6780 have been classified as Phishing and 715 are misclassified as non-Phishing.

64
The results for ‘Public Phishing Corpus’ dataset shown in Table-III, the Folder name ‘20051114’ contains total
438 phishing emails which should have been classified as Phishing emails but, only 411 have been classified as
Phishing and 27 are misclassified as non-Phishing.
The results for Enron dataset shown in Table-II, the Folder name ‘BG’ contains total 10000 spam emails which
should be classified as Phishing emails but only 9402 are classified as Phishing while, 598 have been classified
as non-Phishing.
The results for ‘CSDMC2010_SPAM’ dataset shown in Table-V, the Folder name ‘Spam’ contains total 1378
spam emails which should be classified as Phishing emails but only 1315 are classified as Phishing while, 63
have been classified as non-Phishing.
The above mis-classification suggests that, the features extracted from these emails are not sufficient and need
some more features for accurate classification. So, we need to investigate the other features along with existing
features so that accuracy can be improved. In future we are planning to investigate other header features
‘Subject’ field of the email.

VI. CONCLUSIONS AND FUTURE WORK


A phishing e-mail is a real threat to individuals and entities that must be detected and blocked before it reaches
its target. In this paper, a filter has been designed based on the characteristics of some elements of the message
header, extracting the features, and then classifying the message according to the conditions and rules that detect
and block phishing mail. The experiment has been carried out on some standard datasets such as Enron, Public
Phishing Corpus, SPAM Archive, CSDMC2010_SPAM, and Spam Assassin and has achieved 96.31% average
accuracy of classification.
In future, we are planning to increase accuracy by selecting more elements from the header part and
investigating those features which can improve the results in terms of accuracy of classification. We are planning
to concentrate on the ‘Subject’ field of email header part which may help us to increase accuracy of phishing
classification.

REFERENCES
[1] Tak, Gaurav & Ojha, Gaurav. (2013). Multi-Level Parsing Based Approach Against Phishing Attacks with the Help of
Knowledge Bases. International Journal of Network Security & Its Applications. 5. 10.5121/ijnsa.2013.5602.
[2] P. Mishra, E. S. Pilli and R. C. Joshi, "Forensic Analysis of E-mail Date and Time Spoofing," 2012 Third International
Conference on Computer and Communication Technology, 2012, pp. 309-314, doi: 10.1109/ICCCT.2012.69.
[3] Hamid, I. R. A., Abawajy, J., & Kim, T. H. (2013). Using feature selection and classification scheme for automating
phishing email detection. Studies in informatics and control, 22(1), 61-70.
[4] Beaman, C., & Isah, H. (2022). Anomaly Detection in Emails using Machine Learning and Header Information. arXiv.
https://doi.org/10.48550/arXiv.2203.10408.
[5] Tianda Yang, Kai Qian, Dan Chia-Tien Lo, K. Al Nasr and Ying Qian, "Spam filtering using Association Rules and
Naïve Bayes Classifier," 2015 IEEE International Conference on Progress in Informatics and Computing (PIC), 2015,
pp. 638-642, doi: 10.1109/PIC.2015.7489926.
[6] Jain, A.K., Gupta, B.B. A novel approach to protect against phishing attacks at client side using auto-updated whitelist.
EURASIP J. on Info. Security 2016, 9 (2016). https://doi.org/10.1186/s13635-016-0034-3 .
[7] Davneet Kaur and Sheetal Kalra, “Five-tier barrier anti-phishing scheme using hybrid approach”, Information Security
Journal-A Global Perspective, 2016, DOI: 10.1080/19393555.2016.1215573.
[8] Ghogare, Pramod & Surwade, Ajay & Patil, Manoj. (2018). Effective E-mail Spam Filtering Using Origin Based
Information. International Journal of Computer Sciences and Engineering. 6. 359-362. 10.26438/ijcse/v6i11.359362.
[9] O. Abahussain and Y. Harrath, "Detection of Malicious Emails through Regular Expressions and Databases," 2019
International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), 2019,
pp. 1-5, doi: 10.1109/3ICT.2019.8910291.
[10] T. Krause, R. Uetz and T. Kretschmann, "Recognizing Email Spam from Meta Data Only," 2019 IEEE Conference on
Communications and Network Security (CNS), 2019, pp. 178-186, doi: 10.1109/CNS.2019.8802827.
[11] Thashina Sultana, K A Sapnaz, Fathima Sana, Jamedar Najath, 2020, Email based Spam Detection, INTERNATIONAL
JOURNAL OF ENGINEERING RESEARCH & TECHNOLOGY (IJERT) Volume 09, Issue 06 (June 2020).
[12] Anchit Bijalwan, “Botnet Forensic Analysis Using Machine Learning”, Hindawi’s Journal of Security and
Communication Networks, Volume2020, pp:1-9, February-2020. Article ID 9302318,
https://doi.org/10.1155/2020/9302318.
[13] Kulkarni, Priti & Saini, Jatinderkumar & Acharya, Haridas. (2020). Effect of Header-based Features on Accuracy of
Classifiers for Spam Email Classification. International Journal of Advanced Computer Science and Applications. 11.
10.14569/IJACSA.2020.0110350.

65
[14] A. U. Surwade, "Blocking Phishing e-mail by extracting header information of e-mails",(2020), International
Conference on Smart Innovations in Design, Environment, Management, Planning and Computing (ICSIDEMPC),
2020, pp. 151-155, doi: 10.1109/ICSIDEMPC49020.2020.9299596.
[15] Mourtaji, Youness & Bouhorma, Mohammed & Alghazzawi, Daniyal & Aldabbagh, Ghadah & Alghamdi, Abdullah.
(2021). Hybrid Rule-Based Solution for Phishing URL Detection Using Convolutional Neural Network. Wireless
Communications and Mobile Computing. 2021. 1-24. 10.1155/2021/8241104.
[16] http://nlp.cs.aueb.gr/software_and_datasets/Enron-Spam/index.html. Last accessed on 10 September, 2022.
[17] https://academictorrents.com/details/a77cda9a9d89a60dbdfbe581adf6e2df9197995a Last accessed on 14 October 2022.
[18] http://untroubled.org/spam/ Last accessed on 28 September 2022.
[19] https://github.com/jdwilson4/Intro-to-MachineLearning/tree/master/Data/SPAMData Last accessed on 12 November
2021.
[20] https://spamassassin.apache.org/old/publiccorpus/ Last accessed on 2 October 2022.

66
Grenze International Journal of Engineering and Technology, June Issue

IoT based Weather Detecting System


Priyanshi Patil1, Nikhil Patil2, Pratham Patil3, Mohit Patil4 and Pratap Patil5
1-5
Vishwakarma Institute of Technology, Pune, India
Email: [email protected], [email protected] , [email protected], mohit,[email protected],
[email protected]

Abstract—The system proposed will be a project which will perform the work of weather
detecting with IoT technology implemented in it. The system will be linked to a webpage to
which data will be provided by the various sensors which will be used in the system for the
different purposes. Then the website will analyze the data and accordingly weather conditions
will be displayed on the screen along with the graphical representation of the data. Also, the
data will be stored in the database for future references. Different parameters will be sensed by
the device namely temperature, humidity, atmospheric pressure, sound , light intensity, carbon
dioxide and monoxide levels in the air present there.

Index Terms— Weather, Humidity, Temperature, Pressure, Intensity, Detecting.

I. INTRODUCTION
This is a simple weather detecting system powered by Arduino UNO, this system detects environmental
parameters such as temperature, humidity, barometric pressure, air and sound quality and light intensity as well.
The device is having IoT (Internet of things) technology applied in it. Weather detection have always been a
standout topic for meteorologist.
Weather conditions need to monitored on frequent and daily basis as it assumes a significant role in metereology.
Weather conditions are important for many other factors like farming, deciding schedules for export of goods,
flight schedules and deciding various other outdoor activities. Forecasts depends on environmental parameters
like temperature, Humidity, Wind, etc. To serve this purpose one needs to study these factors.
The proposed device is having different sensors for detecting different environmental parameters namely DHT11
for temperature and humidity, BMP180 for altitude and atmospheric pressure, MQ07 and MQ135 for carbon
dioxide and monoxide levels respectively, sound sensor as FC04 and LDR for light intensity. Also, the project is
linked to a website which will fetch all the data from the sensors and then will accordingly analyze it and will the
output as graphs of the data. Also, the data will always be stored in the database of the website for future
references.
Table I. Type Sizes for Camera-Ready Papers
Methodology/Experimental
Block Diagram
Below is the block diagram of the project. The functional diagram represents the main concept of the project.
The project is powered by an Arduino Uno, on which various sensors are attached. Acoustic sensors such as
FC04, MQ135 for carbon monoxide, DHT11 for temperature and humidity, BMP180 for altitude and barometric
pressure and MQ07 for carbon dioxide levels in the air. In addition, we will use LDR to measure light intensity.

Grenze ID: 01.GIJET.9.2.33


© Grenze Scientific Society, 2023
The I2C module allows multiple devices to be connected to each other using just two wires. The screen will
show on the 16*2 LCD. In addition, the ESP8266 Wi-fi module is used to enable ,internet connectivity with
various vehicle system applications and will provide data for subsequent jobs i.e. the website will also show the
output to the user on the screen with the data graph and also store it in the database.

II. IMPLEMENTATION SETUP


The different components used in the project are:
1)DHT 11
The DHT11 is a low-cost digital sensor for temperature and humidity detection. This sensor can be easily
interfaced with any micro-controller like Arduino, Rasperry Pi, etc. to measure humidity and temperature
instantly.

2)BMP180
BMP180 is a high-precision sensor designed for consumer applications. Air pressure is nothing but the weight of
air acting on everything. Air has weight and wherever there is air its pressure will be felt. The BMP180 sensor
detects this pressure and provides this information as a digital output.

3)MQ135
The MQ135 gas sensor is used in air quality monitoring equipment and is suitable for detecting or measuring
NH3, NOx, Alcohol, Benzene, Smoke, and CO2. The MQ135 sensor module comes with a digital pin that allows
this sensor to work even without a microcontroller and is very useful when you are just trying to detect a specific
gas.

68
4)MQ07
The MQ07 gas sensor is highly sensitive to carbon monoxide. The sensor can be used to detect different gases
containing CO, it is inexpensive and suitable for different applications.

5)ESP 8266
ESP8266 is a low cost WiFi module of the ESP series thst you can use to control your electronic projects
anywhere in the world. It has a buily-in microcontroller and a 1MB flash that allows it to connect to WiFi
networks.

III. RESULTS AND DISCUSSIONS


The result is a weather detection system that will detect the weather based on various environmental parameters
like humidity, temperature, barometric pressure, sound and light, etc. The system will be linked to a device via
IoT, which will provide the graphical result of the forecast and will be stored in the database.

69
IV. FUTURE SCOPE
This system is very useful for farmers where they can ensure high crop yield and reduce the risk caused by
weather through IoT. In particular, it is useful when drastic changes in the environment take place. In the future,
we may also add different types of sensors such as earthquake detection sensor, light sensor, rain level sensors.
We may also add machine learning and artificial intelligence algorithms to predict future weather and its effect
on the environment

V. CONCLUSION
This system is used to detect environmental parameters such as temperature, humidity, pressure, air quality, etc.
Sensors in the data collection environment. By implementing sensors in the system, we can bring the
environment to life. The results of the collected data are then displayed on the screen via Wi-Fi. We can use this
device to monitor a specific room or place where environmental parameters need to be monitored. Accuracy of
this exact model with real data. The main purpose if this device is to make the system beneficial and useful.

ACKNOWLEDGMENT
"DESH VIT, Pune", "Principal VIT, Pune", "HOD, DESH" sincerely thank you for giving us valuable advice
and suggestions in this project. "P.P Musale" would like to thank you for your valuable advice, enthusiastic
attitude and support throughout our project. We are fortunate to work under your leadership.

REFERENCES
[1] Dhanashree S. Medhekar, Mayur P. Bote, Shruti D. Deshmukh, “ The Heart Disease Prediction System Using Naïve
Bayes”, INTERNATIONAL JOURNEY OF ADVANCED RESEARCH IN SCIENCE, TECHNOLOGY AND
ENGINEERING, VOL.2, ISSUE 3, MARCH 2013, pp 1-5
[2] Amruta A. Taksande, PS Mohod, “Application of Data Mining in Weather Forecasting Using Regular Model Growth
Algorithms”, International Journal of Science and Research(IJSR), Volume 4 Issue 6, June 2015
[3] Mary Nsabagwaa, Maximus Byamukamab, Emmanuel Kondelaa, “Towards a Powerful and affordable Automated
Weather Station”

70
[4] Mehrnoosh Torabi, Sattar Hashemi, “ Data Mining Models for Weather Forecasting”, The 16th CSI International
Symposium on Artificial Intelligence and Signal Processing (AISP 2012), IEEE, pp 579-584
[5] Mr. Sunil Navadia, Mr. Jobin Thomas, Mr. Pintukumar Yadav, Ms. Shakila Shaikh, “ Weather Forecasting: A new
approach for Measuring & Analysing weather data”
[6] P. Sushmitha, G.Sowmybala “Design and Implementation of Weather Monitoring and Controlling System”,
International Journal of Computer Applications.
[7] T.R.V Anandharajan G.Abhishek Hariharan, K.K.Vignajeth, R.Jijendiran “ Monitoring the weather with Artificial
Intelligence.
[8] Raj Kumar, Shiva Prakash, “Performance and Parametric Analysis of IoT Motes with Different Network Topologies”

71
Grenze International Journal of Engineering and Technology, June Issue

Automatic Depression Level Detection


Prof. Rupali Umbare1, Vedant Bhamre2, Danish Tamboli3, Trushant Jadhav4 and Pranav Kurle5
1-5
Department of Information Technology, JSPM'S Rajarshi Shahu College of Engineering, Pune, MH-India
Email: [email protected]

Abstract— According to physiological research, there are a variety of variances in both speech
and face movements. These facial and vocal expressions are shared by healthy and depressed
people. On the basis of this information, we offer the Multimodal Attention Feature Fusion and
a novel Spatio-Temporal Attention (STA) network technique that are utilized to get the
multimodal representation of depression signals to be able to predict the amount of personal
depression. Correctly, we first separate segmenting the speech amplitude spectrum and video
into predetermined lengths and submitting them to the STA network, which focuses on the
audio and video frames used to detect depression in addition to integrating the attentional
processing of spatial and temporal information mechanisms. The output of the STA network's
final full connection layer is where the audio and video segment-level functionality is acquired.
In order to collect the changes in every aspect of the audio and segment-level features for videos
and summarize them as an audio and video feature level, this study also provides the eigen
evolution pooling approach. The MAFF is then used to create a multimodal representation
composed of modal complementary data, which is then inputted into a support vector
regression predictor to determine the severity of the depression. The utility of our strategy is
illustrated by experimental findings on the depression databases for AVEC2013 and
AVEC2014.

Index Terms— Convolutional Neural Network, Multimodal Audio/Video Segment-Level


Depression Detection Feature, Machine Learning.

I. INTRODUCTION
Depression is a condition that causes people to have extremely low moods and the inability to engage in typical
social interactions. More gravely, we can observe that depression can also cause behaviors that contribute to self-
harm and suicide. As a result, depression will overtake heart disease as the second biggest cause of death by
2030. Fortunately, we can state that early diagnosis and therapy can assist people in quickly getting out of
problems. However, the diagnostic process is typically challenging and heavily dependent on the doctors, which
can prevent some patients from receiving timely, effective therapy. Finding a system for automatically
diagnosing depression is therefore vital to help clinicians work more effectively. The model of automatic
depression identification has new opportunities thanks to new algorithms, and this could lead to model
improvement by increasing model accuracy and accurately forecasting depressed clients. According to
physiological research, depressive patients' speech and facial movements differ slightly from those of healthy
people.

II. REVIEW OF LITERATURE


[1] Yanfei Wang et. Al focused on multiple instances learning as well as Sampling, Slicing, Long Short-Term

Grenze ID: 01.GIJET.9.2.34


© Grenze Scientific Society, 2023
Memory, and Multiple Instance Learning are methods for feature manipulation. Depression is detected by binary
classification, uses the Support Vector Machine algorithm. It covered the capability to detect various small video
clips of depression symptoms. Maximum accuracy of 81.06% is achieved by using this learning technique.
[2] Sri Harsha Dumpala et. al offer a multi-task learning framework for enhancing depression performance in
terms of severity prediction based on the acoustic aspects of brief speech audio recordings, as well as the usage
sentiment and feeling embeddings that are.. The suggested training for many tasks using regression and
classification improves the assessment the level of depression, according to experimental findings. Additionally,
we demonstrated that, when compared to two separate networks, a CNN that can perform multiple tasks achieves
higher classification of sentiment and emotion performance. When paired with acoustic characteristics,
sentiment-emotion embedded patterns in this multi-task CNN considerably enhanced the accuracy of estimating
depression severity. These enhancements imply that the suggested methods may be applied to creating clinical
applications.
[3] Anastasia Pampouchidou et. al Although there are many different approaches to related algorithms
documented in the literature, automatic depression evaluation is still in need of considerable development
compared to present practices ability to differentiate between various types of depression and how MDD differs
from other mood disorders is just one clinical research concern that has to be addressed methodically. Further
research is necessary to understand individual variation brought on by concomitant personality disorders or
traits, as well as the impact of culture and ethnicity. It's intriguing that, despite the fact that such data can be
useful in understanding ongoing emotional responses, physiological activity measured through EMG, BVP, skin
conductance, and respiration has not been included in the reviewed multimodal studies, with the exception of
those who recorded heart rate using a non-contact, facial video-based system.
[4] Yuan Gong et. al proposed that, as a common mental condition, major depressive disorder must be accurately
diagnosed in order to provide targeted intervention and care. Participants in this challenge are asked to use audio,
video, and text from an interview that lasts between 7 and 33 minutes to build a model that forecasts the severity
of depression. It is difficult to find, collect, and keep crucial temporal details for such lengthy interviews because
doing so will result in the loss of the majority of temporal facts. As a result, we suggest an unique topic
modeling-based method for doing context-aware analysis. Our tests demonstrate that the suggested strategy
outperforms the challenge baseline and the context-unaware method by a wide margin across all criteria. The
ability of our approach to find a variety of temporal features that have an underlying relationship with depression
and further to build models on them was also discovered by analyzing the features chosen by the machine
learning algorithm, which is a task that is challenging for humans to complete.
[5] Asif Sa et. al Despite the fact that depression and social anxiety disorder are relatively widespread, many
people who are depressed or anxious choose not to seek counselling. The majority of current tests for these
diseases are based on client self-report and clinician judgement, making them cumbersome to perform,
vulnerable to subjective bias, and unavailable to patients who have difficulty accessing therapy. The
development of methods for identification, assessment, prevention, and therapy might benefit from objective
indicators of depression and social anxiety. For the purpose of identifying symptomatic persons and state affect
from extensive spoken audio data, we provide a weakly supervised learning system. We specifically present
NN2Vec, a novel feature modelling approach that makes use of the innate relationship between voice states and
symptoms/affective states. Additionally, we offer a novel MIL modification of the BLSTM classifier, known as
BLSTM-MIL, in order to comprehend the temporal dynamics of vocal states in weakly labelled data. We tested
our framework using spontaneous audio speech recordings from 105 participants, including speakers who were
very socially uncomfortable.
[6] A technique for speech depression detection using deep convolutional neural networks was developed by
Karol Chlasta et al. After analyzing five network typologies, ResNet-34 and ResNet-50 were determined to be
delivering the best categorization results. The results suggest a workable new method that uses audio
spectrograms and quick voice samples for initial screening of depressive individuals. The potential for the
spectrograms to generate CNN learnable features was found. This held true despite the challenge of utilizing
voice as a predictor of depressed symptoms. We think that the solution's use of 15-second sample intervals
helped to reduce the effect of noise. Our system can be used independently or as a part of a more complex,
hybrid, or multi-modal strategy.
[7] Richard Caruana et. al proven approach for training artificial neural networks with many outputs, we present
multitask learning (MTL) in connection-ism. In actuality, MTL in connection-ism can be seen in the traditional
NETtalk application and earlier work on giving suggestions to neural networks. We present an empirical
example that demonstrates how MTL can still enhance generalization performance even when the similarity
between numerous tasks is challenging to learn and recognize. We go on to show that this progress cannot be

73
attributed to anything other than a shared inductive bias resulting from the similarity of the tasks. We provide a
brief explanation of how to create multitask decision trees from the top down in order to demonstrate the
generality of the MTL methodology. Decision trees are not typically used to learn several tasks; therefore, this is
noteworthy. By doing this, a system is created that generalizes particular conceptual clustering techniques,
enhancing their applicability in fields where the separation between features (information that will become
available in the future) and classes (objects we wish to forecast) must be maintained.
[8] Robert J. McAulay et. al worked-on analysis/synthesis method was used to analyze speech that was both
clear and interfered with in various ways. In every instance, natural-sounding, high-quality synthetic speech was
produced. The technique may also be applied to the parametric representation of non-speech sounds, such as
music and particular marine biological noises. Finally, it is important to keep in mind that tools used to change
the width of are essential for high-quality speech reconstruction in addition to updating the average pitch. It's
vital to remember that, despite using the frequency analysis window, there are no voicing choices made during
the analysis and synthesis process.

III. METHODOLOGY

Figure 1: Methodology for gathering of data

1. Gathering of Data: Gathering of data is required at the first stage as we need to create a data-set which can
used to analyze and generate a better working model. The data will be collected in the form of audio and video
and it will be collected sufficiently in order to create a better working model.
2. Pre-Processing Of Data: Steps involved in pre-processing of data are: Data Cleaning, Feature Selection &
Data Transformation.
Data Cleaning is the process which involves removing and fixing the missing or incorrect data which is stored in
the database as whenever we create a data set it is bound to have some errors and those errors should be cleaned
which will give better and accurate results.
Feature Selection is the process of picking up appropriate and approximate features from the data-set and then
accordingly we can direct a way in which our model can be influenced.
Data Transformation is a process where the needs and behavior of algorithm is taken into account and the data is
changed accordingly such as the structure or the format of the data.
3. Projection/Prediction of Data: Is a step where we refer to the output after the model is trained on the data set
which was provided for training where we can predict the face gestures from the video data set and the
tone/audio notes of the voice collected from the audio data set to make predictions based on the previous data.
4. Tools needed for Data Visualization: Data Visualization tools provide a very easy way to create visuals on a
data set which are huge. Statistics of the data which are generally not visible are clearly stated and the underlying
patterns which are there can be easily uncovered.
The flow chart is shown in the following fig

74
IV. SYSTEM ARCHITECTURE

Figure 2: System architecture for our model

Algorithms Used:
A. SVM
The acronym SVM stands for "Support Vector Machine," a machine learning under supervision algorithm that
the ability to create regression and classification models that perform well on both linearly and non-linearly
distinguishable data. The SVM algorithm performs classification with the help of margin. The objective of the
algorithm is to find the borderline which can most accurately distinguish the data points in n-dimension space as
of which the boundary line is called the hyperplane.

Figure 3: Support Vector Machine Representation

75
B. CNN

Figure 4: Convolutional Neural Network Representation


CNN: stands for a particular kind of (Convolutional Neural Network) DNN (Deep Neural Networks) and they
are applied in visual memory which can be analyzed. This algorithm uses a technique called convolution which
is an operation in mathematics in which the operation is performed on double number of functions which then
creates 3rd function which shows how the size or form of one is modified by the other. The main aim or
objective of the CNN is that to reduce the complexity of the images so that those images can be processed easily
and additionally it will also not loose any of its features which will then give us better predictions.

V. CONCLUSION
According to physiological research, both facial and verbal activity differ little between depressed and very
healthy people. In light of this reality, we develop a multimodal spatiotemporal representation paradigm for
automatic depression level identification. The suggested STA network focuses on frames relating to depression
detection in addition to integrating secular information. In addition, by removing the information between
processes, the suggested MAFF method enhances the multimodal representation's quality. Experimental
AVEC2013 and AVEC2014 results show that our approach has a decent performance in terms of detection.
Human speech is a sophisticated combination of words and feelings. Every word might mean something
different depending on the context in which it is used. Every user will have a different mental state, making it
challenging to understand their input. Their feelings can help us grasp what they're going through even better.
Making a schedule and choosing the therapies are also aided by this. If the data is accessible, we also think about
applying this approach to identify more diseases. To increase the detection accuracy, we will partition various
tasks and train separate models in the future.

REFERENCES
[1] "Automatic Depression Detection Via Facial Expression Using Multiple Instance Learning," IEEE 17th International
Symposium on Biomedical Imaging (ISBI), 2020, by Yanfei Wang, Jie Ma, Bibo Hao, Pengwei Hu, Xiaoqian Wang,
Jing Mei, and Shaochun Li
[2] Estimating severity of Depression From Acoustic Features and Embedding of Natural Speech | IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP) | 978-1-7281-7605-5/20/$31.00 2021 IEEE, Sri
Harsha Dumpala, Sheri Rempel, Katerina Dikaios, Mehri Sajjadian, Rudolf Uher, Sageev Oore.
[3] Automatic Assessment of Depression Based on Visual Cues: A Systematic Review, by Anastasia Pampouchidou,
Panagiotis Simos, Kostas Marias, Fabrice Meriaudeau, Fan Yang, Matthew Pediaditis, and Manolis Tsiknakis, IEEE
Transactions on Affective Computing, 2017.
[4] "Topic Modeling Based Multi-modal Depression Detection," Proceedings of the 7th Annual Workshop on Audio/Visual
Emotion Challenge, Yuan Gong and Christian Poellabauer, 2017, pp. 69–76.
[5] A Weakly Supervised Learning Framework for Detecting Social Anxiety and Depression, Proceedings of the ACM on
interactive, mobile, wearable and ubiquitous technologies, vol. 2, no. 2, pp. 1–25, 2018. A. Salekin, J. W. Eberle, J. J.
Glenn, B. A. Teachman, and J. J. Stankovic.
[6] "Automated speech- based screening of depression using deep convolutional neural networks," Procedia Computer
Science, vol. 164, pp. 618–628 (2019). K. Chlasta, K. Wok, and I. Krejtz.
[7] Multitask learning: A knowledge-based source of inductive bias, R. Caruana, Proceedings of the ICML, 1993.
[8] Speech Analysis/Synthesis Based on a Sinusoidal Representation by Robert J. McAulay, VOL. ASSP-34, NO. 4,
AUGUST 1986.

76
Grenze International Journal of Engineering and Technology, June Issue

Cryptocurrency Price Prediction by Integrating


Optimization Mechanism to Machine Learning
Dr Deepak Nandal1 and Pankaj2
1
Asst. Prof Dr Deepak Nandal, Computer Science and Engineering, Guru Jambheshwar University of Science and
Technology, Hisar, Haryana 125001, India
Email: [email protected]
2
Research Scholar, Computer Science and Engineering, Guru Jambheshwar University of Science and Technology, Hisar,
Email: [email protected]

Abstract—Research is being done right now to try to predict the future value of cryptocurrency.
The possibility of using a python-based approach has been explored by the scientific community
as a means of realizing this aim. In predictive analytics, it is becoming standard practice to use
the same dataset for both training and testing purposes. Traditional studies have been slowed
down by problems with precision and efficiency. This research makes use of optimization and
the Python programming language to provide a versatile prediction model with little
implementation time. Dataset size is decreased when classification is performed in Python,
which shortens the training period. Eliminating extraneous information also improves the
performance of the trained model. Because of this change, we want to develop a system that is
both adaptable and extensible. To put it another way, such a system would help cryptocurrency
investors make better decisions while buying and selling cryptocurrency. Using several factors,
the study's results have significantly influenced Bitcoin price forecasts. Investment choices are
often guided by such analyses for many fund managers and private investors. Scientists have
developed a flexible and scalable strategy for determining an appropriate script's ideal value.
Investors will need a mechanism to choose which currency to purchase at any given moment
according to market circumstances as trading platforms progress.

Index Terms— Machine learning, Crypto currency, PSO, Accuracy, F1 score, recall value,
Precision.

I. INTRODUCTION
Cryptocurrency is a form of virtual currency that is encrypted to prevent forgery and double spending. The
networks behind many crypto currencies are completely decentralized. The blockchain technology is crucial to
these. Global computer networks keep a general ledger. Digital assets include cryptocurrency. It runs on a
decentralized system. Numerous industries, including finance and law, are expected to be shaken up by
blockchain and related technology, according to experts. Faster monetary transactions are one of the main
advantages of cryptocurrency. The inconsistency of prices and the high cost of transactions are two major
constraints. When using cryptography to safeguard digital or virtual money, simplicity in maintaining and
managing the cryptographic information is a primary concern. It is facilitating the avoidance of complexity and
the reduction of data processing time. Information may be accessed in this system through bitcoin and
individuals with certain permissions. Present research is focusing on following objectives:

Grenze ID: 01.GIJET.9.2.37


© Grenze Scientific Society, 2023
1. Estimating cryptocurrency's potential using data from the coin market cap
2. Keeping a tally of crypto assets over the course of a year to determine their best selling price
3. Developing an original method for forecasting optimal or appropriate pricing
4. Easing the burden on the investor by giving them the best possible pricing

Predict
PREDICTION CRYPTO- Statistical
FACTORS CURRENCY techniques
 Basic PRICE
 High PREDICTION Predict
Dimensiona Machine Learning
l Techniques

Figure 1. Cryptocurrency Price Prediction using machine learning

A. Machine learning
Understanding and developing 'learning' techniques, or methods that use data to enhance performance on some
set of tasks, is the focus of ML, a subfield of computer science. It's considered a kind of AI. In order to generate
predictions or judgments without being explicitly programmed, machine learning algorithms construct a model
using sample data. This data is referred to as training data. In many fields, including health, email filtering, voice
recognition, and computer vision, traditional algorithms would be too time-consuming or costly to design. This
is where machine learning algorithms come in.
B. Particle Swarm Optimization
PSO was first suggested by Kennedy and Eberhart in 1995. Scientists that study social behaviour believe that
individuals of a travelling school of fish or flock of birds "may profit from the experience of all other members."
When one bird in a flock goes out in search of food, the others may benefit from the information it gathers by
hearing from the other birds about the best spots to eat. In this context, "best" means "best" in a high-
dimensional issue space, where several solutions exist.
C. LSTM
LSTM has been recognized as a prominent artificial RNN. Such is widely used in the field of deep learning. One
of LSTM's distinguishing features is its capacity for connectivity and feedback. Contrast with a regular feed
forward neural network, which this is not. This goes beyond handling individual pieces of information, like
graphics. It's also the last step in a series of media files, such as an audio or video file. To accomplish
classification tasks, LSTM networks are deemed appropriate. LSTM networks have been considered as a type of
RNN. Aside from the regular units, LSTM also supports some unique ones. A single ‘memory cell' makes up an
entire LSTM unit. These memory cells can keep information stored for very long periods of time. Due of
LSTM's improved customization options, users are increasingly switching over from RNN. They can control the
inflow and distribution of Inputs based on learned Weights. Therefore, it allows for adaptability in output
management. Accordingly, LSTM is enabling management skills and productive outcomes.
D. Crypto currency
Cryptocurrency, often known as crypto, crypto-currency, or just crypto, is a kind of digital money meant to
function as a means of exchange on a decentralized network, rather than a centralized one backed by a
government or a bank. A digital ledger is a database that keeps track of who owns currencies and when. It uses
encryption to prevent unauthorized access to the database and ensure the integrity of all transactions and coin
ownership transfers. Cryptocurrencies, despite their name, are not regarded to be currencies in the classic sense.
E. Role of Machine Learning in Crypto currency
Predicting cryptocurrency using Machine Learning is the best option available. In order to make a reasonably
accurate prediction, the model needed to satisfy a number of criteria. Daily and 5-minute interval price
predictions for Bitcoin are made using a wide variety of ML models, including as LDA,LR, RF, XGBoost, SVM,
DT, QDA, and KNN. When it comes to blockchain and cryptocurrencies, the uses of machine learning go far
beyond price prediction. By streamlining the back-end processes of crypto trading and mining, ML has the
potential to address the security problems in this technology through deep learning and reinforcement learning.

78
II. LITERATURE REVIEW
Various studies have been conducted to determine how to best predict the price of cryptocurrencies. In 2013, A.
Cheung [1] received almost little attention from the academic community. As a result, individuals complete their
prior knowledge gaps. As part of our research on the occurrence of Bitcoin bubbles, we deploy a newly created
tool that is pretty effective at spotting bubbles. There have been a number of brief bubbles in the cryptocurrency
market since 2010, but the three largest bubbles all occurred in the years between 2011 and 2013, lasting
between six and six-and-a-half months each and ultimately leading to the downfall of Mt Gox. Numerous studies
have shown that the hazards associated with Bitcoin may be mitigated. In 2019, GARCH-in-mean models were
used by J. Liu [3] to investigate the link between volatility and returns of the dominant cryptocurrency and the
ripple effects of the cryptocurrency market. According to E. Bouri [4] in Inn 2020, all three cryptocurrency have
had considerable jump activity in their return series. These numbers suggest that the existence of one
cryptocurrency boom raises the probability that subsequent cryptocurrency booms will occur as well.
Conversely, co-jumping refers to jumping in sync with other traders to maximize volume. In 2020, N. Akbulaev
[5] studied the theoretical and practical connections between Bitcoin and Ethereum. Expanding the scope of
previous studies on the fundamental properties of Bitcoin and Ethereum and the correlations between their
values has allowed for a better understanding of recent trends in the industry. The values of Bitcoin & Ethereum
were shown to be correlated, and this connection might be leveraged to mitigate risk when trading
cryptocurrencies on exchanges like Gemini. In [6], we looked at whether Bitcoin is a means of exchange or an
asset, as well as its present and potential future applications. Their research demonstrates that Bitcoin's statistical
properties are distinct from those of conventional asset classes like stocks, bonds, and commodities, and this
holds true in both stable and volatile financial environments. Speculation, rather than usage as a means of trade
or currency, is the most common use of Bitcoin, according to data collected from Bitcoin accounts. S. Corbet [7]
published a study in 2018 that looked at the temporal and frequent connections between three major crypto
currencies and many different types of financial assets. [7] Many indicators point to the fact that these
possessions are distinct from monetary and material prosperity. The data suggests that Bitcoin investments may
provide diversification advantages for short-term traders. The interconnectedness of things may change over
time as a result of shocks to the financial system from outside the nation. In 2013, E. Turkedjiev [8] used the
ANN to provide short-term stock value predictions, particularly for financial institutions. The nonlinearity of
artificial neural networks makes them useful for analyzing stock market time data (ANN). The Hong Kong
Straight Train and the QDII, both introduced in 2007, also had a substantial effect on the price gaps between A
and H shares in 2010. Several legislative proposals are also made with the goal of narrowing the gap between A-
and H-stock prices. According to L. Guoyi [10], the total equity & GDP, earnings after tax per share, & market
index are all significant elements that affect a bank's stock price. To determine whether or whether this
information has a relationship to the end-of-day share prices of the banks in question, a test model is employed
for analysis and verification. According to S.Beng Ho [11], in order to have a good general learning machine,
you need one that can solve a broad variety of problems fast in a dynamic environment.

III. PROBLEM STATEMENT


There is an immediate need for a scalable and flexible methodology to estimate the value of cryptocurrencies.
The research will take into account the results of previous studies on Deep Learning and Machine Learning. We
want to look at the performance and computational expense of a traditional deep learning system. Price
predictions for popular cryptocurrencies including Bitcoin, Litecoin, Ethereum, Waves, and BTT should be
evaluated, and existing techniques should be compared to those that have been proposed. In order to suggest a
new method for predicting the prices of ETH, BTC, LTC, & WAVE cryptocurrencies, research on existing
methods and problems in this area is required.

IV. PROPOSED WORK


A. LSTM and Its Training Mechanism
The "net" network that has been trained is saved in the system so that it may be tested again later. Two LSTM
layers were used in the implementation, which resulted in a trained network. During the training process, the
proposed model makes use of not one, but two LSTM layers before resorting to a drop out layer. Seventy percent
of the data set is used for training purposes, while the remaining thirty percent is used for testing. The LSTM-
dependent neural network is trained based on the features. Training duration is affected by a number of
parameters, one of which is batch size. Accuracy is improving thanks in large part to the hidden layers and

79
dropout layers. Once a dataset is obtained, characteristics are chosen to use in the training process. Then, a 12
hidden layer LSTM1 layer and a 5 hidden layer LSTM2 layer are implemented, with the training/testing ratio
determined. Over fitting is fixed by dropout layers, and after that a fully connected layer and a softmax layer are
utilized. Decisions on potential intrusions may be made with the use of a classification operation.

End

Start
Classify for
Lab overvalued and
Normal
el under valued

Dataset

Classification

Data filtering after


PSO optimization

Activation function
Dataset

Train Test
70% 30%
Full Connected layer

Sequence input layer

Word embedded layer

Dropout Dropout
LSTM LSTM

Figure 2. Process flow for proposed work

B. Research Methodology
The dataset of crypto currency are captured using python script and PSO is applied over dataset. The PSO is
supporting in getting optimized price in order to support investor regarding best prices. Then dataset is filtered
considering optimized value and machine learning approach is used for training.
Proposed objectives include study on the establishment of records of Bitcoin pricing. In the current study,
categorization of cryptocurrency prices is recommended on the basis of undervalue, overvalue, and typical
pricing. Researchers may now evaluate their findings with the help of the accuracy parameters they obtained.

V. RESULT AND DISCUSSION


Training operation is made after filtering optimized dataset in case of BTC, Etheriam, polygon matic, LTC and
wave.

80
Start

Get dataset of prices of cryptocurrency

Data preprocessing

Performing data filtering operation

Get the optimized price and filter dataset

Consider filtered and optimized dataset for training

Initialize machine learning model by setting batch size, epoch, and hidden layer.

Perform training and testing

Perform node classification on the bases of high, low and moderate call drop ratio

Get the accuracy parameters and perform evaluation

Stop

Figure 3. Process flow of proposed work

Figure 7. Global optimized price Figure 8. Factor plot of global optimized price by weekend/weekday

Error and accuracy report after training of LSTM model


 Train Mean Absolute Error: 0.10702336612861828
 Train Root Mean Squared Error: 0.16077715963351358
 Test Mean Absolute Error: 0.022237375378608704
 Test Root Mean Squared Error: 0.022237375378608704
 Train Accuracy: 0.8929766338713817
 Test Accuracy: 0.9777626246213913

81
Figure 9. Simulation of training loss and testing loss

TABLE I. ANALYSIS OF ACCURACY OF CRYPTOCURRENCY PRICE PREDICTION


Date BTC ETHERIAM WAVE LTC MATIC
10/9/2022 98.5317806 94.4814959 98.0737862 98.6103608 88.4602016
10/10/2022 99.783436 92.6781234 98.0465749 98.2668051 89.6173091
10/11/2022 96.9286016 97.6433739 98.5264403 98.670375 97.0969601
10/12/2022 97.7737 97.2695299 98.3237916 98.451252 99.5937368
10/13/2022 95.92523 96.4562918 98.3094838 98.1136046 99.5400505

105
100 BTC
95
ETHERIAM
90
WAVE
85
A 80

VI. CONCLUSION
09-10-2022 10-10-2022 11-10-2022
Date
12-10-2022 13-10-2022

Figure 10. Comparison of accuracy for different cryptocurrency

The cryptocurrency industry has developed a complex cryptographic infrastructure to oversee its many
LTC
MATIC

operations. The issues that arise from manually handling a crypto currency’s administration are addressed and
avoided in this project. Information likes as users, crypto holders, author ids, and author biographies are being
managed as part of the research. The research sphere is expansive. When handling data, this system took into
account a number of factors. The simulation results show that the suggested technique is more precise than
previous methods. Simulation results conclude that proposed LSTM model is providing accuracy above 97%.

FUTURE SCOPE
Due to rapid growth in craze of cryptocurrency it has become essential for investors to take investment decision
considering overvalue, under value and normal value. Present research has focused on the optimized value of
cryptocurrency and proposed efficient machine learning approach. Such research could play significant role in
predicting best prices in case of stock market also. Thus present research would contribute toward different
investment options.

REFERENCES
[1] A. (Wai-K. Cheung, E. Roca, and J.-J. Su, “Crypto-currency bubbles: an application of the Phillips–Shi–Yu (2013)
methodology on Mt. Gox bitcoin prices,” Applied Economics, vol. 47, no. 23. Informa UK Limited, pp. 2348–2358,
Feb. 04, 2015 [Online]. Available: http://dx.doi.org/10.1080/00036846.2015.1005827

82
[2] J. Carrick, “Bitcoin as a Complement to Emerging Market Currencies,” Emerging Markets Finance and Trade, vol. 52,
no. 10. Informa UK Limited, pp. 2321–2334, Aug. 02, 2016 [Online]. Available:
http://dx.doi.org/10.1080/1540496X.2016.1193002
[3] J. Liu and A. Serletis, “Volatility in the Cryptocurrency Market,” Open Economies Review, vol. 30, no. 4. Springer
Science and Business Media LLC, pp. 779–811, Aug. 24, 2019 [Online]. Available: http://dx.doi.org/10.1007/s11079-
019-09547-5
[4] E. Bouri, D. Roubaud, and S. J. H. Shahzad, “Do Bitcoin and other cryptocurrencies jump together?,” The Quarterly
Review of Economics and Finance, vol. 76. Elsevier BV, pp. 396–409, May 2020 [Online]. Available:
http://dx.doi.org/10.1016/j.qref.2019.09.003
[5] N. Akbulaev, I. Mammadov, and M. Hemdullayeva, “Correlation and Regression Analysis of the Relation between
Ethereum Price and Both Its Volume and Bitcoin Price,” The Journal of Structured Finance, vol. 26, no. 2. Pageant
Media US, pp. 46–56, Apr. 29, 2020 [Online]. Available: http://dx.doi.org/10.3905/jsf.2020.1.099
[6] D. G. Baur, K. Hong, and A. D. Lee, “Bitcoin: Medium of exchange or speculative assets?,” Journal of International
Financial Markets, Institutions and Money, vol. 54. Elsevier BV, pp. 177–189, May 2018 [Online]. Available:
http://dx.doi.org/10.1016/j.intfin.2017.12.004
[7] S. Corbet, A. Meegan, C. Larkin, B. Lucey, and L. Yarovaya, “Exploring the dynamic relationships between
cryptocurrencies and other financial assets,” Economics Letters, vol. 165. Elsevier BV, pp. 28–34, Apr. 2018 [Online].
Available: http://dx.doi.org/10.1016/j.econlet.2018.01.004
[8] P. Alagidede and T. Panagiotidis, “Stock returns and inflation: Evidence from quantile regressions,” Economics Letters,
vol. 117, no. 1. Elsevier BV, pp. 283–286, Oct. 2012 [Online]. Available:
http://dx.doi.org/10.1016/j.econlet.2012.04.043
[9] E. Turkedjiev, M. Angelova, and K. Busawon, “Validation of Artificial Neural Network Model for Share Price UK
Banking Sector Short-Term Trading,” 2013 UKSim 15th International Conference on Computer Modelling and
Simulation. IEEE, Apr. 2013 [Online]. Available: http://dx.doi.org/10.1109/UKSim.2013.31
[10] W. Kang and R. A. Ratti, “Oil shocks, policy uncertainty and stock market return,” Journal of International Financial
Markets, Institutions and Money, vol. 26. Elsevier BV, pp. 305–318, Oct. 2013 [Online]. Available:
http://dx.doi.org/10.1016/j.intfin.2013.07.001
[11] S.-H. Kim and D. Kim, “Investor sentiment from internet message postings and the predictability of stock returns,”
Journal of Economic Behavior &amp; Organization, vol. 107. Elsevier BV, pp. 708–729, Nov. 2014 [Online].
Available: http://dx.doi.org/10.1016/j.jebo.2014.04.015
[12] T. Hendershott and M. S. Seasholes, “Liquidity provision and stock return predictability,” Journal of Banking &amp;
Finance, vol. 45. Elsevier BV, pp. 140–151, Aug. 2014 [Online]. Available:
http://dx.doi.org/10.1016/j.jbankfin.2013.12.021
[13] X. Li, X. Huang, X. Deng, and S. Zhu, “Enhancing quantitative intra-day stock return prediction by integrating both
market news and stock prices information,” Neurocomputing, vol. 142. Elsevier BV, pp. 228–238, Oct. 2014 [Online].
Available: http://dx.doi.org/10.1016/j.neucom.2014.04.043
[14] S.-B. Ho, “Deep thinking and quick learning for viable AI,” 2016 Future Technologies Conference (FTC). IEEE, Dec.
2016 [Online]. Available: http://dx.doi.org/10.1109/FTC.2016.7821605
[15] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, and M. Ghogho, “Deep learning approach for Network Intrusion
Detection in Software Defined Networking,” 2016 International Conference on Wireless Networks and Mobile
Communications (WINCOM). IEEE, Oct. 2016 [Online]. Available: http://dx.doi.org/10.1109/WINCOM.2016.7777224
[16] R. McKenna, S. Herbein, A. Moody, T. Gamblin, and M. Taufer, “Machine Learning Predictions of Runtime and IO
Traffic on High-End Clusters,” 2016 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, Sep.
2016 [Online]. Available: http://dx.doi.org/10.1109/CLUSTER.2016.58
[17] W. LIU, “Machine Learning Algorithms and Applications: A Survey,” International Journal of Computer Science and
Information Technology for Education, vol. 3, no. 1. Global Vision Press, pp. 37–46, May 30, 2018 [Online]. Available:
http://dx.doi.org/10.21742/IJCSITE.2018.3.1.07
[18] S. Wawre et. al. “Sentiment Classification using Machine Learning Techniques”, International Journal of Science and
Research (IJSR), Vol., 5 Issue 4,2016.
[19] C. Yin, Y. Zhu, J. Fei, and X. He, “A Deep Learning Approach for Intrusion Detection Using Recurrent Neural
Networks,” IEEE Access, vol. 5. Institute of Electrical and Electronics Engineers (IEEE), pp. 21954–21961, 2017
[Online]. Available: http://dx.doi.org/10.1109/ACCESS.2017.2762418
[20] S. Sendra, A. Rego, J. Lloret, J. M. Jimenez, and O. Romero, “Including artificial intelligence in a routing protocol using
Software Defined Networks,” 2017 IEEE International Conference on Communications Workshops (ICC Workshops).
IEEE, May 2017 [Online]. Available: http://dx.doi.org/10.1109/ICCW.2017.7962735
[21] D. Zhang, X. Han and C. Deng, "Review on the research and practice of deep learning and reinforcement learning in
smart grids," in CSEE Journal of Power and Energy Systems, vol. 4, no. 3, pp. 362-370, September 2018, doi:
10.17775/CSEEJPES.2018.00520.
[22] F. Wang, L. Duan, and J. Niu, “Optimal Pricing of User-Initiated Data-Plan Sharing in a Roaming Market,” IEEE
Transactions on Wireless Communications, vol. 17, no. 9. Institute of Electrical and Electronics Engineers (IEEE), pp.
5929–5944, Sep. 2018 [Online]. Available: http://dx.doi.org/10.1109/TWC.2018.2851578

83
[23] X. Qiao, D. Shi, and F. Xu, “Optimal pricing strategy and economic effect of product sharing based on the analysis of
B2C sharing platform,” 2019 16th International Conference on Service Systems and Service Management (ICSSSM).
IEEE, Jul. 2019 [Online]. Available: http://dx.doi.org/10.1109/ICSSSM.2019.8887720
[24] Arti, K. P. Dubey and S. Agrawal, "An Opinion Mining for Indian Premier League Using Machine Learning
Techniques," 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU),
Ghaziabad, India, 2019, pp. 1-4, doi: 10.1109/IoT-SIU.2019.8777472.
[25] R. Bhowmik and S. Wang, “Stock Market Volatility and Return Analysis: A Systematic Literature Review,” Entropy,
vol. 22, no. 5. MDPI AG, p. 522, May 04, 2020 [Online]. Available: http://dx.doi.org/10.3390/e22050522.
[26] X. Shen, G. Wang, and Y. Wang, “The Influence of Research Reports on Stock Returns: The Mediating Effect of
Machine-Learning-Based Investor Sentiment,” Discrete Dynamics in Nature and Society, vol. 2021. Hindawi Limited,
pp. 1–14, Dec. 31, 2021 [Online]. Available: http://dx.doi.org/10.1155/2021/5049179.

84
Grenze International Journal of Engineering and Technology, June Issue

A Comprehensive Study on Current Trends in


Unsupervised Machine learning Algorithms and
Challenges in Real World Applications
Dr. P. Velvadivu1, Dr. M. Sujithra2 and R. Priyadharshini3
1-2
Assistant Professor, Department of Computing - Data Science, Coimbatore Institute of Technology
Email: {velvadivu, sujithra}@cit.edu.in
3
MSC Data Science, Department of Computing, Coimbatore Institute of Technology
Email: [email protected]

Abstract—In this technology fueled world, everyone is using smart devices, electronic gadgets,
wireless products etc. and huge amounts of data are being generated, collected, and stored in
the databases. To efficiently process and intelligently analyze the huge amount of data, the
knowledge about subfield of Artificial Intelligence that is, Particularly Machine learning (ML)
is required. There are various types of machine learning and its algorithms have been
introduced to handle real world scenarios. This paper discusses the Comprehensive survey
based on methodologies, techniques, algorithms, applications, and challenges faced by
unsupervised machine learning and how unsupervised learning techniques can be helpful in
real world business and environment. Thus, this study’s key contribution is explaining the
principles of different unsupervised machine learning techniques and their applicability in
various real-world application domains, such as cybersecurity systems, smart cities, healthcare,
e-commerce, agriculture, and many more.

Index Terms— Machine learning, Unsupervised learning, clustering, feature selection and
featureextraction.

I. INTRODUCTION
Very large and enormous amount of data and information are collected and stored from various sources like
mobile phones, personal computers, sensors, cameras, satellites, log files, health care tracker, bio informatics,
human generated data like social media data where enormous number of photos, videos, audios have been
uploaded daily on the internet. Every day, 2.5 quintillion bytes of data generated roughly. Intelligently
collecting, processing, analyzing huge volumes of data and developing corresponding smart gadgets, automated
applications using the knowledge of Artificial Intelligence and Machine learning. Machine learning allows
software applications and programs to automatically predict with accurate accuracy without being explicitly
programmed. Machine learning is the most important field of Data Science. In real world, through learning
capability and experience human tries to learn and Machine works based on human instructions. Machine
learning is the one where machine automatically learns by experience as human does. The role of Machine
learning is to learn, improve performance by experience and predict things with best accuracy. Machine learning
have been classified into supervised, unsupervised and reinforcement learning

Grenze ID: 01.GIJET.9.2.41


© Grenze Scientific Society, 2023
A. Steps followed in Machine learning process
 Gathering Data: Raw data either from excel, access or text files. This step (Collecting past data) forms a
strong foundation for future learning.
 Data Preparation: Raw data from any source contains missing values, irrelevant data, outliers etc. This
technique involves data cleaning, normalization, dimensionality reduction, treatment of outliers and
methods to remove irrelevant data.
 Choosing a model: Before choosing a model, there is a need to identify which type of machine learning
the problem statement is. Then choosing a right machine learning model under any of the machine
learning category plays an important role for future prediction.
 Training: Normally, 70% of data will be used for training part and remaining 30% of dataset will be used
for testing or evaluation part. Training the machine learning model helps the model to understand and
train by its own with well understanding of dataset.
 Evaluation: 30% of testing dataset is used to test the machine learning algorithm and check how well the
algorithm is trained based on performance measures like accuracy, precision, recall etc.
 Hyperparameter tuning: Hyperparameter tuning is a parameter whose value will be set before actual
training process begins.
 Prediction:
B. Machine learning types
1) Supervised learning
In Supervised learning, the model gets trained based on the labelled data. In training phase, the input data will
get trained and tested with the target attribute. Supervised learning is also called task oriented because it mainly
focusses on task and feed more data to train algorithm until it accurately predicts and perform. Supervised
algorithm has been classified into two types, Classification and Regression. Some of the algorithms are K
Nearest Neighbor, Random Forest, Decision tree, Support Vector Machine, Logistic Regression etc.
2) Unsupervised learning
It mainly focusses on identifying underlying trends, patterns, and insights from the dataset. Here, unsupervised
learning models are trained using unlabeled dataset and automatically extract patterns, facts and figures without
any supervision. Unsupervised learning mainly classified into two types, Clustering and Association. Several
algorithms follow these two types namely Agglomerative hierarchical clustering, K means clustering and FP
Growth, Apriori algorithm (Association) problem. Some applications are Recommendation system, Identity
management etc.
3) Reinforcement learning
Feedback based reinforcement learning agent trains automatically using learning, feedback, and previous
experience. The agent will be rewarded if it does right action in the environment, and it will be penalized if it
does any wrong actions. Many applications have been developed using reinforcement learning techniques like,
Gaming technology, Robotics, Self-driving cars etc.
4) Difference between machine learning types

TABLE I. DIFFERENCE BETWEEN MACHINE LEARNING TYPES

Supervised learning Unsupervised learning Reinforcement learning

Machine learning model or Machine learning model or algorithms Reinforcement learning models are
algorithms learns from labeled learns from unlabeled data. It is also based on reward or penalty. It is
data. It is also called as Task called as Data Driven also called as Environment
oriented approach. Approach. Driven Approach.

Types: Classification or Types: Clustering and Association Types: Classification and


Regression rules. Control

Algorithms: Random Forest, Algorithms: K Means Clustering, It is formalized using Markov


Decision tree, K NearestNeighbor DBSCAN Algorithm, Principal Decision process.
Component Analysis

Applications: Medical Applications: Recommendation Applications: Robotics, Video


Diagnosis, Spam Detection systems, Customer segmentation. games.

86
II. UNSUPERVISED MACHINE LEARNING
Unsupervised learning, by the name itself it could be easily understood that it will not guide by any supervision.
This type of learning should automatically extract knowledge, underlying hidden patterns, data groupings from
the dataset without human intervention. It will group objects or items based on similarities. Unsupervised
machine learning will deal with unlabeled dataset where there is no target output tagged with corresponding
input. Hence unsupervised learning is helpful in real life scenarios because all real-world problems will not come
up with input and output pattern.
A. Steps involved in Unsupervised learning techniques:
 Gathering Unlabeled data: In unsupervised machine learning, gathering of data (raw unlabeled
data) isthe important part where it finds insights and trends from the data without supervision.
 Interpretation: It interprets the raw input data to find out the hidden patterns and trends.
 Algorithm: Then will apply suitable algorithms like clustering algorithms or association rules.
 Processing: Here, the data points divide into groups called clusters based on similarity which is
measured using Euclidean or cosine distance.
 Output: When new data point arrives, the algorithm will push the data point into most similar groups
andgives the predicted output based on similarity without any supervision.
B. Advantages of using Unsupervised learning
 Unsupervised learning helps to solve problems without human intervention.
 It automatically learns from the data and discover underlying patterns and group items or objects
basedon similarities.
 It is less complex when compared with supervised machine learning, because in supervised it
involveshuman intervention because one has to understand the input data and label them.
C. Disadvantages of using Unsupervised learning
 It provides less accuracy of results because it has no labeled data and machine must discover
automatically the underlying new patterns and relationships hence provides somewhat less accuracy
compared to supervised machine learning.
 Evaluating an unsupervised machine learning model is quite difficult when compared to supervised
model.
D. Types of Unsupervised learning technique
In unsupervised we have input data and not having corresponding output data. If there is a set of image dataset,
then algorithm does not know about the input features and not trained upon images provided. The unsupervised
model should try to learn upon their own and perform the task by clustering or grouping the images based on
similarities. Unsupervised have been classified into two types,
 Clustering
 Association
E. Clustering
Clustering is grouping of objects based on similarities. It groups the given data points and objects that possess
more similarity will remain in same group and objects that possess less similarity will move to other groups of
clusters. It can be helpful in marketing sectors or industries where they group customers based on their behavior.
Clustering has been used in wide range of applications like e-commerce sites, cybersecurity, health care
analytics, behavioral analytics etc. Many clustering algorithms has been introduced, most popular and widely
used clustering algorithms that is used in machine learning is,
 K-Means Clustering
 Agglomerative hierarchical clustering
 DBSCAN Clustering
F. Association Rules
Association rules is a type of unsupervised learning method which is used to find the relationship between the
objects in the large databases. Association rule mining makes effective marketing strategy. Market basket
analysis is a one example for association rule mining since it finds relationship between the items purchased by

87
the customers. If a person buys product ‘x’, then he/she might buy product ‘y’. If a customer who doesn’t buy
product ‘y’ followed by ‘x’ then they are said to be typical customers and marketing agents target them and cross
sell the items to them. Association rule mining finds frequent items, pairs, associations etc. from relational or
transactionalor any kind of databases.
It has been divided into two parts,
 Antecedent
 Consequent
“If customer purchases the product bread, then he is likely to buy Jam”
 Antecedent: It can be found in datasets. It is bread from above statement.
 Consequent: It can be found in combination with Antecedent. It is Jam from above statement.
The relationship can be described in two parameters, “Support” and “Confidence”. Support indicates how many
times the if/then relation occurs in the datasets, whereas Confidence refers to number of times these if/then
relationships have found to be true.
There are different types of algorithms in association rule mining,
 Apriori algorithm
 FP – Growth algorithm

III. CLUSTERING ALGORITHM


A. K – Means Clustering
It is widely used clustering algorithms in Unsupervised machine learning. K- Means clustering is an iterative
procedure where the data points are grouped into K clusters. Each data point belongs to one single cluster. It
groups data points based on similarity, the similarity between the data points can be determined by calculating
distance between them. The distance between data points and clusters centroid point (Initially, which is the
random data point selected among all data points) can be calculated in many ways,
 Squared Euclidean distance
 Manhattan Distance
 Cosine Distance
 Correlation Distance
Similarity can be measured using any of these distance-based measurement and it is completely
applicationspecific.
1. Working of K – Means clustering
 Initially, determining the k-value, where ‘k’ is denoted as number of cluster centroids among all
data points.
 Cluster centroids: Randomly selecting k data points that is, if k=5, then randomly choosing 5 data
pointsfrom groups of data.
 After selecting centroids, calculating the distance between each data point with centroid using any of
thedistance-based measures.
 The minimum distance between centroid and corresponding data point will form a cluster formation.
 Similarly, calculating distance for all the data points, the data points which is at minimum distance
fromone cluster will get joined with that cluster group. Thus, forming k clusters.
 Since, this is an iterative procedure, there are two steps
 Assigning the data points
 Updating the clusters.
 Updating clusters will occur again and again until there is no assigning of data points from one
cluster toanother.
 Finally, it groups the data points with k clusters based on similarity.
 There are two ways of selecting k value, (a) Elbow Method and (b) Silhouette Method.
2. Elbow Method
Elbow method is used to identify the optimal number of k clusters in the dataset. It determines whether the
selected k value will provide optimal accuracy of grouping the data points. To understand, Initially, fix k: 1, it
forms one single cluster where all data points belong to one cluster. Similarly, fix k value as 2, then data points
have been divided into two clusters. When k value increases, the distance between data points and cluster
centroid points decreases. It is said to be optimal if k value is above 3 because the distance decreases rapidly due

88
to increase in k value. When k is 3 or above, the distance between data point and centroid becomes minimum
and become stable. So, selecting k points above 3 can be optimal solution for identifying the number of clusters.
3. Silhouette Method
It is used to find the accurate separation of k clusters in the dataset. It can be calculated using the formula,
s(o) = b(o) – a(o)
Max{a(o), b(o)}

s(o) = silhouette coefficient of data point ‘o’


b(o) = computing average distance between data point ‘o’ to all other clusters.
a(o) = computing average distance between data point ‘o’ to other data points in same cluster. Silhouette
coefficient should range from [-1,1]. If silhouette coefficient is,
 1: It is the best coefficient value and the number of k cluster selected will correctly groups between
allthe data points.
 -1: It denotes worst k value selection and it does not groups data points correctly.
 0: It is said to be overlapping of clusters.
So, the silhouette value should be high as possible and closed to 1. Compared to elbow method,
silhouettemethod gives the best separation of clusters between the data points.
B. Agglomerative Hierarchical Clustering
Agglomerative hierarchical clustering is a type of hierarchical clustering and widely used clustering algorithm.
Initially, all the data points are considered as one single cluster. Calculating Euclidean distance or any kind of
distance measure between each single data point or each single cluster. The minimum distance between pair of
clusters will be combined as one single cluster. Similarly, calculating each cluster’s distance and that with
minimum distance merges, thus forming one single cluster or k clusters finally.
1. Working of Agglomerative hierarchical clustering
 All the data points are considered as single cluster.
 Computing proximity matrix for all the data points.
 Groups data points based on minimum distance.
 Updating the proximity matrix after merging the clusters.
 Similarly, repeating the process until it attains one single cluster or k cluster.
 Agglomerative hierarchical clustering can be visualized using a chart called “dendrogram”
 Initially if there are 5 data points say A, B, C, D, E, each data points are considered as single cluster.
 Distance between each data point is calculated.
 The minimum distance is found between A and B, D and E.
 Both clusters form a single cluster say, AB and DE
 Again, distance between AB, C, DE has been calculated.
 The minimum distance found between AB, C they form a single cluster.
 Similarly, Distance calculation is carried out for all the data points and finally they form a single
clusteras ABCDE.
C. DBSCAN Clustering
There is a drawback in k-means clustering and agglomerative hierarchical clustering as its fail to create arbitrary
shapes. Thus, DBSCAN helps to overcome this issue. DBSCAN groups data based on high density. The most
interesting part in DBSCAN is it is robust to outliers, and it can easily detect the noise from the group of data
points. In k means clustering, there is a need to determine k value to form number of clusters, but in DBSCAN
there is no need to specify k value or number of clusters. It requires two parameters,
• Epsilon: Radius of circle to be formed around data points.
• Min-points: Minimum number of points to be inside radius of circle.
 DBSCAN classifies data points into three types, (a) Core Point (b) Border point (c) Noise. Consider Min
point: 3, then the number of data points with at least 3 inside the circle with including itself is represented as
“core point”. If the data point within 3 but greater than 1 can be represented as “border point”. If it contains
only one data point inside the circle it is noise. DBSCAN uses Euclidean distance-based measure to
calculate distance between points. To use DBSCAN technique it is much more important to select epsilon
and Minimum point value. Selecting epsilon value and Min point follows some criteria, the minimum point

89
should be one value greater than number of dimensions. Min Point = Dimension+1. Epsilon value can be
decided using elbow graph or k distance graph. The maximum curvature value can be selected as epsilon to
get more accurate value. Hence, DBSCAN approach can be more useful for clustering related problem as
when compared to k- means clustering or Hierarchical clustering. But K-means and hierarchical clustering
can be useful for some applications also. Density based problems can be solved using DBSCAN algorithm.

IV. ASSOCIATION RULE MINING


A. Apriori Algorithm
Apriori algorithm is used in Market basket analysis, which is used to find the relationship between two products,
items, or objects and find the frequent item set from candidate k items. If a person buys a product ‘x’, then
he/she might buy product ‘y’ in combination with x. The main goal of Apriori algorithm is to find the association
rules between items. It is also called as frequent pattern mining, which finds frequent items purchased by the
customer. Hence, this is helpful in marketing companies or industries to find the customer behavior and cross
sell items to the customers. Apriori algorithm operates on the databases with large number of transactions. Three
main components to calculate Apriori algorithm is, support, confidence, and lift.
1. Working of Apriori algorithm
 Consider for example if support count is 2 and confidence is 60%.
 The dataset contains transaction id and frequent item set purchased by a customer.
 If there are 5 items say, A, B, C, D, E. and k: 1 initially then find the number of transactions for each
item.
 If the support count for each item is less than the minimum support count 2, then prune that item.
 Next, again check with k=2, combine 2 items and check the combination of transactions between two
items. Again, if the support count for two items less than minimum support count prune it. Similarly,
theprocess will be repeated until there is no itemset leftover with minimum support count.
 Finally, generate rules for result set of items and check with confidence value of item set with given
confidence value i.e., 60%. If the confidence value less than given value, reject that pair and item set
with maximum confidence value will be selected as frequent item pairs purchased by customers.
 This will get to know about the customer and enhance their purchasing experience and marketing
sectorswill gain more profit.
B. FP - Growth
There is a drawback in Apriori algorithm, since it must scan the database again and again to find the candidate k
items it makes algorithm slower. To overcome this issue, FP Growth has been introduced. It is also called as
Frequent pattern growth algorithm.
o Working of FP-Growth
 Consider the support value to be 3. The items with equal or maximum support value are sorted
indescending order.
 Creating ordered item set, comparing the actual item set with sorted item (item with more than 2
supportcount) and considering only the item in ordered item creating a separate set.
 Now all the ordered item set will be inserted in trie data structure.
 Next step is to find the conditional pattern base by using the trie data structure.
 After that identifying conditional frequent pattern is important to identify frequent item pair.
Thus,finding frequent pattern by using conditional frequent pattern.
 Finally, calculating confidence value for frequent item and comparing with given confidence
value.Identifying, if then rules for frequent item sets by using confidence value.

V. APPLICATIONS OF UNSUPERVISED LEARNING


There are many real-world applications that make use of unsupervised machine learning and enhancing
userexperience by using any of the algorithms.
 Recommender system
 Customer segmentation
 Targeted Marketing
 Identity management, Item categorization

90
 In Genetics and Anomaly Detection
 Image, speech and pattern recognition

VI. CHALLENGES AND RESEARCH DIRECTIONS


Unsupervised machine learning has lots of benefits and it is helpful in real world problems, and it is useful for
users to enhance their experience and thus helpful in making end to end applications and software. Even though,
it has some benefits there is a challenge in unsupervised learning technique since it automatically does
everything without human intervention. Some of the challenges of unsupervised techniques are,
 Complexity in computation due to large volume of data.
 It may give inaccurate results too. Inaccurate results should be solved because it is important
parameter to consider when it is applied in real world problems.
 Even though it automatically predicts, or groups data based on some criteria, there is a need of
human intervention in validating the result.
 It may take longer time to train the data points.
 Collecting data in specific domain like IOT, Cyber security and agriculture, network traffic is
not straight forward.
 In depth investigation is required while collecting real world data.
But when all these challenges are solved successfully, then unsupervised give more advantages in business
environment like gaining profit to marketing entities, enhance user experience by providing more accurate
results for example, recommender system, which recommends similar things what user likes the most, makes
companies to understand about their customers to track their behavior, activities etc. The success of machine
learning based solution in specific domain or application needs good quality of data and choosing appropriate
algorithms. Hence, effectively handling features, processing, and good level of maintaining the dataset leads to
best performance by the machine learning model with high accuracy and leads to build effective and intelligent
application.

VII. CONCLUSION
Unsupervised learning is a powerful tool which can be used for large databases. There are various kinds of
applications developed using unsupervised learning technique. Variety of algorithms are there to solve problems
and give more accurate results. Both advantages and disadvantages are there in unsupervised learning technique,
when successfully solved these challenges faced by unsupervised learning, it gives more profit to companies,
strengthen relationship between companies and their customers, improve user performance and there is a lot of
advantages when using unsupervised machine learning technique. It automatically solves problems by grouping
the data points based on similarity without any human intervention. Less human intervention, more automatic
process with fairly good results is unsupervised machine learning. Hence, this paper describes the complete
survey on unsupervised machine learning with its applications, algorithms and its challenges faced in real world
scenarios.

REFERENCES
[1] A. Toshniwal, K. Mahesh and R. Jayashree, "Overview of Anomaly Detection techniques in Machine Learning," 4th
International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I- SMAC), pp.808-815, 2020, doi:
10.1109/I-SMAC49090.2020.9243329.
[2] Li, N., Shepperd, M., & Guo, Y., "A systematic review of unsupervised learning techniques for software defect
prediction," Information and Software Technology, Vol.122, pp.106287, 2020.
[3] Rodrigues, J., Belo, D., & Gamboa, H., "Noise detection on ECG based on agglomerative clustering of morphological
features," Computers in biology and medicine, Vol.87, pp.322-334, 2017.
[4] Shakeel, P. M., Baskar, S., Dhulipala, V. S., & Jaber, M. M., "Cloud based framework for diagnosis of diabetes mellitus
using K-means clustering," Health information science and systems, Vol.6, No.1, pp.1-7, 2018.
[5] [5] M. Sujithra, P. Velvadivu, J. Rathika, R. Priyadharshini and P. Preethi, "A Study on Psychological Stress of
Working Women In Educational Institution Using Machine Learning," 2022 13th International Conference on
Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, 2022, pp. 1-7, doi:
10.1109/ICCCNT54827.2022.9984460.
[6] Liang, Z., Wang, C., Duan, Z., Liu, H., Liu, X., & Ullah Jan Khan, K., "A Hybrid Model Consisting of Supervised and
Unsupervised Learning for Landslide Susceptibility Mapping" Remote Sensing, Vol.13, No.8, pp.1464, 2021.
[7] Kuang, H., Qiu, Y., Li, R., & Liu, X., "A hierarchical K-means algorithm for controller placement in SDN- based WAN

91
architecture," 2018 10th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA),
IEEE, pp.263-267, February 2018.
[8] Addagarla, S. K., & Amalanathan, A., "Probabilistic Unsupervised Machine Learning Approach for a Similar Image
Recommender System for E-Commerce," Symmetry, Vol.12, No.11, pp.1783, 2020.
[9] S. Siddharth, R. Darsini and M. Sujithra, "Sentiment analysis on twitter data using machine learning algorithms in
python", Int. J. Eng. Res. Comput. Sci. Eng., vol. 5, no. 2, pp. 285-290, 2018.
[10] Acar, E., & Yener, B., "Unsupervised multiway data analysis: A literature survey," IEEE transactions on knowledge and
data engineering, Vol.21, No.1, pp.6-20, 2008.

92
Grenze International Journal of Engineering and Technology, June Issue

Variable Selection Methods, Comparison and their


Applications in Machine Learning: A Review
Kirti Thakur1, Harish Kumar2 and Snehmani3
1-2
University Institute of Engineering (UIET), Panjab University, Chandigarh, India
Email: [email protected]
3
Defence Geoinformatics Research Establishment (DGRE), DRDO Chandigarh, India

Abstract—In the past decade, the availability of voluminous and high-dimensional datasets has
immensely emerged with continuous technological innovations to foster new ways to gather and
analyze data. Thus, feature selection has become a challenging task in areas of application like
text classification, data analysis, prediction, information retrieval etc. Knowledge extraction
using machine learning models usually depends on the quality and quantity of data that they
operate on. Feature selection is one of the core concepts to identify and remove irrelevant as
well as redundant information that may impact accuracy or have no impact on the results.
Feature selection methods are discussed in this review paper along with practical limitations.
Subsequently, the workflow to solve a feature selection problem is also elaborated with feature
selection methods. In feature selection, many surveys and empirical assessments were
performed in many areas like classification, prediction, regression, and clustering, respectively.

Index Terms— feature selection, feature ranking, filter, hybrid, wrapper, embedded.

I. INTRODUCTION
The abrupt growth in volume as well as dimension of datasets is problematic for algorithms and leads to high
computational cost and memory usage. Mobiles, social media, cameras, infrastructure-less wireless networks,
weblogs, radio frequency identification (RFID) readers, internet search, web-based data are few names of
information gathering sources. According to the IDC Digital Universe study, “By 2020, around 40 zettabytes
will be the size of the digital universe” [1]. Apart from structured data in traditional datasets, a large volume of
unstructured and semi-structured data is present.
In the late 19th century, feature selection emerged as an important technique for selection of most relevant,
significant, and important features. Variable Selection is a part of Feature Engineering, also known as ‘feature
selection’ (or attribute selection). It is the process to identify and removal of irrelevant or less vital features to
achieve better accuracy and performance enhancement of the model [2]. The identification or extraction of most
relevant features also known as dimensionality reduction techniques [1]. The advantage of feature selection is
that no information loss of a single feature importance. In case, when original features are very diverse, and a
small set of features is required then removal of some features may lead to a chance of loss of information.
Whereas drawback of feature extraction is that the linear combinations of original features are not interpretable
and the information about the contribution of each feature is often lost. Feature extraction often decreases the
feature space size without losing a lot of information. The choice of feature extraction or selection methods
totally depends on the data type of application domain [3].

Grenze ID: 01.GIJET.9.2.42


© Grenze Scientific Society, 2023
“The objective of feature selection is three-fold: improving the prediction performance of the predictors,
providing faster and cost-effective predictors, and providing a better understanding of the underlying process that
generated the data.” [2]

II. LITERATURE REVIEW


The feature selection method consists of four critical steps: feature subset creation, subset assessment, stopping
criterion, and outcome validation. The objective of feature selection is to select a subset e.g., b= {f2, f4, f8, f9}
from the complete set of input features e.g. a = {f1, f2, f3, f4……. fn} where n is the total number of features in
a dataset. The subset ‘b’ can predict the output with improved accuracy and reduced computational cost
comparable to the performance of the set ‘a.’ All the features in a dataset are not always relevant and the
redundancy in the features may not lead to potential results. Less number of highly relevant features gives a
better generalization with less training and testing time. L. Ladha et al and N. Krishnaveni et al in [4] enlisted
foremost benefits of performing feature selection before modelling the data:
 Reduces dimensionality of feature space,
 Less storage requirements,
 Removes redundant, irrelevant, and noisy data,
 Overfitting reduction,
 Accuracy improvement of resulting model,
 Training time reduction,
 Improve interpretation with complexity reduction,
 Performance improvement, to gain predictive accuracy.
According to literature review, many novel methods have been implemented based on local and global relations.
Genetic Algorithm and Support Vector Machine are the most used methods on the other hand many authors
consider high dimensionality to be crucial concepts to address [5]. Feature selection algorithms (FSA) are
classified into Filter, Wrapper, and Hybrid feature selection methods [6]. While based on data type, FSA are
broadly categorized into four groups: similarity-based, sparse-learning-based, information-theoretical-based, and
statistical-based methods [7]. In 2013 using affinity propagation clustering, a fast feature selection algorithm was
proposed named ‘Sequential Feature Selection’ (SFS). SFS provides high accuracy and was applied separately to
each cluster of a dataset [8]. In 2014 Naïve Bayesian based novel hybrid algorithm was proposed to minimize the
feature selection computing complexity using filter and wrapper algorithms [9]. On the other hand, Unsupervised
and multivariate filter- based feature selection methods were proposed for analyzing the redundancy and
relevance of features [10].
In paper [11], four feature selection methods are compared as Decision trees, entropy measure for ranking
features, estimation of distribution algorithms, and the bootstrapping algorithm on the other side also proved that
the elimination of noise is vital to consider in the classification process. Bi-Objective version outperforms
Particle Swarm Optimization, Ant Colony Optimization, and Genetic Algorithms as an optimization technique
for ensemble systems using filter-based feature selection approach [12]. Whereas Wrapper-based approach was
used by many researchers for feature selection using different optimization techniques such as Ant Colony
Optimization algorithm [13], a hybrid search method and particle swarm optimization [14], Harmony Search
[15].
The output of feature selection algorithms is based on various dimensions such as ranking or subset, tasks
involved as supervised or unsupervised, and principle as filter/ wrapper/ embedded. The feature selection method
must be enhanced to reduce redundant data for large data analytics [4]. Various studies demonstrate that various
algorithms are available for feature selection and each algorithm behaves differently for different types of
datasets. Therefore, analysis is required to find out the suitable algorithm for feature selection.
This section provides introduction to feature selection and brief review on feature selection algorithms. In the
following section 3, different feature selection algorithms are discussed to check situation-based suitability of
feature selection techniques for classification/prediction and clustering algorithms; In section 4, feature selection
algorithms are compared based on different merits such as efficiency, computation cost, feature dependencies,
etc. Various application and technical domains for feature selection are provided in Section 5. Finally, conclude
the paper with a discussion on advanced issues section 6.

94
III. FEATURE SELECTION METHODS AND LIMITATIONS:
The feature selection problem has been studied form many years by statistics, researcher, and machine learning
communities. With emerging data mining research, more attention has been given to feature selection
techniques. Feature selection is also known as subset selection and a pre-processing technique used in machine
learning to increase learning accuracy by removing irrelevant features [16]. Taxonomy of feature selection
techniques for different datasets is shown in “Fig. 1”.
Feature Selection

Select a subset from Filter, wrapper, Embedded Applicable for microarray data, text mining, weather
the original set and hybrid methods forecasting, image processing, sentimental analysis, etc.

Fig 1. Taxonomy of feature selection techniques for different datasets

Complete (Exhaustive) search, Sequential Search, Exponential Search and Randomized Search are most
common search Strategies used for feature selection. A large range of application areas (text analysis, microarray
data analysis, climate change prediction, digital image processing, sentimental analysis, etc.) uses different types
of feature selection algorithms (Filter, wrapper, Embedded and hybrid methods).
A. Filter Methods
Filter methods is the most generic approach among all the four and works irrespective of data modelling
algorithm. Optimal feature set is selected by analysing general features of the dataset. In literature, methods with
respect to filter class are as univariate (evaluate a single feature using ranking) and multivariate (evaluate an
entire feature subset using search strategy). Filter methods cannot be applied universally for different knowledge
discovery operations thus has been classified as regression, classification, or clustering [6], [10], [17]–[32] .
Filter approach to feature subset is shown in Fig. 2.

Entire Feature Machine


Feature Ranking, Learning Classifier
set subset Algorithm
selection

Selecting the best Performance


Dataset
feature subset Evaluation

Fig 2. Filter method

According to [33], filter methods are faster and have low computational cost in contrast to wrapper and
embedded methods but inefficient reliability in classification problems. As a result, this method is more popular
to both academicians and industry practitioner. According to survey analysis, in Table I filter methods are
summarised into 2 categories, along with application utility and references.

TABLE I. MOST COMMON FILTER METHODS

Information Gain (IG) [17]; Gain Ratio [18]; Chi-square [18]; Fisher Score (F-Score) [19]; Classification
Symmetrical uncertainty [20]
Univariate Filter

ReliefC [21] Clustering


Laplacian Score (L-Score) and Spectral feature selection (SPEC) [22] Classification, Clustering
Correlation [22] Regression
Filter feature selection

Relief and ReliefF [23]; ANOVA / Term Variance/ Variance Threshold [24]; Count/ Classification, Regression
class

frequency based or Count Vectorizer [25]; Gini Index (GI) [26]


Fast Correlation based filter (FCBF) [22]; RRVSACO: Relevancy-Redundancy Variable Classification
Selection Ant Colony [10]; RSM: Random subspace (an ensemble) method [27] ;
Multivariate Filter class

Relevance-redundancy feature selection (RRFS) [28]


Correlation based feature selection (CFS) [20]; Minimal redundancy-maximal-relevance Classification, Regression
(mRMR) [6]
Variable selection using sparse cluster analysis [29]; LFSBSS: Localized Feature Selection
Based on Scatter Separability [30]; MCVS: Multi-Cluster Variable Selection [22]; Variable Clustering
weighting K-means [31]; Graph Clustering Ant Colony (GCACO) [32]; Graph Clustering
with Node Centrality (GNCC) [31]
Unsupervised Feature Selection Ant Colony (UFSACO) [10] Classification, Clustering

95
B. Wrapper Methods
In 1996, Ron Kohavi and George H. John proposed the wrapper procedure to decrease irrelevant features [34].
Black-box is used as a predictor to evaluate the variable subset using wrapper methods. Wrapper methods
performs better in case of a smaller number of features in contrast it is expensive to be used for huge feature set
due to high computational cost and feature selection process got slower if each feature is evaluated with trained
classifier [16]. Another drawback is higher risk of overfitting as if using classifier model data learning rate is too
high and provides poor generalization [4], [6], [9]. The induction technique (also known as the ‘black box’) has
been used to represent the supervised problems. Each training instance is characterised by feature vectors and a
class label as shown in Fig. 3. Further, ranking is used for irrelevant features elimination [34].
In 2015, Diao R and Shen Q. suggested that to form a hybrid approach with intellectual properties for feature
selection, development of a meta-framework may be beneficial for dynamic identification of suitable algorithms
[35]. According to [14], Genetic algorithms (GA), Particle Swarm Optimization (PSO), and Ant Colony
Optimization (ACO) are widely used among existing heuristic methods for the variable selection problem. It has
been observed that due to its cogent structure GA is preferable even though PSO and ACO provide higher
accuracy for similar tasks. According to literature survey, Wrappers algorithms can be categorized into
Sequential selection and Heuristic search. In Table II Wrapper methods are summarized, along with application
utility and references.

Entire Search
Feature Strategy, Induction Classifier
set subset Algorithm
generation

Selecting the best Performance


Dataset
feature subset Evaluation

Fig 3. Wrapper method

Wrappers perform feature selection based on performance of modelling algorithm (black box). E.g., for
classification tasks, features subsets will be evaluated based on classifier performance, whereas, for clustering, it
will be based on performance of clustering algorithm [13], [36]–[42]. While implementing Wrappers, the subset
generation is dependent on search strategy same as filter methods. The model evaluation steps are repeated for
each subset until all features are ranked. According to literature, wrappers performance is better than filters
because it evaluates subsets using real modelling algorithms. Generally, any combination of search strategy and
modelling algorithms can be implemented for variable selection. While modelling methods like Naïve Bayes,
linear support vector and Extreme Learning Machines are best with wrappers for greedy search problems.

TABLE II. MOST COMMON WRAPPER METHODS

Sequential/ Greedy Sequential backward selection and Sequential forward [38] Classification
selection algorithms
Wrapper
feature Global/ Random/ Ant Colony Optimization (ACO) and Genetic Algorithm (GA) [39]; Classification, Clustering
selection Heuristic selection Particle Swarm Optimization (PSO-SVM) [42]; Artificial Bee Colony
algorithms (ABC) [40]; Random mutation hill-climbing [36]; Simulated annealing
(SA) [41]

C. Embedded Methods
The filter methods have a major drawback that is independent of classifiers which results worse performance than
wrapper methods. However, wrapper has high computational cost. A midway solution is the use of embedded or
hybrid methods and both use the principal of the classifier to generate criteria to rank most optimal features [43].
Embedded methods are robust and effective while dealing with high dimensional dataset and have lower risk of
overfitting in comparison to wrapper [44]. First, these methods train a machine learning model then drive feature
importance. Finally, remove non-important features using the feature subset. Fig. 4 illustrates the embedded
variable selection methodology.
Embedded methods consider not only feature dependencies via relationship between input and output feature,
but also search features locally that allow local discrimination. In other words, it uses independent criteria to find
optimal subset for known cardinality. The final optimal subset is selected among optimal subsets by using
learning algorithm having best accuracy level. Various types of decision tree algorithms are used by different
embedded methods as CART, C4.5, random forest, multinomial logistic regression, and its variants [45]. Among

96
ML
Entire Produce Algorithm
Feature a Feature + Classifier
set subset Performan
ce

Selecting the best Performance


Dataset
feature subset Evaluation

Fig 4. Embedded method

different embedded methods, Support Vectors based on Recursive Feature Elimination (SVM-RFE) is widely
used [46] Whereas, regularization method also known as penalization, is one of the common embedded types of
feature selection. Among regression based embedded methods, LASSO and RIDGE regression are mostly used
for hyperspectral data to reduce overfitting through inherent correction [47], [48]. According to literature survey,
some mostly used Embedded algorithms are mentioned in Table III. Embedded methods are summarised, along
with application utility and references [45], [47]–[52].

TABLE III. MOST COMMON EMBEDDED METHODS

Embedded Tree based methods / Classification and Regression Trees (CART) [45] Classification, Regression, Clustering
feature Decision tree algorithms ID3 [50]; Random Forest [51] Classification
selection C4.5 Decision Trees [49] Clustering
Regularization / LASSO regression (L1 regularization) [47]
Regression methods RIDGE regression (L2 regularization) [48] Regression
Elastic nets (L1/L2 regularization) [52]

D. Hybrid Methods
Hybrid methods are developed to utilize the advantages of filters as well as wrappers methods [44]. These
methods are sequential based as the first step is commonly based on filter methods to reduce the number of
features that help to remove redundant features then a wrapper method is applied to select the desired number of
features from a reduced set for optimal feature selection [53]. It can be any combination of classical methods
such as filter-filter, filter-wrapper, and filter-filter-wrapper where the output of one method as feature subset is
provided as input to another. Independent test and performance evaluation functions are used by this approach
for feature subset selection. Thus, it helps to improve the efficiency and accuracy for high dimensional dataset
with better computational cost [54]. Hybrid feature selection method layout is shown in Fig. 5. Several
methodologies are developed using hybrid methods, such as: hybrid ant colony optimization; fuzzy random
forest-based feature selection mixed gravitational search algorithm or hybrid genetic algorithms.

optimal
feature pre-
Entire feature
selction Classifier
Feature set selection
(filter
(wrapper
method)
OR filter)

Selecting the best Performance


Dataset
feature subset Evaluation

Fig 5. Hybrid method

IV. COMPARISON
All above mentioned methods have been widely used by many researchers for different applications. The
performance of feature selection methods totally depends upon the dimensionality of the dataset. Thus, new
methods for feature selection are always needed. Table IV, summaries and compared the feature selection
methods along with merits and demerits [9], [13], [14], [37], [43], [44].

V. APPLICATIONS DOMAINS FOR FEATURE SELECTION


In this section, application and technical domains for feature selection are briefly discussed. Feature selection
problems aim at selecting a subset of variables to describe the data with maximum accuracy. The chosen subset
must be short in size and contain only the information required for the given application. In the literature, feature
selection issues are often tackled using search strategies, where the assessment of a specified subset is performed

97
TABLE IV. COMPARISON OF FILTER, WRAPPER, EMBEDDED AND HYBRID METHODS
Filter methods Wrapper methods Embedded methods Hybrid methods
High computational efficiency. Better results than filter, but The performance degrades if Better computational complexity than
computational expensive for number of irrelevant features is wrapper methods.
huge dataset. high in target set.
No interaction with modelling Dependent on modelling Dependent on modelling Dependent on modelling algorithm
algorithm. algorithm performance. algorithm performance. performance.
Low computational cost for High computational cost for Computational cost is less in More flexible and robust against high
large dataset. large dataset thus works comparison to wrapper dimensional data
better for small dataset. methods.
Fast execution than wrapper Slow execution than filter Faster than wrapper methods. High performance than filter
methods methods. methods.
Less prone to overfitting High risk of overfitting as if Generally used to reduce over Overcomes the demerits of wrapper,
data learning rate is too high. fitting; Least prone to enhanced search algorithm thus least
overfitting prone to overfitting
Mostly ignores feature Consider feature Identification of small set of Dependents on the combination of
dependencies and consider each dependencies. features may be problematic. different feature selection methods.
feature separately.
The features subsets are Cross validation methods are Cross validation methods are Cross validation methods are used for
evaluated by using statistical used for method assessment. used for valuation of methods. assessment.
tests

by an appropriate function (filter methods) or directly by the execution of a Data Mining tool (wrapper methods)
[6], [22], [53].
In Table V, feature selection methods are summarized with respect to application domains and evaluation
metrics. Although there is no ideal solution for a particular statement of problem and difficult to conclude until a
significant number of test situations are efficiently addressed.

TABLE V. SUMMARY OF FEATURE SELECTION METHODS BY VARIOUS APPLICATION DOMAINS


Application domain Feature Selection methods Evaluation metrics Best performing
Bioinformatics [13], Information Gain, Chi-square, t-Statistics, Gain Ratio, Accuracy, Stability, Information Gain,
[55] Symmetrical Uncertainty, ReliefF, Gini Index, Max AUC Symmetrical Uncertainty,
Minority, Sum of Variance Chi-square and ReliefF
Text Classification Probability ratio, Bi-normal Separation, Accuracy Recall, Accuracy, Information Gain and Bi-
[56], [57] Balanced, F1 Measure, Information Gain, Power, Random, Precision, and F- normal Separation
Correlation, Chi-square, measure
Clustering[21], [58] Information Gain, Iterative Feature Selection, Document Entropy and Iterative feature selection
Frequency, Chi-square, Entropy based ranking Precision
Rule Induction[15], RIPPER, Induction algorithms, Pruning, LEM1, LEM2, Accuracy, Recall, RIPPER and Multi-strategy
[59] AQ, LERS, Multi-strategy approach and Precision approach
System Distance, entropy, Wrapper (SVM/NN), Global geometric Accuracy Overall, all methods
Monitoring[60], [61] similarity scheme, Correlation, Principal component
analysis, Frequency-domain analysis
Image Recognition Genetic Search, K-mean, Relief, Sequential Floating Accuracy and MSE Relief, SFFS or SFBS and
[40], [62] Forward or Backward Selection (SFFS/SFBS), different their combinations
combination of all, and Random Search

VI. CONCLUSION
In this paper, various strategies are reviewed in the context of feature selection. Each algorithm selects the
variables without computing redundancy out of which some algorithms also do not consider performance and
accuracy. On the other hand, some algorithms do not consider the existence of noisy data when picking features.
According to literature, if the computational time is extended, then the learning process becomes negligible.
Filter methods can be used on huge datasets with many features because they are faster. However, it has no
effect on accuracy. Wrapper approaches pick the best characteristics with high precision. However, the
computational cost is high. Some hybrid solutions attempted to address the shortcomings of both methodologies.
The objective of this study is to provide an in-depth comprehension of feature selection. If the high dimensional
dataset contains irrelevant, insignificant, and unimportant features which in turns prevents effective modelling
and may produce less accurate as well as less understandable results or may fail to achieve desirable results.
Based on the research, an efficient unified framework for any size of dataset with minimal computing cost, and
highest accuracy is required for variable selection.

98
REFERENCES
[1] Gantz John and Reinse David, “THE DIGITAL UNIVERSE IN 2020: Big Data, Bigger Digital Shadow s, and Biggest
Growth in the Far East,” Dec. 2012. https://www.cs.princeton.edu/courses/archive/spring13/cos598C/idc-the-digital-
universe-in-2020.pdf (accessed Nov. 09, 2022).
[2] I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” Journal of Machine Learning Research,
vol. 3, pp. 1157–1182, 2003, Accessed: Nov. 09, 2022.
[3] A. Janecek, W. Gansterer, M. Demel, and G. Ecker, “On the Relationship Between Feature Selection and Classification
Accuracy,” in Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge
Discovery at ECML/PKDD, Sep. 2008, pp. 90–105.
[4] Radha V and N Krishnaveni N, “Feature Selection Algorithms for Data Mining Classification: A Survey,” Indian J Sci
Technol, vol. 12, no. 6, pp. 1–11, Feb. 2019,
[5] T. Li, Y. S. Ho, and C. Y. Li, “Bibliometric analysis on global Parkinson’s disease research trends during 1991–2006,”
Neurosci Lett, vol. 441, no. 3, pp. 248–252, Aug. 2008, Accessed: Nov. 10, 2022.
[6] J. Tang, S. Alelyani, and H. Liu, “Feature selection for classification: A review,” Data Classification: Algorithms and
Applications, pp. 37–64, Jan. 2014,
[7] J. Li et al., “Feature selection: A data perspective,” ACM Comput Surv, vol. 50, no. 6, Dec. 2017,
[8] K. Zhu and J. Yang, “A cluster-based sequential feature selection algorithm,” Proceedings - International Conference
on Natural Computation, pp. 848–852, 2013, Accessed: Nov. 10, 2022.
[9] Z. Zeng, H. Zhang, R. Zhang, and Y. Zhang, “A Hybrid Feature Selection Method Based on Rough Conditional Mutual
Information and Naive Bayesian Classifier,” ISRN Applied Mathematics, vol. 2014, pp. 1–11, Mar. 2014, Accessed:
Nov. 14, 2022.
[10] S. Tabakhi and P. Moradi, “Relevance–redundancy feature selection based on ant colony optimization,” Pattern
Recognit, vol. 48, no. 9, pp. 2798–2811, Sep. 2015, Accessed: Nov. 10, 2022.
[11] V. H. Medina Garcia, J. Rodriguez Rodriguez, and M. A. Ospina Usaquén, “A comparative study between feature
selection algorithms,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics), vol. 10943 LNCS, pp. 65–76, 2018,
[12] L. E. A. Laura Emmanuella and A. M. de Paula Canuto, “Filter-based optimization techniques for selection of feature
subsets in ensemble systems,” Expert Syst Appl, vol. 41, no. 4, pp. 1622–1631, Mar. 2014, Accessed: Nov. 10, 2022.
[13] T. Tekin Erguzel, C. Tas, and M. Cebi, “A wrapper-based approach for feature selection and classification of major
depressive disorder–bipolar disorders,” Comput Biol Med, vol. 64, pp. 127–137, Sep. 2015, Accessed: Nov. 10, 2022.
[14] M. M. Javidi and N. Emami, “A hybrid search method of wrapper feature selection by chaos particle swarm
optimization and local search,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 24, no. 5, pp.
3852–3861, Jan. 2016,
[15] S. Das, P. K. Singh, S. Bhowmik, R. Sarkar, and M. Nasipuri, “A Harmony Search Based Wrapper Feature Selection
Method for Holistic Bangla Word Recognition,” Procedia Comput Sci, vol. 89, pp. 395–403, Jan. 2016, Accessed: Nov.
10, 2022.
[16] Chandrashekar Girish and Sahin Ferat, “A survey on feature selection methods,” Computers & Electrical Engineering,
vol. 40, no. 1, pp. 16–28, Jan. 2014, Accessed: Nov. 10, 2022.
[17] Hoque N., Bhattacharyya D. K., and Kalita J. K., “MIFS-ND: A mutual information-based feature selection method,”
Expert Syst Appl, vol. 41, no. 14, pp. 6371–6385, Oct. 2014, Accessed: Nov. 10, 2022.
[18] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, “Data Mining: Practical Machine Learning Tools and Techniques,”
Data Mining: Practical Machine Learning Tools and Techniques, pp. 1–621, Nov. 2016, Accessed: Nov. 10, 2022.
[19] Duda R.O., Hart P.E., and Stork D.G., “Pattern Classification,” Journal of Classification 2007 24:2, vol. 24, no. 2, pp.
305–307, Sep. 2007,
[20] L. Yu and H. Liu, “Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution,” in : Proc.
20th International Conference on Machine Learning (ICML-2003), 2003, pp. 856–863.
[21] M. Dash and Y. S. Ong, “RELIEF-C: Efficient feature selection for clustering over noisy data,” Proceedings -
International Conference on Tools with Artificial Intelligence, ICTAI, pp. 869–872, 2011, Accessed: Nov. 10, 2022.
[22] S. Alelyani, J. Tang, and H. Liu, Chapter: Feature Selection for Clustering: A Review. Chapman and Hall/CRC, 2018.
[23] M. Robnik-Šikonja and I. Kononenko, “Theoretical and Empirical Analysis of ReliefF and RReliefF,” Machine
Learning 2003 53:1, vol. 53, no. 1, pp. 23–69, Oct. 2003,
[24] F. Ahmed and M. L. Gavrilova, “Two-Layer Feature Selection Algorithm for Recognizing Human Emotions from 3D
Motion Analysis,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and
Lecture Notes in Bioinformatics), vol. 11542 LNCS, pp. 53–67, 2019,
[25] H. Nematzadeh, R. Enayatifar, M. Mahmud, and E. Akbari, “Frequency based feature selection method using whale
algorithm,” Genomics, vol. 111, no. 6, pp. 1946–1955, Dec. 2019, Accessed: Nov. 10, 2022.
[26] H. Liu, M. Zhou, X. S. Lu, and C. Yao, “Weighted Gini index feature selection method for imbalanced data,” ICNSC
2018 - 15th IEEE International Conference on Networking, Sensing and Control, pp. 1–6, May 2018,
[27] C. Lai, M. J. T. Reinders, and L. Wessels, “Random subspace method for multivariate feature selection,” Pattern
Recognit Lett, vol. 27, no. 10, pp. 1067–1076, Jul. 2006, Accessed: Nov. 10, 2022.

99
[28] A. J. Ferreira and M. A. T. Figueiredo, “An unsupervised approach to feature discretization and selection,” Pattern
Recognit, vol. 45, no. 9, pp. 3048–3060, Sep. 2012, Accessed: Nov. 10, 2022.
[29] D. M. Witten and R. Tibshirani, “A framework for feature selection in clustering,” J Am Stat Assoc, vol. 105, no. 490,
pp. 713–726, Jun. 2010,
[30] Y. Li, M. Dong, and J. Hua, “Localized feature selection for clustering,” Pattern Recognit Lett, vol. 29, no. 1, pp. 10–18,
Jan. 2008,
[31] D. S. Modha and W. S. Spangler, “Feature Weighting in k-Means Clustering,” Machine Learning 2003 52:3, vol. 52,
no. 3, pp. 217–237, Sep. 2003,
[32] P. Moradi and M. Rostami, “Integration of graph clustering with ant colony optimization for feature selection,” Knowl
Based Syst, vol. 84, pp. 144–161, Aug. 2015, Accessed: Nov. 10, 2022.
[33] Saptarsi Goswami and Amlan Chakrabarti, “Feature Selection: A Practitioner View,” International Journal of
Information Technology and Computer Science(IJITCS), vol. 6, no. 11, pp. 66–77, 2014,
[34] R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artif Intell, vol. 97, no. 1–2, pp. 273–324, Dec.
1997, Accessed: Nov. 10, 2022.
[35] R. Diao and Q. Shen, “Nature inspired feature selection meta-heuristics,” Artificial Intelligence Review 2015 44:3, vol.
44, no. 3, pp. 311–340, Jan. 2015,
[36] D. B. Skalak, “Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms,”
Machine Learning Proceedings 1994, pp. 293–301, Jan. 1994, Accessed: Nov. 10, 2022.
[37] J. C. Cortizo and I. Giraldez, “Multi criteria wrapper improvements to Naive Bayes learning,” Lecture Notes in
Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),
vol. 4224 LNCS, pp. 419–427, 2006,
[38] K. Z. Mao, “Orthogonal forward selection and backward elimination algorithms for feature subset selection.,” IEEE
Trans Syst Man Cybern B Cybern, vol. 34, no. 1, pp. 629–34, Feb. 2004,
[39] H. R. Kanan, K. Faez, and S. M. Taheri, “Feature selection using Ant Colony Optimization (ACO): A new method and
comparative study in the application of face recognition system,” Lecture Notes in Computer Science (including
subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4597 LNCS, pp. 63–76,
2007,
[40] M. Schiezaro and H. Pedrini, “Data feature selection based on Artificial Bee Colony algorithm,” EURASIP Journal on
Image and Video Processing 2013 2013:1, vol. 2013, no. 1, pp. 1–8, Aug. 2013,
[41] S. W. Lin, Z. J. Lee, S. C. Chen, and T. Y. Tseng, “Parameter determination of support vector machine and feature
selection using simulated annealing approach,” Appl Soft Comput, vol. 8, no. 4, pp. 1505–1512, Sep. 2008, Accessed:
Nov. 10, 2022.
[42] Chung-Jui Tu, Li-Yeh Chuang, Jun-Yang Chang, and Cheng-Hong Yang, “Feature selection using PSO-SVM,” IAENG
Int J Comput Sci, vol. 33, no. 1, 2007,
[43] I. S. Oh, J. S. Lee, and B. R. Moon, “Hybrid genetic algorithms for feature selection,” IEEE Trans Pattern Anal Mach
Intell, vol. 26, no. 11, pp. 1424–1437, Nov. 2004, Accessed: Nov. 10, 2022.
[44] das Sanmay, “Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection,” in Proceedings of the Eighteenth
International Conference on Machine Learning (ICML’01), 2001, pp. 74–81.
[45] M. Sandri and P. Zuccolotto, “Variable Selection Using Random Forests,” Data Analysis, Classification and the
Forward Search, pp. 263–270, Aug. 2006,
[46] Maciej Serda et al., “Sparse multinomial logistic regression via Bayesian L1 regularisation,” Uniwersytet śląski, vol. 7,
no. 1, pp. 209–216, 2007,
[47] R. Muthukrishnan and R. Rohini, “LASSO: A feature selection technique in predictive modeling for machine learning,”
2016 IEEE International Conference on Advances in Computer Applications, ICACA 2016, pp. 18–20, Mar. 2017,
Accessed: Nov. 10, 2022.
[48] M. Imani and H. Ghassemian, “Ridge regression-based feature extraction for hyperspectral data,”
https://doi.org/10.1080/01431161.2015.1024894, vol. 36, no. 6, pp. 1728–1742, Mar. 2015,
[49] S. W. Lin and S. C. Chen, “Parameter determination and feature selection for C4.5 algorithm using scatter search
approach,” Soft Computing 2011 16:1, vol. 16, no. 1, pp. 63–75, May 2011,
[50] S. Nizamani, N. Memon, U. K. Wiil, and P. Karampelas, “Modeling Suspicious Email Detection using Enhanced
Feature Selection,” Dec. 2013,
[51] B. H. Menze et al., “A comparison of random forest and its Gini importance with standard chemometric methods for the
feature selection and classification of spectral data,” BMC Bioinformatics, vol. 10, Jul. 2009, Accessed: Nov. 10, 2022.
[52] A. Destrero, S. Mosci, C. de Mol, A. Verri, and F. Odone, “Feature selection for high-dimensional data,” Computational
Management Science 2008 6:1, vol. 6, no. 1, pp. 25–40, Apr. 2008,
[53] Muhammad Shakil Pervez and Dewan Md. Farid, “Literature Review of Feature Selection for Mining Tasks,”
International Journal of Computer Application, vol. 116, no. 21, pp. 30–33, 2015.
[54] Veerabhadrappa and L. Rangarajan, “Bi-level dimensionality reduction methods using feature selection and feature
extraction,” Int J Comput Appl, vol. 4, no. 2, pp. 33–38, 2010.
[55] M. Yousef, A. Kumar, and B. Bakir-Gungor, “Application of biological domain knowledge based feature selection on
gene expression data,” Entropy, vol. 23, no. 1. 2021.

100
[56] R. K. Palacharla and V. K. Vatsavayi, “A novel filter based multivariate feature Selection technique for text
classification,” J Theor Appl Inf Technol, vol. 99, no. 18, 2021.
[57] O. M. Alyasiri, Y. N. Cheah, and A. K. Abasi, “Hybrid Filter-Wrapper Text Feature Selection Technique for Text
Classification,” in International Conference on Communication and Information Technology, ICICT 2021, 2021.
[58] K. Golalipour, E. Akbari, S. S. Hamidi, M. Lee, and R. Enayatifar, “From clustering to clustering ensemble selection: A
review,” Engineering Applications of Artificial Intelligence, vol. 104. 2021.
[59] A. Adla and S. T. Zouggar, “Performance Assessment of Random Forest Induction Methods,” in 2021 International
Conference on Decision Aid Sciences and Application, DASA 2021, 2021.
[60] M. Tiboni, C. Remino, R. Bussola, and C. Amici, “A Review on Vibration-Based Condition Monitoring of Rotating
Machinery,” Applied Sciences (Switzerland), vol. 12, no. 3. 2022.
[61] A. N. Anggraeni, K. Mustofa, and S. Priyanta, “Comparison of Filter and Wrapper Based Feature Selection Methods on
Spam Comment Classification,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 3,
2021,
[62] P. Jayapriya and K. Umamaheswari, “Performance analysis of two-stage optimal feature-selection techniques for finger
knuckle recognition,” Intelligent Automation and Soft Computing, vol. 32, no. 2, 2022.

101
Grenze International Journal of Engineering and Technology, June Issue

Devanagari Characters Recognition: Extracting Best


Match for Photographed Text
Neelam Chandolikar1, Swati Shilaskar2, Vaishali Khupase3 and Mansi Patil4
1
Department of Information Technology and MCA, Vishwakarma Institute of Technology,Pune, India
Email: [email protected]
2-4
Department of Electronics and Telecommunication, Vishwakarma Institute of Technology,Pune, India
Email: [email protected], [email protected], [email protected]

Abstract—Devanagari script is used by most of the people in India. There are some script-
specific structural characteristics of Devanagari script which makes the character recognition
problem more challenging. Many OCR tools are available for printed or handwritten
Devanagari script recognition. In these systems the input is given in the form of images of the
script which can be scanned or photographed. But the existing systems are not robust. They
give unexpected results when the input to the system is not ideal that is the image is rotated or
tilted or has illumination variance. Our goal is to build a robust OCR system for printed
Marathi highlighted text where the variation with respect to font, size, orientation and
illumination are allowed. This paper proposes appropriate image transformation techniques to
get a robust Devanagari Character Recognition System.

Index Terms— Devanagari, PCA, Tesseract, Levenshtein edit distance, OCR, Perspective
transform, Sauvola Thresholding, Highlighted word.

I. INTRODUCTION
Humans have highly developed sense for several pattern recognition tasks; one such task we very easily perform
every now and then is: recognizing the written text. Humans can develop their reading and writing skills in their
first few years of education and when they growup they can easily recognize text even if it is printed in different
styles, sizes, font and orientation. Even the broken, distorted and misspelled words can also be recognized by
human and all this is possible by past experiences. From lots of research it is found that reading skill of
computers is still way behind the human. In this paper, the goal is to recognize highlighted Marathi words. An
image of highlighted printed or handwritten word is taken as an input. The image can be scanned or
photographed using smart-phone or web camera. Many OCR systems for Devanagari script recognition presently
exists.
Existing systems have certain limitations like they cannot work on tilted images, images captured at different
angles of rotation or in presence of illumination variance. Tilt angle and rotation angle are not the same. The Tilt
angle is the angle made by the camera with the plane. Tilt angle is same as the elevation angle whose zero is at
the horizon. If it is non-zero, the effects due to tilt are observed in the image and may not give correct results.
The rotation angle is the azimuthal angle. Its axis of rotation is perpendicular to the plane. The existing systems
may give unexpected result if angle of rotation is non-zero. Also, if there is uneven illumination or shading, the
existing systems may fail to recognize the word correctly. So, a methodology is proposed to transform the image

Grenze ID: 01.GIJET.9.2.43


© Grenze Scientific Society, 2023
appropriately to nullify the effect due to tilt and the rotation angle. Our system also handles the effect due to
variation in background light and uneven illumination to some extent. So, the proposed system achieves desired
robustness with respect to above mentioned variations.
The original objective of our OCR system was to provide an appropriate users friendly input method for a
Devanagari knowledge search engine. The search engine we are developing is to be used by primary school
students. The search queries to be given to the search engine as a photographed image of Devanagari text, taken
by smart-phones. This way of searching queries is very easy and user friendly than traditional approach where
we have to type Marathi letters from keyboard. Typically, it is quite tedious to input composite Marathi
characters via a keyboard. Though main objective of our OCR system was to provide user’s friendly input
method to our search engine, it can easily be adapted to any other application which requires robust Devanagari
character recognition such as Digitization of old documents, texts, Digitization of forms for banks, post offices,
or any other government organization, recognition of handwritten name/ amount on cheques, etc.

II. RELATED WORK


Quality of image is very important factor for text recognition. Image quality is degraded due to uneven
illumination and when the image is captured in different orientations. Many researchers have developed
techniques to improve quality of such degraded images. Authors Huimin Lu et.al worked on shadow removal
method for text recognition [1]. It uses binarization of images for better performance. Taeyoung Kim et.al
proposed PCA based computation for illumination invariant space. It helps to remove shadow effects from input
color image [2]. H. El Bahi et al worked on offline character recognition system for images captured by camera
phone.They analyze and compare different thresholding methods to avoid illumination effects. As a result they
have chosen Sauvola thresholding method [3]. Sam S. Tsai et.al worked on image matching by using visual text
feature of images captured by camera-phone. This work consists of word distance matching method to
demonstrate false matches [4].
Annmaria Cherian et.al used Hough transform to correct orientation of perspective input image so that it could
recognize text by SVM classifier [5]. Vidula T. V. et. al proposed SURF (Speeded Up Robust Feature) for
perspective distorted image, which is faster than SIFT (Scale Invariant Feature Transform) method [6]. J.
Sauvola et.al proposed a technique for image binarization. In which they used hybrid approach to adapt defective
type images such as change in illumination, noise and resolution[7].Yash Gaurav et.al. presented a Deep
Convolutional Neural Network method for classification of inputted images[8]. Shalini Puria et.al proposed
Devanagari character classification model their model is based on SVMwhich recognizes printed and
handwritten text, they presented some unique preprocessing method for handling shirorekha of Indian scripts[9].
Tripathy et.al presented a SVM based method for Devanagari character recognition using OCR. They provided
their work for bangla Devanagari script [10]. Agarwal et.al presented comprehensive survey of methods based
on machine Learning Algorithms for Handwritten Devanagari Character Recognition[11]

III. PROBLEM DEFINITION


To build a robust OCR system for printed Marathi highlighted text where the variation with respect to font, size,
orientation and illumination are allowed.

IV. INNOVATIVE CONTENT


This research offers a unique method to improve performance of OCR. the goal of this work is to recognize
highlighted marathi word which is captured by mobile phone, or web camera or any other scanning device. the
advantage of this work is that it is rotation invariant, tilt invariant and illumination invariant with maximum
accuracy above 98%.

V. PROPOSED METHODOLOGY
In the proposed methodology, we have focused on recognizing Marathi highlighted words in any orientation i.e.
tilt or rotation, and also having different illumination effects. For recognizing the highlighted script Tesseract
API is used. In case of Marathi Language,Tesseract fails to recognize the script correctly if the tilt and rotation
angle are non-zero or have light variance.The paper focuses on improving these Tesseract API limitations.
Figure 1 Shows the block diagram of the proposed system, which consists of different phases, beginning with
input printed text imagee with a highlighted word, pre-processing, rotation invariant, tilt invariant, illumination

103
invariant, Tesseractimplementation,finding the best match for the Tesseract output using Levenstein distance and
final recognized text. The block diagram of the proposed system as follows:

Figure1.Block diagram of proposed system

A. Pre-processing
Pre-processing steps (as shown in Figure2 ) is applied on the input image to remove the noise from it and also to
minimize the variations in the character styles. The scanned document sometimes has Salt and Pepper noise or
Shaded areas. This noise must be filtered during preprocessing step. Sometimes image contains some black
spots. To remove these black spots and noise along with black shade at the edges, filtering has been done. Here,
we have used Median filter to remove high frequency components that cause noise in the image.

Figure2.Overall flow of Pre-processing

 Input Image: The highlighted printed text is captured by mobile phone, and that captured image is an input to
our system. Captured image contains one or more highlighted words. Recognition of these highlighted words
is the goal of this project.
 Masking: Masking of highlighted part is done here, for that lower and upper bound of color is found out with
the help of BGR values of particular color and then bitwise AND operation is done to extract color part only.
 (RGB)Color image to Grayscale image: The input contains color text image. In preprocessingphase the image
is converted to grayscale image.
 Thresholding: Thresholding is also known as binarization. In this certain threshold value has been set, this
will convert the pixels to black and white. If pixel value is above the threshold, the pixel is converted into
white and if the pixel value is less than threshold value, thepixel is converted into black. Quality of binarized
image depends on value of thethreshold.
 Canny Edge Detection: Canny edge detection is a process to extract significant structural information from
image and reduce the amount of data to be processed
 Boundary Tracing or Contour Detection: Contour detection means to find out the boundary of the area of
interest using edges. It will identify connected components of an image and store that pixel values in the form
of array. Contour can be found out by traversing the rows of image which is already filtered. The contour
detection algorithm searches foreground pixel and store it into an array by marking it. Similarly, it will find all
the neighborhood pixels. This process will continue till all the pixels of the image have been stored or it will
continue to search in next row.
B. Rotation Invariance
To make the system rotation invariant, first we need to find the angle by which the image is rotated and then
compensate the rotation. This is implemented using step shown in Figure3.

104
Figure3.Block diagram to make system Rotation Invariant

1. Mask the highlighted portion as already discussed in masking step of pre-processing part.
2. Crop the masked image to get only highlighted portion. This is required since the masked highlighted portion
has black background which may give us improper dataset obtained in step 3.
3. Classify the pixels into background pixels and foreground pixels. The black pixels of the text are the
foreground pixels and rest others are background pixels.
4. Implement Principal Component Analysis on the foreground pixels obtained in previous step. The brief
explanation of PCA is given immediately after step 5.
5. Rotate the image by the negative of the angle obtained in step 4.
Principal Component Analysis (PCA) [12] is used for finding the direction of maximum variance i.e.the
directions where the data is most spread out. For finding this direction of maximum variance, Eigen vectors of
the covariance matrix associated with the dataset are calculated. The Eigen vector corresponding to largest
Eigen value gives the vector in the direction of maximum variance. In this paper, the coordinates of the pixels
lying inside the contour of the highlighted word forms the dataset.
Implementation of PCA ,steps are as follows
I. A nx2 matrix of the dataset is formed, where n is the number of pixels lying inside the contour of
highlighted pixel.
⎡ ⎤
⎢ ⎥
=⎢ ⎥
⎢ ⎥
⎣ ⎦
II. Find Covariance matrix of the Data matrix:

( ) ( , )
_ =
( , ) ( )

Where, ( , )=
( )
∑( − ̅ )( − ) and ( )=
( )
∑( − ̅ )( − ̅)
III. Find Eigen values and Eigen vectors of the covariance matrix.
IV. Find Eigen vector corresponding to the largest Eigen value. The direction of this Eigen vector can be
obtained by
( ) = tan
The value of angle gives us the angle made by highlighted word with the X axis. If the value is not zero then in
order to make it parallel to X axis we need rotate the image by the angle , Rotate angle= 0-angle.
Figure 4 shows Input Image, its HSV Image then Masked Image, followed by Median Filter Image, its Gray
Scaled Image,application of Binary Thresholding , Contour Detection, Rotated Image, Median Filter on rotated
Image and Final Image.

Figure 4.(a)Input Image, (b)HSV Image, (c)Masked Image, (d) Median Filter Image, (e)Gray Scaled Image, (f)Binary Thresholding ,
(g)Contour Detection, (i)Rotated Image, (j) Median Filter on rotated Image, (k)Final Image

105
C. Tilt Invariant
When the photos taken at slightly tilted angle, the highlighted word is not visible properly. To make the word
properly visible perspective transform method is used.
Implementation of Perspective Transform
To make this system tilt invariant Perspective Transform plays an important role. In the transformed image the
letters are not slanted and straight. This algorithm is very useful for OCR.
Initially the contour for the highlighted word is found. For perspective transform, we need to define the region of
interest which is in the form of rectangle. The coordinates of the vertices of rectangle are such that the top-left
point have the smallest (x+y) sum, the bottom right have the largest (x+y) sum, the top-right have smallest (x-y)
difference and the bottom-left has largest (x-y) difference. These points are then placed in consistent order. The
height and width of the rectangle enclosing the highlighted word can be determined using the above obtained
vertices.The first point is (0, 0) in the list of points it indicates the top-left corner. The second point is top-right
corner given by (maxWidth - 1, 0), (maxWidth - 1, maxHeight - 1) gives the bottom-right corner and (0,
maxHeight - 1) gives the bottom-left corner. In a consistent ordering representation these points are defined.
Top-down view of the image is obtained using cv2.getPerspectiveTransform function. It requires two arguments
rectand dst. The rect is the list of four regions of interested points in the original image and dst is list of
transformed points. The cv2.getPerspectiveTransform function returns the actual transformation matrix M. The
transformation matrix is applied in cv2.warpPerspective function. The transform matrix M, image, height and
width of output image pass in to cv2.warpPerspective function resulted into warped image, which is our top-
down view. Figure 5 shows steps of making tilt invariant.

Figure5. (k) Tilted Image, (l) ROI image shown by rectangle, (m) Warped Image

D. Illumination Invariant
To make this system illumination invariant Sauvola thresholding method is used.
Sauvola thresholding
Sauvola thresholding is a local thresholding technique. This technique is useful for text recognition where
the background of images is not uniform.[7] In this method thresholds are calculated for every pixel by
using formula which is mentioned below. The formula contains the mean and standard deviation of the
local neighborhood which is defined by a window centered around the pixel. The local thresholding value
will be calculated by the following equation:
( , )
( , ) = ( , ). 1 − . 1 −

Where k is a constant equal to 0.5, and R denotes the dynamic range of the standard deviation s (defined as R
128 for a grayscale documents)
Algorithm:
input_image
def_mean_std(image[ndarray(N, M)], int(window_size))
{
m=mean of each pixel of image
s=standard deviation of each pixel of image
returns:m, s
}
def_mean_std (image[ndarray(N, M)], int window_size, k=(float), r=None )
( , )
{ ( , )= ( , ). 1 − . 1 −
returns:T: [ndarray(N, M)]
}

106
Above function is used by Sauvola threshold, in which mean and standard deviation of each pixel of an
image has been calculated and return by using neighborhood. Here, neighborhood is defined by rectangular
window having size w*w. Where, window_size(w) should be odd integer value such as (3, 5, 7, . . . . .).
Here, Parameter window_size determines the size of the window that contains the surrounding pixels. Here,
Sauvola thresholding is applied to an array, threshold value T is calculated using the formula given in
algorithm. Where, m(x, y) is mean of pixel (x, y), s(x, y) is standard deviation of pixel (x, y), k is used to
weights the effect of standard deviation, R is maximum standard deviation of grayscale image.
This algorithm is used to compensate illumination effects of the image even if the image is captured in
different light variations, this Sauvola threshold preserved information contained in an image. Figure 6. has
two images, image (n) is the image having illumination effect and image (o) obtained by applying Sauvola
thresholding which is very useful for OCR system to recognize text correctly.

Figure6.(n) Image having illumination effect, (o) Sauvola Threshold image

E. Text Recognition
Text recognition is the most important task of any OCR system, there are various OCR systems are available but
they are not capable to produce correct output if there is a variation with respect to rotation, tilt and illumination.
So aim of this project to make a robust system where variations with respect to rotation, tilt and illumination are
allowed. In this work main focus is on these three aspects.
This text recognition system is implemented using Google API called Tesseract which supports more than 110
languages[13], where Long Short Term Memory neural network is used to train text file. It is used to convert
image into text. It has 98% accuracy but when there are variations found in rotation, tilt effects, light effects its
accuracy starts decreasing. So, we tried to overcome this problems using PCA to make system rotation invariant,
perspective transform to make system tilt invariant and Sauvola thresholding to make system illumination
invariant. So recognition rate get increases above 98%.
F. Extraction of Best match
The goal of this work is to make robust OCR system which recognizes the word even if it is distorted, word is
misspelled or some characters in the words are deleted. It should be recognized in its correct form and it is done
by finding best match to the infected word. For this, thelist of words is stored in text file. The word obtained
from previous step is search in that text file, and if there is a match found for that particular word, that best
match will be treated as a recognized text. This will increase the accuracy of our system.
To find out best match for the detected word, Levenshtein distance method is used. The Levenshtein distance
[14] is a string distance measurement technique. It is used for measuring the difference between two sequences
of string. In casual way, the Levenshtein distance between two strings is the minimum number of single-
character edits i.e. insertions, deletions or substitutions which are required to change one string into the other.
This Levenshtein edit distance method would help us in matching a word or string in its infected form with its
original form.
The Levenshtein edit distance method is implemented in following order
 Minimum length of the two words
 Actual Levenshtein edit distance between the words
 Length of subset string match, starting from the first letter
In this project Levenshtein distance method is used to find best match to the recognized Marathi word from the
Marathi keywords to increase the accuracy. The keywords list is already stored in the text file. The recognized
words found in infected form it will get correct by using Levenstein distance edit method as shown in Error!
Reference source not found.
Mathematically, the Levenshtein distance between two strings a, b is given by lev(a,b)
(, ) if min(i, j) = 0

⎪ ( , ) = ( − 1, ) + 1
( , ) =
⎨ ( , ) = ( , − 1) + 1 otherwise
⎪ ( , ) = ( − 1, − 1) + 1(
⎩ )

107
TABLE I. LEVENSTEIN EDIT DISTANCE OBSERVATION
Incorrect Word Correct
Word
अशुद अशु ी
िव यता िव ता
नैस रक नैसविगक
अनधा ् य अ धा
Where 1( ) is the indicator function equal to 0 when = and equal to 1otherwise, and , ( , ) is the
distance between the first characters of and the first characters of .
Note that the first element in the minimum corresponds to deletion from (a to b), the second to insertion and the
third to match or mismatch, depending on whether the respective symbols are the same.

VI. PRACTICAL RESULTS AND ANALYSIS


In this section the current result is presented. Below figure show the output of image containing highlighted
Marathi printed text ‘भारत’.System becomes rotation invariant by implementing PCA on correct dataset. Even if
image is rotated during scanning or capturing the system maintains its accuracy. In this work if the recognized
word is sometimes infected, Levenshtein edit distance method is used to extract word which is the best match to
the infected word from list which is stored in text file or Marathi dictionary.Implementation of Perspective
Transform makes the system tilt invariant. Also the effect of uneven illumination is removed using the Sauvola
algorithm.
The Perspective Transform and Sauvola Algorithm work efficiently and give good result. The practical results of
each stage are shown in following figures. Figure7 shows rotation invariant result, Figure8 shows tilt invariant
result and Figure9 shows illumination invariant result. The final recognized word is shown by Figure10.

Figure 7.Rotation Invariant Result

Figure 8.Tilt Invariant Result

Figure 9.Illumination Invariant Result

Fig.10.Recognized Marathi Text

The experiment is done on 100 images. The experimental analysis shows that the proposed method work fine
with long word i.e. word having more than two letters it gives 98% accuracy, the accuracy get reduced due to
illumination effects on image.

108
VII. CONCLUSION
In the context of the current project we are planning to use the character recognition system for knowledge
search engine where users are school children, for whom handling complicated Devanagari keyboard might be
difficult. For the children, transliteration facility to provide input Marathi words using English keyboard is
difficult. So, we allow user to enter the search query by the captured image in which region of interest is
highlighted by marker so that user can acquire more information through the knowledge search engine. We use
the character recognition system to recognize the input and convert it into digital form and pass it on to our
knowledge search Engine. This feature would enable the users to enter search queries in an easy and user
friendly manner.The developed character recognition system has several standalone applications as well, such as
digitization of documents, automating several systems in which ability to recognize the text/numbers play a
crucial role (e.g. recognizing the amounts written on checks, recognizing the addresses written on the envelopes,
recognizing names, addresses, phone numbers written on forms etc.), to list a few. The underlying algorithms
and techniques which we aim at developing would be applicable to all these applications in general. This
approach can be used in multilingual character recognition as well.

REFERENCES
[1] HuiminLu ,Baofeng Guo, Juntao Liu, Xijun Yan “A Shadow Removal Method for Tesseract Text Recognition”,2017
10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI
2017)
[2] Taeyoung Kim, Yu-Wing Tai, Sung-Eui Yoon “PCA ased Computation of Illumination Invariant Space for Road
Detection”,2017 IEEE Winter Conference on Applications of Computer Vision
[3] H. El Bahi, Z. Mahani, A. Zatni and S. Saoud “A robust system for printed and handwritten characterrecognition of
images obtained by camera phone”,WSEAS TRANSACTIONS on SIGNAL PROCESSING Volume 11, 2015
[4] Sam S. Tsai, Huizhong Chen, David Chen, Vasu Parameswaran, Radek Grzeszczuk, Bernd Girod “Visual Text Features
for Image Matching”, 2012 IEEE International Symposium on Multimedia
[5] Annmaria Cherian, Sebastein “Automatic Localization and Recognition of Persectively Distorted Text in Natural Scene
Images”
[6] Vidula T. V.,Vrinda V. Nair “A Robust Performance Evaluation Scheme for Rectification Algorithms in Camera
Captured Document Images ”, 2014 ICCSC, Trivandram
[7] J. Sauvola and M. Pietikainen, “Adaptive document image binarization,” Pattern Recognition 33(2), pp. 225-236, 2000.
DOI:10.1016/S0031-3203(99)00055-2
[8] Y. Gurav, P. Bhagat, R. Jadhav and S. Sinha, "Devanagari Handwritten Character Recognition using Convolutional
Neural Networks," 2020 International Conference on Electrical, Communication, and Computer Engineering (ICECCE),
2020, pp. 1-6, doi: 10.1109/ICECCE49384.2020.9179193.
[9] Shalini Puria,, Satya Prakash Singh, “An efficient Devanagari character classification in printed and handwritten
documents using SVM”, www.sciencedirect.com Procedia Computer Science 152 (2019) 111–121
[10] Tripathy, Nilamadhaba, Tapabrata Chakraborti, Mita Nasipuri, and Umapada Pal. "A scale and rotation invariant
scheme for multi-oriented character recognition." In 2016 23rd International Conference on Pattern Recognition (ICPR),
pp. 4041-4046. IEEE, 2016.
Agrawal, Mimansha, Bhanu Chauhan, and Tanisha Agrawal. "Machine Learning Algorithms for Handwritten
Devanagari Character Recognition: A Systematic Review." vol 7 (2022): 1-16.
[11] Ding, Chris, Ding Zhou, Xiaofeng He, and HongyuanZha. "R 1-pca: rotational invariant l 1-norm principal component
analysis for robust subspace factorization." In Proceedings of the 23rd international conference on Machine learning, pp.
281-288. 2006.
[12] Badla, Sahil. "Improving the efficiency of Tesseract OCR Engine." (2014).
[13] Haldar, Rishin, and Debajyoti Mukhopadhyay. "Levenshtein distance technique in dictionary lookup methods: An
improved approach." arXiv preprint arXiv:1101.1232 (2011)

109
Grenze International Journal of Engineering and Technology, June Issue

Dental Biometrics Segmentation on Panoramic X-Ray


Images using Computational Intelligence Approach
Dr. M. Sujithra1, Ms. J. Rathika2, Dr. P. Velvadivu3, Abinanda.P4, Gayathri.G5 and Rekha.V.S6
1-3
Assistant Professor, Department of Computing – Data Science, Coimbatore Institute of Technology, Coimbatore
Email: [email protected], [email protected], [email protected]
4-6
MSC Data Science, Department of Computing – Data Science, Coimbatore Institute of Technology, Coimbatore
Email: [email protected], [email protected]

Abstract—Dental Biometric is a new field of study in the sector of biometrics identification. This
technique can be sometimes used instead of the usual fingerprint biometric identification. In
most of the cases, Dental Biometric credentials come handy to analyze the details of a dead
person where, the Dental Biometrics of the person before death and the person after their death
could potentially explain various reasons for their death and could justify their identity after
death. Neural Networks have the potential to learn about anything using their complex
construction structure. Here, x-ray copies of the dental teeth structure of a potential individual
are given or fed to the network and the neural network with the help of an object detection
platform such as OpenCV2 detects the teeth structure and can visualize the teeth structure with
x-ray of that individual alone. As the neural is now able to see the teeth, it can learn a lot of
crucial details about it from the picture it can understand. The interpretation about the teeth
contours and the number of teeth is made along the way. The proposed methodology has better
accuracy than the fuzzy clustering relevant methods. Also Suggested to use the appropriate
values of parameters that should be opted for the algorithm.

Index Terms— Dental Biometric, UNet, OpenCV2, image processing.

I. INTRODUCTION
As there is a greater availability of medical digital data, expanding processing power and advances in artificial
intelligence, computer-aided diagnosis (CAD) has made tremendous progress during the previous two decades.
CAD systems that aid radiologists, physicians in decision-making are used to solve a variety of medical issues,
notably breast and colon cancer identification, lung disease classification, and brain lesion identification. Digital
radiography's growing popularity encourages more research in the field. Radiographic image processing has now
become a major issue of automation in dentistry, as radiographic information is a vital aspect of diagnosis of
dental health monitoring, and treatment planning. Several investigations are done in the last decade to address
the problem of teeth detection. There have been several suggested pixel-level techniques for tooth detection that
were based on classic computer vision techniques like thresholding, histogram equalization etc. With enough
recall (sensitivity) one can help the computer to be able to distinguish the teeth. On CT images too, various
techniques have been utilized and a manual method to place coordinates surrounding each tooth has been
developed.

Grenze ID: 01.GIJET.9.2.45


© Grenze Scientific Society, 2023
A. Teeth Analysis
There are basically two processes for tooth numbering: segmentation and classification. The width to height
teeth ratio and crown size extract features from segmented teeth, whereas the wavelet Fourier descriptor is used
to get the geometry of the teeth. The models like Support vector machines (SVMs), sequential based algorithms,
and feedforward neural networks (NNs) are used to characterize teeth.
B. Convolutional Neural Networks (CNN)
CNNs are used in this study for the purpose of tooth detection and tooth numbering. CNNs are a common type of
deep feedforward neural network design, and they're frequently used for image recognition. CNNs are most
popular NN for the past two decades, but the real revolution in deep learning came after the design of AlexNet
architecture. As this architecture significantly outperformed other teams in the ImageNet Visual Recognition
Competition challenge. Since then, CNNs have seen rapid development. CNNs are currently used in a wide
range of applications and represent a cutting-edge solution to various computer vision challenges.

II. DATASET
From January 2016 to March 2017, 1574, panoramic samples of radiographs is randomly selected from the X-
ray’s provided by Reutov Stomatological Clinic, Russia. The database doesn’t not include any other features like
gender, age, time etc. The tooth detection and identification models were trained in the training group, while the
software's performance was verified in the testing group. The XG-3 - Sirona Orthophos X-ray machine was used
to capture all panoramic radiographs (Sirona Dental Systems GmbH, Bensheim, Germany). Ground truth
comments for the photographs were provided by five radiology professionals with varied levels of experience.
Experts are instructed to draw bounding boxes around all teeth with high-resolution panoramic radiographs. Due
to the skewness in the data collection, complete anonymzed data is used. The Steklov Institute of Mathematics in
St. Petersburg, Russia, made a formal decision that the use of radiographic material for this work was exempt
from ethics committee or IRB approval.

III. SYSTEM ANALYSIS


The technique demonstrated here uses panoramic radiographs as an input. To identify the borders of the teeth,
the teeth detection module examines the radiograph. It is then cropped using the anticipated boundary boxes
from the panoramic radiograph. The model classifies each cropped region using the FDI, then heuristics is used
to produce the final teeth numbers. The system outputs the bounding box coordinates and matching teeth
numbers for each detected tooth on the image. The diagram shows the entire architecture and workflow.

Figure.1: Process of Teeth Identification System

IV. MODELS
A. Deep Learning Models
The suggested approach makes use of deep learning techniques. Deep learning enables a computer programme to
extract and learn attributes from the input data to understand previously unheard examples. Deep learning

111
techniques stand out because they can learn directly from raw data input, such as the pixels in pictures, without
the requirement for manual feature engineering. [1]One of the most popular deep learning techniques for image
recognition is deep CNNs. To efficiently represent and learn hierarchical features at various levels of abstraction,
CNN designs take advantage of unique properties of image input information, such as spatial relationships
between objects; see LeCun et al. for a thorough description of deep learning techniques.
B. UNET Model Neural Network
The encoder and decoder were the two essential parts of the introduced technology. The covenant comprises
various first in the encoder, followed by the pooling layers. It is used to extract the image's factors. To enable
translation, the second portion decoder utilizes transposed convolution. It's an F.C joined layers network yet
again.
C. Teeth Detection
The Faster R-CNN model is used in the teeth detecting method. 15 Faster R-CNN arose from the A fast R-CNN
infrastructure that used the R-CNN methodology (Region-based CNN). It is challenging to find the areas of
interest through object detection. R-CNN offered a unified strategy for both regions of interest proposal
generation and object localization. Using Fast R-CNN, which streamlined the pipeline and optimized
computation, improved R-performance for CNNs. Finally, a CNN-based method that was much more
sophisticated was presented by Faster R-CNN.The R-CNN is made up of two parts: the object detector and the
regional proposal network (RPN). RPN proposes region of interest i.e teeth in this case. The object detector
makes use of these recommendations to better localize and categories the objects. Both modules produce feature
maps, which are condensed versions of the source image, by using the CNN convolution layers that lie beneath.
In contrast to standard computer vision algorithms, which demand hand-engineering of the features, the features
are derived during the training phase.
By moving the window over the feature map and creating potential bounding boxes called "anchors" at each
window point, RPN creates regional suggestions. The RPN employs the specific regressor to narrow the
bounding box and determines the likelihood that each anchor will contain an object or a background. The top N-
ranked region ideas are then sent to the object detection network.[2] The object detector generates the final
bounding box coordinates for a two-class detection task after refining the class value of a region to determine
whether it is a tooth or a background. Model weights that had been pre-trained on the ImageNet data set were
used to create the fundamental CNN. All of CNN's layers were adjusted because the data set is sizable enough
and differs enough from ImageNet. With exponential decay following, the learning rate was initially set at 0.001.
D. Teeth Numbering
The teeth are numbered by a convolutional architecture called VGG-16. The model was taught to estimate the
number of teeth using the two-digit notation. This module categorizes the teeth using the output from the teeth
detection module. Based on the anticipated boundary bounds, it crops the teeth. Each clipped image is then given
a two-digit tooth number by the VGG-16 CNN. The classifier begins with a set of confidence ratings for each of
the 32 classes, estimating the likelihood that each bounding box will contain one of the 32 potential tooth
numbers. The classified data is again processed by custom heuristic algorithm to enhance prediction results. This
process is carried out so that each portion of the tooth appear only once.
Like how they were used for teeth detection, the weights learned on ImageNet dataset is then used to initialize
the CNN model. Based on annotations X-rays, cropped images were created for training. [4] The cropping
process was modified to include nearby structures, which increased the CNN's prediction quality by providing
context. To further increase the variety of the data set, the images were improved. 64 batches were used to train
the CNN. The Keras library and TensorFlow serve as the backend for the dental numbering module, which is
developed in Python.

V. RESULTS AND DISCUSSION


In the figure:2, a single input x-ray image from a large amount of input x-ray images loaded into the program has
been displayed. This image is not the raw version of the input but is the processed clear image of the input given
to the program.
In the Figure:3, another image of the same input image displayed in the figure.1 is displayed but with a renewed
sense of perceptiveness. Here, the input image is filled with yellow color and through this one can understand
that the program using yolo successfully detects the teeth in the x-ray image loaded to the program.

112
Figure.2: Clear Image of Input X-ray

Figure.3: Another view of same image computer visionated

Figure.4: UNet model operations

The above figure.4 shows the operations of the UNet Model that was implemented by the program. The UNet
model primarily makes sure that the image detected by the yolo program already is understood further. The
images loaded to the model are expected in the dimensions (512,512) so that the model is implemented better.

Figure.5: F1_Score of the Model

In the figure.5, the f1 score of the UNet model implemented previously has been given. The score implies that
the model is very good in the detection of the teeth x-ray images given to it. It can also be said that the model
scored a 95% in the f1_score calculated for this model.
In the figure.6, the successfully predicted mask or the teeth of the test images by the model fitted has been
displayed. The model in the program now understands or reads from the x-ray images provided containing the
teeth images and the proof that it detects the teeth successfully is implied by this figure.[5].
In the figure.7, a contoured x-ray image of the test images given to the model has been displayed. Contours are
used here as a change in the view of the x-ray image. Here, only the boundary of the teeth detected has been
marked providing a new perceptive. Like the previous image, even this figure is evidence of a proof that the
model can detect the teeth in the x-ray image.

113
Figure.6: Successful predicted mask

Figure.7: Contoured X-ray image

Figure.8: Detailed Detection of the teeth

In the figure.8, a detailed detection view of the teeth x-ray image given to the model as a test image is displayed.
The pixel details of every tooth detected by the model has been displayed next to it. This can significantly help
technicians as these details have been computed easily by the model now that it is detecting and understanding
the x-ray image of the teeth given to it.

Figure.9: Total teeth detected

In this figure, the total teeth that have been detected by the model so far in that image given to it as a test x-ray
image has been given. Note that usually people can have about 28 to about 32 teeth in case of when some or all
wisdom teeth have sprouted in that individual. In some cases, people may have less teeth as it would have fallen
out. All these cases make it more important to know the number of the teeth in a person through the x-ray image
provided.

VI. CONCLUSION
The Dental Biometric System's UNET model-based design is particularly effective at identifying the biometrics
of the teeth. Panoramic radiographs can be a very effective tool to support patients’ diagnosis and to define a

114
treatment plan to them. The use of segmentation models to detect teeth and their exact limits can be of
paramount importance for eliminating a task that is quite susceptible to human failure. Biometric traits can be
used for authentication and personal security. [3,6]It is possible to collect even the pixel details and
measurements of the teeth shown in the panoramic x-ray photographs. These findings are extremely helpful for
forensic and dental research. Based on the F1-score, this model provides 95% accuracy. By ensuring that the
UNET model effectively detects additional information about the teeth, such as their type and any surgical
identification in them, this study can be further refined. The work's results are acceptable and offer directions for
a more superior and efficient dental segmentation procedure. The results obtained in this work are satisfactory
and present paths for a better and more effective dental segmentation process.

REFERENCES
[1] Yetis, A.D., Yesilnacar, M.I., Atas, M. (2021). A machine learning approach to dental fluorosis classification. Arabian
Journal of Geosciences, 14(2):1- 12.
[2] L. Megalan Leo and T. Kalapalatha Reddy, “Learning compact and discriminative hybrid neural network for dental
caries classification,” Microprocessors and Microsystems, vol. 82, Article ID 103836, 2021.
[3] C. Muramatsu, “Tooth detection and classification on panoramic radiographs for automatic dental chart filing: improved
classification by multi-sized input data,” Oral Radiology, vol. 37, no. 1, pp. 13–19, 2021.
[4] M. Sujithra and G. Padmavathi, "Next generation biometric security system: An approach for mobile device
security", Proc. CCSEIT, pp. 371-381, 2012.
[5] Fariza et al. (2019) Fariza A, Arifin AZ, Astuti ER, Kurita T. Segmenting tooth components in dental x-ray images
using Gaussian kernel-based conditional spatial Fuzzy C-Means clustering algorithm. International Journal of Intelligent
Engineering and Systems. 2019
[6] M. Sujithra and G. Padmavathi,” An Improved PCA based Zero Crossing Feature Extraction For Real- Time Biometric
Iris Authentication In low Power Resource Constrained Mobile Devices” International Journal of Applied Engineering
Research,2015.

115
Grenze International Journal of Engineering and Technology, June Issue

Credit Risk Analysis of Loans using Social media


Information
P.P.Halkarnikar1, H.P. Khandagale2 and Amol Dhakne3
1,3
Dr.D.Y.Patil Institute of engineering, Management & Research, Pune, India.
Email: [email protected]
2
Department of Technology, Shivaji University, Kolhapur, India.
Email: [email protected]

Abstract—The core business of the banking sector is sanctioning loans to different individuals
and industries. The credit risk analysis of these elements gives guarantee about regular
repayment of loan. As a result, healthy business firms repay their loan regularly thereby
increasing good return on investment to bank. It is possible to increase the accuracy of credit
risk calculation using current technology like Big Data and different analytical tools. In our
approach, along with traditional parameters like profit/loss, financial history, financial status of
directors, cash flow, we also included non-formatted data like news and informal information
for analysis. This information can be included as positive, negative and regular. This
information can be collected using Big Data techniques from websites, news websites,
government agencies and external agencies. This is used to construct the credit scoring models
and to predict the borrower’s creditworthiness and default risk. Looking at the uncertainty
associated with judging the credit of borrower, it is necessary to add new tools and methods to
get maximum correctness. Our approach to use Big Data analysis tools to input informal
sources available on internet, will increase the accuracy of finding good borrower for banks.

Index Terms— Financial Analysis, Credit risk, Big Data, Data mining.

I. INTRODUCTION
The banking industry deals with capital flow and risk associated with it. The overall performance and profit of
the bank depends upon the repayment of the loans distributed to different sectors. Bank distributes loans to
individual and other businesses. An individual’s credit can be calculated based on his income, tax paid, saving
and assets. But for business firms it is a complicated process. Many banks are now using automated tools for risk
calculation and credit determination. These tools take into consideration of profit/loss, sales history, and
financial status of promoters, cash flow and other parameters. From these parameters, bank calculate the credit
level of the firm. The bank's success greatly depends upon its decision of credit to firm. Banks are also exposed
to different kinds of risk, but the most challenging risk is credit risk. The performance of loan contracts affects
profitability and stability of a bank growth and development. The extent to which a borrower uses the credit
facility efficiently will greatly impact the firm’s repayment ability and performance, which in turn affects the
lending institutions. Credit risk is the loss of bank’s profit, since the customer does not adhere to his or her loan
refund commitment. Financial institutions are facing the problem of loan proposals because of continuous
changes in the business environment, credit regulations, marketing strategies and the competition in business

Grenze ID: 01.GIJET.9.2.46


© Grenze Scientific Society, 2023
itself. The objective of credit scoring is to help credit providers to quantify and manage the financial risk
involved in providing credit so that they can make better lending decisions quickly and more objectively.
Various statistical and machine-learning techniques have been used to model company credit and bond ratings in
the past. The present analysis depends upon the statistical figure gathered from various sources of business.
Sanctioning of loans requires the use of huge and various data along with substantial processing time to process
large number of variables. The development of Big Data technology has encouraged the researcher to add non
statistical information as one of the parameters for credit risk analysis and default prediction. Today credit risk is
concern by various stakeholders such as institutions, customers, regulatory bodies, depositors and investors. A
lack of attention to changes in economic or other circumstances of business that can lead to a deterioration in the
credit standing of a business will lead to bank’s sustainability. Credit risk is interest topic for finance
communities, researchers and banking sector.
The objective of this paper is to determine the utilization of Big Data techniques to develop a valid and useful
mechanism for analyzing credit risk and estimating the allowance for credit default. The statistical data collected
by bank is additionally supported by the data collected from social sites. This will help the loan distribution with
less risk of failure. The loan applicant has been classified into two categories: good credit and bad credit. A good
credit business is likely to repay the debt whereas a bad credit business is likely to default. An analysis of credit
risk can provide indirectly an indication of whether bank’s credit granting policies are proper.

II. RELATED WORK


Kwaku D. Kessey in his paper discussed the issue of increase in nonperforming assets in the banking sector of
Ghana.[1] the author discusses the challenges of risk management in the changing scenario of technology and
automation. The risk to the banking sector due to poor risk calculation using traditional methods, has resulted in
many bad debts. The study is limited to Ghana but it reveals the changing evaluation process is need of time for
banking sector. Proper portfolio design for credit risk analysis is highlighted in this paper. The use of the latest
technology in risk calculation is underlined in this paper. The primary data required is taken from the bank and
secondary data is collected from firms’ website and annual reports. Trend analysis is applied for previous years
to understand the company growth.
Hamid EslamiNosratabadi et al.used different data mining tools for credit risk analysis of loans.[2]. Different
data mining techniques like KNN, decision tree and others are applied to loan parameters using Clementine data
mining software. After applying data mining tools, loans are classified into three groups: bad, medium and good.
A fuzzy expert system is proposed for better risk prediction. Authors propose the fuzzification of risk classes for
analysis of loans. Asrin Karimin in [3] develops similar techniques.
Ghatge and Halkarnikar, in this paper have proposed automated analysis of bank parameters for finding credit
risk in sanctioning loans to firms.[4]. They proposed artificial neural network technique for credit risk analysis.
The ANN technique is advance data mining tool which has self-learning capability. The parameters selected for
credit risk are automatically weighted according to the training set. The work in this paper is further extended to
incorporate non formatted data collected from websites and social media for calculation of credit risk. Khaled
Alzeaideen also proposes the artificial neural network approach for credit risk analysis in his article.[9].
Sudhakar M et al. propose the use of dataming in the field of banking for prediction of credit risk for loans. [6].
In this paper credit risk is calculated using famous Weka tool of data mining. The decision tree technique is used
for classification and predication of customers. Two stages are implemented for analysis. CIBIL credit score
maintained by external organization is considered for risk calculation. In first step tree is built and pruning is
done. In second stage prediction of loan is done for risk stage. Loans are recommended for selection or rejection
of proposal. Sudhamathy G applies similar approach in his paper for risk prediction.[7] The tool used for tree
development is R. He has used this approach of developing tree structure to predict possible risk in sanctioning
individual loans.[10].
Somayeh Moradi et al. proposed dynamic model for credit analysis which is monthly trained for prediction.[11].
They also added fluctuating politico-economic factors for development of dynamic model. These factors work
along with financial parameters set by bank, making robust prediction possible. For model they have considered
the Iranian bank system. Fuzzy rules are formed to take into consideration of all internal and external factors.
The statistical methods are considered in many papers we referred. [3][5][8]. These papers provided valuable
parameters used for credit risk analysis. Looking to the present state of credit risk analysis, data mining
techniques are becoming popular for analysis. Salihu, Armend et al. provided a comprehensive survey of
different data mining algorithms used for credit analysis. [12] In this paper advantage of use of modern tools is

117
highlighted by author. The data generated by social; site is unformatted and is termed as Big Data. The BigData
tools are currently used for analysis of data generated by social sites and interne for credit risk analysis.
Wenshuai Wu described the advantage of using Big data analysis for credit risk calculation. [13] The complex
methods and parameters involved in risk calculation always need advance tools to incorporate in this process.
Author has focused on modern tools and future trends in this complex process of credit risk calculation. The
volume of data is huge; hence independent research is carried out for the effect of social information on risk
calculation in our proposed system. The social site's information and news provided are valuable information
that can provide the financial state of firms and their business domain. If this information is utilised for the
calculation of credit risk, a bank may avoid a possible loss to its assets.

III. DATA COLLECTION AND METHODOLOGY


A. Methods of Data Collection
Financial analysis of a potential borrower begins with an understanding of the firm, its business, sales figure, its
key risks and success factors. The financial ratios are calculated from these values. Some qualitative variables
are also derived from available data. For the development of our proposed model, data is collected from a
nationalized bank with restricted area of commercial loans. These loans are disbursed to various firms in past
five years. Their performance is known during this period. The decision of loan sanction is based on a perception
of sanctioning persons. These authorities check the fact presented to them at the time of loan application. There
is no other tool is used to collect the non- informal information about the business domain except the personal
interview at the time of sanctioning of the loan. The data collected from the bank is as under:
1) History/Application of the Borrower.
2) Financial Statements/Balance sheet for 3 years before loan sanctioning.
3) Account Statements for each financial year.
B. Data Collection from Social sites
The data required for analysis is collected from other sites also. The following sources are used with a filter to
specific firm name and period.
• CIBIL site.
• Web site of the firm.
• Facebook responses.
• Twitter responses.
• News site of financial newspapers.
All the information is filtered through Big data tools limited to a given firm and period. The data collected from
these sources is non-structured type. So needs more complicated processing to bring it into a useful format.
Information collected from these sources is processed using natural language processing and clustering
algorithms for keywords. Based on keywords in contents of sites are graded as positive, negative and normal.
The status and quantity of these grades will influence the final decision based on the weight assigned to these
parameters. The final decision about the status of loan is calculated by adding these social site parameters with
regular class decided by facts presented in the bank.
C. Probable Methods of Data Analysis
Normal parameters are derived by the bank from the previous balance sheet. The analysis of the balance sheet of
the firm is done on the following risk factors:
D. Credit Scoring
Credit scoring is defined as a statistical method that is used to predict the probability that whether a loan
applicant will default or become delinquent. This helps to determine the amount of credit that should be granted
to a borrower. Credit scoring can also be defined as a systematic method for evaluating credit risk that provides a
consistent analysis of the factors that have been determined to cause or affect the level of risk .The objective of
credit scoring is to help credit providers to quantify and manage the financial risk involved in providing credit so
that they can make better lending decisions quickly. [4] Credit scoring helps to increase consistency of the loan
application process and allows the automation of the lending process. Based on the consumer’s credit scores, the
financial institutions are also able to determine the credit limits to be set for the business firms. For the
calculation of credit score, the parameters discussed above are considered with different weights assigned by the
bank. The pattern of loan repayment of previous or existing loans is also considered and called as behavioral

118
TABLE I. TRADITIONAL RISK FACTOR ANALYSIS
SR. NO. RISK FACTORS VARIABLES DESCRIPTION
Funds raised by the firm/
Capital
borrower
Leverage and Solvency Net worth Capital + reserves
1
Indicators It is a proportion
Debt Equity
between firms total debt and
ratio
total equity
Creditors, Loans to be repaid
Current
within one year, provisions of
Liabilities
taxes and expenses.
Current Cash in hand and bank
Assets balances, Inventory of the firm.
2 Liquidity Indicators
It measures the proportion of a
party’s current assets to its
Current Ratio current liabilities and thus
gives a measure of the short
term liquidity of the firm.
Sales Sale of the Goods by firm
Total sales minus Total
Profit
expenses.
Profit after depreciation of
3 Profitability Indicators
Net Profit Building/machinery and
furniture.
Profit to sales It is the percentage of profit to
ratio total sales of the year.

scoring. The deductive credit scoring system awards points (weights) to particular relevant attributes of the credit
parameters. The weightage value of attributes is aggregated to a total score. The relevant attributes and their
weights are determined by the credit decision-makers based on their experiences. The cut-off for the score is
considered by bank for rejection of the loan or determining risk associated with the credit sanctioned.

IV. PROPOSED SYSTEM


The proposed system is based on present parameters and parameters collected from internet. The data mining
techniques are used to find the credit score using financial and internal parameters of firm, while the Big data
tools are used to capture information from social sites and web sites. Web sites used by Big data tools are
government sites and company web sites. The change in government rules or change in business patterns are
detected which may affect the profitability of firm. The specific kay words based on business of firm is selected
to filter the information from social sites. The collected information is classified as positive, negative and
normal. The accumulated score is again normalised by weight assign to them by bank. Both blocks give
accumulated credit scores used to decide the firm’s credit risk. The same system can be used for predicting credit
default based on current parameters collected from both the stages. The details are shown in figure 1.

Figure 1. The architecture of Proposed System

Process flow diagram is shown in figure 2. Basic steps of data mining are not shown in the figure, but data
cleansing and attribute pruning is necessary for proper application of data mining for prediction. Different

119
algorithms can be implemented here. We used decision tree technique for analysis of regular parameters.
Similarly proper key words suitable to firms business are selected so that big data tools used for filtering data
from social sites can give correct score. The Natural Language Toolkit is used for text tokenization of data
collected form social site. Key words are separated and different clustering techniques are applied to classify the
information into positive, negative and normal. These classes are provided different weights by bank depending
upon the policies and processes adopted by bank. So total credit score is calculated from the two stages presented
to decide on loans. Our system is capable of indicating the status of loan after it is also sanctioned. As it is
continuously monitor the internet for information. Any change in domain policies, and company business news
are picked up by the prosed system to predict the sanctioned loan condition in future.

Figure 2. Flowchart of Proposed System

IV. THE IMPLEMENTATION AND RESULT


The dataset collected from the nationalized bank is considered as an input to the system. It contains 43 instances;
with 30 cases of sanctioned loan proposals and 13 cases of rejected loan proposals. The database consist of
attributes with each case is characterized by 13 decision attributes. Out of these, 10 were numerical and 3
categorical. From the given dataset 14 inputs are used as a training sample and remaining 29 are used as a testing
samples.
For first step of data mining, decision tree classification C4.5, is used for classification of loan proposals based
on their credit score. In the second stage social site information around the firms name is collected using
HADOOP map reduce. The combined credit score is used to classify loan proposals in either good or bad credit.
The results are compared with result using the existing system as shown in table 2. The proposed system shows
more correct results as compared to the existing system.

TABLE II. COMPARISON OF RESULTS USING THE CONFUSION MATRIX

14 6 18 2
6 3 4 5
Existing System Proposed System

120
V. CONCLUSIONS
Credit risk is major risk parameter in the banking sector. Wrong calculation of credit score leads to credit
defaulters. Irregular repayment leads to poor health of bank. This affects the interest of stock holders of the
financial sector. For a good and healthy economy of the country, it is essential to have a strong banking sector. It
is expected that banks should give loans by proper analysis of the firm’s financial health. In this paper we have
proposed the system which relies on the present automatic system using data mining techniques and adding
social site information. The system takes into account non informal and non-structured data for analysis. This
data works as sensor for detecting the health of firm and its business sector. The bank can decide using this
system the credit limit, interest rate and repayment capacity of the firm. The implementation using decision tree
and Hadoop shows good results at the primary work carried out by us. The correctness of system can be judged
by evaluating it further using different evaluation matrix. This system not only helps the bank during the
sanctioning process of loans but also helps monitor the firm’s capability to repay the loan during its loan period.

REFERENCES
[1] Kwaku D. Kessey, “Assessing credit risk management practices in the banking industry of ghana: processes and
challenges”, Global Journal of Management and Business Research, Vol. 15 Issue 6 Version 1.0 pp. 201-212, 2015.
[2] Hamid EslamiNosratabadi, SanazPourdarab and Ahmad Nadali, “A new approach for labeling the class of bank credit
customers via classification method in data mining”, International Journal of Information and Education Technology,
Vol. 1, No. 2, pp 151- 156, June 2011.
[3] Asrin KARIMI, “Evaluation of the Credit Risk with Statistical analysis”, International Journal of Academic Research in
Accounting, Finance and Management Sciences, Vol. 4, No.3, pp. 206–211, July 2014.
[4] Ms. A. R. Ghatge, Mr. P. P. Halkarnikar, “Estimation of credit risk for business firms of nationalized bank by neural
network approach”, International Journal of Electronics and Computer Science Engineering, Vol. 2, No. 3, pp. 828-
834, 2012.
[5] Maubi Andrew Mokaya,Dr. Ambrose Jagongo, “Corporate loan portfolio diversification and credit risk management
among commercial banks in kenya”, International Journal of Current Business and Social Sciences, Vol.1, Issue 2,
pp.81-111, 2014.
[6] Sudhakar M, Dr. C. V. K Reddy, “Two step credit risk assesment model for retail bank loan applications using decision
tree data mining technique”, International Journal of Advanced Research in Computer Engineering & Technology
(IJARCET), Vol. 5 Issue 3, pp. 705 – 718, March 2016.
[7] Sudhamathy G., “Credit Risk Analysis and Prediction Modellingof Bank Loans Using R”, International Journal of
Engineering and Technology (IJET), Vol. 8 No. 5, pp. 1954 -1966, Oct-Nov 2016.
[8] Naoyuki Yoshino and Farhad Taghizadeh-Hesary,“A comprehensive method for the credit risk assessment of small and
medium-sized enterprises based on asian data”, ADBI Working Paper Series, December 2018.
[9] Khaled Alzeaideen, “Credit risk management and business intelligence approach of the banking sector in Jordan”,
Cogent Business & Management, 6:1, 1675455, DOI: 10.1080/23311975.2019.1675455, 2019.
[10] Anchal Goya, Ranpreet Kaur, “Loan Prediction Using Ensemble Technique”, International Journal of Advanced
Research in Computer and Communication Engineering, Vol. 5, Issue 3, March 2016.
[11] Somayeh Moradi, Farimah Mokhatab Rafiei, “A dynamic credit risk assessment model with data mining techniques:
evidence from Iranian banks”, Financial Innovation, 2019.
[12] Salihu, Armend; Shehu, Visar, “A Review of Algorithms for Credit Risk Analysis”, Proceedings of the ENTRENOVA -
ENTerprise REsearch InNOVAtion Conference, , IRENET - Society for Advancing Innovation and Research in
Economy, Zagreb, Vol. 6, pp. 134-146, 10-12 September 2020.
[13] Wenshuai Wu, “Credit Risk Measurement, Decision Analysis, Transformation and Upgrading for Financial Big Data”,
Hindawi Complexity, Vol. 2022, Article ID 8942773, https://doi.org/10.1155/2022/8942773, 2022.

121
Grenze International Journal of Engineering and Technology, June Issue

Blockchain Enabled Marksheets and Degree Certificates


Sharon Christa1 and Tanusha Mittal2
1-2
Graphic Era Deemed to be University, Dehradun, India
Email: (sharonchrista.cse,tanushamittal.cse)@geu.ac.in

Abstract—The certificates and grade reports are crucial records for anyone applying for a job
or seeking higher education because they act as identification verification. The traditional
paper-based certificate method makes obtaining such a crucial document highly time-
consuming and expensive. A trusted authority issues the digital certificate, which is a document
that may be used to demonstrate authenticity. Technology development has made it possible for
the practice of producing fraudulent report cards and diplomas. Document fraud and forgery
have gone unreported due to a lack of anti-forge mechanisms. E-documents use digital
signatures to enable authentication, integration, and non-repudiation, however forgery is
possible when the key itself is compromised. In order to prevent certificate fraud and guarantee
the security, legitimacy, and secrecy of diplomas, blockchain technology is deployed. Blockchain
ensures correctness and trustworthiness of information and allows for quick authentication of
degree certificates.

Index Terms— Blockchain, Digital Signatures, Digital Markcard, E-document.

I. INTRODUCTION
In order to demonstrate a graduate's qualifications once they successfully complete the chosen course,
universities provide certificates to the graduates. The crucial records needed to apply for jobs and further
education are these marksheets and degree certificates. Validation and verification of documents have grown in
importance. It is important to confirm that the graduate's diploma is authentic and that the owner is the rightful
owner with the appropriate authorization [1]. Traditional paper certificates need a lot of time and money. They
are even susceptible to fraud brought on by blunders and forgery. Mark sheets made of paper have a long
procedure, little flexibility, and are not environmentally friendly. However, the forging of certificates has
increased as a result of the availability of sophisticated and affordable technologies. Both the credential bearer
and the university that granted the certificate are put in danger [6] as a result of this. This study suggests a
system that uses blockchain technology to digitalize both the production of degree certificates and their
verification. The markcards are protected from fabrication and falsification thanks to the immutable nature of
blockchain technology. [11]
More than any other invention this century, blockchain technology will significantly influence the way we live in
the future. Anyone who cannot comprehend it will soon feel left behind when they awaken in a technologically
advanced world that increasingly resembles magic. The development of various techniques to verify academic
records such as degree markcards has confused people on which architecture is best to identify real and forged
markcards. [12] As blockchain has a very interesting feature of immutability this project uses permissionless
blockchain such as Ethereum as a platform to build a system that issues and validates the degree certificates [16].
Traditional paper-based degrees and mark cards are susceptible to fraud due to typos and forgeries. Markcards

Grenze ID: 01.GIJET.9.2.49


© Grenze Scientific Society, 2023
made of paper have a long processing time, little flexibility, high cost, and are not environmentally friendly.
Students who are seeking for jobs right after graduation, students who are going to institutions abroad with the
goal of pursuing higher education, and recruiters are all having difficulty validating their marksheets because
bogus marksheets and degree credentials are widely available. [13] [14] Therefore, the goal of this study is to
suggest a system that makes certificate fraud impossible.
A. Objectives of Blockchain
1. Digitalizing mark cards and degree certificates using blockchain technology.
2. Easy to check and verify the markcards through online. Students can obtain their certificates online and
submit to the recruiter during the interview process, and recruiter can validate certificate using digital mark
cards platform.
3. Digitalized mark cards are free from forgery and falsification and are secured using blockchain technology.
4. Digitalizing with blockchain technology reduces time consumption and is available at low cost.
5. Digitalizing the degrees and mark cards using blockchain technology provides confidentiality, integrity,
non-repudiation and authentication.

II. RELATED WORK


The Research Paper “Paper-based Document Authenticating using Digital Signature and QR Code” by M.
Warasart and P. Kuacharoen gives us brief understanding of implementing paper based document authentication.
This paper presents an implementation of paper-based document authentication. The integrity of the text
message and the author of the document can be verified with the use of a digital signature and QR code. This
model can be either automatic or semi-automatic [1]. When the OCR is not accurate and when it requires the
user to visually compare the text message on the paper and the one obtained from the QR code the model is said
to be semi-automatic; however, this method does provide convenience for the user in dealing with a large
amount of documents. [2]
[3] The Research paper “Using Blockchain as a tool for tracking and verification of official degrees: business
model” by Oliver, Miquel; Moreno, Joan; Prieto, Gerson; Benitez, David gives the brief knowledge of
verification of degree certificates from business perspective. This paper presents two financial models balancing,
the price for the service is balanced between the employer and the graduate as they are main stakeholders of that
service. Students demand easy to check and less cost proof of certification, employers when recruiting demand
quick and trustable verification of degrees. [4] [15]
The Research paper "A Graduation Certificate Verification Model via Utilization of the Blockchain
Technology"by Osman Ghazali and Omar S. Saleh provides a theoretical knowledge of blockchain technology
for issuing and verification of academic certificates [6]. The fundamental idea of using blockchain for issuing
and verification of academic records closes all the gaps and difficulties in existing systems. This paper provides
the knowledge of hash, public/private key cryptography, digital signatures, peer-to-peer networks and proof of
work. The paper explains how to uses various elements to formulate the block which is divided into two main
processes, namely issuing a digitally signed academic certificate and verifying the academic certificate.[5]
[7] The Research paper "CredenceLedger: A Permissioned Blockchain for Verifiable Academic Credentials" by
R. Arenas and P. Fernadez gives the knowledge of using permissioned blockchain for verifying academic
records. This paper describes how permissioned Blockchain can be applied to a specific educational use case -
decentralized verification of academic credentials. CredenceLedger, is a system that stores compact data proofs
of digital academic credentials in Blockchain ledger that are easily verifiable for education stakeholders and
interested third party organizations. [8]
[9] The Research paper “A Permissioned Blockchain-Based System for Verification of Academic Records” by
Ahmed Badr, Laura Rafferty, Quassy H. Mahmoud, Khalid Elgazzar and Patrick C.K. Hun gives us the idea of
implementing our system for verification of degree markcards. In permissioned blockchain the verification of
academic records using hyperledger fabric is the main focus to leverage the blockchain application in education
domain. Various challenges in sending and receiving the transcripts between universities and difficulties in
verifying the academic records by recruiters are solved using Hyperledger fabric. [10]

III. DESIGN OF EXPERIMENT/ MATERIAL METHODS


A. Algorithm
Step 1: User Logged in.

123
Step 2: Admin User logged in,
Yes, Go to Next Step.
No, Go to Step 9.
Step 3: Display dashboard, View Certificate and Issue Certificate for all the students list.
Step 4: Logout
Yes, end the process. Stop,
No, Go to next step.
Step 5: Click on view certificate.
Step 6: Certificate available?
Yes, Click on View Certificate Go to Next Step.
No, Click on Issue Certificate Go to Step 8.
Step 7: Get url from Certificate Table & Prompt Display window to download certificate. Go to Step 2.
Step 8: Update CertificateAvailable True in Database and call enroll(id,certificateHash) to update into
blockchain. Go to Step 2.
Step 9: Company User logged in.
Yes, Go to next step.
No, Go to Step 16.
Step 10: Display dashboard, Validate Certificate for students list.
Step 11: Logout
Yes, end the process. Stop,
No, Go to next step.
Step 12: Click on Validate Certificate, Upload the Certificate File & find md5sum of image.
Step 13: ImageHash == blockChainStoredHash
Yes, Go to Next Step.
No, Display Failed message Go to Step 9.
Step 14: Display message: Validation Success & Display Profile Page.
Step 15: Click on dashboard. Go to Step 9.
Step 16: Logged in as Student User.
Step 17: Logout
Yes, end the process. Stop,
No, Go to next step.
Step 18: Request for certificate
Step 19: Certificate Issued by University?
Yes, Display the profile page. Go to Next Step,
No, wait for the certificate to be issued.
Step 20: Logout
Yes, end the process. Stop,
No, Go to next step.
The modules are as follows:
1. User Interface Design: After the user login to user interface, the first page visible is the dashboard. In
the dashboard, admin user can view the certificate, Issue certificate, able to see list of students and
companies. Student user can request and view his certificate, whereas company user can validate and
view certificate of the candidate. In general user interface has Student List, Company List, View
Certificate, Issue Certificate, Request Certificate and Validate Certificate modules. These modules are
visible based on the role of each user exists the digital certificate system.
2. Verification: Verification has steps to verify data from database using database connector and
blockchain using web3 connector. Database connectors used to update Certificate availability in User
table and Certificate URL in the Certificate table. Web3 Connector used to add student details to block
chain and get student details from the blockchain. The purpose of the verification is to verify whether
the uploaded certificate is valid or not using the connectors.
3. Server: In project, firebase database is used for storing the data. As shown in figure, User table stores
student and user details like name, email id, usn number etc., each user is categorized using user role
column. User role 1 is set for admin and 2 & 3 is set for company & student respectively. In the project,
firebase database is used for storing the data. As shown in the figure, User ta student and user details
like name, email id, usn number etc., each user is categorized using user role column. User role 1 is set
for admin and 2 & 3 is set for company & student respectively. Certificate table stores the data URL

124
details. In project, firebase database is used for storing the data. As shown in the figure, User ta student
and user details like name, email id, usn number etc., each user is categorized using user role column.
User role 1 is set for admin and 2 & 3 is set for company & student respectively. Certificate table stores
the data url details for the image.
4. Blockchain: Ganache is personal blockchain for Ethereum development, which can be used to deploy
contracts, develop your applications and run tests. Records are stored in terms of blocks, each Ganache
is personal blockchain for Ethereum development, which can be used to deploy contracts, develop your
applications and run tests. Records are stored in terms of blocks, each of the records contains usn and
certificate hash.
As shown in the below Fig. 4.3, Data Flow diagram for digital certification consists of three main entity. Admin,
Student/Company and Blockchain. When admin click on View Certificate button, first in the system will verify
in the user table, whether certificate is available is true, if its true then certificate url will be fetched from
certificate table and url will be passed to admin user for download. If certificate is not available for particular
student, then issue certificate will be shown to admin user. When admin user clicks on issue certificate, is puts
the image url to the certificate table and update certificate available column in user table as true. And also,
push’s the details to blockchain using enroll methods with USN and certificate has detail. In case of company
user, whenever they click on validate user, system verifies the certificate with newly uploaded certificate hash
with hash available in the block chain for that particular student. If the certificate is valid then profile
information of the student is displayed. In case of student user, student will request for the certificate, and wait
for certificate to be issued, once certificate is issued, he will able to view and download the certificate.
B. Sequence Diagram
User Click on login button, validate user will be called for validatin g user email and password. To validate the
user, emailed and password verified in the database, when credentials match with the database credentials user
Login will be successful. And the dashboard will display list of students.

Fig. 1: Data Flow Diagram of Admin Fig. 2: Data Flow Diagram of Admin

Fig. 3: Login Page Fig 4: New User Registration

Fig. 5: Admin Profile Fig. 6: University/ Company Details

125
Fig. 7: Student Profile Fig. 8: Verification status Page after the validation of marksheet

Fig. 9: Verification Status Page for an altered marklist Fig. 10: Verification Data Flow

Fig. 11: User Validation Data Flow Fig. 12: University/ Company Certificate Validation Data Flow

The graphic below illustrates the five significant modules that make up the module flow diagram. The ensuing
modules are combined to fulfil the needs of the suggested project. These are the modules:
1. User Interface Design
2. Server
3. Blockchain
4. Verification
The front-end web application that enables registration for students and universities through easy form filling is
included in the user interface design module. Every time a user registers, the database of universities and
students is updated. Only from the specific university that the student attends may they request a certificate. The
university computes the hash of the certificate and uploads the requested document to the web server if the user
is a legitimate student. Now the student can download their diploma. The blockchain stores the certificate's hash
rather than the complete file because the valid transactions are committed to it. The certificate produced by the
student may now be hashed using the same process, and prospective employers and other universities can
compare the results to the hash stored in the blockchain for that specific certificate. The markscard is legitimate
if the hashes of the two are identical; else, it is invalid.

IV. DEVELOPMENT AND TESTING OF THE PROPOSED SYSTEM


The project is stored in the certificate-validation folder. When the project executes it leads us to sign in page. It
is having two entries: email id and password which are provided by the user during registeration or sign up. If it
is a new user then link to sign up page is available which takes required information. By default role 3 which
means user as student is set. Whereas admin has role 1 and company has role 2. Admin user is directed to the

126
Fig. 13: Architecture Design

Fig. 14: Modules in the proposed architecture

dashboard comprising of student list, company list and logout option. Under the student list, the action column is
either empty for particular user who just registered but not requested any certificates or if the particular user is
requesting a certificate, then action column contains issue certificate button which is clicked to select the
particular student certificate from the college database to upload it. If certificate is already issued to a student,
then only view certificate button is visible to view certificate. The admin can add new companies by navigating
to company list Company user is directed to the dashboard comprising of student list. The student list contains a
column which is either empty for student who did not have any certificates yet or contains a validate certificate
button if a particular student has the certificate. When a student user log into the system then he will be having
all the details given by him during registration along with a certificate row either containing a button view
certificate if the certificate is already issued otherwise request certificate button to request for certificate. When a
student takes his certificate to a company, then company can simply login to the system and click on validate
certificate button for particular student which redirects to a validation page where the certificate given by student
is uploaded using cloud. If the certificate is same as issued by the admin/college then the certificate valid
message pop up in the same window otherwise shows certificate invalid indicating the certificate is either forged
or altered.

127
TABLE 1: TEST CASES

V. CONCLUSIONS
Any record that is stored in the blockchain repository cannot be changed because to the immutable nature of
blockchain technology. It offers security, integration, and authentication. Online access to digital degree
certificates reduces costs and saves time. The ability to receive a replica of the original markcards or certificates
online in the event of loss or damage to the originals offers a great deal of flexibility. Security is provided by the
fact that the papers kept in the blockchain repository cannot be changed or removed. The main benefactors of
this system are universities, students, and recruiters because it makes the process of creating and verifying
certificates and marksheets easier. In conclusion, the suggested model avoids certificate fraud and falsification,
and employers may be confident that they will receive accurate information from the blockchain repository.
Digital certification also has huge scope for all the sectors of education like medical, engineering, pharmacy. law
etc. which can adopt new certification and save a lot of manual work. Digital certification is essential for all the
online courses, as digital certification avoid forgery so these online course certifications will be easily accepted
by organisation. For example, company user creation option through online portal directly helps rather than
university creating the same. So, university will only accept organisation request and allow them to use the
digital certification if they match certain criteria. Using digital certificate, we can do a lot more. We may also
incorporate the online test, so the student will receive their certification right away following the test's approval.
Digital certificate has lots of future scope in future, as in current situation social distance has become very
important aspect of life, so degree certificates and markscard can be obtained online.

REFERENCES
[1] M. Warasart and P. Kuacharoen, “Paper-based Document Authenticating using Digital Signature and QR Code,” no.
Iccet, 2012.
[2] Z. Chen, “Anti-Counterfeit Authentication System of Printed Information Based on A Logic Signing Technique”.
[3] Oliver Miquel, Moreno Joan, Prieto Gerson, Benitez,David(2018):”Using Blockchain as a tool for tracking and
verification of official degrees: business model”,29th European Regional Conference of the ITS.
[4] Juliana Nazare,Kim Hamilton Duffy, J. Philipp Schmidt “Digital Certificate Project” MIT Media Labs, 2015.
[5] Stephen Thompson “The Preservation of Digital Signatures on the Blockchain” University of British Columbia iSchool
Student Journal ,vol.3 (Spring 2017).
[6] Jayashri, N., Rampur, V., Gangodkar, D., Abirami, M., Balarengadurai, C., & Kumar, A. (2023). Improved block chain
system for high secured IoT integrated supply chain. Measurement: Sensors, 25, 100633.
[7] Osman Ghazali and Omar S. Saleh, “A Graduation Certificate Verification Model via Utilization of the Blockchain
Technology”, e-ISSN: 2289-8131 vol. 10 no. 3-2.
[8] X. Technologies, “Blockchain imperative for educational certificates,” Xanbell Technologies, 2017.
[9] MIT Media Lab Learning Initiative and Learning Machine, “Digital Certificates Projects.” [Online]. Available:
http://certificates.media.mid.edu/.
[10] R. Arenas and P. Fernadez, “CredenceLedger: A Permissioned Blockchain for Verifiable Academic Credentials.” In
IEEE international conference on Engineering,Technology and Innovation (ICE/ITMC). Stuttgart, Germany 2018.
[11] Ahmed Badr, Laura Rafferty, Quassy H. Mahmoud, Khalid Elgazzar, Patrick C.K. Hung “A Permissioned Blockchain-
Based System for Verification of Academic Records” in IEEE 2019.

128
[12] Sharma, I., & Sharma, S. (2022, November). Blockchain Enabled Biometric Security in Intemet-of-Medical-Things
(IoMT) Devices. In 2022 International Conference on Augmented Intelligence and Sustainable Systems (ICAISS) (pp.
971-979). IEEE.
[13] Neethu Gopal and Vani V Prakash, “ Survey on Blockchain Based Digital Certificate System,” IRJET, vol. 5, Issue: 11 |
Nov 2018.
[14] Nitin Kumavat, Swapnil Mengade, Dishant Desai, Jesal Varolia, “Certificate Verification System using Blockchain,”
IJRASET vol. 7 Issue IV, Apr 2019
[15] Tyagi, S., Ansari, N., Bisht, D., Kumar, R., Memoria, M., Awasthi, M., ... & Gupta, A. (2022, May). Role of IOT and
Blockchain in Achieving a Vision of Metropolitan’s Digital Transformation. In 2022 International Conference on
Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON) (Vol. 1, pp. 752-757). IEEE.
[16] T. Keerthana, R. Tejaswini, V. Yamini, K. Hemapriya, “Integration of Digital Certificate Blockchain and Overall
Behavioural Analysis using QR and Smart Contract”, IJRESM vol. 2, Issue-3, March 2019.

129
Grenze International Journal of Engineering and Technology, June Issue

Design of a Miniaturized Microstrip Antenna using


Slots on the Radiating Patch for Wireless Applications
Susmita Bala1, Biplab Bag2, Sushanta Sarkar3 and ParthaPratim Sarkar4
1
Dept. of Electronics,Vidyasagar University, Midnapore, India
Email: [email protected]
2
Dept. of Electrial Engineering Murshidabad Instutute of Technology, India
3-4
DETS, University of Kalyani, Kalyani, India
Email: [email protected], [email protected], [email protected]

Abstract—This paper proposed a miniaturized microstrip antenna (MMA). The design consists
of a microstrip antenna having a radiating patch with modified U slots. Dielectric substrate of
PTFE is used to design the antenna. Co-axial probe feeding technique is used to energise the
MMA. HFSS software is used to model the design of the MMA. Initial patch antenna without
slots, resonant at 6.65 GHz. After etching slots on the radiating patch, the MMA resonant at
3.85 GHz. So the 66.46% of miniaturization has been done. The MMA provides a maximum
gain of about 5.54 dBi at 3.85 GHz. This design may be used to operate for wireless
applications.

Index Terms— Miniaturization, Reflection co-efficient, Gain, Radiation Pattern, Modified U-


slots, PTFE.

I. INTRODUCTION
The design of a miniaturized antenna has emerged as one of the most important issues for modern broadcasting
systems due to the rapidly expanding wireless networks. Due to its favorable characteristics, including as
compactness, inexpensive manufacture, simplicity of structure, and effective compatibility with small electronic
apparatus, microstrip antennas are the most suitable candidate for wireless applications [1-2]. For the purpose of
shrinking the antenna's size, slots of various sizes and shapes have been inserted into the radiating patch, ground
planes, or both [3,4]. S. Islam & et al. reports on a small antenna for RFID applications [5]. The antenna has slots
of various sizes to accomplish its 32% compactness. It has been observed that employing slots for a multiband
patch antenna reduced size by 30% presented in [6]. M.S.H. Ali & et al. [7] demonstrates a small Patch antenna.
This offers two bands. The antenna has two L-shaped slots that are mirror images of one another, along with two
slits and a square slot. This antenna offers a compactness of 41.2%. A comparison study has been done on a
compact equilateral triangular patch antenna with various slot shapes, and the best compactness of 43.47% is
reported in [8]. It is estimated that a small microstrip antenna for mobile communication can achieve 46.13%
more compactness by employing two irregular rectangular slots at the patch's edge [9]. By utilizing open end
meandering slots in the ground plane, a rectangular microstrip antenna reports an 83% compactness [10]. Using
an H-shaped slot on the radiating patch and a U and L slot combination on the ground plane, it is reported that
the compactness is 86% [11]. A circular patch antenna adds an open ended slot to the radiating patch to increase
compactness by 86.5% [12]. A miniaturization of 50% has been reached by only using defected ground structure

Grenze ID: 01.GIJET.9.2.59


© Grenze Scientific Society, 2023
in [13]. Koch fractal geometry was applied on the square patch to reach 45 % of size reduction in [14].
Miniaturization was done by altering not just the iteration but also the number of segments on the boundary
while maintaining the same iteration of a patch in [15]. Complementary split ring resonators were used for size
reduction of a patch in [16]. Fractal geometry of Mandelbrot reduces 58.5% of area in [17]. The insertion of slots
in the various directions was proposed in [18] for miniaturization. A 33 % of size reduction has been achieved by
using slits in a slot antenna in [19]. A slotted microstrip antenna provides 67 % of size reduction was informed in
[20]. A CPW fed slot antenna achieves miniaturization using spiral ring resonator in [21].
In this paper, a miniaturized microstrip antenna (MMA) has been designed using slots. The miniaturization of
66.46% has been achieved by using only U slots on the radiating patch. This antenna may be useful in wireless
applications.

II. ANTENNA GEOMETRY


The layout of the proposed miniaturized antenna (MMA) is depicted in figure.1. The MMA is modeled using
glass PTFE substrate material. Its specification is, dielectric constant (εr) = 2.5, height= 1.6 mm and loss
tangent=0.002. The antenna is modeled by HFSS software. The MMA is excited by co-axial probe feed. The
evolution process of the MMA is shown in figure.2. The evolution process consists of three steps. Three steps
are Ant_A, Ant_B and Ant_C. Dimensions of the Ant_A are 13 mm and 17.75 mm. The Ant_A is given in
figure.2(x). An extended U-similar slot is placed on the Ant_A. This modification on the patch is named Ant_B.
After this, other U-similar slot is placed on Ant_B. This modification is named as the Ant_C (figure.2.2 (z)).
This Ant_C is considered as MMA. The design parameters of the MMA are given in the Table I. The small black
box presents the feed location of the MMA. The width of each section of the slots are fixed and it is 0.4 mm.
The reflection co-efficient of different steps of the MMA is given in figure.3. The resonant frequency and its
magnitude is given in the Table II.

TABLE I: DESIGN PARAMETERS OF THE PROPOSED REPORTED ANTENNA [MM]

a b c d e f
50 50 13 17.75 10.6 7.6
g h i j k l
10.6 7.6 8 6.6 1 1.4

Figure.1: The layouts of the MMA Figure.2(x-z): The evolution of the MMA (Ant_A to Ant_C)

TABLE II: COMPARISON INFORMATION TABLE REGARDING DIFFERENT STEPS OF THE PROPOSED ANTENNA
Steps Resonant Reflection co- Gain Compactness
Frequency efficient (%)

Ant_A 6.65 GHz -26.23 dB 6.64 dBi Not applicable


Ant_B 3.93 GHz -7.9 dB 5.45 dBi Not applicable
Ant_C 3.85 GHz -17.99 dB 5.54 dBi 66.46%

III. RESULT AND DISCUSSION


This section presents simulated results of the MMA. Figure.6 shows the reflection co-efficient (S11) of the
MMA. 3.85 GHz resonant frequency is obtained from the MMA. The gain of the MMA is 5.54 dBi at the

131
Figure.3: Comparison reflection co-efficient versus frequency plot for ANT_A to ANT_C

Figure.4(a): Surface current Figure.4(b): Surface current distribution


distribution(J_surface) at 6.65 GHz of (J_vector) at 6.65 GHz of the MMA
the basic patch antenna (Ant_A)

Figure.5(a): Surface current distribution Figure.5(b): Surface current distribution


(J_vector) at 3.85GHz of the MMA (J_vector) at 6.65 GHz of the MMA

resonant frequency of 3.85 GHz. The plot of the gain is given in figure.7. The radiation patterns (E and H plane)
of the MMA are given in figure.8 and figure.9. In this article, initial rectangular patch antenna (Ant_A) without
slot gives resonant frequency of 6.65 GHz. After placing slots on the patch (Ant_C), the frequency shifts from
6.65 GHz to 3.85 GHz.
Figure 4(a) shows the surface current distribution of the Ant_A and its vector current distribution is shown in
figure 4(b). Figure 5(a) shows the surface current distribution of the proposed antenna and its vector current
distribution is shown in figure 5(b). Red color indicates maximum current density. In case of figure 4(b), it is
shown that current flows from left to right on entire radiating patch but in case figure 5(b), the current path
becomes leathered due to the presence of the slots. As the current path increases that means length increases and
frequency decreases. So the frequency shifts from higher to lower range. This phenomenon is called
miniaturization or compactness. The compactness of the proposed antenna has been calculated by using
following equations (1-8).

132
Perimeter of the Ant_A to resonate at the frequency of 6.65 GHz is =2× (17.75+13) mm=61.5 mm…… (1)
So, the perimeter of the proposed antenna to resonate at 3.85 GHz is (61.5×6.65)/3.85=106.22 mm……… (2)
Therefore, total length and breadth of proposed antenna= 106.22/2=53.11 mm……… (3)
The length of the proposed antenna with respect to initial antenna is (17.75/30.75) ×53.11=30.65 mm….. (4)
The breadth of the proposed antenna with respect to initial antenna is (13/30.75) ×53.11=22.45 mm….. (5)
The area of Ant_A=17.75×13 mm2=230.75 mm2……… (6)
The area of the proposed antenna is =30.65×22.45 mm2= 688.1 mm2………… (7)
So, the compactness= {(688.1 -230.75)/ 688.1 }×100%=66.46 %.......( 8)

Figure.6: Reflection co-efficient plot of the MMA Figure.7: Gain plot of the MMA

IV. COMPARISON TABLE


In this section comparison of the proposed antenna with the previously reported antennas have been done (Table
III). After comparison with respect to substrate area, techniques of miniaturization and % of miniaturization, it is
concluded that proposed MMA provides very good results.

Figure.6: The radiation patterns (E plane) of the proposed Figure.7: The radiation patterns (E plane) of the proposed MMA
MMA

V. CONCLUSION
This article presents a miniaturized microstrip antenna. Only insertions of slots on the radiating patch have been
done. HFSS software simulation tool is used to model and simulate the proposed miniaturized microstrip
antenna. The design is very simple and easy. It provides 66.46% of miniaturization. This antenna may be useful
in wireless applications.

133
TABLE III: COMPARISON TABLE
Ref Substrate Techniques to % of
area achieve Miniaturizati
miniaturization on
[5] 37.7×28.4 Slots on the patch 32%
[6] 24×24 Slots on the patch 30%
[8] 120×120 Slots on the patch 43.47%
[9] 24×30 Slots on the patch 46.13%
[13] 22×20 DGS on the 50%
ground plane
[14] 60×60 Koch fractal 45%
geometry
[19] 45×25 Slots and slits 33%
Propose 50×50 Slots on the patch 66.46%
d MMA

REFERENCES
[1] J. S. Kuo and K. L. Wong, “A compact microstrip antenna with meandering slots in the ground plane,” Microwave Opt.
Technol. Lett. vol 29, pp. 95-97, April 20, 2001.
[2] C. L. Tang, H. T. Chen, and K. L. Wong, “Small circular microstrip antenna with dual frequency operation,” Electron.
Lett. Vol 33, pp. 1112–1113, June 19, 1997.
[3] H. Malekpoor and S. Jam, “Design of a multi-band asymmetric patch antenna for wireless applications”, Microwave
Opt. Tech. Lett. vol. 55, pp. 730–734, April, 2013.
[4] U. Kiran, V. R. M, R. M. Yadahalli, P. V. Hunagund and S. F. Farida, “Microstrip-line-fed rectangular microstrip
antenna with open end meandering slots in the ground plane for compact broadband operation” Microwave Opt Technol
Lett. Vol 49, pp. 824 – 827, April, 2007.
[5] K. L. Wong and K. P. Yang, “Compact dual-frequency microstrip antenna with a pair of bent slots,” Electron. Lett. Vol
34, pp. 225–226, Feb. 5, 1998.
[6] S. Islam and M. Latrach, “Design construction and testing of a compact size patch antenna for RFID applications”
Microwave Opt. Tech. Lett. vol. 55, pp. 2920–2925, 2013.
[7] M. S. M. ALI, S. K. A. RAHIM, M. I. SABRAN, M. ABEDIAN, A. ETENG AND M. T. ISLAM, “Dual band
miniaturized microstrip slot antenna for WLAN applications”, MICROWAVE OPT. TECH. LETT. VOL. 58, PP. 1358–
1362, JUNE, 2016.
[8] S. Dasgupta, B. Gupta and H. Saha, “Compact equilateral triangular patch antenna with slot loading” vol. 56, pp. 268–
274, February, 2014.
[9] S. Chatterjee, U. Chakraborty, I. Sarkar, P. P. Sarkar, and S. K. Chowdhury, “A compact microstrip antenna for mobile
communication” India Conference (INDICON), Annual IEEE, 17-19 December, 2010, Kolkata, India.
[10] U. Kiran, V. R. M, R. M. Yadahalli, P. V. Hunagund and S. F. Farida, “Microstrip-line-fed rectangular microstrip
antenna with open end meandering slots in the ground plane for compact broadband operation” Microwave Opt Technol
Lett. Vol 49, pp. 824 – 827, April, 2007
[11] S .I .H. Shah, S. Bashir, A. Altaf, and S. D. H. Shah “Compact multiband microstrip patch antenna using defected
ground structure (DGS)” XIXth International Seminar/Workshop on Direct and Inverse Problems of Electromagnetic
and Acoustic Wave Theory (DIPED), 22-25 September, 2014, Tbilisi, Georgia.
[12] K. Mondal, L. Murmu, and P. P. Sarkar, “Investigation on compactness, bandwidth and gain of circular microstrip patch
antenna” Devices for Integrated Circuit , 23-24 March, 2017, Kalyani, India.
[13] Hanae Elftouh, Naima A. Touhami, Mohamed Aghoutane, Safae El Amrani, Antonio Tazon and Mohamed Boussouis,
“Miniaturized Microstrip Patch Antenna with Defected Ground Structure” Progress In Electromagnetics Research C,
vol. 55, pp. 25–33, 2014.
[14] Il-Kwon Kim, Jong-Gwan Yook and Han-Kyu Park, “Fractal-shape small size microstrip patch antenna” Microwave and
Optical Technology Letters, vol. 34, no. 1, July 5 2002.
[15] Jeevani Jayasinghe, Omar Saraereh, Rajas Khokle and Karu Esselle, “Design and analysis of m-segment fractal
boundary antennas” vol.61, issue 9, pp. 2119-2125, 2019.
[16] Yang Cai, Zuping Qian, Wenquan Cao and Yingsong Zhang, “Research on the half complementary split-ring resonator
and its application for design” Microwave and Optical Technology Letters, vol. 57, no. 11, November 2015.
[17] D. R. Minervino, 1 A. G. D’Assuncaoand C. Peixeiro2, “Mandelbrot fractal microstrip antennas” Microwave and
Optical Technology Letters vol. 58, no. 1, January 2016
[18] Jai Mangal and L Abhinav Varma, “A Miniatueized rectangular slotted patch antenna for WiFi frequency range
applications” IEEE 2nd international conference on applied electromagnetic, signal processing and communication
(AESPC), Bhubaneswar, India, 26-28 November, 2021.
[19] Ziyang Li, Leilei Liu, Pinyan Li and Jian Wang, “Miniaturized design of CPW-Fed slot antennas using slits” 2017 Sixth
Asia-Pacific Conference on antenna and Propagation (APCAP), Xi'an, China, 16-19 October 2017.

134
[20] Sudipta Das, Parimal Chowdhury, Arindam Biswas, Partha Pratim Sarkar, and Santosh Kumar Chowdhury, “Analysis of
a Miniaturized Multiresonant Wideband Slotted Microstrip Antenna With Modified Ground Plane”, IEEE antennas and
wireless propagation letters, vol. 14, pp. 60-63, 2015
[21] Biswarup Rana, Soumen Banerjee, Priyasha Chatterjee, Ritam Banerjee, Rituparna Basak, “Design of a CPW-Fed Spiral
Ring-Loaded Miniaturized Slot Antenna” International Conference and Workshop on Computing and Communication
(IEMCON), Vancouver, BC, Canada, 15-17 October 2015.
[22] Balanis, C.A., “Antenna Theory: Analysis and Design”, John Wiley & Sons, Inc, 1997.

135
GrenzeInternational Journal of Engineering and Technology, June Issue

Cloud-Based Resource Distribution Using a Blockchain


Approach
Radha T. Deoghare1 and Mrs. Sapana A. Kolambe2
1-2
Department of Information Technology,
PCCOE, Assistant Professor
Email: [email protected], [email protected]

Abstract—It is difficult to divvy up and keep track of resources amongst several entities. This is
particularly true for complex and ever-changing systems, such as those seen in cloud
computing, software engineering, and the Internet of Things (IoT). Providing safe access
control is crucial to the success of such a system. In particular, the safe, adaptable, and
granular handover of privileges from one entity to another. Here, we introduce a blockchain-
based multi-organizational delegation system. In our system, smart contracts on the blockchain
specify how the consortium's member organizations interact and how their shared resources
are divided up.

Index Terms— Resource Allocation, IoT, blockchain, Security.

I. INTRODUCTION
Cloud computing is a method of remote, scalable resource provisioning that uses utility-based computing
models. Parallel computing, grid computing, and distributed computing are all realized on the cloud [1]. Users
can access a shared pool of resources in the cloud and use them as needed using an "on-demand" model [2].
Users can use cloud services whenever and wherever they like thanks to the cloud's powerful computing
capabilities and massive storage capacity. IT assets such as databases, servers, communication devices,
networks, and software systems are housed in a cloud data Centre. As more customers use the cloud, more
servers or other gear will be needed to meet demand. Thus, the creation of more physical nodes will result in an
increase in data center power usage. Today, 2% of all electricity used in the globe goes toward powering data
centers. By the year 2030, projections show it will have reached 8%. Data centers have three major power users:
servers, data Centre networks, and cooling systems. The network uses 10% to 25% of the energy, the cooling
systems use 15%-30%, and the servers use 40%-55%, respectively [3].
Computing resources such as RAM, CPU, Network, and Storage are provided by IaaS (Infrastructure as a
Service) and their use is typically governed by Service Level Agreements (SLAs) (Service Level Agreement).
Use of resources is also influencing energy requirements. One of the causes of the data center's energy
insufficiency is the inefficient use of its resources [4]. Even at 10% CPU utilization, the workload is light enough
to cause the energy consumption to be more than 50% of the maximum. This is where IaaS's virtualization
techniques come into play, which helps maximize the usefulness of any given cloud's assets [5]. Due to the
shared resources made available by virtualization, VMs can take the place of PMs in processing user requests.
Separating virtual machines (VMs), moving VMs, and merging VMs are all examples of what may be done
using virtualization. VM migration is a method for moving active virtual machines from one physical host to

Grenze ID: 01.GIJET.9.2.306


© Grenze Scientific Society, 2023
another. Consolidating virtual machines (VMs) that were previously spread across multiple hosts into a smaller
number of hosts saves power by either shutting down the first host or placing it into hibernation [6]. One way to
run virtual machines is through a method called Virtual Machine Placement (VMP). A powerful VMP method is
required to improve energy productivity and maximize use of available resources [7]. When it comes to
optimization, the VMP problem is NP-hard [8].
In this study, we combine the power of the genetic algorithm (GA) with that of the random forest (RF) algorithm
to create a novel and effective hybrid VMP strategy. Our goal is to keep the load spread across a number of
physical computers while decreasing the data center's energy consumption. One of the most important factors in
determining how effective the proposed solution is is how well it makes use of the hardware's available
resources. The goal of this study is to reduce the execution time, average start time, and average finish time
required by the cloud, as well as the waiting time and request completion time, respectively. The proposed
approach also aims to speed up the iterative metaheuristic algorithms like GA, ACO, PSO, and others by cutting
down on the time it takes to identify the best solution. The goal of the model is to use the best possible optimal
solution to train a machine learning model, which can then be used to forecast the optimal solution in a constant
amount of time, bypassing the need for evolutionary processes to find the global best answer [9].
The genetic algorithm is one of the metaheuristic methods used to locate a truly optimal answer. First, using the
mapping between virtual and physical machines as a training dataset, the GA creates an optimum schedule for
resource allocation. The next step is to use the GA-created dataset to train the random forest algorithm, which
then assigns virtual machines to physical machines based on the categorization it has made. With the GA-
obtained data sample, the RF's classification accuracy may be evaluated.
A. Problem Statement
In this article, we take a look at the current state of blockchain-based cloud security and analyze its many
benefits and drawbacks. The effectiveness of a smart contract can be measured in terms of its robustness,
security, stability, and practicality. The model's capacity to generalize has proven superior. However, this
paradigm needs to be implemented locally at each cloud data center. Attribute encryption using ciphertext has
improved retrieval efficiency and validated the integrity of the data at rest. But the proposed model has poorer
search efficiency. BIoTHR ensures the confidentiality of cloud data while providing advantageous pricing and
accessibility. The model does not make use of low-power Internet of Things gadgets. As far as usability, safety,
privacy, and reaction speed are concerned, EACMS ensures the best possible results in every category. When
compared to traditional medical systems, this concept has proven to be far more effective. Contrarily, it
necessitates a more capacious memory system. When it comes to delay and throughput, DBDH performs at its
best, and it also offers top-notch security. However, this paradigm is vulnerable to attacks that happen in real-
time. In terms of latency, throughput, and resources, the modified Merkle Tree data structure excels. Resource
consumption, latency, transaction response time, and throughput are all analyzed to guarantee the proposed
model performs well. However, as the system's user base grows, so does its latency. Safeguarding private health
information in less time than conventional methods, the timestamped algorithm is a significant improvement
over the alternatives. This methodology does not, however, process or provide privacy for the little data bits.
Smart contracts have been shown to increase data integrity and privacy by providing security and access control.
For smart contracts that rely heavily on locally stored data, another approach is necessary. Researchers might use
this analysis as support for proposing a novel blockchain-based approach for cloud data security.
B. Contributions
The growing popularity of blockchain technology offers a potential answer to the cloud computing resource
management issue. It ensures users' data security in the cloud computing environment [14] while also
cryptographically guaranteeing the irreversible and unforgeable features of the data. Additional identifying
attributes of the alliance chain members are shared by cloud service providers and cloud computing environment
customers.
In order to better integrate the blockchain system into the cloud computing network architecture, the
fundamental objective of this work is to propose a cloud computing resource contribution model based on an
alliance chain. A solution to the cloud computing resource management challenge is found in the application of
blockchain's incentive and disincentive mechanisms to encourage nodes to actively contribute to the pool of
available computing resources. By recording the resource-contribution behavior of cloud nodes and the degree of
satisfaction upon task completion in blockchain form, an uncontrollable assessment system is created, which can
address issues like malicious poor reviews and brush in real-world applications.

137
II. RESEARCH METHODOLOGY
We propose Automatic software cloud resource allocation utilizing permission block, a policy-based and
autonomic middleware that enables self-adaptiveness for data management in clouds, to address these issues.
Therefore, the proposal combines three highly sought-after elements: Software cloud resource allocation is
monitored in real-time, and the collected and aggregated metrics (such as write latency, read latency, uptime,
free memory, etc.) Are secured on a blockchain for optimal privacy and integrity. Data management decisions
are made based on what cloud service is best suited to help you satisfy your service level agreements (slas), and
your data is transmitted securely. Moreover (iii), the cloud storage setup is automatically re-configured (based on
the simple, reusable, and extendable configuration policies), meaning that a human operator is no longer needed
to monitor and manually re-configure the cloud storage setup's security.

Figure 01: System model

III. WORKING
Each node in a distributed ledger maintains a chain of records called "blockchain." The Bitcoin network's
consensus issue was proposed by S. Nakamoto. Each block in the blockchain, with the exception of the initial
block, includes the hash of the prior block, as illustrated in Figure 2. (Genesis block). The former block is always
produced ahead of the latter, and each block contains transactions, which are logs of acts taken on the
blockchain, such as the transfer of assets. Figure 02 further elucidates the technique by which a blockchain is
created. As shown in the diagram, step one of a transaction involving Node0 involves a user signing it with his
private key. As a result, the digital signature improves security and data integrity, and the transaction can be
tracked using the user's public key. Afterward, Node0's immediate neighbor receives the transaction broadcast
(i.e., Node1 and Node2).
Node1 and Node2 ensure the broadcast transaction follows the transaction protocol before broadcasting it to
Nodes3 and 4. If the transaction does not follow the protocol, it will be dropped.

Figure 02: A blockchain network

138
Each network should make it clear to all participants what kind of protocol will be used for transactions before
the blockchain is even created. Transaction protocol's primary goal is to maintain network order in the
blockchain.

IV. CONCLUSION
This paper initially examines the privacy and security concerns surrounding edge computing-enabled IoT, before
moving on to describe the features of blockchains that make them ideal for use in IoT applications. It was
suggested to use a common framework for all Internet of Things (IoT) use cases that involve blockchain
technology and edge computing. The entire process of a transaction was laid out in minute detail under the
proposed framework. Additionally, the edge computing resources allocation problem was addressed by
developing a smart contract in a private blockchain network that utilized the cutting-edge reinforcement learning,
Asynchronous Advantage Actor-Critic algorithm. In particular, the efficiency of the suggested method is
improved over the state-of-the-art edge computing resource allocation techniques by catering to various service
users and differentiating between their Quality of Service (QoS) needs. That's an example of how AI and
blockchains can work together. The simulation results were presented to prove the efficiency of the proposed
resource allocation system for edge computing. Joint optimization of blockchain settings and edge computing
resource allocation is something we plan to investigate more in future work.

REFERENCES
[1] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash, “Internet of things: A survey on enabling
technologies, protocols, and applications,” IEEE Communications Surveys & Tutorials, vol. 17, no. 4, pp. 2347–2376,
2015.
[2] M. M. Rathore, A. Ahmad, A. Paul, and S. Rho, “Urban planning and building smart cities based on the internet of things
using big data analytics,” Computer Networks, vol. 101, pp. 63–80, 2016.
[3] C. Qiu, X. Wang, H. Yao, J. Du, F. R. Yu, and S. Guo, “Networking integrated cloud-edge-end in iot: A blockchain-
assisted collective qlearning approach,” IEEE Internet of Things Journal, 2020.
[4] C. Qiu, F. R. Yu, H. Yao, C. Jiang, F. Xu, and C. Zhao, “Blockchainbased software-defined industrial internet of things:
A dueling deep q-learning approach,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 4627–4639, 2018.
[5] P. Garcia Lopez, A. Montresor, D. Epema, A. Datta, T. Higashino, A. Iamnitchi, M. Barcellos, P. Felber, and E. Riviere,
“Edge-centric computing: Vision and challenges,” ACM SIGCOMM Computer Communication Review, vol. 45, no. 5,
pp. 37–42, 2015.
[6] J. Du, L. Zhao, J. Feng, and X. Chu, “Computation offloading and resource allocation in mixed fog/cloud computing
systems with min-max fairness guarantee,” IEEE Transactions on Communications, vol. 66, no. 4, pp. 1594–1608, 2018.
[7] S. Shen, Y. Han, X. Wang, and Y. Wang, “Computation offloading with multiple agents in edge-computing–supported
iot,” ACM Transactions on Sensor Networks (TOSN), vol. 16, no. 1, pp. 1–27, 2019.
[8] X. Wang, Y. Han, V. C. Leung, D. Niyato, X. Yan, and X. Chen, “Convergence of edge computing and deep learning: A
comprehensive survey,” IEEE Communications Surveys & Tutorials, vol. 22, no. 2, pp. 869–904, 2020.
[9] X. Wang, C. Wang, X. Li, V. C. Leung, and T. Taleb, “Federated deep reinforcement learning for internet of things with
decentralized cooperative edge caching,” IEEE Internet of Things Journal, 2020.

139
Grenze International Journal of Engineering and Technology, June Issue

Detecting Human Emotion by Text Classification


U Brunda1, Palakuru Akhilesh2 and Dr. K. Kalaiselvi3
1-2
Final year, SRM Institute of Science and Technology, Kattankulathur-603203, Chennai, Tamil Nadu, India
3
Assistant Professor, SRM Institute of Science and Technology, Kattankulathur-603203, Chennai, Tamil Nadu, India
Email: [email protected], [email protected], [email protected]

Abstract— Nowadays, it's fairly usual to share moments on social media. By communicating
thoughts, ideas, and enjoyable experiences over text, we can express our feelings without
needing a lot of words. To investigate people's opinions, sentiments, and emotions, for instance,
businesses may target YouTube as an abundant source of data. A greater comprehension of an
author's emotions is often possible through emotion analysis. Analyzing expressions as positive,
negative, or neutral has been the focus of almost all projects evaluating Telugu social media.
We'll categorize the expressions in this essay into groups based on the emotions of happiness,
fury, fear, disgust, and melancholy. Different approaches have been used in the case of other
languages to automatically recognize textual emotions, however few of them were based on deep
learning. Now let's talk about the system we utilized to classify the feelings stated in Telugu
YouTube comments. For tasks requiring phrase classification, our model includes an XLM-
RoBERTa and Multilingual BERT that was specifically trained on our dataset using trained
word vectors. We contrasted the outcomes of our method with those of other machine learning
techniques. The architecture of our deep learning technique is a word-based, end-to-end
network, phrase, and document vectorization procedures. The proposed deep learning strategy
was tested using the Telugu YouTube comments dataset, and the results were promising when
compared to more traditional machine learning methods.

I. INTRODUCTION
As social media has become more popular, internet users can now voice their opinions on a wide range of
subjects. Social networking sites are increasingly being used for a variety of activities, such as the advertising of
products, the sharing of news, and the recognition of achievements.
Emotion analysis, often known as opinion mining, is the study of how to infer from textual data how individuals
feel about a particular thing, person, or organization.
Market analysis, e-commerce, social media monitoring, and many more areas are examples of contemporary
applications for emotion analysis. Telugu is the fifteenth most frequently spoken language in the world, with
more than 75 million native speakers. The creation of a technique for Telugu text emotion analysis will benefit
several people and organizations.
Everyday life brings us into contact with a variety of events, which leads to the formation of opinions regarding
those occurrences. A person's emotions are strong feelings they have in reaction to their circumstances or
interpersonal relationships. It has a big impact on consumer decision-making in a lot of different areas, such e-
commerce, restaurants, movies, interests, and satisfaction with a service or a product. Additionally, it affects our
health! Users can now voice their opinions about a comment, picture, or event using Facebook's replies feature,
which has just undergone some changes. These reactions include angry, happy, love, and surprise.

Grenze ID: 01.GIJET.9.2.312


© Grenze Scientific Society, 2023
Examples for emotion analysis as given below:
1. నసంవత ల న చ ...ఎ ఉం
Nenu puttina sanvatsaralake ma nanna chanipoyadu.. Ela untadho kuda teledu – SAD
2. ఛం లం ఉం
Story chandalanga undi - ANGRY
3.ఈ తం 200 టవ
Ee chitram 200 gross nu datavacchu – TRUST
4. ఎఫఎ ఆశ ర
Cinema vfx chusi nenu ascharyapoya – SURPRISE
5. ఉన ం న ఆ తం ల ం
Chala twist lu unnanduna nenu cinemani aatranga chudalanukuntunnanu – ANXIETY

In academic circles, emotional analysis is seen as a kind of higher, more developed version of sentiment
analysis. Sentiment analysis is used to classify texts (posts, words, or documents) as neutral, positive, or
negative. Emotional analysis, on the other hand, is a more extensive and in-depth investigation of user emotions
with the goal of examining the psychology of various user behaviors and illuminating deeper human emotional
meanings including anger, disgust, trust, grief, delight, and surprise.
The English language has a good reputation in the field of emotion detection, including the accessibility of
datasets and dictionaries, in contrast to Telugu, which has a dearth of resources.
In this study, We look into automatic emotion recognition for Telugu language using MULTI LINGUAL BERT
using four steps: word, sentence, document vectorization, and classification. Displaying the performance and
precision that deep learning has so far attained, we also compared this methodology to other machine learning
techniques. We applied our techniques to analyze user sentiment in the YouTube comments dataset.

II. RELATED WORK


A. NAILA ASLAM(2022) Sentiment Analysis and Emotion Detection on Cryptocurrency Related Tweets Using
Ensemble LSTM-GRU Model
The methodological approach that they took. Random Forest, Decision Trees, KNN, and SVM, as well as the
Metrics That They Employed in Order to Obtain the Output Precision, Accuracy, and the F1 score Remember,
one of the problems with the model is that the balancing of the dataset through the use of the random under
sampling shows that performance is diminished due to there being fewer training data.
B. A Majeed (2022)Emotion Detection in Roman Urdu Text Using Machine Learning
This research develops detection of human for Roman Urdu sentences with a specific size of dataset and they are
mapped to six different classes of emotions. They used methods like KNN, Decision tree, SVM, and Random
Forest. The final result shows KNN as the best model with better F-measure score compared to the other
approaches.
C. T. Balomenos(2022) Emotion Analysis in Man-Machine Interaction Systems
This paper is developed for extracting the emotions from related image sequences. This uses an advanced
intelligent rule-based system. It helps the MMI to deal with specific emotion states such as frustration and anger.
D. Abdullah (2022) Multimodal Emotion Recognition using Deep Learning Smsa
This paper is a review of emotional recognition of multimodal signals and unimodal solutions as they have
higher accuracy. This improves better understanding of physiological signals and emotional awareness.
E. Omkar Gokhale, Shantanu Patanka, Onkar Litake, Aditya Mandke, Dipali Kadam(2022)
Emotion analysis in Tamil
It is the overview of shared task of emotion analysis in Tamil. The task is split into two in which one of the task
includes social media Tamil comments that are annotated with specific suited emotions and in the other task
fine-grained emotions are annotated for the social media comments in Tamil. The metrics used for evaluating
models are Precision, Recall and Micro average.

141
F. LINJIAN LI(2021) A Novel Emotion Lexicon for Chinese Emotional Expression Analysis on Weibo: Using
Grounded Theory and Semi-Automatic Methods
The downsides for the model include the methodology that was utilized (ALO and SC-LIWC), the metrics that
were used to generate the output (Precision, recall, and F1), and the methodology that was used (ALO). Only
users from China's Weibo were surveyed for this data set.
The strength of the relationship that each word has with the corresponding emotion category was not included in
the lexicon.
G. Chang Liu,Taiao Liu, Shuojue Yang,And Yajun Du(2021)
Individual Emoticon Recognition Approach Combined Gated Recurrent Unit with Emotion Distribution Model.
This paper proposes a model called semantic emoticon emotion recognition (SEER). First the input text is
divided into four categories with emotion dictionary and emoticons. Then it is combined by a bidirectional gated
recurrent unit (Bi-GRU) a network with an emotion-vector-capturing attention mechanism. Lastly, a emoticon
distribution model is constructed to obtain emotion vectors from various social network data. Fourth, we
combine the emoticon emotion characteristics in text with the texts semantic emotional components using
various fusion weights based on the various types of input short messages. Depending on the resulting emotion
vector, we finally divide the short text emotions into six categories.
H. BHARATHI RAJA CHAKRAVARTHI (2021)Dataset for identification of homophobia and transophobia in
Multilingual youtube comments.
This paper describes the process of building the dataset, qualitative analysis of data, and inter-annotator
agreement. In addition, we create baseline models for the dataset.
I. FERDOUS AHMED(2020) Emotion Recognition from Body Movement
The methodologies that were applied were SVM,LDA,GNV,DT, and KNN. The metrics that were applied for
the purpose of obtaining the output were f-Score,p-Score, and Accuracy. The limitations of the model are as
follows: Observed a marginal drop in performance across the board in action-independent cases
J. ZISHAN AHMAD, RAGHAV JINDAL, ASIF EKBAL and PUSHPAK BHATTACHHARYY (2020)
Borrow from rich cousin: transfer learning for emotion detection using cross lingual embedding. Expert Systems
with Applications.
This paper is mapped with the emotions of disaster domain sentences in Hindi. Dataset is created for the disaster
domain sentences. The models used here are CNN and Bi-LSTM (Bi-Directional Long Short Term Memory).
For Hindi emotion categorization, the neural networks are trained on the available datasets, and then the weights
are adjusted using one of four transfer learning techniques.
K. Zhenzhong Lan, Mingda Chen, Piyush Sharma, and Rady Soricut (2019)
ALBERT: A Lite BERT for Self-supervised learning of language representations.
For BERT to use less memory and train more quickly, we provide two parameter-reduction strategies. Detailed
empirical data demonstrates that our suggested methods produce models that scale far better than the original
BERT. We also employ a self-supervised loss that emphasises modelling inter-sentence coherence, and we
demonstrate that it consistently facilitates tasks that require multi-sentence inputs.
L. Alexis Connaeu, Kartikay Khandelwal, Naman Goyal,Vishrav Chaudhary (2019)
Unsupervised cross-lingual representation learning at scale.
Using more than two terabytes of filtered CommonCrawl data, we train a Transformer-based masked language
model on 100 different languages. On a number of cross-lingual benchmarks, including +14.6% average
accuracy on XNLI, +13% average F1 score on MLQA, and +2.4% F1 score on NER, our model, called XLM-R,
greatly surpasses multilingual BERT (mBERT). Low-resource languages are where XLM-R excels, with
Swahili's XNLI accuracy increasing by 15.7% and Urdu's by 11.4% over earlier XLM models. The trade-offs
between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages
at scale are among the important aspects that must be considered in order to accomplish these advantages, and
we also give a thorough empirical study of these factors.
M. Stephen Merity, Nitish Shirish Keskar, And Richard Socher. (2019)
An analysis of neural language modeling at multiple scales.

142
We provide a model architecture and training method that, when applied to the WikiText-103 data set, achieves
state-of-the-art performance while being significantly quicker than an NVIDIA cuDNN LSTM-based model by a
factor of two. using the Quasi-Recurrent Neural Network (QRNN), an Longer sequences within batches and
softmax with weight tying.
N. Jeremyhoward And Sebastian Ruder (2019)
Universal language model fine-tuning for text classification.
We describe strategies that are essential for fine-tuning a language model and propose Universal Language
Model Fine-tuning (ULMFiT), a powerful transfer learning method that may be used for any NLP application.
On six text classification tasks, our approach greatly exceeds the state-of-the-art, lowering the error on most
datasets by 18–24%. Furthermore, it matches the performance of training from scratch on 100x more data with
only 100 labeled instances. Our pre-trained models and code.
O. VINAY KUMAR JAIN, SHISHIR KUMAR, and STEVEN LAWRENCE FERNANDES (2019)
Extraction of emotions from multilingual text using
intelligent text processing and computational linguistics.
Every word of emotion in a tweet is significant in decision-making, hence an efficient pre-processing technique
has been utilized to maintain the significance of multilingual emotional words. The Naive Bayes algorithm and
Support Vector Machine (SVM) are used to classify tweets' sentiments in exquisite detail.

III. PROPOSED METHODOLOGY


Using a masked language modeling (MLM) aim, a model was trained on the top 104 languages with the most
content on Wikipedia. Maximum accuracy was thus possible. To make sure the outcomes were trustworthy, this
was done. The inaugural release took place in the repository you are currently seeing, and this page acted as its
introduction. This model can distinguish between distinct varieties of English because it is sensitive to
capitalization, a feature that both varieties of English share.
The team in charge of Hugging Face has been in charge of writing this model card because the team in
responsibility of releasing BERT was not accountable for doing so. This is so that Hugging Face can be released,
which is the responsibility of the team in charge of it.
The XLM-RoBERTa of BERT model is a transformers model that has been pre trained on a big corpus of data
that includes information in a variety of languages using an unsupervised learning approach. The BERT model
has been trained using this data. The model in question has been designated as the XLM-RoBERTa model. This
suggests that an automatic method was used to produce inputs and labels from those texts, and that it was pre
trained exclusively on the raw texts, with no human tagging of them in any way (which explains why it can use a
significant amount of data that is publicly available).This also suggests that the raw texts were not in any manner
labeled by humans before it was pre-trained. Additionally, this implies that there were no human annotations of
any kind on any of the raw texts. This also shows that the raw texts were not in any manner, shape, or form
labeled by humans. This can be inferred from the absence of human labeling. To be more precise, it underwent
pre-training with the idea of achieving the following objectives:
The first step in masked language modeling (MLM) is to take a sentence as input, randomly mask 15% of the
words, then run the entire masked sentence through the model and ask it to predict the words that have been
hidden. This procedure is carried out again and again until the model correctly predicts the words that have been
obscured. This process is repeated multiple times until the model can correctly anticipate the words that are
hidden. Multiple iterations of this approach are carried out until the model is able to correctly anticipate the
words that have been hidden.Unlike traditional recurrent neural networks (RNNs), which normally view the
words one after another, and autoregressive models like GPT, which internally mask the future tokens, this
approach does not view the words in a sequential sequence. This approach instead presents the words in the
order that a reader would use if they were reading them out to themselves. This approach examines the words in
the same sequence as one would read them aloud to themselves. In other words, it adheres to the speech's
organic flow.On the other hand, this approach takes into account the words in the presentation's order of
arrangement. As a result, the model is able to obtain a complete representation of the text that takes into
consideration both orientations.
While the model is being trained, the Next Sentence Prediction (NSP) method combines two masked sentences
into a single input. This improves the model's capacity for learning. As a result, the learning efficiency of the
model is increased. This is carried out in order to enable the model to produce predictions that are more accurate.

143
Although it is not always the case, it is possible that they will correlate to sentences that were written next to one
another in the original text. Although not always the case. Do not, however, take this too literally because it is
not always the case. But on the other side, they can decide not to follow through in the end. The model must then
decide whether or not the two sentences were located in the text directly after one another. If you have access to
a dataset with labeled sentences, for example, you may use the characteristics the BERT model produced as
inputs to train a typical classifier.
The standard classifier will be able to learn from the tagged sentences as a result. As a result, the model is able to
develop an internal representation of the languages that are included in the training set. This will give the
standard classifier the chance to learn about the classification of sentences. Then, using this representation of the
languages, features that are useful for later tasks in the process can be extracted.
A. Algorithm: Xlm-Roberta
1. Importing XLMRobertaTokenizer and XLMRobertaForSequenceClassification from transformers.
2. Model is named as xlm-roberta-base.
3. Input that is the tokenizer is given as XLMRobertaTokenizer.from_pretrained(MODEL_TYPE).
4. The module is downloaded of 100 percent.
5. Checking the size of the vocab.
6. Verifying whether the special tokens are present or not.
7. Model inputs are given such as
input_ids (type: torch tensor)
attention_mask (type: torch tensor)
labels (type: torch tensor)
8. The very first input is the 'input_ids'. These represents the sentences which also represent tokens.
9. The second is the 'token_type_ids'.
10. Third is the 'attention_mask'. It has the same length as of 'input_ids' and it also tells the model which tokens
in the 'input_ids' are working and which are padding.
11. To indicate token or a special word '1' is used and for padding '0' is used.
12. Third input also consists of 'labels'.
13. A tokenizer is used to create XLM-RoBERTa input for both one and two input sentences.
14. The sequence of tokens are decoded.
15. The truncated tokens will return in a list called overflowing_tokens.
16. Data is loaded.
17. Creates folds according to the requirement for traning and testing.
18. Displays the required folds.

IV. IMPLEMENTATION
We'll outline the data in this part that was used, as well as our methodologies, in order to recognise emotions in
Telugu YouTube videos using a deep learning approach XLM-RoBERTa.
The three modules of the project for implementation are:
a. Dataset creation
b. Training dataset
c. Testing dataset
A. Dataset Creation
The dataset of Telugu YouTube comments was provided and used for the training of the model, which is an
ordinal classification task based on the intensity of feelings: You must classify a comment into one of five
ordinal classes of intensity for the emotion represented by the letter E if it is offered together with an emotion
that begins with that letter. One comment has been added to the dataset for each of the following emotions, for a
total of one thousand comments includes rage, fear, joy, disgust, and sadness.
Our dataset was split into two sets: 500 comments made up the training set, and 100 comments made up the
testing set. 90% of the dataset had to be used for training, while just 10% was necessary for testing. The test
dataset was only used to test the created model and offer an indicator of how well the trained model is working.
To train the model, the training datasets were classifier and to optimise the parameters. The model was not given
access to the test dataset.

144
Fig1: Block diagram

Fig2: Created dataset

B. Training Dataset
Data Preprocessing is done. Because our dataset was in Telugu, we had to perform some specialized pre-
processing in order to identify the most effective pattern i.e, training the dataset. The following are the steps that
we followed:
i. The standardization of the writing of certain characters that can be written in a variety of different
ways, such as writing in the normal form.
ii. Do away with all of the diacritics.
iii. Eliminate all of the punctuation marks.
iv. Get rid of characters that are repeated: When describing an action, such as laughing, YouTube users
frequently repeat a character to highlight and accentuate their meaning. Hahaha, enlargement Wow,
what outrage! Oh no, etc. We eliminated all other occurrences of repeated characters because we
believed that a word could only have two instances of a repeating character.
In addition, we have the option of including a step that gets rid of stop words in the input text. Stop words
include things like prepositions, conjunctions, and other similar words.
C. Testing the Dataset
The dataset is divided in two types as data used for training for 80 percent and 20 percent for testing the data.
The trained dataset is tested with different algorithms like XLM-RoBERTa and Multilingual BERT. To test the
dataset necessary python libraries for Colab code execution dataset as pandas data frame are imported. We have
used seaborne's count plot to count various emotions. The task is to find the best machine learning algorithm
with good accuracy.

145
V. RESULTS & DISCUSSION
This model is trained using XLM-RoBERTa algorithm and Multilingual BERT algorithm with around 600
Telugu sentences mapped with the emotions happy, neutral, disgust and anxiety. This model gives the accuracy
of 77 percentage for XLM-RoBERTa algorithm and for the algorithm Multilingual BERT it gives 53 percentage.

Fig 3: Graphical representation count of mapped dataset

Fig 4: Accuracy of 77 Percentage using XLM-roberta algorithm

Fig 5: Model with 53 percent accuracy for multilingual BERT algorithm

146
VI. CONCLUSION
This study attempted to classify comments made on social media. We applied the XLM-RoBERTa and
Multilingual BERT strategies. With a macro-averaged f1 score of 0.77 for XLM-RoBERTa and for Multilingual
BERT f1 score is of 0.53. XLM-RoBERTa method outscored all other models. Overall, the models are seen to
identify emotions like anxiety, happiness, neutral, and disgust. The models are far less accurate in classifying
more complex emotions like fear, rage, and melancholy. To enhance the performance of the models, alternative
strategies, such as genetic algorithm-based ensembeling, can be tested in the future.

REFERENCES
[1] Emotion Analysis in Man-Machine Interaction Systems – T.Balomenos 2022
https://link.springer.com/chapter/10.1007/978-3-540-30568-2_27
[2] Multimodal emotion recognition using deep learning SMSA -Abdullah 2022
https://scholar.google.com/scholar?cluster=11062434886599925582&hl=en&as_sdt=0,5
[3] Emotion Analysis in Tamil - Omkar Gokhale, Shantanu Patanka, Onkar Litake, Aditya Mandke, Dipali Kadam 2022
https://scholar.google.com/scholar?cluster=11062434886599925582&hl=en&as_sdt=0,5
[4] Emotion detection in roman Urdu text using machine learning - A Majeed 2022
https://dl.acm.org/doi/abs/10.1145/3417113.3423375?casa_token=fS0ijtmLfAIAAAAA:a0hgzLiIfWlAYnp3E5x5fvxZ9
TAX GgYBBZ_XmDBI0xiY0NU1nfJvK5xXkwfMGPTguPBNBwrfb4GmjBQ
[5] Sentiment Analysis and Emotion Detection on Cryptocurrency Related Tweets Using Ensemble LSTM-GRU Model -
NAILA ASLAM, FURQAN RUSTAM, ERNESTO LEE, PATRICK BERNARD WASHINGTON, AND IMRAN
ASHRAF 2022 https://ieeexplore.ieee.org/abstract/document/9751065
[6] A Novel Emotion Lexicon for Chinese Emotional Expression Analysis on Weibo: Using Grounded Theory and Semi-
Automatic Methods - LIANG XU, LINJIAN LI, ZEHUA JIANG, ZAOYI SUN, XIN WEN, JIAMING SHI, RUI SUN,
AND XIUYING QIAN 2021 https://ieeexplore.ieee.org/abstract/document/9139939
[7] Emotion Recognition from Body Movement - FERDOUS AHMED, A. S. M. HOSSAIN BARI, AND MARINA L.
GAVRILOVA 2020 https://ieeexplore.ieee.org/abstract/document/8945309
[8] Individual Emotion Recognition Approach Combined Gated Recurrent Unit with Emoticon Distribution Model -
CHANG LIU,TAIAO LIU, SHUOJUE YANG,AND YAJUN DU 2021
https://ieeexplore.ieee.org/abstract/document/9597507
[9] Borrow from rich cousin: transfer learning for emotion detection using cross lingual embedding. Expert Systems with
Applications - Zishan Ahmad, Raghav Jindal, Asif Ekbal, and Pushpak Bhattachharyya. 2020.
https://www.sciencedirect.com/science/article/pii/S0957417419305536?casa_token=kEwcg0YM4DIAAAAA:Gxkl7Wj
7_hbtk7vdEPDEwzd7eqgnW_-4xRCl5c8PxV0GRulYhpHcieOkW895-482sC5rtYWEyiOO
[10] Dataset for identification of homophobia and transophobia in Multilingual youtube comments. - Bharathi Raja
Chakravarthi, Ruba Priyadharshini, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Kayalvizhi Sampath, Durairaj
Thenmozhi, Sathiyaraj Thangasamy, Rajendran Nallathambi, and John Phillip McCrae 2021
https://arxiv.org/abs/2109.00227
[11] ALBERT: A Lite BERT for Self-supervised learning of language representations. - Zhenzhong Lan, Mingda Chen,
Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut 2020
/group?id=ML_Reproducibility_Challenge/2020
[12] Unsupervised cross-lingual representation learning at scale. Alexis Conneau, Kartikay Khandelwal, Naman Goyal,
Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin
Stoyanov. 2019. https://arxiv.org/abs/1911.02116
[13] An analysis of neural language modeling at multiple scales. - Stephen Merity, Nitish Shirish Keskar, and Richard
Socher. 2019. https://arxiv.org/abs/1803.08240
[14] Universal language model fine-tuning for text classification. - Jeremy Howard and Sebastian Ruder. 2019.
https://arxiv.org/abs/1801.06146
[15] Extraction of emotions from multilingual text using intelligent text processing and computational linguistics. - Vinay
Kumar Jain, Shishir Kumar, and Steven Lawrence Fernandes. 2019.
https://www.sciencedirect.com/science/article/pii/S1877750317301035?casa_token=xGYROGWQ9aIAAAAA:O-
oKcOApy1QWNbdRDt6pGb-XK8bus-y1sWsk2SCdQGmDYxUs6ycXsePgqUht0qlwpwtDgwxaUzJZ

147
Grenze International Journal of Engineering and Technology, June Issue

Effects of Integration of Electric Vehicle Charging


Stations into the Grid
Deepti Jagyasi1 and Prof. Ramchandra Adware2
1-2
Department of Electrical Engineering GH Raisoni College of Engineering Nagpur, India
Email: [email protected], [email protected]

Abstract— One of the best ways to address urgent sustainability issues like global warming,
depletion of fossil fuel reserves, and emissions of greenhouse gases is to use electric vehicles
(EVs). By reducing environmental damage and lowering emissions that contribute to climate
change, incorporating electric vehicles into the distribution system will benefit public health. A
microgrid is a tiny power network that collaborates with groups of loads as well as distribution
generators, powerful software solutions, and other elements utilizing devices connected to the
grid that function as a single controlled entity. Last but not least, the study states various ways
to mitigate the effects of electric vehicles on distribution system power quality and get rid of
harmonics.

Index Terms— Integration of electric vehicles, micro grid, power distribution systems,
ecological damage, harmonics.

I. INTRODUCTION
For ensuring a sustainable future, issues like global warming, the depletion of fossil fuel reserves, and emissions
of greenhouse gases (GHGs) require immediate attention. The primary reason for the rising consumption of
fossil fuels and emissions of greenhouse gases is the rapid rise in global energy consumption. As a consequence
of these issues, the renewable energy sector has carried out substantial research for identifying ways and where
to replace traditional fossil fuels and lessen environmental problems. The electrification of the transportation
sector is seen as a promising solution to this issue because it is one of the largest contributors to rising emissions
of pollutants.MG is regarded as the smartest option for optimal operation because of the power grid's rapid
expansion and the intricate structure of DES over long distances. However, electric vehicles have been put on
hold and restricted to golf carts and delivery trucks due to the readily available availability of fossil fuels,
advancements in combustion technology, and the ease with which internal combustion engines can be utilized.
However, EV penetration remains low due to concerns about cruising range, deteriorating batteries, a lack of
charging infrastructure, and high initial costs. Even though electric vehicles have a significant impact on the
reduction of air pollution, they may harm the quality of the grid's power supply. Rechargeable batteries are used
to power electric motors and store energy in electric vehicles. EV battery chargers employ electronic power
devices to transform DC voltage to AC voltage since non-linearity has an impact on electricity quality. Nonlinear
loads have a detrimental effect on power quality. Microgrid has transformed into a key investigation part in wise
the structure and movement systems. Microgrid principally incorporates different environmentally friendly
power sources using different innovative advances, for example, Power electronic-based technology. EVs can
have a negative impact on energy systems, particularly power quality, but they can also significantly reduce CO2

Grenze ID: 01.GIJET.9.2.314


© Grenze Scientific Society, 2023
emissions and reliance on hydrocarbon-based generators. By reducing emissions that contribute to climate
change and environmental damage, research into incorporating electric vehicles into the distribution system will
improve public health. They may result in problems including power factor deterioration, voltage imbalance, or
voltage variations, as well as harmonic or interharmonics in distribution systems. The harmonic frequency
content is distorted by nonlinear loads like electric vehicle battery chargers. Harmonic distortion and poor power
quality are the results of charging a lot of electric vehicles at once. Power transformer performance is harmed by
harmonics as power losses rise and output power decreases. Power cables, capacitors, relays, and switch-based
power electronic devices can all be affected by harmonic distortion. During the charging process, EV battery
charging stations produce a lot of harmonic distortion. In order to resolve issues with the power distribution
system, voltage and current harmonics must therefore be thoroughly investigated.
A. EV Chargers’ Harmonics
Summing up the harmonics produced by a single charger is not the same as simultaneously charging multiple
cars with the generated harmonics. The system where energy is stored of an EV charging station is made up of
three main parts. They are battery, software, and power conversion system. There are no similarities between EV
chargers systems and converters due to generating of harmonic voltages rather harmonic currents. Because they
do not have a negative meaning as harmonic currents do, harmonic voltages are not included in the analysis [1].
The charging current and this charging voltage are virtually unrelated. There are three stages to the charger.
These chargers are connected to end nodes, front nodes, and nodes that change at random. Taking into account
that these charger levels represent 10 percent, 20 percent, and 30 percent of the total load on the feeder, the
charger is connected to the feeder's end node for the front node, while it is connected to the source power supply
for the end node. With his 6, 12, and 18 chargers each connected to a variable value, the supplier's test system
was run.
B. EV Charging Station Harmonics
The controlling features of the EV battery chargers are included, and their operation is contingent on the
ingredients. The performance of the power system is subject to appropriate opportunities and risks as a result of
the adoption of electric vehicles, which have the potential to significantly increase loading. These chargers and
current voltage have an impact rate design and flexibility of energy.

Figure 1: EV battery charger

C. EVs and Harmonics


The three types of EVs are BEV, HEV, and FCEV. Battery Electric Vehicle is abbreviated as BEV. Hybrid
electric vehicle is the acronym for HEV. Fuel Cell Electric Vehicle is the acronym for FCEV [2]. Power quality
is significantly affected by non-linear loads like electric vehicle charging stations, which cause voltage and
current harmonics. In power systems, harmonics regarding voltage and current cause a variety of issues,
including, damage to the equipment, a transformer that overheats, a low power factor, and a narrow range of
voltage profiles.
D. AC Microgrids
There are three types of microgrid systems.AC, DC, and hybrid AC/DC are the three. With or without
converters, this system permanently connects all DGs—consumers and storage—to the AC busbar network.
Without the need for an inverter, AC generators like diesel, wind, and micro turbines can typically be connected
directly to AC busbar. DC power sources like batteries, ESS, and PV systems also require DC/AC inverters. It is
consequently associated in an orderly fashion with the air conditioner busbar. There are a lot of problems with
AC MG, and these networks have complicated control and timing issues. This network, on the other hand, is still

149
widespread today. In AC MG, the three-phase AC bus serves as the power connection point between MGs and
these primary power grids. Installing the common connection area between the Microgrids and the main power
grid is simple. A quick switch serves as the disconnect point. The DG powers the load when normal events take
place, and any extra power generated is sent to the power grid. The AC microgrid will receive the necessary
power from the main grid if the DG's output power falls below the load demand. A significant detail to make
reference to is that air conditioner Microgrids' power quality guidelines are managed in view of ordinary power
conveyance frameworks and methods of activity.

Figure 2: AC microgrid

E. DC Microgrids
The majority of microgrid generators produce DC that must be converted to AC in order to keep up with the AC
grids of today. Because some devices require SC power to function, DC conversion is required at the system's
end. Be that as it may, AC MG successfully lessens the DC-AC-DC energy transformation, prompting energy
misfortune. Using high DC voltage operation as a benchmark, DC MG aims to solve the AC MG issue. By
reducing the number of converters in a single MG system process, the DC-MG structure, in contrast to AC-MG,
can provide significant energy savings.. The necessity of the class converters to communicate the DESs,
stockpiling gadgets, and loadings. It is evident that DC MGs are less likely to cause power quality issues than
AC-distributed grids and are better suited for residential distribution systems. DC MG eliminates the need for
DG synchronization and ensures that control is highly dependent on the DC bus voltage, removing several
control challenges in microgrid. Additionally, primary control is made much simpler by the absence of reactive
power flow management [3]. Additionally, the power supply is unaffected because many modern devices use
direct current and do not contain power electronics. The switching phase transients must be properly controlled
in MG to prevent device destruction. Thus, the issue of his PQ for this situation warrants extra examination.

Figure 3: DC microgrid

III. EVS' INTEGRATION INTO THE GRID


A. Overview of Ev Technology
EV technology is utilized in PHEVs, also known as hybrid electric vehicles (HEVs). The hybrid electric vehicle,
also known as a battery electric vehicle, was the first EV technology to enter the modern automotive market. The
electric drive system of EV offers the necessary power to the EV motors at the time the vehicle is in motion. On
the other hand, while the vehicle is parked and connected the EV charging system supplies the battery with
energy from the grid. The controller, power converter, battery pack, and electric machine are the essential
components of an electric vehicle's electric propulsion system [4]. By utilizing EVs to transmit owners, stored
energy for the grid of electric vehicles can also actively participate in the electricity market. Electric vehicles can
be controlled as auxiliary service providers to the grid through mechanisms known as V2G and G2V, which

150
control their discharge to and charge from the grid. When it comes to balancing power on the grid in
unidirectional V2G, EV batteries are regarded as switchable loads. A single battery in an electric car is
insufficient to affect the grid. A large number of electric vehicles known as EV Aggregators act as a link
between individual electric vehicles and market participants. Electric vehicles can connect to third-party
aggregators on their own or as part of a fleet across cities or regions. Microgrids can receive fewer
communication signals from EV aggregators. As a result, market operators can benefit from EV aggregators'
ability to lower complexity and mitigate cybersecurity risks.

Figure 4: EV electric propulsion

B. Effects of Ev Integration into the Grid


Electric vehicle grid integration may have an effect on the power system's power quality. The vast majority of
the examinations directed so far have inspected what EV mix means for power quality boundaries, for example,
voltage profile, voltage awkwardness, power misfortune, and music. In light of their growing popularity,
numerous efforts have been made to investigate the impact of electric vehicle grid integration. The effects of
voltage profiles, harmonics, power losses, and electric vehicle grid integration on grid stability issues are
thoroughly examined in this study. Electric vehicle penetration in the power grid has a significant impact on
electricity prices as well [5]. The stability of the grid's voltage may suffer as a result of the integration of electric
vehicles. The location, prevalence, and charging time of electric vehicles all play a role in this. Uncertainty
regarding EV connection points, their prevalence, and the duration of connections and disconnections raise load
requirements.

Figure 5: EV Utilization

C. EV integration's effects on the stability of the grid


The capacity of a power system to return to a steady-state operating state following a fault is known as power
system stability. The significance of stability studies is demonstrated by the numerous reports of outages caused
by system instability. While charging from the framework, EVs show up as non-straight loads with unexpected
attributes in comparison to typical loads and can pressure the power framework. Additionally, it is challenging to
predict this new load's behaviour due to uncertainties regarding EV connection points, charging time, and
duration [6]. As a result, the power system's stability may become uncertain if electric vehicles are refuelled a
lot.

Figure 6: EV and grid interaction

151
 Voltage stability impacts:The term voltage soundness alludes to the power lattice's capacity to keep up with
the voltage on all transports at voltage levels after an issue. Grid voltage stability can be significantly
impacted by variations in demand for the load and features. After system contingencies, the power system
may become more unstable if the load model's alpha is negative. For investigating a comprehensive
examination of the effect of EV penetration on the stability of the grid's voltage of a test distribution
network of 43 buses with interconnected EV charging stations was carried out [7]. The 43-bus test
distribution network's weakest bus has a significantly shorter charging range when using an integrated EV
charging station. The weakest buses' charging margins continue to decline as EV integration grows.
However, the power grid's voltage stability is impacted by the location of EV charging stations, (P/P 0) = a
(V/V 0) a + b

Figure 7: Graph between Load Model and Loading Margin

 Frequency stability impacts: The frequency may deviate from the permissible range if there is an imbalance
in the power grid's load demand and generation demand. Frequency stability refers to a power system's
capacity to maintain acceptable frequencies following a power system failure [8]. The charging grid's load
demand will rise dramatically as more electric vehicles are sold, necessitating more power generation to
maintain a frequency that is within acceptable limits.
D. Power quality impacts of EV integration
Investments in power generation, demand, prices, and emissions will undoubtedly rise with increasing EV
penetration. The economic effects of incorporating electric vehicles into electricity markets have been the subject
of numerous studies. The following are examples of how EV integration affects load profiles, energy prices,
operating costs, and ancillary services.

IV. HARMONIC AND SUPERHARMONICS CHARACTERISTICS


A. Characteristics of Harmonics
Sources of harmonic pollution in microgrid have high penetration and decentralization to form a complete
network [9]. Traditional local harmonic mitigation has the disadvantage of being difficult to implement and
expensive to maintain.
B. Characteristics of Superharmonics
SH emissions have increased as a result of efforts to reduce the number of low-frequency harmonics in inverter
output current and improve power factor. In this instance, SH is sent to the grid whenever the inverter is running
because it comes from the circuit of the inverter. An SH measurement and mitigation method and the device can
be the inverter if it is not producing power or working properly.

V. TECHNIQUES FOR MITIGATION


A. Harmonic mitigation
The effects of EV chargers and harmonic distortion as a whole can be enhanced by reducing odd harmonics to a
minimum. The following techniques can be used to lessen harmonic distortion caused by EV chargers' odd
harmonic current distortion:
 Low-pass harmonic filters:By connecting a capacitor to another resistor in parallel with the inductor, you
can create a low-pass filter. Additionally, low-pass filters are regarded as the best and most efficient method

152
for reducing harmonics in power systems [10]. Due to its simplicity, economy, resistance to maintenance,
and high reliability, it is primarily utilized in power transmission and distribution networks.

Figure 8: Low pass filter

 Active filter: A nonlinear load's harmonic current spectrum is out of phase with that of active filters. The
active filter's harmonic current effectively eliminates all network harmonics when fed directly into the
system in real-time. By continuously providing reactive power, both capacitive and inductive, active filters
can improve power factors in addition to suppressing harmonic currents.

Figure 9: Active filter

 Hybrid control techniques:Grid-connected solar energy systems typically employ hybrid control technology.
Passive-based control strategies have progressed to recover reactive power and avoid harmonic distortion by
utilizing the current control loop of an attached inverter. This approach is planned to eliminate symphonies
twisting by utilizing an ongoing control circle to create and follow the consonant substance [11]. Utilizing
physics to model the energy of damped injections and locating system settings that will produce the
appropriate response is the objective of control engineering.
B. Superharmonics Mitigation
 For improving the power quality of the PV as well as batteries in MG, DVRs are used to deal with voltage
drops and spikes. With his fuzzy logic-based DVR, he overcame the MG-connected mesh's waviness and
slack [12]. MPC was used to make the DVR work better and deal with Sage and MG swelling made of PV,
a super capacitor, and a battery.
 Other tools for addressing issues with power quality include STATCOM and SVC. Due to the high
prevalence of DREs as MGs, voltage fluctuations were mitigated by STATCOMs [13]. In power system
MGs, STATCOMs were also utilized for reducing fluctuation in voltage and compensating for reactive
power.

VI. CONCLUSION & FUTURE SCOPE


With significant progress in the mobility sector, high-efficiency electric vehicles of a new generation are gaining
popularity. Electric vehicles should be used instead of gasoline vehicles due to the advantages of battery
charging. The primary advantages of charging EV batteries include reduced oil pollution, high efficiency, and
dependability. However, one of the problems energy providers face is poor power quality when charging EV
batteries. In nonlinear loads, EV battery charging is taken into account because of power electronic components
like rectifiers. In contrast, MG Grid is a novel power grid that can be utilized to fulfil energy demands in the
future in the direction of green power and smart grids. In MG Grid, some power sources primarily use renewable
energy. However, RES output is unstable and weather-dependent, despite the fact that numerous power
electronics devices are required. Therefore, a crucial factor in the expansion of MG is the availability of PQ
criteria, measurement, and mitigation strategies.

153
REFERENCES
[1] Pinilla, J.T.M., 2022. Hosting Capacity: A Tool to Modernize the Grid and to Contribute to the Integration of Distributed
Energy Resources in Colombia. Global Journals of Research in Engineering, 22(B1), pp.33-40.
[2] Kharrazi, A., Sreeram, V. and Mishra, Y., 2020. Assessment techniques of the impact of grid-tied rooftop photovoltaic
generation on the power quality of low voltage distribution network-A review. Renewable and Sustainable Energy
Reviews, 120, p.109643.
[3] Xingang, Y., Aiqiang, P., Guangzheng, Y., Chenyang, L. and Yangxiu, Y., 2019, May. Supraharmonics measurement
algorithm based on CS-SAMP. In 2019 IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia) (pp. 160-164).
IEEE.
[4] Li, H., Lv, C. and Zhang, Y., 2019, July. Research on new characteristics of power quality in distribution networks. In
2019 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS) (pp. 6-10). IEEE.
[5] Shabbir, N., Kütt, L., Jarkovoi, M., Iqbal, M.N., Rassõlkin, A. and Daniel, K., 2021. An overview of measurement
standards for power quality.
[6] Tavakoli, A., Saha, S., Arif, M.T., Haque, M.E., Mendis, N. and Oo, A.M., 2020. Impacts of grid integration of solar PV
and electric vehicle on grid stability, power quality and energy economics: A review. IET Energy Systems Integration,
2(3), pp.243-260.
[7] Hamadi, A., Alam, M.S. and Arefifar, S.A., 2021. Analyzing the Impact of Electric Vehicle Charging Stations on Power
Quality in Power Distribution System. 2021 SAE Technical Paper, (2021-01-0199).
[8] Bouzelata, Y., Kurt, E., Uzun, Y. and Chenni, R., 2018. Mitigation of high harmonicity and design of a battery charger
for a new piezoelectric wind energy harvester. Sensors and Actuators A: Physical, 273, pp.72-83.
[9] Suman, S., Chatterjee, D. and Mohanty, R., 2022. Development of improved harmonic compensation technique for PV-
wind hybrid distributed generator connected to microgrid. Electric Power Systems Research, 210, p.108071.
[10] Town, G., Taghizadeh, S. and Deilami, S., 2022. Review of Fast Charging for Electrified Transport: Demand,
Technology, Systems, and Planning. Energies, 15(4), p.1276.
[11] Kurt, E., Cottone, F., Uzun, Y., Orfei, F., Mattarelli, M. and Özhan, D., 2017. Design and implementation of a new
contactless triple piezoelectrics wind energy harvester. international journal of hydrogen energy, 42(28), pp.17813-
17822.
[12] Alkahtani, A.A., Alfalahi, S.T., Athamneh, A.A., Al-Shetwi, A.Q., Mansor, M.B., Hannan, M.A. and Agelidis, V.G.,
2020. Power quality in microgrids including supraharmonics: Issues, standards, and mitigations. IEEE Access, 8,
pp.127104-127122.
[13] Naqvi, S.A.H., 2020. Active Power Measurement in Energy Metering (Master's thesis).
[14] Iqbal, M.N., Measurement Based Approach for Residential Customer Stochastic Current Harmonic Modelling.
[15] Yamahata, C., Stranczl, M., Sarajlic, E., Krijnen, G.J. and Gijs, M.A., 2012. Temporally aliased video microscopy: an
undersampling method for in-plane modal analysis of microelectromechanical systems. Journal of
Microelectromechanical systems, 21(4), pp.934-944.
[16] Lim, J., Wang, P., Shaw, S., Gong, H., Armacost, M., Liu, C., Do, A., Heydari, P. and Nenadic, Z., 2022. Artifact
propagation in subdural cortical electrostimulation: Characterization and modeling. Frontiers in neuroscience, 16.
[17] Wang, X.K., Hao, Z. and Tan, S.K., 2013. Vortex-induced vibrations of a neutrally buoyant circular cylinder near a
plane wall. Journal of Fluids and Structures, 39, pp.188-204.
[18] Mirsafian, S., 1996. Forced vibration of two beams joined with a nonlinear rotational joint. University of Kansas.
[19] Niu, Y., Yang, T., Yang, F., Feng, X., Zhang, P. and Li, W., 2022. Harmonic analysis in distributed power system based
on IoT and dynamic compressed sensing. Energy Reports, 8, pp.2363-2375.
[20] Pond, T.L. and Martin, C.R., 2020. Electrical characteristics of the oxyfuel flame while cutting steel. Experimental
Thermal and Fluid Science, 112, p.109985.
[21] Chen, H. and Konofagou, E.E., 2014. The size of blood–brain barrier opening induced by focused ultrasound is dictated
by the acoustic pressure. Journal of Cerebral Blood Flow & Metabolism, 34(7), pp.1197-1204.
[22] Wang, K.W. and Harne, R.L., 2017. Harnessing bistable structural dynamics: for vibration control, energy harvesting
and sensing. John Wiley & Sons.
[23] Halterman, K., Valls, O.T. and Wu, C.T., 2015. Charge and spin currents in ferromagnetic Josephson junctions. Physical
Review B, 92(17), p.174516.

154
Grenze International Journal of Engineering and Technology, June Issue

Enhancement of Accuracy and Performance of Deep


Learning System for Intrusion Detection System
Dr. Abhishek Kajal1 and Vaibhav Rana2
1
Asst Professor, 2Research Scholar
1-2
Department of Computer Science and Engineering,
Guru Jambheshwar University of Science and Technology,
Hisar, Haryana - 125001, India
Email: [email protected], [email protected]

Abstract— In post pandemic scenario, most studies have been realized that there is a need for
more work on accuracy in IDS, taking into account the prior study in the field. Furthermore,
various aspects are impacting the time consumption during training operations in existing
literature. Conventional studies have only given a few solutions for effective intrusion detection.
When used, the insights and recommendations from this study will have a significant effect on
the strategy employed to reliably foretell IDS. Taking into consideration the training model, the
current research ought to provide a flexible and scalable approach to IDS detection. The
proposed model will train on a large dataset, increasing the likelihood that it would provide
accurate results. Future research should continue utilizing the same paradigm in order to
enhance IDS detection. The finding has significant implications for improving IDS forecasting.

Index Terms— Intrusion Detection System, Deep Learning, Accuracy, Performance.

I. INTRODUCTION
The increased frequency of cyber assaults may be directly linked to the rising popularity of using online
resources. Passwords, credit card numbers, and other sensitive information sent via a network are susceptible to
attack from both within and outside the system. This attack may be carried either manually or automatically by
the aggressor. The effectiveness and ferocity of these assaults are only increasing. This particular gang of
cybercriminals has becoming tougher to stop. Cybercriminals or cyber attackers are the nefarious individuals
responsible for these types of data breaches. Individuals or groups with deep domain experience in the field may
sometimes suggest novel, flexible, and reliable intrusion detection systems (IDS).
A. Background
Intrusion detection is a topic that will be explored in depth in this study. Although IDS studies have been
conducted over decades, scholars continue to worry about how reliable their findings are. Multiple machine
learning strategies would be used to enhance the IDS's detection capabilities. This research would examine the
state of the art in intrusion detection systems in order to pave the way for future developments in the field. For
the purpose of security analysis, researchers may think about using an RNN-based LSTM model. A filtering
system would be used to enhance precision and efficiency. Furthermore, the suggested IDS model's performance
will be compared to that of the standard model.

Grenze ID: 01.GIJET.9.2.325


© Grenze Scientific Society, 2023
B. Intrusion Detection System
In this context, "IDS" refers to an Intrusion Detection System. The fundamental aims of such systems are the
detection and classification of intrusions, attacks, and other data-stealing activities. This system is used on the
network and the host side, and it operates fully automatically in both environments. Both network-based
intrusion detection systems (NIDS) and host-based intrusion detection systems (HIDS) exist.

Figure 1. Intrusion detection system

IDS systems include things like burglar and thief alarms. A home's lock system, for instance, is one defense
against intruders. However, a burglar alarm will generate noise (or "ring the alarm") to alert the homeowner(s)
that a lock system has been compromised and an intruder is attempting to enter the residence. And now there's
been an attempt at breaking into their house. Furthermore, IDS is greatly aided by firewalls and routers, which
allow for near-instantaneous data transmission.
C. Taxonomy of IDS
The IDS Classification is shown in Figure 2. Another categorization of intrusion detection systems may be
utilized in terms of the protected system type when considering the area as the source of data. HIDS and
network-based IDS are two families of IDS programs that leverage information gleaned from a single host
(system) as well as information gleaned from a network segment.
Using a modem put in an organization's private network; external users might access the Intranet without
detection from the firewall. IPS is network threat prevention system that analyses network traffic flows to
identify & prevent vulnerability exploitation. Network (NIPS) and Host (Host) are two forms of preventive
systems (HIPS). They monitor network traffic and take steps to safeguard networks and systems. False positives
and false negatives are the IPS problem. In an intrusion detection system, a false positive occurs when an alert is
triggered despite the fact that there was no assault. In the case of an attack, a false negative is described as an
occurrence that fails to raise an alert. A single point of failure, marked updates, and encrypted communication
might all be at risk if inline operations were used. IDS monitor the actions taking place in a system or network. It
might be a piece of hardware or an application on a computer, depending on your preference. It keeps tabs on
any suspicious behavior that may have occurred on a network or system. It makes a substantial contribution to
the guarantee of data security. It's one of the most cutting-edge tools for spotting all kinds of network threats
with pinpoint accuracy. A network-based system analyses activity such as the amount of traffic, IP address,
service ports, and protocol to determine the network's health.Network traffic is monitored by Intrusion Detection
Systems (IDS) to look for unusual behavior. In addition, it sends out warnings as soon as activity is identified.

156
This is referred to as a network-capable software program. It performs a full scan of the system to look for any
potentially dangerous activity or policy violations.

Figure 2. IDS classification

A variety of components make up an intrusion detection system. Sensors that generate security events are one
component. It is causing the intrusion detection system to go into overdrive. There's also a console.During
routine operations, intrusion detection systems look for signs of known assaults or deviations. Deviations and
anomalies are forwarded up the stack and examined at the protocol and application layers.

Figure 3. IDS detection and prevention system [17]

Intrusion Prevention System: As a safety measure, intrusion detection systems have long been explored. In the
IoT system, it works mostly at the network layer. Designed for IoT-dependent intelligent systems, IDS must be
able to function in a very limited processing capacity Fast reaction time is required for this. This is designed to
handle a large amount of data quickly.
D. Machine learning
ML can be cited as algorithms due to which the software applications predict output in a very accurate way and
for this purpose no programming is required. In the algorithms of Machine Learning previous records are used in
the form of input for the prediction of fresh output values. Fraud detection, waste filtering, malware threat
detection, BPA, and predictive maintenance are all possible uses for machine learning. The way an algorithm
improves its accuracy in making predictions in a common way to classify classical machine learning.
Unsupervised learning and supervised learning are the two most common methods. Scientists want to make
predictions about algorithm selection based on data.
Working of supervised machine learning: An algorithm can only be trained using labeled inputs and intended
outcomes when using supervised machine learning.
Working of unsupervised machine learning: There is no need for data to be labeled when using unsupervised
machine learning (ML) techniques. Their job is to seek for patterns in unlabeled data in order to divide it into
manageable chunks for further analysis.

157
Uses of machine learning: Machine learning is now being used in a wide range of fields. Among its many uses is
in Facebook's News Feed recommendation engine. It's possible that the recommendation engine may begin to
priorities postings from a certain group if a member often pauses to read them. The engine is working behind the
scenes to strengthen the member's online habits. The News Feed will be adjusted if the member's reading habits
change and he or she fails to keep up with postings from that particular group in the following weeks.

Binary Classification

Supervised ML Choice From A Variety Of Possible Solutions

To Predict Continuous Values

Clustering Classification

Unsupervised ML Anomaly Detection

Sets Of Elements In A Dataset

Figure 4. Working of ML-A glance

E. Deep learning
Computers are taught to learn by doing, and this is what is known as "deep learning" in machine learning.
Autonomous vehicles rely on deep learning to identify things like stop signs and pedestrians, among other
things. DNN (Deep neural networks) is used for prediction and classification whereas CNN is used for
prediction, recognition and vision. RNN & LSTM is used for prediction and classification. Present research
would make use of RNN & LSTM for IDS detection and classification.
DNN Prediction and time series prediction

CNN Computer recognition and vision

Classification
RNN & LSTM

Figure 5. Role of DNN, CNN, RNN & LSTM

F. Long Short-Term Memory (LSTMS)


The usage of "LSTMs," a particular sort of recurrent neural network, may considerably assist several tasks.
These discoveries are almost exclusively the result of recurrent neural networks. LSTMs explicitly avoid the
long-term dependency issue. For them, it's a given that they can remember knowledge for long periods of time
without having to put any effort into it. Every recurrent neural network has abasic structure i.e., a repeated series
of neural network modules. LSTMs have a distinct recurring module structure than an LSTM chain. Instead of
simply one, neural networks include four levels, each of which interacts differently.

II. LITERATURE REVIEW


For undertaking the research on title “Role of Machine Learning in Building Intrusion Detection System”, a
detailed comprehensive and in-depth study was undertaken on IDS, ML and LSTM. A brief description of those
prior research articles is provided ahead:

158
Analysis by M Tavallaei et al. [1] of the KDD CUP 99 data set was completed in 2009. J. Martens and I.
Sutskever [2] focused on learning recurrent neural networks in 2011. A new intrusion detection method was
unveiled by M. Sheikhan et al. [3] in 2012. They employed a smaller RNN for their task. It was a feature
grouping-based technique. S.Revathiet et al. [4] recommended a full investigation of the NSL-KDD dataset in
2013. Various machine learning algorithms were used. This was done in order to identify intrusions.Researchers
W. Li et al. [5] studied the most recent intrusion detection systems under development in 2014. Based on KNN
algorithmic programs their system was developed. A wireless sensor network mechanism was
devised.Information extraction and automated learning algorithms were surveyed in 2016 by A. L. Buczak et al.
[6]. They concentrated on ways to avoid intrusion detection in the first place. Detection of an intrusion was
accomplished by the using approaches that combined information extraction with machine learning. Deep
learning was cited by A. Javaid et al. [7] in 2016. Their efforts were also dedicated to the development of a more
effective intrusion detection system. Classification algorithms for network traffic were examined by Bo Dong et
al. [8] in 2016. They came to the conclusion that a variety of ways would be implemented as part of a free
information package, and they then put those approaches into practice. They came up with the best method for
detecting intrusions from this collection of realistic examples. For the time being, deep learning was the greatest
option because of its ability to forecast. Because of this, deep learning approaches were already being used in
industries such as structural identification or organizational structure. Monitoring security events provided data
for intrusion detection analysis, which were used to determine the current state of the network. Existing intrusion
detection approaches that used automated learning showed enhanced accuracy and efficiency.Deep learning was
also suggested by T.A. Tang [9] et al. in 2016. The goal of their method was to identify network intrusions.
Software defined networking was the focus of the study. Chuanlong Yin et al. [10] presented a model and
technique for using a neural network-based identification system in 2017. In addition, they assessed the efficacy
of design in the context of dual and multiple class structures. Other factors that affect accuracy include neuron
density, as well as different learning speeds' impact on neuron number. NSL-KDD was utilized as the dataset.
Using the RNN-IDS classification model, it was discovered that it’s possible to accurately represent the data.
Compared to other automated learning approaches, the categorization model was significantly more efficient and
accurate. Using their architecture, intrusion detection was more accurate. It provided the most up-to-date
research approach for detecting intrusions.Analysis of data pre-processing was carried out in 2017 by N.
Paulauskas et al. [11]. They examined the impact of pre-processing data on IDS methods.The NSL-KDD dataset
was utilized in their study. In 2017, P. S. Bhattacharjee et al. [12] proposed IDS. They utilized the NSL-KDD
data collection for this purpose in 2017. R. A. R. Ashfaq [13] did work on fuzziness based semi-supervised
learning approach in 2017. They did research for intrusion detection system. Sara A. Althubiti et al. [14] was the
one responsible for putting the detecting system in place in 2018. To do it, their team used the Coburg Intrusion
Detection data package. In addition, this researcher used the Long-Short-Term Memory (LSTM) and Deeply
Structured Learning (DSL) methods (LSTM). Their research yielded an accuracy of around 85 microns. This
level of precision was deemed acceptable. Their LSTM outputs were compared with the most elegant approaches
in order to meet our assessment criteria. To do which, they employed a variety of measures such as authenticity
and versatility.Meira, Jorge [15] did comparative Results with Unsupervised Techniques in 2018. Their research
played significant role in detection of cyber attack novelty. Kolli [16] in 2018 focused on Cyber Situational
Awareness (CSA) for PTC. They considered Distributed IDS System. Clotet [17] in 2018 considered real-time
anomaly-based IDS. They considered this system for cyber-attack detection. Their system worked at the
industrial process level of Critical Infrastructures.Intrusion detection was designed by Peisong Li et al. [18] using
an enhanced DBN and GA in 2019. Iterative development of DBN network topologies yielded diverse network
structures for different assaults, including low-frequency attacks and other forms of attacks. In order to provide
intrusion detection, a DBN should be created that optimized network layout. There's no limit to the number of
hidden layers that may be generated using a genetic algorithm. Neurons in the "hidden layer" develop in a
manner similar to this. Speed of detection was provided by minimizing system complexity to the maximum
degree feasible. Using this technique, an intrusion detection system's performance could be improved. Arul [19]
makes use of ANN in their IDS based research in 2019.Khraisat [20] did survey of intrusion detection systems in
2019. Author considered techniques, datasets and challenges related to IDS. R. Vinayakumar [21] introduced
Deep Learning Approach in order to implement Intelligent IDS in 2019. Many alternative approaches of
automated learning were employed by Qusay H. Mahmoud et al. [22] in 2020, including SVM, DT, and random
forest. Using the most recent IoTID20 information package, new IDS techniques in IOT networks might be
supported.They considered hessian-free optimization in their research. Y. Zhou [23] proposed an efficient
intrusion detection system in 2020. This system was based on feature selection and ensemble classifier. Y. J.

159
Chew [24] considered decision Tree in 2020. They considered sensitive Pruning in Network dependent
IDS.Song, Yajie& Bu [25] proposed Novel Intrusion Detection Model in 2020.

TABLE 1. LITERATURE REVIEW


S.No Author Year Topic Methodology Short coming
[8] B. Dong and X. Wang 2016 Comparison deep learning method Deep learning Lack of flexibility
to traditional methods using for and accuracy
network intrusion detection
[10] Chuanlong Yin, Yuefei 2017 A Deep Learning Approach for Deep learning Limited scope
Zhu Intrusion Detection Using Recurrent
Neural Networks
[11] N. Paulauskas 2017 Analysis of data pre-processing Data preprocessing and Time consumption
influence on intrusion detection intrusion detection and complexity
using NSL-KDD dataset
[14] Althubiti, Sara 2018 LSTM for Anomaly-Based Network LSTM Time consumption
Intrusion Detection
[18] P. Li and Y. Zhang 2019 A Novel Intrusion Detection Machine learning Lack of accuracy
Method for Internet of Things
[19] A. Arul Anitha 2019 Artificial neural network based Artificial neural network Performance
intrusion detection system for issues
internet of things
[20] A. Khraisat, I. Gondal 2019 Survey of intrusion detection Data processing, Lack of technical
systems: techniques, datasets and classifier, machine work
challenges learning
[21] R. Vinayakumar 2019 Deep Learning Approach for Deep learning Performance and
Intelligent Intrusion Detection accuracy issues
System
[22] Ullah, Imtiaz 2020 A Scheme for Generating a Dataset Activity detection Lack of smart
for Anomalous Activity Detection scheme solution
in IoT Networks
[23] Y. Zhou, G. Cheng 2020 Building an Efficient Intrusion Feature selection Lack of feasibility
Detection System Based on Feature Classifier
Selection and Ensemble Classifier
[24] Y. J. Chew 2020 Decision Tree with Sensitive Decision Tree Slow and outdated
Pruning in Network-based Intrusion approach
Detection System
[24] A. Kajal et al 2020 A Hybrid Approach for Cyber Genetic Algorithm (GA) Complex and
Security: Improved Intrusion and Artificial Bee scope of accuracy
Detection System using ANN-SVM Colony (ABC) algorithm enhancement
for Feature Selection,
Artificial Neural
Network (ANN) with
Support Vector Machine
(SVM) as Classifier
[25] Song, Yajie 2020 A Novel Intrusion Detection Model Fusion of network Complicated to
with the support of a Fusion of implement in real
Network and Device States for life
Communication-Based Train
Control Systems

III. PROBLEM STATEMENT


Taking into account the findings of prior studies in IDS, it has become clear that further effort is needed to
improve accuracy. Additionally, a number of variables have been shown to affect how much time is needed for
each step of the training process. In terms of effective intrusion detection, the solutions supplied by traditional
research are inadequate.

IV. NEED OF RESEARCH


Although there have been numerous studies conducted on IDS, it has been noted that improving the reliability of
IDS detection remains a significant obstacle. The current state of the art in IDS detection and categorization is
inefficient and might benefit from the introduction of a deep learning technique that can achieve the same or
better results in less time.

160
V. SCOPE OF RESEARCH
The approach utilized to reliably forecast IDS will be significantly influenced by the thoughts and
recommendations related to this study. Taking into consideration the training model, the latest research ought to
provide a flexible and scalable method of detecting IDS intrusions. Since the proposed model will use a large
dataset for training, its overall accuracy should improve. If we want to make progress in IDS detection, further
research must use the same paradigm. The research will have significant implications for improving the ability to
foresee IDS.

APPENDIX A APPENDIX TITLE


Appendixes, if needed, is numbered by A, B, C... Use two spaces before APPENDIX TITLE.

ACKNOWLEDGMENT
The authors wish to thank A, B, C. This work was supported in part by a grant from XYZ.

REFERENCES
[1] M. Tavallaee, E. Bagheri, W. Lu, and A. A. A. Ghorbani, ‘‘A detailed analysis of the KDD CUP 99 data set,’’ inProc.
IEEE Symp. Comput. Intell.Secur. Defense Appl., Jul. 2009, pp. 1–6.
[2] J. Martens and I. Sutskever, ‘‘Learning recurrent neural networks with hessian-free optimization,’’ presented at the 28th
Int. Conf. Int. Conf.Mach. Learn., Bellevue, WA, USA, Jul. 2011, pp. 1033–1040.
[3] M. Sheikhan, Z. Jadidi, and A. Farrokhi, ‘‘Intrusion detection using reduced-size RNN based on feature
grouping,’’NeuralComput. Appl.,vol. 21, no. 6, pp. 1185–1190, Sep. 2012.
[4] S. Revathi and A. Malathi, ‘‘A detailed analysis on NSL-KDD dataset using various machine learning techniques for
intrusion detection, ’’Int. J. Eng.Res. Technol., vol. 2, pp. 1848–1853, Dec. 2013.
[5] W. Li, P. Yi, Y. Wu, L. Pan, and J. Li, ``A new intrusion detection system based on KNN classi_cation algorithm in
wireless sensor network,''J. Elect. Computer. Eng., vol. 2014, Jun. 2014, Art. no. 240217.
[6] L. Buczak and E. Guven, ‘‘A survey of data mining and machine learning methods for cyber security intrusion
detection,’’IEEECommun.Surveys Tuts., vol. 18, no. 2, pp. 1153–1176, 2nd Quart., 2016.
[7] Javaid, Q. Niyaz, W. Sun, and M. Alam, ‘‘A deep learning approach fornetwork intrusion detection system,’’ presented
at the 9th EAI Int. Conf.Bio-inspired Inf. Commun. Technol. (BIONETICS), New York, NY, USA,May 2016, pp. 21–
26
[8] B. Dong and X. Wang, “Comparison deep learning method to traditional methods using for network intrusion
detection,” in Proc. IEEE ICCSN,2016, pp. 581–585
[9] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, and M. Ghogho,‘‘Deep learning approach for network intrusion
detection in soft-ware defined networking,’’ inProc. Int. Conf. Wireless Netw. MobileCommun. (WINCOM), Oct. 2016,
pp. 258–263.
[10] Chuanlong Yin, Yuefei Zhu, A Deep Learning Approach for IntrusionDetection Using Recurrent Neural Networks,
IEEE Access, Received September 5, 2017, accepted October 5, 2017, date of publication October 12, 2017, date of
current version November 7, 2017.
[11] N. Paulauskas and J. Auskalnis, ‘‘Analysis of data pre-processing influence on intrusion detection using NSL-KDD
dataset,’’ inProc. Open Conf.Elect., Electron. Inf. Sci. (eStream), Apr. 2017, pp. 1–5.
[12] P. S. Bhattacharjee, A. K. M. Fujail, and S. A. Begum, ‘‘Intrusion detection system for NSL-KDD data set using
vectorised fitness function in genetic algorithm,’’Adv. Comput. Sci. Technol., vol. 10, no. 2, pp. 235–246, 2017.
[13] R. A. R. Ashfaq, X.-Z. Wang, J. Z. Huang, H. Abbas, and Y.-L. He,‘‘Fuzziness based semi-supervised learning
approach for intrusion detec-tionsystem,’’Inf. Sci., vol. 378, pp. 484–497, Feb. 2017.
[14] Althubiti, Sara & Jones, Eric & Roy, Kaushik. (2018). LSTM for Anomaly-Based Network Intrusion Detection. 1-3.
10.1109/ATNAC.2018.8615300.
[15] Meira, Jorge. (2018). Comparative Results with Unsupervised Techniques in Cyber Attack Novelty Detection.
Proceedings. 2. 1191. 10.3390/proceedings2181191.
[16] Kolli, Satish& Lilly, Joshua &Wijesekera, Dusminda. (2018). Providing Cyber Situational Awareness (CSA) for PTC
Using a Distributed IDS System (DIDS). V001T03A004. 10.1115/JRC2018-6142.
[17] Clotet, Xavier &Moyano, José & León, Gladys.(2018). A real-time anomaly-based IDS for cyber-attack detection at the
industrial process level of Critical Infrastructures.International Journal of Critical Infrastructure Protection.23.
10.1016/j.ijcip.2018.08.002.
[18] P. Li and Y. Zhang, "A Novel Intrusion Detection Method for Internet of Things," 2019 Chinese Control And Decision
Conference (CCDC), Nanchang, China, 2019, pp. 4761-4765, doi: 10.1109/CCDC.2019.8832753.
[19] A. Arul Anitha and L. Arockiam, “ANNIDS: Artificial neural network based intrusion detection system for internet of
things,” Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 11, pp. 2583–2588, 2019, doi: 10.35940/ijitee.K1875.0981119.

161
[20] A. Khraisat, I. Gondal, P. Vamplew, and J. Kamruzzaman, “Survey of intrusion detection systems: techniques, datasets
and challenges,” Cybersecurity, vol. 2, no. 1, 2019, doi: 10.1186/s42400-019-0038-7.
[21] R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, A. Al-Nemrat, and S. Venkatraman, “Deep Learning
Approach for Intelligent Intrusion Detection System,” IEEE Access, vol. 7, no. c, pp. 41525–41550, 2019, doi:
10.1109/Access.2019.2895334.
[22] Ullah, Imtiaz, and Qusay H. Mahmoud. "A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT
Networks." Canadian Conference on Artificial Intelligence. Springer, Cham, 2020.
[23] Y. Zhou, G. Cheng, S. Jiang, and M. Dai, “Building an Efficient Intrusion Detection System Based on Feature Selection
and Ensemble Classifier,” Comput. Networks, p. 107247, 2020, doi: 10.1016/j.comnet.2020.107247.
[24] A. Kajal and S. Nandal, “A Hybrid Approach for Cyber Security: Improved Intrusion Detection System using ANN-
SVM,” Indian Journal of Computer Science and Engineering, vol. 11, no. 4, pp. 412-425, 2020, doi:
10.21817/indjcse/2020/v11i4/201104300.
[25] Y. J. Chew, S. Y. Ooi, K. S. Wong, and Y. H. Pang, “Decision Tree with Sensitive Pruning in Network-based Intrusion
Detection System,” Lect. Notes Electr. Eng., vol. 603, pp. 1–10, 2020, doi: 10.1007/978-981-15-0058-9_1.
[26] Song, Yajie& Bu, Bing & Zhu, Li. (2020).A Novel Intrusion Detection Model Using a Fusion of Network and Device
States for Communication-Based Train Control Systems.Electronics. 9. 181. 10.3390/electronics9010181.

162
Grenze International Journal of Engineering and Technology, June Issue

Real-Time Remote General Healthcare Clinic


Yashwanth M1, Yoga Verma V2 and Dr. S. Saranya3
1-3
Department of Electronics and Communication, Easwari Engineering College,
Email: [email protected], [email protected], [email protected]

Abstract— Distributing medicines to people located in remote areas is a daunting not so easy for
the Government and the Real-time remote general Healthcare Clinic help meet the above
requirements. The model has basic and emer-gency medicines which is monitored and can be
refilled. We can consider it to be a computerized drug stock-room system that people can access
easily in an emergency without having to go near the pharmacy. The ma-chine could be easily
setup in remote regions, such as highways, deserts, tribal areas. It is a µcontroller and mo-tors
based system that dispenses the drug when the user accesses the event via input drug storage
data can be obtained remotely based on this information refilling the machine will be hassle-
free.

I. INTRODUCTION
Some groups in India are biting the dust because they are not in leadership positions and medicine is not
available in time. Most need regular access to the most basic medicines. This is due to the level of network needs
and as a result there is a growing shortage of funds to hold these posts. If the need for medicine is so great that
pharmacies are not open or medicines are not available in this mood in the evening, problems arise. In remote
and remote areas and areas with low public turnover, the availability of medicines within reach of patients is a
fundamental issue. Gov-ernment accountability, prudent choice, adequate public sector support, productive
circular structures, control of costs and commitments, and drug delivery by current and future prescribers to
improve the enduring quality of healthcare in India.
Education in the culture of normal use. In the current situation where we want all devices to be programmed, this
model will be of great help for the state of well-being. This framework is fully constrained by a 16-bit
microcontrol-ler. To improve efficacy and patient well-being, these frameworks provide potential components
for PC-controlled capacity, management, observation, and tracking.
The World Wellbeing Association mentions ‘Countries with less social finance classes has less of well-being.
Happi-ness Inequality is tightly linked to levels of social status. Happiness has all the characteristics of being one
of the anchor points that connect education and access to data. Competent electronic health checks with clever
billing frameworks for remote areas are another idea, but very useful in an agricultural country like India, where
healthcare is virtually rudimentary. Dispensing machines allow prescriptions to be stored securely in permissive
payment units and the use of opiates and other controlled substances to be tracked electronically. The framework
is customizable as new skills can be added or current skills can be modified according to requirements.

II. RELATED WORKS


International Situation – According to work done by Shih Shih in 2016, he developed an intelligent medicinal
dis-penser in which the dispenser is fully programmed with the exception of setting up activities and retrieving

Grenze ID: 01.GIJET.9.2.327


© Grenze Scientific Society, 2023
individu-al parts from the medicinal compartment. Suggested construction and operation. Improvement
progresses through current connections and correspondences 2121 between parts that rely on broader
collaboration [1].
According to a study proposed by Sarika Oundhakar in 2017 on information on equipment and innovations that
deal with most of the ubiquitous confectionery machinery. It makes sense to use these candy machines to reduce
reliance on labor and magnify their effectiveness [2].
In 2012 he proposed a framework that works for all sizes of pills and containers, done by Mukund. The
concentrate also had a schedule he could change for 31 days for 21 drugs. The station broadcasts four warnings a
day. In addi-tion, a program gradually changes the number of doses and the number of doses as needed [3].
Based on a work by S. Gayathri on 2015, it aimed to monitor patients in remote regions by wearable sensors.
Here, the framework includes temp, strain, heart rate, and rate increase sensors. Information is monitored and put
to use by a µcontroller. Additionally, the framework uses the GSM modem method to send data to the expert
wearable device via SMS [4].
Referring to a work of Varun Vaid in 2014, he proposed a procedural plan to promote candy vending machines.
These candy machines have a variety of applications that can be deployed anywhere and used by anyone [5].
E-Health Checking with a superior management framework for remote areas is powered by an Arduino
implementa-tion. The planning and execution of this framework combine both management and observation
parts to achieve better adaptation.
National Status - A 2018 study by Rajendra Prasad P showed that efficient e-health monitoring using a smart
remote area dispensing system can be used to plan and execute a variety of well-selected medications using pre-
defined A framework that contains the pill container used by Arduino to allocate a. Medicine with the help of the
appropriate switch. Clean pillows and maternity units are added for the wellbeing of women. Monitoring sensors
such as temp, heart rate and circulatory stress is present to really see the patient's condition. If conditions exceed
normal, an emer-gency button can be pressed to provide assistance. [6]
A 2017 Vishal Tank study provides an adaptable, easy, and crude solution for extending basic healthcare to all
posi-tions at a very secure price. Machines can be adapted to any area or environment with minor changes in
equipment and programming. This machine is hooked up to add a sharp drag unit that sends a top-off notification
message to the nearest scientist when the number of drag strips falls below a certain level. [7]
In 2017 According to Mahaveer Penna's research, an automated drug dispensing machine proposal was planned
and implemented to further develop medical care in remote areas by caring for patients with essential conditions
such as fever, migraine. This frame utilizes state-of-the-art specialized perspectives such as implantable frames
and Arduino to administer the medicine expected by the patient according to the patient's wishes through a
keyboard interface, which has achieved great results in improving medical care. We've also consolidated medical
development infor-mation into one place. Simple trial and error justification. [8]

III. COMPONENTS
The prototype done is made using an ESP32. The other components used in the model are DC motors, switches,
and IR sensors. Components are selected from those available in local stores to represent the most appropriate
quality-price ratio. A DC motor is used to drive the trays and deliver the desired drug. A gear set converts that
high speed into high torque. This is important for the design as the main job is to open the tray with the product
on it.
A medical vending machine should be designed as a service to deliver medicines with special requests. This
model’s working requires a vending machine to dispense commonly used non-prescription medications on
demand. The working principle of this method is illustrated in block diagram.

Figure 1: Block diagram

164
The block diagram above fully describes how the project works. As you can see, the whole project is divided
into her two units, the control unit and the delivery unit. The control unit consists of parts necessary for data
processing such as controllers and input switches, and the delivery unit consists of motors, infrared sensors, chat-
bots, etc., all attached to the delivery tray.
The main component of the model is the microcontroller ESP32. Connect all peripherals and programmatically
con-trol all processes in the system. Panels are used as inputs for controllers. Separately, an infrared sensor is
connected to the controller to monitor and update the availability of medicines in the medicine tray. push buttons
in the system allow product selection. Each medicine becomes available at the push of a button.
DC motors are responsible for drug delivery by opening and closing the shell. Pressing a specific button turns on
the motor for the respective drawer, opening it with the help of a gear set and closing it after a specified time as a
pre-caution. Patients/consumers have access to medicines during business hours.
The prototype has a chat-bot feature that recommends medications that are appropriate for a patient's symptoms.
This chat-bot uses embedded C and is trained with a certified list of symptoms and medication records.
Other than that the availability of medicines in the trays are monitored and continuously updated to the cloud via
internet using the ESP32 controller. And the data in the cloud are in encrypted format for basic security purpose
and for viewing the details an OTP generation method for the registered email id is used
The data is available with the time stamps.

Figure 2a: Control unit Figure 2b: Delivery unit

Figure 3: Chat-bot

IV. LIMITATIONS OF THE PRODUCT


One main limitation of the vending machine is the required supply voltage setting for the . This is determined by
the input voltage for the element and its diversity. The Main Board runs up to 15V along with the DC motors
also which also draws considerable amount of current and requiring input voltage of 10-30V, so we can't connect
the machine straight to the main 220V power supply. In the prototype the Control board and the motor are
powered by individual power supplies, can be considered impractical.

V. PROBLEMS AND POSSIBLE SOLUTIONS

A problem observed is when the IR sensor generated high output value. When inspected the reason for this
malfunction was the bright lights in the environment where it was used. So the IR sensor works the best in dark
conditions

165
Another problem is that since multiple components are used the availability of pins for connecting them lacked
which resulted in making multiple compromises and it also reduces the availability of pins for additional
components in the future

Figure 4: Live stock monitoring

Figure 5: The final model

VI. FURTHER IMPROVEMENTS


In the current prototype, there is a vast amount of improvements that can be made. Firstly the lack of user
authentication. Since we are dealing with medicines to prevent cases of overdose or any other undesirable things
from occurring authentication is a must and can be added. Which QR code method may be a good option with
each person having their own QR code and when it is scanned the patient will be identified
The power supply, as mentioned above, consumes very little power considering its size, and the main advantage
of the proposed vending machine is its remote location. So the system can be completely off-grid. this is a better
option. , would be a better option A better option would be to use solar panels and batteries to store the extra
power generated for running without the sun.
For pharmaceuticals, the issue of dosing must be addressed. To do this, we can use Young's rule, a formula that
calculates dosage for different age groups using age and adult dosage as parameters. Depending on the patient's
age, the system will provide the right drug with the right dosage.
Patient monitoring is not done by the model, sensors such as temperature and blood pressure sensors could be
used in the future to monitor patients and recommend better drugs to them.

166
VII. CONCLUSION
In this real-time remote general medicine clinic project, a basic model of a medication machine
with inventory monitoring and consultation software (chat bot) was created. Its potential hasn't been fully
realized yet, but there are a lot of additions that can be made to the model, so this project shows how to make a
remote medication dispenser and what additional basics you can do with your project. Ideas include tethering
chat-bots to hardware, the ability to physically monitor patients to prescribe better dosages, and
software improvements for better counselling.

REFERENCES
[1] Pill dispenser with alarm via smart phone notification; Chi-Sheng Shih ,Nurmiza Binti Othman; Ong Pek Ek; IEEE 5th
Global Conference on Consumer Electronics Year: 2016.
[2] Sarika Oundhakar, International journal of engineering technology science and research, December 2017.
[3] S Mukund and NK Srinath, Design of Automatic medication dispenser, 2012.
[4] S Gayathri, N Raj Kumar, V Vinothkumar, Human health monitoring system using wearable sensors, international
research journal of engineering and technology,vol.2, NO. 8 June 2015.
[5] Varun Vaid, Comparison of different attributes in modeling a FSM based Vending machine in two different styles, 2014.
[6] P. Rajendra Prasad, N. Narayan, S. Gayathri and S. Ganna, "An Efficient E-Health Monitoring with Smart Dispensing
System for Remote Areas," 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information &
Communication Technology (RTEICT), 2018
[7] V. Tank, S. Warrier and N. Jakhiya, "Medicine dispensing machine using raspberry pi and arduino controller," 2017
Conference on Emerging Devices and Smart Systems (ICEDSS), 2017.

167
Grenze International Journal of Engineering and Technology, June Issue

Detection of Varicose Superficial Venous Throm-


bophlebitis in Vein using MSNN Algorithm
Shiyam R1, Srividya R2 and Dr.Saranya S3
1-3
Department of Electronics and Communination, Easwari Engineering College,Chennai,India
Email: [email protected], [email protected], [email protected]

Abstract— A doctor uses the diagnosis of medical images to predict whether the person suf-fers
from damaged tissues or organ. Therefore, object detection and image classification of medical
images has received some concentration in medical field. This paper as put forward a varicose
vein detection algorithm based on varicose superficial venous thrombophlebitis images and
multi-level neural network algorithm (MSNN). The varicose vein diagnosis sys-tem has better
accuracy and performance due to the uniqueness of the leg vein images. The proposed system
used multi-scale technology by pre-processing the images and extracting features by feature
extraction to create a detection system of varicose disease with high per-formance.

Index Terms— Vascular endothelial cells, inflammation, Multi- scale neural network, Scle-
rotherapy, Endovascular Laser Therapy.

I. INTRODUCTION
Biometrics is an automated recognition system that is used recognize individuals by means of physical
appearances. Veins is a blood carrying vessel that carries blood to every part of the body. Varicose veins appear
as dark blue or purple veins which are swollen and enlarged. Digital medicine is a category in medical field,
which is a combination of prescription medication and ingestible sensor component.
The deep learning algorithm proposed by Lau etal.[1] the algorithm is used to achieve a good recognition system
by training with several skin damaged images of the patient. Mohammed etal.[2] used 3 dimensional convolution
network to classify and recognize the patients with Alzheimer’s dis-ease. Kawahara etal.[3] used a CNN
structure for obtaining structural connectivity map from DTI by MRI scans. Yuan[4] used a multi-dimensional
CNN algorithm to classify the skin tissue damages, working with each CNN model with a different image
resolution. Al-Antari etal[5] used a modified neural network combined wth support vector machine to detect
breast tumors, suggesting that the given model as a improved performance of the breast image classification.
Varicose veins present in the lower limbs of the body are the most common disease of outer ves-sels. Around
23% of the people present in the world suffer from varicoses in their legs and knees and at present, number of
the people having varicose veins has exceeded 25 million in the world. In China, the ubiquity of the varicoses as
exceeded 8%. In addition to affecting esthetics, varicose veins can also cause complications, such as bleeding in
the lower limbs, superficial veins swelling, inability to walk for long distance and reduced work capacity.
Varicoses are generally valves that are enlarged and swollen that generally appear on the legs and feet of the
patient body. This condition happens due to the flow of blood is against gravity and occurs after prolonged
standing. Varicose veins cause aching pain in the legs and feet that cause dis-comfort in loss of joint function,
which can lead to circulatory problem.
The main problem facing doctors today is the difficult access to veins for intravenous drug administration.

Grenze ID: 01.GIJET.9.2.329


© Grenze Scientific Society, 2023
Several problems occur when veins are incorrectly detected, such a, s injuries, lesions, thrombus, etc. Hence a
non-surgical varicoses diagnosis system based on MRI imaging using MSNN algorithm has been successfully
developed which classifies the varicose veins from the given da-taset.
Thrombophlebitis is an blood clot in leg and feet vein that causes an inflammation superficial veins and affected
at the surface of the exterior skin. If a blood clot grows inside a swollen part of the vein, the vein appears puffed
up and intensified.
Deep vein thrombophlebitis(DVT) appears due to the blood clot formed in the deep veins of the human body
usually in the lower limbs. Varicoses constitutes and acts as a primary reason for many diseases, especially of
the lower limbs.

II. RELATED WORKS


Varicoses are broadened veins in the hypodermis tissues of the lower legs or legs and varicoses are quite easily
visible to the naked eyes. In the today’s world, the improvement of society and changes in the life style of the
people, varicoses have become a universal disease in the world. Most people with varicose veins usually don’t
detect any changes in the superficial layer of the skin. Due to this people don’t give enough attention to the
changes in the lower limbs of their body due to varicose veins. Prolonged obstruction of the superficial valve of
the lower limbs can cause severity in the var-icoses disease. Spider veins is mild type of varicose veins which are
smaller in size and appear in red or blue in colour and frequently appear in lower legs. Varicoses can be
prevented by regularly exercising, maintaining a healthy weight, wearing loose clothes which are not that tight to
the body.Fig.1 shows the comparison of normal and varicose veins.

Figure 1 Figure 2

The first work is done by Lau et al.[6] in which a network algorithm has successfully achieved a good detection
system using 100 MRI images of the skin damages. By Kawahara et al.[7] applied a MSNN structure that
connects the structural connectivity map that is obtained by DTI and MRI images. The main advantage of the
MSNN algorithm is that it makes use of the spatial coherence that is it gives equal importance to all the edges of
the network and it is independent of the geometrical transformations. The treatments that are used for varicose
vein disease are injection treatment(sclerotherapy), laser treatment and vein surgery. Surgery is considered as one
of the painful methods as it involves interceding by tubal ligation and also pulling out of the veins, these causes
the patient massive pain and also the patient takes a long period of time to recover from the surgery.
Sclerotherapy is a methodology that done by injecting medicine through the blood vessels or lymph vessels and
this makes the swelling to shrink. This causes enormous pain in the swollen area as it is difficult to exactly find
the affected vein. Sclerotherapy is commonly used in the treatment of spider veins which is a mild version of the
varicose veins and this procedure is a non- surgical treatment as it only requires injection for treatment of the
varicose vein disease. Complications of the varicose veins includes draining of blood, skin changes of the
patient, rash, ulcers, infection, bleeding, blood clots. Endovascular varicoses laser surgery is a method that is
used to treat varicose veins by penetrating laser into the body and produces heat which shrinks the varicose
veins. A model is designed to examine the bandwidth of the ultrasound devices and various values of the return

169
speed. The conclusion of the simulation is that the bandwidth in which necrosis in the valves of the blood vessels
can be determined.
A. Multi-level Neural Network
Specialists and various exploration results that have extraordinarily advanced the advancement of clinical
picture research favor the use of profound convolutional brain networks in clinical pictures. In any case,
because of the absence of open huge scope data sets, most examination utilizing profound convolutional brain
networks is completed on restrictive little data sets, and the heartiness of the organization isn't excellent.Fig.2
represents the multi-level neural network.
Furthermore, profound convolutional brain networks utilize managed preparing strategies. To further develop
the organization speculation execution, it is important to extend the organization progressive system or look for
a more sensible organization structure. As the level of the organization develops, more marked preparing
information is required for variation. This paper joins the elements of Google-Net organization structure, VGG,
and the writing to build a lightweight profound convolutional brain network with further developed highlight
extraction. The letter "C" demonstrates the pack base layer. "M" indicates the MFM enactment layer. "P"
demonstrates most extreme pooling and "Fc" shows a completely associated layer.
B. Inception Layer
Google Net is a kind of neural network that uses inception model for training and evaluating the given dataset.
Google Net is a model that utilizes multiple size of the filter sizes present in the inception layer and rarely max
pooling, two filter sizes for making the resolution of the images halved. There are some changes present in the
Google Net from the other similar architectures of the neural network, that is Google Net uses 1x1 convolution
layer and also global mean pooling.Fig.3 shows the inception model structure.
C. Activation Function
An activation function is a module that is used to provide compact outputs for compact inputs whereas it provides
huge values if the input surpasses a threshold limit. There are few examples for activation function such as Relu,
Sigmoid, Step function, Leaky Relu. The mathematical representation of the activation function is given as in the
Figure.4
D. Relu
Relu is a rectified function in which the function tends to zero if the input values are less than zero and the
function tends to the input value if the input values are greater than or equal to zero.Fig.5 shows the equation and
Fig.6 shows the graph of relu function.

Figure 3 Figure 4

170
Figure 5

Figure 6

III. PROPOSED WORK

Varicoses is classified into many types based on the severity of the disease in the patient’s body. In this paper,
each part of the image segmentation is developed to detect the varicoses by image pre-processing, feature
extraction and at the end segment the images over the original images. By using the algorithm, the classification
results determine the efficiency of the infected veins.Fig.7 and 8 shows the block diagram of training set and
testing set of the network.

IV. WORKING
A. Image Preprocessing and Feature Extraction
The images present in the dataset are pre-processed and the features are extracted by using mean pixel value
method. These images are then separated into training set and testing set.
B. Neural Network Training
After pre-processing and feature extraction of the images present in the dataset, the MSNN algorithm is the
trained using the training set of the dataset. After training the neural network using training set, it is then
evaluated and validated by the testing dataset. It is done by matching the testing set with the original images and
the accuracy of the MSNN algorithm is measured. The accuracy of the testing set tends to increase by increasing
the number of neural network iterations. The efficiency and error test curves of the MSNN model in maximum
aggregate sampling mode are shown.
C. Network Performance Comparison
The amount of training parameters of the MSNN model network is related to the depth of the network and the
number of network filters in each layer. A deeper network layer and a wider network width can capture more
network features and improve the network's ability to represent features. The network parameters of NIN and
Google-Net are smaller than VGG. Network training time is shorter, resource overhead is less, and network
execution efficiency is higher. At the same time, the MSNN model constructed in this article introduces a deeply
separable convolutional layer, the calculation size of the model parameters is the smallest, and the network runs

171
the fastest not only in the network. A desktop computer with a graphics card can run fast and can run quickly on
a portable mobile device.
Combined with image multi-scale technology, it can still achieve high classification accuracy while reducing the
parameters of the deep convolutional neural network model.

Figure 7 Figure 8

V. RESULTS
The proposed system accuracy for 30 epochs or 150 iterations of MSNN model is shown in Fig.9
Also the model is used to predict test samples of the dataset and the confusion matrix of the predicted values is
displayed in the Fig.10

Figure 9 Figure 10

VI. CONCLUSION
There is a positive correlation between vascular endothelial cell inflammation and varicose veins of the lower
extremities. Therefore, this paper uses vascular endothelial cells as a research object to construct a deep
convolutional neural network for lower limb varicose veins to improve classification and recognition accuracy.
The accuracy of the presented work with MSNN comes close to meeting the acceptable error levels that would
be required for a system with some classification of a patient's leg image.
Compared with the existing deep convolutional neural network model, the network can improve the feature
extraction ability of the network, has the characteristics of fast running.

REFERENCES
[1] Real-Time Epidemiology Of Varicose Veins And Chronic Venous Disease Prediction Using Decision Tree Algorithm ,
R.Panneer Selvi , R.Sasikumar , S.Deva Priya , C.Jeganathand, Turkish Journal of Computer and Mathematics
Education ,Vol.12 No.9 (2021)
[2] Analysis of Varicose Veins in Lower Limbs through Multiscalar CNN, Krishnarani. M, Malini.V, Rathna. P, Sharmila.
A, Vaizhnavi. G,IJESC, 2020 Volume 10 Issue No.4
[3] Analysis of varicose veins of lower extremities based on vascular endothelial cell inflammation images and multiscale
deep learning, RuizongZhu,HuipingNiu,NingningYin,TianjiaoWu,YapeiZhao,November 2019
[4] Detection of Diseases Using Machine Learning Image Recognition Technology in Artificial Intelligence, Jian
Huang,Jing Li ,Zheming.
[5] Li,Zhu Zhu,Chen Shen,Guoqiang Qi,Gang Yu, Hindawi Computational Intelligence and Neuroscience,Volume 2022,
Article ID 5658641
[6] Suma V R Amog Shetty, Rishab F Tated, Sunku Rohan, “CNN based Leaf Disease Identification and Remedy
Recommendation System”
[7] Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA
2019] IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5.

172
[8] . Al-Antari M A , Al-Masni M A , Mun-Taek C , et al. A fully integrated computer-aided diagnosis system for digital X-
ray mammograms via deep learning detection, segmentation, and classification[J]. International Journal of Medical
Informatics, 2018, 117:44-54.
[9] Lau H K , Chang J W , Daut N , et al. Exploring Edge-Based Segmentation Towards Automated Skin Lesion
Diagnosis[J].AdvancedScienceLetters,2018,24(2):1095-1099
[10] ShenW, Zhou M ,YangF , et al. Multi-scale Convolutional Neural Networks for Lung Nodule Classification[J]. Inf
Process Med Imaging,2015,24:588-599.
[11] Haxhe J, De Maeseneer M, Schoevaerdts J. Le traitement des varices en Belgique: oùallonsnous? Impact des nouvelles
technologies sur les pratiquesmédicales.Phlébologie 2010;63(2):9-14.
[12] R. Kenneth, "User Authentication via Keystroke Dynamics: an Artificial Immune System Based Approach", in
Proceedings of 5th International Conference on Information Technology, 2011

173
Grenze International Journal of Engineering and Technology, June Issue

Email Automation and Database Management


Prof. Umakant Tupe1, Shubham Ghalme2, Kanchan Shelke3 and Rutuja Kadam4
1-4
JSPM Rajarshi Shahu College of Engineering, Pune, India
Email: [email protected], [email protected], [email protected],
[email protected]

Abstract— Senders and recipients both put a lot of work into managing their email. Perhaps
some of this job can be automated. In order to determine. (i) the kind of automated e-mail
handled user’s desire. (ii)the types of data and computation required to support that
automation, we conducted a mixed-methods need-finding study. We organised the demands
through a design workshop, ran a poll to further understand those categories, then categorised
the email automation software already on the market to determine which criteria have been
satisfied. Our findings point to the necessity of a richer data model for rules, additional
attention-management options, using context of internal and external emails, complex
processing like response aggregation, and sender-friendly features. We created a framework for
producing short stories to better explore our findings. An efficient information system gives
users accurate, timely, and pertinent information that they may use to make decisions. When
making judgments for current operations as well as long-term strategic planning. To ensure
that the decisions made are the right ones, the decision-making process must be supported by
timely and pertinent data and information. Information is created as a result of data
processing, which information systems perform using information technology.. Data
management is required to ensure that information is the right information, at the right time,
accurately, and pertinently. Data is the building block of information and is gathered in a
database (database) to provide To collect the necessary information on the academic
organization of the institution, for instance, a university must create an academic database that
at the very least includes student, lecturer, course, room, and schedule data. Therefore, a basic
understanding of databases and Database Management Systems is required in order to build a
successful database. Database management solutions are used to organize the massive volumes
of data that businesses use on a regular basis. Managers need to be able to swiftly and readily
discover certain facts so that they can make decisions. The company divides the complete data
collection into a series of linked data tables; by reducing data repetition, these linked tiny
collections of data will ultimately improve data consistency and accuracy. Most businesses
today employ databases with a relational structure. Automated email is any message that is
automatically sent from your email service provider (ESP) in response to a particular user's
actions (or inactions) on your website or web app. You may use automated mail to provide one-
to-one communications to customers in real time, enhancing their engagement, loyalty, and
retention.

Index Terms— SQL, SMTP, Python, Pandas, NumPy, E-mail, Automation etc.

I. INTRODUCTION
In the 51 years since its creation, email has developed into not just a commonplace instrument for individual and

Grenze ID: 01.GIJET.9.2.331


© Grenze Scientific Society, 2023
group communication but also a location for keeping track of assignments, activities, and personal information.
Email is increasingly intimately associated with work as a result, and for many individuals, email takes up the
majority of their workday. Since at least 1990, when the program allowed writers to Due to the burden it
generates, there has always been a need to automate many aspects of email processing. To filter email into
certain folders, regular-expression scripts may be created. Richer automation functionalities have developed over
time. For instance, Boomerang enables users to postpone receiving emails so they can be received again at a later
time. A user is forced to manage all of their requests manually, rely on a human assistant, or juggle a collection
of 3rd party plugins since different apps provide various automation possibilities.

II. MOTIVATION
Our Initial Promise to finding the best solution and overcoming current industry issues. It Will Improve Our
Profile Being a Sponsored Project. Our approach will lessen manual labour. First exposure to industry.to
maximise effectiveness.

III. PROBLEM DOMAIN


This project's primary method of operation is a database management system and sending automatic emails to
the non-fillers after comparing and segment the data from the GST protal. The automated processes will take the
place of the manual ones. By automating processes, we can cut back on expenses, time, and waste.

IV. PROBLEM DEFINITION


The organization's immediate challenge was to separate the data by contrasting two databases based on specific
criteria. We must automatically email alerts to the "non fillers" category after segregation. When an organisation
receives a lot of data from the "GST-PORTAL," we must divide it into two groups, such as "Fillers" and "Non-
fillers," based on the data already in the organisation. After segregation, we must automatically email the non-
fillers with transaction data and a reminder to pay their debts as soon as feasible using the organization's letter
format. This project tries to overcome the real time difficulty encountered by industry.

V. STATEMENT
Automated emails are issued to non-fillers to pay their invoices as soon as is possible.

VI. REVIEW OF LITERATURE


Database management systems (DBMSs) are pieces of software that make it simple for businesses to consolidate
data, manage data effectively, and give application applications access to data. Several DBMSs, such as
Microsoft Access, IBM DB2, Postgres, Oracle, Microsoft SQL Server, are highly popular. The primary purpose
of a database management system (DBMS) is to store data. Users are not required to understand how data is
stored or processed. Each piece of data entered must be transformed into a predetermined structure and format as
part of the DBMS's function in data presentation and transformation.
Email users are both senders and recipients, and systems provide automation capabilities addressing both roles.
For instance, email clients provide two types of reminders. Email users can set a reminder for messages to get
back to it later (reminder as a recipient) and remind their recipients to solicit responses (reminder as a sender)
Implementation\Methodology
Data will be compared under the parameters set forth by the company. After contrasting two types of Data
Segregates:
The matched data from both datasets fall under the fillers category.
The unmatched data from both datasets fall into the non-fillers category.
Gathering of all transaction information for a single consumer that falls into the non-filler category.
Keeping track of this transaction in many Excel files using the GSTIN number. After that sending each non-
filler an automated email by obtaining their email addresses from the organization's database.

VII. PROBLEM SOLVING


The problem will be resolved in two steps.

175
Phase 1: Importing the organization's pre-existing dataset for the Non-Fillers Category.
By using manual filters on an excel sheet, the organisation has filtered out this dataset.
gathering of all transaction information for a single consumer that falls into the non-filler category.
Keeping track of this transaction in many Excel files using the GSTIN number.
sending each non-filler an automated email by obtaining their email addresses from the organization's
database.
Phase 2: Importing the GSTR2-A Company register and Portal datasets.
Data comparison on the terms set forth by the business.
Data division into two categories:
The matched data from both datasets fall under the fillers category.
The unmatched data from both datasets fall into the non-fillers category.
Exporting the Non-filler dataset that has been filtered out onto a separate Excel sheet for usage in the
following phase

VIII. CONCLUSION
Users want their email management to be more automated, according to our research. The results of the three
different need-finding probe we carried out consistently pointed to a few common categories of email needs:
capturing richer data models and internal and (time-varying) external context, using them for recipients to
manage attention and for senders to lessen recipient load, and automated content processing to, for example,
aggregate replies to an invitation or extract attached photos into a relevant storage location.
hackers that abuse the characteristics of email today. We discovered people utilizing attention-management
techniques that recycle existing email features. For instance, users have been seen marking emails as unread
across numerous probes to act as a reminder to revisit and read them.

VIII. FUTURE SCOPE


A graphical user interface that is interactive and simple to understand. saves time and money while increasing
accuracies-useability (i.e., scripts can be used for any similar form of automation) (i.e., scripts can be used for
any similar type of automation). Flexible (i.e., easy to adapt for any future updates) (i.e., easy to modify for any
future updates). The organization's manual tasks will be replaced by automated ones. Business automation
standards.

ACKNOWLEGDEMENT
Perseverance, inspiration and motivation have always played a key role in the success of any venture. At this
level of understanding it is difficult to understand the wide spectrum of knowledge without proper guidance and
advice, hence we take this to express our sincere gratitude to our respected Project Guide who as a guide evolved
an interest in us to work and select an entirely new idea for project work. He has been keenly co-operative and
helpful to us in sorting out all the difficulties. I am also grateful to my classmates and cohort members,
especially my office mates, for their editing help, late-night feedback sessions, and moral support. Thanks,
should also go to the librarians, research assistants, and study participants from the university, who impacted and
inspired me. We would also like to thank our HOD and Principal, for their continuous advice and support.

REFERENCES
[1] Daniel Gruen, Steven L. Rohall, Suzanne Minassian, Bernard Kerr, Paul Moody, Bob Stachel, Martin Wattenberg, and
Eric Wilcox. 2004. Lessons from the reMail prototypes. In Proceedings of the 2004 ACM conference on Computer
supported cooperative work (CSCW ’04). ACM, New York, NY, USA, 152–161
[2] The Lifetime of Email Messages, Ahmed Hassan Awadallah.2018 History Of Emails.
[3] Database Management System Individual Research Report, Edward Hung 2001 Data Base Management & Data
Comparison
[4] Control System Over Database Management, M.C. Crowley 2005 Automation.
[5] A Study On Data Management Tools, L. Ramanan, M Kumar, K.P.V Ramanakumar 2013 Data Segregation Concepts
[6] Automated Response Suggestion For Emails, Anjuli Kannan, Karol Kurach, Sujith Ravi 2016 Automation of Emails
[7] Gnanaprakash, V., N. Kanthimathi, and N. Saranya., Automatic number plate recognition using deep learning, IOP
Conference Series: Materials Science and Engineering, Vol. 1084. No. 1. IOP Publishing, (2021).

176
[8] R. Madhanraj, T. Vigneshwaran, N. Palanivel ap, M. Srivarappradhan, Automatic number plate detection in vehicles
using faster R-cnn, Ieee international conference on system, computation, automation and networking (icscan), July
2020.
[9] [8] Sahu, C.K., Pattnayak, S. B., Behera, S. & Mohanty, M. R., A Comparative Analysis of Deep Learning Approach for
Automatic Number Plate Recognition, Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics
and Cloud) (I-SMAC), 932-937, (2020).
[10] Gupta, Shally, Rajesh Shyam Singh, and H. L. Mandoria, A Review Paper on Automatic Number Plate Recognition
System, (2020).
[11] Ganta, S., & Svsrk, P., A novel method for Indian vehicle registration number plate detection and recognition using
image processing techniques, Procedia Computer Science, 167, 2623-2633, (2020).
[12] Y. Sukesh, M. Ramana Sai Prasada Varma, S. Usha Kiruthika, Vehicle Number Recognition from Vehicle Images
using CNN, International Journal of Innovative Technology and Exploring Engineering (IJITEE), ISSN: 2278-3075,
Volume-9 Issue-6, April (2020).

177
Grenze International Journal of Engineering and Technology, June Issue

A Novel Approach with Deep Learning Method with


Effective Storage Security in Hybrid Clouds
Vijay Prakash1, Aditya Tripathi2, Shashank Saxena3 and Arshad Ali4
1-4
iNurture-TMU
[email protected], [email protected], [email protected], [email protected]

Abstract— Platforms, data storage, and IT services are delivered over the Internet in cloud
computing, a contemporary computing technology. Task management is crucial for effective
scheduling and affects the overall effectiveness of cloud computing environments due to the full
availability of resources and the significant number of tasks assigned to it. In cloud
environments, security is a crucial concern in addition to timing. Since cloud computing
services go beyond data archiving and backup, supporting data dynamics through the most
popular types of data manipulation, like block modification, insertion, and deletion, is also
crucial for practical use. A step, that is. Public auditability or dynamic data manipulation have
not always been effective in prior attempts to ensure remote data integrity, but this document
accomplishes both. We first recognized the challenges and potential security concerns of direct
extension with fully dynamic data updates from prior work, and then we seamlessly
incorporated these two crucial features into the protocol design. In particular, we enhance
existing proof-of-storage models by modifying the conventional Merkle hash tree structure for
block tag authentication to achieve effective data dynamics. This demonstrates how to construct
an elegant validation scheme for dot. To secure cloud data storage, a variety of techniques have
been put into practice [1]. The safety analysis method described in [1] is not a useful technique,
though. The new idea of smart card authentication is used in this work to provide security for
cloud data storage. Data storage in the cloud can be made more secure using an effective
method called smart card authentication. We implemented this prototype in accordance with
the CPDP scheme within the virtualization framework of a cloud-based storage service. Hadoop
Distributed File System (HDFS) 6 is illustrated in Figure 5 as an example.
It is a distributed, scalable, and portable file system [14]. HDFS's architecture is made up of
NameNodes and DataNodes, where NameNodes translate filenames to a collection of block
indices and DataNodes hold actual data blocks. The NameNode's index hash table and
metadata must be integrated in order to support the CPDP scheme and provide query services
based on hash values ((3)i,k) or index hash records (i). implement a protocol for verification.

Index Terms— cloud computing, cloud security, hybrid clouds, public verifiability, and storage
security.

I. INTRODUCTION
Hybrid clouds effectively offer dynamic scaling of services and data transfers by integrating a variety of private
and public cloud services. For instance, a customer can combine data from various private or public providers
into one backup file or archive (see Figure 1). As an alternative, a service can take data from other services that
are located in a private cloud and store it in its own storage, creating a hybrid cloud.

Grenze ID: 01.GIJET.9.2.332


© Grenze Scientific Society, 2023
A Data-Proven (PDP) scheme based on public cloud offers a publicly accessible remote interface for exploring
and managing vast amounts of data, but the benefit of the suggested PDP scheme is that performance is
comparable to that of hybrid cloud. is capable of meeting the specific bandwidth requirements of, but the
capacity cannot keep up with the time. Consider the hybrid cloud storage service depicted in Figure 1 as a
solution to this problem. It consists of three distinct entities. an organization that manages and offers storage
services through the use of large amounts of computing power and storage space. trusted third parties (TTPs)
that keep track of and offer data retrieval services for customer review information.
The new paradigm of data storage in the "cloud" is seen as a promising service platform for the Internet, but it
raises a number of challenging design issues that have a big impact on the overall security and performance of an
organization. The most significant issue with cloud data storage is data authentication on unreliable servers.
For example, a storage service provider may choose to conceal data failures from its clients in order to benefit
from them. More importantly, by not keeping or consciously deleting data files that they routinely share with
their customers, service providers can save money and disk space. The customer may find it practical to perform
routine consistency checks without using a local copy of the data files to mitigate the issue when a significant
amount of electronic data is offloaded and the customer has constrained resource capacity. must find a solution.
There may be deep roots.
Several solutions have been put forth among various protection models and schemes to address this issue. A lot
of time and effort is put into designing solutions in all of this work to satisfy various requirements. High system
efficiency, stateless acknowledgments, limitless query use, and irretrievable data are a few of these.
All of the methods so far presented can be divided into two groups based on the role of the verifier in the model:
private verifiability and public verifiability. Although a system with private verifiability can achieve higher
system efficiency, public verifiability does not imply that only the customer (the data owner) can compete with
cloud server.
Customers can do this without using their own computing resources to contract out service evaluations to
independent external auditors (TPAs). In the cloud, the client itself cannot be relied upon to maintain integrity.
The addition of public verifiability to validation protocols, which are anticipated to play a more significant role
in achieving economies of scale in cloud computing, seems more balanced in real-world applications. be divided.
The conversation seems to be similar to the performance conversation. When performing validation, validators
shouldn't need external data.

II. RELATED WORK


In Yan Zhu, Huaixi, and others (2010). [1] Demand ownership of verifiable data to guarantee data integrity. The
hybrid cloud's data ownership scheme is one that is helpful and auditable. It offers service scalability, data

179
migration, and collectively gathers and stores customer data. Because operating costs are lower, communication
complexity can be reduced to minimum.
According to Qian Wang et al. Through dynamic data manipulation, [2] introduces a new scheme that enables
remote he data integrity and auditability. This step first pinpoints specific scaling issues and potential security
concerns with fully dynamic data updates. Through manipulation of the conventional Merkle Hash Tree (MHT)
structure, which is used to validate blocktags, we achieve efficient data dynamics and enhance retrievability
models. This is a very effective and safe technique [2].
An allocation framework driven by models was put forth in 2012 by Tekin Bicer, David Chiu, and Gagan
Agrawal [3]. Data-intensive applications running in hybrid cloud environments can benefit from this technique's
support for time- and money-efficient execution. With a 3 point 6 percent error rate, you can meet
implementation deadlines, stick to budgetary restrictions, and shorten application run times.
Using the currently in use cloud computing organisms, Haoming Liang, Wenbo Chen, and Kefu Shi [4] proposed
a method for analyzing programming and task scheduling models. The programming process, its modification
process, and the flow of replacing services and resources are all explained with the help of examples.
In 2010, Ravi Sandhu, Raj Boppana, and Ram Krishnan put forth a fresh idea for integrating mission-driven
presentation, pliability, and security policies into the computing and communication infrastructure by integrating
hooks and supporting protocols into the cloud. This methodology can effectively address the twin cloud security
and accessibility issues [5].
A dynamic user-integrated cloud computing architecture was introduced in 2011 by Guannan HU and Wenhao
ZHU [5]. This model expands the capabilities of cloud computing data centers by actively integrating clients
with storage and computing capability. Services are offered to other users through client cooperation with the
data center [5]. In order to better meet the practical learning needs of lifelong learners, Xiang Li, Jing Liu, Jun
Han, and Qian Zhang proposed The article describes design of micro-learning platform architecture constructed
through cloud computing expertise, details the layered structural design of micro-learning platform based cloud,
and details the intention [6].
Xinwen Zhang, Anugeetha Kunjithapatham, Simon Gibbs, Joshua Schiffman, and Sangoh Jeong proposed A
Solution for Authentication and Secure Session Management between Weblets Running on the Device Side and
Weblets in the Cloud in 2009 [7]. allows cloud weblets to access sensitive user data through external web
services and offers protected migration. In business environments, it enables application integration between
private and public clouds [8].
Year: 2010, Yan Zhu, Huaixi, et al. [9] suggests a hybrid cloud data ownership plan that facilitates data
migration and service scalability. This opens up possibilities in which several cloud service providers collaborate
to store and manage customer data. Less overhead and simpler communication are the outcomes of this plan.
According to Qian Wang et al. [10] is a protocol that outlines the challenges and potential security concerns of
direct extension with fully dynamic data updates before demonstrating how to create complex validation
schemes for error-free integration. is recommending. This has an effect on block tag validation using the
conventional Merkle Hash Tree (MHT) structure [10].
Arash Nourian and Muthucumaru Maheswaran proposed a new image coding scheme in 2012 that uses the Ima
coding method, which transforms images using chaos maps, to enhance image privacy and enable the cloud to
carry out some types of computation. introduced. chosen following random masking.
Jia Yu, Rajkumar Buyya, and Kotagiri Ramamohanarao presented a method in 2008 for allocating the proper
resources to workflow tasks in order to complete their execution and enable each user to perform the desired
function. It makes an effort to enhance the workflow scheduling algorithms currently in use that have been
created and used by various grid projects [13].
Luis Mendonça and Henrique Santos published research findings and test results in 2012 that defined an efficient
set of traffic parameters that could model both normal and abnormal behavior of networks and demonstrate
abnormal and coordinated behavior. We focused on the detection of botnet movements in the instance of The
detection framework model was also foreseen and tested with actual traffic gathered at the University of Minho
campus edge [15].
A new security load balancing architecture based on multilateral security (LBMS) was proposed by Pengfei Sun
Qingni Shen, Ying Chen Zhonghai, and Wu Cong Zhang in 2011. This architecture provides the ideal physical
security device by automatically migrating tenant VMs during peak loads. CloudSim, a simulation of cloud
computing, is the foundation of this method. When VMs move to physical machines for load balancing, this
design tries to prevent potential attacks.
A new hybrid scheme that combines anomaly and signature detection with honeypots was put forth by Pragya
Jain and Anjali Sardana in 2012. To enable real-time system capabilities, the first stage used signature-based

180
detection of known worm attacks. Anomaly detectors can quickly spot deviations from the norm at the second
level. Honeypots are used to identify zero-day attacks at the top level. It provides the advantages of a resource-
efficient honey farm by utilizing honeypots and both detectors. Regulators route data traffic to the proper
honeypots [18].

III. PROBLEM STATEMENT


The issue is that a variety of authentications have been used to protect dynamic data while numerous techniques
have been used to store it. Although public verifiability and data dynamics are already implemented in cloud
environments, it is still inefficient to provide both at the same time for remote data authentication. We also don't
combine these technologies. Merkle hash tree storage of data failed. In order to provide a joint PDP scheme that
supports dynamic scaling across multiple storage servers, a technique was put into place.

IV. PROPOSED METHODOLOGY


Smart cards can support a wide range of applications used by many different communities, including electronic
(digital) signatures, email and data encryption, virtual private network (VPN) authentication, and password
management. Biometric authentication and secure wireless network access. There are several variations of smart
card technology. connect Hadoop and unstructured data.
Plastic cards (contact or contactless communication), USB devices, or as protected components that can be
embedded in mobile phones and other devices are all examples of these. The following diagram displays the
proposed method's architectural overview.
The suggested approach here uses smart cards to provide security while implementing the hybrid cloud concept.
A. Authentication using a smart card.
Configuration:.

Generation of keys in Module 1.


Choose a number of data.
Saves information as a Merkle hash tree.
Make a count of the files.
To facilitate authentication, make a private key.
Key Assignments for Module 2.
Files should be given keys.
Put the right key in the file's encryption.
In a hash table, keep the key and the information.
since using an index to access data is a simple process.
You can only search the data index; you cannot search the entire database. So, it will move along quickly.
Module 3: Cloud Server Data Storage.
These encrypted files should be saved to a different location on your cloud server.
Requestor possesses only appropriate keys.
The requester sends these keys to an outside verifier.
TPA uses these keys to validate data later on.
But the original data is hidden from TPA. Only uses cryptographic signature schemes for verification checks.
Integrity Check is a module in 4.
Decrypt all of the cloud server's files.
Put all files together.
Verify the data size. The size is identical to the initial data.
Put the corresponding encrypted file in this location in case data loss occurs as a result of a file's technical
difficulties.
Keeps the entire file encrypted so that your security is never compromised.
Data Dynamics, Module 5.
At runtime, this module executes a few operations on the cloud server.
Change in data.
Incorporating data.
Erasure of data.
Batch Audit is the sixth module.

181
On cloud servers, many users keep their files.
In a batch system, each user validates his own data.
To accomplish this, a number of scheduling and priority algorithms are employed. E.g. bottleneck, deadlock.
therefore, the auditing time will be very less.
To determine if the evidence passes an integrity check, a verifier (TTPA) runs an algorithm. If the validation
process worked, she returns TRUE; otherwise, she returns FALSE. Finally, the user receives a thorough test
report. In the system description, we discussed the use of TTPA for hybrid cloud data integrity verification.
Her three main client-side phases make up our plan. The initialization stage, the key generation stage, and the tag
generation stage are what they are called.
Step 1:
A certificate is first obtained by the user (or data owner) during his KeyGen stage.
CA (Certificate Authority) request and.
give his name and public key. tick The.
The format of certificates issued by CA is as follows:
C(DO) = [ID(DO),, sigCA(ID(DO),,)] -.
where [sigCA(ID (DO),,)] denotes the CA's digital signature. produced using a proprietary algorithm.
Step 2:
The data owner uses the verification algorithm verCA[C (DO)] to confirm permissions.
Step 3:
TTPA and the cloud server must receive this certificate in a secure manner. A perpetrator might pose as a
trustworthy person.
user and acquire her valid.
public key and identity information are contained in a certificate. In order to prevent this, the consumer gives
cloud servers and verifiers access to the certification authority verCA[C(DO)]'s verification algorithm. Online
server.
additionally got a certificate.
C(DO) = [ID(DO), verDO, sigCA (ID(DO), verDO,)] ---(9).
C(CS) = [ID(CS), verCS, sigCA (ID(DO)), verCS,]----- (10).
Step 4: During the exchange of communications.
the and the TTPA.
Using the cloud server, a safe, authenticated channel is created.
the session key. These two parties communicate using the same session key.
The generated proof (P), which Cloud Server signed with its private key.
creating a signature with (SK).
Next, send (P,, C.
(DO) to TTPA.
Step 5:
After receiving the signature, TTPA first verifies the certificate using a verification algorithm. If.
is certified, the answer is checked using its public key.
(PK). Otherwise, the consistency check will stop.
Step 6:
Decrypt the proof reply once more in the following step.
utilizing the public key (PK). After making sure that, this is done.
Verification of the certificate of authenticity. Her.
The authenticity of the server is then established by verifying the signature.
Step 7:
Finally, TTPA checks the proof by running the.
verification algorithm. By doing this, you can prevent active attackers from altering your data in any way.

182
V. RESULT ANALYSIS

TABLE I. PROPOSED WORK FROM VARIOUS ATTACKS


Recurring Attacks Y

Attacks involving identity theft Y

Insider dangers Y

External assaults Y

Listening in Y

Identity Theft Y

Attacks using passwords Y

The various proposed attack preventions are being criticized, as shown in Table 1.

TABLE II. SUGGESTIONS FOR AUTHENTICATION FACTORS.

Number of bits in Number of bits in Time taken


token secrete value

64 216 24.5 sec

An analysis of first factor authentication is presented in Table 2. Here, the number of bits used to generate the
secret value is dependent on the number of bits used to create the token.

TABLE III. THE SUGGESTED METHOD'S MEMORY COMPARISON.


Storage/ scheme Our scheme R. song al et.

Smart card 640 bits 480 bits

Server 320 bits 640 bits

It shows that the proposed method reduces the load on the server because the server only stores the server's
private key.

VI. CONCLUSION
In any environment, security is crucial to the transmission of data from sender to receiver. When data is
consistent, it is kept in a single place across many nodes. For data storage and data integrity, the cloud concept
was introduced. The ability to check nodes from one cloud to another and back up dynamic data is typically
possible if multiple clouds are implemented and data is stored dynamically. Add unstructured data to
challenging Hadoop. By authenticating the node from one cloud to another, we use the hybrid cloud concept to
dynamically store data exposed from any node in any cloud. Describe the idea. We're working to make it a
reality. Data ownership is supposed to be dynamic, so whenever one of your node's clients wants to access
data kept in another cloud, it has to first approve public access to the data. The suggested method
used here improves authentication, lessens the chance of eavesdropping, and guards against a number
of attacks like DOS, replay attacks, etc.

VII. REFERENCES
[1] Yan Zhu, Huaixi Wang, Zexing Hu, Gail-Joon Ahn, Hongxin Hu, Stephen S. Yau “Efficient Provable
Data Ownership for Hybrid Clouds”, 2010, 1st Part on Computer and Communication Security. Proceedings of the 17th
ACM Conference (CCS) '10), S. 756-758, 2010.
[2] Qian Wang, Cong Wang, Jin Li1, Kui Ren and Wenjing Lou, “For Storage Security in Cloud Computing Enabling
public verifiability and data dynamics of ”, 2009 Proceedings of die 14. Europäische Konferenz zur Forschung in der
Computersicherheit (ESORICS'09), S. 355-370, 2009.

183
[3] Tekin Bicer, David Chiu und Gagan Agrawal, „Time and Cost Sensitive Data-Intensive Computing on
Hybrid Clouds“, 2012 IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 636 –
643, Mai 2012.
[4] Haoming Liang, Wenbo Chen and Kefu Shi “Cloud Computing: Programming Models and
[5] Guannan HU und Wenhao ZHU, "A Dynamic User-integrated Cloud Computing Architecture", 2011 International
Conference on Innovative Computing and Cloud Computing (ICCC '11), S.: 36-40, 2011.
[6] Xiang Li, Jing Liu, Jun Han and Qian Zhang, "Architecture Design of Microlearning Platform Based on Cloud
Computing", Proceedings of the 2011 International Conference on Innovative Computing and Cloud Computing (ICCC
'11), S. 80- 83, 2011
[7] Zhang, Joshua Schiffman, Simon Gibbs, Anugeetha Kunjitha, Patham and Sangoh Jeong, Securing Elastic Applications
on Mobile Devices for Cloud Computing, Proceedings of the 2009 ACM Workshop on Cloud Computing Security
(CCSW '09), S. 127-134, 2009. Yan Zhu, Huaixi Wang, Zexing Hu, Gail-Joon Ahn, Hongxin Hu, Stephen S.
Yau, “Efficient Provable Data Ownership for Hybrid Clouds,” Proceedings of the 17th ACM Conference on Computer
and Communication Security (CCS '10), S. 756-758, October 2010.
[8] Qian Wang, Cong Wang, Jin Li1, Kui Ren and Wenjing Lou, “Enabling Public Verifiability and Data
Dynamics of Storage Security in Cloud Computing,” Proceedings of the 14th European Conference on Research
in Computer Security ( ESORICS'09), S. 355-370, 2009.
[9] Arash Nourian und Muthucumaru Maheswaran, "Towards privacy-enhanced limited image processing in the cloud",
13th ACM/IFIP/USENIX International Middleware Conference Proceedings of the 9th Middleware Doctoral
Symposium (MIDDLEWARE '12), Artikel Nr. 5, 2012.
[10] Jean Bacon1, David Evans1, David M. Eyers1, Matteo Migliavacca2, Peter Pietzuch2 and Brian Shand3, “End-to-
Work in the Cloud”. Enhancing End Application Security," Proceedings of the ACM/IFIP/USENIX 11th International
Conference Middleware (Middleware '10), S. 293-312, 2010.
[11] Jia Yu, Rajkumar Buyya, Kotagiri Ramamohanarao, "Workflow Scheduling Algorithm for Grid Computing", 2008
Springer Berlin Heidelberg, ISSN NO. 1860-949X, pp. 173–214,
[12] Jon Oberheide, Kaushik Veeraraghavan, Evan Cooke, Jason Flinn and Farnam Jahanian, "Virtualized Security
Services in the Cloud for Mobile Devices", Proceedings of the First Workshop on Virtualization in Mobile Computing
(MobiVirt' 08), S. 31-35, 2008.
[13] Luís Mendonça und Henrique Santos, ``Botnets: A Heuristic-Based Detection Framework'', Proceedings of the Fifth
International Conference on Security of Information and Networks (SIN '12), S. 33-40, 2012
[14] Alex Kantchelian Justin Ma und Ling Huang "Robust detection of comment spam using entropy rate", Proceedings of
the 5th ACM Workshop on Security and Artificial Intelligence (AISec '12) Record, S. 59-70, 2012.
[15] Pengfei Sun Qingni Shen, Ying Chen Zhonghai und Wu Cong Zhang, "POSTER: LBMS: Load Balancing based on
Multilateral Security in Cloud", Proceedings of the 18th ACM Conference on Computer and Communication
Security. (CCS '11), S. 861-864, 2011.
[16] Shiuan-Tzuo Shen, Hsiao-Ying Lin and Wen-Guey Tzeng, "Effective Alignment for Secure Erasure Code-Based
Storage Systems Reliability Check Scheme", IEEE Trans. On Reliability, vol. 64. Nr. September 3, 2015.

184
Grenze International Journal of Engineering and Technology, June Issue

Video Surveillance Fire Detection System using CNN


Algorithm
Prof. U. L. Tupe1, Lakhan Jadhav2, Shivanand Koli3, Prasad Kulkarni4 and Mayur Gaikwad5
1-5
JSPM's Rajarshi Shahu College of Engineering/Information Technology, Pune,India
Email: [email protected]

Abstract— Fires in public places such as shopping malls, hospitals, train stations, and others can
endanger both people and resources. This has been a big concern for the past few decades.
Stopping these accidents should be a priority. So to achieve these, there are various techniques,
but there are some loopholes in those techniques. So to overcome those loopholes, we have
developed a model that detects fire in images and video frames, and as soon as it detects fire, it
sends an alert message to the nearest fire station and related authorities. The main purpose of
these models is to prevent accidents due to fire and minimize human work load. This paper uses
the CNN algorithm to build these projects.

I. INTRODUCTION
Fire accidents are one of the biggest threats to industries, social gathering places, hospitals, malls, and various
densely populated areas across the world. These kinds of incidents may cause damage to property and the
environment and pose a threat to human and animal life. According to the recent National Risk Survey Report
[1], fire was ranked third in terms of its impact on various fields related to problems, among many others. There
were some fire accidents in different countries that resulted in the impending ecological disaster, claiming
millions of lives and resulting in billions of dollars in damage. Early detection of fire can save many lives, as
well as various resources, and prevent damage caused by fire. This is one of the advantages of detecting fire at
an early stage. In order to achieve high accuracy and robustness in dense urban areas, detection through local
surveillance is It is both necessary and effective. There were numerous issues with traditional fire systems, such
as false alarms, detecting fire when there was none, and ringing alarms when there was no fire. Maintenance was
difficult. The use of sensors in hot, dusty industrial conditions is also not possible. Thus, detecting fires through
surveillance video One of the most feasible and cost-effective solutions is streaming, which is suitable for the
replacement of existing systems without the need for large infrastructure installations or investments. The
existing domain knowledge is heavily used in video-based machine learning models.
As a result, they must be updated to meet new threats. So our model can detect fire in a video or image frame
and send an alert message as soon as it detects fire. It can be used to detect fires in surveillance videos. Unlike
existing systems, this neither requires special infrastructure for setup, like hardware-based solutions, nor does it
need domain knowledge and prohibitive computation for development.

II. PROPOSED FRAMEWORK


We investigate deep neural networks for potential fire detection in the early stages of the suggested system's
surveillance. For the objective problem, we investigate various deep CNNs while accounting for accuracy, the
embedded processing power of CCTV systems, and the frequency of false alarms. The computer vision problems

Grenze ID: 01.GIJET.9.2.334


© Grenze Scientific Society, 2023
and applications where CNNs have shown promise include object detection and localization, image
segmentation, super-resolution, classification, and indexing and retrieval. Their hierarchical method, which
automatically learns very potent qualities from raw data, might be credited for this widespread success.
A typical CNN design is made up of three well-known processing layers. 1) Numerous feature maps are created
as a convolution layer when various kernels are applied to the input data. 2) A pooling layer that, in order to
achieve some translation invariance and dimensionality reduction, chooses the maximal activation from a small
neighbourhood of feature maps acquired from the previous convolution layer. 3) A fully interconnected layer
that builds a global representation of high-level data from input information. The high-level characteristics of the
input data are produced by this layer after a sequence of convolutional and pooling layers. These layers are
arranged in a hierarchy, with the output of one layer serving as the input for the following one.
In convolutional kernels and fully connected layers, the weights of every neuron are adjusted and learned
throughout the training period. By simulating the characteristic features of the input training data, these weights
can carry out the objective classification. Pre-processing refers to all the adjustments made to the raw data in our
project before it is provided to the deep learning or machine learning algorithm. A CNN model, for instance,
would almost likely produce subpar classification results if trained on raw images. A neural network called CNN
extracts the features of the input images. The input image is used by a network for feature extraction. The feature
extraction signals are used by the neural network for categorization.

III. RELATED WORK


THE SDLC MODEL TO BE USED each software development life cycle model begins with an analysis, in
which the technologies employed in the project and the team load are specified. Software Development Life
Cycle models, or SDLC models, are one of the fundamental ideas in the software development process. The
SDLC is a continuous process that begins when the choice to start the project is made and ends when it is
completely removed from exploitation. No single SDLC model exists. Here, the waterfall model is used in the
suggested system. The steps of the waterfall model that are necessary for creating the suggested system project
are shown in Figure 3.1.
The phases of the proposed system's waterfall model are listed below. Requirement Analysis
A. Waterfall Model:
Figure 3.1 the most crucial and essential stage of the SDLC is the requirement analysis. With input from all
Department of Computer Engineering, ACEM, 2021–22 18 Video Surveillance Fire Detection System
stakeholders and subject matter experts (SMEs) in the sector, the senior members of the team carry it out. At this
point, planning is also done for the requirements for quality assurance and for the identification of project-related
risks. A meeting is scheduled with the client by the business analyst and project manager to obtain all the
necessary information, such as what the customer wants to construct, who will be the end user, and what the
product's goal is.
B. System Design:
System Design is the following phase that will compile all of the information on requirements, analysis, and
software project design. This phase is the result of the previous two, such as requirement collection and client
input.
C. Implementation:
The actual development process starts here, and the programming is created. Coding represents the start of
design implementation. Programming tools including compilers, interpreters, debuggers, and other similar tools
are used to generate and implement the code, and developers must adhere to the coding standards outlined by
their management.
D. Testing:
After the code is created, it is tested in comparison to the requirements to ensure that the solutions are addressing
and accumulating the demands throughout the requirements stage. Unit testing, integration testing, system
testing, and acceptability testing are carried out at this level.

186
E. Deployment:
After the software has been certified and no problems or errors have been reported, it is deployed. The software
may then be delivered as is or with proposed improvements in the object portion depending on the assessment.
The maintenance of the software starts once it has been deployed.
F. Maintenance:
Once the client begins utilising the built systems, the real problems surface and periodic problem-solving is
required. Maintenance is the process when the developed product is given attention.

IV. PROPOSED WORK


When classifying images, convolutional neural networks have produced results with extremely high accuracy.
Convolutional neural networks are the most widely used deep learning architecture and are quite powerful.

Figure 1. Using deep CNN Early flame detection in surveillance videos.

Figure 2: Operation of CNN architecture

This built model will collect data from CCTV or surveillance footage and process it gradually in real time.
Frame by frame, the video is processed, and then the processed frames are fed into the pretrained CNN model.
This pre-trained CNN model will categorise frames into two groups in real time: one with fire and the other
without fire. This pre-trained CNN model may be set up to operate on a distant server using data from various
video surveillance systems. After processing these inputs, the pre-trained CNN model outputs a real-time
prediction on the real-time streaming data. To ensure that no frames are lost, streaming frames will be kept in
data storage. As the knowledge grows, the model will shortly
By doing this, a rich dataset is produced, and the model is trained using a large number of frames from the
dataset. As a result, the model's frame prediction accuracy will grow. Since the architecture of the monitoring
system won't need to be changed, this fire detection will be affordable. The model uses information from current
CCTVs or surveillance systems to forecast the presence of fire. This architecture is shown in Fig. 1.

187
V. CNN
An example of an eager learner algorithm is CNN. One of the Deep Learning models is CNN. The classification
performance of CNN is excellent. The most effective algorithm for classifying images at the moment is CNN.
Since LeNet, a Deep Learning algorithm, CNN has gained enormous popularity for classifying images. Huatan
Watan Jishu Wutan AUG/2020, Volume XVI, Issue VIII, ISSN 1001-1749 Pages: 99 There were five concepts
utilised, all of which had very good accuracy, to classify handwritten digits. The convolution layer, the Relu
layer, the Pooling layer, and the fully connected layer are the four processing layers that make up a convolution
neural network. One layer's output is used as the input for another layer.
1. The convolution layer, which forms the foundation of CNN, maps several kernels onto the input data before
performing a dot product to produce a feature map. 2. The Relu layer replaces negative numbers with 0 while
leaving other values alone. 3. To minimise and produce translation invariance, select the maximum value for the
pooling layer in a limited area around these maps. The fourth layer of CNN is known as the fully connected
layer. This layer receives as its input the output of the previous three levels. A fully connected layer will classify
the photos using the training data it has collected using the provided weights.

Implementation

VI. ADVANTAGES
1) Easy to detect Fire or NOT.
2) Improve Accuracy.
3) Time Saving.
4) Easy to use.
5) It is User Friendly Application.

VII. APPLICATIONS
1) Fire Detection System.
2) Helps in Fire Fighting

VIII. CONCLUSIONS
In conclusion, a sizable archive of recordings of forest fires in a range of scene conditions has been used to
investigate an aerial-based forest fire detection approach. The chromatic and motion characteristics of a forest
fire are first extracted, and then they are rectified using a rule to highlight the burning region, in order to increase
the detection rate. Second, smoke is also extracted using our suggested algorithm to address the issue of dense
smoke that almost completely engulfs the fire. In the actual application of aerial forest fire monitoring, the
proposed System framework demonstrates its robustness with a high accuracy rate of detection and a low false
alarm rate. Making difficult and specific scene comprehension data sets for fire detection methods and in-depth
trials may be the main focus of future research.
Additionally, fire detection systems can be integrated with reasoning theories and information-hiding algorithms
to intelligently observe and authenticate the video stream and start the necessary actions in an autonomous
manner.

188
ACKNOWLEDGMENT
Sincere appreciation to Prof. U. L. Tupe and HOD Dr. Nihar Ranjan for their assistance in resolving project-
related issues. A particular thanks to the computer wizards who created lovely libraries that are time-saving.
Additionally, we appreciate the IEEE community's assistance with the use of libraries.

REFERENCES
[1] C. Kao and S. Chang, “An Intelligent Real-Time Fire-Detection Method Based on Video Processing,” IEEE (2003 Int.)
[2] C. Ha, U. Hwang, G. Jeon, J. Cho, and J. Jeong, “Vision-based fire detection algorithm using optical flow”, 2012.
[3] C. E. Premal and S. S. Vinsley, “Image Processing Based Forest Fire Detection using YCbCr Colour Model”, 2014.
[4] N. I. Binti Zaidi, N. A. A. Binti Lokman, M. R. Bin Daud, H. Achmad, and K. A. Chia, “Fire recognition using RGB
and YCbCr color space”, 2015.
[5] K. Poobalan and S. Liew, “Fire Detection Algorithm Using Image Processing Techniques,”, December, 2015.
[6] Khan Muhammad, Jamil Ahmad, Zhihan Lv, Paolo Bellavista, Po Yang, and Sung Wook Baik, “Efficient Deep CNN-
Based Fire Detection and Localization in Video Surveillance Applications”, 2018,IEEE.
[7] Oxsy Giandi, Riyanarto Sarno “Prototype of Fire Symptom Detection System”, ©2018 IEEE.
[8] Sneha Wilson, Shyni P Varghese, Nikhil G A, Manolekshmi I, Raji P G, “A Comprehensive Study on Fire Detection”,
(2018). [9] Jiang Feng, Yang Feng, “Design and experimental research of video detection system for ship fire”, ©2019
IEEE. [10] Ke Chen,Yanying Cheng, “Research on Image Fire Detection Based on Support Vector
Machine”,(2020),IEEE [11] UANG HONGYU1, KUANG PING1, LI FAN1, SHI HUAXIN1, “An Improved
MultiScale Fire Detection Method Based On Convolutional Neural Network”, © 2020 IEEE.

189
Grenze International Journal of Engineering and Technology, June Issue

Smart Time Table Generation using Artificial


Intelligence
Sukhwant Kour Siledar1 and Dr. Vijaya B. Musande2
1
M. Tech, Department of Computer Science & Engineering, JNEC, MGM University, Aurangabad
2
Professor, Department of Computer Science & Engineering, JNEC, MGM University, Aurangabad
[email protected], [email protected]

Abstract— For any educational institutions be it a small-scale school or a college or huge


institutions like universities, time tabling is a very important and tedious task. It concerns all
activities with regard to producing a schedule that must be subjective to different constraints.
Being dependent on various constraints, it is a temporal arrangement of courses, faculty and
students hence it requires changes every now and then. Creating such time tables manually
especially with large scale institutions like universities, leaves scope for clashes and human
errors. To overcome these problems automated time table generation using AI can save a lot of
time of administrators and manpower. Such an automated system will require various inputs
like course details, infrastructure, teachers available and strength of class, and will aim to make
most optimized utilization of all these resources in a way to best suit any of constraints of
college rules. A precise time table will then be chosen from generated solutions. Time table is
scheduled for various purposes like organizing lectures, bus schedules, exam time tables and
many more. In our proposed model we will make an approach to build a model that can
automate time tables to serve various scheduling purposes using Evolutionary Algorithm (EA)
and Genetic Algorithm (GA), and further can be integrated with ERP of institutes for ease of
data fetching. This technique allows one to make an attempt to create time table with reduced
errors and mistakes in the least time.

Index Terms— Timetable, Generation, Genetic, Optimization, Constraints, Scheduling, Fitness,


Chromosome.

I. INTRODUCTION
Timetable is basically a structure which shows the time at which some prescribed event occurs. For educational
institutes timetable is for achieving its basic purpose of lecture delivery, and is used for scheduling of events
throughout the day, week, term or year for each batch. It requires the combination of resources like batch of
students, classes, instructors, time slots, and days arranged in a way such that no mentioned resources have an
overlap. This practice of mapping events in general (classes/ exams) to timeslot subject to the constraints, is
carried out manually in most of the institutes, requiring lot of manpower and time. Hence, time tabling gives rise
to scheduling problem that is tedious and as well requires solution in every institute at least once or twice an
academic year.
From above discussion, time tabling is a non-polynomial (NP) complete problem, i.e., a problem which has no
defined way to draw an appropriate solution. This NP complete scheduling problem falls in the class of
computational problems for which no efficient algorithm that can give accurate solution has yet been found.

Grenze ID: 01.GIJET.9.2.336


© Grenze Scientific Society, 2023
Hence to provide an efficient solution to the stated problem, we will make an adaptive heuristic approach, which
will generate a set of good solutions from which the most optimized solution will be provided as output. In the
field of computer science, for precision or optimization problem, artificial intelligence is used to implement a
model providing solution to it.
The main objective of this paper is to address the timetabling problem (which is NP complete, scheduling
problem) by using AI which will result in an optimal solution with minimal or no redundancy errors. This model
is developed using genetic algorithms falling in the class of evolutionary algorithm of AI. It works in adaptive
heuristic searching way for solving constrained optimization problem. This approach is based on natural
selection and then proceeding for evolution of selected individuals from population to reproduce generations i.e.
pool of many timetables here, from all these generations on the basis of fitness the most optimal solution is given
as output. The classical GA that we will be using in our model is based on Darwin’s theory of evolution and the
principle of survival of the fittest.

II. RELATED WORK


The constant struggles in designing of time table have been observed widely associated with small to large scale
educational institutions. While this area is vast covering scheduling of regular classes, examinations, bus
schedules and many more; various efforts have been made globally to address this problem and lot of research
has been carried out to propose and implement various methods for automated generation of time tables. As also
mentioned in the previous section time tabling is a NP complete problem, so only the attempts to the best
possible solutions have been made so far. Optimal solutions were achieved using both kind the traditional and
artificially intelligent techniques. This chapter will examine all the variety of approaches carried out so far.
The stated timetabling problem was first studied by Gottlieb in 1963, who considered that each lecture consisted
of one student group, and one teacher whose combination can be chosen freely. Since then, the problem is being
addressed by proposing various models using variety of methodologies to develop a computational solution
which can replace manual task of scheduling, as discussed in the following part of this section.
Authors Dipti Srinivasan Tian Hou Seow Jian Xin Xu (2002) [1] proposed a model for automated timetable
generation using multiple context reasoning. They presented an EA based approach along with context-based
reasoning for creating physical timetables in less computational time. But it is difficult to implement as
compared to GA and has less accuracy and optimization.
Shengxiang Yang, Member, IEEE, and Sadaf Naseem Jat (2008) [2] defined university course timetabling
problem (UCTP) as a combination of events and time slots. They addressed it using GA with Local Search (LS)
technique. The GA guided search strategy create offspring based on individuals from previous generations and
LS use its exploitive search to improve the efficiency and quality.
D. Nguyen, K. Nguyen, K. Trieu, and N. Tran (2010) [3], used Tabu search algorithm to develop the model for
giving solution to timetabling problem. In this, the search space comprises of set of achievable solutions. A
fundamental “taboo” element exists to move aside the non-improving moves and gradually it keep away from
getting trapped at local maxima but this is expensive approach for evaluating the resources and formulating it.
N. M. Hussin and A. Azlan (2013) [4], implemented graph colouring heuristic method for scheduling problem.
Here problem stated is represented using graphs for constructing stages of difficulty in the process of scheduling.
Nodes of graph denote subjects and edges denote conflict. With each phase it keeps improvising the solutions
and ultimately reach the best solution, but the prolonged time taken to reach this stage makes this method less
reliable.
Authors W. F. Mahmudy and R. E. Febrita (2017) [5] used fuzzy logic method for providing computational
solution to time tabling problem. This multivalent logic is based on fuzzy set theory and linguistic variables are
used to solve optimization problem of timetable and provide a realistic solution. But this makes it difficult to
evaluate membership function and ultimately it is hard to create and calibrate fuzzy model.
T. Elsaka (2017) [6], used Constraint satisfaction modelling which is based on constraints and variables and not
on the object function. Two main components here are constraint and data. Constraint programming has
statement of constraints that serve as part of program which is an advantage of this model. The main hindrance is
the amount of time it consumes and the soft constraints are not considered.
Authors K. Y. Junn, J. H. Obit, and R. Alfred (2018) [7] proposed a solution based on GA which is based on
theory of natural selection. It is an iterating process of creating population from individuals. Convergence can be
a drawback if not taken care in the iteration.
D. Apostolou and E. Psarra (2019) [8] have proposed an approach on basis of Hybrid Particle Swarm
optimization (PSO) with local search, which is an AI method. The stated problem is provided a solution by

191
integrating PSO with prototype methodology which creates particles that can upgrade themselves and have own
memory. But unlike GA it does not have operators like crossover and mutation to avoid convergence.
This section explained how others through various approaches used intelligence to solve the problem by setting
rules and how classical genetic algorithm can prioritize these rules dynamically to optimize the timetable
generation by providing benefit of distributed solution and load balancing. It can serve as the best possible way
to provide a solution provided constraints and proper convergence condition.
From above discussion, many studies have focused on making an approach to time table scheduling using AI by
considering diverse techniques. However, rapidly evolving demands do not sustain with these attempts. The
well-established relationship between constraints and scheduling is very crucial for our problem to provide an
optimized solution. In light of this, we also studied the area of genetic algorithms in great depth, researchers have
made latest and advanced studies that can be compatible with the evolving world.
Indeed, a brief analysis by Sourabh Katoch, Sumit Singh Chauhan, Vijay Kumar (2020) [9] on advances in
genetic algorithm and their implementation, has made a clear differentiation between all above discussed
attempts and motivated us for an approach using GA in our problem domain. Further for GA implementation and
mathematical modelling, the analysis made by L V Stepanov, A S Koltsov, A V Parinov, A S Dubrovin (2018)
[10] was studied and on its basis our proposed model was developed.

III. METHODOLOGY
A. Proposed System
In our approach we have developed model for providing solution to Timetabling problem based on genetic
algorithm in AI, it is based on natural selection and evolution method. Here we use basic terminology which can
be referred as follows.
Phenotype refers to population in real world, whereas Genotype is used for reference to population in
computational world.
Population is generally referred to set of human beings in phenotype but in genotype it is set of solutions (here it
will be a set of generated time table, also sometimes population can be referred to as generation)
Chromosome is the individual solution to given problem and a gene specifies one element position in a
chromosome.
To simply understand implementation of GA and the genetic operators that are responsible for the alteration in
composition of offspring, the following part of this section can be referred.
Implementation of GA in our developed model is done as in below steps,
Preprocessing
The prerequisite to perform operations in GA is to convert potential solution into a simple value like a string of
real or binary numbers. It helps in improvement of speed of algorithm. We have used conversion of data to
binary string i.e., a string of combination of 0 or 1. These bits of string are responsible for characteristic
presentation of solution as well as algorithm accuracy. Each chromosome string is a composition of sequential
arrangement of gene string.
Initial population
After encoding, the first step of the algorithm is to generate initial population which is done by random creation
of individuals on the basis of the constraints defined. The larger the population, the better will be the results.
This process sometimes can also be described as selection process or in terms of GA it can be referred as
reproduction operator.
Evaluation of population
The parameter used for evaluating an individual is known as fitness of an individual. The fitness of each
chromosome is determined within a generation, which is an estimate of how well the solution satisfies the given
constraints relative to other solutions.
As we have used binary encoding, our range for the fitness function will be between 0 (worst solution) and 1
(best solution). The fitness function defined in our approach is as follows,
∑ ̇ ∗ℎ∗
( )=

where, a= timetable solution under evaluation,


h= hours per day,

192
d= days per week,
t= total fitness of a generation
After determining the fitness of all chromosomes in a generation, i.e., complete evaluation of population, we
need to select chromosomes for further mating so that new generations can be created. This is done using
Roulette wheel selection method [9]. The basic principle on which this selection method works is
Selection α fitness
The concept of this technique, is to dividing a wheel according to proportion based on fitness value. Then each
chromosome is mapped in the suitable proportion. Eventually, the wheel is rotated and the pointer points at the
chromosome is selected for further reproduction. Also, as the basis of proportion is fitness value, mostly the
larger proportion will be composed of fitter chromosomes, increasing probability of pointer to stop at fitter
individuals.
Crossover
Now for mating the selected chromosomes, the crossover operator is used, hence resulting in new chromosomes
together making a new generation. We have used single point crossover in our model. The selected
chromosomes using above selection method are used and single point crossover is performed [9]. In this
operation, random point of a chromosome is selected, and for two parents, gene after this point is swapped
resulting in new off springs with different gene composition.
Mutation
This genetic operator is very crucial as it is used to prevent converging at a very early stage of reproduction. It
basically alters genes of chromosomes for generating diverse variety of population. We have implemented
Displacement mutation (DM), as name suggests it operates by displacing genes within a chromosome. This
results in production of diversity, hence reducing risk of premature convergence.
B. Algorithm
Based on the above described implementation of GA in our model, its algorithm can be depicted as in Fig. 1.
C. Constraints
In GA constraints can be classified into two categories,
Hard constraints, these mandatorily needs to be followed. In our model the hard constraints that are,
 Same student must not have two lectures simultaneously.
 Same lecturer must not have two lectures simultaneously.
 One room must be allocated to only one lecture at a time.

Figure 1. Algorithm Figure 2. System architecture

193
Soft constrains are the constraints that are required to be satisfied, but it is not obvious that they will always be
satisfied. In our model the soft constraints considered are,
 Same lecturer must not have two consecutive lectures.
 Fixed slot assignment for particular subject.
D. System Architecture
The system architecture of the developed model as shown in Fig. 2 uses GA to solve timetabling problem and
generates optimal minimum or no error solution. It has capability to take various resources (class, subject,
teacher details) as input in a very user-friendly manner and process them for the required output in low cost and
less time.
E. Hardware and Software Requirements
The minimum hardware requirements for implementing the system are Processor of minimum configuration
Pentium IV/Intel I3 core with speed of at least 1.1 GHz, RAM 512 MB (min) and Hard Disk of minimum 20GB.
The output is displayed on standard monitor screen and for data input keyboard and mouse is required.
Software requirements for implementation is about the prerequisites to be installed on the system for proper
functioning. In our model, such prerequisite software are Java 8 and above supporting compiler, Struts-2
framework, Apache tomcat server and MySQL database. Operating system supporting our model are Windows 7
and above versions.
For developing front end Html5, CSS, JavaScript, bootstrap and AJAX is used, whereas the backend code is in
Java.
F. Modules and Interfaces
 Registration and Login: When the user visits the homepage, for the first time they need to register by
providing basic details. These are then stored in the user detail database as seen in system architecture
Fig. 2. Login is a submodule of this interface, where user can enter credentials for logging in.
 Input Interface: After login, user is asked for all input details required like number of slots, batches,
days per week for which schedule is to be created. Then after the course details along with faculty
details and submit button is clicked. This directs to the browser page where the optimized timetable is
displayed.

IV. RESULT ANALYSIS


Timetable is generated using genetic algorithm and this optimized solution satisfies all hard constraints defined
and most of the soft constraints.
In Fig. 3, the snapshot of the console shows the fitness values reached in various generations by the
chromosomes. All of the generations are presented on the console itself in the genotype and on the output screen,
only the phenotype optimized solution is displayed. Also, by clicking on refresh we are able to get the lesser
optimized solution on display, this is the facility for convenience of human to compare and contrast given
solution with the other possible solutions if in any case human intervention is needed to alter the produced
timetable. Results displayed are free from clashes and generated in a very less time as compared to manual
creation and also saves lot of efforts.

V. CONCLUSION
We have implemented GA in AI for smart time table generation, which produces optimal timetable subject to the
constraints defined.
Though the solution cannot always be 100% optimal and suited, as the degrees of optimization depend upon the
constraints defined. The improvement has been achieved using intelligence system by the appropriate use of
genetic operators. Developed system works accurately initially for schools as they execute classes in a very
classical model and gradually increases its scope to colleges and higher secondary classes which has more
constrains hence making the problem more complex, but our model deals with it in a user-friendly way. There is
future scope to enhance the developed system model for producing timetable and scheduling for various
purposes like examinations, bus schedules, and many more.

194
Figure 3. Genotype output on console determining fitness

REFERENCES
[1] Dipti Srinivasan Tian Hou Seow Jian Xin Xu “Automated timetable generation using multiple context reasoning for
university models”, IEEE conference (2002).
[2] Sadaf N. Jat, Shengxiang Yang “A mimetic algorithm for university course timetabling problem”, 20th IEEE
International conference on tools with artificial intelligence (2008).
[3] K. Nguyen, D. Nguyen, K. Trieu, and N. Tran, “Automating a real-world university timetabling problem with Tabu
search algorithm”, in 2010 IEEE RIVF International Conference on Computing & Communication Technologies,
Research, Innovation, and Vision for the Future (RIVF), (2010).
[4] A. Azlan and N. M. Hussin, “Implementing graph coloring heuristic in construction phase of curriculum-based course
timetabling problem”, in 2013 IEEE Symposium on Computers & Informatics (ISCI), (2013).
[5] R. E. Febrita and W. F. Mahmudy, “Modified genetic algorithm for high school time-table scheduling with fuzzy time
window”, in 2017 International Conference on Sustainable Information Engineering and Technology (SIET), (2017).
[6] T. Elsaka, “Autonomous generation of conflict-free examination timetable using constraint satisfaction modelling”, in
2017 International Artificial Intelligence and Data Processing Symposium (IDAP), (2017).
[7] K. Y. Junn, J. H. Obit, and R. Alfred, “The study of genetic algorithm approach to solving university course timetabling
problem”, in Lecture Notes in Electrical Engineering, Singapore: Springer Singapore, (2018), pp. 454–463.
[8] E. Psarra and D. Apostolou, “Timetable scheduling using a hybrid particle swarm optimization with local search
approach”, in 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA),
(2019).
[9] Sourabh Katoch & Sumit Singh Chauhan & Vijay Kumar, “A review on genetic algorithm: past, present, and future”,
Multimedia Tools and Applications, Springer (2021).
[10] L V Stepanov et al, “Mathematical modeling method based on genetic algorithm and its applications”, Journal of
Physics: Conference Series (2019).
[11] Peter Brucker, “Scheduling and constraint propagation”, November (2002).
[12] David E. Goldberg “Genetic Algorithm in Search, Optimization and Machine Learning”, (1989).
[13] John McCall, “Genetic algorithms for modelling and optimization”, Journal of Computational and Applied Mathematics
184 (2005).
[14] B. Shirazi, H. Fazlollahatabar and D. Shafiei, “A Genetic Approach to Optimize Mathematical Model of Faciliies
Relocation Problem in Supply Chain”, Jouranl of Applied Sciences (2008).
[15] Abdelghany A, Abdelghany K, Azadian F “Airline flight schedule planning under competition”, Comput Oper Res
87:20–39 (2017).
[16] Arkhipov DI, Wu D, Wu T, Regan AC “A parallel genetic algorithm framework for transportation planning and logistics
management”, IEEE Access 8:106506–106515 (2020).
[17] Baker JE, Grefenstette J “Proceedings of the first international conference on genetic algorithms and their applications”,
Taylor and Francis, Hoboken, pp 101–105 (2014).

195
Grenze International Journal of Engineering and Technology, June Issue

Review of AI/ML in Software Defined Network from


Past to Present
Dr. Raghavendra Kulkarni
Department of Computer Science, School of Science
Gandhi Institute of Technology & Management (GITAM) Deemed to be University
Hyderabad, Telangana, India.
Email: [email protected]

Abstract— Software-Defined Networks (SDN) technology disrupts the traditional network


architecture's tight link between the data plane and the control plane, enabling network
resource economy, security, and controllability. In this study, we carried out a systematic
analysis with a specific focus on the application of AI/ML algorithms to enhance SDN functions.
Artificial intelligence (AI) or Machine learning (ML) will have significant potential in fields
such as route planning, network resource management, traffic scheduling, network security and
fault detection, when paired with SDN architecture. Networks have become more complicated
and challenging to configure, manage, and monitor as a result of these demands. Researchers
and operators recommended using software tools that can monitor and configure networks on-
demand to make networks more manageable and controllable. From the perspective of ML
algorithms, this study focuses on the applications of traditional AI/ML algorithms in SDN-
based networks. Finally, a discussion and analysis of the potential future development of SDN
concepts in ML algorithms is addressed. We present a summary of the state-of-the-art after
reviewing 1450 publications. Researchers from various domains will find this study useful and
essential in fully understanding the fundamental concerns.

Index Terms— Artificial Intelligence, Machine Learning, Quality of Service, Software-Defined


Networks, Systematic Literature Review.

I. INTRODUCTION
The necessity for novel and effective network architectures is made clear by the rising service heterogeneity and
consumption. To ensure quality of service (QoS) and to achieve Service-Level Agreements (SLAs), modern
networks are configured with complicated static rules. The complexity of administration and configuration
procedures tends to rise in multi-vendor setups. The problems with conventional network-centric architectures
are intractable. Consequently, with SDN and network programmability and automation, network-centric
paradigms give way to application-centric paradigms [1]. In order to grant access to and control over the network
resources, the SDN controller hosts APIs. However, these APIs require the applications to effectively optimize
network speed and security, and machine learning (ML) and artificial intelligence (AI) algorithms can support
this effort. There have been numerous systematic reviews of SDN studies conducted by academics and business
professionals. The authors of [2] describe different SDN load balancing methods, some of which use AI
algorithms. Using SDN designs, Ray et al. proposed an examination of IoT devices in [3]. In addition, [4]
discusses the difficulty of applying AI/ML in SDNs.

Grenze ID: 01.GIJET.9.2.337


© Grenze Scientific Society, 2023
Network management is made easier with the advent of SDN, which also makes it possible to configure
networks efficiently through programming [5]. Through streamlined hardware, software, and management, SDN
can accommodate while having lower operational costs [6]. Hardware limits on the network design will be
eliminated.
Applications for SDN enable centralized management of network policies and regulations. Additionally, they
include a range of features that let administrators use ML techniques to successfully resolve network issues. In
parallel, the SDN architecture implements network management and traffic control based on ML techniques.
From a comprehensive network perspective, controlling network traffic is simple since the controller has access
to all data regarding physical networks and their operational needs. According to our findings, network traffic
categorization research has been a hot area for a while. This study is crucial for choosing the best route
configurations, managing network resources, meeting QoS standards, etc. One of the crucial applications that
cannot be underestimated is network security.
Although these studies offer some extremely intriguing viewpoints and a thorough examination of the subject,
none of them discuss how AI/ML might be used to SDNs as a whole, instead concentrating primarily on
particular features like load balancing and intrusion detection systems. Furthermore, no studies that used a rigid
and open selection process in conjunction with a systematic literature review technique could be located.
Therefore, the focus of the current work is on how AI/ML may enhance performance and address unique
challenges in SDNs.

II. METHODOLOGY
The objective of this study is to compile a collection of publications and analyze them in order to address
different research issues. The following are the research questions:
RQ1. What kind of AI/ML mechanisms are applied to SDN?
RQ2. Can Performance of SDNs be improved by AI/ML?
RQ3. What are the main limitations of using AI/ML in SDNs with respect to Quality of Service (QoS)?
We gathered a number of publications for analysis. An initial batch of papers is assessed using
inclusion/exclusion criteria at the beginning of the procedure. In a subsequent iteration, references and citing
articles from the individuals listed are acquired and examined.
Our search was conducted using the terms "Software Defined Networks," "Artificial Intelligence," and "Machine
Learning," as well as their respective acronyms: ("SDN" OR "Software Defined Network" AND ("Artificial
Intelligence" OR "AI" OR "Machine Learning" OR "ML")
A. Start Set and Criteria
We submitted the search query into the IEEE Xplore, Google Scholar and Core, search engines in order to build
the start set. The first five papers from each group were then chosen.
The following acceptance standards were established before the articles were examined:
- Publication date between 2011 and 2022 (OpenFlow's release date);
- Published in first- or second-quartile peer-reviewed articles (Scimago ranking);
- Written in English;
- Focus on the current themes, such as SDN and AI/ML applications;
- Articles with access that the authors have been given.
With the help of the approval criteria, we selected a collection of articles that directly examine AI/ML in SDNs
from reputable sources. Only six of the prospective beginning set's fifteen articles ultimately complied with the
requirements: [7–12]. A total of 344 were omitted based on the year of publication, 112 were excluded because
they were not published in conferences or journals with high enough rankings, 14 were unavailable, 296 were
duplicates, 4 were not articles, and 582 did not directly address the issues at hand. 98 articles were chosen from
this process, which was conducted between August and September 2022.

III. DISCUSSION
The primary objectives of this part are to present our findings, discuss them, and provide an overview of existing
and emerging trends.
A. Application of AI/ML algorithms
The application of AI/ML algorithms in the publications is displayed in Table 1. The ranking shows that
supervised learning algorithms are second in prevalence to neural networks (NNs) methods. Other mechanisms,

197
such as self-organizing maps, have been discussed in a number of studies but have not received as much
attention.
The popularity neural networks and of deep learning, as well as RF and DT algorithms, can be used to explain
why supervised learning techniques are preferred over the others. Unsupervised learning discovers patterns from
unlabeled data, whereas supervised learning uses labelled data to adjust model parameters.

TABLE I. SDN-CONCEPT NETWORK PERFORMANCE ANALYSIS AND APPLICATIONS OF AI/ML ALGORITHMS

References AI/MI Algorithm Application Performance analysis


[13] K-Nearest Neighbours Predis: detects various attacks Easily accessible, highly accurate; incapable of
(KNN) in addition to DDoS attacks. recognizing extremes; Calculate features easily;
Suitable for multiclass classifications; Time
consuming for large datasets
[14] Random Forest (RF) using regression to model one High accuracy
VNF's latency distribution.
[15], [16] Decision Tree (DT) Packet classification; LCD: Simple to comprehend and implement;
optimize the ASP; Data preparation is easy or not required;
Flow classification; High-speed;
Inductive inference; Too many categories may lead to higher error growth.

[17]-[19] Neural Network (NN) Collaborative intrusion Due to its simple and parallel computational
prevention; capabilities, it achieved a low overhead;
Predicting the performance of Achieved low mean squared error (MSE);
SDN; improved efficiency and a 19.3% reduction in
Load balancing network latency.
[20], [21] Reinforcement Cognitive network Manage networks efficiently;
Learning (RL) management
[22] Deep RL Adaptive multimedia traffic Dynamic coordination of computational, networking,
control mechanism leveraging and caching resources
[23],[24] Deep Q-Learning Q value-action function Promote resilience and scalability
approximation
[25] SVM Predict link failure; Reduces the start-up time for identification and
[26], [27] Detect DDoS attack; classification recognition; lowers the rate of false
alarms
[28] Laplacian SVM Traffic classification on the real Similar applicability to supervised learning; only
Internet data tested in a lab environment; processes synthetic data
In this context, it appears that supervised learning can be used more readily to enhance network decision-making
in areas like routing and QoS. RL and unsupervised learning algorithms are outperformed by supervised learning
algorithms, which are distinguished by a peak in 2017 and an erratic drop in the years that followed. It's
interesting to note that RL has been rising gradually. In IoT, 5G access networks and automotive networking—
dynamic situations recognized by RL to perform unsupervised and supervised learning methods—this suggests a
possible application for SDNs. Based on these findings, it is predicted that the usage of RL to train networks and
SDN processors would have to adapt to variations in resource demand and traffic, especially with the
introduction of SDNs in increasingly complicated networks.
B. Artificial Intelligence in SDN
Numerous issues, resource allocation and admission control, [29], have been successfully solved using AI and
ML methodologies. However, in the SDN era, AI's function was greatly expanded due to the significant efforts
made by the business sectors. Many researchers have revealed a significant trend in the scientific community's
application of AI methods in SDNs.
C. ML Methods In SDN-Concept Networks
The reinforcement learning, the semi-supervised, the unsupervised, the supervised learning method are the four
kinds of ML approaches. A mathematical model is created by supervised learning algorithms using a labelled
training sample. Algorithms for unsupervised learning gain information derived from test results that were not
labelled. Additionally, when some of the sample input lacks labels, semi-supervised learning approaches are
used to build mathematical models using sparse training data. Numerous classification and prediction issues have
been successfully solved by the application of ML methods [30]. In this section, we will continue our work from

198
an algorithmic perspective by providing many traditional ML techniques used in SDN also listed in Table I for
greater understanding.
1) Supervised Learning in SDN-Concept Networks
Nowadays, Numerous various applications, including spam detection, object and speech recognition, commonly
use supervised learning [31]. Predicting the value of results obtained from the values of a vectors of input
variable is the objective. In the context of regression approaches, the SDN architecture uses a regression to
predict [32].The key performance indicator (KPI) for the application and the network metrics are additionally
related using multiple linear regressions [33]. Regression algorithm usage in SDN is currently uncommon on the
whole. We focus on introducing the categorization techniques in SDN. The Logistic Regression, SVM, Decision
Trees, KNN, Naive Bayesian algorithms are some of the most frequently used classification methods.
a) K-Nearest Neighbours (KNN) In SDN-Concept Networks
KNN is categorized by calculating the distance between various feature values. The categorization outcomes
actually depend on a relatively limited set of nearby samples. KNN is appropriate for multiclass classifications
which has been extensively utilized as a classifier in many different fields.
Predis, a computationally straightforward and effective KNN method, was proposed by Zhu et al. [13] as its
detection technique. Because of its better efficiency design, it can accurately identify a variety of other forms.
The algorithm takes a long time when the training dataset is huge. KNN, one of the most straightforward ML
algorithms, is simple to use, estimates features accurately, and works well for multiclass classifications. When
used to huge datasets, the algorithm takes a long time.
b) Support Vector Machine (SVM) in SDN-Concept Networks
Generalized linear classifiers include SVM that uses supervised learning to carry out binary classification.
Because both structural and empirical risk minimizations are taken into consideration in the optimization
problem, SVM is stable.SVM only works for binary classification tasks, it should be mentioned. Multiple
classification tasks will therefore be broken down into a number of binary questions. The technology uses SVM
embedded in the controller to identify DDoS attacks in [26]&[27]. It can identify the distinction between flow
entries created by DDoS attack traffic that are malicious and flow entries created by normal traffic that are
benign. In terms of binary classification problems, SVM has a lower rate of false alarms. Effectively cutting
down on the time needed to start classification recognition and assault detection is the detection strategy. Since
SVM is established at the SDN controller stage, the effectiveness of the SDN system is not significantly
impacted by its complexity.
c)Neural Networks (NN) in SDN-Concept Networks
According to the testing findings, CIPA is more effective than [35] at identifying DDoS flooding attacks. CIPA
also has success finding outbreaks of the Witty, Slammer, and Conficker worms. Due to its parallel and
straightforward processing capabilities, the system achieved little computational and communication overhead.
A multi-label classification method was suggested by He et al. [36] to estimate global network allocations. The
neural network approach outperformed decision trees and logistic regression and reduced algorithm runtime by
up to two-thirds. In order to estimate traffic demands off-line for a mobile network operator, Alvizu et al. [34]
employed a neural network technique, which reduced the optimality gap between 0.2% and 0.45%. Additionally,
the next configuration time point was predicted off-line using a NN technique.
A system for intrusion detection for SDN built on NN technique was proposed by Abubakar et al. [37]; it made
use of the NSL KDD dataset to obtain a high reliability of 97.3%.
d)Decision Tree (DT) in SDN-Concept Networks
DT is a prediction model that illustrates the relationship among both object values and characteristics. It is a tree
data structure where leaf nodes signifies a category and branch route denotes a potential parameter value,
and internal node in the tree indicates an object. DT is frequently used in data mining to examine data for
prediction. Packet classification is its primary use in networks. These are well-known methods, such as Hyper
Cuts [38], Hi Cuts [39], Cut Split [40] or Effi Cuts [41], Partition Sort [42], which incorporates the advantages
of DTs and TSS (Tuple Space Search), is proposed in light of the significantly increased dimensionality and
dynamism in SDN. A least cost disruptive (LCD) decision tree was developed to resolve trade-offs between
good delivery of services, adaption costs, and users disruptive level variables [43].The DTs were employed in
the work in [44] as a technique of solving the Flow Table Congestion Problem (FTCP). The main advantage of
DT over KNN and SVM is that it can be easily implemented, and that preparation of data is either trivial or not
even necessary. However, when there are too many categories, errors could increase more frequently.
e)Ensemble Learning in SDN-Concept Networks
The objective of the supervised learning method is to develop a stable model that excels in all situations,
although the facts aren’t quite obvious. In order to create a stronger, more complete strong supervision model,

199
multiple weak supervised models are combined through ensemble learning. A particular approach is used to
merge a group of individual learners after they have first been formed. Bagging and boosting are two of the main
ensemble learning algorithms. Despite the fact that these methods are used less frequently than traditional
methods. RF (Random Forest) creates bagging integration and uses it in numerous settings using DT as the
foundation learner. The indoor localization generated by the model in [45] has a high efficiency of 98.3% and
performs best when using SVM, NN and KNN. It trains itself using RF-based cross validation (Neural
Networks). To effectively describe the latency distribution of a single VNF, Lei et al. [46] suggest using a
random-forest regression prediction approach. Ensemble learning is more accurate than the conventional
methods mentioned above, but comes with a high level of complexity.
2) Unsupervised Learning in SDN-Concept Networks
The training samples' labels are unknown in unsupervised learning techniques. Analyzing training samples
without labels that adds another layer of support for data analysis, the objective is to discover the fundamental
characteristics and laws governing the data. The most popular technique is "clustering," and K-means is the most
basic and well-known algorithm[48]. In an SDN-based WAN design, a controller placement problem is solved
using a hierarchical K-means algorithm [49]. There are also algorithms that contrast or combine supervised
learning and unsupervised learning. Understanding each algorithm's benefits and drawbacks is the goal of the
comparison. Different supervised and unsupervised learning methods, including Naive Bayes, KNN, K-medoids
and K-means, are used by Barki et al. [47] to categorize the traffic as abnormal or normal. Compared to Bayes
and KNN, K-means and K-mediods are faster but less accurate. For traffic categorization, the two ML
techniques unsupervised K-means clustering and supervised SVM are investigated [50].
3) Semi-Supervised Learning in SDN-Concept Networks
Generally, unsupervised and supervised learning are the two main subcategories of machine learning technology.
While unsupervised learning only utilizes unlabeled sample sets, supervised learning only employs labelled
sample sets. However, because labelling data is so expensive, there are sometimes far more unlabeled data
available in real issues than there are labelled data. As a result, semi-supervised learning methods that may be
applied to both labelled and unlabeled samples quickly developed. A combination of unsupervised and
supervised learning is used in this learning strategy. By employing a large amount of unlabeled samples and
small amount of labelled data and it primarily focuses on how to build and categorize models. The same
applications as supervised learning utilize semi-supervised learning [36]. Semi-supervised learning has only been
evaluated in the lab and has historically been used to handle synthetic data, whereas [51] has done studies to
achieve accurate traffic classification of real Internet data. To re route effectively and accomplish the objectives
of resource, the QoS parameters may be used. The QoS classifier also uses semi-supervised machine learning to
handle traffic from unidentified applications. Its practical importance has not been adequately represented,
relative speaking. Additionally, more study is needed to determine the practical benefits of semi-supervised
learning.
4) Reinforcement Learning (RL) in SDN-Concept Networks
Through trial and error and rewards from interactions, an agent learns the reward-guiding activity referred as RL.
In SDN-concept networks [52], [53], RL delivers path selection or route optimization and is often used to
support reliability and scalability [20]. When delay reduction and throughput maximization are employed as the
primary operational and maintenance method for DROM [54], the resulting network performance, routing
services, and convergence are all significantly improved. The Internet of Vehicles (IoV) environment can be
sensed and learned from to give an optimal routing policy adaptively. SDCoR [55] is the first study to do this
and outperforms numerous common IoV protocols. Reduce the number of distinct pathways used for contiguous
data frames in order to address the primary challenge of high jitter [52]. It is suggested to mix certain innovative
RL research with other technologies for improved performance. To discover the best overlay paths with the least
amount of monitoring overhead, for instance, Random NN with RL are constructed [21]. For making auto-
scaling policy decisions, SRSA [56], an RL-based auto-scaling decision mechanism, was investigated.
Furthermore, due to the complex and dynamic network environment, RL with architecture changes are
considered.
To effectively manage networks with SON (Self-Organizing Networks) capabilities, Daher et al. [57] presented a
scalable strategy based on distributed RL. By combining DL with RL, Deep Reinforcement Learning (DRL)
accelerates learning and enhances the effectiveness of RL algorithms. DRL has produced outstanding outcomes
in both theory and practice. In particular, the Google Deep Mind team's DRL-based Alpha Go program is
regarded as a significant development in the field of artificial intelligence. Our findings support DRL's claims of
some development in SDN concept networks. DRL for leveraging adaptive multimedia traffic control
mechanism was researched by Huang et al. Without using a mathematical model, it can directly govern

200
multimedia flow. Deep Q-Learning (DQL) is specifically employed for the majority of DRL-related activities
[58]. And in various network circumstances, different DQL approaches can be employed to solve various
challenges. He et al. [59] suggested an integrated DQL methodology with SDN that uses a deep Q network to
approximate the Q value-action function. Overall, RL is a significant ML technique that is frequently applied to
network-related problems. Keep in mind that it only describes the interaction processes as opposed to offering a
different teaching strategy. Additionally, an RL can be created from any learning algorithm [60], and it will be
frequently used for analysis and prediction. Figure 1 gives a brief description of machine learning methods [61].

Fig 1. Overview of machine learning approaches

D. Quality of Service in SDN


The network will offer various levels of QoS depending on the offered traffic type, the volume of traffic at any
given time, and the traffic's final destination[62]. Using Deep Air, a DRL-based adaptive intrusion detection
Software-Defined Networks may successfully defend against cyber attacks (SDN). When compared to a Q-
learning, Deep Air can significantly lower the ratio of QoS violating traffic flows [63]. To install SDN controller
flow rules on the SDN-capable switches to decide the routing paths [64]. However, they are constrained in size
due to their high cost and energy consumption. The installation in the SDN nodes’ flow tables is impacted by
this restriction, and ineffective rule administration can result in a reduction in the network's QoS. In [65] a DRL-
based solution to the SDN flow tables' rule insertion issue is provided. The fundamental concept is to get rid of
the rules that are meant to be used less. In this method, the objective of increasing the quantity of flows that the
network can handle and subsequently enhancing network QoS is accomplished.
Due to the expansion of IoT applications, cloud services, mobility and video streaming made available to
internet users, networking systems are becoming more sophisticated. The problem is that different QoS demands
made by internet users are not satisfied. Artificial intelligence can be integrated into the system to manage
complex networks thanks to machine learning. To get a comprehensive network view and installing the best
routes in the routing/switch devices, the SDN platform is employed. The difficult part is computing the choice
reward value and dynamically gathering the parameters in network. Real-time network topologies are used to
carry out the emulation. Comparing the results to the conventional link-state algorithm, they are encouraging.
The learning agent can gradually learn the way thanks to the RL algorithm for path determination (training time).
For a destination pair and single source, the testing step takes n-1 comparisons to find the path [66].
Network congestion and the end-to-end customer satisfaction exhibited by the QoE, the massive traffic
utilization has adversely affected the network's QoS particularly during night time peak hours. A n intelligent
multimedia framework in [67] is introduced to make use of the integration of SDN and RL, which allows for the
exploration, learning, and exploitation of potential paths for video streaming flows, to optimize users' QoE and
the network's QoS.
E. Impact of AI/ML on SDNs
The impact of AI/ML in SDNs is described in Table 2. AI facilitates intelligent resource optimization, promotes
autonomous network management and controller, and increases security. Even though the majority of the
techniques are proofs of ideas, they amply show how AI/ML may be used to manage QoS and QoE and

201
automate networks. These results imply that SDNs can successfully implement AI/ML algorithms, potentially
accommodating both present and future specifications

TABLE II. IMPACT OF AI/ML ON SDNS

References AI/MI algorithms Use Cases


[68]-[70] Deep RL Optimise network resource usage
[71] Supervised ML Guarantee traffic based on its QoS requirements

[72] Matheuristic with ML based Enhance autonomous network management and


configuration
[73], [74] Decision Table (DT), Bayesian Network Improve network security
(BayesNet), and Naive-Bayes;Neural Network
[75] Unsupervised ML for Networking Monitor load balancing
[76], [77] semi-supervised ML; Guarantee Quality of Service
RF, DT, KNN
[78] ML and Deep Packet Inspection High accuracy and applicability of classifier

F. Limitations of applying AI/ML in SDNs


AI/ML techniques are useful in many sectors, but they have major drawbacks. The same premise holds true for
SDNs, as seen in Table 3. There is potential for improvement, as many studies (76 %) report having trouble
putting the AI/ML techniques into practice. The three main issues are learning distortion, difficulty in processing
large amounts of data without sampling and locating quality training sets

TABLE 3. LIMITATIONS OF AI/ML ALGORITHMS OF QOS IN SDN ENVIRONMENTS


References Method Advantages AI/MI Limitations

[79] SDN-Cloud - • Since a multi-controller SDN environment is •Weaksecurity between controllers


DDoS attack emerging, attack mitigation at a single and switches
controller will result in increased computing •Reduces flow timeout duration
overhead. ;•Time-consuming operation
• High scalability support
[80] AI-aided SD- •High scalability Nil
IoT •Less complexity.
•High QoS
[81] FSM •Less overhead •Impossible to forecast an absolute
•Does not require frequent flow migrations transition.
•Not suited for a large-scale
environment
[82] Sway •K-paths are determined •Low throughput
•The route is constructed for multimedia traffic •Low Load balance rate
• Low scalability
•Not suited for heterogeneous
devices
[83] DNN-SDR •Achieved load balance rate • Time-consuming
•Network dimensionality is reduced • Single point of failure
• Poor scalability
[84] FRI •Low packet loss rate •Large delay in best path selection
•Predicts the best path for traffic •Low throughput
•Traffic overhead is high
[85] Robust •Able to reduce security threats • Authentication is time consuming
security (SDN- •Increases flow time out period • Searching time for packet
5G) assignment is extensive
• Conventional ECC is utilized to
generate keys, reducing robustness
and dependability;

IV. CONCLUSION
To investigate the application of AI/ML approaches in SDNs, we examined at 98 publications (out of a total of
1450). The findings imply that algorithms for supervised learning considerably outperform those for
unsupervised learning and reinforcement learning. According to the majority of studies, NNs are the most
effective way to improve intelligence and optimize SDNs. However, in environments where numerous diverse
devices compete for network resources, RL has noticed a minor rise in adherence and may begin to see a higher

202
rise in the number of SDN problems it is capable of resolving (e.g., 5G networks). Network management,
automation, performance, and QoS are all enhanced through supervised and reinforcement learning.
One of the most effective AI technologies is ML for managing and operating autonomous networks because to
its capacity to extract information from data, which is fuelled by the availability of data and the theoretical
advancement of ML frameworks. Although there have been some analyses of the problems and difficulties for
ML in different SDN-based networks there hasn't been much proof that the applications haven't succeeded in
providing workable management solutions for autonomous networks. We concentrate on SDN network
applications using ML techniques for the other part. We also talk about the future directions for this field of
study. The primary problems with ML approaches are noted. Although ML areas have made considerable
progress, effective ML is challenging due to challenging patterns and a lack of training data availability. Because
of this, many ML programs frequently perform below expectations. We anticipate that our discussions will serve
as a straightforward manual for the advancement of SDN and the creation of a more intelligent network.
Researchers with various goals can use this study to better understand the fundamental problems in the subject.
Future network design and management will depend heavily on SDN-concept networks using ML techniques in
all areas, including resource management, intelligent routing management, network security, flow control, etc.
Future research will be done in-depth on the main issues mentioned in the study. The uses of AI/ML in SDN
applications increases the potential and worth of these architectures in research and business. Future trends are
expected to place an even greater emphasis on AI/ML techniques, as they offer substantial performance
improvements.

REFERENCES
[1] Cisco: The Art of Application-Centric Networking. Tech. rep.,
https://www.cisco.com/c/dam/en/us/solutions/collateral/borderlessnetworks/officeextend-solution/cisco td 030513
fin.pdf
[2] M. R. Belgaum, S. Musa, M.M.Alam, M. M.Su’Ud, A Systematic Review of Load Balancing Techniques in Software-
Defined Networking. IEEE Access 8 (2020)98612– 98636. https://doi.org/10.1109/ACCESS.2020.2995849
[3] P. P. Ray, N. Kumar, SDN/NFV architectures for edge-cloud oriented IoT: A systematic review. Computer
Communications 169 (2021) 129–153. https://doi.org/10.1016/J.COMCOM.2021.01.018
[4] N. Sultana, N. Chilamkurti, W. Peng, R.Alhadad,Survey on SDN based network intrusion detection system using
machine learning approaches. Peer-to-Peer Networking and Applications 2018 12:2 12(2) (2018) 493–501.
https://doi.org/10.1007/S12083-017-0630-0
[5] K. Benzekki, A. El Fergougui, and A. E. Elalaoui,Software-defined networking (SDN): A survey,Secur. Commun.
Netw., 9(18) (2016) 5803-5833.
[6] S. Sezer, et al., “Are we ready for SDN? Implementation challenges for software-defined networks,” IEEE Commun.
Mag., 51(.7) (2013) 36-43.
[7] J. Xie, F. Richard Yu, T. Huang, R. Xie, J. Liu, C. Wang, Y. Liu, A survey of machine learning techniques applied to
software defined networking (SDN): Research issues and challenges. COMST 2018 21(1)(2019)393–430.
https://doi.org/10.1109/COMST.2018.2866942
[8] Y. Cao, R. Wang, M. Chen, A.Barnawi, AI Agent in Software-Defined Network: Agent-Based Network Service
Prediction and Wireless Resource Scheduling Optimization. IEEE Internet of Things Journal 7(7) (2020) 5816–
5826.https://doi.org/10.1109/JIOT.2019.2950730
[9] Y. Zhao, Y. Li, X. Zhang, G. Geng, W. Zhang, Y. Sun, "A survey of networking applications applying the software
defined networking concept based on machine learning." IEEE Access 7 (2019) 95397-95417.
[10] M. Latah, L. Toker, Artificial Intelligence Enabled Software Defined Networking: A Comprehensive Overview. IET
Networks 8(2) (2018) 79–99. https://doi.org/10.1049/iet-net.2018.5082
[11] S. Nanda, F. Zafari, C.Decusatis, E.Wedaa, B. Yang, Predicting network attack patterns in SDN using machine learning
approach. NFV-SDN, (2017)67–172. https://doi.org/10.1109/NFV-SDN.2016.7919493
[12] J. C. Ferreira, D. Teixeira &J. Macedo, Systematic literature review of AI/ML in software-defined networks using the
snowballing approach (2021).
[13] L. Zhu, X. Tang, M. Shen, X. Du, and M. Guizani, Privacypreserving DDoS attack detection using cross-domain traffic
in Software Defined Networks, IEEE J. Sel. Area Comm., 36(3) (2018) 628-643.
[14] T.H. Lei, Y.T. Hsu, I. C. Wang, and C. H.P. Wen, Deploying QoSassured service function chains with stochastic
prediction models on VNF latency,in Proc IEEE NFV-SDN, 2017, pp.1-6.
[15] W. Li, X. Li, H. Li, and G. Xie, “CutSplit: A decision-tree combining cutting and splitting for scalable packet
classification,” in Proc. IEEE INFOCOM, 2018, pp.2645-2653.
[16] Y. Sorrachai, D. James, A. X. Liu, and E. Torng, “A sortedpartitioning approach to fast and scalable dynamic packet
classification,” IEEE ACM T. Network, vol.26,no.4, pp. 1907-1920. 2018.
[17] Chen, X.F., Yu, S.Z.: ‘CIPA: A collaborative intrusion prevention architecture for programmable network and SDN’,
Computers and Security, 2016,58, pp. 1-19.

203
[18] Sabbeh, A., Al-Dunainawi, Y., Al-Raweshidy, H.S., Abbod, M.F.: ‘Performance prediction of software defined network
using an artificial neural network’. In Proc. of SAI Computing Conference (SAI), London, UK, July 2016, pp. 80-84.
[19] Chen-Xiao, C., Ya-Bin, X.: ‘Research on load balance method in SDN’, International Journal of Grid and Distributed
Computing, 2016,9,(1), pp. 25-36.
[20] L. S. R. Sampaio, et al., “Using NFV and reinforcement learning for anomalies detection and mitigation in SDN,” in
Proc IEEE ISCC, 2018, pp.432-437.
[21] F. Francois and E. Gelenbe, “Optimizing secure SDN-enabled interdata centre overlay networks through cognitive
routing,” in Proc IEEE MASCOTS, 2016, pp.283-288.
[22] X. Huang, T. Yuan, G. Qiao, and Y. Ren, “Deep reinforcement learning for multimedia traffic control in Software
Defined Networking,” IEEE Network, vol.32, no.6, pp. 35-41, 2018.
[23] N. C. Luong, et al., “Applications of deep reinforcement learning in communications and networking: A survey,” arXiv
preprint arXiv:1810.07862, 2018.
[24] Y. He, Z. Zhang, and Y. Zhang, “A big data deep reinforcement learning approach to next generation green wireless
networks,” in Proc IEEE GLOBECOM, 2017, pp. 1-6.
[25] K. Bao, J. D. Matyjas, F. Hu, and S. Kumar, “Intelligent Software-Defined Mesh Networks with link-failure adaptive
traffic balancing,” IEEE TCCN, vol.4, no.2, pp.266-276, 2018.
[26] Y. Yu, L. Guo, Y. Liu, J. Zheng, and Y. Zong, “An efficient SDNbased DDoS attack detection and rapid response
platform in vehicular networks,” IEEE Access, vol. 6, pp. 44570-44579, 2018.
[27] D. Hu, P. Hong, and Y. Chen, “FADM: DDOS flooding attack detection and mitigation system in software-defined
networking,” in Proc. IEEE GLOBECOM, 2017, pp.1-7
[28] P. Wang, S. C. Lin, and M. Luo, “A framework for QoS-aware traffic classification using semi-supervised machine
learning in SDNs,” in Proc IEEE SCC, 2016, pp.760-765.
[29] Testolin, A., Zanforlin, M., De Grazia, M.D.F., et al.: ‘A machine learning approach to QoE-based video admission
control and resource allocation in wireless systems. In: Proc. of 13th Annual Mediterranean Ad Hoc Networking
Workshop (MED-HOCNET), Piran, Slovenia, June 2014, pp. 31-38.
[30] S. Nanda, F. Zafari, C. DeCusatis, E. Wedaa, and B. Yang, “Predicting network attack patterns in SDN using machine
learning approach,” in Proc. IEEE NFV-SDN, 2016, pp.167-172.
[31] C. Song, et al., “Machine-learning based threat-aware system in software defined networks,” in Proc. IEEE ICCCN,
2017, pp.1-9.
[32] D. Comaneci, and C. Dobre, “Securing networks using sdn and machine learning,” in Proc. IEEE CSE,2018, pp.194–
200.
[33] H.Z. Jahromi, A.Hines, and D.T. Delanev, “Towards applicationaware networking: ML-based end-to-end application
KPI/QoE metrics characterization in SDN,” in Proc. ICUFN, 2018, pp.126–131.
[34] Alvizu, R., Troia, S., Maier, G., Pattavina, A.: ‘Matheuristic with machinelearning-based prediction for software-defined
mobile metro-core networks’, IEEE/OSA Journal of Optical Communications and Networking, 2017,9,(9), pp. D19-
D30
[35] Gamer, T.: ‘Collaborative anomaly-based detection of large-scale internet attacks’, Computer Networks, 2012,56,(1),
pp. 169-185.
[36] He, M., Kalmbach, P., Blenk, A., Kellerer, W., Schmid, S.: ‘Algorithm-Data Driven Optimization of Adaptive
Communication Networks’. In Proc. of IEEE 25th International Conference on Network Protocols (ICNP), Toronto,
Canada, Oct. 2017, pp. 1-6
[37] Abubakar, A., Pranggono, B.: ‘Machine learning based intrusion detection system for software defined networks’. In
Proc. of Seventh International Conference on Emerging Security Technologies (EST), Canterbury, UK, Sept. 2017, pp.
138-143.
[38] P. Xiao, W. Qu, H. Qi, Y. Xu, and Z. Li, “An efficient elephant flow detection with cost-sensitive in SDN,” in Proc
IEEE INISCom, 2015, pp.24–28.
[39] M. Latah, and L. Toker, “Towards an efficient anomaly-based intrusion detection for software-defined networks,” IET
Networks, vol.7, no.6, pp.453-459, 2018.
[40] W. Li, X. Li, H. Li, and G. Xie, “CutSplit: A decision-tree combining cutting and splitting for scalable packet
classification,” in Proc. IEEE INFOCOM, 2018, pp.2645-2653.
[41] D. Côté, “Using machine learning in communication networks,” J. Opt. Commun. Netw., vol.10, no.10, pp. D100-D109,
2018.
[42] Y. Sorrachai, D. James, A. X. Liu, and E. Torng, “A sortedpartitioning approach to fast and scalable dynamic packet
classification,” IEEE ACM T. Network, vol.26,no.4, pp. 1907-1920. 2018.
[43] D. Chemodanov, P.Calyam, S.Valluripally, H. Trinh, J. Patman & K. Palaniappan. (2018). On qoe-oriented cloud
service orchestration for application providers. IEEE Transactions on Services Computing, 14(4), 1194-1208.
[44] B. Leng, L. Huang, C. Qiao, and H. Xu, “A decision-tree-based online flow table compressing method in Software
Defined Networks,” in Proc IEEE/ACM IWQoS, 2016.pp.1-2.
[45] R. Gomes, M. Ahsan, and A. Denton, “Random forest classifier in SDN framework for user-based indoor localization,”
in Proc IEEE EIT, 2018, pp.537-542.
[46] Q. Cheng, et al., “Guarding the perimeter of cloud-based enterprise networks: An intelligent sdn firewall,” in Proc IEEE
HPCC,2018, pp.897-902.

204
[47] L. Barki, A. Shidling, N. Meti, D. G. Narayan, and M. M. Mulla, “Detection of distributed denial of service attacks in
software defined networks,” in Proc IEEE ICACCI, 2016, pp.2576-2581.
[48] P. C. Lin, P. C. Li , and V. L. Nguyen, “Inferring OpenFlow rules by active probing in software-defined networks,” in
Proc IEEE ICACT, 2017, pp.415-420.
[49] H. Kuang, Y. Qiu, R. Li, and X. Liu, “A hierarchical K-means algorithm for controller placement in SDN-Based WAN
architecture,” in Proc ICMTMA, 2018, pp.263-267.
[50] Z. Fan, and R. Liu, “Investigation of machine learning based network traffic classification,” in Proc ISWCS,2017, pp.1-
6.
[51] P. Wang, S. C. Lin, and M. Luo, “A framework for QoS-aware traffic classification using semi-supervised machine
learning in SDNs,” in Proc IEEE SCC, 2016, pp.760-765.
[52] J. Chavula, M. Densmore, and H. Suleman, “Using SDN and reinforcement learning for traffic engineering in
UbuntuNet Alliance,” in Proc ICACCE, 2016, pp.349-355.
[53] S. Sendra, A. Rego, J. Lloret, J. M. Jimenez, and O. Romero, “Including artificial intelligence in a routing protocol using
Software Defined Networks,” in Proc IEEE ICC, 2017, pp.670-674.
[54] C. Yu, J. Lan, Z. Guo, Z. Guo, and Y. Hu, “DROM: Optimizing the routing in Software-Defined Networks with deep
reinforcement learning,” IEEE Access, vol.6, pp.64533-64539, 2018.
[55] C. Wang, L. Zhang, Z. Li, and C. Jiang, “SDCoR: Software Defined cognitive routing for internet of vehicles,” IEEE
Internet Things, vol.5, no.5, pp. 3513 – 3520, 2018.
[56] P. Tang, F. Li, W. Zhou, W. Hu, and L. Yang, “Efficient auto-scaling approach in the telco cloud using self-learning
algorithm,” in Proc IEEE GLOBECOM, 2015, pp.1–6.
[57] T. Daher, S. B. Jemaa, and L. Decreusefond, “Softwarized and distributed learning for SON management systems,” in
Proc IEEE/IFIP NOMS, 2018, pp.1-7.
[58] N. C. Luong, et al., “Applications of deep reinforcement learning in communications and networking: A survey,” arXiv
preprint arXiv:1810.07862, 2018.
[59] Y. He, Z. Zhang, and Y. Zhang, “A big data deep reinforcement learning approach to next generation green wireless
networks,” in Proc IEEE GLOBECOM, 2017, pp. 1-6.
[60] S. C. Lin, I. F. Akyildiz, P. Wang, and M. Luo, “QoS-aware adaptive routing in multi-layer hierarchical software
defined networks: A reinforcement learning approach,” in Proc IEEE SCC, 2016, pp.25-33.
[61] Sultana, N., Chilamkurti, N., Peng, W., Alhadad, R.: Survey on SDN based network intrusion detection system using
machine learning approaches. Peerto-Peer Networking and Applications 2018 12:2 12(2), 493–501 (jan 2018).
https://doi.org/10.1007/S12083-017-0630-0
[62] Fathy, C., & Saleh, S. N. (2022). Integrating deep learning-based IoT and fog computing with software-defined
networking for detecting weapons in video surveillance systems. Sensors, 22(14), 5075.
[63] Phan, T. V., &Bauschert, T. (2022). Deepair: Deep reinforcement learning for adaptive intrusion response in software-
defined networks. IEEE Transactions on Network and Service Management.
[64] Kim, G., Kim, Y., & Lim, H. (2022). Deep Reinforcement Learning-Based Routing on Software-Defined
Networks. IEEE Access, 10, 18121-18133.
[65] Jiménez-Lázaro, M., Berrocal, J., & Galán-Jiménez, J. (2022, April). Deep Reinforcement Learning Based Method for
the Rule Placement Problem in Software-Defined Networks. In NOMS 2022-2022 IEEE/IFIP Network Operations and
Management Symposium (pp. 1-4). IEEE.
[66] Raikar, M.M., Meena, S.M. (2022). Reinforcement Learning Based Routing in Software Defined Network. In: Rout,
R.R., Ghosh, S.K., Jana, P.K., Tripathy, A.K., Sahoo, J.P., Li, KC. (eds) Advances in Distributed Computing and
Machine Learning. Lecture Notes in Networks and Systems, vol 427. Springer, Singapore. https://doi.org/10.1007/978-
981-19-1018-0_16
[67] Al Jameel, Mohammed, Triantafyllos Kanakis, Scott Turner, Ali Al-Sherbaz, and Wesam S. Bhaya. 2022. "A
Reinforcement Learning-Based Routing for Real-Time Multimedia Traffic Transmission over Software-Defined
Networking" Electronics 11, no. 15: 2441. https://doi.org/10.3390/electronics11152441
[68] Huang, X., Yuan, T., Qiao, G., Ren, Y.: Deep Reinforcement Learning for Multimedia Traffic Control in Software
Defined Networking. IEEE Network 32(6), 35–41 (nov 2018). https://doi.org/10.1109/MNET.2018.1800097
[69] Liu, W.X., Zhang, J., Liang, Z.W., Peng, L.X., Cai, J.: Content Popularity Prediction and Caching for ICN: A Deep
Learning Approach with SDN. IEEE Access 6, 5075–5089 (dec 2017). https://doi.org/10.1109/ACCESS.2017.2781716
[70] Yu, C., Lan, J., Guo, Z., Hu, Y.: DROM: Optimizing the Routing in SoftwareDefined Networks with Deep
Reinforcement Learning. IEEE Access 6, 64533–64539 (2018). https://doi.org/10.1109/ACCESS.2018.2877686
[71] Ganesan, E., Hwang, I., Liem, A. T., & Ab-Rahman, M. S. (2021, June). SDN-enabled FiWi-IoT smart environment
network traffic classification using supervised ML models. In Photonics (Vol. 8, No. 6, p. 201). Multidisciplinary
Digital Publishing Institute.
[72] Alvizu, R., Troia, S., Maier, G., Pattavina, A.: Matheuristic with machinelearning-based prediction for software-defined
mobile metro-core networks. Journal of Optical Communications and Networking 9(9), D19–D30 (sep 2017).
https://doi.org/10.1364/JOCN.9.000D19
[73] Nanda, S., Zafari, F., Decusatis, C., Wedaa, E., Yang, B.: Predicting network attack patterns in SDN using machine
learning approach. NFV-SDN 2016 pp. 167–172 (may 2017). https://doi.org/10.1109/NFV-SDN.2016.7919493

205
[74] Azzouni, A., Pujolle, G.: NeuTM: A neural network-based framework for traffic matrix prediction in SDN. NOMS 2018
pp. 1–5 (jul 2018). https://doi.org/10.1109/NOMS.2018.8406199
[75] Usama, M., Qadir, J., Raza, A., Arif, H., Yau, K.L.A., Elkhatib, Y., Hussain, A., Al-Fuqaha, A.: Unsupervised Machine
Learning for Networking: Techniques, Applications and Research Challenges. IEEE Access 7, 65579–65615 (2019).
https://doi.org/10.1109/ACCESS.2019.2916648
[76] Wang, P., Lin, S.C., Luo, M.: A framework for QoS-aware traffic classification using semi-supervised machine learning
in SDNs. SCC 2016 pp. 760–765 (aug 2016). https://doi.org/10.1109/SCC.2016.133
[77] Owusu, A.I., Nayak, A.: An Intelligent Traffic Classification in SDNIoT: A Machine Learning Approach. BlackSeaCom
2020 (may 2020). https://doi.org/10.1109/BLACKSEACOM48709.2020.9235019
[78] Yu, C., Lan, J., Xie, J., & Hu, Y., QoS-aware traffic classification architecture using machine learning and deep packet
inspection in SDNs. Procedia computer science, 131(2018)1209-1216.
[79] K. Bhushan, B. Gupta, Distributed denial of service (ddos) attack mitigation in software defined network (sdn)-based
cloud computing environment, Journal of Ambient Intelligence and Humanized Computing (2018) 1–13.
[80] M. Begovic, S. Causevic, B. Memic &A. Haskovic AI-aided traffic differentiated QoS routing and dynamic offloading
in distributed fragmentation optimized SDN-IoT. Int. J. Eng. Res. Technol, 13(8)(2019) 1880-1895.
[81] F. AL-Tam and N. Correia, “Fractional switch migration in multi-controller software-defined networking,” Comput.
Networks, 157(2019)1–10.
[82] N. Saha, S. Bera, and S. Misra, “Sway: Traffic-Aware QoS Routing in Software-Defined IoT,” IEEE Transactions on
Emerging Topics in Computing, IEEE Computer Society, (2018)1–12.
[83] H. Yao, X. Yuan, P. Zhang, J. Wang, J. Chunxiao, and M. Guizani, Machine Learning Aided Load Balance Routing
Scheme Considering Queue Utilization,IEEE Trans. Veh. Technol.,(2019) 1–1.
[84] I. I. Awan, N. Shah, M. Imran, M. Shoaib, and N. Saeed, An improved mechanism for flow rule installation inband
SDN, J. Syst. Archit., 96(2019)1–19.
[85] I. H. Abdulqadder, D. Zou, I. T. Aziz, B. Yuan& W. Dai, Deployment of robust security scheme in SDN based 5G
network over NFV enabled cloud environment. IEEE Transactions on Emerging Topics in Computing, 9(2)(2018) 866-
877.

206
Grenze International Journal of Engineering and Technology, June Issue

Preprocessing and Segmentation of Retinal Blood


Vessels in Fundus Images using U-Net
Mrs. R Sudha Abirami1 and Dr. G Suresh Kumar2
1
Research Scholar, 2Assistant Professor
Department of Computer Science and Engineering, Pondicherry University Karaikal Campus, Karaikal,
Email: [email protected], [email protected]

Abstract— Deep Learning plays an important role today in disease detection and prediction. All
deep learning models need to be trained to process the input; Extract features and return
prediction results. Before classification and prediction, the given input must be preprocessed to
perform segmentation using augmentation. Only with the help of preprocessed images each
model can make accurate predictions at higher speeds. This proposed work aimed to detect
Diabetic Eye Diseases by means of segmenting the augmented images using U-Net. U-Net is
familiar with its Encoder-Decoder architecture for sampling. Retinal Blood Vessel is one of the
most precise parts of an eye. Based on the nature of this blood vessel one can identify whether it
is affected by diabetic retinopathy or not. So, segmenting the blood vessel helps to classify the
disease category in early stage and of course U-Net is probably meant for segmenting medical
images. In this paper, the discussions will be made on preprocessing eye images from the data
set, segmenting those images using U-Net to extract the retinal blood vessel, classification based
on segmentation..

Index Terms— Deep Learning, Prediction, Classification, Segmentation, Augmentation, U-Net,


Diabetic Eye Diseases.

I. INTRODUCTION
A. Image Processing:
Image processing as the name suggests, means processing images, and many techniques are required to reach the
goal. It is the core domain of computer vision that plays a key role in many real-world examples, such as
robotics, self-driving cars, and object recognition. Image processing allows us to transform and manipulate
thousands of images simultaneously and derive useful insights from them. It has a wide range of applications in
almost every field. The final output may be in the form of either an image or an equivalent feature of that image.
This can be used for further analysis and decision making. An image can be represented as a 2D function F (x, y)
where x and y are the spatial coordinates. The amplitude of F at a particular value of x, y is known as the
intensity of the image at that point. If the x, y and amplitude values are finite, it can be called as digital image.
This is an array of pixels arranged in columns and rows. A pixel is an image element that contains information
about intensity and color. Images can also be represented in 3D, where x, y, and z are spatial coordinates. Pixels
are arranged in a matrix. This is called an RGB image. There are different types of images: i) RGB Image -
Contains three layers of a 2D image, these layers being the red, green and blue channels, ii) Grayscale Images -
These images contain shades of black and white and contain only one channel.

Grenze ID: 01.GIJET.9.2.338


© Grenze Scientific Society, 2023
B. Segmentation:
Image segmentation is a computer vision task that divides an image into regions by assigning a label to each
pixel in the image. It provides more information about the image than object detection, which draws bounding
boxes around detected objects, or image classification, which assigns labels to objects. Segmentation is useful
and can be used in real-world applications such as medical imaging, clothing segmentation, flood maps, and self-
driving cars. There are two types of image segmentation.
 Semantic Segmentation: Classify each pixel with a label.
 Instance Segmentation: Classifies each pixel to distinguish each object instance.
In Ref. [1], U-Net is a semantic segmentation method originally proposed for medical image segmentation. This
was one of the early deep learning segmentation models, and the U-Net architecture (Figure 1) is also used in
many of his GAN variants such as his Pix2Pix generator.
In Ref. [2], the U-Net is an elegant architecture that solves most of the occurring issues. It uses the concept of
fully convolutional networks for this approach. The intent of the U-Net is to capture both the features of the
context as well as the localization. This process is completed successfully by the type of architecture built. The
main idea of the implementation is to utilize successive contracting layers, which are immediately followed by
the up-sampling operators for achieving higher resolution outputs on the input images.

Figure1.U-Net Architecture

This paper is arranged as chapters in the following manner, Chapter II contains Literature Survey, Chapter III
contains Image Preprocessing, Chapter IV contains Segmentation, Chapter V contains Proposed Method,
Chapter VI contains Segmentation result and discussions, and Chapter VII contains Conclusion.

II. LITERATURE SURVEY


In Ref. [4], a typical CNN has a multi-layered structure such as feed forward networks. As different from feed
forward networks, a CNN can include several Convolutional layers with a sub-sampling section. All the

208
parameters have been calculated for the CNN after three image processing stages as resizing the image, applying
Histogram Equalization, and applying CLAHE. After the image processing-based enhancement, the
classification was made using the CNN. The performance of the introduced method was evaluated by using 400
retinal fundus images in the MESSIDOR database. In Ref. [5], this work depicts the green channel of the RGB
model exhibits the best contrast between the vessels and background while the red and blue ones tend to be
noisier. The grey image from the green channel is processed and the retinal blood vessels appear darker in the
grey image and then invert it to appear brighter than non-vessel background. Salt and pepper noise is added in
order to represent the presence of noise. In order to remove the salt and pepper noise, order and median filters are
used. The output of the order filter gives better contrast between the vessels and the background, thereby
removing the noise more accurately than the other filters.
In Ref. [6], This study aimed to detect Optic Disc. Some of the features of Diabetic Retinopathy are exudates,
hemorrhages and micro aneurysms. Detection and removal of optic disc plays a vital role in extraction of these
features. This paper focuses on detection of optic disc using various image processing techniques, algorithms
such as canny edge, Circular Hough (CHT). Retinal images from IDRiD, Diaret_db0, Diaret_db1, Chasedb and
Messidor datasets were used.
In Ref. [7], the proposed model has been trained with three types, back propagation NN, Deep Neural Network
(DNN) and Convolutional Neural Network (CNN) after testing models with CPU trained Neural network gives
lowest accuracy because of one hidden layer whereas the deep learning models are outperforming NN. The Deep
Learning models are capable of quantifying the features as blood vessels, fluid drip, exudates, hemorrhages and
micro aneurysms into different classes. In Ref. [8], Glaucoma is a group of conditions, in which high pressure
inside the eye damages the optic nerve of the eye. The vision lost due to glaucoma is irreversible and cannot be
regained. Hence it is very important to detect this disease as early as possible and treat early to preserve vision.
In this paper, the performance of five preprocessing techniques is compared namely Contrast Adjustment,
Adaptive Histogram equalization, Median filtering, Average filtering and Homomorphic filtering. The
performances of these techniques are evaluated using Mean Square Error (MSE) and Peak Signal to Noise Ratio
(PSNR).
In Ref. [9], proposed automated methods consist of pre-processing, blood vessels extraction, optic disc
segmentation and macula region segmentation. Initially, pre-processing is performed using shade correction and
top-hat transformation for enhancement of dark anatomical structures such as blood vessels and macula/fovea
region. A novel graph cut method is used to extract blood vessels. Then template based matching and
morphological operations are used for detection and extraction of optic disc. Finally, post processing is used for
detection of macula in retinal images. In Ref. [10], the work proposed was an OD segmentation model from
fundus images based on Retina Net extension with DenseNet that addresses the vanishing gradient problem,
enhances feature propagation, performs deep supervision, strengthens feature reuse and reduces the number of
parameters. The model was developed based on promising results achieved by the Retina Net and the DenseNet
in many object detection problems. Combining both models facilitates the reuse of computation through dense
connections and improves gradient flow.
In Ref. [11], segmenting the optic disc (OD) is an important and essential step in creating a frame of reference
for diagnosing optic nerve head pathologies such as glaucoma. The main contribution of this paper is in
presenting a novel OD segmentation algorithm based on applying a level set method on a localized OD image.
To prevent the blood vessels from interfering with the level set process, an in-painting technique was applied.
The new automatic eye disease diagnosis system has to be robust, fast, and highly accurate, in order to support
high workloads and near-real-time operation.

III. IMAGE PREPROCESSING


The purpose of pre-processing is to raise the image's quality so that we can analyze it more effectively.
Preprocessing allows us to eliminate unwanted distortions and improve specific qualities that are essential for the
application we are working on. Those characteristics could change depending on the application. There are four
different types of Image Pre-Processing techniques and they are as follows;
1. Pixel brightness transformations/ Brightness corrections
2. Geometric Transformations
3. Image Filtering and Segmentation
4. Fourier transform and Image re-saturation
The brightness of a pixel is altered by Brightness transformations, which are dependent on the characteristics of
the individual pixel. In PBT, the value of the output pixel depends only on the value of the matching input pixel.

209
Enhancing contrast is a crucial component of image processing for both human and machine vision. It is
commonly used in speech recognition, texture synthesis, medical image processing, and many other image/video
processing applications as a pre-processing step.
The most common Pixel brightness transforms operations are
1. Gamma correction or Power Law Transform
2. Sigmoid stretching
3. Histogram equalization
Two commonly used point processes are multiplication and addition with a constant.
g(x)=αf(x)+β (1)
The parameters α>0 and β are called the gain and bias parameters and sometimes these parameters are said to
control contrast and brightness respectively.
A. Histogram Equalization
It is a well-known contrast enhancement technique due to its performance on almost all types of images.
Histogram equalization provides a sophisticated method for modifying the dynamic range and contrast of an
image by altering that image such that its intensity histogram has the desired shape. Unlike contrast stretching,
histogram modeling operators may employ non-linear and non-monotonic transfer functions to map between
pixel intensity values in the input and output images.
The normalized histogram can be represented as,
P(n) = (number of pixels with intensity n) / (total number of pixels)
B. Image Filtering and Segmentation
The purpose of utilizing filters is to change or improve the qualities of the images and/or to extract important
data from the images, such as edges, corners, and blobs. A kernel, which is a tiny array applied to each pixel and
its neighbors inside a picture, defines a filter. Some of the basic filtering techniques are; i. Low Pass Filtering
(Smoothing), ii. High pass filters (Edge Detection, Sharpening), iii. Directional Filtering, iv. Laplacian Filtering.
Often based on the properties of the picture's pixels, Image Segmentation is a widely used method in digital
image processing and analysis to divide an image into various parts or areas. Foreground and background can be
distinguished in an image using segmentation, and pixels can be grouped together according to their similarity in
color or shape. Image Segmentation is mainly used in Face detection, medical imaging, Machine vision,
Autonomous Driving.
C. Fourier Transform
In Ref. [3] is an important image processing tool used to decompose an image into its sine and cosine
components. The output of the transform represents an image in the Fourier or frequency domain, while the input
image is equivalent in the spatial domain. In Fourier-domain images, each point represents a specific frequency
in the spatial-domain image. Fourier transforms are used in a variety of applications such as image analysis,
image filtering, image reconstruction, and image compression.

IV. IMAGE SEGMENTATION


A. What is Segmentation?
Segmentation is the separation of one or more regions or objects in an image based on a discontinuity or a
similarity criterion. In Ref. [21], A region in an image can be defined by its border (edge) or its interior, and the
two representations are equal.
Segmentation approaches:
• Pixel-based segmentation: each pixel is segmented based on gray-level values, no contextual information, only
histogram. Example: Thresholding.
• Region-based segmentation: considers gray-levels from neighboring pixels by – including similar neighboring
pixels (region growing), – split-and-merge, – or watershed segmentation.
• Edge-based segmentation: Detects and links edge pixels to form contours.
Following are the primary types of image segmentation techniques: (in Figure 2)
1. Edge-Based Segmentation
2. Thresholding Segmentation
3. Region-Based Segmentation
4. Clustering-Based Segmentation

210
5. Watershed Segmentation

Figure 2. Segmentation approaches

1. Edge-Based Segmentation: Edge-based segmentation is a popular image processing technique that identifies
the edges of various objects in a given image. It helps locate features of associated objects in the image using the
information from the edges. Edge detection helps strip images of redundant information, reducing their size and
facilitating analysis.
Edge-based segmentation algorithms identify edges based on contrast, texture, color, and saturation variations.
They can accurately represent the borders of objects in an image using edge chains comprising the individual
edges.
2. Thresholding Segmentation: Thresholding is the simplest image segmentation method, dividing pixels based
on their intensity relative to a given value or threshold. It is suitable for segmenting objects with higher intensity
than other objects or backgrounds.
The threshold value T can work as a constant in low-noise images. In some cases, it is possible to use dynamic
thresholds. Thresholding divides a grayscale image into two segments based on their relationship to T, producing
a binary image.
3. Region-Based Segmentation: Region-based segmentation involves dividing an image into regions with similar
characteristics. Each region is a group of pixels, which the algorithm locates via a seed point. Once the algorithm
finds the seed points, it can grow regions by adding more pixels or shrinking and merging them with other
points.
4. Clustering-Based Segmentation: Clustering algorithms are unsupervised classification algorithms that help
identify hidden information in images. They augment human vision by isolating clusters, shadings, and
structures. The algorithm divides images into clusters of pixels with similar characteristics, separating data
elements and grouping similar elements into clusters.
5. Watershed Segmentation: Watersheds are transformations in a grayscale image. Watershed segmentation
algorithms treat images like topographic maps, with pixel brightness determining elevation (height). This
technique detects lines forming ridges and basins, marking the areas between the watershed lines. It divides
images into multiple regions based on pixel height, grouping pixels with the same gray value. The watershed
technique has several important use cases, including medical image processing. For example, it can help identify
differences between lighter and darker regions in an MRI scan, potentially assisting with diagnosis.
B.CNN for Image Segmentation:
In Ref. [12], we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing
images. As well as, improving the Encoder- Decoder CNN structure SegNet with index pooling and U-Net to
make them suitable for multi-targets semantic segmentation of remote sensing images.
Convolutional neural network is a hierarchical model whose input is raw data, such as RGB image and raw audio
data. Convolutional neural networks extract high level semantic information layer by layer from the input layer
of raw data by stacking a series of operations such as convolution operation, pooling operation and mapping of
non-linear activation functions. This process is called “feed-forward operation”. Different types of operations in
the convolutional neural networks are called “layers”. Convolution operations are convolutional layers and

211
pooling operations are pooling layers. The last layer of convolutional neural network transforms its target tasks
(classification, regression, etc.) into the objective function. By calculating the error or loss between the predicted
value and the real value, the error or loss back-forward layer by layer by the back-propagation algorithm to
update the parameters of every layer and then back-forward again and again until the network model converges.
In Ref. [13], during past days, a lot of research has been carried out for retinal blood vessel segmentation for
identification of Diabetic Retinopathy using various machine learning and deep learning models. In this research
work, Convolutional Neural Network (CNN) and CLAHE are applied together to tackle the problem of retinal
blood vessel segmentation of images over the DRIVE dataset. The method undergoes pre-processing- grey scale
conversion and CLAHE, feature extraction using morphological feature, segmentation, training and prediction
using CNN. Experimental evaluation shows that the proposed method outperforms 98.06% accuracy.
In Ref. [14], a comprehensive review of the literature has been written, covering a broad spectrum of pioneering
works for semantic and instance-level segmentation, including fully convolutional pixel-labeling networks,
encoder-decoder architectures, multi-scale and pyramid-based approaches, recurrent networks, visual attention
models, and generative models in adversarial settings. We investigate the similarity, strengths and challenges of
these deep learning models, examine the most widely used datasets, report performances, and discuss promising
future research directions in this area. In Ref. [15], with the advent of neural networks, deep convolutional neural
networks (DCNNs) provide benchmarking results in the problems related to computer vision. Manifold DCNNs
have been proposed for semantic segmentation such as U-Net, DeepU-Net, ResUNet, DenseNet, RefineNet, etc.
The general procedure is common for all the models. It has three phases - pre-processing, processing and output
generation. The outputs of the processing phase are the masked image and segmented image. In this paper, a
systematic critique of the existing DCNNs for semantic segmentation has been manifested. The datasets and the
architectures of the existing models have also been discussed in this paper with illustrations.
In Ref. [16], the proposed model has 13 layers and uses dilated convolution and max-pooling to extract small
features. Ghost model deletes the duplicated features, makes the process easier, and reduces the complexity. The
Convolution Neural Network (CNN) generates a feature vector map and improves the accuracy of area or
bounding box proposals. Restructuring is required for healing. As a result, convolutional neural networks
segment medical images. It is possible to acquire the beginning region of a segmented medical image. The
proposed model gives better results as compared to the traditional models, it gives an accuracy of 96.05,
Precision 98.2, and recall 95.78. In Ref. [17], they present a network and training strategy, that relies on the
strong use of data augmentation to use the available annotated samples more efficiently. The architecture
consists of a contracting path to capture context and a symmetric expanding path that enables precise
localization. We show that such a network can be trained end-to-end from very few images and outperforms the
prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal
structures in electron microscopic stacks.
In Ref. [18], after researching various techniques, they have found that, the CNN is one the most powerful tool in
image segmentation technique. Detailed analysis of CNN is also done here explaining different layers and
workings of each layer. As we know CNN technology is at a boost of implementation nowadays in making the
human life more and more convenient and less manual. There have already been a lot of work done in various
fields like commutation, medical tasks, crop monitoring, road transportation, activity detection, product quality
monitoring. In Ref. [19], a novel attention Gabor network (AGNet) based on deep learning for medical image
segmentation that is capable of automatically paying more attention to the edge and consistently for
improvement to the segmentation performance is proposed. The proposed model consists of two components.
The first one is to determine the approximate location of the organs of interest in the image using convolution
filters, and the other one is to highlight salient edge features intended for a specific segmentation task using
Gabor filters. In order to facilitate collaboration in between the two parts, a region attention mechanism based on
Gabor maps is suggested. The mechanism improved performance by learning to focus on the salient regions of
the image that are useful for the authors' tasks.
In Ref. [20], fully automatic segmentation of wound areas in natural images is an important part of the diagnosis
and care protocol since it is crucial to measure the area of the wound and provide quantitative parameters in the
treatment. Various deep learning models have gained success in image analysis including semantic
segmentation. This manuscript proposes a novel convolutional framework based on MobileNetV2 and connected
component labeling to segment wound regions from natural images. The advantage of this model is its
lightweight and less compute-intensive architecture. The performance is not compromised and is comparable to
deeper neural networks. We build an annotated wound image dataset consisting of 1109-foot ulcer images from
889 patients to train and test the deep learning models.

212
V. PROPOSED METHOD
A. Dataset:
In the proposed method, the UNET model is trained with DRIVE dataset. This dataset contains 20 training
images with mask and manual ground truth images. The images were enhanced using some preprocessing
techniques such as, Grayscale transformations, Brightness transformations, applying CLAHE (Contrast Limited
Adaptive Histogram Equalization), Extracting edges by Canny Edge Detection etc., The aim of this method is to
segment the blood vessel from retina for classification of disease severity.
B. Flow Diagram:
The flow diagram (in Figure 3) depicts the work flow of process to extract blood vessel and segment them using
UNET Model. In the proposed method, images are acquired from the dataset. The RGB image is then converted
into a grayscale image. The Gray image is fed to the process of CLAHE (Contrast Limited Adaptive Histogram
Equalization). The image is processed with Canny Edge Detection method. By this, Retinal Vessels are extracted
from the image. The output is considered as ground truth image for segmentation process. The UNET model is
trained to segment the retinal blood vessel from the given image. This model is implemented in Google Colab,
NVIDIA Tesla K80 GPU. The model performance is measured using segmented image by comparing them with
ground truth.
RGB Image to Ground
Image Gray Image Truth
Acquisition Image

Histogram Train the


Equalization Model with
(CLAHE) UNET

Denoising Segmented
Image

Edge Detection
Method Result
(Canny) Evaluation

Figure 3. Flow Diagram of Proposed Method

C. ALGORITHM:
Input: Ii, Read images from Dataset,
Process:
Step1: Convert RGB Image to Grayscale Image
Step2: Increase Brightness using CLAHE.
Step3: Denoise the image by Non-Local Mean Denoising function.
Step4: Detect the edges by Canny Edge Detector – Ground Truth Image, GI.
Step5: Train the model, Ui = Ii + GIi.
Step6: Segmented Image, Si.
Output: Retinal Blood Vessels Extraction.
D. CLAHE (Contrast Limited Adaptive Histogram Equalization):
The CLAHE operates on small regions in the image, called tiles, rather than the entire image. The neighboring
tiles are then combined using bilinear interpolation to remove the artificial boundaries. This algorithm can be
applied to improve the contrast of images. There are two parameters to be considered. They are,
i. clip Limit – This parameter sets the threshold for contrast limiting. The default value is 40.
ii. tileGridSize – This sets the number of tiles in the row and column. By default, this is 8×8. It is used while the
image is divided into tiles for applying CLAHE.

213
E. De-noising Images:
In Ref. [22], its basic idea is to build a pointwise estimation of the image, where each pixel is obtained as a
weighted average of pixels centered at regions that are similar to the region centered at the estimated pixel. For a
given pixel xi in an image x, NLM (xi) indicates the NLM-filtered value.
Let wi, j be the weight of xj to xi, which is computed by.
W i,j = 1/Ci exp (- ‖ − ‖ ) / h (2)
where Ci denotes a normalization factor, and h indicates a filter parameter.
F. Canny Edge Detection Method:
This method contains four major steps to process;
 Reduce Noise using Gaussian Smoothing.
 Compute image gradient using Sobel filter.
 Apply Non-Max Suppression or NMS to just jeep the local maxima
 Finally, apply Hysteresis thresholding which that 2-threshold values T_upper and T_lower which is
used in the Canny () function.
Figure 4. shows the original image, preprocessed, edge image, and segmented image.

(a) (b) (c) (d) (e)


Figure 4. (a) Original image, (b) Grayscale image, (c) CLAHE image, (d) Canny Edge image and (e) Segmented image.

VI. SEGMENTATION RESULTS


After preprocessing, the U-Net model is trained with images and masks. Original images and Ground truth
images are given as training and validation inputs. Images are given to x coordinate and masks are given to y
coordinate to perform 2D convolutions. The DRIVE dataset contains manually annotated masks considered as
ground truth images. The model performs out well in all epochs. The number of epochs started from 5, and ends
with 100. The following table (Table.1) and Figure 5. shows the accuracy achieved for each number of epochs.

A. Result Comparison:
In previous research works, the authors used Convolutional Neural Network models and H-minima for
classification using the DRIVE dataset. But segmentation helps the classification task in better way. And also, U-
Net models is specifically meant for bio-medical images. This model really predicts the segmented images as in

214
the ground truth image. The edge images were generated automatically. This method is implemented in Google
Colab with GPU environment. From the results, the proposed model is compared with existing works of
different datasets and different models. The proposed method outperforms well with U-Net Model. The
following table (Table.2) shows the comparison about the dataset used and accuracy obtained by other authors.

VII. CONCLUSION
In the proposed method, DRIVE dataset is used and Canny Edge Detection with U-Net model was implemented.
The model uniquely identifies all the edges present in the original eye image, also it represents the blood vessels
of retina. From the identified edges, the U-Net model is trained with images and masks for various number of
epochs. Finally, it achieves better accuracy compared to other previous works. This work can be enhanced
furtherly as, by applying different datasets such as, IDRiD, Messidor-2, STARE etc., and the segmentation can
also be done with other models like Attention U-Net, Res U-Net. Due to its easy-to-access and good
performance, the proposed method accelerates the diagnosis of Diabetic Eye Diseases in early stages itself.
REFERENCES
[1] https://pyimagesearch.com/2022/02/21/u-net-image-segmentation-in-keras/
[2] https://blog.paperspace.com/unet-architecture-image-segmentation/
[3] https://www.mygreatlearning.com/blog/introduction-to-image-pre-processing/
[4] D. Jude Hemanth1, Omer Deperlioglu2, Utku Kose3, “An enhanced diabetic retinopathy detection and classification
approach using deep convolutional neural network”, https://doi.org/10.1007/s00521-018-03974-0.
[5] P. Pearline Sheeba, V. Radhamani, “Analyzing and Feature Extraction of Diabetic Retinopathy in Retinal Images”,
International Journal of Engineering Research & Technology (IJERT) http://www.ijert.org, ISSN: 2278-0181.
[6] Priyanka Konatham, Mounika Venigalla, Lakshmi Pooja Amaraneni, K. Suvarna Vani, “Automatic Detection of Optic
Disc for Diabetic Retinopathy”, International Journal of Innovative Technology and Exploring Engineering (IJITEE),
ISSN: 2278-3075 (Online), Volume-9 Issue-7, May 2020.
[7] Suvajit Dutta, Bonthala CS Manideep, Syed Muzamil Basha, Ronnie D. Caytiles1 and N. Ch. S. N. Iyengar,
“Classification of Diabetic Retinopathy Images by Using Deep Learning Models”,
http://dx.doi.org/10.14257/ijgdc.2018.11.1.09.
[8] Mrs.S.Rathinam*1 M.E,(Ph.D), Dr.S.Selvarajan*2 M.E.,Ph.D., “Comparison of Image Preprocessing Techniques on
Fundus Images for Early Diagnosis of Glaucoma”, International Journal of Scientific & Engineering Research, Volume
4, Issue 12, December-2013 1368 ISSN 2229-5518.
[9] P. R. Wankhede, K. B. Khanchandani, “Feature Extraction In Retinal Images Using Automated Methods”, International
Journal Of Scientific & Technology Research Volume 9, Issue 03, March 2020 ISSN 2277-8616.
[10] Manal AlGhamdi, “Optic Disc Segmentation in Fundus Images with Deep Learning Object Detector”, Journal of
Computer Science 2020, 16 (5): 591.600 DOI: 10.3844/jcssp.2020.591.600.
[11] Ahmed Almazroa, Weiwei Sun, Sami Alodhayb, Kaamran Raahemifar, Vasudevan Lakshminarayanan, “Optic disc
segmentation for glaucoma screening system using fundus images”, https://doi.org/10.2147/OPTH.S140061.
[12] Muhammad Alam, Jian-Feng Wang, Cong Guangpei, LV Yunrong, Yuanfang Chen, “Convolutional Neural Network for
the Semantic Segmentation of Remote Sensing Images”, Mobile Networks and Applications (2021) 26:200–215,
https://doi.org/10.1007/s11036-020-01703-3.
[13] Arun Kumar Yadav, Arti Jain, Jorge Luis Morato Lara and Divakar Yadav, “Retinal Blood Vessel Segmentation using
Convolutional Neural Networks ”, DOI: 10.5220/0010719500003064.
[14] Shervin Minaee, Yuri Boykov, Fatih Porikli, Antonio Plaza, Nasser Kehtarnavaz, and Demetri Terzopoulos, “Image
Segmentation Using Deep Learning: A Survey”, arXiv: 2001.05566v5 [cs.CV] 15 Nov 2020.
[15] Rishipal Singh , Rajneesh Rani, “Semantic Segmentation using Deep Convolutional Neural Network: A Review”,
International Conference On Intelligent Communication And Computational Research (ICICCR-2020),
https://ssrn.com/abstract=3565919.
[16] Marcelo Zambrano-Vizuete, Miguel Botto-Tobar ,Carmen Huerta-Sua´rez, Wladimir Paredes-Parada, Darwin Patiño
Perez, Tariq Ahamed Ahanger, and Neilys Gonzalez, “Segmentation of Medical Image Using Novel Dilated Ghost Deep
Learning Model”, Computational Intelligence and Neuroscience, https://doi.org/10.1155/2022/6872045.

215
[17] Olaf Ronneberger, Philipp Fischer, and Thomas Brox, “U-Net: Convolutional Networks for Biomedical Image
Segmentation”, arXiv: 1505.04597v1 [cs.CV] 18 May 2015.
[18] Ravi Kaushik, Shailender Kumar, “Image Segmentation Using Convolutional Neural Network”, International Journal Of
Scientific & Technology Research Volume 8, Issue 11, November 2019, ISSN 2277-8616.
[19] Shaoqiong Huang, Mengxing Huang,Yu Zhang, Jing Chen, Uzair Bhatti, “Medical image segmentation using deep
learning with feature enhancement”, IET Image Processing, doi: 10.1049/iet-ipr.2019.0772.
[20] ChuanboWang, D. M.Anisuzzaman, VictorWilliamson, Mrinal Kanti Dhar, Behrouz Rostami , Jefrey Niezgoda,
SandeepGopalakrishnan, ZeyunYu, “Fully automatic wound segmentation with deep convolutional neural networks”,
https://doi.org/10.1038/s41598-020-78799-w.
[21] https://datagen.tech/guides/image-annotation/image-segmentation/, https://www.v7labs.com/blog/image-segmentation-
guide.
[22] Linwei Fan, Fan Zhang , Hui Fan and Caiming Zhang, “Brief review of image denoising techniques”, Visual Computing
for Industry, Biomedicine, and Art (2019) 2:7 https://doi.org/10.1186/s42492-019-0016-7.
[23] Wang Xianchenga,b*, Li Weia ,Miao Bingyic , Jing Hed,e , Zhangwei Jiangf , Wen Xue , Zhenyan Ji g , Gu Hongh ,
Shen Zhaomeng, “Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network”,
International Conference on Data Science (ICDS 2018), www.sciencedirect.com.
[24] Boubakar Khalifa Albargathe, S.M., Kamberli, E., Kandemirli, F. et al. Blood vessel segmentation and extraction using
H-minima method based on image processing techniques. MultimediaToolsAppl 80,2565–2582(2021),
https://doi.org/10.1007/s11042-020-09646-3.

216
Grenze International Journal of Engineering and Technology, June Issue

COVID-19 Tracker
N Malarvizhi1, Arun Kumar Dash2, J Aswini3 and V Manikanta4
1
Professor, Dept of CSE, Vel Tech Rangarajan Dr Sagunthala R&D Institute of Science and Technology, Chennai
2-4
UG Student, Dept of CSE, Vel Tech Rangarajan Dr Sagunthala R&D Institute of Science and Technology, Chennai
3
Professor, Dept of CSE, Saveetha Engineering College, Chennai
Email: [email protected], [email protected], [email protected], [email protected]

Abstract— Covid-19, which is also called as “The outbreak of Corona virus” as it is commonly
known as in the process of getting terminated completely, but it is never known about the future
of any virus or bacteria. The prevalence of Corona Virus in our society can be successfully
monitored only through proper tracking and the subsequent analysis with those tracking. The
process of analyzing the tasks is completely dependent on the society experts, but they can
analyze only with the tracking provided to them. The application that is being built now eases
the work of the analysis experts. It will keep a track of the day-to-day cases of corona, its surges
and downfall. It will show the daily changes in the form of graphs and charts which, with a
better User Interface that can help not only the experts, but also general people, visualize the
readings clearly by minimizing the visible details and making it more abstract suitable for
smaller screen sized applications. People can check the ups and downs of the covid cases in a
pictorial format through bar charts, pie charts, line graphs and other graphical diagrams. The
data for all the countries will be fetched via an API call which will be updated by the API
service provider itself. The data fetched will then be processed and visualized in the form of
bars and charts. Apart from the daily cases, users can also view the daily deaths and recoveries.
They can additionally check the total number of Covid cases till date. The total number of
Deaths and recoveries can also be viewed by the user with ease. Users will be given an option to
sort the type of data they want to view i.e., death, recoveries, new cases and total cases. Based
on their input, the pictorial representation will be shown to them. Mobile phone users always
look for an application for their comfort, thus an application which visualizes less detailing and
more abstract way targeting small screen users.

Index Terms— User Interface, Covid-19, Visualization, Tracking.

I. INTRODUCTION
The title of this project "Covid-19 Tracker" in simple terms means a record of daily covid cases prevailing in the
country. These records in turn helps the analyst and experts to properly understand the Covid scenario and make
correct decisions for future. Furthermore, the tracking helps every individual using the app to remain up to date
and aware of the future consequences. In the past, the cases turned less predictable when there was a second
wave in the curve plotting the number of active cases and new cases. Though the curve of death cases did not
find a clean wave second time in the graph. These incidents made the cases go unpredictable causing not only
the analysts, but also the general people look into graphs everyday with less understanding of the high level
graphs in mobile devices and great numbers which doesn't explain much.
Though there were many types of visualizations, more detailing and lower level graphical representations lost

Grenze ID: 01.GIJET.9.2.339


© Grenze Scientific Society, 2023
views from mobile users. Mobile phone users always look for an application for their comfort, thus an
application which visualizes less detailing and more abstract way targeting small screen users. The Data for all
the countries will be fetched via an API call which will be updated by the API service provider itself. The data
fetched from the API will then be processed and filtered. The filtered data will be used to visualize the covid
cases and displayed to the users through pictorial representation in the form of bars and charts. Mobile phone
users always look for an application for their comfort, thus an application which visualizes less detailing and
more abstract way targeting small screen users.The main purpose of the project is to provide a better user
interface to the users to view the covid changes happening daily and work accordingly. Both the common people
and the covid analysts can make use of the app and keep them updated and aware.

II. LITERATURE SURVEY


In [1], the author analysed how much time people spend over small displays compared to large displays over
time. Small displays are considered mobile phones, the data taken from websites which encountered API calls
from devices with smaller resolutions and screen size, data visualised over various time periods. The study
proved that people with smaller devices are more likely to check data related to a specific trend or fashion but
often give up due to complex data over a smaller screen.
Various visualization techniques have been shown in [2] which can be applied online and offline by using static
data and how it helps various professions such as policy makers, scientists, healthcare providers and the general
public understand the aspects of the pandemic. These techniques came to limelight due to the pandemic and most
people learnt more new terms about visualization during the pandemic. In [3], the authors use the retrieved data
for Susceptible Exposed Infectious Recovered (SEIR) predictive modelling. Sentimental Analysis is done with
news data by segregating into negative and positive sentiments, to understand the influence of the news on
people's behaviour both politically and economically.
Audio analysis methods have been discussed by the authors in [4] using audio data sets and symptoms of people
to detect the presence of COVID-19 virus from sampling of data and applying SVM. The data was crowd
sourced using their application. This audio data helped people identify symptoms as during the first phase of
pandemic throat infection was a major symptom for the virus. In [5], the authors discussed about the
Community-level surveillance with an unsupervised symptom-based online search model is being used here. In
our analysis, we compare Correlation and regression where the relationship of search frequency time series and
confirmed cases or deaths can uncover symptoms or behaviours related to COVID-19.
The authors in [6] discussed about the analyses of mobility data collected by GPS and Wi-Fi by analysing the
data of devices connected through public Wi-Fis through tourism boards and tracing its mobility during the
pandemic. The data helped in tracking people who were possibly in contact with a vector using the device
footprint logged in public Wi-Fis which were connected same time when the disease carrier was using the Wi-Fi
to easily sort out the contact tracing. This made possible to track down few cases and avoid spreading.
In [7], the authors suggested various phases of COVID-19 spread control by using automated ways with the help
of IOT and Internet to conduct various process like screening, testing, contact tracing, prediction, sanitation with
the help of robotics and IOT instead of manual ways to increase accuracy and safety. This further helped safety
of people by using robots instead of front line workers using safety guards and masks which doesn't guarantee
safety in all circumstances.

III. PROPOSED SYSTEM


The architecture of the project is displayed in Figure 1. The system consists of front end and back end. The front
end of this system consists of the User Interface which interacts with the user to get inputs by handling events
from touch and keyboard. The user interface consists of views and view controllers which are arranged in
particular layouts and order to show one over the other with specific overlays. The user interface has to filled
with data, which is obtained from back end through the help of data managers which manage the data from
network and database. The user interface also handles various screen sizes and resolution thereby computing the
size and pixels of every frame for the specific device. Other than colour and layout, front end needs data to be
filled out which is dynamic and cannot be hard coded. The data is handled by back end. The back end consists of
a data manager which handles the data to be fetched from various APIs and decode the data to a specific format
which the code can understand. It converts to classes and types defined earlier to the user interface and passes
the data in those types to the front end. The front end utilizes this data and updates the User Interface. The data
obtained from APIs after decoding is also stored in databases for further access as network calls take a lot of

218
time to get the response. This data is passed further to database and to the UI and received from network to the
database and so on. Thus, back end manages the data and sends it to the front end through a single layer to
interact to avoid multiple point of contacts between front and back end.

Figure 1. Architecture diagram

There are three modules in this application namely, , Domain module, Data module and Presentation module.
The presentation module handles the input data received from user and converts it to function calls and passes it
to the domain module. Also, this module handles all the views and visualizations presented to the user such as
buttons, colours, layouts, frames and images. The domain module collects the function calls from the
presentation module and passes it to the data module. The data module collects the data requests from the
domain module and passes responses to the domain module the data fetched either from database or through
some network calls. The final data is passed to the presentation module and displayed to the user..
A. Domain Module
The domain module consists of all types responsible for the application's back-end. It consists of the entities not
limiting to the types present in the database model. These entities are used throughout the application in place of
the set of data which is structured. Domain also has the implementations of the use cases which are required for
the functionality of the application. It uses the help of Data module to fetch data for the specific use case and
may or may not convert it to specified entities and pass back to UI. The UI gets help of these use cases to obtain
the data without directly interacting with the database or API.
B. Data Module
The data module consists of all network calls such as API requests and responses, socket management and
database management. The data is obtained from the network and entered to the database tables. These tables
receive data from sockets also. The formatting of these data happens in this module. The formatted data is passed
to the Domain module as a response.

219
C. Presentation Module
The presentation module performs all input and output operations with the user. This module has all the views
which displays buttons, text, images, etc. These views are flooded with data obtained from Domain module.
There can be inner views which also can be used with the help of the Domain module for obtaining the data. The
data used here should be easy to understand as it interacts directly with the user.
D. Input Design
The input is collected through user interface items such as Buttons, selectors, touch events through the mobile's
screen.The inputs are converted to use cases in the presentation module in the app and sent to domain module to
collect data as a response to the input events. The inputs are logged using the default logger present in the
development environment and are collected frequently while the application is used. These events are collected
just in case the application crashes, the recent events are sent to the developers in the crash report to analyse the
cause of the crash.
E. Output Design

Figure 1. Screenshot of output

The output is displayed as an iOS application in iPhones. The output includes the graphical representations of the
covid-19 data in bar graphs, pie-charts, half pie-charts. The data is also represented in maps showing the counts
inside circular buttons, which on click displays the country's case details in a pull-over view, which can be
dragged across the screen to view the details. Further, there are buttons to switch the type of data in the map
visualization. A sample screenshot of output is displayed in Figure 2.

IV. RESULTS AND DISCUSSIONS


The proposed system has many UI visualizations which are simpler and easy to understand for the general
audience unlike the existing system which provided with complex data which was helpful only to experts and
analysts. This further dominates the need of the application by displaying live data over the application by
fetching from live crowd sourced data, whereas in web we need to refresh to look over the new data. Figure 3
and 4 depicts the global and daily Covid cases respectively. Figure 5 shows the advantage of performance of
Swift programming language which is used for iOS development better than react native, a leading web
development language. Swift leads in performance for displaying images, graphs, maps, etc. comparatively with
react native. Although, maps consume more memory comparatively with react native, iPhones are capable of
handling high memory load. Swift is designed in such a way to overcome the defects of other development
languages by using a LLVM(Low Level Virtual Machine) compiler, which uses a concept of Automatic
Reference Counting, which saves huge memory spaces in long running applications comparatively with
languages like Java and JavaScript which use garbage collector, which is comparatively slower.

220
The proposed system has less detailed data in graphs, charts allowing users to easily understand and get
convinced over the covid cases by looking at these visualizations over the mobile phone. This system has
performance advantages as it processes less data which further reduces the time and space complexity over the
data which is complex in existing systems. This efficiency produced in the existing system makes this
application more eligible for needs over the existing system.

Figure 2.Global Covid Cases

Figure 3.Daily Covid Cases

221
V. CONCLUSION AND FUTURE ENHANCEMENTS
The proposed work is mainly be used to track the Covid-19 surges. Finding, testing, isolating and treating a
Covid patient is an ongoing process that is undoubtedly helping a lot of people, but, it is also vital to trace and
track the day to day Covid-19 cases to keep a record and analysis from the records. The analysis done will in
turn help the experts and other individuals to create awareness and increase the mental satisfaction among
people. This model can be adopted for further visualization concepts such as stocks, HR management, analytics,
business management, project management and health services. All industries which use graphs and other
visualizations to display in mobile applications can build iOS application using simple visualization and enhance
the application to be comfortable for the user which looking at the data.

REFERENCES
[1] Brehmer, M., Lee, B., Isenberg, P., & Choe, E. K. (2018). Visualizing ranges over time on mobile phones: A task-based
crowdsourced evaluation. IEEE transactions on visualization and computer graphics, 25(1), 619–629.J. Clerk Maxwell,
A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73.
[2] Comba, J. L. (2020). Data visualization for the understanding of covid-19. Computing in Science & Engineering, 22(6),
81–86.
[3] Hamzah, F. B., Lau, C., Nazri, H., Ligot, D. V., Lee, G., Tan, C. L., Shaib, M., Zaidon, U. H. B., Abdullah, A. B.,
Chung, M. H. et al. (2020). Coronatracker: Worldwide covid-19 outbreak data analysis and prediction. Bull World
Health Organ, 1(32), 1–32.
[4] Han, J., Brown, C., Chauhan, J., Grammenos, A., Hasthanasombat, A., Spathis, D., Xia, T., Cicuta, P., & Mascolo, C.
(2021). Exploring automatic covid-19 diagnosis via voice and symptoms from crowdsourced data. ICASSP 2021-2021
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8328–8332.
[5] Lampos, V., Majumder, M. S., Yom-Tov, E., Edelstein, M., Moura, S., Hamada, Y., Rangaka, M. X., McKendry, R. A.,
& Cox, I. J. (2021). Tracking covid-19 using online search. NPJ digital medicine, 4(1), 1–11.
[6] Ribeiro, M., Nisi, V., Prandi, C., & Nunes, N. (2020). A data visualization interactive exploration of human mobility
data during the covid-19 outbreak: A case study. 2020 IEEE Symposium on Computers and Communications (ISCC),
1–6.
[7] Singh, P. K., Nandi, S., Ghafoor, K. Z., Ghosh, U., & Rawat, D. B. (2020). Preventing covid-19 spread using
information and communication technology. IEEE Consumer Electronics Magazine, 10(4), 18–27.

222
Grenze International Journal of Engineering and Technology, June Issue

Smart Vision Goggles for Blind People


Manne Sowmya1, Dr. N. Swathi2, A. Raja Sri3, M. Sai Prakash4, K. Jayadeep5
1-3-4-5
IV- B.Tech., Department of EIE, VR Siddhartha Engineering College, Vijayawada, India
2
Asst. Professor, Department of EIE, VR Siddhartha Engineering College, Vijayawada, India
[email protected], [email protected], [email protected], [email protected],
[email protected]

Abstract— People who are visually impaired face various problems in their daily life. Their
daily activities are greatly restricted by loss of eyesight. They usually travel using blind
navigation systems or by their accumulated memories in their long-term exploration. This
paper is about a new design of assistive smart vision goggles for visually impaired persons. The
main objective of the proposed model is to make the user comfortable carrying the device that is
designed in a wearable format. This device includes a pair of glasses and an obstacle-detection
module fitted in it the center of goggles, an output device i.e., voice through a headset, a camera
to find the obstacle, and text recognition i.e., it helps the blind to read out the text by scanning
through the camera where the input is given through a switch. For image capturing, we use
Raspberry pi and for obstacle detection, we use ultrasonic sensors which can scan at most 5-6
meters of 30 degrees range. Cloud technology is used to identify the object scanned by the
camera. These Smart Vision Goggles for Blind People are portable devices, lightweight, easy to
use, and user-friendly. These glasses could easily guide blind people and help them to avoid
obstacles.

Index Terms— Smart goggles, Blind People, Raspberry pi, Ultrasonic Sensors, Text recognition.

I. INTRODUCTION
People who are visually impaired have decreased ability of visual perception. Blind mobility is one of the major
challenges encountered by visually impaired persons in their daily lives. Their life activities are greatly restricted
by the loss of eyesight. Visually impaired people are who have lost their ability to see completely and who have
partial loss of vision [1]. According to the world health organization (WHO), there are around 285 million
visually impaired people in the world. Among the 285 million people, 39 million people are totally blind and 246
million people have a low vision. Because they don't require usage of hands or just minimal hand use, wearable
gadgets are thought to be the most helpful of all assistive technologies. The head-mounted type is the most
common. Their primary benefit is that, unlike other devices, they naturally point in the direction of the viewer,
negating the need for extra direction signs. This paper highlights a brand-new style of smart glasses that can help
with a variety of jobs while yet having a modest construction cost. To communicate information to the user, the
design makes use of the latest Raspberry Pi 4 Model B, a camera, and earphones.

Grenze ID: 01.GIJET.9.2.340


© Grenze Scientific Society, 2023
II. LITERATURE SURVEY
A. Previous Works and their Limitation
There are already many existing devices that help a blind person in walking. The most common one is a simple
walking stick. With this simple walking stick, the blind man may identify the obstacle. A few authors came up
with the smart cane for the blind which detects the obstacles in front of them [2]. Still, it has many disadvantages
such as they can’t detect the obstructions that are hidden but very dangerous for the blind such as stairs, holes,
etc., Shoes that point you in the right way, which aid the blind, are another example of a smart device in use. The
audible directions on your smart phone are no longer necessary to pay attention to while you're walking
somewhere new. Bluetooth allows the shoe insoles to communicate with the associated app, which is accessible
on iOS, Android, and Windows. Then, a vibration will notify you as you walk when you need to turn. As the
turn draws closer, the buzz in your left shoe will tell you to turn left, and the vibration in your right shoe will tell
you to turn that way. Both feet will vibrate at once if you need to turn around. The shoes can't yet assist you in
avoiding obstacles, through their path. The Eye Stick, a walking stick with eyes, is another gadget. A lens is
fixed to the bottom of the eye stick. The distinctive features, such as traffic lights, stairs, the subway, and so
forth, can be recognized. After that, blind persons can receive each communication via vibration. These gadgets
still have limitations in that they cannot detect obstacles such as poles or other objects that are directly in front of
the user [3].
B. Proposed Model to Overcome the Limitations
In this regard, to overcome the limitations, Smart Vision Goggles for the Blind by using raspberry pi is proposed
in this paper. It is designed in the shape of eyeglasses for providing guidance efficiently, comfortably and safely.
It uses ultrasonic sensors and a webcam which are placed on the goggles to detect an object and alert the user.
The advantage is, it is easy to carry the goggles and it provides a lot of features like it identifies the obstacles in
front of the person and alerts the person with a message through headphones. Also, it detects the image in front
of the user by scanning through a webcam and gives the information to the user through headphones. The
proposed model will also provide text reading by scanning the image through a webcam and converting the text
using google text to speech converter. If the user feels unsafe, then a message will be sent to the user’s caretaker
mobile by pressing an input switch.

III. IMPLEMENTATION

Figure 1. Functional diagram of the proposed prototype

The implementation of Smart vision Goggles for the Blind is achieved with the help of components such as
Raspberry pi, ultrasonic sensors, a webcam, and earphones. All these components are placed in a compact and
secure manner which helps the person to carry them easily. The person will be comfortable with the model, he
can use it whenever he needs it and can remove it when it is not necessary. These goggles will help blind people
to reach their destination independently. The reason these glasses are more reliable and easier to implement is
that they are developed using ultrasonic sensors and raspberry pi which are commonly available almost
everywhere. These ultrasonic sensors have a specified range (2cm-4m) for sensing the objects. These sensors
detect the obstacles in front of a person, like they can sense objects, stairs and buildings and give a voice-over to

224
the person through headphones. We designed the goggles in such a way that they can identify the object in front
of the person by taking pictures via camera. Conversion of captured image information into voice will be
provided to the user through headphones that help the blind people to know who/what is in front of them. These
goggles also help the blind to read text by scanning the book through the camera. Another feature we have added
to this project is, if the blind person is feeling insecure then he can press an input switch upon which a message
will be sent to the caretaker of that blind person immediately.
The Raspberry pi is a low-cost and small size computer that is used for running programs. Ultrasonic sensors are
used to measure the distance by using ultrasonic waves. Here, the sensor head emits an ultrasonic wave and
receives the wave reflected from the target. These sensors measure the distance from the target by measuring the
time between the emission and reception. They have a limited range to detect the obstacle i.e., 2cm-4m.
Ultrasonic sensors are placed on the top of the model, some with a certain angle to identify the obstacles, digs,
holes, and stairs and alert the user with a message through the earphones. The functional diagram of the
proposed model is shown in Figure 1. The features such as voice-out obstacles, detecting the image, reading the
scanned image, and sending the current location are programmed on the raspberry pi. A specific button is
provided to execute all these features. Pressing the button at different periods enables the features execution
accordingly. The output of all the features is in speech format using the gTTS (google Text to Speech) library
[4].

IV. PROPOSED SYSTEM DESIGN FLOW

Figure 2. Flow diagram of the proposed system

225
Stepwise procedure to use this prototype model is as follows.
Step 1: First, make a connection of all the modules and other components to the raspberry pi, and
microcontroller.
Step 2: Get the option from the user through the input switch.
Step 3: Ultrasonic sensors automatically sense the obstacles in front of the user and alert him by specifying the
range of distance as shown in Figure 5.
Step 4: If the input switch is pressed for less than 5sec then the image is captured through camera and the
obstacle is identified using cloud technology Imagga as shown in Figure 3.
Step 5: If the input switch is pressed for 5-10sec then the image is captured and the text present in that image is
extracted using pyTesseract module and converted to speech using gTTS library as shown in Figure 4.
Step 6: If the input switch is pressed for more than 10sec then a notification is sent to the user’s caretaker.
Step 7: All the actions are converted into a resultant text document and given to gTTS module which converts
the text into speech that can be heard by the user through earphones.

Figure 3. Capturing the image and identifying the object. Figure 4. Capturing the image and reading the text

Figure 5. Reading the obstacle distance and warning the user

IV. HARDWARE DESCRIPTION


A. Raspberry pi processor:

Figure 6. Raspberry pi 4 Model B

The raspberry pi used in our project is “Raspberry PI 4 Model B” (shown in figure 6) with a high-speed 64-bit
quad-core processor, 4GB RAM and Dual-band 2.4/5.0 GHZ wireless LAN [9]. It is the heart of this project
which is responsible for performing all control actions [8].

226
B. External Webcam:

Figure 7. Webcam

An external webcam is attached to the goggles which are used for capturing the images for the optical character
recognition (OCR) feature and the dominant feature [10]. The external webcam that is used in this prototype is
shown in Figure 7.
C. Ultrasonic Sensors:

Figure 8. HC-SR04 Ultrasonic sensors

Ultrasonic sensors are used to detect obstacles, digs, and stairs in front of the user. These ultrasonic sensors work
in a particular range of distances [11]. The ultrasonic sensors that are used in this project are shown in Figure 8
and their range is 2cm-4m.
D. Input Switch:

Figure 9. Input Switch

The input switch provided would act as a hand-held remote through which the user can select a particular feature
to execute and hear the result. The switch will be operated based on the period the user holds it. If the time is less
than 5sec then the obstacle is identified by scanning through a webcam. If the time is 5-10sec the text
recognition will be enabled. If it is more than 10sec then a message is sent to the respective person who takes
care of the user [12]. The switch used in this project is shown in Figure 9 and the power rating is MAX 50mA
24V DC and the operating temperature range lies between -20 to 70 0C.
E. Microcontroller:
The microcontroller used in this project is ATmega328p. It is an 8-bit AVR RISC-based microcontroller chip. It
consists of flash memory 32 KB ISP with read-while-write capabilities, EEPROM of 1KB, 23 general-purpose
I/O pins, SRAM (Static Ram) of 2KB, 32 general-purpose working registers, etc [13]. Microcontroller is used to
get the information from the ultrasonic sensors and gives it to the raspberry pi for processing the data.

227
V. SOFTWARE USED
IDE— IDLE software is needed to be installed on the computer to give the optimal functioning of the
application [14]. The specifications of the software used are:
 Operating System – Windows
 Programming Language – Python3 [15]
 IDE – IDLE
Libraries – requests, time, cv2, speake3, pytessseract, pyttsx3, json, telepot, RPi.GPIO, Imagga.

VI. PYTHON MODULES


A. Voice out results
gTTS (google search to speech library): It is used to convert the text format to speech format. It is vital in
providing voice output for users [5].
B. pyTesseract
Python-tesseract is an OCR tool for python. That is, it will recognize and “read” the text embedded in images
[6].
C. Geocoder
This module is used to find the current location of the user. Geocoding uses spatially explicit reference datasets
(e.g., digital road networks) to identify the location to best match the input address, essential for comparing and
interpolating the address to the range of addresses for each segment of the referenced dataset [7].

VII. HARDWARE AND SOFTWARE INTERFACING

Figure 10. Hardware and Software Interfacing Figure 11. Hardware connections

The hardware and software interfacing is shown in Figure 10 and hardware connections are depicted in Figure
11. The USB cable is connected to Raspberry pi for the power supply and earphones are plugged in. After
interfacing, the program is compiled and run, then the respective features will get executed based on input from
the switch and the output is heard through earphones.

IX. RESULTS
The output of the smart vision goggles is observed as shown in figures 12 to 15. When the user is walking with
these goggles on, if there is any obstacle then he is hearing a message “obstacle in front of you (if there ia any
object or stairs)” or “digs infront of you (if there is downward stairs or hole)” as shown in figure 12. If the user
wants to know what the obstacle is then he can press the input switch. The features offered here are:
Case 1: If he presses it for about less than 5 sec, then it captures the image and gives the information about the
objects that are captured in the image to the user through earphones. That information is shown in Figure 14.
Case 2: If he presses it for about 5-10 sec, then it captures the image and and it reads the text and gives that
information to the user through earphones, which is shown in Figure 15.

228
Case 3: Another feature included in the project is, if the user felt insecure he can long press (>10sec) the input
switch so that a message is sent to his caretaker as shown in Figure 13.

Figure 12. Obstacle message to the user through earphones Figure 13. A message sent to the caretaker’s mobile

Figure 14. Captured image and Obstacle Information heard by the user

Figure 15. Text captured and text information heard by the user

X. CONCLUSION
Smart vision goggles, is a project in which we have used raspberry pi model to provide assistive care for blind
people. This product helps blind people to read newspapers, and books, it helps to identify the obstacles in front
of the user, captures the images, and can send an alert message to the person who is taking care of the user. All
these features have voice output to the user through headphones. These features are accessed by the user through
the input switch provided. This proposed model is providing efficient results and it is easy to use. With this
model the user is able to walk through roads, labs and hallways without any assistance.

229
XI. FUTURE SCOPE
All the features that are developed in this product are processed by the raspberry pi processor. For further
enhancement of the project, we can add directions to guide the person, help the user to know the current date and
time, help him to know the exact location where he is, weather information, and can get the top 10 headlines of
present-day news. All these features can be processed by using raspberry pi. This may help the person to know
the things happening in society without any assistance.

REFERENCES
[1] M. R. Miah and M. S. Hussain, "A Unique Smart Eye Glass for Visually Impaired People," International Conference on
Advancement in Electrical and Electronic Engineering (ICAEEE), pp. 1-4, doi: 10.1109/ICAEEE.2018.8643011, 2018.
[2] Global data of visually impaired from WHO website www.who.int/blindness/ publications/globaldata/en/.
[3] L. -B. Chen, J. -P. Su, M. -C. Chen, W. -J. Chang, C. -H. Yang and C. -Y. Sie, "An Implementation of an Intelligent
Assistance System for Visually Impaired/Blind People," IEEE International Conference on Consumer Electronics
(ICCE), pp. 1-2, doi: 10.1109/ICCE.2019.8661943, 2019.
[4] K. R. Rani, "An audio aided smart vision system for visually impaired," International Conference on Nextgen Electronic
Technologies: Silicon to Software (ICNETS2), 2017, pp. 22-25, doi: 10.1109/ICNETS2.2017.8067889.
[5] L. Abraham, N. S. Mathew, L. George and S. S. Sajan, "VISION- Wearable Speech Based Feedback System for the
Visually Impaired using Computer Vision," 2020 4th International Conference on Trends in Electronics and Informatics
(ICOEI)(48184), pp. 972-976, doi: 10.1109/ICOEI48184.2020.9142984.
[6] P. P. Singh, S. S. Hegde, R. Varun, V. Hegde, and K. A. S. Devi, “Audio Narration of a Scene for Visually Disabled
using Smart Goggle”, IJRESM, vol. 5, no. 4, pp. 73–75, Apr. 2022.
[7] Du Buf, J.M. Hans; Barroso, João; Rodrigues, João M.F.; Paredes, Hugo; Farrajota, Miguel; Fernandes, Hugo; João
José; Teixeira, Victor; Saleiro, Mário. The SmartVision Navigation Prototype for Blind Users, International Journal of
Digital Content Technology and its Applications, 5, 5, 351-361, 2011.
[8] M. T. Islam, M. Ahmad and A. S. Bappy, "Real-Time Family Member Recognition Using Raspberry Pi for Visually
Impaired People," IEEE Region 10 Symposium (TENSYMP), 2020, pp. 78-81, doi: 10.1109/TENSYMP
50017.2020.9230937.
[9] A. Pardasani, P. N. Indi, S. Banerjee, A. Kamal and V. Garg, "Smart Assistive Navigation Devices for Visually
Impaired People," 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), 2019,
pp. 725-729, doi: 10.1109/CCOMS.2019.8821654.
[10] K. Kumar, R. Patra, M. Manjunatha, J. Mukhopadhyay and A. K. Majumdar, "An electronic travel aid for navigation of
visually impaired persons," Third International Conference on Communication Systems and Networks (COMSNETS
2011), 2011, pp. 1-5, doi: 10.1109/COMSNETS.2011.5716517.
[11] R. Agarwal et al., "Low-cost ultrasonic smart glasses for blind," 8th IEEE Annual Information Technology, Electronics
and Mobile Communication Conference (IEMCON), 2017, pp. 210-213, doi: 10.1109/IEMCON. 2017.8117194.
[12] J. A. Brabyn, "New Developments in Mobility and Orientation Aids for the Blind," in IEEE Transactions on Biomedical
Engineering, vol. BME-29, no. 4, pp. 285-289, April 1982, doi: 10.1109/TBME.1982.324945.
[13] M. P. Agrawal and A. R. Gupta, "Smart Stick for the Blind and Visually Impaired People," Second International
Conference on Inventive Communication and Computational Technologies (ICICCT), 2018, pp. 542-545, doi:
10.1109/ICICCT.2018.8473344.
[14] K. S. P, A. A, G. K and H. S, "Raspberry Pi based Smart Assistance for Visually Impaired People," 2022 3rd
International Conference on Electronics and Sustainable Communication Systems (ICESC), 2022, pp. 1199-1204, doi:
10.1109/ICESC54411.2022.9885412.
[15] M. A. V., L. M. Gladence, R. Velaga and B. Valluri, "Smart Assistive System for Blind People using Raspberry PI,"
2020 International Conference on System, Computation, Automation and Networking (ICSCAN), 2020, pp. 1-8, doi:
10.1109/ICSCAN49426.2020.9262433.

230
Grenze International Journal of Engineering and Technology, June Issue

Data Science based Recommendation System -An


Application of Computer Science
Zeba Khan1 and Dr. Abdul Rahman2
1
Applied College Jazan University/Department of Computer and Information System, Jazan, Kingdom of Saudi Arabia
2
Presidency University / School of Computer Science and Engineering &
Information Science, Rajanukunte, Yelahanka, Bangalore
Email: [email protected], [email protected]

Abstract— Recommender system (RS) has emanated as the most popular application of e-
commerce websites. In the e-commerce business, collaborative filtering based RS systems
suggest products to the customers and find the interesting things for the users which they may
wish to purchase. The success rate of any recommendation system depends upon, information
reliability as well as expression of daily life behavior in the consolidated format. The main task
is to produce the best ranked list of ‘n-number’ of items for the user’s need. Due to the natural
structure, Z-Numbers are more consistent to producing a recommendation list. To solve real
life problem, Z-Numbers should be incorporated into decision-making models. However, Online
shopping involved multicriteria group decision-making (MCGDM). Z-information persisting
some difficulties with MCGDM. Therefore, to enhancing the ability of Z-numbers Complex
Fuzzy Sets (CFSs), are employed. Besides this entropy, distance measure, and aggregation
operator are combined to produced MCGDM based ranking of customer preference.

Index Terms— Fuzzy Set, Linguistic Variables, Z-Numbers, Recommendation System.

I. INTRODUCTION
Data Science is one of the fastest growing fields under the umbrella of Computer Science. Moreover,
Recommendation System is a co-domain of Data Science that is used by various e-commerce websites like
Amazon, Flipkart, Netflix, and so on. It helps in presenting relevant products in front of user to increase the sale
and customer base. However, it is not so easy to plot precise mapping of products and priority or choice of
customer. The main reasons include the limited knowledge of decision-makers, lack of evaluation time, and
insufficient understanding of alternatives. The success rate of any recommendation system depends upon,
information reliability as well as expression of daily life behavior in the consolidated format. Z-numbers have
been proven to be advantageous in revealing uncertain information. In 2011 Zadeh [1] proposed the concept of
Z-numbers. In this article improved Z-numbers are applied to harness the Recommendation System.
Considering that various uncertain data are inevitable in the process of addressing actual problems, uncertain Z-
numbers are to express information and measure information reliability.
Generally accepted, linguistic description is more consistent with the daily expression [1]. However, there are
still some difficulties limiting the applications. Like the calculation process in Z-Numbers is a typical nonlinear
programming problem, which will inevitably lead to the problem of complicated calculation. Therefore, to
enhancing the ability of Z-numbers Complex Fuzzy Sets (CFSs), are employed. Besides this entropy, distance

Grenze ID: 01.GIJET.9.2.341


© Grenze Scientific Society, 2023
measure, and aggregation operator are combined to produced MCGDM based ranking of customer preference.
Collaborative filtering systems help in automatically locating relevant opinions and aggregating them to provide
recommendations. The main task is to present the best ranked list of ‘n’ number of items for the user’s need.
Clustering the preferred item’s set and recommend the prioritized list are two major tasks of Collaborative
filtering based Recommendation System. A preference expressed by a user for an item can be represented by a
triplet (User, Item, Rating). Due to the difficult-to-quantify nature of evaluation, experts intend to describe the
evaluation information with natural language like slightly good, good, very good, extremely good.
Z-Number is more suitable demonstrated more support to express mundane task [2], [3], Besides this interval
valued fuzzy sets and intuitionistic linguistic numbers are applied in [4], [5], [6] to express information with
more reliability. Therefore “Z-Numbers” attract the attention of researchers in the domain of multifold group
decisions [7], [8], control theory [9], [10], and cluster [11].
However, the calculation process is a typical nonlinear programming problem, which will inevitably lead to the
problem of complicated calculation Aliev et al. [13], [14] defined the operations of discrete and continuous Z-
numbers. Liu et al. [15] suggested the negation of discrete Z-numbers. To further enrich the related research, Z-
numbers have also been tried to combine with other mature theories. In [16] and [17], Z-triplet is utilized to
express the normal cloud model and the trapezium cloud model with linguistic terms.
Z-numbers employed to update the Bayesian network , VIKOR model, and TOPSIS model in [12],[18] and [19]
respectively. In previous study a gap is found to express decision making problem which are based on two or
more criteria. Hence there is requirement of harnessing the improved version of existing Z-Numbers.
The rest of this article is organized as follows. In Section II, some fundamental concepts are introduced along
with transformation method between Z-numbers and CFSs. In Section III, IVZCFS method validated using
proposed alogorithm. Finally, Section VII concludes this paper.

II. FUNDAMENTALS
A. Z-numbers
Z-numbers are improved to uncertain Z-numbers. By utilizing an interval concept instead of a specific concept to
characterize the fuzzy constraint and reliability. Typically, a Z-number is composed of A and B, recorded as Z =
(A, B). The parameters A and B represent the fuzzy constraint and reliability, respectively. The two components
are connected by the underlying possibility distribution (UPD). For example User e1 is very certain, that product
a1 is good based on a criteria c1” can be transformed into Z = (“good,” “very certain”).
Usually, the determination of UPD is a problem. In 2002, the concept of complex fuzzy sets (CFSs) was defined
by Ramot et al. [20]. It is an extension of type-1 fuzzy sets, in which the codomain of the membership function is
the unit disc. A brief description of CFS is given in next subsection.
B. Complex Fuzzy Sets
Assumed that S is CFS in an universe of discourse U, then S = {(x,µ_s (x))|x ∈ U}
μ_x=r_s (x).e^(jω_s (x) )
where r_s (x) ∈ [0, 1], ω_s (x) ∈ [0, 2π], j=√-1.
C. Uncertain Z-Numbers
In [17] Z-numbers were combined with Atanassov’s interval-valued intuitionistic linguistic fuzzy sets to
formalized “uncertain Z-numbers”.
IVZ = ((very good, [0.6, 0.8]), (uncertain, [0.7, 0.8])).
Expression can be represented by pair of bounded interval IVZ= ([Sa−, Sa+], [Hb− , Hb+ ]), where Sa− and Sa+
represent the upper and lower bounds of linguistic constraint, respectively. Hb− and Hb+ indicate the upper and
lower bounds of reliability, respectively.
The aim of IVZ is to better represent the fuzzy constraint and reliability. IVCFSs
= [ ( ), ( )] ∗ [ω ( ),ω ( )] can also achieve the aim. ( ) and ( ) represent the upper and lower
bounds of amplitude term, respectively. ω ( ), and ω ( ) indicate the upper and lower bounds of phase term,
respectively.
The theme of ‘IVZ’ , state that “interval concept is more appropriate than a specific concept to characterize the
fuzzy constraint and reliability, e.g. {anticipated budget deficit, between 1.8 and 2.2 million dollars, likely, very
likely}

232
D. Transformation From Uncertain Z-Numbers to IVCFSs:
Z- numbers are expressed by triplet (Z,A,B) [1] to make information more expressive and reliable A and B are
were projected into interval rather than specific in uncertain Z-numbers. To express IVZ in IVCFSs, the
membership variables of [Sa− , Sa+ ] and the probabilistic variables of [Hb−,Hb+ ] are represented by − ( ) and
( ) and ω ( ), and ω ( ) respectively by using definition (1).

Definition 1: Given two Linguistic Term Set S and H,


where S = {S0, S1,...,SE|E ∈ N} and H = {H0, H1,...,HF |F ∈ N},
− ( ), + ( )]
[
for ([Sa− , Sa+ ], [Hb− , Hb+ ]), and for = [ − ( ), + ( )] ∗
using ( ) , ( ) ∈[0,1] and ω ( ), ω ( ). ∈ [0,2π]
E. Generalized Entropy
Entropy, an essential concept in thermodynamics and information theory, is usually adopted to measure the
chaos of a system. Scholars have proposed many entropy calculation methods for classical fuzzy sets [21]. As far
as we know, however, there is no approach to calculate the entropy of Z-numbers from the perspective of CFSs.
generalized entropy Eg(J) is defined as equation (2).
1 1
( ) = 1 − [ (| − ( ) + + ( ) − 1| + | − ( ) + + ( ) − 2 | )] / (2)
2 2
In literature various ordered weighted operators are reported, among them the entropy weight method (EWM)
preserved the objective. Moreover, EWM is high reliability and convenience [22]. Therefore a weight
calculation method is employed based on generalized entropy based equation (2). Equation (1) =
− ( )
, + ( )]
[ − ( ), + ( )] ∗ [ is adopted to derive the weight of each criterion. The weight vector {w1, w2,...,wn}
is derived as ,
1− ( )
= (3)
∑ [1 − ( )]
where {i = 1, 2,...,n},
F. Aggregation Operator
Interval-valued complex fuzzy sets power weighted average (IVCFSsPWA)) is used in the process of
information fusion is proposed.
− +
Definition 2: Assume that = [ − ( ), + ( )] ∗ [ ( ), ( )] {i=1,2, 3,…n}.
The integrated J is given by equation (4).

= ( , ,. .. .. .. )= ⨁ ⨁. . . . . . . . . . . (4)

Where Integration Operation is given by equation (5)

⨁ = [1 − (1 − ) ,1
[ ( ∏ ( ( / )) ), ( ∏ ( ( / )) )]
− (1 − ) ]. (5)

G. Distance Measure and Similarity Measure


The distance and similarity measure between J1 and J2 is calculated by equation (6) and (7) respectively.
1 ω− 2 (x) − ω− 1 (x) ω+ 2 (x) − ω+ 1 (x)
( , ) = (| − 2 ( ) − − 1 ( )| + | + 2 ( ) − + 1 ( )| + | |+| |) (6)
4 2 2
− + − + − + + +
/
1 (x)+ 1(x) 2 (x)+ 2 (x) 1 (x)+ 1 (x) 2 (x)+ 2 (x)
( , )=1− − + − (7)

III. PROPOSED SYSTEM


In the real-time online users may either drop the idea of shopping or divert to another e-shopping cart due to
following reasons . Limited knowledge of decision-makers, lack of evaluation time, and insufficient
understanding of other products. That triggered pop up best products in millisecond based on Z-information.

233
Therefore to satisfy multicriteria decision making in collaborative filtering based recommendation uncertain Z-
numbers are with improved version in proposed algorithm.
Steps
(i) : Attain the IVCFSs of evaluation information as shown in Table1.
(ii) : Estimate weight vector.
(iii) : Obtain collective evaluation information.
(iv) Calculate the positive and negative distances (Dp) and (Dn) between alternatives and ideal
solutions.
(v) : Generate a non- decreasing order of list of products based on matching of customer query.
The evaluation information is displayed in Table I, II, and III for attributes c1,c2,c3,andc4.

TABLE I EVALUATION INFORMATION FOR E1

c4(lips and chin) c3(eyes) c2(nose) c1(forehead)


{ slightly-large, { small, slightly small, { extremely small, very { slightly small, slightly
large, certain, uncertain slightly uncertain} large, slightly certain, large, extremely uncertain,
f1 extremely certain} certain} certain}
{ slightly-large, { small, slightly small, { extremely small, very { slightly small, very large,
large, certain, uncertain slightly uncertain} large, slightly certain, extremely uncertain, very
f2 extremely certain} certain} certain}
{ slightly-large, { small, slightly small, { extremely small, very { slightly small, slightly
large, certain, uncertain slightly uncertain} large, slightly certain, large, extremely uncertain,
f3 extremely certain} certain} certain}
{ slightly-large, { small, slightly small, { extremely small, very { slightly small, slightly
large, certain, uncertain slightly uncertain} large, slightly certain, large, extremely uncertain,
f4 extremely certain} certain} certain}

TABLE II EVALUATION INFORMATION FOR E2

c4(lips and chin) c3(eyes) c2(nose) c1(forehead)


{ slightly-large, { small, slightly small, { extremely small, very { slightly small, slightly
large, certain, uncertain slightly uncertain} large, slightly certain, large, extremely uncertain,
f1 extremely certain} certain} certain}
{ slightly-large, { small, slightly small, { extremely small, very { slightly small, very large,
large, certain, uncertain slightly uncertain} large, slightly certain, extremely uncertain, very
f2 extremely certain} certain} certain}
{ slightly-large, { small, slightly small, { extremely small, very { slightly small, slightly
large, certain, uncertain slightly uncertain} large, slightly certain, large, extremely uncertain,
f3 extremely certain} certain} certain}
{ slightly-large, { small, slightly small, { extremely small, very { slightly small, slightly
large, certain, uncertain slightly uncertain} large, slightly certain, large, extremely uncertain,
f4 extremely certain} certain} certain}

TABLE III EVALUATION INFORMATION FOR E3

c4(lips and chin) c3(eyes) c2(nose) c1(forehead)


{ slightly-large, large, { small, slightly small, { extremely small, very { slightly small, slightly
certain, extremely uncertain slightly uncertain} large, slightly certain, large, extremely uncertain,
f1 certain} certain} certain}
{ slightly-large, large, { small, slightly small, { extremely small, very { slightly small, very large,
certain, extremely uncertain slightly uncertain} large, slightly certain, extremely uncertain, very
f2 certain} certain} certain}
{ slightly-large, large, { small, slightly small, { extremely small, very { slightly small, slightly
certain, extremely uncertain slightly uncertain} large, slightly certain, large, extremely uncertain,
f3 certain} certain} certain}
{ slightly-large, large, { small, slightly small, { extremely small, very { slightly small, slightly
certain, extremely uncertain slightly uncertain} large, slightly certain, large, extremely uncertain,
f4 certain} certain} certain}

234
VI. EXPERIMENTS
During the online shopping displaying specific set of products in first few seconds play extremely important role
to attract customer attention. Recommendation of best suited products are based on profile building features of
Collaboration filtering. Besides this, produce a list of products with decreasing order of taste, may be helpful in
printing good impression in customer memory. Customer, always want to choose the most suitable brand of the
product based on their needs. Suppose that customer wants to select one from four brands a1, a2, a3, and a4. For
the sake of prudence, a group of three online users e1,e2,and e3 evaluate the options from four aspects, including
the physical appearance (c1), color combination(c2), performance (c3), and price (c4) among the brands a1, a2,
a3, and a4 . Assuming that the Linguistic Term Sets (LTSs) are H = {h0 = extremely terrible, h1 = very terrible, h2
= terrible, h3 = slightly terrible, h4 = slightly good, h5 = good, h6 = very good, h7 = extremely good} and S = {s0
= extremely uncertain, s1 = very uncertain, s2 = uncertain, s3 = slightly uncertain, s4 = slightly certain, s5 =
certain, s6 = very certain, s7 = extremely certain}.
A. Results
Table IV,V, and VI consist of evaluation information in the form of IVCFSs. After calculating the weight vector
of criteria for user e1,e2,and e3, estimated collective evaluation information with aggregation operator is shown in
Table VI . Besides these distances between alternatives and ideal solutions distances Dp and Dn are estimated.
Besides this Table VII is displaying ranking of all alternatives in decreasing order of preference. Hence the order
of displaying of face sketch can follow the estimated sequence to achieve better impression.
TABLE IV INPUTS ARE IN IVCFS FOR E1

c1 c2 c3 c4
a1 (0.43,0.57,0.90,4.49) (0.14,0.86,3.59,4.49) (0.29,0.43,1.79,2.69) (1.79,2.24,4.49,6.28)
a2 (0.14,0.29.1.79,2.69) (0.71,1,3.59,5.38) (0.29,0.57,1.79,4.49) (0.43,0.71,3.59,6.28)
a3 (0.29,0.43,0.90,1.79) (0.71,0.86,0.90,4.49) (0.71,1,2.69,3.59) (0.57,0.71,0.90,1.79)
a4 (0.43,0.57,2.69,4.49) (0.14,0.71,0.90,1.79) (0.86,1,1.79,2.69) (0.43,0.57,2.69,3.59)

TABLE V INPUTS ARE IN IVCFS FOR E2

c1 c2 c3 c4
a1 (0.28,1,2.690,6.28) (0.42,0.57,0.89,4.48) (0.14,0.85,1.79,3.49) (1.39,0.854,3.58,3.58)
a2 (0.14,0.28,1.79,3.58) (0.14,0.28,1.79,2.69) (0.710,1,1.79,2.69) (0.73,1,1.79, 2.62)
a3 (0.14,1,0.89,3.58) (0.28, 0.42, 0.89, 1.79) (0.714,0.85,2.69,1.79) (0.71,0.89,0.90,1.79)
a4 (0.14,0.28, 3.58, 6.28) (0.42,0.57,2.69,4.49) (0.14,0.71,1.79,4.49) (0.14,0.71,2.69,4.48)

TABLE VI INPUTS ARE IN IVCFS FOR E3

c1 c2 c3 c4
a1 (0.43,0.57,0.90,4.49) (0.14,0.86,3.59,4.49) (0.29,0.43,1.79,2.69) (1.79,2.24,4.49,6.28)
a2 (0.14,0.29.1.79,2.69) (0.71,1,3.59,5.38) (0.29,0.57,1.79,4.49) (0.43,0.71,3.59,6.28)
a3 (0.29,0.43,0.90,1.79) (0.71,0.86,0.90,4.49) (0.71,1,2.69,3.59) (0.57,0.71,0.90,1.79)
a4 (0.43,0.57,2.69,4.49) (0.14,0.71,0.90,1.79) (0.86,1,1.79,2.69) (0.43,0.57,2.69,3.59)

TABLE VII RANKING OF ALL BRANDS

a1 a2 a3 a4
e1 0.54 0.66 0.61 0.62
e2 0.65 0.57 0.58 0.44
e3 0.62 0.56 0.61 0.53
Ø 0.62 0.58 0.59 0.55
Ranking a1> a3> a2> a4

235
III. CONCLUSIONS
Even though Z-Numbers is a naive concept just introduced in 2011, and it is in preliminary stage. Besides this
it’s an effective tool to develop MCGDM. E-Commerce website using recommendation system to help online
customers in purchasing daily life commodities. Therefore, an improved version of Z-Numbers i.e. IVZ is
applied along with CFS. In this paper IVCFS TECHNIQUE is adopted to handle uncertain Z-numbers. The
support of CFS is unrestricted and may include real numbers, daily households, accessories, moreover provides
feasibility of interpreting information expression uncertainty and reliability. Therefore, CFS base IVZ is used to
ranked and recommend certain set of preferred items.
Generalized entropy, distance measure, and aggregation operator are applied proposed recommendation system.,
which is ultimately producing a ranking of favorite products from high to low. The limitation of presented work
is high degree of dependency of evaluation information from different sects of users.

REFERENCES
[1] L. A. Zadeh, “A note on Z-numbers,” Inf. Sci., vol. 181, no. 14, pp. 2923–2932, 2011.
[2] G. Beliakov, M. Pagola, and T. Wilkin, “Vector valued similarity measures for Atanassov’s intuitionistic fuzzy sets,”
Inf. Sci., vol. 280, pp. 352–367, 2014.
[3] T. Verma and A. Kumar, “Ambika methods for solving matrix games with Atanassov’s intuitionistic fuzzy payoffs,”
IEEE Trans. Fuzzy Syst., vol. 26, no. 1, pp. 270–283, Feb. 2018.
[4] L. De Miguel, H. Bustince, J. Fernandez, E. Induráin, A. Kolesárová, and R. Mesiar, “Construction of admissible linear
orders for interval-valued Atanassov intuitionistic fuzzy sets with an application to decision making,” Inf. Fusion, vol.
27, pp. 189–197, 2016.
[5] J. Xu, S. P. Wan, and J. Y. Dong, “Aggregating decision information into Atanassov’s intuitionistic fuzzy numbers for
heterogeneous multi-attribute group decision making,” Appl. Soft Comput., vol. 41, pp. 331–351, 2016.
[6] J. Q. Wang, P. Wang, J. Wang, H.-Y. Zhang, and X.-H. Chen, “Atanassov’s interval-valued intuitionistic linguistic
multicriteria group decision-making method based on the trapezium cloud model,” IEEE Trans. Fuzzy Syst., vol. 23, no.
3, pp. 542–554, Jun. 2015.
[7] J. Chai, S. Xian, and S. Lu, “Z probabilistic linguistic term sets and its application in multi-attribute group decision
making,” Fuzzy Optim. Decis. Making, vol. 20, pp. 529–566, 2021.
[8] S. Xian, J. Chai, T. Li, and J. Huang, “A ranking model of Z-mixturenumbers based on the ideal degree and its
application in multi-attribute decision making,” Inf. Sci., vol. 550, pp. 145–165, 2021.
[9] F. Sabahi, “Introducing validity into self-organizing fuzzy neural network applied to impedance force control,” Fuzzy
Sets Syst., vol. 337, pp. 113–127, 2017.
[10] S. Razvarz and M. Tahmasbi, “Fuzzy equations and Z-numbers for nonlinear systems control,” Procedia Comput. Sci.,
vol. 120, pp. 923–930, 2017.
[11] R. A. Aliev, W. Pedrycz, B. G. Guirimov, and O. H. Huseynov, “Acquisition of Z-number-valued clusters by using a
new compound function,” IEEE Trans. Fuzzy Syst., vol. 30, no. 1, pp. 279–286, 2022, doi:
10.1109/TFUZZ.2020.3037969.
[12] W. Jiang, Y. Cao, and X. Deng, “A novel Z-network model based on Bayesian network and Z-number,” IEEE Trans.
Fuzzy Syst., vol. 28, no. 8, pp. 1585–1599, Aug. 2020.
[13] R. Aliev, A. Alizadeh, and O. Huseynov, “The arithmetic of discrete Znumbers,” Inf. Sci., vol. 290, pp. 134–155, 2015.
[14] R. A. Aliev, O. H. Huseynov, and L. M. Zeinalova, “The arithmetic of continuous Z-numbers,” Inf. Sci., vol. 373, pp.
441–460, 2016.
[15] Q. Liu, H. Cui, Y. Tian, and B. Kang, “On the negation of discrete Znumbers,” Inf. Sci., vol. 537, pp. 18–29, 2020.
[16] H.-G. Peng and J.-Q. Wang, “A multicriteria group decisionmaking method based on the normal cloud model with
Zadeh’s Znumbers,” IEEE Trans. Fuzzy Syst., vol. 26, no. 6, pp. 3246–3260, Dec. 2018.
[17] H. G. Peng, H. Y. Zhang, J. Q Wang, and L. Li, “An uncertain Z-number multicriteria group decision-making method
with cloud models,” Inf. Sci., vol. 501, pp. 136–154, 2019.
[18] K. W. Shen and J. Q. Wang, “Z-VIKOR method based on a new comprehensive weighted distance measure of Z-
number and its application,” IEEE Trans. Fuzzy Syst., vol. 26, no. 6, pp. 3232–3245, Dec. 2018.
[19] F. Sabahi, “FN-TOPSOS: Fuzzy networks for ranking traded equities,” IEEE Trans. Fuzzy Syst., vol. 25, no. 2, pp. 315–
332, Apr. 2017.
[20] D. Ramot, R. Milo, M. Friedman, and A. Kandel, “Complex fuzzy sets,” IEEE Trans. Fuzzy Syst., vol. 10, no. 2, pp.
171–186, Apr. 2002.
[21] Q. Zhang, Y. Chen, J. Yang, and G. Wang, “Fuzzy entropy: A more comprehensible perspective for interval shadowed
sets of fuzzy sets,” IEEE Trans. Fuzzy Syst., vol. 28, no. 11, pp. 3008–3022, Nov. 2020.
[22] Q. Jia, J. Hu, Q. He, W. Zhang, and E. Safwat, “A multicriteria group decision-making method based on AIVIFSs, Z-
numbers, and trapezium clouds,” Inf. Sci., vol. 566, pp. 38–56, 2021.

236
Grenze International Journal of Engineering and Technology, June Issue

Mobile Malware Attacks, Classification, Propagation,


Analysis, Detection, Challenges and Future Directions –
A Survey
Dr. B Senthilkumar1, Dr. M Sujithra2 and Mani Barathi SP S3
1
Associate Professor, Department of Mechanical Engineering, Kumaraguru College of Technology
Email: [email protected],
2
Assistant Professor, Department of Computing – Data Science, Coimbatore Institute of Technology
Email: [email protected]
3
M.SC Data Science (Integrated Master’s Degree Program), Department of Computing – Data Science, Coimbatore Institute
of Technology

Abstract— Number of Smartphone users are increasing day by day and mobiles have become
an integral part of the society. This is because of the rich variety of mobile devices and essential
applications provided by its manufacturers. The increasing number of mobile devices invite
skilled developers and hackers to develop malware that invades personal and business
information in a very efficient manner. Therefore, mobile devices are an ideal target for various
security issues and data privacy threats in a mobile ecosystem. Threats posed by malwares
include leaking of private information, financial loss to users, system damage. For better
protection on computers, researchers and manufacturers making great efforts to produce anti-
malware systems with effective detection methods why the most targeted platform for the
malware developers is Android and why and how and when malwares propagate into the
mobile system with how they are detected, and protection mechanism is discussed in this paper.

Index Terms— Detection algorithm, mechanism, malware, machine learning approach, threats,
vulnerabilities.

I. INTRODUCTION
In recent times, the use of mobile devices for both business and personal purposes has increased significantly.
Modern tablets and Smartphone's provide many useful services such as internet browsing, maps, social network
clients, internet banking in addition to standard mobile functionality including phone calls, SMS, and Bluetooth.
The data used and stored in these services is often highly sensitive and therefore desired by the attackers. Mobile
devices may have become the most popular gadgets, but their security is still a developing domain. this is a
rising significance and a cumulative need, but it is comparatively weak area for the user’s data privacy and
protecting. Although mobile companies do think for the user’s security and data privacy, the use of applications
from the internet creates complex issues in relation to handling threats and vulnerabilities when securing a user’s
data privacy.
There are thousands of diverse applications accessible from application stores for each mobile device, and these
applications have an extensive range of purposes, including web browsing, entertainment (movies, games, and
music), social networking, communication (e-mail, internet messaging), banking, and location-based services.

Grenze ID: 01.GIJET.9.2.343


© Grenze Scientific Society, 2023
The reasons for the motivations of existence of malware are
 User Information Stealing
 Publicity of User Data
 Spam SMS
 Optimization of search process
 Ransom

Figure 1: Malware growth in relation with time

Figure 2: Threats and attacks on mobile OS

Security goal is the processes of achieving are confidentiality, integrity, and availability.
 Confidentiality is referred as the process of preserving the data from the unauthorized confinements and
implies to the proprietary information and personal privacy security.
 Integrity refers to the process of safeguarding the information from the unauthorized attacker’s process of
destruction or modification and further the guarantee for the authenticity and non-reputation of information.
 Availability is characterized as guaranteed access and utilization of data within the assured time. The term
availability ensures that the information and data used are utilized in timely.

Figure 3: Data protected by CIA trait

The important aspects of mobile device security and data privacy issues is discussed. Sensitive security issues
affecting on smart phones such as malware attacks, vulnerabilities, and Threats is addressed. Classification of
malware, malware propagation and the types of malware analysis techniques is discussed. This survey presents

238
the trusted security countermeasures, various malware detection techniques to help the users to protect their
devices. Research questions for future works are introduced in this review. This paper is organized as follows. In
Section 2, the malware classification is discussed. In Section 3, this paper discussed the malware propagation.
Section 4 presented the malware analysis and detection approaches taxonomy. In Section 5, discussed the
malware detection mechanism. In section 6, we have presented the performance evaluation techniques. Future
directions are described in Section 7. In section 8, conclusion is presented.

II. MALWARE CLASSIFICATION


The term malicious software is the abbreviation of the term malware. Put simply, any piece of software that was
written with the intent of doing harm to data, devices or to people is called a malware. Malware classification
can be done in several ways to distinguish the unique their types from each other. for better understanding of
how they can infect computers and devices, the threat level they pose and how to protect against them,
Distinguishing and classifying them from each other becomes imperative. The depending on the type of damage
malware infuses it can be helpful in categorizing what kind of malicious software you're dealing with. The
common types of malwares, but are as follows
 VIRUS: like biological science, viruses glue themselves to clean files and it is contagious and may attack
other clean files. They can spread hysterically, and cause damage toa system’s core functionality and
deleting or corrupting files. They usually appear as an executable file.
 TROJANS: malware which differentiate its own self as the legitimate software or is included in legitimate
software that has been tampered with does belong to trojans family. It creates backdoors in your security to
let other malware in and may also tend to act discretely.
 SPYWARE: No surprise here: spyware is malware designed to spy on you. It hides in the background and
takes notes on what you do online, including your passwords, credit card numbers, surfing habits and more.
 WORMS: Worms infect entire networks of devices, either local or across the internet, by using network
interfaces. It uses each consecutive infected machine to infect more.
 RANSOMEWARE: Also called scareware, this kind of malware can lock down your computer and threaten
to erase everything unless a ransom is paid to its owner.
 ADWARE: Though not always malicious in nature, particularly aggressive advertising software can
undermine your security just to serve you ads which can give a lot of other malwares a way in. Plus, let’s
face it: pop-ups are annoying.
 BOTNETS: Botnets are networks of infected computers that are made to work together under the control of
an attacker.

Figure 4: Malware types

III. MALWARE PROPAGATION


There are different types of malware propagation, they are
 REPACKAGING: Malware developers first download any popular app, disassemble that app (i.e.,
generating the source code written in Java), insert their own code having malicious payload within the
original code, reassemble the app and redistribute that app in official or third-party app markets.

239
 UPDATE ATTACKS: Repackaging technique includes malicious payload within the original app but that is
easier to detect by analyzing the source code. To evade detection malware developers instead of including
malicious payload within the app, they include only an update component which downloads the malicious
payload at run time after the app is installed on the device. Hence scanning the source code will not be able
to detect the malware as initially there is no malicious code within the app.
 DRIVE BY DOWNLOADS: This technique employs traditional drive-by-downloads to Android devices as
well in which users are enticed to download interesting or attractive apps. For example, Tracker malware
has in-app advertisement library. After clicking on that advertisement link user is redirected to a website
which displays the message to download an app which can save battery of the device. However, that
downloaded app is a malware which subscribes to premium rate services without user’s knowledge.

Figure 5: Types of malware propagation

IV. MALWARE ANALYSIS TECHNIQUES


Malware can be analyzed with the help of detection techniques. Malware analysis is the method or a process of
understanding the code, behavior, and functionality of malware so that critical act of attack can be measured.
static analysis, dynamic analysis and permission-based analysis are the three broad categories of Detection
techniques- parameters-static code analysis, taint tracing and control flow dependencies are the ways in which
static analysis can be done with this is depicted in the figure given below. Dynamic analysis considers
parameters including-network traffic, native code, and user interaction. Permission-based analysis can be done
with the help of permissions specified in manifest file. As told in literature, various techniques exist for detection
of mobile malware.

Figure 7: Malware analysis techniques

 Static Analysis: Static analysis can perform the investigation in the downloaded app by inspecting its own
software properties and its source code. It is an inexpensive way to find malicious activities in code
segments without executing application and notifying its behavior. Many techniques can be used for static
analysis: decomplication, decryption, pattern matching and static system call analysis etc. The obfuscation
and encryption techniques embedded in software makes static analysis hard. Static analysis is further
divided into two categories-
- misuse detection
- anomaly detection traditionally used by anti-viruses.

240
 Misuse detection: signature-based approach for detection of malware based on security policies and rulesets
by matching of signatures is where the Misuse detection is used. Data flow dependency and control flow
dependencies in source code that would help to understand the behavior of apps, this is possible in case of
static analysis.
 Anomaly detection: machine learning algorithms for learning of known malwares and predicting unknown
malware is used in case of Anomaly detection. identifying action of malware rather than pattern is the most
common application of this approach. The procedure undergone here are first used to construct suspicious
behavior of applications and then observed signatures are matched against database of normal behavior
applications. by training network with classifier such as support vector machine (SVM) It can distinguish
between malicious and normal behavior.
 Dynamic Analysis: Dynamic analysis perform the implementation of application in secluded environment to
keep in track of its execution behavior. Various heuristics are considered for monitoring dynamic behavior
which includes-monitoring network activity, file changes and system call traces. Android applications can
run in an Android SDK, a mobile device emulator running on desktop computer for emulation of software
and hardware features except generating phone calls. For testing purposes emulator supports Android
Virtual Device (AVD) configurations. When applications start running on the emulator, it can use all
services like to invoke other applications, accessing network state, play audio and video, store and retrieve
data.
 Permission Analysis: Permissions play key role while analyzing android applications. They are listed in
Manifest.xml file while each application is installed. Install time permissions limits application behavior
with control over privacy and reduces bugs and vulnerabilities. Users have right to allow or deny the
installation of applications, but he cannot go for the selection of individual permissions. These permissions
are required in android applications because the use of resources in android phones is based on this
permission set. Some researchers can detect malicious behavior of android applications based on
permissions specified in Manifest.xml.

V. MALWARE DETECTION MECHANISM


Malware detection system is a method to examine if a program has malicious or non-malicious. Detection
system includes two process-
 Analysis
 Detection.
It takes two inputs one is the signature or behavioral parameters of a given code and second is the program under
inspection, it can employ its own detection mechanism to decide if the program is malware or not. They can be
divided into
 signature-based detection
 specification-based detection
 behavior-based detection
 Application Permission Analysis
 Cloud Based Malware Detection
 Data Mining Based Malware Detection.
 Signature-Based malware detection
A pattern-marching approach commercial antivirus is an example of signature-based malware detection where
the scanner scans for a sequence of byte within a program code to identify and report a malicious code. This
approach to malware detection adopts a syntactic level of code instructions to detect malware by analyzing the
code during program compilation. This technique usually covers complete program code and within a short
period of time. However, this method has limitation by ignoring the semantics of instructions, which allows
malware obfuscation during the program’s run-time.
A. Specification-Based Malware Detection
Specification based detection makes use of certain rule set of what is considered as normal to decide the
maliciousness of the program violating the predefined set of the rules. The specification-based system there
exists a training phase which attempts to learn all valid behavior of a program or system which needs to inspect.

241
The main limitation of specification-based system is that it if very difficult to accurately specify the behavior the
system or program.
B. Behavioral-Based Detection
The behavior-based malware detection system is composed of several applications, which together provide the
resources and mechanisms needed to detect malware on the Android platform. Each program has its own specific
functionality and purpose in the system and the combination of all of them creates the Behavior-Based malware
detection system. For collecting data from Android applications, the Android data mining scripts and
applications mentioned in are the responsible, and the script running on the server will be the responsible for
parsing and storing all collected data.
C. Permission –based Detection
Applications run in a sandbox environment however they need permissions to access certain data. At the time of
installation, Android platform asks the user to grant or deny permission for the application based on the activities
the application can perform. This is to overcome a limitation in Android platform where the developers can
intentionally hide permission label to a component. If no label is specified there is no restriction as it had default
allow policy.
D. Cloud Based Malware Detection
The Google Play apps are examined for the malware. Bouncer are the service used automatically to examine the
apps on the Google Play Store for malware. As soon as an application is uploaded, the Bouncer checks it and
then compares it to other known malware, Trojans, and spyware. Every application is run in a simulated
environment to see if it will behave maliciously on an actual device.
E. Data Mining Based Malware Detection
In data mining methods for detecting malicious executables, a malicious executable as a program that performs
function, such as compromising a system’s security, damaging a system, or obtaining sensitive information
without the user’s permission. Their data mining methods detect patterns in large amounts of data, such as byte
code, and use these patterns to detect future instances in similar data. Their framework used classifiers to detect
new malicious executables.

VI. PERFORMANCE EVALUATION METRICS


Evaluation metric plays a critical role in achieving the optimal classifier during the classification training. Thus,
a selection of suitable evaluation metric is an important key for discriminating and obtaining the optimal
classifier. This paper systematically reviewed the related evaluation metrics that are specifically designed as a
discriminator for optimizing generative classifier. Generally, many generative classifiers employ accuracy as a
measure to discriminate the optimal solution during the classification training. However, the accuracy has
several weaknesses which are less distinctiveness, less discriminability, less informativeness and bias to majority
class data.

Figure 8: Performance Evaluation Metrics

242
VII. FUTURE DIRECTIONS
In case of security and privacy, the Smartphone users are not able to figure out the number of attacks on their
devices and how much money malicious apps may steal from their accounts. In this survey, first, we have
discussed different types of the mobile device vulnerabilities and threats, Secondly, we have classified malware
and malicious applications focusing on how the attack is executed and what is the target of the attackers. Finally,
discussed the possible malware detection defense mechanisms for mobile device security and then, suggested
some future directions to improve the detection of malicious or abnormally behaving applications before its
propagation. Using new machine learning techniques for providing real-time behavior analysis and identifying
fake apps. Deep learning algorithms can be utilized for the features extraction with more accuracy during the
malware testing. The Mobile OS companies, especially popular ones, should consider more security mechanisms
for preventing against unpredictable attacks.

VIII. CONCLUSION
Smart phones are becoming popular in terms of power, sensor, and communication. With the rapid proliferation
of the Smartphone gadgets and developing apps with a lot of features, as several sensors and connections, the
number of malware and attacks is raising. Modern, smart phones provide lots of services such as messaging,
browsing internet, emailing, playing games in addition to traditional voice services. Increase in the number of
smart phones on the market, the need for malware analysis is an urgent issue. Malware is a critical threat to
user’s computer system in terms of stealing
 confidential information
 corrupting or disabling security system.
This survey paper explains some occurred technologies used by security researchers to gear these threats. It says
malware types, static, dynamic and hybrid malware analysis techniques, malware detection mechanisms. Among
the various existing approaches Machine Learning methods have shown the results with high Accuracy in the
detection of malicious activities. With this categorization, we want to provide an easy understanding for users
and researchers to improve their knowledge about the security and privacy of smart phones.

REFERENCES
[1] Ammar Ahmed E. Elhadi, MohdAizainiMaarof and Ahmed Hamza Osman, Malware Detection Based on Hybrid
Signature Behaviour Application Programming Interface Call Graph, American Journal of Applied Sciences 9 (3): 283-
288, 2020, ISSN 1546-9239, 2021, Science Publications
[2] Sujithra, M., Padmavathi, G., (2012): A Survey on Mobile Device Threats, Vulnerabilities, and their Defensive
Mechanism. International Journal of Computer Applications (0975-8887) Volume 56-- No.14
[3] Kirti Mathur, Saroj Hiranwal, A Survey on Techniques in Detection and Analyzing Malware Executables, International
Journal of Advanced Research in Computer Science and Software Engineering, ISSN: 2277 128X, Volume 3, Issue 4,
April 2020.
[4] Sujithra, M.; Padmavathi, G.: Enhanced permission-based malware detection in mobile devices using optimized random
forest classifier with PSO-GA. Res. J. Appl. Sci. Eng. Technol. 12, 732–741 (2016)
[5] Matthew G. Schultz, Eleazar Eskin, Erez Zadok, and Salvatore J. Stolfo, Data Mining Methods for Detection of Study
New Malicious Executables, in Proceedings of the Symposium on Security and Privacy, 2021, pp. 3849.
[6] Sujithra M, Padmavathi G, Narayanan S (2015) Mobile device data security: a cryptographic approach by outsourcing
mobile data to cloud. Procedia Computer Science. 47:480–485
[7] Jonathan joseph bloun, adaptive rule-based malware detection employing learning classifier systems, Thesis- Master of
science in computer science, Missouri University of science and technology, 2022.

243
Grenze International Journal of Engineering and Technology, June Issue

A Survey on Hyperspectral Sensing Techniques for


Identification of Fake Pharmaceuticals Medicines
Pravin V Dhole1, Vijay D Dhangar2, Sulochana D Shejul3 and Prof. Bharti W Gawali4
1-4
Department of Computer Science and Information Technology, Dr. Babasaheb Ambedkar Marathwada University,
Aurangabad-431004 (MS), India.
Email:[email protected], [email protected], [email protected],
[email protected]

Abstract— Risky or subpar medications are indeed an issue for healthcare organizations,
especially in middle-income countries with inadequate pharmaceutical care and pharmaceutical
legal frameworks. Impoverished pharmaceuticals can lead to major adverse drug reactions, the
emergence of bacterial resistance, as well as the possibility of therapeutic failure. They can also
raise medical costs and erode the trust of the people in medical institutions. In the rapidly
growing world, the study of fake pharmaceutical medicines is crucial. Hyperspectral imaging
(HSI) is a developing imaging technique for use in medicine. The dataset, which has three
dimensions, two spatial and one spectral is acquired via HSI. A summary of the research on
hyperspectral sensing techniques for the identification of fake pharmaceutical medicines is
provided in this research article. This study aims to evaluate the available various technology
for identifying fake medicines.

Index Terms— Hyperspectral imaging (HIS), Raman hyperspectral imaging, spectroscopy, fake
pharmaceuticals medicines, spectral signature, Visible to near-infrared region, Spectral data
analysis.

I. INTRODUCTION
The impact of fake medicines on society has a global reach. Due to massive distribution paths, unauthorized
online pharmacies, and reused materials and packaging, they are difficult to identify. It is necessary to develop
detection techniques for identifying fake medicines as well as various techniques for confirming counterfeiting
to solve this problem. For the identification of fake medicines tablets, various challenges are rising day by day.
Developing a new technique to identify fake pharmaceutical products with appearance inspection. Now the rapid
development of identification techniques in that analytical, chromatography, and spectroscopy are available but
these techniques are taking more time, more data, or maybe impacting the environment. So, to reduce this
problem need the development of new techniques or methods. Selling in fake medicine (FM) the prevalence of
fake or poor medications is increasing. Quite striking, particularly as the leading medical products or
medications there is no need for a permit. A primary goal in rich nations, exercise is a costly medication of living
(hormones, steroids, appetite suppressants, medications for premature ejaculation, Psychotropic drugs) whereas,
in underdeveloped regions, daily existence pharmaceuticals (Tablets of antibiotics, antimalarial, antituberculosis,
antiretroviral medicine) are the objective [1]. Error! Reference source not found. shows the categories of fake
medicines with its definition. Substandard medicines are approved medical item that does not meet benchmarks,

Grenze ID: 01.GIJET.9.2.345


© Grenze Scientific Society, 2023
regulations, or sometimes both, and is often referred to as hazardous and pharmaceutical ingredients which
purposefully or falsely conceal their origin, name, or even content. Those standards were completely
contradictory; therefore, a sampling may only be categorized as being either sub-standard or falsified by the
world health organization (WHO) [5]. Medicines that lack active pharmaceutical ingredients (APIs), in addition
to those that include incorrect substances that could or could not be dangerous, are counterfeit.

TABLE 1:FAKE MEDICINE CATEGORIES


Categories Definitions
Substandard Medications that fall short of quality standards and criteria are known as poor
or substandard medications [2].
Falsified Falsified medications seem to be imitations of legitimate medications made
from bogus components [3].

Counterfeit Medications that can violate copyrights or trademark infringement are


considered counterfeit [3].
Diverted The unauthorized transfer of authorized medications from credible sources to
the black market is known as prescription medication diverting [4]

The world health organization (WHO) has noted that counterfeit medications have identification the illegally
source information purposefully and misnamed. Several nations definitions of the word counterfeit medication
have made it difficult to share information among them or truly comprehend the scope of the issue on a
worldwide scale [6]. Various categories of counterfeit medications are distinguished, and that can be found using
a variety of analytical techniques. Some of the most typical products include those having an inadequate amount
of the active ingredient or products with no active ingredient at all. 15.6 % of items are packed wrongly, and
21.4 % of products are constructed using inappropriate material. 8.5 % of the precise product's genuine copies
have significant contamination [7]. Fake medications with a wrong dosage of an active component. It can cause a
variety of medical conditions. Minimal antibiotic treatment may not eradicate the germs but it could cause the
growth of bacterial resistance. In Cambodia, fake malaria medicines killed 30 people in 2000 [8]. In 1993, more
than 100 kids in Nigeria were killed as a result of a toxic chemical found in fake cough syrup. Due to the
presence of ethylene glycol in cough syrup in place of glycerol, comparable cases occurred both China and India
during the time period 1990 and 2007 as well as in Panama. Around 190,000 people died in 2002 as a result of
polyethylene glycol poisoning in paracetamol syrup [9].
Fake medications with an undeclared active component. Recreation medications with potential botanical
constituents are a typical target for this kind of fraud. Despite being healthy, cannabinoids could nonetheless
have certain pharmaceutical consequences for the body. Therefore, it needs to be considered, and using such
medications must be used with a prescriber. Four children cough syrups Promethazine oral solution, Kofexmalin
baby cough syrup, Makeoff baby cough syrup, and Magrip N cold syrup have recently been labeled as
substandard by the WHO after causing kidney problems and the deaths of about 66 kids in the Gambia [10].
According to the literature, fake medicines cover practically all pharmaceutical medicines types such as
antimicrobials, antimalarials [11,12]. erectile dysfunctions, herbal, diabetes, and weight control [13].
A. Hyperspectral Remote Sensing
HRS (Hyperspectral Remote Sensing), also known as imaging spectroscopy, can give good image spectral
information. Researchers and scientists have investigated and applied imaging spectroscopy techniques for the
detection, identification, and mapping of minerals on land, in waters, and the atmosphere, based on the
characteristics of HRS, which combine imaging with spectroscopy and possess individual absorption features of
materials due to specific chemical bonds in a solid, liquid, or gas. As a result, HRS technology, as an enhanced
remote sensing instrument, has been explored for a variety of applications, including geology, geomorphology,
and environmental monitoring, etc. [65]. Hyperspectral Imaging (HSI) is a technique that examines the entire
wavelengths of colors instead of identifying primary colors RGB (red, green, and blue) with each pixel. So, the
light reaching every pixel is divided into numerous separate wavelength channels and provides additional details
about what has been viewed. The unique color signature of an individual object may be recognized using HSI.
Unlike other optical technologies that can only detect a single color but hyperspectral imaging can detect the
whole color spectrum in each pixel [66].
Standard RGB photographs display a basic contour and color they can always distinguish between connected
component sampling that shares the same contours and color that varied in their materials responses. Obtain the

245
same, particularly when it comes to pharmaceutical prejudice. A primary between hyperspectral images and
standard RGB data is the higher Spectrum Resolution and larger Spectrum Range of hyperspectral data. Figures
1 and 2 demonstrate that the hyperspectral signal contains additional data than the standard RGB data. An earlier
technique for image processing is much less suited for hyperspectral data since data is considered hyperspectral
it has more than a hundred plus bands [67].

For the identification of active pharmaceuticals ingredients in medicines use the hyperspectral data. In
hyperspectral data, a target at a high number of distinct wavelengths is described as hyperspectral imaging. In its
most basic form, hyperspectral data is a data cube in which the first two dimensions indicate spatial distances
and the third represents the spectral wavelength or wavenumber. Hyperspectral imaging is used in various
applications like remote sensing, agriculture, and food, also be used in pharmaceutical studies [68]. The
advantage of the technique is it required little sample and do not make use of any chemical or solvent, which is
increased safety, reduced negative environmental impacts, and save chemical laboratory analysis cost. When the
data is captured by hyperspectral image sensors all the chemical information is in a given spectral range. Thus,
with appropriate data analysis, it is simultaneously determining several quantitative properties with a single scan
image.

II. RELATED WORK


To start, we looked through the existing literature for academic journals, original research, conference papers,
and case studies that are available on different platforms like Google Scholar, IEEE Xplore, and PubMed
(Medline) repositories. The multidisciplinary character of the research objective, which required a review of the
science/health research, has been the justification for selecting those resources. Hyperspectral imaging (HSI) is a
developing imaging technique used in medicine, particularly in the identification of fake pharmaceuticals. A
literature review has focused on the study of the identification of fake medicines using hyperspectral imaging,
multispectral imaging, Raman hyperspectral imagining. TABLE II shows hyperspectral imaging's various
applications in the medical field.
An infrared absorption band-based multimodal PLS-based quantitative study of produced tablet content was
built. The study's aggregate result with 7 suspect paracetamol tablet tests, or 12%, is roughly in line with WHO
assessments for such amount of inferior or counterfeit pharmaceuticals marketed internationally. The numerical
study for this wavelength range (1524-1493 cm-1) revealed no errors [33]. Near-infrared spectroscopy (NIR) is
being to quantify the active components in semi-solid pharmaceutical preparations. This study assessed how well
six distinct dermatological semi- solid pharmaceutical formulations could be quantitatively inspected using a
NIR spectrometer with a restricted wavelength range (1000–1900 nm). The accuracy and variance are highly
dependent on the active component and the range of its concentrations [34]. ASD FieldSpec4 Spectrometer was
used in a study for the spectral database of pharmaceutical common excipients and active pharmaceutical
ingredients. It has a broad spectral range (350-2500 nm) and its supporting software were need derivatives along
with a derivative gap of 7 are employed as pre-processing approaches. ViewSpecPro software is used to display
the output of the spectrum database as figures for each sample [31]. The strategy was applied to real or fake
Viagra and Plavix materials. A margin of error of 15 to 24% was used to determine the amount of active
medicinal component. The methodology may now be used to provide a first estimate before the application of
further quantitative approaches [35]. Phosphodiesterase type 5 inhibitors have a murky online market that is
booming. Because fake medical products could make the user feel sick, medical practitioners may get sceptical
of them. The authentic drug and Cialis pills from dubious internet pharmacies were also examined in this
investigation. Using infrared and Raman spectroscopy, it was possible to detect the bogus pills and establish

246
TABLE II: APPLICATIONS OF HYPERSPECTRAL IMAGING SYSTEMS IN MEDICAL
Spectral range Methods Application Method of
(nm) measuring
440 -640 Hyperspectral imaging Skin cancer [14] Fluorescence and
Reflectance
500- 600 Medical hyperspectral imaging (MHS Diabetic foot [15] Reflectance

450- 650 Hyperspectral imaging Endoscope [16] Reflectance


365- 800 Hyperspectral imaging microscopy Melanoma [17] Transmission

400 -720 Hyperspectral imaging Tumor hypoxia Fluorescence


Microvasculat-ue [18]
450 - 700 Medical Hyperspectral Imaging Breast cancer [19] Reflectance

400 -1000; Hyperspectral imaging Intestinal ischemia [20] Reflectance


900 - 1700
450 to 950 Hyperspectral imaging Prostate cancer detection [21] Reflectance
2500 - 11,111 Fourier transform infrared spectroscopy Breast cancer [22] -
500-3500 Raman hyperspectral imaging Substandard Antimalarial -
Tablet [23]
1000-2500 Hyperspectral imaging Classification of drug tablets Wavelength
[24]
1000-2500 Short wave infrared hyperspectral imaging Detecting counterfeit drugs Wavelength
with identical API composition
[25]
350 -1050 Hyperspectral imaging Detection of Counterfeit Reflectance
Medicines [26]
1000 -2500 Near Infrared spectroscopy counterfeit detection a large Reflection
database of pharmaceutical
tablets [27]
4000–12,000 cm Near infrared spectroscopy Illegal synthetic adulterants in Reflection
herbal anti-diabetic medicines
[28]
1730–174 cm−1 Raman spectroscopy Identification of counterfeit Wavelength
drug [29]
4000–400cm−1, Fourier transform infrared spectroscopy, Detection of counterfeit Reflectance
1000–4000cm−1 Near-infrared spectroscopy and medicines [30] Detection of
Raman spectroscopy falsified antimalarial drug [31]
1001-2500 Hyperspectral non-imaging Pharmaceutical Common -
Excipients [32]

their API and excipients [36]. Aripiprazole in pharmaceutical formulation and bulk is detected quantitatively
using high-performance liquid chromatography (RP-HPLC). According to the study, recovery experiments and
the computation of the percent return were both used to assess the method's reliability [37]. Fake medications
were examined using portable Raman spectroscopy with tailored localized plain views. Local Straight-Line
Screening (LSLS) as well as fundamental problems severely (PCA). An algorithm was used to identify the fake
drug mixed with herbal medications. To detect suspicious fake medicines, the LSLS technique was extended to
Raman Spectroscopy by weighting developed false positive false negative ratios adjustments [38]. A total of 26
anabolic androgenic substances (ASA) tablets are founds. Outraged, the developed technology has been applied
to products intended to promote stronger and larger muscles. UHPLC-MS (ultra-high-performance liquid-
tandem mass chromatography spectrometry) technology has been developed and approved for screening and
quantifying AAS found in counterfeit medications and supplements [39]. Hyperspectral detection is a method are
used have to detect fake medicines tablets. By adding different amounts of calcium carbonate, these medication
powders have been modified to mimic fake medications. For this study, a hyperspectral sensor that operates in
the visual range and near-infrared (350–1050 nm) range was utilized. The findings suggest that we would
achieve a classification accuracy of greater than 90% [40]. To study to identify fake drugs, counterfeit drugs
were detected using image analysis and processing, within range of visible near-infrared (400-1000 nm) and
short-wave infrared (1000-2500 nm) hyperspectral imaging. Pfizer Viagra original product and imitation pills
were compared. The Gray-Level Co-Occurrence Matrix (GLCM) analysis allows for the assessment of the
homogeneity of pill component distribution [41].
53 formulations from 29 distinct medicinal product families have been measured, producing a massive library of
spectra. The principal component analysis (PCA), the K-Nearest Neighbors (KNN), the support vector machine

247
(SVM) and the discriminant analysis (DA) were among the chemometric methods used to analyze the data.
Near-infrared spectroscopy is used for the identification of pharmaceutical tablets by using a rapid investigative
tool for counterfeit detection [42]. Near-infrared (NIR) spectroscopy was used as a quick and easy analytical
approach to distinguish fake pharmaceuticals. Atorvastatin calcium sesquihydrate (AT) formulations were found
in seven different types of brand-name and generic pills. The likelihood of classifying the AT tablet samples into
the seven kinds was 100%. The major excipient combinations were what determined the PCA and SIMCA (Soft
independent modeling of class analogy) classification of the AT tablets [43]. From the related work we studied
and analyze the hyperspectral imaging technology that is used most to identify fake medicines and active
pharmaceutical ingredients (API) with an average spectral range is 450nm to 1000nm. We also analyze the fake
medicine tablets that are found in internet pharmacies and medicines stores those are Paracetamol, Pfizer Viagra,
Plavix, and Cialis.

III. DATA COLLECTION TECHNIQUES


The issue of medicines has evaluation has significantly across the globe. The importance and difficulty of
maintaining medical performance are increasing. This creates demand for faster, smarter ways to meet demand.
Any medicine, pharmaceutical ingredient, or material is considered an active pharmaceutical ingredient. It's all
about testing, testing for similar as compounds and identifying contaminants. Preparations of pharmaceuticals
medicines can be analyzed using various tools or methods.
A. Spectroscopic Method
The word spectroscopy describes a variety of techniques that make use of radiation to learn more about the
composition and characteristics of substances. Spectroscopy refers to a wide range of technologies that employ
radiation to collect information about the composition and features of a material, which one used to tackle a wide
range of research problems [44]. A variety of analytical issues can be resolved by using the spectroscopy
approach, which studies the chemical and physical characteristics of materials using radioactivity. In the
spectroscopic technique, there are different types of the spectroscope. In that Infrared spectroscopy (IRS) and
near-infrared spectroscopy (NIRS), Mass spectrometry (MS), Nuclear magnetic resonance (NMR) spectroscopy,
Raman spectroscopy, Fourier transform infrared (FTIR) spectroscopy, etc. For example, NIRS has been widely
used in the pharmaceutical sector, to identify crude ingredients or Active Pharmaceutical Ingredients (API), or to
determine relative humidity. NIRS collects details about both chemical and physical factors. The biologically
active component of a drug product (tablet, capsule, cream) that produces a better impact is referred to as the
active pharmaceutical ingredient (API) [45]. Raman spectroscopy, which is simple, non-destructive, and
information-rich, is an excellent method for the detection of the rapid categorization of drug substances. It
provides a powerful tool for the analysis and determination of counterfeit medicines when coupled with
chemometric techniques [46].
B. Chromatographic Method
The chromatography method can be utilized for and separate the components within a mixture. It is a process
used in laboratories to separate mixtures. The mixture is dissolved in a fluid known as the mobile phase, which
transports it through a structure that contains another substance known as the stationary phase. Chromatography
can be used for both preparatory and analytical purposes. Preparative chromatography is a type of purification
that is used to separate the components of a mixture for later use. Analytical chromatography is often employed
for lesser amounts of material, although it may also be utilized for pharmaceutical analysis and formulation.
Chromatography may be used for purification and classification, it is a method that use in laboratories to
separate mixtures.31 For the identification of fake medications and illicit pharmaceutical formulations, many
chromatography methods have been used and are as following [47].
C. Thin-layer Chromatography
Thin-layer chromatography (TLC) is a method that has the benefits of being inexpensive and simple to use. The
basic idea behind TLC in counterfeit analysis is straightforward: by comparison of the results with a test solution
that was similarly applied to a TLC silicon plate, the existence or authenticity of the active ingredient in a fake or
copycat sample is verified.
D. Liquid Chromatography
It is utilized in the evaluation and characterization of unlawful pharmaceutical formulations and fake
medications. LC is utilized in this sector for a variety of reasons in conjunction with various detectors. It is used

248
as both a quantitative approach and a method for target analysis (the presence of one or more recognized
substances).
E. Gas Chromatography
Gas chromatography (GC) have been employed to identify and find fake medications. Gas chromatography has
been used to verify the authenticity of essential oils and the presence of residual solvents, volatile components,
and unidentified chemicals or analogues (particularly in the quality assurance of herbal remedies) [48].
F. Hyperspectral Imaging Method
Hyperspectral imaging (HSI) is a developing imaging technique for use in medicine, particularly in the detection
of illness with photograph treatment. The hypercube collection, which has three dimensions two geographical
and one spectral is acquired via HSI. HSI's remotely sensed spectrum scanning yields sensor readings on the
biology, and architecture, including the substance of the material. Hyperspectral imaging is a blended technique
that integrates spectroscopic with image processing. The three-dimensional (3-D) dataset of spatial and spectral
data is produced through Hyperspectral imaging besides gathering spectral information at each pixel of a two-
dimensional (2-D) sensor emits. This set of information is referred to as a hypercube. Figure 1 Hypercube versus
red, green, and blue picture comparisons. The two-dimensional picture on every wavelength is a part of the
three-dimensional database known as Hypercube. The reflectance curve (spectral signature) of a pixel in the
picture may be located in the lower left. Merely 3 picture bands on the red, green, and blue wavelengths are
present in an RGB color picture. The RGB picture's image pixel gradient is shown in the bottom right. The
source of each spectrum on samples may be identified with geographical information, allowing for a more
thorough investigation of how light interacts with the disease. HSI can recognize several pathological disorders
thanks to the spectral signature of each pixel in the photos. In comparison to multispectral imaging (like red,
green, and blue color cameras), HSI often covers a continuous region of the visible spectrum with more spectral
bands (more than hundred) and greater spectral resolution [49]. Over the past years, hyperspectral imaging
methods have successfully demonstrated their value in a variety of pharmacological research domains. It
involves taking pictures of an object at several distinct wavelengths.

Figure 1:Hypercube 3D dataset and RGB 2D Images

A hyperspectral picture is essentially a datacube with 2 dimensions that reflect spectral wavelengths.
Hyperspectral imaging is used in chemical imaging, which is the process of identifying and quantifying the
chemical components of a sample or product, as well as their dispersal or uniformity. Compared to hyperspectral
imaging, which covers any frequency band, spanning visual to lengthy infrared, chemical imaging generally uses
the near-infrared (NIR) or short-wave (SWIR) infrared ranges, which carry on the bases of the chemical
relationships The NIR and SWIR spectra of the organic compounds which make up the majority of
pharmaceuticals are distinctive. Chemical compositions inside a material can be identified and quantified using
spectral features. Combining, tracking tablets manufacture, and spotting fake goods are just a few of the medical
research and quality assurance applications that make utilization of chemical imaging.

IV. HYPERSPECTRAL DATA ANALYSIS


For the hyperspectral data collection and analysis below steps as in Figure 2 can be followed

249
Figure 2:Workflow steps for proposed system

A. Data Pre-processing
Images recognition and data standardization are the fundamental components of the hyperspectral imaging pre-
processing stage. The literature additionally makes use of the Gaussian function to flatten spectral signatures and
reduce the noise impact [50]. Data normalization modifies or normalizes hyperspectral illumination data to
values that indicate the inherent characteristics of biomaterials, such as absorbance or reflectance. A superior
way of preparing data for the analysis is by normalization, which also minimizes systems distortion and image
artifacts caused by uneven surface illumination or a lot of unnecessary data in the hyperspectral imagery's sub-
bands. Absorbance and reflectance [51,52] are the most popular pre-processing techniques used in hyperspectral
data. By covering the sensor lens, taking a dark picture, and subtracting the hyperspectral data for the dark
image. From the hyperspectral data taken from the area of interest, the camera's dark current impact was
eliminated. To create a white reference image, a white diffuse reflectance target was employed. The
hyperspectral data relative reflectance (R) is determined by equation 1. From the collected data it needs to select
a spatial region of interest (ROI) and then apply further pre-processing [53].
R= 100 (1)
whereas Id is the dark image, Iw is the white reference image, and Is is the raw hyperspectral data.

B. Feature Selection and Feature Extraction


Finding the most pertinent information within the original dataset and representing it in a lower-dimensional
space are the aims of feature selection and feature extraction. A greater variety of spectral bands may make it
feasible to distinguish between more particular classes in hyperspectral databases. However, using too many
spectral bands may reduce the accuracy of classification because of the effect of dimensionality. Every pixel in
hyperspectral data may be expressed as an N-dimensional vector, where N is the total number of spectral bands.
For hyperspectral image processing applications, such pixel-based format has been extensively employed [54].
C. Classification
Primary pixel and subpixel hyperspectral data categorization techniques utilized within the medical field depend
on the type of pixel information present. Structured and unstructured categorization could be done at the pixel
level. Sampling distribution for the data is typically assumed by parameterized classifications, however, this
assumption is frequently ignored in applications [55].
Principal component analysis (PCA): PCA is the commonly utilized dimensionality reduction technique for
analyzing medical hyperspectral datasets. While retaining the majority of the variation in the high-dimensional
region as is practicable, PCA minimizes redundant information in the bands of hyperspectral imaging [56]. It is
new platforms of remotely sensed have made utilization of the principle component analysis. A historical
backdrop of PCA and its mathematical justification are detailed in Gonzalez and Woods in 1993. The overall
majority of the research focuses on methods for achieving efficient multispectral data categorization, with little
attention paid to PCA efficiency and also its enhancement. The principal component analysis is based on the
observation that adjacent bands in hyperspectral imaging frequently include data about the object that is nearly
identical and strongly correlated. The analysis is used to modify the original information to eliminate band
correlations. The procedure results in the identification of the ideal linear function of a set of the original bands
that accounts for the variance of pixel values in a picture [57]. Simply data dimensionality is reduced using the
mathematical equation that is Principal Components Analysis (PCA). As a result, the PCA approach enables the
recognition of standards in data and their representation in a way that highlights both their similarities and
contrasts. Finding patterns, compressing them, and reducing their size without losing any information [58].
K-Nearest Neighbour (KNN): One of the simplest machine learning algorithms, depending on the supervised
method, is K-Nearest Neighbour. A new data point is classified using the K-NN method groups with similar after
all the existing data has been stored. As a result, fresh data may be quickly and accurately categorized into

250
appropriate categories that use the K-NN algorithm, emphasizing both their distinctions and similarities. Finding
patterns, compressing them, and reducing their size without losing information [59]. A popular non-parametric
technique for classification in pattern recognition is the KNN algorithm. The core element of KNN is that a data
point's categorization is decided by the classifications of its closest K neighbours [60].
Partial Least Square Regression (PLSR): It is frequently used for quantitative spectrum analysis as well as
reflectance spectroscopic data processing. It breaks down both variables and discovers additional components.
When there are several highly line segments key predictors, it is utilized to build forecasting analytics. Based on
the spectrum, it may be utilized to create a linear prediction model for the sample size. Distinct frequency
readings make up each spectrum. The answers are forecasted linearly using the PLS Factors, which are generated
as a specific linear combination of the spectral range. Compared to the previous multiple regression approach
method, it produces richer findings [61].
Support Vector Machine (SVM): The aims of SVM is to identify an ideal higher-dimensional space as a decision
boundary in high-dimensional space, is founded on a statistical learning concept. The SVM chooses from an
unlimited number of linear decision boundaries the one that minimizes the classification error in the instance of a
two-class pattern-recognition problem when the categories are discrete. The decision boundaries chosen will
therefore be the one that leaves the largest class label, where the margin is defined as the sum of the distances
from the nearest instances of the two classes to the higher dimensional space [62,63]. Support vector machines'
benefits include efficiency in high-dimensional environments. It is useful in situations where the quantity of
dimensions exceeds the number of samples. It is also memory efficient since it only uses a portion of the training
points (known as support vectors) in the decision function. There are standard processors available, but we can
also define our processors. SVM has some drawbacks, including avoiding over-fitting when selecting processor
functions and regularisation terms if the number of features is much more than the sample size [64].

V. CHALLENGES
Currently, the areas of active pharmaceutical ingredient detecting (API), pharmaceutical validity authentication,
medication cluster analysis, or medication covering layer recognition are used hyperspectral technologies in
pharmaceuticals. The majority of pharmaceutical identification techniques like analytical methods, spectroscopy
methods, and the basic principal component analysis (PCA), with partial minimal, when it comes to statistical
techniques like polynomial regression (PLSR), data evaluation in the experiment is often the implementation
situation; commercial pharmaceutical production processes, as well as other locations, are not excluded from
this. As a result, there are many potential applications for hyperspectral technology in the context of
pharmaceutical research, yet there are numerous obstacles to overcome. The main problem at the moment is how
to employ hyperspectral technologies in medication identification in the commercial pharmacy setting using
extreme accuracy and cheap cost. So continued development of that kind of technique is being constrained by a
shortage of pharmaceutical hyperspectral data sources or a lack of scientific fervor. Hence, should ensure the
continued development of hyperspectral technologies inside this sector, academics have to not only extend
beyond identification techniques to generally utilized computer vision techniques, yet also need to regularly
incorporate popular pharmacological detecting data.

VI. CONCLUSION

Based on the study, the issue of fake medications has indeed been extensively acknowledged, although it is not
yet properly described and adequately handled. According to a study, we found the most fake pharmaceuticals
medicines are antimicrobials, antimalarials, erectile dysfunctions, herbal, diabetes, and weight control. In this
study, we have tried to provide an overview of several approaches for the identification of fake pharmaceutical
medicines, particularly we are focusing on hyperspectral imaging technology because the is non-invasive, actual
hyperspectral imaging technologies are three-dimensional picture cubes having 2-dimensional aspects, and one
spectral aspect can be acquired. Every hypercube pixel may well be described by a spectral curve that really can
reach from the Ultraviolet towards the Infrared spectrum. This hyperspectral imaging acquired remotely sensed
spectrum offers analytical details on the material biology, architecture, including substance. As it gives spectrum
data that could be utilized to differentiate between authentic and fraudulent medications, hyperspectral sensors
easier processes thus accelerating the identification of falsified medicines as found 350 nm to 1000nm visible
range to near-infrared range.

251
REFERENCES
[1] Bottoni P. Fake pharmaceuticals: A review of current analytical approaches. Microchemical Journal. 2019 Sep 1.
[2] Johnston A. Substandard drugs: a potential crisis for public health. British journal of clinical pharmacology.2014.
[3] https://www.ema.europa.eu/en/humanregulatory/overview/public-health-threats/falsified medicines-overview accessed
on 20/09/2022
[4] Wood D. Drug diversion. Australian prescriber. 2015 Oct;38(5):164.
[5] WHO global surveillance and monitoring system for substandard and falsified medical products
https://apps.who.int/iris/handle/10665/326708 accessed on 20/09/2022
[6] Glass BD. Counterfeit drugs and medical devices in developing countries. Research and Reports in Tropical Medicine.
2014
[7] Dégardin K, Roggo Y, Margot P. Understanding and fighting the medicine counterfeit market. Journal of
pharmaceutical and biomedical analysis. 2014 Jan 18.
[8] Newton PN, Green MD, White NJ. Counterfeit anti-infective drugs. The Lancet infectious diseases. 2006 Sep.
[9] Deisingh AK. Pharmaceutical counterfeiting. Analyst. 2005.
[10] https://timesofindia.indiatimes.com/city/ranchi/vigil-in-state-after-who-bans-4-kid-cough accessed on 10/10/2022
[11] Björkman-Nyqvist M, Svensson J, Yanagizawa-Drott D. The market for (fake) antimalarial medicine: Evidence from
uganda. Abdul Latif Jameel Poverty Action Lab. 2013 Jun.
[12] Bottoni P,Fake pharmaceuticals: A review of current analytical approaches. Microchemical Journal. 2019 Sep.
[13] Ho HM, Xiong Z, Wong HY, Buanz A. The era of fake medicines: Investigating counterfeit medicinal products for
erectile dysfunction disguised as herbal supplements. International Journal of Pharmaceutics. 2022 Apr.
[14] Kong SG, Martin MET. Hyperspectral fluorescence imaging for mouse skin tumor detection. Etri Journal. 2006 Dec.
[15] Greenman RL, Panasyuk S, Wang X, Lyons TE, Dinh T, Longoria L, Giurini JM, Freeman J, Khaodhiar L, Veves A.
Early changes in the skin microcirculation and muscle metabolism of the diabetic foot. The Lancet. 2005 Nov.
[16] Kester RT, Bedard Real-time snapshot hyperspectral imaging endoscope. Journal of biomedical optics. 2011 May.
[17] Dicker DT, Lerner J, Van Belle P, Guerry, 4th D, Herlyn M, Elder DE, El-Deiry WS. Differentiation of normal skin and
melanoma using high resolution hyperspectral imaging. Cancer biology & therapy. 2006 Aug.
[18] Sorg BS, Moeller BJ, Donovan O, Cao Y, Dewhirst MW. Hyperspectral imaging of hemoglobin saturation in tumor
microvasculature and tumor hypoxia development. Journal of biomedical optics. 2005 Jul.
[19] Panasyuk SV, Yang S, Faller DV, Ngo D, Lew RA, Freeman JE, Rogers AE. Medical hyperspectral imaging to facilitate
residual tumor identification during surgery. Cancer biology & therapy. 2007 Mar.
[20] Akbari H, Kosugi Y, Kojima K, Tanaka N. Detection and analysis of the intestinal ischemia using visible and invisible
hyperspectral imaging. IEEE Transactions on Biomedical Engineering. 2010 May.
[21] Akbari H, Halig L, Schuster DM, Fei B, Osunkoya A, Master V, Nieh P, Chen G. Hyperspectral imaging and
quantitative analysis for prostate cancer detection. Journal of biomedical optics. 2012 Jul.
[22] Kumar S, Desmedt C, Larsimont D, Sotiriou C, Goormaghtigh E. Change in the microenvironment of breast cancer
studied by FTIR imaging. Analyst. 2013.
[23] Frosch T, Wyrwich E, Yan D, Domes C, Domes R, Popp J, Frosch T. Counterfeit and substandard test of the
antimalarial tablet Riamet® by means of Raman hyperspectral multicomponent analysis. Molecules. 2019 Sep.
[24] Kaneko H, Funatsu K. Classification of drug tablets using hyperspectral imaging and wavelength selection with a
GAWLS method modified for classification. International journal of pharmaceutics. 2015 Aug.
[25] Wilczyński S, Koprowski R, Marmion M, Duda P, Błońska-Fajfrowska B. The use of hyperspectral imaging in the
VNIR (400–1000 nm) and SWIR range (1000–2500 nm) for detecting counterfeit drugs with identical API composition.
Talanta. 2016 Nov.
[26] Shinde SR, Bhavsar K, Kimbahune S, Khandelwal S, Ghose A, Pal A. Detection of Counterfeit Medicines Using
Hyperspectral Sensing. In2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology
Society (EMBC) 2020 Jul.
[27] Dégardin K, Guillemain A, Guerreiro NV, Roggo Y. Near infrared spectroscopy for counterfeit detection using a large
database of pharmaceutical tablets. Journal of pharmaceutical and biomedical analysis. 2016 Sep.
[28] Feng Y, Lei D, Hu C. Rapid identification of illegal synthetic adulterants in herbal anti-diabetic medicines using near
infrared spectroscopy. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy. 2014 May.
[29] Lu F, Weng X, Chai Y, Yang Y, Yu Y, Duan G. A novel identification system for counterfeit drugs based on portable
Raman spectroscopy. Chemometrics and Intelligent Laboratory Systems. 2013 Aug.
[30] Sacré PY, Deconinck E, De Beer T,Chiap P, Crommen J, De Beer JO. Comparison and combination of spectroscopic
techniques for the detection of counterfeit medicines. Journal of pharmaceutical and biomedical analysis. 2010 Nov.
[31] Gupta RS, Deshmukh RR, Kshirsagar AV. Spectral Database of Pharmaceutical Common Excipients and Paracetamol
API Using ASD Field Spec 4 Spectrordiometer. Medico-Legal Update. 2021 Apr.
[32] Yabré M, Sakira AK, Bandé M, Goumbri BW, Ouattara SM, Fofana S, Somé TI. Detection of Falsified Antimalarial
Sulfadoxine-Pyrimethamine and Dihydroartemisinin-Piperaquine Drugs Using a Low-Cost Handheld Near-Infrared
Spectrometer. Journal of Analytical Methods in Chemistry. 2022 May.
[33] Lawson G, Ogwu J, Tanna S. Quantitative screening of the pharmaceutical ingredient for the rapid identification of
substandard and falsified medicines using reflectance infrared spectroscopy. PLoS One. 2018 Aug.

252
[34] Schlegel LB, Schubert-Zsilavecz M. Quantification of active ingredients in semi-solid pharmaceutical formulations by
near infrared spectroscopy. Journal of pharmaceutical and biomedical analysis. 2017 Aug.
[35] Rebiere H, Martin M, Ghyselinck C, Bonnet PA, Brenier C. Raman chemical imaging for spectroscopic screening and
direct quantification of falsified drugs. Journal of Pharmaceutical and Biomedical Analysis. 2018 Jan.
[36] Spálovská D, Pekárek T, Kuchař M, Setnička V. Comparison of genuine, generic and counterfeit Cialis tablets using
vibrational spectroscopy and statistical methods. Journal of Pharmaceutical and Biomedical Analysis. 2021 Nov.
[37] Ahmed N, Shaikh O. A.Development and validation of rapid HPLC method for determination of Aripiprazole in bulk
drug and pharmaceutical formulation. Journal of Innovations in Pharmaceutical and Biological Sciences. 2017
[38] Lu F, Weng X, Chai Y, Yang Y, Yu Y, Duan G. A novel identification system for counterfeit drugs based on portable
Raman spectroscopy. Chemometrics and Intelligent Laboratory Systems. 2013 Aug.
[39] Cho SH, Park HJ, Lee JH, Do JA, Heo S, Jo JH, Cho S. Determination of anabolic–androgenic steroid adulterants in
counterfeit drugs by UHPLC–MS/MS. Journal of Pharmaceutical and Biomedical Analysis. 2015 Jul.
[40] Shinde SR, Bhavsar K, Kimbahune S. Detection of Counterfeit Medicines Using Hyperspectral Sensing. In2020 42nd
Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 2020 Jul.
[41] Wilczyński S, Koprowski R. The use of hyperspectral imaging in the VNIR (400–1000 nm) and SWIR range (1000–
2500 nm) for detecting counterfeit drugs with identical API composition. Talanta. 2016 Nov.
[42] Dégardin K, Guillemain A, Guerreiro NV, Roggo Y. Near infrared spectroscopy for counterfeit detection using a large
database of pharmaceutical tablets. Journal of pharmaceutical and biomedical analysis. 2016 Sep.
[43] Hattori Y,Peerapattana J, Otsuka M. Rapid identification of oral solid dosage forms of counterfeit pharmaceuticals by
discrimination using near-infrared spectroscopy. Bio-Medical Materials and Engineering. 2018 Jan.
[44] Sacré PY, Deconinck E, Chiap P, Crommen J, De Beer JO. Comparison and combination of spectroscopic techniques
for the detection of counterfeit medicines. Journal of pharmaceutical and biomedical analysis. 2010 Nov.
[45] Dégardin K, Guillemain A, Guerreiro NV, Roggo Y. Near infrared spectroscopy for counterfeit detection using a large
database of pharmaceutical tablets. Journal of pharmaceutical and biomedical analysis. 2016 Sep.
[46] Neuberger S, Neusüß C. Determination of counterfeit medicines by Raman spectroscopy: systematic study based on a
large set of model tablets. Journal of pharmaceutical and biomedical analysis. 2015 Aug.
[47] Deconinck E, Sacré PY, De Beer JO. Chromatography in the detection and characterization of illegal pharmaceutical
preparations. Journal of chromatographic science. 2013 Sep.
[48] Phillips G. World Congress of Pharmacy and Pharmaceutical Sciences: anticounterfeiting measures. Pharm J. 2003.
[49] Lu G, Fei B. Medical hyperspectral imaging: a review. Journal of biomedical optics. 2014 Jan;19.
[50] Kong SG, Du Z, Martin M, Vo-Dinh T. Hyperspectral fluorescence image analysis for use in medical diagnostics.
InAdvanced Biomedical and Clinical Diagnostic Systems III 2005 Apr.
[51] Sowa MG, Payette JR, Hewko MD, Mantsch HH. Visible-near infrared multispectral imaging of the rat dorsal skin flap.
Journal of biomedical optics. 1999 Oct.
[52] Gillies R, Freeman JE, Cancio LC, Brand D, Hopmeier M, Mansfield JR. Systemic effects of shock and resuscitation
monitored by visible hyperspectral imaging. Diabetes technology & therapeutics. 2003 Nov.
[53] Farrugia J, Griffin S, Valdramidis VP, Camilleri K, Falzon O. Principal component analysis of hyperspectral data for
early detection of mould in cheeselets. Current Research in Food Science. 2021 Jan.
[54] Lu G, Fei B. Medical hyperspectral imaging: a review. Journal of biomedical optics. 2014 Jan.
[55] Lu D, Weng Q. A survey of image classification methods and techniques for improving classification performance.
International journal of Remote sensing. 2007 Mar.
[56] Rodarmel C, Shan J. Principal component analysis for hyperspectral image classification. Surveying and Land
Information Science. 2002 Jun.
[57] Huse VA, Chaudhary D, Gawali BW. Image Processing Approach for Fish Image Analysis–A.
[58] Sonawane MM, Gawali BW, Manza RR, Mendhekar S. Analysis of Skin disease techniques using Smart Phone and
Digital Camera Identification of Skin Disease. Research Journal of Science and Technology. 2022 Jul.
[59] Guo Y, Han S, Li Y, Zhang C, Bai Y. K-Nearest Neighbor combined with guided filter for hyperspectral image
classification. Procedia Computer Science. 2018 Jan.
[60] Gore¹ RD, Nimbhore SS, Gawali BW. Understanding soil spectral signature though RS and GIS Techniques.
[61] Vapnik V. The nature of statistical learning theory. Springer science & business media; 1999 Nov 19.
[62] Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods.
Cambridge university press; 2000 Mar 23.
[63] available on https://scikit-learn.org/stable/modules/svm.html accessed on 15-11-2022
[64] Pu R. Hyperspectral remote sensing: fundamentals and practices. CRC Press; 2017 Aug 16.
[65] Vasefi F, MacKinnon N, Farkas DL. Hyperspectral and multispectral imaging in dermatology. In Imaging in
Dermatology 2016 Jan.
[66] Chen SY, Chen YC, Lien CT. A New Application of Hyperspectral Techniques in Drug Classification. In International
Conference on Intelligent Information Hiding and Multimedia Signal Processing 2018 Nov.
[67] Fake medicines: The worldwide industry putting your life in danger [Internet]. Srinath Perur M. 2019 [cited 1 July 2019]
CNN.www.cnn.com/2018/10/30/health/fakemedicinepartner/ index.html

253
Grenze International Journal of Engineering and Technology, June Issue

Electrical Design of Off-Road Electric Vehicle


Lipika Nanda1, Nikita Lahon2, Arjyadhara Pradhan3, Babita Panda4, Chitralekha Jena5 and
Sourav Kumar Satpathy6
1-6
School of Electrical Engineering, KIIT Deemed to be University, Bhubaneswar, Odisha, India
Email: [email protected], [email protected], [email protected], [email protected],
[email protected], [email protected]

Abstract—The study, design, and construction of every electrical system required to create a
fully functional off-road EV are presented in this report. Three main electrical categories were
required, namely critical vehicle systems, e.g. engine control unit integration with engine
sensors and a motor-controlled shifting system, safety systems and the additional designs that
added value to the vehicle, e.g. an electronic driver interface, digitally controlled shifting etc.

Index Terms— Accumulator, BLDC motor, Simulink, Efficiency.

I. INTRODUCTION
The benefits of EVs include increased energy efficiency due to regenerative braking and newer packaging
options, along with a reduction in CO2 emissions. The goal of this thesis is to increase understanding of the
benefits of electric power trains and how they might be used[1]-[2]. The objective of the paper is to highlight
that how different choices affect the vehicle dynamics, how the motors and gears might be configured and
the way the accumulator pack and motor controller might be designed to maximize the performance of the
car.
Many experts are looking for alternate energy sources because of the problem of air pollution caused by
automobiles. A promising approach is to use an electric car, known as a combustion motor replacement. In order
to determine its characteristics, the first step in this research is to model the power flow inside the energy system
for electric vehicles[3]-[4]. Because electric vehicles are highly dependent on the finite amount of electrical
energy provided by the battery, power flow efficiency is a crucial topic to discuss. Therefore, it needs to be
handled effectively. In order to ensure that the amount of electrical energy meets the needs of electric vehicles,
the study will track the power flow calculation[5]-[8]. The electrical layout of an off-road electric vehicle model
using MATLAB/Simulink software is to get the best power flow response to the electric vehicle energy system.

II. SIMULATION DESIGN PARAMETERS


The simulation approach was entirely based on existing Simulink and MATLAB Models, which was configured
according to the requirements. The main reason behind this approach was to avoid “Reinventing the
wheel”which saved us a lot of time and extra energy. We had to do a lot of exploratory research to finalize a
step-by step problem solving approach to acquire the required data. Using various graphs and problem solving
strategies we have fulfilled this topic in detail and the power train model of formula and off road EV is hence
made.
Aiming toward resolving the issues with the previous year’s vehicle, so more emphasis was given on Simulating

Grenze ID: 01.GIJET.9.2.803


© Grenze Scientific Society, 2023
and calculating new values of the Power train department (speed,torque,power) and Electrical Department
(Battery Calculations) for reaching the heights of efficiency for the power train. Our team also intended to create
a correct Cooling system design simulation and analyze the dynamic response. In order to try to to so, several
simulations were conducted to realize the right calculations.

III. ELECTRICAL SPECIFICATION


A. Tractive System
Tractive system consist of Accumulator pack,Accumulator management system,Brake light,Led indicators,Fuse,
Kill switch,Master switch,Relays, Motor control unit,DC-DC converter,HVD and wires with their connectors.
i)Shutdown Circuit
This Circuit is responsible for shutting down the Tractive System upon failure of any component. It consists of
many sensor-driven automatic as well as manual switches which assists the purpose.
ii)High Voltage System
The HV System consists of the following equipment-
 Li-FePo4(Formula)
 Li-NMC(E-Baja) Accumulator Pack
 BLDC Electric motor
 Kelly Motor Controller
iii)Accumulator Pack
Purpose: Supply power to the entire Tractive System
Status:Research for improving the efficiency of existing battery pack by modifying the Accumulator
Management and Charging techniques.
For research purposes I am referring to the design strategy of University of Wisconsin-Madison.
Next Steps: modifying the existing pack by adding cell temperature monitoring module and adding some extra
safety measures to the cell configuration
iv)Accumulator specification
 Accumulator voltage:48V,
 Accumulator capacity:110Ah
 Li-ion cell Nominal voltage-3.2V
 Charge cut-off Voltage-3.6V
 Cut-off Voltage-2V
 Accumulator pack consist of combination of 22 in parallel and 16 in series.
 The cells contain within an en-closer constructed of fibre reinforced plastic. The cell is mounted in a
non permanent arrangement as it will be easy for servicing and access. Accumulator container is
equipped with at least 2 AIR & 1 Fuse . It also include AMS with embedded Master and Slave modules
to monitor cell voltages. MCU also embedded Thyristor module to regulate the voltage supply to the
Motor.
B. Safety Device
i)Accumulator Isolation Relay
TWO KILOVAC EV200 AIRs are used in the system.
each one is connected to the battery positive and negative terminals respectively.Its rated at 500A at 12v.The
AIRs are connected in parallel with each other.It is of normally open type relay.Operating temperature -40
degree to 85
degree.
ii)Brake System Plausibility Device (BSPD)
Non programmable.Disconnects AIRS when throttle>10%, and brakes are activated.Opens AIRS circuit.AIRS
remain open until power cycle.
iii)Brake light:
Must be red and Visible in sunlight.Must be rectangular/triangular/round.Located between wheel center
line/shoulder of driver.

255
D. Kelly DC to DC converter Specifications
 Nominal input voltage: DC 48V, 60V, 72V
 Output voltage: DC 13.5V under 70℃ or DC 12.2V above 70℃
 Operating voltage range: 40V-100V
 Output current: 30A
 Output power: 400W
 Operating Temperature Range: -20℃ to 90℃ (case temperature)
 Full - load efficiency: ≥93%
 Ripple coefficient: ≤1%
 Weight: 2.25lbs

Fig. 1 Block diagram of the overall system

The basic simulink will be derived with respect to the following block diagram.
In the configuration the main blocks are battery, drive cycle, controller, DC motor, vehicle body and
transmission and feedback. The driver cycle is the driver input who will ride the vehicle. Here FTP75 drive cycle
is used to give the output on how it reacts to the drive cycle. The main purpose to design the EV is to understand
the speed of vehicle, calculate SOC discharge rate, distance travelled by vehicle. The SOC block in MATLAB
simulink will calculate the SOC, distance is measured by distance and time parameter.

IV. PROBLEM FORMULATION


For the battery charging current I B, the battery management system monitors the battery voltage, SOC, and
battery temperature voltage,

where, S : state of charge; Ch : actual stored “Ah” capacity in the battery; Chnom : the nominal “Ah” capacity of
the battery. The battery terminal voltage VB is determined by battery SOC and its impedance.

Fig 1: charging characteristic of lithium battery

256
 Battery size calculation :
= (3000x8)/48= 500Ah
 Current required to run load: I 1=load(watt)/voltage = 3000/50= 60A

 Battery Charging current: I 2= battery size(Ah)/Hr= 500/10= 50A

Total current to run load: I 1+I 2


50+60=110A
 Output power
W=2pi(n)/60 =(2x3.14x2400)/60
= 251 W

Pout= T ×W= 251×12= 3000 W approx.


Pin= V×I= 48×110= 5280 W
 Efficiency= Pout/Pin = 55%

V. SIMULINK RESULT ANALYSIS


The current flows to the dc motor through the battery through the dc power converter to drive the motor. The
controller will control the voltage . the controller will run by the instruction given by the driver input and will
run the motor at required rpm.

Fig. 2 Simplified configuration of Electric vehicle

All these components are available in the Simscape electrical components library so they need a connection to
the solver configuration block which helps solve the simulation.The majority of blocks are Simscape blocks. It
convert the physical signal from Simscape block to Simulink signal.
Initially it was set to 90% charge and after running for 1000s the final charge is around 55%.
The distance obtained for a given FTP75 drive cycle with80Ah battery capacity is able to cover say 10km
distance for 80Ah battery capacity.
Distance Travelled The distance obtained for a given FTP75 drive cycle with80Ah battery capacity is able to
cover 11.14km distance

257
Fig 3: Motor controller design

Fig 4: Driver input

Fig.4 State of charge vs time

Fig. 5 Distance travelled vs time

258
VI. CONCLUSION
The EV is created by MATLAB simulink and it is obtained for a drive cycle by FTP75 driver.It can be tested
for any drive cycle by selecting the drive cycle block. It can travel 11.14 km by the presence of 80Ah battery
capacity. It will travel further more distance if the battery capacity has been enhanced. However, it should be
taken into account considering space and cost in the real time.
Making the actual model is made simpler by modelling the EV prototype system. The battery life of an EV can
be estimated using this prototype. The model can be used to assess the efficiency of an EV during startup or
constant-speed operation.

REFERENCES
[1] Kim, S., Chung, S., Shin, W., Lee, J., A study of predicting model of an electrical energy balance for a conventional
vehicle, Procee-dings of the 17th World Con-gress The International Federa-tion of Automatic Control Seoul, Korea,
July 6-11, 2008.
[2] Kunzli, N., Public-Health Impact of Outdoor and Traffic-Related Air Pollution: An European Assessment, The Lancet,
Vol. 356,Number 9232, September 2000, pp. 795-801.
[3] Larminie, J., Lowry, J., 2003, Electric Vehicle Technology Explained, John Wiley & Son.
[4] Lustenader, E. L., Guess, R. H., Richter, Turnbull, F. G., De-velopment of a Hybrid Flywheel /Battery Drive System for
Elec-tric Vehicle Applications, IEEE Transactions on Vehicular Tech-nology, Vol. VT-26, May 1977, pp.135-143.
[5] Patterson, P., Quantifying the Fuel Use and GHG Reduction Poten-tial of EVs and HEVs, Available April 26, 2002,
http://www.ott. doe.gov/pdfs/evsl7 .pdf
[6] J. Bauman and M. Kazerani, "A comparative study of Fuel-Cell– Battery, Fuel-Cell–Ultra capacitor, and Fuel-Cell–
Battery–Ultra capacitor Vehicles," in IEEE Transactions on Vehicular Technology, vol. 57, no. 2, pp. 760-769, March
2008.
[7] Drishya.K.Sasi and Jiji. K S, "A survey of bidirectional DC/DC converters for battery storage applications in distributed
generation systems," 2020 International conference on power, Instrumentation, Control and Computing (PICC), Trissur,
India, 2020.
[8] M. Kabalo, B. Blunier, D. Bouquain and A. Miraoui, "State of the art of DC-DC converters for fuel cell vehicles," 2010
IEEE Vehicle Power and Propulsion Conference, Lille, France, 2010, pp. 1-6. [6] S. Miao, F. Wang and X. Ma, "A New
Transformer-less Buck Boost converter with positive output voltage," in IEEE Transactions on Industrial Electronics,
vol. 63, no. 5, pp. 2965-2975, May 2016.

259
Grenze International Journal of Engineering and Technology, June Issue

PM based Eddy Current Braking for Automobile


Applications
Vinayak C Magadal1 and Mrityunjaya Kappali2
1-2
Dept. of Electrical & Electronics Engineering, KLE Technological Univeristy, Hubballi, India
Email: [email protected], [email protected]

Abstract—Of late, there is overwhelming growth of the e-mobility sector and hence, growing
interest in integrating automotive electronics. In this context, the present paper discusses Eddy
Current Braking (ECB) Systems. Electrification of the braking systems would aid in gaining
electronic control and total integration of electrical and electronic components in an
automobile. Electrical eddy-current braking is a prominent type of braking systems used in
textile, oil rigs and locomotive sectors. Potential of incorporating ECB with existing frictional
braking to form Integrated braking systems for automotive braking applications is explored.
ECB provide retardation, while frictional components are required to halt the brake disc.
Development of analytical model and a preliminary hardware model are carried out in the
present paper.

Index Terms— Eddy Current Braking, electrical braking systems, permanent magnets.

I. INTRODUCTION
Brakes have changed considerably from traditional wooden log blocks to present day Automatic braking
systems. For more than 100 years, braking systems have evolved to adapt with improving automotive
capabilities and road conditions. Throughout history, the mechanical friction type of braking mechanism has
been favored, with little to no advocation towards Electrical or electromagnetic braking. With the exponential
growth of e-mobility sector and interest of electrifying automotive components, the focus on electric braking is
gaining prominence.
In the present paper, Eddy Current Braking (ECB) Systems are looked into as a prospect for application in
automotive braking. Integrated ECB systems which incorporate existing frictional braking components and
employ Permanent Magnets (PMs) are explored to fulfil the needs of automotive braking.
A brief review on evolution of braking systems and current status of ECB is done in Section II. In section III, ECB
is designed using Permanent Magnets incorporating brake disc. Section III covers a simple analytical model is
developed to validate the expressions and plot characteristic graphs of an ECB. In section V, a preliminary
hardware model is designed to validate the fundamental working principles of an ECB employing PMs.
Subsequent results obtained and inferences observed are elaborated in section VI, followed by concluding remarks
in section VII.

II. LITERATURE SURVEY


A. Evolution Of Braking Systems
The earliest brakes were wooden log blocks used on steel rimmed cart wheels. They were primarily used in

Grenze ID: 01.GIJET.9.2.805


© Grenze Scientific Society, 2023
steam powered vehicles and horse drawn carriages. However, Rubber tyres were introduced by Michelin
brothers in late 1890s [1], which made wooden braking obsolete. The drum braking systems were a huge
upgrade from wooden braking in terms of usability and braking force. Mechanical drum braking consists of a
cam, brake linkage, brake shoes and a drum anchored to the vehicle's chassis. They were designed to be used in
early rubber wheeled automobiles. Louis Renault, a French automobile pioneer, developed this model around
1902 [2].
A 4-wheel brake system using hydraulics was first used in Model A Duesenberg car in 1921 [3]. In Hydraulic
braking, when a pedal is pressed, fluids are used to transfer the pressure to the brake shoe. Cylinders and tubes
are used to achieve the desired pressure.
The first vehicle to commercially adapt to disc braking was Chrysler Imperial in 1949 [4]. This method employs
calipers with brake pads, which pinches a rotor or disc, mounted on the wheel shaft.
Anti-lock Braking Systems (ABS) was first installed in 1966 in FF sports sedan, produced by Great Britain
Jensen. ABS was meant to be used in airplanes [5]. It prevents vehicles’ brakes from locking up. This technology
particularly senses when the brake is about to lock up and then responds by stimulating the valves to reduce
brake pressure.
A production car by Pierce-Arrow's was the first to integrate power braking in 1928 [6]. In this method, the
intake-manifold vacuum is used to reduce the magnitude of effort required to apply brakes.
Automatic braking technology has been on the rise since 2006 with Mercedes leading and implementing the
technology in their higher-end models. Automatic Emergency Braking Systems (AEBS) employ short-range
radar and long-range radar that can bring a car to a stop even if the driver does not touch the brake pedal.
Electrical dynamic braking is the use of an electric traction motor as a generator for slowing down a rotor. It is
termed "rheostatic" if the generated electrical power is dissipated as heat in brake grid resistors, and
"regenerative" if the power is returned to the supply line. Other spectrum of Electrical braking is
Electromagnetic brakes, which slow or stop motion using electromagnetic force to apply mechanical resistance
(friction).
B. Eddy Current Braking Systems: Advancements and Current Status
Currently, eddy current brakes are used in various domains like oil rigs, textile industry, trains and selected
automotives. With respect to the automotive domain, they are employed in some of the commercial trucks and
buses and are referred to as “electric retarder” or “electromagnetic retarder”. It is an auxiliary braking system to
the basic friction brake system, aiding in scenarios like downhill driving and overheating. It is usually mounted
on the drive-shaft.
Several architectures [7-12] are developed to vary the magnetic field in case of permanent magnets and several
architectures are developed to control excitation in terms of electromagnets. However, architectures which vary
the magnetic field by mechanical movement of the magnet itself have not been studied extensively.
An auxiliary braking system which uses electromagnets (EM) is proposed in [13, 14]. PMs are used as a source
of magnetic field as opposed to EMs used in the previous configurations [15]. The magnetic circuits are designed
to vary magnetic fields. In recent years, permanent magnet technology has evolved, with magnets having
increased energy density and reduced costs. Extensive study using new permanent magnets to use in Eddy
current braking is to be carried out.
Stand-alone type of braking systems or retarder type of braking systems are predominantly designed [16,17].
Integrated braking configurations [18] i.e., eddy-current braking which incorporates the already existing
components of a traditional frictional brake is not efficiently realized. However, including magnets in the
existing caliper of the brake is not explored extensively.
Simple Analytical models [17, 18] and comparatively more complex FEA analysis models [19,20,21] are among
the common simulation models used. The difference between the models in terms of performance and the
parameters affected is explored. However, subsequent ways to reduce the gap between the models in terms of
results by improving the simpler analytical model is to be attempted.
An experimental test set-up is employed to validate the Finite-Element Analysis (FEA) results and numerical
results in [22]. Models not necessarily designed to be implemented on the vehicle, but to verify the results and
equations are to be developed. An effective experimental model to fill the gap between prototype model and
mathematical results is to be realized.
Gaps in Research:
At present, ECB is not commercially implemented in passenger vehicles. Towards this end, prominent research
and development of prototype models for Integrated braking systems which make use of existing mechanical
frictional braking components are not realized.

261
Keeping these in minds, the scope for further research work is inferred to be:
 Development of an end-to-end model which uses a simple analytical method taking into consideration the
new PMs.
 This analytical model’s equations and results are to be validated by a simple working hardware model.
The architecture of mechanically moved magnets to control the braking can be explored and the setup is to
be realized incorporating already present components of traditional frictional braking.

III. DESIGN OF ECB


The Eddy Current Braking model employed consists aluminum brake disc rotating with velocity vm, through an
inhomogeneous magnetic field created by magnet placed facing the disc. Fig. 1 represents the arrangement of the
Brake disc and the Magnet.

Fig. 1. Basic ECB Model with brake disc and magnet

In the model, we can either employ Permanent Magnet (PM) or Electro-magnets (EMs) to generate the magnetic
field. In case of PMs, air-gap distance is varied to gain control over braking force, whereas excitation current is
varied in case of EMs.
Permanent Magnets
During the rotation of brake disc with velocity vm, the induced voltage ui in the disc can be calculated using
Faraday’s law of induction,
∮ ⃗ × ⃗ =( − )0.5 (1)

where l is the width of the disc [23]. The resistances along axis of eddy current path, namely R1 and R2 is
calculated. Disc physical and material properties are taken into consideration.
= (2)
= (3)

The total resistance of one eddy current path is,


= 2 ( + ) (4)
The correction factor is used, as the eddy currents are considered of only one eddy current streams.
To take into consideration the field repression, the resistances are calculated across the thickness of the brake
disc. But, due to presence of fictive current in eddy current paths, effective thickness, for eddy current paths
is calculated as,
= (1 − ) (5)
= (6)
Where, is the conductivity of eddy current material, is the penetration depth and is the relative
permeability.
As eddy currents of only one stream are taken into consideration, an inductance L is applied to eddy current path.
Inductance is the result of magnetic resistance present in the air gap. It is calculated by,
= (7)

262
where, CL is the inductance fitting parameter and is the length of the magnet. The respective reactance can be
calculated by,
= (8)

Due to the impedance Z and voltage induced in the disc, the eddy current can be estimated as,
= (9)

The drag force produced in an eddy current brake can be approximated by,

= ( )
(10)

Where, Nec is the number of eddy current paths. All parameters used for the calculations are presented in Table 1
including brake disc and magnet parameter values.
The Effective flix density at an air gap distance lg is calculated by,
= ( − ) (11)
( )
where, d is the Disc diameter.
The aim of the design is to find the physical parameters of the ECB which ensure maximum braking torque. Fig.
2 shows the Torque-speed curve of an ECB with its
characteristic values.

TABLE I. PARAMETERS USED FOR PM DISC BRAKING


Parameter Value
Disc thickness (tec) 4 mm
Disc diameter (d) 20 mm
Disc material Aluminium
Length of magnet (l) 80mm
Magnetic Inductance of PM (Bx) 1.2 T
Number of eddy current paths (Nece) 8
Resistance Correction Factor (Cr) 2.02
Inductance Correction Factor (CL) 0.21
Permeability Correction Factor (µ co) 8.2
Fig. 2. Characteristic torque curve of an ECB

Torque axis and Speed axis correspond to Braking Torque of the ECB and Speed of the Brake disc respectively.
It can be observed that the Torque of an ECB increases up to its critical speed VCrit, where it reaches Maximum
torque Tmax. As the speed increases past Critical speed, the torque decreases.

IV. DESIGN OF ANALYTICAL MODEL


A simple analytical model to assess the equations of ECB using permanent magnets is developed. The analytical
model takes disc parameters, magnet parameters and other physical parameters like air gap as input. It calculates
the Braking force and torque generated, with respect to brake disc speed.
The results of the analytical derivations are implemented and computed using MATLAB®. The braking torque
is computed for a broad speed range. Fig. 3 shows the algorithm used for calculations.
After the calculations, the characteristics ‘Braking Force-Speed’ and ‘Torque-speed graph’ are plotted.
Fig. 4 shows the Braking Force vs Speed plot of Eddy-Current Braking system. The Brake disc speed Vm is
plotted against the produced Braking Force. In our application of Eddy Current Braking system to automobiles,
the region of characteristic plot where, braking force is increasing with increase in speed is taken into account.
Fig. 5 corresponds with the Braking Torque-Speed characteristics of the Eddy Current Braking system. Brake
disc speed is plotted against corresponding Braking Torque. It can be observed that Braking torque is increasing
in magnitude with increase in speed.
Fig. 6 shows the Braking force as a function of different air gaps lg. It can be seen that increasing air gap
between the permanent magnet and disc decreases the magnitude of braking force.

263
Fig. 3. Algorithm implemented in MATLAB to solve for Braking parameters of ECB

Fig. 4. Braking Force – Speed curve of Eddy Current Brake Fig. 5. Braking Torque – Speed curve of Eddy Current Brake

Fig. 6. Braking force – speed curve for different air-gap distance lg

264
V. HARDWARE DESIGN AND PRELIMINARY REALIZATION
A preliminary hardware model is designed to test our hypothesis on fundamental physics and working of an
Eddy Current Braking model employing Permanent Magnets. Neodymium Circular disc magnets, which possess
high magnetic flux density are chosen to decelerate an Aluminium disc of a similar dimensions to that of an
automotive brake disc. Suitable drive type to actuate the disc is designed and motor sizing is carried out. Fig
7.shows the CAD model of the designed system.

TABLE II. HARDWARE MODEL SPECIFICATIONS


Disc
Material Aluminum
Dimensions 20 cm OD
Magnet
Type Neodymium disc magnets
Magnetic Induction (B) 1.2 T
Dimensions 50*20*20 mm
Motor
Motor type PMSM Motor
Torque 5-8Nm
Speed 1500 RPM
Type Variable speed type
Drive
Type Direct coupling with shaft
Shaft 8mm aluminum
Bearings Ball bearings
Mountings Weld mountings
Body Cast Iron

Fig. 7. Model of preliminary Braking test system using PM

A preliminary hardware working model is realized to test out the theoretical assumptions and to ensure the
fundamental understanding of working of an ECB using PM are validated.
Due to restrictions not limiting to financial, availability and time restrictions, some variation has been made from
the designed model.
A mild steel disc of 20mm outer diameter was procured instead, and appropriate light-weight shaft has been used
to enable rotation of the disc. The shaft is directly coupled with a PMSM motor to eliminate any drive losses.
Neodymium circular disc magnets are mounted on a retractable flat member which oscillates the magnet in and
away from the disc. Ball bearings are installed at both ends of the shaft to ensure there is no other load other than
the disc itself. The whole frame for the model has been constructed with Iron square members to ensure strength
and rigidity.

265
Fig. 8 and Fig. 9 contains the images of realized ECB model using PMs.

Fig. 8 Realized working model of preliminary Braking system using PMs Fig. 9. View of PMs in vicinity of the Brake disc

VI. RESULTS AND DISCUSSIONS


Several test cases have been performed on the set-up to aid the understanding of basic principles related to ECB.
Disc speed has been measured in RPM.
In the first set of experiments, PM is brought in the vicinity of the disc without cutting off the motor power
supply. The readings are taken for two cases, 330 RPM and 860 RPM as initial speeds. In the next set of
experiments, motor power is cut off when the magnets are brought in. Same two cases of speed are measured.
Air gap lg of 2 mm has been fixed throughout the entirety of the experiments undertaken.
Fig. 10 shows case 1, when the disc is initially rotating freely at 330 RPM. When the permanent magnet is
brought in the vicinity of the disc, the speed of the brake disc drops to 280 RPM. Speed of the disc is reduced by
50 RPM and remains constant as the motor is powered throughout the time period.
Fig. 11 shows case 2, where the disc is rotating freely at 860 RPM. When the permanent magnets are brought
closer to the disc, the speed of the brake disc drops to 780 RPM and remains constant as motor is still live.

Fig. 10 Disc speed – Time graphs of initial speed 330 RPM Fig. 11 Disc speed – Time graphs of initial speed 860 RPM

In the next set of experiments, the motor power is taken off to simulate accelerator pedal being taken off in case
of an automobile. Subsequently, readings of time taken by the disc coming to halt, with magnet in its vicinity
and magnet without in its vicinity are taken. This corresponds to braking effect and the deceleration rate of the
brake disc.
Fig. 12 shows the graph of Brake disc speed versus time, with magnet in discs vicinity and without magnet in
discs vicinity. It can be observed that the disc naturally decelerates at the rate of 11 RPM/s from 330 RPM to 0
RPM in 0.5 seconds. When the permanent magnet is brought in to produce brake torque, the disc now
decelerates at increased magnitude of 18.33 RPM/s and takes only 0.3 seconds to stop.
In another case when the initial speed of the disc is 860 RPM, the disc naturally decelerates at 8.9 RPM/s to
come to a halt at 1.60 seconds. When the braking as a form of bringing in the permanent magnets is applied, the
disc decelerates at increased magnitude of 23.8 RPM/s and takes 0.6 seconds to stop. It is illustrated in Fig. 13.

266
Fig. 12 Deceleration graph of initial speed 330 RPM Fig. 13 Deceleration graph of initial speed 860 RPM

VII. CONCLUSION
With the exponential growth of e-mobility sector and the increasing interest in automotive electronics, focus on
electric braking is gaining prominence. In this context, the present work was taken up to explore the realization of
Eddy current braking using Permanent Magnets for automobile braking application.
Evolution of braking system was studied, leading to exploration of current state Eddy Current Braking
Mechanisms. Eddy Current Brake is designed with Permanent Magnets incorporating already present frictional
braking component of brake disc. Simple analytical model is developed to validate the equations and plot the
characteristic graphs of an Eddy Current Braking System. Preliminary hardware system to validate the
understanding of fundamental working of ECB was designed and realized.
Usage of ECBs in passenger vehicles has advantages like gaining electronic control over the braking mechanism
and complete Electrical-Electronic integration of all the components in an automobile. It can be concluded that the
presented work aims to aid the transition from traditional frictional braking to Integrated ECB systems.

ACKNOWLEDGMENT
Authors would like to thank the authorities of KLE Technological University, Hubli - 580031 (INDIA) for the
support in carrying out this research work under Research Experience for Under-graduate (REU) scheme.

REFERENCES
[1] Britannica, T. Editors of Encyclopedia (2015, February 5). Michelin. Encyclopedia Britannica.
[2] Post, W. (2014). Car braking systems. In Fundamentals of Automotive and Engine Technology (pp. 130-141). Springer
Vieweg, Wiesbaden.
[3] Volti, R. (2006). Cars and culture: The life story of a technology. JHU Press
[4] Feeney, B., Guran, A. S., Hinrichs, N., & Popp, K. (1998). A historical review on dry friction and stick-slip phenomena.
[5] Gowda, D., Kumar, P., Muralidhar, K., & BC, V. K. (2020, November). Dynamic analysis and control strategies of an
anti-lock braking system. In 2020 4th International Conference on Electronics, Communication and Aerospace
Technology (ICECA) (pp. 1677-1682). IEEE.
[6] Siegel, I. H. (1965). Independent Inventors: Six Moral Tales. Pat. Trademark & Copy. J. Res. & Ed., 9, 643.
[7] H. Sakamoto, “Design of permanent magnet type compact ECB retarder,” Society of Automotive Engineers #973228,
pp. 19-25, 1997.
[8] Shin Kobayashi, Yukitoshi Narumi (1999). Eddy current reduction braking system (U.S. Patent no. 6237728B1). U.S.
Patent and Trademark Office.
[9] TohruKuwahara (1999). Eddy current reduction apparatus (U.S. Patent no. 6,209,688). U.S. Patent and Trademark
Office.
[10] TohruKuwahara (1999). Permanent magnet type eddy current braking system. (U.S. Patent no. 5,944,149). U.S. Patent
and Trademark Office.
[11] Jiangyin Intellectual Property Operation Co., Ltd (2012). Permanent magnet disc brake and braking method thereof
(China Patent no. 102979837B). China Patent and Trademark Office.
[12] Jiangsu University (2018). A kind of permanent magnetism double disk brake and its braking method (China Patent no.
108895096B). China Patent and Trademark Office.
[13] Luo, L., Zhai, Q., Li, W., Qian, C., & Liu, H. (2017). Research on an integrated electromagnetic auxiliary disc brake
device for motor vehicles. IEEJ Transactions on Electrical and Electronic Engineering, 12(3), 434-439.

267
[14] Ji, Y., Wang, J., Xu, Y., Liu, Z., Zhou, Y., & Li, J. (2016). Study on the thermal-magnetic coupling characteristics of
integrated eddy current retarder (No. 2016-01-0185). SAE Technical Paper.
[15] Gay, S. E. (2010). Contactless magnetic brake for automotive applications (Doctoral dissertation, Texas A & M
University).
[16] Ye, L., Li, D., Ma, Y., & Jiao, B. (2011). Design and performance of a water-cooled permanent magnet retarder for
heavy vehicles. IEEE Transactions on Energy Conversion, 26(3), 953-958.
[17] Simeu, E., & Georges, D. (1996). Modeling and control of an eddy current brake. Control Engineering Practice, 4(1),
19-26.
[18] Anwar, S. (2004). A parametric model of an eddy current electric machine for automotive braking applications. IEEE
transactions on control systems technology, 12(3), 422-427.
[19] Gay, S. E., &Ehsani, M. (2005, September). Analysis and experimental testing of a permanent magnet eddy-current
brake. In 2005 IEEE Vehicle Power and Propulsion Conference (pp. 10-pp). IEEE.
[20] Gay, S. E., &Ehsani, M. (2006). Parametric analysis of eddy-current brake performance by 3-D finite-element analysis.
IEEE Transactions on Magnetics, 42(2), 319-328.
[21] Gay, S. E., &Ehsani, M. (2005, September). Optimized design of an integrated eddy-current and friction brake for
automotive applications. In 2005 IEEE Vehicle Power and Propulsion Conference (pp. 290-294). IEEE
[22] Gay, S. E., & Ehsani, M. (2006). Parametric analysis of eddy-current brake performance by 3-D finite-element analysis.
IEEE Transactions on Magnetics, 42(2), 319-328.
[23] Holtmann, C., Rinderknecht, F., & Friedrich, H. E. (2015, March). Simplified model of eddy current brakes and its use
for optimization. In 2015 Tenth International Conference on Ecological Vehicles and Renewable Energies (EVER) (pp.
1-8). IEEE.

268
Grenze International Journal of Engineering and Technology, June Issue

Skin Disease Identification using online and Offline


Data Prediction using CNN Classification
Minakshi M. Sonawane1, Ali Albkhrani 2, Bharti W. Gawali3, Ramesh R. Manza4 and Sudhir Mendhekar5
1-4
Department of Computer science and IT, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad (MS), India
Email: [email protected], [email protected], [email protected], [email protected]
5
Department of Dermatology, neurology, and leprosy Government Medical College, Aurangabad (MS), India.
Email: [email protected]

Abstract—In this study, a convolution neural network (CNN) is used to classify images for the
detection of skin illness. We collected in a database from the government medical hospital in
Aurangabad and the HAM10000 online data. The seven classes are in the skin diseases dataset
such as Basel Cell Carcinoma, Psoriasis, Ringworm, Impetigo, Leprosy, and Eczema. The seven
additional categories of skin disease are in the database. We have used pre-processing
techniques to improve the model accuracy such as resizing images, and normalization of a
dataset. We have used a deep learning algorithm for the classification of skin diseases in the
database. We have used a deep learning algorithm for the classification of skin disease. It is
given an 80.2% percent accuracy rate and its overall accuracy is 78%. Acne disease
identification is got 100 accuracies while testing for it is 97.6% accurate. For the classification of
skin diseases, we used a deep learning system. Its total accuracy is 78% and it has an accuracy
rate of 82.2%. Identification of the acne disease has a 100 accuracy rating, while testing for it
has a 97.6% accuracy rating.

Index Terms— CNN, Skin Disease Dataset, Deep Learning, Convolution Neural Network,
Image Processing.

I. INTRODUCTION
The most common disease nowadays that affects people of all ages is skin disease and lesions, however young
children and the elderly are immunity powerless compared to other. The investigation of patient’s medical
history and symptoms, skin scrapping, dermoscopic examination, and skin biopsy is a common method for skin
disease diagnosis. But these methods of diagnosis are exhausting, time-consuming, and prone to error. Most of
them request an expert dermatologist with superb vision. Medical imaging technologies are sophisticated and
trustworthy in diagnosing skin diseases. However, people in low-resource contexts are healthcare institutions. In
the healthcare sector, digital imaging cameras and sensing platforms have recently emerged as an alternate
method of disease diagnosis. The most recent generation of the camera allows for high-resolution digital image
capture images in high resolution to its high-definition camera, enormous storage capacity, and high-resolution
digital image capture to its high-definition camera, enormous to its probability, affordability, and connectivity
[1].
Computer-aided diagnosis is important and required because it may analysis of different types of skin diseases.
The bulk of regularly used algorithms for forecasting skin diseases involve deep learning. This approach will

Grenze ID: 01.GIJET.9.2.806


© Grenze Scientific Society, 2023
enable finding the inspected bits in the discovered data pattern, which will considerably improve the
performance of even the simplest computational models [2]. The majority of chronic skin conditions, such as
impetigo, ringworm, eczema, basal cell carcinoma, and psoriasis, are categorized as serious fitness problems that
have an adverse impact on one's physical, mental, and financial health. This dataset contains 2000 dermatoscopic
images. A digital camera is employed in many different settings due to its likelihood, affordability, and
connectivity [3]. The majority of the initiatives, which focused on skin disease images and aimed to identify
specific body parts, were dependent on the availability of an online public dataset. The rest of this article is
structured as follows: More information on related studies work for the diagnosis of skin conditions is provided
in sections 3 and 4. Section 3 describes the finding and the discussion of the result followed by Section 5's
conclusion.

II. LITERATURE REVIEW


Since the past ten years, several studies have been published on Skin disease. Andre Esteva and Brett Kuprel
investigated clinical screening and histological testing to categorise skin cancers at the dermatologist level that
have substantial neurological systems. They initially showed how to classify a skin illness using a single CNN,
and then they became ready to classify images utilising two essential binary inputs, where the disease is only
represented by a single pixel [4].
Convolution and Artificial Neural Networks (ANN) Neural networks are commonly used techniques in
radiological imaging and diagnosis. The ANN-based model for early detection of breast cancer through image
processing or either neural network approach method requires enormous training and testing models considering
performance, which requires a lot of computational effort. Furthermore, in ANN, as image resolution increases,
so does the number of trainable parameters, resulting in massive training efforts. Furthermore, for the validation
set, the classifier achieved an accuracy of 89.90%. The K-Nearest Neighbours (KNN) classification model is
widely used for casting and prediction models. This model is also divided into training and testing phases.
Furthermore, the accuracy of the KNN model is quite high [5]. KNN models are not suitable for use with large-
scale data models because performing prediction models can take a long time. Poor performance when working
with high-dimensional datasets with inappropriate feature information may impact the model's accuracy and
prediction performance.

III. DATASET DESCRIPTION

TABLE I. LOCAL DATABASE FOR SKIN DISEASE

Sr. Name of Database Size Name of Disease Resolution Year


No. Institute/Organizatio
n

1 Govt. Hospital 612 Acne, psoriasis, Eczema, Wart, 6016*3384 2020-2021


(GHATI), Ringworm, Vitiligo, Skin Cold.
Aurangabad
2. https://www.kaggle. 10015 Melanocytic –Nevi, BCC, 478 x 600 2014
com/datasets Benign Keratosis-lesions, A dataset ingested by
Melanoma, Dermatofibroma, data. The world may
Vascular-lesson, Akira have a maximum size
of 1GB and up to 250
individual files

We have captured images from Government Medical Hospital Aurangabad, under the observation of the
dermatology lab. We used a direct-current light source to avoid the flickering effect of alternating current (AC).
Additionally, information is withheld unless physicians completely and honestly disclose the objectives behind
the collection of their data. Both dermatologists and patients are aware that we are only gathering this
information for the study. This data was gathered from every patient, and our study was authorized by the
institute's ethics committee.
In Source _1 (Sony HD Camera) we have data from seven different skin datasets such as Acne, psoriasis,
Eczema, Wart, Ringworm, Vitiligo, and Skin Cold., which are mentioned in table 2. The data indicates a
6016*3384 resolution.
Another is the Source 2 we have taken the pictures from Kaggle [6]. We have considered skin infection pictures
with the natural parts. It has been seen that the proposed framework yield exactness differs as for skin illnesses.

270
TABLE II. LOCAL DATABASE FOR SKIN DISEASE
Acne BCC Eczema Leprosy Psoriasis
Skin Disease Train Data Test
(80%) Data (20%)

Acne 180 25
Basal Cell Carcinoma- BCC 45 10
Psoriasis 275 120
Ringworm 50 21
Ringworm Skin Wrath Vitiligo Dilation
Cold Impetigo 45 35
Leprosy 122 79
Eczema 118 23
Total 825 313

Figure 1. Skin Disease Images Dataset

We have additionally gathered pictures from the web. More than 1012 images with a resolution of 678*600 have
been downloaded on seven different infections: melanocytic, bcc, benign keratosis lesion, melanoma,
dermatofibroma, vascular lesion, and Akira. In the underlying preparation stage, trademark properties of
ordinary picture highlights are confined, and, in light of these, a one-of-a-kind portrayal of every characterization
classification is made for seven distinct classes.
The classes are skin inflammation, acne infection, leg infection, hand infection, dermatitis subcutaneous, lichen
simplex, stasis dermatitis, and ulcers [7]. In the testing stage, these component space allotments are utilised to
create group picture highlights.

IV. METHOD AND TECHNIEQUES

Figure 2. Proposed Methodology

A. Image Acquisition
With the use of two sources, including dermatologist photographs from a digital camera and the Kaggle website
as well as online and offline disease images.
Data Preparation
At the point when we gathered our pictures, all the pictures were in an alternate measurement. Our informational
index is diverse in its height, width, and size. In any case, the profound neural classifier needs a comparable
informational index for preparing and testing the informational index. So we set the pixels to 100 x 100. At that

271
point, we'll prepare our model. Our all-out picture number post-growth is 3000. We used 2400 images for
preparation and 600 for testing [8].
B. Convolution neural Network
We developed our algorithm based on pre-trained 825 images we adjust the final layer and used on, Our dataset
as inputs, CNN have similar functions, where the calculated feature are combined with each other. The
simplified framework of the entire process is shown in figure 2. CNN is often used in real life for image
recognition and natural language processing. Each pixel in the input images was transformed into element in a
matrix. If there are 100 images input images, the input matrix would be 825 images. 256*100 dimensional
images this also called input layers.

Figure 3. Convolutio Model

Image data generator generates argumentation of images in real time while the model is still training.
One can apply random transformation on each training images as it is passed to model.
Data are distributed in following layers.
1. Convolutional
2. Pooling
3. Dropout
4. Flatten
5. Dense
Our idea is to build up a new CNN model, in our model, we have 13 layers. We also have 5 convolutional layers:
• The first layer has 32-3 × 3 filters and ‘linear’ as an activation function.
• The second layer has 64-3 × 3 filters and ‘linear’ as an activation function.
• The third layer has 128-3 × 3 filters and ‘linear’ as an activation function.
• The fourth layer has 256-3 × 3 filters and ‘linear’ as an activation function.
Additionally, we may state that the average size of the five max-pooling layers is 2 X 2. The parameters of our
two dropout layers are 0.3 and 0.4, respectively. In our model, a level layer exists. Finally, there are two thick
layer capabilities known as "softmax" and "straight". Both skills are used in the beginning process. But to
determine the likelihood of our five classes, "softmax" is used [9].
Training Model
Adam optimizer is used for the compilation of our model. For training purposes, we use 80% of our training
dataset, and then the rest of the 20% dataset is used for testing purposes. our training dataset consists of 2400
images. So we can say that the number of training sets consists of 1920 images and validating set consist of 670
images. Our classifier’s batch size 78. 50 epochs were used by us to train the model.

V. PERFORMANCE EVALUATION
Preparing precision is regarded as the model's accuracy when applied to the data we prepare. The model's
accuracy when applied to a small sample of data from any class is referred to as the approval precision. The
diagram in the illustration depicts the creation and acceptance of exactness. to evaluate the performance of the
proposed model, we conduct a set of experiments by comparing the proposed model to several state-of-art in
diagnosis models. A convolutional neural network (CNN) system is used for Deep Learning [10]. In image
processing, such a method is commonly used to classify the object as well as to perform the ROI detection and
segmentation process. There are a number of layers in CNN to detect various features of the input layer learning.

272
At different resolutions, filters are applied to every trained image and the outcome of the convolution layer. The
CNN algorithm used for disease detection is based on layer. Our aim behind using CNN for skin disease
detection is to improve the recognition results compared to other classifiers.

Figure 4. Compare Skin disease Ratio Figure 5. The Ratio of DiseasePrediction

we have shown in figure 6. The ratio of disease prediction result. In the graph is affected among people and more
prominent in men and infection on the lower extremity of the body is more visible in women. and some
unknown regions also show infection and its visible in men and women, that acral surface show at least cases
that too in men.only gender groups don’t show this kind of infection.

Figure 6. Gender Wise Disease Prediction Ratio

In the online dataset, skin diseases were found maximum 45 year old patient and below 10 year old. Maximum
of 10 below. We observed that the probability of having a disease ratio is increased compared to men and
women who are facing skin cancerous disease; we have found most of the melanocytic lesions in
dermatofibroma disease. The age group between 0-75 years is infected the most by melanocytic nevi. On the
other hand, people aged 80-90 are affected more by benign keratosis-like lesions. All the gender groups are
affected the most by melanocytic nevi.
A. Performance Measures
Experiments have been carried out to validate the efficiency of the proposed model. The experiment was carried
out with a core i5, 2.3 GHz processor with 8 GB RAM using python. Comparisons with other models conducted
to measure the performance of the classification are evaluated in terms of classification sensitivity, specificity,
and accuracy from the confusion matrix. The measures are computed by using the equations described below
with the following convolutions. In this study, the confusion matrix was used to calculate several metrics. This
matrix forms four indices which are true positive (TP), false positive (FP), false negative (FN), and true negative
(TN). TP and TN match the number of correctly predicted hypoxic and normal samples, whereas the FP and FN
match the number of incorrectly predicted hypoxic and normal samples, respectively.
Accuracy, Recall, and F1-score have been determined from our test dataset which contains 600 pictures. So we
can say that out Precession normal is 0.76, Recall normal is .78 and F1-score normally us 0.78. Finally, we can
say that our classifier is quite acceptable. Characterization table is given underneath. The total accuracy we got
78 %.

273
Figure 7 Confusion Matrix

TABLE IV. CNN MODEL-1, PERFORMANCE MEASURES DETECTION OF


OFFLINE SKIN DISEASE

Diagnosis class in the Precision Recall F-score 100%


dataset 80%
Acne 100% 100% 100% 60%
40%
Basal Cell Carcinoma- 96.01% 98.6% 98%
20%
BCC
0%
Psoriasis 95% 97.9% 97.5% 50.60% 57.40% 79% 88.30% 78% 85.50%

Ringworm 92.6% 96% 96%


Impetigo 92.4% 97.3% 96% 69%
Leprosy 74.4% 94.23% 91%
Eczema 68.6% 85.5% 80.2% Figure 8. Disease Prediction

Table V. Show that the proposed system can produce high accuracy when we apply it to multiclass skin disease.
The result shows that the proposed system correct identifiers all Acne, BCC (Basel cell carcinoma), and psoriasis
patients with a diagnosis. Finally, Acne skin diagnosis has the highest accuracy compared with other diseases.

TABLE V. CNN MODEL-1, PERFORMANCE MEASURES OF DETECTION OF


ONLINE SKIN DISEASE 100.00%
Diagnosis class in the Precision Recall F-score
dataset
50.00%
Melanocytic – Nevi 69% 69% 69%
0.00%
BCC 50.01% 50.6% 51%
Benign Keratosis-lesions 55% 57.4% 57.6%
(BKL)
Melanoma –MEL 79.6% 79% 70%
Dermatofibroma – DF 88.4% 88.3% 88.3% Precision Recall F-score
Vascular Lesions –VASC 74.4% 78% 69%

Akira 68.6% 85.5% 80.2%


Figure 9. Performance Accuracy with Offline Dataset

Table VI. Show that the proposed system has higher performance in terms of accuracy specification and
sensitivity and f-score than the system proposed system by using higher accuracy compared with other diseases
using online data using skin disease. The result shows that the proposed system nearby correctly identifies all
patients with DF, MEL, and AKIRA disease. Finally, dermatofibroma skin disease has also can be diagnosed
with the nearby accuracy state with other state of art skin diagnosis systems.

274
VI. CONCLUSION
In this study, we have presented a CNN method for the diagnosis of dermatological disease in brief. We gather
information from internet and original datasets with various photos, including those of skin conditions including
vitiligo, psoriasis, warts, eczema, and skin cold. Another source is online, where 1012 pictures of seven distinct
infections melanocytic, bcc, benign keratosis lesion, melanoma, dermatofibroma, vascular lesion, and Akira have
been obtained. In terms of identifying skin lesions, we have seen some quite encouraging findings. We identified
seven different skin conditions in certain hairy photos. Finally, we ran a statistical analysis to compare
performance with the results of our objective investigation. The results of the statistical tests conducted on both
datasets’ photos to assess performance led to the conclusion that our technique is the statistically best algorithm.
As a consequence, when each class was examined independently, our accuracy rate values in multiple
classifications rose, and using CNN classification resulted in findings with varied degrees of accuracy. In terms
of identifying skin lesions, we have seen some quite encouraging findings. As a consequence, when each class
was examined independently, our accuracy rate values in multiple classifications rose and using CNN
classification resulted in findings with varied degrees of accuracy. The method displays disease skin detection
accuracy in an online database. It has an accuracy rating of 82.2% and a 78% total accuracy. Acne disease
detection accuracy is 100 percent, and test results are 97.6 percent accurate.

REFERENCE
[1] Minakshi M. Sonawane, Bharti W. Gawali, Ramesh R.Manza, Sudhir Mendhekar, “analysis of skin disease technique
using smartphone and digital camera identification of skin disease”, Research Article, vol. 4,issue. 3, pp. 529–551, July -
September 2022.
[2] Minakshi M. Sonawane, Ramdas D. Gore, Bharti W. Gawali, Ramesh R.Manza, Sudhir Mendhekar, “Computed Aided
Diagnosis System for Skin Disease Identification”, International Conference on IoT-based Control Networks and
Intelligent Systems (ICICNIS 2020), pp. 656-667, 2020.
[3] R. J. Hay, N. E. Johns, H. C. Williams, I.W. Bolliger, R. P. Dellavalle, and D. J. Margolis, “The global burden of Skin
disease in 2010: An Analysis of the prevalence and impact of skin conditions”, 55 J, Investigative Dermatology, vol.
134, no. 6, PP NO.1527_1534, 2014.
[4] Sardana K, Mahajan S, Sarkar R. Spectrum of skin diseases among Indianchildren. Pediatr Dermatol, 26 (1) pp no:6–
13,2009.
[5] Palak Mehta, Bhumika Shah, “Review on Techniques and Steps of Computer Aided Skin Cancer Diagnosis”
International Conference on Computational Modeling and Security (CMS2016).
https://doi.org/10.1016/j.procs.2016.05.28.
[6] Online available: https://www.medicalnewstoday.com/articles/154322.
[7] Abraham Getachew Kelbore, Philip Owiti ,Anthony J. Reid , Efa Ambaw Bogino, Lantesil Wondewosen and Blen
KassahunDessu, “Pattern of skin disease in childeren attending a dermatology clinic in a referral hospital in wolaita soda
, southern Ethiopia ”,BMC Dermatology, https://doi.org/10.1186/s12895-019-0085-5, pp no- 3-8, 2019.
[8] Housman TS, Feldman SR, Willi ford PM, Fleischer AB Jr., Goldman ND, et al., “Skin cancer is among the most Costly
of all Cancers to treat for the Medicare population”, J Am Acad Dermatol 48: pp. 425_429, 2003.
[9] Z. Hu and C. S. Yu, “Functional research and development of skin barrier”, Chinese Journal of Clinicians, vol. 7, No. 7,
PP NO 3101–3103, 2013.
[10] A.F. Jerant, J. T. Johnson, C. D. Sheridan, and T. J. Caffrey. "Early detection and treatment of skin cancer", Am Fam
Physician 62 (2): 357–68, 375–6,381–2. PMID 10929700, July 2000.

275
Grenze International Journal of Engineering and Technology, June Issue

Automatic Rail Track Inspection System


D. Chandrasekhar1, A. Sumalatha2, K. Poojitha3, G. Neeraj Manikanta Sai4 and K. Kavya5
1, 3-5
IV- B.Tech., Department of EIE, VR Siddhartha Engineering College, Vijayawada, India
Email: [email protected], kondaveetipoojitha @gmail.com, [email protected],
[email protected]
2
Asst. Professor, Department of EIE, VR Siddhartha Engineering College, Vijayawada, India
Email: [email protected]

Abstract— The need for safety components in the contemporary rail systems is increasing as a
result of the need to prevent accidents. Among the presence of impediments on the tracks,
whether they are stationary or mobile, is one of the major factors that might cause significant
accidents. This study focuses on one of the most effective techniques for preventing railway
accidents caused by track cracks and obstacle identification. The primary goal of this project is
to create a method for identifying railway track cracks and notifying nearby stations. The
location of track defects is being pinpointed by a GPS system. The project provides a solution
by providing an advanced tracking and management system for trains to improve the current
railway track inspection and hence transportation service. The approach is based on a potent
blend of mobile computing, an infrared sensor, an Arduino Uno and the Global System for
Mobile Communication (GSM).

Index Terms— Arduino, GPS module, GSM module, IR sensor, Motor driver.

I. INTRODUCTION
Although it has been determined that rail cracks are the main reason for derailments in the past, there are
currently no accessible low-cost automated testing methods available. Because of this problem's significant
effects, a practical and affordable solution that can be used on a large scale is needed. Crack is often assessed
manually by the experts called keyman. This is accomplished with a track checker, a miniature railroad car
designed to inspect the integrity of railroad lines. The early track checkers, also known as track walkers, were
only individuals who walked on the tracks to ensure that they were undamaged Fig. 1. However, a contemporary
track checker is a compact carriage with wheels that may either be controlled by an engineer known as a "Track
Checker" or it can be automated. This vehicle travels on railroad rails. One of the best nondestructive testing
(NDT) methods now available for surface and near-surface fault detection is eddy current testing (ECT) for
crack detection Fig. 2. The most effective ECT devices are powerful enough to measure thin materials precisely
and easily identify minute variances.
In order to enhance the inspections, new detection techniques must be created. Even if the government has taken
the required precautions to ensure a safe voyage, accidents will still occur owing to these cracks, thus this study
focuses on an effective technique to prevent these mishaps. This study describes an approach for inspecting and
to find breaks and cracks in railway tracks. Tracking has advanced significantly with the development of
communication technology, making it easier to monitor items like automobiles. The prior approaches were
replaced with renewed options. These options rely on the integration of technology using the Global Positioning
System (GPS) and other technologies [1].

Grenze ID: 01.GIJET.9.2.807


© Grenze Scientific Society, 2023
Figure 1. Manual inspection Figure 2. Eddy current testing

II. LITERATURE SURVEY


There are several conventional methods for inspecting railroad track cracks. Advancements technologies such as
GSM and GPS are playing major role in wireless inspection of tracks with simple components viz. GSM modem,
an IR transmitter, and a receiver etc. An IoT based railway track crack detection by using IR sensor is developed
and the information of crack is sent to the host server [1]. A Bluetooth technology based [2] system for detecting
railway cracks using IR sensor is proposed. But Bluetooth technology fails for the long range communication.
[3-5] proposed the crack monitoring technique by using GPS,GSM and GPRS communications with low
accuracy which can be enhanced by using IR sensors. An ultrasonic sound based technology was proposed
which can be made with a special embedded system but they feature of geo location is not provided [6]. An
image processing based technique was proposed to detect the crack. Image processing is one of the advanced
techniques used for vision based detection. Even though it is a advanced method it has the disadvantage of
having limited information about software requirements [7]. Proposed an IR transmitter and receiver based
technique to monitor the rail track crack, but the location of the crack detected to be sent to the nearby station [8-
9]. Various methods of track inspection techniques are being developed by many researchers. They are broadly
classified as nondestructive testing, condition monitoring systems, track recording systems etc. [10]. Among
different sensors that come under nondestructive testing such as cameras and accelerometers, IR sensors are
economic, simple and low wait comparatively.
After reviewing different techniques used for crack detection using IR in particular it is identified that
identification of location along with the detection of crack is the need of hour. The proposed model can detect
the rail crack using IR sensor accurately and sends the geo location where the crack is detected by using GPS
communication system. And it sends the message to the registered mobile number by using GSM module.

III. METHODOLOGY
The key components of crack detection system are given in block diagram Fig. 3. The concept includes several
technologies,including previously discussed IR, GPS and GSM, in the proposed system.

Figure 3. Block diagram of crack detection system

The microprocessor is the heart of the system which coordinates crack detection and communication. The GSM
module is driven by the microcontroller to transmit text messages containing the current coordinates of the crack
consisting its latitude and longitude received by the GPS receiver to the appropriate authority.

IV. HARDWARE COMPONENTS


The hardware elements used in the suggested system are discussed in the following sections.

277
A. Arduino UNO
It uses a type Uno Arduino board (R3) as seen in Figure. With a wide range of shields, it is one of the most
popular and frequently utilized. The Italian word "Uno," which means "One," was chosen to symbolize the
imminent introduction of Arduino 1.0. The Arduino Uno version 1.0 board with the crack detecting mechanism
will serve as the standard going forward. The most well- liked construction boards for electronics and robotics is
the Arduino UNO. The flexibility of the board and it has become extremely popular because it can connect
numerous robotic components, including sensors, actuators, etc. One of the advantages of UNO is a USB port
that can be programmed using the Arduino software and communicates with software such as Windows or Mac
OS without the need for drivers. The Arduino Uno's ATmega328 has a boot loader built in that enables users to
upload new code to it without the need for an external hardware programmer. A microcontroller board called the
UNO R3 Stater Kit is based on the ATmega328.It includes a 16 MHz crystal oscillator, a USB port, 6 analogue
inputs, a rest button, a power jack, and 14 digital input/output pins. You only need to connect it to a computer
with a USB wire and power it with an AC to DC adapter or battery to get things going. It comes with everything
you need to support the microcontroller. The Uno R3 is compatible with all currently available shields and is
adaptable to new shields that utilize these extra pins. Shields for Arduino are simple to use boards that may be
used to completea variety of activities quickly.

Figure 4. Arduino UNO R3 board

B. IR Sensor
IR LED and Photodiode are the primary electroniccomponents required to make an infrared detectorcircuit. One
kind of diode is an IR photodiode that can detect light, serve as a source of illumination, and on rare occasions
have a black or dark blue layer on the outside that makes it appear like an LED. As a source of infrared rays, IR
LEDs are the kind of LED that emit light in the infrared range. When no light is shining on it, it has an
extremely high resistance. This set of infrared transmitters andreceivers, also known as an IR TX- RX pair, may
be purchased for not very much money from any respectable electronics parts store Additional components
needed for this sensorinclude a transistor type 2222 and resistances of 330 and 10 ohms .
C. GPS Module
The major three connections required to use the module as presented in Figure 5. A microwave horn antenna to
detect problems with rail tracks the computer terminal viawhich the module transmits GPS satellite data.

Figure 5. GPS module with Arduino board Figure 6. GSM module with Arduino

D. GSM Module
The GSM module requires a predeterminednumber, GPRS shield, and SIM card to send SMS messages with
the GPS coordinates notification to a cell phone,.
The general procedure used to deliver SMS is as follows:
 Place the SIM card in the slot provided as indicated in Figure 6, installed atop the GSM shield.
 Connect the module to the Arduino board.

278
 Connecting external power to the shield source using USB-to-Arduino cable. Gently push the power
button to check the power On/Off indication.
 Observe how frequently the network LED blinks; it begins blinking rapidly for a few seconds
looking for the network.
E. Motor Driver
It uses a DC motor driver of type L293D Figure 7. Two DC motors are connected to and run by a dual H- bridge
motor driver integrated circuit. Both in a clockwise and anticlockwise direction. It operates on the principle of an
H-bridge, a type of circuit that allows voltage to flow in either way, allowing the motor to revolve either
clockwise or anticlockwise. Two H-bridge circuits that may independently spin two dc motors are present in a
single L293D chip.

Figure 7. L293D pin diagram

Figure 8. System flow chart Figure 9. Proposed model

The input pins on the left (pin 2) and right (pin 15and 10) will control how the motors on the left and right side
rotate accordingly. The L293D is intended in positive-supply applications to drive inductive loads such relays,
solenoids, DC and bipolar stepping motors, as well as other high current/high voltage loads. The motors are spun
according to the inputs given across the inputs as logic 0 or logic 1. A single L293D chip has two h-bridge
circuits that in may independently operate two dc motors. Due its compactness it is very much employed in
robotic application for controlling DC motors. The pin layout of an is provided below L293D motor controller.

279
V. SYSTEM SOFTWARE
The Arduino's code for proposed system is created in the C programming language. The program's strategy can
be separated into four sections: The first section on the Arduino UNO's input output addressing, the second on
moving the engine forward and getting sensor data, and the third on figuring out if there is a fracture by
analyzing the sensor reading. The crack's latitude and longitude, as determined by the GPS module, are
shown in the third section. The fourth section involves sending coordinates to a predetermined cell phone using a
GSM module. Figure 8 illustrates the flow chart of the proposed system. Programs built using Arduino Software
(IDE) are referred to as drawings. These drawings are created in a text editor and stored alongside the file UNO
extension.

VI. SYSTEM IMPLEMENTATION AND RESULTS


A pair of infrared sensor transmitter and receiver assembly makes up the proposed fracture detecting system
crack finder. Up on the detection of a crack, signal is sent to the controller. A GPS receiver is enabled whose
purpose is to receive satellite signals to determine the present location of the device. It uses the most recent
latitude and longitude information. A GSM modem has been used to transmit the information that has been
received. First stage in execution is to construct the suggested model, which is given in Figure 9 and secondly
loading the code. The system is a conventional robot that resembles a basic toy. A single IR sensor that can
detect cracks in the railway has been mounted in front of the robot. The motherboard that encloses the motor
driver, GPS, and GSM modules, as well as the Arduino UNO board is principally responsible for the actual
detection. The GPS & GSM modules will send information to a defined number along with geolocation Figure
10 and Figure 11.

Figure 10. Geolocation images Figure 11. Geolocation images

VII. CONCLUSION
The primary goal of this work is to use a robotic crack tracing system to replace the manual approach of railroad
fracture identification. The prosed automatic rail track inspection system comprising Simple parts includea board
for Arduino connected to IR sensors, motors, GSM and GPS modules was developed and tested on a prototype
rail track. The system successfully detected the crack (which was intentionally put in the track) and sent the SMS
with geo location of the crack to the predefined mobile number. Checking the SMS is far convenient compared
to web based alert system as it just gives information of crack in single touch.The developed system is found
reliable and economical, and it is useful in places where manual inspection is difficult and expensive, such as in
mountainous, forested, and remote areas. The system is tested on both its software and hardware sides, and is
working well. Hence the proposed system can be implemented on railroads. The presented system not only
economical but it will save a significant amount of time in contrast to the modern techniques. Given that
everything is automated, tracking how well the tracks are maintained may be the end less chance of error than
conventional detection methods. Consequently, it will greatly reduce the likelihood of train accidents.

REFERENCES
[1] Ritika Mukhija Mr. Rajakumar P, “Railway Management System using IR sensors and Internet of Things Technology”,
YMER || ISSN : 0044-0477, VOLUME 21 : ISSUE 5 (May) – 2022, pp 1345-1352.

280
[2] Richard J. Greene, John R. Yates and Eann A. Patterson, "Crack detection in rail using infraredmethods", Opt. Eng.
46, 051013, May 2007.
[3] B. R. Krishna, D. Seshendra, G. Raja, T.Sudharshan, and K.Srikanth, “Railway track fault detection system by using ir
sensors and bluetooth technology,” Asian Journal of Applied Science and Technology (AJAST), pp. 82–84, 2017
[4] Prof. Z.V. Thorat and Nikhil Ranjane, “Automatic Railway Track Crack Detection System Using GSM GPS”,in IEEE
International Conference, 2020.
[5] Mr. Anand S. Muley, Mr. Siddhant B. Patil2, Prof. A.H.Shelar, “Railway Track Crack Detection based on GSM
Technique”, International research journal of engineering and technology (IRJET), Volume: 04, pp. 1252-1254, Issue:
01/jan2017.
[6] Jubayer Jalil and M. B. I. Reaz, “Accident detection and reporting system using GPS, GPRS and GSM technology”,in
IRJET, October 2019.
[7] K.Divya and R.Anjugam, “Railway Safety Monitoring System using Ultrasonic and IR Sensor”, in IJSRD- International
Journal for Scientific Research Development, 2018.
[8] Rijoy Paul, Nima Varghese, Unni Menon, Shyam Krishna, “Railway Track Crack Detection Rijoy”, International
Journal of Advanced Research and Development, Volume3, Issue3(2018).
[9] Rennu George, Divya Jose, Gokul T.G., Keerthana Sunil, Varun A.G. “Automatic Broken Track Detection Using IR
Transmitter and Receiver”, International Journal of Advanced Research in Electrical, Electronics and Instrumentation
Engineering, Vol.4, Issue 4, April 2015, pp-2301-2304.
[10] Amir Falamarzi* , Sara Moridpour, “A Review on Existing Sensors and Devices for Inspecting Railway Infrastructure”,
Jurnal Kejuruteraan 31(1) 2019: 1-10
[11] Pravinram, Prasath, Nanda Gopal, Haribabu, “Railway Track Crack Detection Robot using IR and GSM”, International
Journal for Scientific Research and Development, Vol. 4,Issue 02, 2016, pp-652-657
[12] Nanda Kishore, Ruhejadhav, Aishwarya, Pallavi, Railway Track Crack Detection Using GPS and GSM”, International
Journal of Innovative science and Research Technology (IJISRT), Vol 5, Issue 4, pp. 386-389, April-2020.

281
Grenze International Journal of Engineering and Technology, June Issue

Automatic Industrial Gas Leakage Detection and


Control System
Teja Sai Ethesh1, Dr. N. Swathi2, M. Sai Vamsi Reddy3, N. Anusha4 and A. Yashwanth Krishna5
1, 3-5
IV- B.Tech., Department of EIE, VR Siddhartha Engineering College, Vijayawada, India
Email: [email protected], [email protected], [email protected],
[email protected]
2
Asst. Professor, Department of EIE, VR Siddhartha Engineering College, Vijayawada, India
Email: [email protected]

Abstract—This project's primary goal is to develop a successful functioning prototype that can
detect the existence of gas leakage, which in this case is liquefied petroleum gas (LPG) and
control the gas to prevent leaks in great amounts. The prototype should perform immediate
data transfer and warning in addition to detection and control. This can be done by
implementing an alerting system, such as an alarm via a buzzer, an application via a Wi-Fi
module, and SMS via a GSM module. Here, the alerting system is in place to inform nearby
residents and industry workers about the leakage. Once a leak has been discovered, the first
step is to pinpoint its precise location. This can be done in two different ways. The first way is to
check the pressure using a barometric pressure sensor in each region, if there is a difference
between the pressure in one region and the previous which means a leak of gas has occurred;
the higher the pressure drop, the bigger the leak. The second approach involves using an
appropriate gas sensor, in this case MQ-6, to directly detect the gas concentration that is
leaking. This is accomplished by comparing the error between the actual value and the
predetermined value of the sensors. Here, we are combining both the approaches to increase
redundancy. When this condition is met, an alert is triggered, and valves at the location of the
gas leak are controlled using an Arduino Mega Controller. Further repair action will be
performed at the damaged pipe line. This results in a shorter response time for damage
restoration. As a conclusion, this project has offered students the chance to incorporate theories
into solving issues relevant to the engineering scope of work.

Index Terms— Arduino Mega Rev 3, GSM module, ESP8266 Wi-Fi module, Sensor,
HX710B Atmospheric Pressure Sensor Module.

I. INTRODUCTION
Today's industrial accidents and worker fatalities are primarily caused by dangerous gas leakages. Consider the
recent "VIZAG GAS LEAK" event that happened in LG Polymers on May 7, 2020, for a clearer understanding.
This issue involved uncontrolled Styrene vapours coming from the boiler tanks. In the immediate aftermath of
the tragedy, 12 individuals lost their lives, 585 people required hospital care, and cattle and vegetation were
destroyed. This is one of the recent incidents to occur in our nation. In this decade, there have been a lot of other
accidents.
Gas usage causes serious issues in both domestic and commercial settings. The gas that is used could be pricey

Grenze ID: 01.GIJET.9.2.808


© Grenze Scientific Society, 2023
or hazardous. Therefore, if the gas that leaked is toxic, it may have negative effects on the worker's and the
surrounding community's health. And if gas is expensive, the industrial management suffers a loss. So, to avoid
these in most of the industries, one of the key parts of any safety plan for reducing risks to personnel and plant is
the use of early-warning and controlling devices with the help of gas detectors. These can assist in giving you
additional time to take corrective or preventative action. They can also be utilised as a component of an
industrial plant's comprehensive, integrated monitoring and safety system. Gas leakage accidents, which are
extremely significant and deadly, are caused by the oil and gas industry's rapid expansion. Since gas leaks also
result in a large financial loss, solutions must be found at least to reduce the effects of these accidents. The
difficulties lie in creating a prototype of a gadget that can not only detect leaks but also react to them
automatically when they happen.

II. LITERATURE SURVEY


A. Previous Works and their Limitations
In the past, many authors came up with ideas to prevent and detect gas leakage. One such idea is presented in
"IoT Based Industrial Plant Safety Gas Leakage Detection System," done by R. K. Kodali, R. N. V. Greeshma, K.
P. Nimmanapalli and Y. K. Y. Borra [1], and the detection is also can be done through robotics, the authors Meer
Shadman Saeed and Nusrat Alim done the work on “Design and Implementation of a Dual Mode Autonomous
Gas Leakage Detecting Robot” in which they implemented a robot that detect the gas leakage in small tunnels,
vents, pipelines where human can hardly make access to detect the leakage of gas [12]. Most of these studies
deals with only detection and not on control action i.e., it cannot prevent the leakage of gas. After a time, some
authors proposed some ideas which deals with detection as well as control. Here, the control action is primarily
focused on shutting off the system if there is any leakage, which may be accomplished by installing a control
valve at the inlet or by turning off the system's power [2,4]. Due this the gas accumulated in the pipes are
released into the atmosphere creating damage and loss to the industry. Also, the whole process in the industry
will be shut down until the damage is rectified. Some authors came with other ideas of having additional control
system such as neutralizing gas system that will decrease the effect of toxic gas by releasing a neutralizing gas in
fixed proportion based on amount of toxic gas detected that neutralizes the toxic gas leaked. Here also the gas is
wasted until the damage is cured and the cost of installation of extra or additional system (neutralizing gas
system) is not that much beneficial for any small and medium scale industry.
B. Proposed Model to Overcome the Limitations
In order to get over these limitations, this study proposes an IoT-based automatic industrial gas leakage
monitoring and control system. It can detect gas leakage from a variety of remote locations with continuous
monitoring of leakage and is able to control gas leakage via control valves placed in the necessary positions in
the pipe lines. It generates real-time leakage information that is accessible through the internet and SMS by
working with a gas sensor, pressure sensors, microcontroller through Wi-Fi module and GSM module. This idea
of prevention of gas can be also implemented in LPG gas cylinders to prevent gas leaks in households[11].

III. PROPOSED SYSTEM


The primary functions of the proposed system are to detect gas leaks, monitor leakage data and control toxic gas
leaks with ON/OFF switching of control valves. In this proposed system, a microcontroller Arduino Mega has
been used, that is mainly connected with hardware components like gas sensors (MQ6), Pressure Sensors
(Barometric pressure sensor), Wi-Fi and GSM modules and solenoid valves.
A. Methodology
Figure 1 shows the block diagram of the proposed model, which is divided into three main sections, labelled (i)
Inputs, (ii) Alert system, and (iii) Control system, presents the proposed prototype's overall system information.
The Input section is composed of the sensor data which are pressure and Gas concentration that are given to the
Micro Controller. The Alert system consists of some modules such as Wi-Fi module, GSM module and an LCD.
Finally in Control section there will be an IC i.e., L293D which delivers the desired control signal to the desired
control valves.
The system's fundamental concept is to employ solenoid valves to partition the pipe lines into segments or
compartments. A pressure and gas sensor are placed at every segment. Having two sensors is intended to boost
redundancy. Since any sensor could undergo damage, the others would continue to function. Various
components of our suggested system are depicted in Figure 2. As shown in Figure 2, we have taken 3 segments

283
Figure 1. Block Diagram of the Proposed model
for our prototype. The primary objective of the system is to determine the location of the gas leak, or the precise
segment where the gas leak occurred, by utilizing gas sensors (G1, G2, G3) and pressure sensors (P1, P2, P3).
Opening any one of the hand valves (HV1, HV2, or HV3) in this prototype causes a leak at different positions.
The attention shifts to controlling action once the damaged component has been located. Closing the control
valves (Solenoid valves or air valves) at the beginning and end of the segment where the gas leak is occurring is
the primary objective of the control action.
For a better understanding, if a gas leak occurred at segment 2, i.e., by opening of HV2, then there will be a
change in pressure from segment 2, i.e., at P2, P3 respectively. We can also observe an exponential rise in the
readings of the gas concentration in parts per million (ppm) with respect to the gas that is leaking at the gas
sensor G2 only. The controller determines when to close the valves in relation to the intended set point by using
these measurements of pressure change and gas concentration. Since we were using the MQ 6 sensor, whose
range is 200 to 10,000 ppm, we had set the set point at 300 ppm. Whenever the concentration exceeds the
predetermined level or whenever there is a significant shift in the pressure values from the previous segment.,
i.e., if the gas leak satisfied the aforementioned requirements, a signal from the Arduino controller is delivered to
the solenoid valves, S2 and S3 which are located at the inlet and outlet of the segment 2 respectively.
Additionally, a triggered signal is sent to the alarm system, which comprises of a GSM module that sends an
SMS to factory workers informing them of the location of the gas leak and a Wi-Fi module that transmits
continuous monitoring data from sensors to the ThingView application or ThingSpeak website over the internet
[3]. In addition to these, we also have the most widely used equipment, such as an alarm (buzzer) to alert the
industry that there is a leak and an exhaust fan to remove the gas leak into the atmosphere. Due to this
segmentation and closing of control valves there will be a minor gas leak which will be released and breathed
into the atmosphere where it does not significantly impact the health of the people in and around the industry
premises. Therefore, since the desired position of the leakage at a specific section is known, less time is needed
for damage restoration.
B. Hardware and software
The hardware setup is shown in Figure 3. A brief description about the hardware used is given below.
Arduino Mega 2560 Rev 3
The ATmega2560 serves as the basis for a microcontroller board known as the Arduino Mega 2560. (datasheet).
It includes 16 analogue inputs, 4 hardware serial ports (UARTs), a 16 MHz crystal oscillator, 54 digital
input/output pins (14 of which can be utilized as PWM outputs), a USB connector, a power jack, an ICSP
header, and a reset button. It is used to obtain sensor data and produces activation signals that operates the
solenoid valves and sends alert signals.

284
Figure 2. Top view of the proposed prototype

Figure 3. The hardware setup

MQ-6 Sensor
The MQ6 Gas Sensor is a Metal Oxide Semiconductor (MOS) type Gas Sensor that is primarily used to identify
the presence of Butane and LPG in the air. The MQ 6 sensor has a range of 200 to 10,000 ppm [5].
GSM Module
SMS notifications are sent using a SIM900 GSM module when gas is detected. GSM is intended to be a tool for
exchanging information [6].
Piezo buzzer
The buzzer's primary function is to transform the input signal, which is current (less than 15 mA), into sound.
HX710B Atmospheric Pressure Sensor Module
Atmospheric Pressure Sensor Module, HX710B, with an altitude resolution of 10 cm is used. This barometric
pressure sensor is best used with altimeters and variometers. The sensor module can sense 0-40KPa air pressure.
This sensor can be used to monitor water level and other air pressure [7].
LCD
LCD is employed for displaying the message indicating that “gas detected at zone” into the display, which is
initially coded in program to display the danger.
Wi-Fi Module
A self-contained SOC with an integrated TCP/IP protocol stack, the ESP8266 Wi-Fi Module allows any
microcontroller to access your Wi-Fi network. The ESP8266 is capable of offloading all Wi-Fi networking tasks
from another application processor or hosting an application [8].
Exhaust fan
Exhaust fans are used for exhaling of toxic gases in industries.
Solenoid Valves
It is an electromechanical valve that is often used to control the flow of liquid or gas. There are many different
kinds of solenoid valves, but the two most common varieties are direct acting and pilot driven [9].

285
IC L293D
L293D IC is a typical Motor Driver IC which allows the DC motor to drive on any direction. Here in this
prototype, it is used for opening and closing of solenoid valves [10].
C. Proposed system design flow
The flowchart diagram in Fig. 4 depicts the entire hardware connection process of the suggested system design.
Following are some quick discussions of the process flow's steps:
 Step 1: Firstly, connect all the Modules and other components to the Arduino Mega Micro controller.
 Step 2: Get the sensor data from the respective sensors that are placed at respective locations.
 Step 3: Check whether the predetermined condition is satisfied or not, that is if the Gas concentration
from the MQ-6 Gas sensor is greater than or equal to 300 ppm or if there is an exponential change in
the Pressure values from Pressure sensor.
 Step 4: If the condition is not satisfied then go back to the second step.
 Step 5: If the condition met then Arduino finds the location where the leakage is occurred with the help
of sensor data.
 Step 6: Now it sends a triggered signal to all the alert systems such as GSM module and Wi-Fi module
and also share the sensor data with them.
 Step 7: It also sends the control signal (CO) to IC L293D to apply the desired control action at desired
control valve.

Figure 4. Flow chart of the proposed system

IV. EXPERIMENTAL RESULTS


Figure 5 shows both the hardware and software setup circuit with Arduino Mega, pressure and MQ-6 sensors,
control valves and a PC for detection, indication of position of gas leakage in the pipelines on LCD display and
the control action taken.

286
As mentioned earlier in Methodology section, let us study the same example practically. That is if leakage is
occurred at second segment which is done by opening second hand valve, then the sensor parameter changes are
shown in Table 1.
Figure 6 shows the LCD display which displays the pressure and MQ- 6 sensor readings. Here in the display the
first line shows the Pressure values and second line gives the gas concentration values. Here A, B, C indicates
the segments 1, 2, 3 respectively. When the gas leakage occurs, then the LCD displays the location of gas
leakage(Location 2) with respect to the pressure and MQ-6 sensor as shown in Figure 7.

TABLE I: OBSERVATIONS
Parameters Before opening the 2nd hand valve After opening the 2nd hand valve
Pressure at position 1 (A) 320 Pa 215 Pa
Pressure at position 2 (B) 315 Pa 215 Pa
Pressure at position 3 (C) 310 Pa 220 Pa
Gas concentration at Position 1 (A) 0 ppm 0 ppm
Gas concentration at Position 2 (B) 0 ppm 431 ppm
Gas concentration at Position 3 (C) 0 ppm 0 ppm

Figure 5. Proposed prototype model

Figure 6. LCD display showing the gas and pressure sensors readings Figure 7. LCD display showing the position of leakage

Figure 8. (a) Graphical Representation of pressure (Pa) values in ThingSpeak at segment 1, (b) Graphical Representation of Gas
concentration (ppm) values in ThingSpeak at segment 1

287
Figure 9. (a) Graphical Representation of pressure (Pa) values in ThingSpeak at segment 2, (b) Graphical Representation of Gas
concentration (ppm) values in ThingSpeak at segment 2

Figure 10. (a) Graphical Representation of pressure (Pa) values in ThingSpeak at segment 3, (b) Graphical Representation of Gas
concentration (ppm) values in ThingSpeak at segment 3

Coming to the alerting system, the SMS and Graphical analysis data will be sent to the application or website.
The Figures 8,9,10 gives the individual sensor data with respect to time which is sent to the ThingSpeak website
through Wi-Fi module to get aware of the data in the form of graphs and a warning is sent when the gas leakage
exceeds the limit. The same data can also be seen in mobile through ThingView application. For better
understanding, visit the website through the link https://thingspeak.com/channels/243722.

Figure 11. SMS sent to the registered mobile with sensor values

The Data is also sent to the mobile as SMS as shown in Figure 11 by using SIM900A Quad Band GSM/GPRS
Module.

288
V. CONCLUSION
In terms of the project's goals, the prototype has given adequate answers for preventing the toxic gas leaks in
industries at low cost. The detection, alerting, and control systems make up the prototype's three core parts. The
detection part is designed with a MQ-6 sensor and a HX710B atmospheric pressure sensor. The control system is
constructed using solenoid valves, an Arduino Mega controller, an exhaust fan, and an IC L293D. The alert
system is created using an LCD, an ESP8266 Wi-Fi Module, and a SIM900 GSM Module. This alert system is
incorporated to increase public awareness of toxic gas leaks, which in turn lowers accidents brought on by toxic
gas leaks. The integration of these three systems gives us accurate results. The prototype's design eliminates all
the limitations of previous works and makes it simple for both small and large-scale companies to deploy
without making significant changes to the existing system.

FUTURE SCOPE
Other types of sensors, as well as other actuators, can be employed that may perform better than the traditional
paradigm. It also enables flexibility because other modules may be introduced without affecting the existing
modules for taking control actions and alerting system.

REFERENCES
[1] R. K. Kodali, R. N. V. Greeshma, K. P. Nimmanapalli and Y. K. Y. Borra, "IOT Based Industrial Plant Safety Gas
Leakage Detection System," 2018 4th International Conference on Computing Communication and Automation
(ICCCA), 2018, pp. 1-5, doi: 10.1109/CCAA.2018.8777463.
[2] S. Z. Yahaya, M. N. Mohd Zailani, Z. H. Che Soh and K. A. Ahmad, "IoT Based System for Monitoring and Control of
Gas Leaking," 2020 1st International Conference on Information Technology, Advanced Mechanical and Electrical
Engineering (ICITAMEE), 2020, pp. 122-127, doi: 10.1109/ICITAMEE50454.2020.9398384.
[3] Yahaya, S. Z., Mohd Zailani, M. N., Che Soh, Z. H., & Ahmad, K. A. (2020). IoT Based System for Monitoring and
Control of Gas Leaking. 2020 1st International Conference on Information Technology, Advanced Mechanical and
Electrical Engineering (ICITAMEE). doi:10.1109/icitamee50454.2020.9398384
[4] H. Paul, M. K. Saifullah and M. M. Kabir, "A Smart Natural Gas Leakage Detection and Control System for Gas
Distribution Companies of Bangladesh using IoT," 2021 2nd International Conference on Robotics, Electrical and
Signal Processing Techniques (ICREST), 2021, pp. 109-114, doi: 10.1109/ICREST51555.2021.9331226.
[5] Ba Thanh Nguyen and Anh Vu Nguyen, “IoT Application for Gas Leakages Monitoring,” International Research
Journal of Advanced Engineering and Science, Volume 5, Issue 4, pp. 51-53, 2020.
[6] M. A. Subramanian, N. Selvam, S. Rajkumar, R. Mahalakshmi and J. Ramprabhakar, "Gas Leakage Detection System
using IoT with integrated notifications using Pushbullet-A Review," Fourth International Conference on Inventive
Systems and Control (ICISC), pp. 359-362, doi: 10.1109/ICISC47916.2020.9171093, 2020.
[7] A. Banik, B. Aich and S. Ghosh, "Microcontroller based low-cost gas leakage detector with SMS alert," 2018 Emerging
Trends in Electronic Devices and Computational Techniques (EDCT), 2018, pp. 1-3, doi:
10.1109/EDCT.2018.8405094.
[8] R. K. Kodali, R. N. V. Greeshma, K. P. Nimmanapalli and Y. K. Y. Borra, "IOT Based Industrial Plant Safety Gas
Leakage Detection System," 2018 4th International Conference on Computing Communication and Automation
(ICCCA), 2018, pp. 1-5, doi: 10.1109/CCAA.2018.8777463
[9] A. Suryana et al., "Detection of Leak Position in Household LPG Distribution Pipes Using Gas Pressure Sensors and
Continuity Equation,"6th International Conference on Computing Engineering and Design (ICCED), pp.15,
doi10.1109/ICCED51276.2020 .9415775, 2019.
[10] R. S. Rosli, M. H. Habaebi and M. R. Islam, "Characteristic Analysis of Received Signal Strength Indicator from
ESP8266 WiFi Transceiver Module," 7th International Conference on Computer and Communication Engineering
(ICCCE), pp. 504-507, doi: 10.1109/ICCCE.2018.8539338, 2018.
[11] K. Gavaskar, D. Malathi, G. Ravivarma and A. Arulmurugan, "Development of LPG Leakage Detection Alert and Auto
Exhaust System using IoT," 2021 7th International Conference on Electrical Energy Systems (ICEES), 2021, pp. 558-
563, doi: 10.1109/ICEES51510.2021.9383633.
[12] M. S. Saeed and N. Alim, "Design and Implementation of a Dual Mode Autonomous Gas Leakage Detecting Robot,"
2019 International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), 2019, pp. 79-84,
doi: 10.1109/ICREST.2019.8644075.

289
Grenze International Journal of Engineering and Technology, June Issue

Polymer Conducting Nanocomposite Film to Improve


Electromagnetic Compatibility of Electronic Devices
Vikas Rathi1, Brijesh Prasad2, Varun Mishra3, Hemant Singh Pokhariya4 and Himanshu Pal5
1,3,5
Department of Electronics and Communication Engineering, Graphic Era deemed to be University, Dehradun, India.
Email: [email protected], [email protected], [email protected]
2
Department of Mechanical Engineering, Graphic Era Deemed to be University, Dehradun, India.
Email: [email protected]
4
Department of Computer Science Engineering, Graphic Era Deemed to be University, Dehradun, India.
Email: hemantdoon86 @gmail.com

Abstract—This article presents the analysis of polymer conducting nanocomposites (PCNC) to


improve the electromagnetic compatibility of various electronic equipment. PCNC has been
fabricated using the reduced graphine oxide (rGo) as a filler, and poly (vinylidene fluoride-co-
trifluoroethylene) [P(VDF-TrFE)] polymer as insulating polymer matrix. Simplest and efficient
method of solution casting was used for fabrication purpose. The obtained nanocomposite
films were characterized for surface structure using Scanning Electron Microscopy (SEM)
technique. The dielectric parameters were obtained in the frequency region of 10KHz to 1MHz
accordingly the shielding effectiveness was calculated using the dielectric parameters. The
developed sheets are cost effective, flexible and showing excellent EMI shielding properties
around 30dB for shielding of various electronic equipment to enhance their electromagnetic
compatibility.

Index Terms— Nanocomposite, Electromagnetic Interference, Solution casting, Shielding


effectiveness, flexibility.

I. INTRODUCTION
Nowadays various electronic devices are used in close vicinity to each other. The electromagnetic interference
(EMI) which occurs due to interference in electrical and magnetic fields of electronic devices could disturb the
functioning of any device as well as can interrupt the working of other devices in same confined space, which
can be dangerous [1,2,3]. Various studies shown that there has been rise in number of calamity due to EMI.
These calamities are mostly found in an environment where so many electronic devices are working at the same
time [4] as noted by rathi et al. To tackle with this problem, the manufacturing companies of electronic devices
putting their constant efforts. They want to fortify the safety features of electronic devices to the effects of
electromagnetic radiations (EM). The solution of this problem is to offer a shielding for EM radiations. By
shielding a device, electromagnetic compatibility of the device can be improved. To guard any electronic device,
we need to stop the transmission of EM radiation through a shield called EMI shield.
Electrical conductivity is most significant requirement of EMI shield. We know that all metals are very good
conductor of electricity. So they are assumed to be perfect source of protection against EMI waves. But they
have some limitations also, like heavy weight, rigid and rusting nature. In today scenario, an alternative is used,

Grenze ID: 01.GIJET.9.2.809


© Grenze Scientific Society, 2023
which is a polymer based conducting composites. They are considering to be best substitute for EMI shielding
over metals [5,6]. Polymer composites are light in weight, flexible, non-corrosive and very cost effective which
make them good choice for creating an EMI shield. Conducting polymer composites are fabricated by adding
conducting fillers within the insulating polymer matrix. Fillers are added to insulating polymer matrix using
techniques like blending, casting, in sites, polymerization etc. These conducting fillers then form conducting
channels in polymer matrix to allow the electrons movement. The conducting polymer composite enhance the
ability of material to shield EM waves known as EMI Shielding Effectiveness (SE). SE is expressed in dB (SE
dB). It is expressed as sum given from reflection, absorption & multiple reflection [7,8]. So, it can be concluded
that
SE =A+R+CF. (1)
This relation helps in understanding the shielding effect. Here, A is absorption loss, R is reflection loss CF is loss
to multiple reflection in the shield.
Nowadays, to resolve the problems of thickness & flexibility we need to fabricate flexible polymer conducting
nanocomposite (PCNC) film. For present work poly (vinylidene fluoride-co-trifluoroethylene) [P(VDF-TrFE)]
has been used for matrix material & reduced graphene oxide (rGo) has been used as a conducting filler. P(VDF-
TrFE) have many features, like its flexibility, easily process able, and of light weight in comparison to other
materials [9,10]. It is a ferroelectric copolymer that have a large dipole moment and dielectric constant. rGo is
used as a filler due to its property of easy soluble and cost effectiveness, Other main advantages of rGo is that
they are efficient in making EMI shield films & they have capability to form thin nano-composites. Different
compositions of P(VDF-TrFE)/rGo composite were fabricated with different concentration of rGO as 5wt%,
10wt% and 15wt% so we obtained the films with composition as 95/05 , 90/10 and 85/15 respectively. using
solvent casting method. The dielectric properties of the fabricated films were measured as the function of
frequency (10KHz to 1MHz) by network analyzer accordingly the SE of the films was calculated. Scanning
Electron Microscopy (SEM) was used to get the idea of fabricated film’s surface morphology. Mechanical
strength of the films was measured to get the idea of flexibility of the films.

II. EXPERIMENTAL
A. Materials
The PCNC films were developed using the polymer P(VDF-TrFE) (99.9 % pure) powder purchased from
Piezotec Arkema group. The polymer molecular weight was 200,000 g/mol. rGo was used as the conducting
filler having product ID-777684, in powdered form as solutes and for dissolving the solutes and N-N,-
Dimethylformamide (DMF) was used as solvent to form a heterogeneous solution. The items were purchased
from Sigma Aldrich, India.
B. Fabrication of Nanocomposite Film
The simple and effective technique of solution casting was used to develop P(VDF-TrFE)/rGo PCNC. Fig. 1
describe the process of solution casting method. Firstly the P(VDF-TrFE) is dissolved with DMF in a glass
tumbler. The combination of P(VDF-TrFE) and DMF was heated at a temperature of around 80OC for approx 2 h
with a stirring speed at 400 rpm so that we get a homogenous solution. After this rGo with different ratio is
added in that solution, the obtained solution was again heated at 50OC for around 5 h. Acquired uniform solution
is transferred into a glass petri dish and allowed to heat slowly so that solvent can be removed from the petri
dish. Once the films are completely dried, the films can be unwrapped from the petri dish. The same process of
fabrication was repeated for the different concentration of rGO as 5wt%, 10wt% and 15wt% so we obtained the
films with composition P(VDF-TrFE)/rGo as 95/05 , 90/10 and 85/15 respectively.

Figure 1. Fabrication process of Composite Film

291
III. RESULTS AND DISCUSSION
A. Surface morphology of composite films
With the help of SEM images we get the idea of surface morphology of the fabricated CPNC. Fig. 2 shows the
SEM images of fabricated films. Fig. 2(a) shows the morphology of pure P(VDF-TrFE) we can see the spotted
grain structure like surface of the film as reported earlier [11,12]. Fig. 2(b) and 2(c) shows the SEM image of
90/10 and 85/15 respectively. We can see the effect of rG0 on the surface, clearly the conducting channels can be
seen on the surface of P(VDF-TrFE) as concentration of rGo increases the conducting channel increases.

Figure 2. SEM images of (a) pure P(VDF-TrFE); (b-c) SEM images of composition 90/10, 85/15 of P(VDF-TrFE)/rGo respectively

B. Dielectric analysis
SE of fabricated CPNC samples was calculated with help of dielectric parameters in the frequency range of 10
KHz to 1 MHz. Conductivity (σ) and Dielectric constant (ϵʹ) of the fabricated CPNCs are shown in Fig. 3(a) and
(b) with respect to the frequency. SE of the films mainly depends on conductivity, it is clear from the graph that
conductivity of film with 85/15 P(VDF-TrFE)/rGo ratio is maximum and the film with composition 95/05
P(VDF-TrFE)/rGo has smallest value of conductivity. The main cause for this is the formation of various
conducting channels between the interfacial area of polymer matrix, which increases the dipole moment.
Dielectric constant of the films also calculated and it is maximum for film with 15% of rGo.

Figure 3. (a) Conductivity; (b) Dielectric constant of fabricated films

C. Mechanical Properties
The mechanical properties analysis gives the idea about the performance of the fabricated films when the films
are lay open to stretching or pulling force before it fails [13]. The mechanical properties of films with
combination labels 85/15,90/10,95/05 are depicted in the stress – strain diagram shown in Fig. 4. From the graph
we can observe that 85/15 is the most brittle among these films and 95/05 is the most ductile material. The
toughness of the films will depend on area under the curve in stress – strain diagram which is different for all
hence, the toughness for them will also be different. The film with composition of 90/10 is the most satisfactory
material according to the graph but still we can see the composition 85/15 have flexibility of 0.7mm and can bear

292
a load of around 1.5 MPa however as the concentration of conducting filler decreases the flexibility and
mechanical strength of the films increases.

Figure 4. Shielding Effectiveness of obtained films

D. EMI Shielding Analysis


We have already described that SE can be measured with the help of A, R and CF. Total value of SE of P(VDF-
TrFE)/rGo is presented in Fig. 5. We can see from the figure that the SE value increased with increase in rGo
content and decreased with frequency generally. The CPNC having 15 wt% of rGo filler content showed
maximum value of SE. The maximum value of 36 dB at 10KHz was attained for 15 wt% of rGo in P(VDF-
TrFE)/rGo conducting polymer composite. The film with 5wt% of rGo the SE is low it is around 10dB for
examined frequency region. After the concentration of 15wt% of rGO there is no significant increase in SE, so
that should be its percolation threshold level.

Figure 5. SE of obtained films

IV. CONCLUSION
In the presented work thin, cost effective and flexible composite sheets have been developed for EMI shielding
application. As we increase the content of filler in the composition of film the SE of films get increases as well.
With the study of dielectric and SE analysis, we can say that the CPNC with 15wt% of rGo is giving most
suitable result as, it’s SE is highest and it is in between 36dB to 25dB for examined frequency range of 10khz to
1Mhz. It conductivity is also highest among other fabricated CPNC. Also the mechanical properties of CPNC
with 15wt% of rGo is satisfactory. This is also clear from SEM images that the rGo filler distributed evenly
through P(VDF-TrFE) polymer matrix. The fabricated P(VDF-TrFE)/rGo composite film has a very good
prospect to be used as a flexible EMI shielding film to increase the electromagnetic compatibility of various
electronic equipment.

REFERENCES
[1] B. Zhao, C. Zhao, R. Li, S. Hamidinejad and C. Park, “Flexible, Ultrathin, and High-Efficiency Electromagnetic
Shielding Properties of Poly(Vinylidene Fluoride)/Carbon Composite Films,” ACS Appl. Mater. Interfaces, vol. 9, no.
24, pp. 20873−20884, Jun. 2017.

293
[2] M. Ameli., M. Nofar, S. Wang. and C.B Park. “Lightweight Polypropylene/Stainless-Steel Fiber Composite Foams with
Low Percolation for Efficient Electromagnetic Interference Shielding,” ACS Appl. Mater, Interfaces. vol. 6, no. 14, pp.
11091−11100, July 2014,
[3] D. X. Yan, H. Pang, B. Li, R. Vajtai, L. Xu, P. G. Ren, J.H. Wang and Z. M. Li, “Structured Reduced Graphene
Oxide/Polymer Composites for Ultra - Efficient Electromagnetic Interference Shielding,” Adv. Funct. Mater, vol. 25,
no. 4, pp. 559−566, Jan. 2015.
[4] V. Rathi, V. Panwar, G Anoop, M. Chaturvedi, K. Sharma and B. Prasad. "Flexible, Thin Composite Film to Enhance
the Electromagnetic Compatibility of Biomedical Electronic Devices," in IEEE Trans. Electromagnetic Compatibility,
vol. 61, no. 4, pp. 1033-1041, Aug 2019.
[5] F. Paulis, M. H. Nisanci, A. Orlandi, M. Y. Koledintseva, and J.L. Drewniak, “Design of homogeneous and composite
materials from shielding effectiveness specifications,” IEEE Trans. Electromagnetic Compatibility, vol. 56, no. 2, pp.
343–351, Apr. 2014.
[6] S. H. Lee, D. Kang, and 1.k. Oh, “Multilayered Graphene-Carbon Nanotube-Iron Oxide Three-Dimensional
Heterostructure for Flexible Electromagnetic Interference Shielding Film,” Carbon, vol. 111, pp. 248−257, Jan. 2017
[7] M. Chen, L. Zhang, S. Duan, S. Jing, H. Jiang, M. Luo, and C. Li, “Highly conductive and flexible polymer composites
with improved mechanical and electromagnetic interference shielding performances,” Nanoscale, vol. 6, no. 7, pp. 3796-
3803, 2014
[8] V. Rathi and V. Panwar, “Electromagnetic interference shielding analysis of conducting composites in near and far field
region,” IEEE Trans. Electromagn. Compat., vol. 99, pp1-7, Jan. 2018.
[9] Q. M. Zhang, V. Bharti, and G. Kavarnos, “Poly (vinylidene fluoride) (PVDF) and its copolymers,” Encyclopedia of
Smart Materials, John Wiley & Sons, New York, pp. 807–825, Jul. 2002.
[10] V. Panwar and G. Anoop “An ionic polymer–metal nanocomposite sensor using the direct attachment of an acidic ionic
liquid in a polymer blend” J. Mater. Chem. C, 7, 9389-9397, Jul.2019
[11] S.W. Hahm, and D.Y. Khang, “Crystallization and microstructure-dependent elastic moduli of ferroelectric P(VDF–
TrFE) thin films,” Soft Matter, vol. 6, no. 22, pp. 5802–5806, 2010.
[12] J. Ryu, K. No. Y. Kim.Y, E. Park, and S. Hong,.“Synthesis and Application of Ferroelectric Poly (Vinylidene Fluoride-
co-Trifluoroethylene) Films using Electrophoretic Deposition,” Scientific Reports, vol. 6, 36176. Nov 2016
[13] H.L. Chen et.al., “Predicting mechanical properties of polyvinylidene fluoride/carbon nanotube composites by molecular
simulation”, Materials Research Express, vol. 4, no. 11, 115025, Nov. 2017.

294
Grenze International Journal of Engineering and Technology, June Issue

Floating Sun Tracking Solar Panel


CH. PSVL Anjani Pujitha1, K.Umamaheswari2, A. Karthik3, J. Harshavardhan4 and D.V S Gopi Sivanadh5
1, 3-5
IV- B Tech Department of EIE, VR Siddhartha Engineering College, Vijayawada, India
Email: [email protected], [email protected], [email protected],
[email protected]
2
Asst. Professor, EIE Department, VR Siddhartha Engineering College, A.P, India
Email: [email protected]

Abstract—The future of renewable power generation is Solar power. The main problem with
solar power generation is solar panels occupy more area on roof tops, open areas and they are
not easy to mount. The concern with solar panels is that they are difficult to install, maintain,
and clean on a regular basis. They also take up a lot of room on roofs or other open spaces.
Additionally, shifting the solar panels in accordance with the position of the sun may produce
up to 40% more solar electricity. Here, we suggest a of solar panel that may be placed on bodies
of water, like lake pools, freeing up space on the ground. We also provide a novel technique that
is sun tracking floating solar for moving solar panels in accordance with the position of the sun
using LDR sensors, which would increase power production and the floating system in the
water resists the solar panel from overheating. Additionally, the water is also conserved due to
reduction in evaporation of water from the water body. In the upcoming 10 years, India
proposes the generation 1GW and 1.75 GW of solar Photo Voltaic power from renewable
energy sources. As on date around 5000MW has been commissioned in different parts of
country, as per the Jawaharlal Nehru Solar Mission. To meet the target, there is a need to
produce more solar energy in short span of 10 years. Floating solar Photo Voltaic plants are an
emerging form of PV systems, that floats on the water bodies like canals, water reservoirs,
lakes, and ponds. This paper proposes a prototype of floating sun tracking solar panel to
increase the production of solar energy using floating solar panels, Raspberry Pi Pico
microcontroller board, Thonny IDE software.

Index Terms— Raspberry Pi Pico microcontroller board, Thonny IDE, Current Sensor
module, DHT11 temperature and Humidity Sensor.

I. INTRODUCTION
The standard solar panel does not have much efficiency and the production of energy is very little. To overcome
this problem, we have proposed a floating sun tracking solar panel. The floating Photo Voltaic system exploits
many functions such as cooling, concentrating, and tracking. The outcomes of the system have designated an
important influence on cooling and tracking on the system competence. The main advantage is large amount of
solar energy production when compared to roof top solar panels and the solar panel is introduced in the water
therefore due to cooling effect of water the panel will not get heated continuously[1]. This increases the
efficiency of solar panel which also leads to production of large amount of energy. More electricity will be
generated due to cooling effect of water in floating solar than the terrace roofing systems. The geometry of the
given system has been determined with two major aspects [2]. Firstly, the module should protect as much water

Grenze ID: 01.GIJET.9.2.810


© Grenze Scientific Society, 2023
as possible to avoid water evaporation. Secondly, the size of the module is adapted to the commercially available
PV modules in the market.

II. IMPLEMENTATION
Fig.1 represents the floating sun tracking solar panel flow diagram. The heart of the system is the controller, i.e.,
Raspberry Pi Pico RP2040. Single axis solar panel is used, which rotates in 1800 according to this position
maximum amount of energy is absorbed by the LDR. The panel rotation is done with the help of servo motor.
Micro Python is used in Raspberry Pi Pico for rotating the servo motor. The floating of the panel is done with the
help of vacuum filled tubes [3]. The solar panel continuously rotate in the direction of sun with the help of servo
motor. Thus, the Photo Voltaic cells absorb the maximum amount of energy from the sun. Chargeable batteries
are used to store energy for future needs. The floating solar panels can be installed at the existing power plants.
These floating solar panels keep water bodies fresh and clean while generating the renewable energy. 16×2 LCD
is used to display the generated voltage. Due to continuous rotation of panel in the sun direction there is
continuous heating this reduces efficiency in solar panel. In order to overcome this, we use floating solar panel
as it continuously dissipates the heat. When compared to roof top solar panels, efficiency of the floating solar
panel is increased by 35%. A solar panel of 100 watts when received 10 hours of direct sun-hours per day will
generate 2kWh of energy. Then the maximum energy annually produced is 730kWh [4].

Fig.1 Functional process flow diagram

III. PROPOSED SYSTEM WORKING FLOW

Fig.2 Flow Chart for the Working of Proposed System Fig.3 RP2040Raspberry Pi Pico

296
The flowchart diagram in Fig.2 depicts the working process of the suggested system design. Following are some
quick discussions of the process flow steps:
 Step1: Place the Solar Panel on a roof top or in an open area.
 Step2: Connect the model to an external power supply for rotating the servo motor.
 Step3: If the LDR1 has absorbed more voltage than LDR2, the servo motor rotates in 30o Clockwise
direction.
 Step4: If the LDR2 has absorbed more voltage than LDR3, the servo motor rotates in 60o Clockwise
direction.
 Step5: If LDR3 absorbs the maximum voltage, then the servo motor rotates in 90o clockwise direction.

IV. HARDWARE AND SOFTWARE


A. RP2040 Raspberry Pi Pico
The RP2040 Raspberry Pi Pico shown in fig.3 is a dual-core Arm Cortex-M0+ processor with 264KB internal
RAM and it supports up to 16MB of off-chip Flash [5].It has 40 I/O pins. Among 40 pins, 26 are multipurpose
GPIOs and it has 8 ground pins. It has 3 pins for debugging.

Fig.4 Pin configuration of RP2040

Fig.4 shows the pin configuration of Raspberry Pi Pico. The Raspberry Pi Pico consists of an integrated
temperature sensor and low power sleep and dormant modes. Table.1 shows the specifications of Raspberry Pi
Pico (RP2040).

TABLE I. RP2040 SPECIFICATIONS


S. No Parameters Specification
1 Microcontroller RP2040
2 Operational voltage range 1.8Volts-5.5 Volts
Dual-Core and Arm Cortex-
3 Processor
M0+
4 SRAM 264KB
5 Flask Memory 2MB

B. INA219 Dc voltage and current Sensor Module

Fig.5 INA219 based Dc voltage and current Sensor Module

297
Fig.5 shows a INA219 based DC voltage and Current sensor module. CJMCU-219 is a zero drift I2C interface
based. It is a bidirectional current/power monitoring module. It is an essential component of power monitoring
system. It is capable of sensing current, voltage, power. It transmits data to host microcontroller using I2C bus
protocol. The specifications of CJMCU-219 current sensor module is given in Table.2 [6].

TABLE II. CJMCU-219 SPECIFICATIONS


S. No Parameter Specification
1 Power Input 3 Volts to 5.5Volts
2 Target Voltage +26V max
3 Current sense resistor 0.1 ohm 1% 2W
4 Bus Voltages 0 to 26 V
5 Compatible interface 2C- or SM Bus

C. LDR Sensor

Fig.6 LDR Sensor

LDR or light dependent resistor is shown in Fig.6. It is a kind of resistor whose resistance changes depending on
the amount of the light falling on its surface and it is made of a high resistance semiconductor. These resistors
are used in circuits where it is required to sense the presence of light. It’s operation is based on semi
conductivity. LDR has a variety of functions and resistance. The electrons in the semiconductor material's
valence band are stimulated to the conduction band when light strikes the object, or when photons strike it. To
cause the electrons to move from the valence band to the conduction band, the incident photons must have an
energy larger than the bandgap of the semiconductor material. As a result, when sufficiently energetic light
impacts the device, a huge number of charge carriers are produced as more and more electrons are driven to the
conduction band[7].
D. DHT-11 Temperature and Humidity Sensor

Fig.7 DHT-11 Temperature and Humidity Sensor

Fig.7 shows a digital sensor DHT11 for measurement of Humidity and temperature. The sensor is interfaced
with the Raspberry Pi Pico. This DHT11 is available as both sensor and module. In this prototype we are using
DHT11 sensor. DHT11 measures the surrounding air using thermistor and capacitive humidity sensor [8].
Table.3 shows the specifications of DHT11 Temperature and Humidity Sensor.

TABLE III. DHT11 SPECIFICATIONS


S. No Parameter Specification
1 Operational Voltage 3.5V - 5.5V
0.3mA(measuring)
2 Operational Current
60uA (standby)
3 Output data Serial data
4 Temp Range +0°C - +50°C
5 Humidity Range 20% - 90%
6 Accuracy ±1°C & ±1%

298
E. LCD

Fig.8 16X2 LCD

LCD mean liquid crystal display is shown in fig.8. This 16×2 LCD working principle is, it blocks the light rather
than dissipate. It is an electronic display module used in many applications like mobile phones, calculators, and
computers. The LCD used here has 16×2 display with 40 pins [9]. The main advantages of using this kind of
LCD are they are inexpensive, simply programmable also there are no limitations for displaying custom
characteristics.
F. MG995 Servo Motor

Fig.9 MG995 Servo Motor

Fig.9 shows a MG995 which is a heavy-duty reliable servo motor. It is a high-speed actuator with dual bearings.
It is a low power, and cost effective which is feasible for Industrial production. A maximum torque of 208 oz-in
is delivered at 6 volts. It has a maximum rotational speed of 0.13 seconds per 60o. If the voltage is dropped to a
smallest of 4.8 volts, it maintains a torque of 180 oz-in and rotates with a speed of 0.17 seconds per 60o [10].
G. Thing Speak IOT Platform
Thing Speak is an IOT platform used for gathering the real-time data like location, climatic changes information
and other device data. In our proposed floating sun tracking solar panel model this IOT platform is used to
collect the voltage of both the floating solar panel and the static solar panel.
H. Thonny IDE
Fig.10 shows the interface of the Thonny IDE. It is the platform used for coding. We use Micro python for
programming the servo motor to rotate in the direction of the sun.

Fig.10 Interface of the Thonny IDE


I. Micro Python
Micro Python is a Python 3 programming language. It contains a small subset of Python library and is enhanced
to run on microcontrollers and in obliged environments.

V. HARDWARE AND SOFTWARE INTERFACING


The software and hardware interfacing connections are depicted in Fig.11 and Fig.12 shows hardware
connections. The solar panel is exposed to light after interfacing [11]. Then the panel will rotate in the direction

299
of the sun with the help of servo motor. The amount of voltage generated is shown in the LCD display and the
current, voltage and Temperature readings are taken from the thing speak cloud using the Wi-Fi module
ESP8266. This is specifically for use in Internet of Things (IoT) systems. With a 32-bit processor, some RAM,
and, depending on the supplier, between 512KB and 4MB of memory, the ESP8266 is a complete Wi-Fi system
on a chip. This enables the chip to work as a standalone device that can run simple programmes or as a wireless
adaptor that can add Wi-Fi functionality to other systems.

Fig.11 Hardware and Software Interfacing Fig.12 Hardware connectional diagram

VI. RESULT
There were two solar panels tested. The other is a floating sun tracking solar panel, while the first is a static
rooftop solar panel. The static solar panel is positioned at a 33-degree angle since this will ensure that its power
production is maximised. From 8 AM to 6 PM, the test was conducted continuously across three days, with
measurements being made continuously. In this test, a load was used to compute the solar panel's current and
voltage (battery of capacity 15V-17V). Figures 13,14,15 shows the voltage, current and temperature readings of
the static solar panel and Figures 16,17,18 shows the voltage, current and temperature readings of Sun Tracking
Floating solar panel in Thing Speak IOT platform and the readings were taken for every 15 minutes from 8:00
AM to 5:00 PM. The following results shows that the floating sun tracking solar panel is more efficient than
static solar panel.

(a) (b)

(c) (d)

Fig.19 shows the final output voltage displayed on the LCD. The power rating of the solar panel used in the
prototype is 5W,12V which is useful for charging small electronic devices. 25-35% energy is generated in
floating solar panel when compared with roof top solar panel. The solar panel used in the prototype is of area 27
X 19 sq.cm ,which generates a maximum energy of 11 volts. The standard roof top solar panel with the same
power rating generates a maximum of 9 volts.

300
(e) (f)
Fig.13(a) Graphical representation of Voltage for Static Solar Panel in Thing Speak (b)Graphical representation of Current for Static Solar
Panel in Thing Speak (c)Graphical representation of Temperature for Static Solar Panel in Thing Speak (d) Graphical representation of
Voltage for Floating Sun Tracking Solar Panel in Thing Speak (e) Graphical representation of Current for Floating Sun Tracking Solar
Panel in Thing Speak (f) Graphical representation of Temperature for Floating Sun Tracking Solar Panel in Thing Speak

Fig.19 Final output voltage displayed on the LCD

Table.6 shows the real time statistical data of the prototype floating Sun Tracking Solar Panel. The current,
voltage, and temperature readings are taken from 8:00 AM to 5:00 PM continuously. The sun tracking panel
rotates in the direction, when the maximum light falls on the panel, due to absorbance of the light by LDR sensor
the servo motor runs, this causes change in the position of the solar panel which leads to absorption of maximum
energy.

TABLE VI. REAL TIME STATISTICAL DATA OF FLOATING SUN TRACKING SOLAR PANEL AFTER TESTING
Solar panel position with respect to Sun Tracking Time Temperature(oC) Current (mA) Voltage (V)
30o 8:00 AM 27 193 9
30o 9:00 AM 28 199 10
60o 11:00 AM 32 203 11
60o 12:00 PM 35 260 18
60o 1:00 PM 34 221 13
60o 2:00 PM 33 234 15
90o 4:00 PM 31 203 11
90o 5:00 PM 30 198 10

VII. CONCLUSION
The concept of floating sun tracking solar panel is neoteric. In this study, we provide an easy-to-understand
explanation of the solar tracking mechanism used to increase solar gain energy. We also discuss how
inexpensive it is to operate and maintain a solar tracker. The tracking system is used to locate the solar panel in
sun direction to produce the maximum amount of energy. The floating system is used to cool the solar panel
which is heated continuously due to sun tracking. The cooling system dissipates the heat absorbed by the solar
panel so that it works efficiently and produces more amount of energy when compared to standard roof based
solar panels. The floating solar panel generates a maximum voltage of 11 volts while a roof top solar panel
generates a maximum voltage of 9 volts for a standard 5W solar panel.

FUTURE SCOPE
As the renewable energy resources are free, proper management is needed and we need to discover more
technologies for energy production through these free resources. Other types of sensors as well as solar panels
can be employed in water reservoirs, lakes that may perform better than the traditional paradigm. For maximum

301
absorption of light from the sun we can use anti reflective coatings on the solar panel. By using them the
destructive interference will be eliminated from incident light waves from the sun. Thus, the maximum amount
of light is transmitted to the solar panel which increases the amount of energy production.

REFERENCES
[1] Sahu, Alok Yadav, Neha & Sudhakar, "Floating photovoltaic power plant", A review: Renewable and Sustainable
Energy Reviews, vol 66, pp. 815-824, 2016.
[2] Sriwirote.B, Noppakant.A & Pothisaran, "Increasing efficiency of an electricity production system from solar energy
with a method of reducing solar panel temperature," International Conference on Applied System Innovation (ICASI),
pp.1-3, 2017.
[3] R. Chowdary, M. A. Aowal, and Rehman. A , "Floating Solar Photovoltaic System: An Overview and their Feasibility at
Kaptai in Rangamati," IEEE International Power and Renewable Energy Conference, pp.1-5, 2020.
[4] D. Mital, B.Saxena and K. V. S. Rao, "Floating solar photovoltaic systems: An overview and their feasibility at Kota in
Rajasthan," International Conference on Circuit, Power, and Computing Technologies (ICCPCT), pp.1-7, 2017.
[5] Y. Bikrat, A. Benlghazi, and D. Moussaid, "A Photovoltaic Wireless Monitoring System," International
Symposium on Advanced Electrical and Communication Technologies (ISAECT), pp. 1-5, 2018.
[6] D. Tukymbekov, M. Nurgaliyev, N. Kuttybay, Y. Nalibayev and G. Dosymbetova, "Intelligent energy efficient street
lighting system with predictive energy consumption," International Conference on Smart Energy Systems and
Technologies (SEST), pp. 1-5, 2019.
[7] F. I. Musthaffa, S. Shakhir, F. F. Musthaffa and A. T. Naif, "Simple design and implementation of solar
tracking system two axis with four sensors for Baghdad city," 9th International Renewable Energy
Congress(IREC), pp.1-5, 2018.
[8] Mondhal.A, M. J. Alih, and Dutta. P, "IoT Enabled Smart Solar Panel Monitoring System Based on Boltuino
Platform, "IEEE International IOT, Electronics and Mechatronics, Conference(IEMTRONICS), pp.1-7, 2022.
[9] C. -Y. Yang, C. -Y. Hsieh, F. -K. Feng and K. -H. Chen, "Highly Efficient Analog Maximum Power Point
Tracking (AMPPT) in a Photovoltaic System," in IEEE Transactions on Circuits and Systems I: Regular
Papers, vol. 59, no. 7, pp. 1546-1556, 2012.
[10] T. Kaur, S. Mahajan, S. Verma, Priyanka and J. Gambhir, "Arduino based low-cost active dual axis solar tracker, "IEEE
1st International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), pp. 1-5,
2016.
[11] M. S. S. K, V. K. R, and J. R, "Simulation and Implementation of Dual-Axis Solar Tracker with PV Panel for Domestic
loads, "13th IEEE PES Asia Pacific Power & Energy Engineering Conference (APPEEC), pp.1-6, 2021.

302
Grenze International Journal of Engineering and Technology, June Issue

Application of Grey Wolf Optimization Algorithm for


Improving Inertia Constant Selection in Wind Farm
Deployments
Deepesh Bhati1 and Sandeep Bhongade2
1
IPSA IES/Electrical and Electronics Engineering, Indore, India
Email: [email protected]
2
SGSITS / Electrical Engineering, Indore, India
Email: [email protected]

Abstract—This paper presents a Grey Wolf Optimization (GWO) algorithm for doubly fed
induction generator (DFIG)-based wind turbine generators. In order to address the system
operability challenges that have arisen as a result of the continuous reduction of system inertia
caused by the increasing penetration of renewable power generation. The GWO algorithm that
has been makes it possible for individual DFIG generators to contribute an efficient inertial
response. This response helps to stabilize the rate at which the frequency is changing and
minimizes the large frequency deviations, when disturbance occurs. The DC voltage of the
DFIG runs at different levels in accordance with the changes in the inertia constant to facilitate
energy exchange with the associated ac grid. Additionally, the standard control system for the
DFIG has been updated to accommodate the implementation of the GWO algorithm. The
proposed model utilized GWO in order to evaluate optimum values of inertia constant, that
assists in improving output power efficiency levels. Concerning the DFIG, practical challenges
such as maximizing active power while minimizing reactive power are examined, and pertinent
solutions are offered for a variety of different cases.

Index Terms— Grey Wolf Optimization, Power System, DFIG, Wind, Inertia, Load Frequency
Control.

I. INTRODUCTION
In contrast to fixed-speed machines where active and reactive power control is not independent, DFIG-based
wind turbines are the preferred option for network operators [1]. The turning speeds of conventional wind
turbines are fixed; on the other hand, DFIG technology enables wind turbines to function at a broad range of
speeds. Traditional wind turbines have fixed turning speeds. The back-to-back converter is affixed to the rotor of
the DFIG, and its function is to provide the rotor with currents of varying frequencies in order to obtain the
required rotational speeds of the rotor. This application note provides a demonstration of how a back-to-back
converter controller may be used in combination with a DFIG wind turbine to generate electricity levels [2]. The
dynamic response of DFIG to fluctuations in wind speed as well as the process of turbine braking. The amount
of power that the wind has in the form of kinetic energy [3, 4], which is represented by the symbol Pv, by
utilizing (1).

Grenze ID: 01.GIJET.9.2.812


© Grenze Scientific Society, 2023
= (1)

where Vv represents average speed of wind in the area, A = π R2, where R represents rotor blade diameter, and ρ
represents density of air in the current area sets. The power recovered by wind turbine can be represented via (2),
= (2)

The power coefficient Cp is a number that does not have a specific unit of measurement, and it is used to express
how well a wind turbine is able to transform the kinetic energy of the wind into the mechanical energy that may
be used. The power output of the wind turbine is what is used to measure the efficiency of the wind turbine. This
coefficient shifts as a function of the wind speed, the speed of the rotor blades, and the angle at which the pitch is
set [5, 6]. The length of the rotor blades in proposed model of a Wind Turbine with DFIG is set to R = 50 meters,
and the air density is set to = 1.225 kilograms per cubic meter. Both of these settings are in meters per second
squared. Automatic adjustment in pitch angle (β) is done in such a way as to guarantee that the change in Cp as
shown in the “Fig. 1”. The value of Cp will be maximum when β is taken to be zero. Apart from this, the output
power of the turbine will be less if the value of β is different [16].

Figure 1. Power coefficient (Cp) as a function of wind speed [16]

One of many DFIG controls may be activated at any one time, and the one that is used is determined by the zone
in which the machine is now running. After investigating performance of different models [7, 8, 9] it can be
observed that existing models showcase high efficiency for control of DFIG operations. In this regard, it is
feasible to observe that the models [10, 11, 12] that are now in use are either very sophisticated or do not
incorporate a significant amount of control with the intention of maintaining constant output levels. In order to
address these problems, one possible solution is described in Section 3, and it is titled "Design of a Machine
Learning technique for increasing Inertia constant Selection in Wind Farm installations." The proposed model
was put through an examination in section 4, during which its results were compared with those of previously
executed DFIG-based deployments. This paper comes to a conclusion with a number of in-depth observations on
the model that has been provided, as well as suggestions for optimization models that may enhance the
performance of the model in a variety of use scenarios.

II. PROPOSED GWO ALGORITHM FOR IMPROVING INERTIA CONSTANT SELECTION IN WIND FARM DEPLOYMENT
After referring existing DFIG based control models [13, 14, 15], it was observed that existing models do not use
stochastic optimizations, which limits their applicability under real-time use cases. To overcome this limitation,
proposed Grey Wolf Optimization (GWO) algorithm for estimation of inertia constant for DFIG based wind
turbines is discussed in this text. The algorithm works via the following process,
 Initialize the following parameters,
 Total wolves existing in the model (N_w)
 Total iterations for which the model will be evaluated (N_i)
 Learning rate for the model (L_r)
 Initialize all wolves to be ‘Delta’, and evaluate them for each iteration via the following process,
 If the Wolf is currently marked as ‘Delta’, then process it, else go to the next wolf in sequence
 To process a Wolf, generate its internal configuration via the following process,
 Stochastically generate an inertial constant via (3),

304
= (0, 1) (3)
Where, Hs represents the inertial constant, and STOCH indicates a stochastic process to generate numbers
between given ranges.
 Based on this value of Hs, simulate the model, and estimate its fitness via (4)
= (4)

Where, Pactive represents active power at the output of model, while Preactive represents output reactive power
levels.
 Evaluate fitness for all Wolves, and then estimate fitness threshold via (5)
=∑ ∗ (5)

o At the end of each iteration, re-evaluate all Wolves via the following process shown in “Fig. 2”,
 Mark Wolf as ‘Alpha’, if f>2*fth (6)
 Mark Wolf as ‘Beta, if f>fth (7)
 Mark Wolf as ‘Gamma, if f>LW*fth (8)
 Else, Mark Wolf as ‘Delta’, if for this configuration,
f<fth (9)
Repeat this process for all iterations, and then select the ‘Alpha’ Wolf with maximum fitness levels. Due to
selection of Wolf with maximum fitness, active power is increased, while reactive power levels are reduced at
the output, which assists in improving circuit efficiency levels. This is advantageous, because it's possible that an
excessive quantity of reactive power may cause the components to overheat, which would significantly cut down
on the equipment's lifetime.

Figure 2. Flowchart of grey wolf

305
If power quality standards and regulations are not followed to, it may result in unanticipated shutdowns, power
losses, and even fines. This is in addition to the fact that it may cause blackouts. Thus, the efficiency of DFIG
based wind systems is improved due to selection of proper inertia constants. This efficiency is evaluated for
different models in the next section of this text.

III. RESULT ANALYSIS & COMPARISON


The proposed model uses GWO algorithm in order to evaluate optimum values of inertia constant, that assists in
improving output power efficiency levels. To validate this performance, the model was evaluated on a standard
DFIG model that can be observed from “Fig.3”, where a 120 kV source with 2500 MVA 3 Phase coupling
device is connected to a 30 km line which is capable of feeding a 25 kV load via grounding transformer sets. The
model uses 150 Ohms input source resistances with 50 Ohms load resistors. It also uses a combination of Wind
Turbine with Drive Train in order to produce power base for generators, that drives an asynchronous machine for
on load conditions. The circuit is capable of being driven by a 9 MW Wind farm that consists of 6 generator
units, each having a capacity of 1.5 MW under real time loads.
The model is validated via modifying the inertia constants under different loads, and power efficiency was
evaluated through (10).
( )
ή = ∑ ( )
(10)

Where, P(Out) & R(Out) represents active and reactive power outputs for N different circuit reading iterations.
designations.

Figure 3. Simulink model of the DFIG model under different condition

The outputs were obtained for 3 Phase Voltage across 575V grid (Vabc_575), 3 Phase Current across 575V grid
(Iabc_575), Active Power (P), Reactive Power (Q), 3 Phase Voltage across 25kV grid (Vabc_25), and 3 Phase
Current across 25kV grid (Iabc_25). These waveforms can be observed from “Fig. 4” as follows, based on these
readings, the power efficiency (P) levels were evaluated via (10), for the circuit with GWO and without GWO
were tabulated in table 1 as follows, which represents circuit performance under different simulation instances.
Based on these results and “Fig. 5”, it can be observed that the proposed model can improve the power efficiency
levels by 8.5% after application of GWO, which makes it useful for a wide variety of real-time simulation use
cases. Due to these advantages, the proposed model is useful for improving power efficiency for different DFIG
based wind farms.

IV. CONCLUSION
In this paper, optimal value of inertia constant for DFIG wind farm has been obtained for different loading
condition, GWO optimization has been utilized. Because of this, the proposed model is able to enhance the
power efficiency levels after the application of GWO by 8.5%, which allows it to be useful for a wide variety of
various real-time simulation use cases. This is because of the fact that: It is possible that the model that has been

306
Figure 4. Output voltage & current levels for different component

TABLE I. RESULTS FOR DIFFERENT SIMULATION INSTANCES


S. No. Simulation Time Efficiency (%) Efficiency (%) INERTIA CONSTANT
(s) Without GWO With GWO WITH GWO
1 1 75.50 86.50 0.0931
2 2 76.80 88.30 0.9723
3 3 77.40 89.40 0.5302
4 4 78.30 90.50 0.7062
5 5 79.25 90.80 0.4057
6 6 80.15 91.20 0.1843
7 7 81.05 91.20 0.8000
8 8 81.95 92.80 0.9557
9 9 82.85 93.57 0.8968
10 10 83.75 94.34 0.5852
11 12 84.65 95.11 0.7640
12 15 85.55 95.89 0.4771
13 18 86.45 96.66 0.4658
14 20 87.35 97.43 0.0976
15 25 88.25 98.20 0.4858

120.00
100.00
ή% 80.00
60.00
40.00
20.00
0.00
1 2 3 4 5 6 7 8 9 10 12 15 18 20 25
Time in Sec.
Without GWO With GWO

Figure 5. Power efficiency levels for different simulation instances

created may be successful in boosting the power efficiency of a variety of DFIG-based wind farms as a result of
these advantages.

307
REFERENCES
[1] S. Huang, Q. Wu, Y. Guo and F. Rong, "Hierarchical Active Power Control of DFIG-Based Wind Farm With
Distributed Energy Storage Systems Based on ADMM," in IEEE Transactions on Sustainable Energy, vol. 11, no. 3, pp.
1528-1538, July 2020, doi: 10.1109/TSTE.2019.2929820.
[2] Z. Dong, Z. Li, L. Du, Y. Liu and Z. Ding, "Coordination Strategy of Large-Scale DFIG-Based Wind Farm for Voltage
Support With High Converter Capacity Utilization," in IEEE Transactions on Sustainable Energy, vol. 12, no. 2, pp.
1416-1425, April 2021, doi: 10.1109/TSTE.2020.3047273.
[3] N. Shabanikia, A. A. Nia, A. Tabesh and S. A. Khajehoddin, "Weighted Dynamic Aggregation Modeling of Induction
Machine-Based Wind Farms," in IEEE Transactions on Sustainable Energy, vol. 12, no. 3, pp. 1604-1614, July 2021,
doi: 10.1109/TSTE.2021.3057854.
[4] B. Liu et al., "Impedance Modeling of DFIG Wind Farms With Various Rotor Speeds and Frequency Coupling," in
IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 68, no. 1, pp. 406-410, Jan. 2021, doi:
10.1109/TCSII.2020.2997927.
[5] X. Wang, H. Yu, Y. Lin, Z. Zhang and X. Gong, "Dynamic Equivalent Modeling for Wind Farms With DFIGs Using
the Artificial Bee Colony With K-Means Algorithm," in IEEE Access, vol. 8, pp. 173723-173731, 2020, doi:
10.1109/ACCESS.2020.3024212.
[6] Y. Zhang, C. Klabunde and M. Wolter, "Frequency-Coupled Impedance Modeling and Resonance Analysis of DFIG-
Based Offshore Wind Farm With HVDC Connection," in IEEE Access, vol. 8, pp. 147880-147894, 2020, doi:
10.1109/ACCESS.2020.3015614.
[7] M. Wang et al., "Impedance Modeling and Stability Analysis of DFIG Wind Farm With LCC-HVDC Transmission," in
IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 12, no. 1, pp. 7-19, March 2022, doi:
10.1109/JETCAS.2022.3144999.
[8] K. Sun, W. Yao, J. Fang, X. Ai, J. Wen and S. Cheng, "Impedance Modeling and Stability Analysis of Grid-Connected
DFIG-Based Wind Farm With a VSC-HVDC," in IEEE Journal of Emerging and Selected Topics in Power Electronics,
vol. 8, no. 2, pp. 1375-1390, June 2020, doi: 10.1109/JESTPE.2019.2901747.
[9] J. Shair, X. Xie, J. Yang, J. Li and H. Li, "Adaptive Damping Control of Sub synchronous Oscillation in DFIG-Based
Wind Farms Connected to Series-Compensated Network," in IEEE Transactions on Power Delivery, vol. 37, no. 2, pp.
1036-1049, April 2022, doi: 10.1109/TPWRD.2021.3076053.
[10] J. Liu et al., "Impact of Power Grid Strength and PLL Parameters on Stability of Grid-Connected DFIG Wind Farm," in
IEEE Transactions on Sustainable Energy, vol. 11, no. 1, pp. 545-557, Jan. 2020, doi: 10.1109/TSTE.2019.2897596.
[11] L. M. Castro and E. Acha, "On the Dynamic Modeling of Marine VSC-HVDC Power Grids Including Offshore Wind
Farms," in IEEE Transactions on Sustainable Energy, vol. 11, no. 4, pp. 2889-2900, Oct. 2020, doi:
10.1109/TSTE.2020.2980970.
[12] Y. Wu and P. Zhang, "Online Monitoring for Power Cables in DFIG-Based Wind Farms Using High-Frequency
Resonance Analysis," in IEEE Transactions on Sustainable Energy, vol. 13, no. 1, pp. 378-390, Jan. 2022, doi:
10.1109/TSTE.2021.3113017.
[13] R. Venkateswaran and Y. H. Joo, "Retarded Sampled-Data Control Design for Interconnected Power System With
DFIG-Based Wind Farm: LMI Approach," in IEEE Transactions on Cybernetics, vol. 52, no. 7, pp. 5767-5777, July
2022, doi: 10.1109/TCYB.2020.3042543.
[14] [N. Tong et al., "Coordinated Sequential Control of Individual Generators for Large-Scale DFIG-Based Wind Farms," in
IEEE Transactions on Sustainable Energy, vol. 11, no. 3, pp. 1679-1692, July 2020, doi: 10.1109/TSTE.2019.2936757.
[15] H. Dong, M. Su, K. Liu and W. Zou, "Mitigation Strategy of Sub synchronous Oscillation Based on Fractional-Order
Sliding Mode Control for VSC-MTDC Systems With DFIG-Based Wind Farm Access," in IEEE Access, vol. 8, pp.
209242-209250, 2020, doi: 10.1109/ACCESS.2020.3038665.
[16] Fatima Zohra naama, et. Al., "Model and simulation of wind turbine and its associated Permanent Magnet Synchronous
Generator", Energy Procedia, pp. 1-10, 2019.

308
Grenze International Journal of Engineering and Technology, June Issue

Negative Emotion Detection using ECG and HRV


Features
Sindhu N1 and Dr Jerritta S2
1
School of Engineering, Department of Electronics & Communication Engineering, Vels Institute of Science, Technology
and Advanced Studies (VISTAS) and Associate professor , College of Engineering Trivandrum, Kerala
Email: [email protected]
2
Associate Professor, School of Engineering, Department of Electronics & Communication Engineering, Vels Institute of
Science, Technology and Advanced Studies (VISTAS)
Email: [email protected]

Abstract—Emotions cause different physical, behavioural and cognitive changes in the human
body. Emotions can be positive and negative. Negative emotion is the experience of negative
feelings such as anger, frustration, panic, stress and fear. These negative emotions can cause
severe health problems. So there is a need for detection of negative emotions . It will help in
improving the health of the human body. As these emotions result in a change of various
physiological parameters like heart rate, skin temperature, blood pressure, skin conductance,
etc., these signals can be used to detect the emotions of a person. These signals are generated
by the body during the functioning of various physiological systems, so they cannot be regulated
artificially. Due to this reason, it is a reliable source for the detection of such information. So
physiological signal is one of the most important factor in the field of emotion detection. The
change in signals represents certain characteristics which are used to estimate the emotions.
This work mainly focuses to build a better model of negative emotion detection for Typically
Developed group using Machine learning approach with the help of Electrocardiogram (ECG)
signal. This study was conducted on DECAF database for typically developed group. The
study focused to extract the relevant features from both ECG and HRV signals. Then to identify
which is more contributing towards negative emotion detection. A machine learning model was
developed for typically developed group db4 as mother wavelets for feature extraction. The
significant features of ECG and HRV were then classified separately using the logistic
regression, ensemble and support vector machine. Logistic regression classifier achieved
maximum accuracy using HRV data for typically developed (TD) group.

Index Terms— ECG, negative emotion detection, DWT, ma- chine learning.

I. INTRODUCTION
Emotion is a state of thought that arises spontaneously and is accompanied by physiological changes. Emotion is
made up of three parts: a subjective component that defines how we feel emotions, a physiological component
that describes how our bodies react to emotions, and an expressive component that reflects the human reaction to
each emotion. External motivations, thoughts, and changes in interior feelings are all referred to as emotion.
Emotion recognition has become a vast field of study in cognitive science, engineering, and psychol- ogy.
Emotion detection was used in psychology to comprehend the feelings of persons who were being counselled. It

Grenze ID: 01.GIJET.9.2.813


© Grenze Scientific Society, 2023
is also employed in the medical field to help crippled and elderly persons. Emotions can be positive and negative.
The sensation of negative feelings such as anger, frustration, panic, stress and fear is known as negative
emotion. Negative emotions might lead to serious health issues. As a result, there is a requirement for negative
emotion detection. It will aid in the improvement of human health. The early methods employed were facial
expression and voice processing, but the primary difficulty with these approaches is that they may be readily
hidden because a person can mimic himself and mask the true feelings. The physiological signals can be utilised
to determine a person’s emotions since they cause changes in physiological characteristics such as heart rate, skin
temperature, blood pressure, and skin conductance. Because these signals are produced by the body during the
operation of numerous phys- iological systems, which cannot be intentionally managed. As a result, it is a
trustworthy source for detecting and forecasting such information. As a result, one of the most essential factors in
the field of emotion detection is physiological signal. Phys- iological detection of emotions can have better
performance compared to other techniques since physiological signals are not under the voluntary control of the
human. Emotion can be modeled as two dimensional emotion model consisting of valence and arousal,
where valence denotes pleasantness or polarity of emotion stimuli whereas arousal represents the strength of
emotion.
Since brain and emotions are not mapped for autistic people EEG cannot be used for emotion recognition for
such people. Here in this work ECG is employed for detecting the emotions. DWT is used for the feature
extraction purpose. These ex- tracted features after selection are used as training data for the classifiers. The
significant features of ECG and HRV were then classified separately using the logistic regression, ensemble and
support vector machine. This paper consists of 5 sections. Section II describes about the previous works done in
the field. It gives a detailed review of works carried out in this field. Section III describes about the proposed
system and section IV gives the results and discussion and section V gives the conclusion of the work.

II. RELATED W ORK


Zi Cheng et al. [2] used various combination of features extracted from ECG signal and its derived HRV to
detect negative emotion. Emotions were evoked by using 15 stan- dardized film clips. HRV was derived using an
automatic R peak detection algorithm. They extracted a total of 28 features, including seven linear-derived
features, ten nonlinear- derived features, four time-domain features (TD) and six time- frequency domain features
(T-F D). 5 classifiers including SVM, Random Forest (RF), k-Nearest Neighbor (kNN), De- cision Tree (DT) and
Gradient Boost Decision Tree (GBDT) were also compared. Among all these combinations, the best result was
achieved by using only 6 time-frequency domain features coming from wavelet with SVM, which showed the
highest accuracy of 79.51% and the lowest time cost of 0.13 ms.Han-Wen Guo et al. et al. [3] used 3-10 minute
video excerpts for eliciting emotions such as wrath, fear, sadness, happiness, and relaxation. They used time-
domain, frequency- domain, Poincare, and statistic analysis to extract heart rate variability (HRV) components
from an ECG signal. Time- domain analysis extracts characteristics such as mean, co ficient of variation,
standard deviation of RR interval, a standard deviation of successive differences of RR interva FFT extracted
parameters such as low frequency, high frequency, and LH ratio were used to perform spectral analysis of
HRV data. Statistics analysis elements include kurto coefficient, skewness, and entropy. The length SD2 along t
line of identity and the breadth across this line are Poinca properties of point clouds (SD2). PCA was the data
reducti technique used which selected 5 features as relevant featur Then these relevant features were used to
classify differe emotion states by support vectors machine (SVM). Usi 13 HRV features the classification
accuracy for 2 emotio (negative and positive) was 70.3% and that for 5 emotio was 52%. After feature
selection accuracy for 2 emotions wa 71.4% and that for 5 emotions was 56.9%.
M. S. Goodwin et al. [4] investigated whether previous physiological and motion data collected by a wrist-worn
biosensor can predict hostility toward others in children with ASD. They recorded peripheral physiological and
motion signals from a biosensor worn by 20 youth with ASD and developed prediction models based on ridge-
regularized lo- gistic regression. Time series feature extraction and logistic regression classifier was used. B.
Anandhi et al. [5] analyzed the QRS complex derived ECG signal for emotion recognition. A personalized
emotion elicitation protocol was developed for children with ASD. Emotion evocation was with the help of
audio-visual stimuli. They conducted the study for 10 children with ASD. Various digital filters were used for the
removal of noise and quality improvement. Different linear and nonlinear features were extracted from this
complex and one-way ANOVA was used for the analysis of these features. Finally, the authors used k-nearest
neighbor and ensemble classifier for the classification of emotions. They were able to achieve an accuracy of
70.5% for children with ASD.

310
DECAF[1] is a database containing the physiological re- sponse to different emotions elicited in 30 subjects
using 36 movie clips and 40 one minute music video segment. They had collected different signals such as
MEG, horizon- tal electrooculogram hEOG, ECG, trapezium electromyogram.,(tEMG) and near-infrared facial
videos which were recorded synchronously.

III. METHODOLOGY
The work focuses to extract the relevant features from both ECG and HRV signals. Then to identify which is
more contributing towards negative emotion detection. A machine learning model was developed for typically
developed group db4 as mother wavelets for feature extraction. The significant features of ECG and HRV were
then classified separately using the logistic regression, ensemble, support vector ma- chine and k-nearest
neighbor. ECG data for various emotions are collected from DECAF [1]. In this work ECG signal corresponding
to two different emotions happy and sad are considered. The block diagram is shown in Fig. 1.

Fig. 1. Block Diagram of Negative Emotion Detection System

A. ECG Data
ECG data was taken from DECAF database[1]. It is a mul- timodal database containing physiological response
to various emotions. The emotions are elicited using 24 one minute movie segments and 40 one minute music
segments. The experiment was carried out for 30 healthy subjects. In this work only ECG signal response to the
one minute movie segment for emotions happy and sad was analysed.
B. Pre-Processing
Pre-processing is done to improve the quality of signals by removing noises. These noises include power line
interference, baseline wandering and high-frequency noises [6]. This helps to improve the quality of the negative
emotion detection method. Baseline wander is a low-frequency noise that arises from breathing, electrodes
attached to the body, or subject movement. It occurs in the frequency range of 0.5 to 0.6Hz. Baseline wander can
cause the amplitude of the QRS complex to increase significantly. The wavelet-based approach is best for
removing ECG signals. The DWT-based method makes use of high-level decomposition to eliminate low-
frequency components corresponding to the baseline variation. DWT was performed using Daubechies (db8) as
the mother wavelet because of the similarity of wavelet function with the shape of ECG signal[7]. Then high-
frequency noises occurring due to power line noises were removed using 6th order low pass Butterworth filter
with a cut of frequency of 50 Hz since in India the power line frequency is at 50 Hz. After noise removal heart
rate variability (HRV) was derived. It refers to the variation of the time interval between successive heartbeat.
Fig. 2 and Fig. 3 represents raw and corresponding pre-processed signals of happy emotion. Fig. 4 and Fig. 5
represents raw and corresponding pre-processed signals of sad emotion.

Fig. 2. Raw ECG signal containing happiness of TD group

311
Fig. 3. Pre-processed ECG signal containing happiness of TD group

C. Feature extraction
Here features are extracted from ECG and HRV. Different features are extracted to get the emotional content in
the signal. Feature extraction helps to reduce the redundant data present in the signal. Thereby it helps to get
useful information from the signals. Different feature extractions techniques are used in the literature [8]. DWT
(Discrete Wavelet Transform) was used to extract the features here.
DWT makes use of the mother wavelet which is a single prototype function used to decompose the input
signal. Decomposition depends on the scaling and shifting derive frequency sub-bands of the input signal.

Fig. 4. Raw ECG signal containing sadness of TD group Fig. 5. Pre-processed ECG signal containing sadness of TD group

DWT decomposes the original signal to approximation and detail coefficients with the help of a low pass filter
and high pass filter. The output of the low pass filter (LPF) is known as approximation coefficients and the
output of the high pass filter (HPF) is known as detailed coefficients. The output of this LPF is again applied to
HPF and LPF which forms the second decomposition level. In this study 14 level decomposition is done. This is
because the emotional content is present in the low-frequency band and high-frequency band [10]. Detail
coefficient from 11th to 14th is used for extracting various features.
Wavelet transform make use of mother wavelets. Different wavelets includes Daubechies (db) wavelet, Haar
wavelet, Symlet wavelet, Coiflet wavelet etc. Daubechies are orthogonal wavelets which is characterized by
maximum number of van- ishing moments for some predefined support length. The name of these wavelets are
represented as dbN. Here N represents the order of these wavelets. Usually N varies from 1 to 8. In this work,
analyses were carried out using db4 mother wavelet. ECG and HRV features were extracted for negative emotion
detection using two emotions. All the features extracted from ECG and HRV data is listed in Table I.
In addition, time domain features of HRV is also considered which includes mean R-R interval
difference(meanRR), Root Mean Square Distance of Successive R-R interval(RMSSD), Number of R peaks in
ECG that differ more than 50 mil- lisecond(NN50), percentage of successive RR intervals that differ more than
50 ms (pNN50), standard deviation of RR intervals(SD RR), and Standard Deviation of Heart Rate(SD HR).

312
TABLE I. FEATURES EXTRACTED FROM ECG AND HRV
Serial Features Description
no.
1 max Maximum value of signal in
each level
2 min Minimum value of signal in
each level
3 mean Mean value of signal in
each level
4 median Median value of signal in
each level
5 std Standard deviation of signal
in each level
6 mad Mean absolute deviation of
signal in each level
7 range Range of signal in each
level
8 power Power of signal in each
level
9 L1 norm L1 norm of signal in each
level
10 L2 norm L2 norm of signal in each
level
11 Kurtosis Kurtosis value of signal in
each level
12 entropy Entropy value of signal in
each level
13 skewness Skewness value of signal in
each level
14 HF power Sum of power of levels 11
and 12
15 LF power Sum of power of levels 13
and 14
16 LF power norm HF power/(LF power + HF
power)
17 HF power norm LF power/(LF power + HF
power)
18 power HF power + LF power
19 ratio Ratio of HF power and LF
power

D. Feature extraction
Here features are extracted from ECG and HRV. Different features are extracted to get the emotional content in
the signal. Feature extraction helps to reduce the redundant data present in the signal. Thereby it helps to get
useful information from the signals. Different feature extractions techniques are used in the literature [8]. DWT
(Discrete Wavelet Transform) was used to extract the features here.
DWT makes use of the mother wavelet which is a single prototype function used to decompose the input
signal. Decomposition depends on the scaling and shifting
E. Feature extraction
Here features are extracted from ECG and HRV. Different features are extracted to get the emotional content in
the signal. Feature extraction helps to reduce the redundant data present in the signal. Thereby it helps to get
useful information from the signals. Different feature extractions techniques are used in the literature [8]. DWT
(Discrete Wavelet Transform) was used to extract the features here.

313
DWT makes use of the mother wavelet which is a single prototype function used to decompose the input
signal. Decomposition depends on the scaling and shifting.

TABLE II . SIGNIFICANT FEATURES WITH P AND MEAN VALUES FOR ECG


Sl Features Sig. Mean Value Mean Value (sad)
no. Value (happy)

1 mediand11 0.044 -2.3x10 −6 9.86x10−7


2 kurtosisd11 0.034 12.933 14.235
3 ratio 0.01 0.3573 0.4241

TABLE III SIGNIFICANT FEATURES WITH P AND MEAN VALUES FOR HRV
Sl Features Sig. Mean Value Mean Value (sad)
no. Value (happy)
1 meand11 0.028 618.372 692.67
2 L1d14 0.012 -3.5x107 -5.4x107
3 NN50 0.000 70.504 70.90
4 SD HR 0.000 26.808 28.004

F. Classification
The significant features obtained after feature selection is classified using various machine learning algorithms.
Every machine learning classifier have two phases. First phase is training phase and followed by a testing phase.
70% of total available data is used for training and model is tested using remaining 30%. Classifiers are used to
classify the significant features into emotions happiness and sadness of typically developed group. Here three
different machine learning models such as logistic regression, ensemble and SVM are used for negative emotion
detection.

IV. RESULTS AND D ISCUSSION


The features extracted from ECG signal includes time- frequency domain features and frequency domain
features. Time-frequency domain features include maximum value, minimum value, mean value, median value,
standard deviation, mean absolute deviation, range, power, L1 norm, L2 norm, entropy, kurtosis, skewness of 11th
to 14th decomposition level detail coefficient. Frequency domain features include high frequency power, low
frequency power, total power, low frequency power norm, high frequency power norm
The features extracted from HRV data include time- frequency domain features,time domain features and
frequency domain features. Time-domain features include mean R-R interval difference (meanRR), Root Mean
Square Distance of Successive R-R interval (RMSSD), Number of R peaks in ECG that differ more than 50
millisecond(NN50), percent- age of successive RR intervals that differ more than 50 ms (pNN50),standard
deviation of RR intervals (SD RR), and Standard Deviation of Heart Rate(SD HR). Frequency domain features
include high frequency power, low frequency power, total power, low frequency power norm, high frequency
power norm and ratio of low frequency power and high frequency power.
Significant features of ECG for typically developed group using db4 analysis includes mediand11, kurtosisd11
and ratio of low frequency power and high frequency power. Significant features of HRV for typically developed
group using db4 analysis includes meand11, L1d14, NN50 and SD HR.
Logistic regression, ensemble and SVM classifiers were used for classifying signals to emotion happiness and
emotion sadness. The training phase is carried out on 70% percent of the feature data set while the testing phase
is the remaining feature set. Confusion matrix of logistic regression classifier obtained for db4 analysis using
HRV features for typically developed group is shown in Fig 6. Accuracy gives a measure of correctly classified
signals with respect to total number of signals.
The classification results obtained for typically developed group is given in the Table IV and Table V. This
analysis indicates that HRV data, indicating the vari- ability in heart rate is an effective indicator for detecting
negative emotions. Logistic regression and ensemble classifier was found to have better performance than other
classifiers.

314
Fig. 6. Confusion matrix obtained for Logistic regression using HRV features

TABLE IV. SUMMARY OF RESULTS OBTAINED BY USING ECG FEATURES


Model Accuracy Precision Recall F-1
score
Logistic 63.3% 60% 64.2% 62%
Regression
Ensemble 66.7% 66.7% 66.7% 66.7%
SVM 66.7% 66.7% 66.7% 66.7%

TABLE V. SUMMARY OF RESULTS OBTAINED BY USING HRV FEATURES


Model Accuracy Precision Recall F-1
score
Logistic 96.7% 93.3% 100% 96.5%
Regression
Ensemble 90% 86.7% 92% 87.5%
SVM 86.7% 86.7% 86.7% 86.7%

V. C ONCLUSION
Electrocardiogram signals are an effective way for analyzing human emotions. In this work, negative emotion
detection for typically developed group using different classification models was done. The study included a
comparison between ECG and HRV features. It was found that HRV features are more contributing towards the
emotion detection. Logistic regression and Ensemble classifier showed better performance compared to other
machine learning algorithms for typically developed group.

REFERENCES
[1] Mojtaba Khomami Abadi, Ramanathan Subramanian,Seyed Mostafa Kia, Paolo Avesani, Ioannis Patras, and Nicu
Sebe, “DECAF: MEG Based Multimodal Database for Decoding Affective Physiological Re- sponses”, IEEE
Transactions on Affective Computing , Vol. 6, No. 3, July-September 2015
[2] Z. Cheng, L. Shu, J. Xie and C. L. P. Chen, ”A novel ECG-based real time detection method of negative emotions in
wearable applications”, 2017 International Conference on Security, Pattern Analysis, and Cy- bernetics (SPAC),
Shenzhen, 2017, pp. 296-301.
[3] H. Guo, Y. Huang, C. Lin, J. Chien, K. Haraikawa and J. Shieh, ”Heart Rate Variabil-ity Signal Features for Emotion
Recognition by Using Principal Component Analysis and Support Vectors Machine”, 2016 IEEE 16th International
Conference on Bioinformatics and Bioengineer- ing (BIBE), Taichung, 2016, pp. 274-277.
[4] M. S. Goodwin, C. A. Mazefsky, S. Ioannidis, D. Erdogmus, and M. Siegel,“Predicting aggression to others in youth
with autism using a wearable biosensor,” Autism research, vol. 12, no. 8, pp. 1286–1296, 2019.
[5] Anandhi, B., and S. Jerritta., ”Recognition of valence using QRS complex in children with Autism Spectrum Disorder
(ASD)”, IOP Conference Series: Materials Sci- ence and Engineering, Vol. 1070. No. IOP Publishing, 2021.

315
[6] P. Chettupuzhakkaran and N. Sindhu, ”Emotion recognition from phys- iological signals using time-frequency analysis
methods”, 2018 Interna- tional Conference on Emerging Trends and Innovations in Engineering and Technological
Research ICETIETR 2018, pp. 1–5, 2018.
[7] M. Bassiouni, E.-S. El-Dahshan, W. Khalefa, and A.-B. M.Salem,“Intelligent hybrid approaches for human ecg signals
identification,” Signal, Image and Video Processing, vol. 12, pp. 941–949, 07 2018.
[8] Shu, L., Xie, J., Yang, M., Li, Z., Li, Z., Liao, Yang, X. (2018), ”A review of emotion recognition using physiological
signals”, Sensors, vol. 18, no. 7, 2018.
[9] A. Bagirathan, J. Selvaraj, A. Gurusamy, and H. Das, “Recognition of positive and negative valence states in
children with autism spec- trum disorder (asd) using discrete wavelet transform (dwt) analysis of electrocardiogram
signals (ecg),” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 1, pp. 405–416, 2021.
[10] M. Murugappan, S. Murugappan, and B. S. Zheng, ”Frequency band analysis of electrocardiogram (ECG) signals for
human emotional state classification using discrete wavelet transform (DWT)”, Journal of Physical Therapy Science,
vol. 25, no. 7, pp. 753–759, 2013.

316
Author Index

A H
Aarya Pawar 7 Halkarnikar P P 116
Abdul Rahman 231 Harish Kumar 93
Abhishek Kajal 155 Harshavardhan J 295
Abinanda P 110 Hemant Singh Pokhariya 290
Aditya Tripathi 178 Himanshu Pal 290
Ajay U Surwade 61
Ali Albkhrani 269 J
Amol Dhakne 116 Jayadeep K 223
Anjani Pujitha PSVL 295 Jerritta S 309
Anusha N 282
Arjyadhara Pradhan 254 K
Arshad Ali 178 Kalaiselvi K 140
Arun Kumar Dash 217 Kanchan Shelke 174
Aswini J 217 Karthick Myilvahanan J 33
Ayushi Agarwal 14, 23 Karthik A 295
Kavya K 276
B Khandagale H P 116
Babita Panda 254 Kirti Thakur 93
Bachu Munideepika 50 Krishnaveni A 33
Bharti W Gawali 244, 269
Biplab Bag 130 L
Brijesh Prasad 290 Lakhan Jadhav 185
Brunda U 140 Lipika Nanda 254
C M
Chandrasekhar D 276 Madhukumar Patnala 50
Chitralekha Jena 254 Malarvizhi N 217
Mani Barathi S P S 237
D Manikanta V 217
Danish Tamboli 72 Manne Sowmya 223
Deepak Nandal 77 Mansi Patil 102
Deepesh Bhati 303 Mayur Gaikwad 185
Deepti Jagyasi 148 Minakshi M Sonawane 269
Dhyaneshwaran J 29 Mohana Sundaram N 33
Mohit Patil 67
F Mrityunjaya Kappali 260
Farrel Deva Asir J 29
N
G Nachiket Joshi 55
Gayathri G 110 Nallapothula Sreenivasulu 50
Gopi Sivanadh D V S 295 Neelam Chandolikar 102

© Hinweis Research, 2023


Neeraj Manikanta Sai G 276 Senthilkumar B 237
Nikhil Patil 67 Shantanu Sharma 23
Nikita Lahon 254 Sharon Christa 122
Nilesh Gopale 55 Shashank Saxena 178
Ninad Deogaonkart 55 Shivanand Koli 185
Nishant Kulkarni 7 Shiyam R 168
Nivetha K 33 Shreeyanshi Gautam 23
Shubham Ghalme 174
P Sindhu N 309
Palakuru Akhilesh 140 Snehmani 93
Pankaj 77 Sourav Kumar Satpathy 254
Paramjit 43 Srividya R 168
ParthaPratim Sarkar 130 Sudha Abirami R 207
Poojitha K 276 Sudhir Mendhekar 269
Pranav Kurle 72 Sujithra M 85, 110, 237
Prasad Kulkarni 185 Sukhwant Kour Siledar 190
Pratap Patil 67 Sulaiman Awadh Ali Obaid Maeli 61
Pratham Khinvsara 7 Sulochana D Shejul 244
Pratham Patil 67 Sumalatha A 276
Pravin V Dhole 244 Sunny Kumar 1
Preethi N 1 Suresh Kumar G 207
Priya Shelke 55 Suruchi Dedgaonkar 55
Priyadharshini R 85 Sushanta Sarkar 130
Priyanshi Patil 67 Susmita Bala 130
Pushan Deb 1 Swathi N 223, 282
Swati Shilaskar 102
R
Radha T Deoghare 136 T
Raghavendra Kulkarni 196 Tanusha Mittal 122
Raja Sri A 223 Teja Sai Ethesh 282
Ramchandra Adware 148 Totthuku Sunil 50
Ramesh R Manza 269 Trushant Jadhav 72
Rathika J 110 Tupe U L 185
Rekha V S 110 Tushar Raikar 7
Revant Pund 7
Rishikesh Dayma 7 U
Ritika Rastogi 14 Umakant Tupe 174
Riya Gupta 14 Umamaheswari K 295
Rohit Desai 55
Rupali Umbare 72 V
Rutuja Kadam 174 Vaibhav Rana 155
Vaishali Khupase 102
S Varun Mishra 290
Sai Prakash M 223 Vedant Bhamre 72
Sai Vamsi Reddy 282 Velvadivu P 85, 110
Sandeep Bhongade 303 Vetti Pavithra 50
Santosh R 33 Vijay D Dhangar 244
Sapana A Kolambe 136 Vijay Prakash 178
Saranya S 29, 163, 168 Vijaya B Musande 190
Saurabh Charya 43 Vikas Rathi 290
Vinayak C Magadal 260 Yoga Verma V 163

Y Z
Yashwanth Krishna A 282 Zeba Khan 231
Yashwanth M 163
View publication stats

You might also like