AITCand CSSP2023 Proceedings
AITCand CSSP2023 Proceedings
net/publication/370230085
CITATIONS READS
0 1,609
1 author:
SEE PROFILE
All content following this page was uploaded by Aditya Kumar Tripathi on 25 April 2023.
Chief Editors
Dr. Janahanlal PS
Viswajyothi College of Engineering, India
Hinweis Research
HR comprises the world's most distinguished Engineers, Scientists and Academicians covering the entire
spectrum of scientific disciplines.
Published by
Hinweis Research
KP 7/581, Kazhakkuttam, Thiruvananthapuram, Kerala, India-695301
Proceedings of the
Joint International Conferences on AITC and CSSP 2023
ISBN: 978-81-958173-5-1
Technical Chair
Dr. Phiroj Shaikh, Don Bosco Institute of Technology, India
General Chair
Dr. Mukta Dhopeshwarkar, Dr Babasaheb Ambedkar Marathwada University, India
Dr. Shuzlina Binti Abdul Rahman, Universiti Teknologi MARA, Malaysia
General Co-Chair
Dr. Sonali B. Kulkarni, Dr.Babasaheb Ambedkar Marathwada University, India
Dr. Rekha. K S, The National Institute of Engineering, India
Publicity Chair
Dr Pingkun Yan, Philip Research North America
Dr. Savan K Patel, Ganpat University, India
Dr. C Namrata Mahender, Dr Babasaheb Ambedkar Marathwada University, India
Dr. Savan K Patel, Ganpat University, India
Publicity Co-Chair
Prof. Dacheng Tao, NTU, Singapore
Dr. Amlan Chakrabarti, University of Culcutta, India
Dr. Sonali B. Kulkarni, Dr.Babasaheb Ambedkar Marathwada University, India
2. Home Automation with Node MCU & Firebase using Internet of Thing (IoT) 7-13
Aarya Pawar, Pratham Khinvsara, Revant Pund, Tushar Raikar,
Rishikesh Dayma and Nishant Kulkarni
7. Detecting and Isolating Black-Hole Attacks in Manet using Timer based 43-49
Baited Technique
Paramjit and Saurabh Charya
8. Design of Wideband Band Stop Filter using Signal Interference Technique 50-54
Madhukumar Patnala, Bachu Munideepika, Vetti Pavithra, Totthuku Sunil
and Nallapothula Sreenivasulu
10. Phishing E-Mail Detection and Blocking it based on the Header Elements 61-66
Sulaiman Awadh Ali Obaid Maeli and Ajay U Surwade
15. Variable Selection Methods, Comparison and their Applications in Machine 93-101
Learning: A Review
Kirti Thakur, Harish Kumar and Snehmani
16. Devanagari Characters Recognition: Extracting Best Match for Photographed 102-109
Text
Neelam Chandolikar, Swati Shilaskar, Vaishali Khupase and Mansi Patil
18. Credit Risk Analysis of Loans using Social media Information 116-121
Halkarnikar P P, Khandagale H P and Amol Dhakne
20. Design of a Miniaturized Microstrip Antenna using Slots on the Radiating 130-135
Patch for Wireless Applications
Susmita Bala, Biplab Bag, Sushanta Sarkar and ParthaPratim Sarkar
23. Effects of Integration of Electric Vehicle Charging Stations into the Grid 148-154
Deepti Jagyasi and Ramchandra Adware
24. Enhancement of Accuracy and Performance of Deep Learning System for 155-162
Intrusion Detection System
Abhishek Kajal and Vaibhav Rana
28. A Novel Approach with Deep Learning Method with Effective Storage 178-184
Security in Hybrid Clouds
Vijay Prakash, Aditya Tripathi, Shashank Saxena and Arshad Ali
29. Video Surveillance Fire Detection System using CNN Algorithm 185-189
Tupe U L, Lakhan Jadhav, Shivanand Koli, Prasad Kulkarni and
Mayur Gaikwad
31. Review of AI/ML in Software Defined Network from Past to Present 196-206
Raghavendra Kulkarni
32. Preprocessing and Segmentation of Retinal Blood Vessels in Fundus Images 207-216
using U-Net
Sudha Abirami R and Suresh Kumar G
35. Data Science based Recommendation System -An Application of Computer 231-236
Science
Zeba Khan and Abdul Rahman
40. Skin Disease Identification using online and Offline Data Prediction using 269-275
CNN Classification
Minakshi M Sonawane, Ali Albkhrani, Bharti W. Gawali, Ramesh R Manza
and Sudhir Mendhekar
42. Automatic Industrial Gas Leakage Detection and Control System 282-289
Teja Sai Ethesh, Swathi N, Sai Vamsi Reddy, Anusha N and
Yashwanth Krishna A
45. Application of Grey Wolf Optimization Algorithm for Improving Inertia 303-308
Constant Selection in Wind Farm Deployments
Deepesh Bhati and Sandeep Bhongade
46. Negative Emotion Detection using ECG and HRV Features 309-316
Sindhu N and Jerritta S
Grenze International Journal of Engineering and Technology, June Issue
Abstract—Vehicle Detection algorithms run on deep neural networks. But one problem arises,
when the vehicle scale keeps on changing then we may get false detection or even sometimes no
detection at all, especially when the object size is tiny. Then algorithms like CNN, fast-RCNN,
and faster-RCNN have a high probability of missed detection. To tackle this situation YOLOv3
algorithm is being used. In the codec module, a multi-level feature pyramid is added to resolve
multi-scale vehicle detection problems. The experiment was carried out with the KITTI dataset
and it showed high accuracy in several environments including tiny vehicle objects. YOLOv3
was able to meet the application demand, especially in traffic surveillance Systems.
Index Terms— Surveillance video, vehicle detection, codec, convolutional neural network,
YOLOv3, moving object detection and tracking.
I. INTRODUCTION
Road accidents and crime are increasing day by day. An intelligent Road Monitoring System is becoming the
need of the hour. Vehicles need to be identified with a license plate so that further research can be made on a
specific vehicle to identify the driver and provide proper evidence to law enforcement.[1] Because of the
increasing number of network cameras, locally manufactured visual data, and Netizens, it is difficult yet
essential to analyze a large amount of background data at once. Moving object detection (MOD) is a technique
for extracting dynamic foreground elements from video frames, such as moving pedestrians or automobiles, and
removing the background that isn't moving. Due to the recent success of convolutional neural networks (CNN),
there is a great deal of interest in deep learning-based object identification algorithms, and numerous models
have achieved cutting-edge results [2]. Particularly opposed to artificially manufactured methodologies, deep
learning techniques utilize proposal generation methods like MultiBox, DeepBox, and region proposal networks
(RPNs) can provide fewer candidates of superior value.
YOLOv3 Algorithm is a pre-trained model on COCO dataset with over 80 classes that it can detect with a MAP
to 69% under sunny weather condition. As YOLOv3 is a pre-trained model, the detection is very fast and can be
implemented in live traffic conditions. As shown in fig 1, It can be shown that the distance of the item and the
size of the vehicle are inversely related. This may lead to incorrect detection or even faulty detection in some
contexts.
This paper improves the YOLOv3 network to address this problem. Given that features like SSD, YOLOv3, and
FPN all use feature pyramid structures at the detection stage, this study proposes a novel multi-level feature
pyramid structure introduced to the codec module to recognise vehicle targets at various forms. The multilayer
characteristics that the backbone network had retrieved were first merged into basic features. The essential
2
found in CAMs, and they can be scattered throughout the globe as multiple little portions or they can
overlap. There are four types of class activations that can affect how the bounding box develops based
on the amount of instances of each existing class:
If a class has exactly one instance, there is just one instance of each item in the related CAM (a 28
28 grid). All of the active cells are counted. The locations of the top and leftmost activated cells on
the vertical and horizontal axes, respectively, of the grid, are taken as the top left corner of the
bounding box, and the locations of the bottom and rightmost are taken as the bottom right corner,
regardless of the shape of the region in the CAM (i.e., whether it is unitary or fractionated in the
CAM's grid)
If there are several instances of a class and the number of instances equals the number of unique
regions in the CAMs, then each centre has a corresponding instance represented by a linked area in
the CAM. Each region's bounds are determined by the grid position's lowest and maximum indices.
III. METHODOLOGY
A. Yolov3
A real-time object detection system called YOLOv3 (You Only Look Once, Version 3) recognizes particular
things in films, live feeds, or still photos. To find objects, YOLO employs features that a deep convolutional
neural network has learned.
Darknet-53: Darknet-19 is the name of the network architecture that YOLOv2 uses. It has 24 layers in total,
including 19 convolutional layers (thus the name darknet-19) and 5 maximum clustering layers. Due to the loss
of several fine-grained information during input down sampling, YOLOv2 is not particularly good at recognizing
small targets. In order to get low-level features, YOLOv2 uses identity mapping to connect feature maps from
the preceding layer.
Under Three Scales Detect: When the input image size is decreased to 32, 16 or 8, respectively, YOLOv3
generates predictions at each of three scales that are precisely stated. The 82nd layer is in charge of making the
initial prediction. The network lowers the visual resolution for the first 81 layers until the 81st layer's pitch is 32.
The size of the resultant feature map, if we start with a 416 416 image, is 13 13. Here, we employ a 1 1 detection
kernel to produce a detection feature map with dimensions of 13 13 255. The layer 79 feature map is then
sampled twice to a dimension of 26x26 after passing through numerous convolutional layers. The feature map of
layer 61 and this feature map are then thoroughly concatenated.
Anchor Boxes: 9 anchor boxes altogether are used by YOLOv3. In each ratio, three. If you train YOLO on your
own dataset, you must create 9 anchor points using K-Means clustering.
Additional Bounding Boxes: More bounding boxes are predicted by YOLOv3 than YOLOv2 for input photos
of the same size. When YOLOv2's original resolution is 416 × 416, for instance, it is assumed that 13 x 13 x 5 =
845 boxes. Five boxes are found in each grid cell using five anchor points. The prediction are given below[5]: -
= ( )+ ()
= + ( )
= ( )
= ℎ ( )
Softmax Abandoned : YOLOv3 classifies items found in photos using several labels. Previously in YOLO, the
author was accustomed to using softmax level scores and regarded the class of objects encompassed in the
bounding box as having the greatest score. This was altered in YOLOv3.
Loss Function
Loss = ∑ ∑ 1 [( − ) + ( − ) ]
3
+ ∑ ∑ 1 (2 − × ℎ )[( − )
+(ℎ − ℎ ) ] − ∑ ∑ 1 [ log ( )
+ 1− log (1 − )]
− ∑ ∑ 1 [ log ( )
4
TABLE I. AVERAGE PRECISION IN THREE DIFFERENT DIFFICULTY LEVELS UNDER THE KITTI DATASET
Algorithm Name Average Precision % Time
Easy Moderate Hard
R-CNN 32.23 26.04 20.93 -
Faster-R-CNN 87.90 79.11 70.19 142
YOLOv3 95.04 92.39 87.51 34
Fig 3.a & 3.b. P-R Diagram in YOLOv3 and Faster R-CNN in three difficulties
(a)
The above Graphs show that the model worked with high accuracy and with high speed in YOLOv3 model
rather than Faster R-CNN or R-CNN. The self-collected images also showed the same results in YOLOv3
model. The difficulties that were set for the images were of three levels easy moderate and hard. as showed in
the fig 3a and 3b. The difficulties differed with according to several properties like rainy, sunny, cloudy, dark,
bright, distorted images etc. In all the difficulties YOLOv3 gave very high accuracy.
5
(b)
Fig 5a & 5b. Vehicle Detection YOLOv3
IV. CONCLUSION
In this study, the YOLOv3 network model is applied to the problem of vehicle recognition in videos of traffic
surveillance. It was shown that during the actual detection phase, small scale autos frequently go undetected. To
efficiently and effectively construct multi-scale features that can adapt to the identification of multi-scale target
vehicles, we present a unique feature pyramid module built on the basis of YOLOv3 and based on encoding and
decoding. After being tested on the KITTI dataset, the impact has been increased. Good detection results have
been reached for vehicle targets of various sizes, especially for the identification of microscopic targets. The
accuracy is significantly better than the YOLOv3 algorithm and can better meet the requirements of practical
applications.
REFERENCES
[1] BINGXIN HOU et al, A Fast Lightweight 3D Separable Convolutional Neural Network with Multi-Input Multi-Output
for Moving Object Detection, 2021
[2] Yiping Gong, et al, Context-Aware Convolutional Neural Network for Object Detection in VHR Remote Sensing
Imagery, 2020
[3] FENG HONG, et al, A Traffic Surveillance Multi-Scale Vehicle Detection Object Method Base on Encoder-Decoder,
2020
[4] Hong-Mei Sun, Rui Sheng Jia, Finding every car: a traffic surveillance multi-scale vehicle object detection method,
[Springer Science + Business Media, LLC, part of Springer Nature 2020 ]
[5] Shi, L,;Zhang, F,;Xia,J,;Xie,J, Z,;Liu,R. Identifying Damaged Building in Aerial Images Using the Object Detection
Method, Remote Sens. 2021,13,4231
[6] Xiu-Zhi Chen, Chieh-Min Chang, Chao-Wei Yu and Yen-Lin Chen A Real-Time Vehicle detection system under various
Bad Weather Conditions Based on a Deep Learning Model without, Published: 9 October 2020
[7] Dinesh Rajan,Brett Story, Xinxiang Znang, Night Time Vehicle Detection and Tracking by Fusing Sensor Cues from
Autonomous Vehicle, May-2020
[8] Kun Wang Maozhen Liu1, YOLOv3-MT:A YOLOv3 using multi-tracking for vehicle visual detection, 30 April 2021/
Published online: 4 june 2021
[9] XIAOTAO SHAO, CAIKE WEI, YAN SHEN, Feature Enhancement on CycleGan for Night time Vehicle Detection,
November 27, 2020 acceted December 15,2020
[10]Mohammed Rabah, Ali Rohan, Heterogeneous Parallelization for Object Detection and Tracking in UAVs, February 7,
2020, accepted February 19, 2020 date of current version March 11,2020
[11]GIHA YOON,GEN-YONG KIM, HARK YOO, Implementing Practical DNN-Based Object Detection Offloading
Decision for Maximizing Detection, Performance of Mobile Edge Devicedate of publication October 8,2021
[12]Ye Tao, Zho Zongyang, Chai Xinghua, Low-altitude small-sized object detection using lightweight feature-enhanced
convolutional neural network, journal of System Engineering and Electronics, Vol. 32, 4 August 2021, pp.841-853
[13]P. Tu, T. Sebastian, G. Doretto, N. Krahnstoever, J. Rittscher and T. Yu, "Unified crowd segmentation", Computer
Vision.
6
Grenze International Journal of Engineering and Technology, June Issue
Abstract—The Internet of Things is made up of objects with unique identities that are linked to
one another online. The idea is to simply connect and keep an eye on numerous sensors
and equipment via the Internet, which is widely used in this new era. This paper
primarily explains the general overview and knowledge of IOT-based sensing systems and
monitoring systems, which leverage databases and software to construct smart, automated
household appliances. board, and an Android OS smartphone is used to remotely control the
internet. This system's core part and brain, the Node MCU, can serve as an interface between a
wide range of hardware parts and the real-time database. The system provides many cutting-
edge switching features that turn on and off lights, fans, and other connected household
equipment. It is widely used for switching lights, fan on/off by sensing and analyzing data.The
cloud-based system is another notification element of this system architecture. The main feature
of this system is that it can also be controlled from remote areas which can prove to contribute
towards energy saving.
I. INTRODUCTION
In IOT(Internet of Things),devices communicate with each other . IoT devices can share contents based on
function control in a predefined manner. This project focuses on the use of cloud to operate the home appliances
over the internet from even remote areas.
Various components like DC motor, L293D, etc. are connected to Node MCU microcontroller, these
components help in acquiring data from surroundings, which include the state of appliances and send this
collected data from microcontroller to the database .Users can access as well as update this data anytime, from
anywhere using Android app developed using Flutter which uses Dart Programming language. Using this
application the states of appliances in the database can be updated. The Microcontroller (Node MCU) fetches
the data from the database and reflects the updated state on the Arduino program . This program then executes or
operates the various appliances based on the updated conditions in the database . The state of the appliances can
be updated using buttons or the voice control commands through the application.
8
IV. IMPLEMENTATION SETUP
Components Required Hardware Requirements:-
1) NODE MCU ESP8266
2) DC Motor
3) L293D
4) Male to female jumper wire
5) Connecting Wires
6) Breadboard
7) LED
8) Fan
9) Mobile phone
10) Power Supply Software requirements:
11) Android Flutter App
A. NODE MCU ESP8266
Node MCU The ESP8266 is an open source Internet of Things (IoT) platform that comprises both software and
hardware (WHAT IS A NODEMCU?). The application is powered by Espressif's ESP8266 Wi-Fi SoC, which is
a System-on-Chip. The ESP-12 module contains the hardware. The ESP8266 is a System-on-Chip (SoC) that
combines a 32-bit CPU, an antenna, switches, filters, a power amplifier, power management modules, and
standard digital peripheral interfaces in a small and simple package [8]. The ESP8266 is a low-cost
microprocessor that works with a Tensilica Xtensa LX106 core and is used in a variety of IoT applications. This
microcontroller is connected to the WiFi continuously which helps in fetching the states of the appliances from
the database. Different appliances are connected to this controller.
Fig.4 DC Motor
9
C. L293D
L293D is a standard motor driver or motor driver IC that allows a DC motor to be driven in either direction.
L293D is a 16-pin integrated circuit that can operate two DC motors in any direction at the same time. It means
that a single L293D IC may operate two DC motors. The l293d can also power tiny and silent large motors. It is
based on the H-bridge idea. The H-bridge circuit enables voltage to flow in either direction. Because voltage
must change direction in order to rotate the motor in either a clockwise or anticlockwise direction, H-bridge ICs
are perfect for controlling a DC motor. Fig 5 shows L293D motor driver. We have used a L293D motor driver to
control a DC motor to which we have attached a fan .This can be used to prototype the switching on/off states of
a fan.
D. Breadboard
For building a prototype and to mount our components at one place for connection purposes and creating a
proper circuit ,we have used a breadboard. Many electrical components in electronic circuits can be coupled by
placing their leads or terminals into the holes and then connecting them with wires as needed. The breadboard
contains metal strips below it that link the holes on the top of the board.
Fig. 6- Breadboard
E. LED
The light-emitting diode (LED) is a common standard light source in electrical electronics. It can be used in
various applications such as in mobile phone or large advertising billboards. They are mostly used in devices
that that include works related to time and displaying various types of data. Aviation, illumination, fairy lights,
car headlights, marketing, general lighting, traffic lights, camera flashes, lit wallpaper, horticulture grow lights,
and medical supplies are just a few of the uses for LEDs.
10
Fig. 8 Representation of project using Android Mobile App
G. Firebase
Google Firebase is an application development software provided by Google that enables developers to develop
iOS, Android & web apps. Firebase offers capabilities for measuring statistics, reporting and troubleshooting app
problems, and generating marketing and product experiments.
By giving safe access to the database directly from client-side code, Firebase Real-time database enables you to
develop complex, collaborative apps. Data is saved locally, and real-time events continue to fire even when the
user is offline, providing the finest responsive experience ever.
Fig. 9– Firebase
11
Fig. 11 – On/Off Buttons in Android Flutter App for turning on/off lamp, fan, light
Fig. 12– Button for Voice Command for turning Home appliance on/off
VI. CONCLUSION
With the help of IOT technology, we get to know or have basic idea on how we can control home appliances. As
long as the user is connected to internet, this prototype of ours will help the user to control his home appliances
irrespective of his location . The GUI System which we have created allows the user to easily control the
appliances using a smartphone which is connected to internet ,so as any change occurs,the user will be notified
immediately and he/she can control the appliances using the provided GUI interface.The microcontroller which
we have used is Node MCU which acts like an interface between the components and the user. This Node MCU
is connected to several appliances like light,fan,lamps,etc. To establish an application layer for connection
between the user who is remotely located ,we have used a micro web server. This system communication
between the user and the appliances was possible through internet. Notifications are delivered to users via the
Android flutter app, which is installed on their smartphone. Users may operate remotely or automate household
appliances by utilising components such as NodeMCU, DC Motor, L293D, Firebase, and so on. All these
components together help in building a remotely controllable smart home automation system through which we
can switch the lights , fan ON/OFF.
REFERENCES
[1] J. Lertlakkhanakul, J.W.Choi and M. Y.Kim, Building Data Model and Simulation Platform for Spatial Interaction
Management in Smart Home, Automation in Construction, Vol. 17, Issue 8, November 2008, pp. 948- 957
12
[2] A. R. Al-Ali and M. AL-Rousan, Java-based Home Automation System, IEEE Transactions on Consumer Electronics,
Vol. 50, No. 2, May 2004
[3] R. J. C. Nunes and J. C. M. Delgado, An Internet Application for Home Automation, 10th Mediterranean Eletro-
technical Conference, MeleCon 2000,Vol. I. pp. 298-301
[4] D. H. Stefanov and Z. Bien, The Smart House for Older Persons and Persons with Physical Disabilities: Structure,
Technology Arrangements, and Perspectives, IEEE Trans- actions On Neural Systems And Rehabilitation Engineering,
Vol. 12, No. 2, June 2004, pp. 228-250
[5] C. Douligeris, Intelligent Home Systems, IEEE Communications Magazine,Vol. 31, Issue 10, October 1993, pp. 52-61
[6] Y.-J. Mon, C.-M. Lin and I. J. Rudas, Wireless Sensor Network (WSN)Control for Indoor Temperature Monitoring,
Acta Polytechnica Hungarica,Vol. 9, No. 6, 2012, pp. 17-28
[7] E. N. Ylmaz, Education Set Design for Smart Home Applications, Computer Applications in Engineering Education,
Vol. 19, Issue 4, December 2006, pp.
[8] N. Sriskanthan and Tan Karand. “Bluetooth Based Home Automation System”. Journal of Microprocessors and
Microsystems, Vol. 26, pp.281-289, 2002.
[9] E. Yavuz, B. Hasan, I. Serkan and K. Duygu. “Safe and Secure PIC Based Remote Control Application for Intelligent
Home”. International Journal of Computer Science and Network Security, Vol. 7, No. 5, May 2007.
[10] Amul Jadhav, S. Anand, Nilesh Dhangare, K.S. Wagh “Universal Mobile Application Development (UMAD) On Home
Automation” Marathwada Mitra Mandal’s Institute of Technology, University of Pune, India Network and Complex
Systems ISSN 2224-610X (Paper) ISSN 2225-0603 (Online)Vol 2, No.2, 2012
[11] Rana, Jitendra Rajendra and Pawar, Sunil N., Zigbee Based Home Automation (April 10, 2010). Available at SSRN:
http://ssrn.com/abstract=1587245 http://dx.doi.org/10.2139/ssrn.1587245
[12] R.Piyare, M.Tazi “ Bluetooth Based Home Automation System Using Cell Phone”, 2011 IEEE 15th International
Symposium on Consumer Electronics.
13
Grenze International Journal of Engineering and Technology, June Issue
Abstract—The paper focuses on Workforce management using Django Framework. This open-
source software application has a wide-spread use especially in small scale organizations who
can’t afford expensive software. Every organisation, public or commercial, uses an information
system to keep information about their employees. However, it has been discovered that many
small-scale enterprises in India still utilise paper and pen to preserve records. Even though
there are many sophisticated technology systems that can perform this function, they are all too
expensive for these low-level industries to afford. This essay addresses developing a method to
handle their challenges at a lower cost. Our Workforce Administration System has 4 views
namely HR, Employee, Team Lead and Fresher based on the different categories of users of this
software application. There are several functionalities in these 4 views that makes this
framework not only employee friendly but also helps builds a bond between the company and
the staff by ensuring smooth interaction between the two.
I. INTRODUCTION
Employees are an organization's most important resource for growth and seamless operation. The documentation
that was previously completed to manage personnel was really onerous and demanded a lot of time in addition to
extra labor. Conflicts might also result from it. For instance, manual searching would be necessary and take a lot
of time if the information of any employee was required. The information was not secure in this type of system,
and the registers' information could be simply altered. As a result, a system was required to automate everything,
including the monitoring of attendance, the tracking of existing projects inside the companies, and the
methodical training of new hires. There are several systems on the market that can assist in carrying out these
tasks, but they are highly expensive and occasionally need to be handled by skilled experts. However, we're
going to open-source this programme so that it may readily operate on any machine with a few installations.
Additionally, using it wouldn't require a lot of technical knowledge.
Different portals would be included in the system that would be created for human resources, employees,
trainees, and team leads. The human resource site, for example, would offer the ability to view employees,
change their information, and add new hires, among other options. Examine staff attendance, the status of the
organization's ongoing projects, and other things. Similar to this, the team lead, trainee, and staff would each
15
TABLE I. LITERATURE SURVEY RELATED TO THE PROBLEM DEFINED
16
Firstly, we have tried to make this system user-friendly by making its UI simple enough so that it can be used
even if the user is not highly skilled in using technologies.
Secondly, we have made our system more diverse by increasing the number of views to 4 that provides more
functional capabilities to different types of users that include HR, Employee, Team Lead and Fresher.
Lastly, since we have used Django in building our system, it makes our system more reliable and secure.
III. METHODOLOGY
To create this project, we have used Django, a high-level python web framework. For hr, team-lead, and
employees, there would be various portals with varying rights and functionalities. For instance: While employees
are not given these functionalities, the hr can add or delete employees. In this project, Django was used. Django
requires a minimum of:
-4GB RAM
-an Intel Core i3
-Windows 7 or later
The architecture of the system mainly consists of the following parts: -
i. Frontend-This is the interface with which the users will interact. This is being made simple, attractive
and user-friendly so that the users could easily interact with the various services provided.
ii. Backend-The backend of any application is basically what goes on behind the scenes. It consists of
APIs, servers, operating systems, databases and more all of which come together to ensure that correct
information is served to the user as quickly as possible. It is the backbone of the website and is
responsible for fetching the information which is to be displayed on the front end. It responds to the
requests made by the user and serves them with the required information.
iii. Database-It is an organized collection of structured data. It is responsible for storing the information
entered by the user and for storing the data which is displayed on the front-end of the application after
being fetched from here.
The various technologies used for implementing the application are discussed below: -
1). Django-It is a high-level python framework used to create websites that use python. It has many ready-to-use
features like user login and authentication system and database connection and it supports various databases also.
The database that we have used in this project is the default database that is being used by Django that is db.
sqlite3.Django also helps in the re-usability of various components and features like template inheritance etc.
Django follows MVT architecture that is Model, View and Template.
i. Model-The data that we want to display on the frontend of the website or the data that we want to store
in the database is done with the help of models.
ii. View-It is responsible for handling the requests from the user. It renders and associated content on
receiving request from the user.
iii. Template-It is an HTML file that contains the layout of the webpage to be rendered.
Some of the features of Django that make it so popular are: -
1. High Security
2. Rapid Development
3. High Scalability
4. SEO optimized
5. Thoroughly tested
Django has one project and within which different modules are built which provide functionalities to the Project.
The different apps have to include in the settings.py file of the project.
17
2.) HTML, CSS and JavaScript-HTML are used to provide structure to the website.CSS is used for styling the
website and JavaScript is used to provide interactivity to the website. In this application, we have used two CSS
frameworks: -
i. Bootstrap-It is one of the most extensively used HTML, CSS and JavaScript frameworks. It is an open-
source framework and is free to use. It follows a mobile-first approach. It helps to make the website
fully responsive and has various built-in classes which could easily manipulate the styling of the web-
page.
ii. Materialize CSS-It is a UI component library developed and designed by Google. The main goal of
building this was to allow for a unified user experience across all they products developed by them
across all the platforms. We have used along with bootstrap to make the dashboards more interactive
and to follow the dashboard convention.
This system offers four views with different functionalities as discussed below: -
A. HR View
i. View the Employees List, and their details, and execute CRUD actions on them.
ii. View the Organization's different departments.
iii. Take a look at the projects the company is working on and keep track of their progress.
iv. Verify any employee's attendance record.
v. Distribute discounts and bonuses to various staff.
vi. Send out crucial notifications.
vii. Examine employee feedback.
viii. Approve leaves
ix. View issues and questions and address them.
x. View fresh applications on the job portal of the organization and decide whether to accept or reject
them.
B. Employee View
i. Review their information and make any necessary changes.
ii. Mark their attendance.
iii. Request leave.
iv. Verify the day's assignment.
v. View discounts and rewards and redeem them.
18
Figure 3. Use Case Diagram for Employee View
19
Figure 5. Use Case Diagram for freshers view
2. When the HR of the company logins, he is directed to the following page where he can take several actions as
discussed under the functionalities of HR view.
Figure 7. HR Dashboard
20
3. Below is the screenshot of the portal’s detailed employee view. It will help the system admin or HR to see all
the details of the employee like his employee code, department, position, his personal details etc. The HR also
has the option to edit these details or remove any employee.
4.Every functionality in the dashboard has a different purpose. For instance, the ongoing projects functionality in
the HR Dashboard will help the organization to see all the ongoing projects, their progress, project lead and the
details of the team members.
Test Cases: -Some of the test cases that we have tested the system for are: -
Test Case 1: - We have tried to test if we are able to login using the wrong credentials.
Figure 10. Trying to login into the system using wrong credentials
The test case has passed, we will not be able to login using the wrong credentials. Thus, the system is secure and
the person who has the right credentials can only login and perform various operations.
Test Case 2: - Now, we are trying to see if we are able to delete the details of any employee from the system.
Figure 11. Testing if we are able to delete an employee Figure 12. Thus, we have deleted an employee successfully
21
Test Case 3: - Now we are testing if we are able to add a new Department.
Figure 13. Testing if we are able to add a new department Figure 14. Thus, we are able to add a new department successfully
successfully in HR Dashboard
V. CONCLUSION
Thus, we have created a system that will help organizations to manage their workforce efficiently. Thus,
reducing the problems faced by the organizations earlier. This system will also ensure transparency in the
organization and bridge the communication gap between the employees and the employers as they will be able to
directly write their queries through the portal. This system in comparison to the other system will make use of
analytics which would make it easier to analyze data and make important decisions based on them.
REFERENCES
[1] Bajpayi, Rishabh. (2020). Employee Management System. International Journal for Modern Trends in Science and
Technology. 6. 225-234. 10.46501/IJMTST061242.
[2] Singh, P., Fartyal, H., Zubair, K. A. A., & Laddha, A. (2019). Employee Management System. International Research
Journal of Engineering and Technology (IRJET), 6.
[3] Punia, R., Panwar, S., Kamra, R., & Gupta, R. (2020). Voice Based Employee Management System Using AWS and
Alexa. International Journal of Innovative Research in Computer Science & Technology (IJIRCST), ISSN, 2347-5552.
[4] "Administration in Employee Management System", International Journal of Emerging Technologies and Innovative
Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.7, Issue 1, page no. pp125-132, January-
2020.
22
Grenze International Journal of Engineering and Technology, June Issue
Abstract— The goal of educational institutions should be to find effective ways to provide new
and efficient learning opportunities based on their environment, student characteristics, teacher
preparation, economic crisis, and advancing technology in an effort to make learning more
efficient, equitable, and innovative in higher education.This study paper identifies the need for
and possibility for developing new online courses in order to engage and motivate students in
accordance with their demands (e-Learning, blended learning, mobile learning).
This paper presents the insights and harmonizes the acquired knowledge since the
implementation of a mobile learning solution, reviews a few definitions of e-learning, explores
the motivating factors behind its development, and provides an overview of the circumstances
in which e-learning is a perfect option, as well as the main e-learning component types. It is
concluded that e-learning is a useful tool for the growth of the Indian educational sector.
Considering the idea of e-learning and examining the different types of e-learning are the key
objectives of this research work.
I. INTRODUCTION
E-learning is a crucial activity for each and every nation's advancement. Everyone involved in this new era is
contemplating growth. We will achieve the desired outcomes if it is adequately planned. This research report
examines the state of e-learning in India. Because of the great ease of use & accessibility, navigation, interaction,
and user-friendly interface design of e-learning as in contrast to traditional learning, student satisfaction rates rise
with continuous use. It has been determined that very few professional courses offered by higher education
educators than that of the average national use e-learning, while few such non-professional courses are taught by
teachers using e-learning. In recent years, there has been an increase in research into just how effective e-
Learning performs. This is primarily due to greater opportunities for integrating IT and learning, but also
because political and social focus on "what works" in learning is expanding.
24
The goal of learning is to acquire uniform background information about a given subject.
Learners like studying at their own speed and are highly motivated to pick up new skills.
The course emphasizes long-term training requirements rather than urgent ones.
when it is necessary to get the information.
V. QUALITY OF E-LEARNING
The quality of an e-learning course/training is increased by:
learner-centered content: learner-focused material The curriculum for online learning should be precise,
pertinent, and tailored to each learner's needs, responsibilities, and roles in both their professional and
personal lives. Information, knowledge, and skills should be offered in the conclusion.
granularity: To aid in the assimilation of new information and to provide the student flexibility in their
learning schedule, e-learning content has to be normalized (broken down into manageable chunks).
engaging content: In order to create an engaging and inspiring learning environment, innovative use of
instructional approaches, methods, and tactics is required.
Interactivity: To keep students' interest, encourage learning, and create a positive learning environment
for them, frequent
learner-teacher interaction is essential.
personalization: Self-paced courses must be altered to take into account the needs and interests of the
student.
Tutors and facilitators in instructor-led courses need to be able to monitor each learner's development
and performance.
Only 10% (6/61) of the studies are labeled as "not successful," which calls into question the validity of the
classifications given the difficulties and issues that e-Learning must overcome. A closer examination of the
abstracts reveals that many of the empirical studies and reachers on efficacy were carried out by scholars who
seemed to have an interest in e-success. The literature review currently does not support the examination into
whether e-Learning solutions are particularly successful due to the issue of "effectiveness bias."
As seen in figure 2, this model depicts the crucial elements (in gray) that the review found to be important in
determining how effective e-Learning is according to various definitions. There must be a positive and helpful
25
Figure 2 flow chart
learning atmosphere. The amount of motivation of the person(s) using the eLearning (artifact) affects how long
they use it, and prior online or professional experience tends to have a generally beneficial effect on
efficacy.Program designs that include peer and instructor interaction as well as opportunities for practise
improve the efficacy of online learning. This model illustrates how the important variables that affect efficacy
interact, however many studies in the literature review do not take into account the wide range of definitions.
26
characterized by the old classroom paradigm to learning that is immediate, online, self-driven, and available
whenever a student desires. There have been several turning points in the path of education in India.In plain
English, e-learning is defined as education that is provided online, over the internet, and includes a variety of
formats and styles, including remote learning, computerized electronic learning, online learning, and internet
learning.
C. E-Learning and Government
The government can employ e-learning in a variety of ways, including the following:
Effective policy and rule communication may aid the government.
It can raise awareness among individuals about various programmes and goals.
According to their demands, it will offer citizens/people a public forum for communication or
education.
Both unstructured and semi-structured information may be managed by it.
It can carry out government policy.
The government may benefit from an effective e-learning system in many ways. The government may offer a
learning portal centered on public private partnerships' (PPP) policies, rules, and regulations. A meaningful and
worthwhile education among the populace through e-learning can help a government become more open in its
governance.
D. E-Learning and Higher Education
Studies conducted on a worldwide scale indicate that, after the United States, India has the second-highest
number of students enrolled in online courses, with more than 1,55,000 coming from the nation. 32% of the over
1.2 million students globally are from the United States, while 15% are from India (but keep in mind that both
nations' populations are large). There is an increasing need in higher education to develop an e-learning
programme in which all components of a course are controlled through a standardized user interface throughout
the whole school. Many of these programmes have been launched in our nation; students must attend orientation
sessions at institutions, but the course material is distributed online. The majority of colleges do provide online
advising and registration, e-counseling, and student newspapers, among other online learning assistance and
services. E-learning has the capability to overcome India's rural areas' lack of access to professors and teachers
with the necessary qualifications. Live online coaching, streaming films, and virtual classrooms are a few of the
answers that e-learning may provide to these issues. E-learning is the greatest choice even though there is no
replacement for efficient and well-organized classroom instruction.
IX. CONCLUSION
According to the results of our study, we draw the essential conclusion that the fast expansion of internet
connectivity is a key driver of the development of e-learning. Online learning will become more efficient and
educational quality will rise thanks to a solid internet infrastructure with a wide range of regional and
international actors. E-learning improves both the standard of education and the state of the economy in
emerging nations like India.
Our research article addressed the following research queries: How are e-efficacy learning's metrics determined?
How is the effectiveness of e-learning measured? Why are e-Learning programmes so effective? The benefits of
taking into account and making clear how these notions are utilized in study and practice were emphasized in
this article. The bulk of studies, according to this survey, employed quantitative and comparative methods. This
paper argues that practitioners and academics are prevented from uncovering unexpected and unintentional
potential reasons of error by utilizing only quantitative measurements to satisfy preset learning objectives.Open-
ended qualitative survey questions can significantly increase the validity of such methods [8] The environment in
which the e-Learning solution was employed and deployed as well as the users of the artifact were taken into
account while categorizing these elements.
REFERENCES
[1] Sharma, R. C., & Mishra S. (2013). International Handbook on e-Learning, Vol. 2
[2] Harden, R. M., & Hart, I. R. (2002). An international virtual medical school (IVIMEDS): The future for medical
education. Medical Teacher, 24, 261-267
27
[3] Laurillard, D. (2006). E-learning in higher education. Changing Higher Education: The Development of Learning and
Teaching, 71-84
[4] Government of India. (2011). Census Report, 223.
[5] Ministry of Human Resource Development, Government of India. (2014). Annual Report, 2013-2014.
[6] Chandra, S. (2014). E-learning prospects and challenges. International Journal of Research in Finance & Marketing,
4(10)
[7] Shinde, S. P., & Deshmukh, V. P. Web-based education in schools: A paradigm shift in India.
28
Grenze International Journal of Engineering and Technology, June Issue
Abstract—India is among the nations also with the greatest populations. One of India's biggest
problems has continued to be health misinformation due to overcrowding. Death occurs every
minute as a result of unforeseen and unplanned situations. Saving a life is lucky and good.
Using the stretcher's embedded microcontrollers and sensors, a smart, intelligent healthcare
system will be created. In the event of an accident, it will determine the condition of the corpse
and send that information to the hospital in addition to sending a server-based alert to the
closest police station to avoid any potential legal issues. If this process is followed, critical care
units in hospitals can have their physical requirements improved before patients arrive,
potentially saving many lives.
I. INTRODUCTION
One of the biggest problems facing humanity is its health. The previous 10 years have seen a lot of attention
focused on the healthcare industry. The main objective was to create a reliable method for patient monitoring
that would let medical professionals keep a watch on patients who might be being treated in a hospital or going
about their everyday lives as usual. Due to recently enhanced technology, patient monitoring systems have
become one of the most significant developments. Currently, a more contemporary strategy is required.
In the conventional method, the main role is played by healthcare experts. They must go to the hospital ward to
provide the required diagnostics and guidance. There are two fundamental issues with this strategy. First, the
patient must always have a healthcare provider nearby, and second, the patient must spend some time being
admitted to the hospital with biomedical equipment at their bedside.
To tackle these two difficulties, patients receive education and information on diagnosing and preventing
illnesses. The second requirement is for a patient monitoring system (PMS) that is dependable and easily
accessible. We can employ technology more wisely to improve the aforementioned situation.
III. COMPONENTS
The ESP32 was used to create the prototype. A fingerprint sensor, a heartbeat sensor, and a breathing sensor
were also utilized. To reflect the best quality-price ratio, components are chosen from those sold at nearby
retailers. Finding patient information to communicate to the doctor is done via a fingerprint sensor.
30
Figure 1: Block Diagram
IV. DESIGN
System-on-a-chip microcontrollers in the ESP32 family are low-cost, low-power devices that have Bluetooth and
Wi-Fi built in. The ESP32 family, power amplifiers, low-noise receive amplifiers, RF baluns, integrated antenna
switches, and power-management modules in addition to the Tensilica Xtensa LX6 dual- and single-core CPUs,
the Tensilica Xtensa LX7 dual-core microprocessor, and the single-core Tensilica Xtensa LX7 processor. The
ESP32 was designed and built by the Chinese business located in Shanghai, Espressif Systems, and is produced
Via TSMC using their 40 nm innovation Excellent in functionality, this optical biometric fingerprint reader may
be used in a range of finished goods, including automobile door locks, safe deposit boxes, attendance, and access
control. Summary of the Product The fingerprint sensor R305 fingerprint module has a direct connection to a
microcontroller UART or a PC through a MAX232 or USB Serial converter. A TTL UART interface is also
possible.
The Respiration Sensor measures diaphragmatic breathing in real-world or fictitious biofeedback applications
like stress reduction and relaxation training. This sensor measures breathing frequency in addition to displaying
the relative depth of breathing.
A digital output of the heartbeat is created when one finger is positioned on the heartbeat sensor. When a
heartbeat is detected, the beat LED blinks in time with each one. A microcontroller (BPM) may simply be linked
to this digital output to calculate the beats per minute rate. It uses a finger to modulate the light.
Internet using the ESP32 controller. And the data in the cloud are in encrypted format for basic security
purpose and for viewing the details an OTP generation method for the registered email id is used The data is
available with the time stamps Once the website has verified the OTP, the message will be sent to the registered
mobile number.
31
Figure 3: Live patient Monitoring
of common coupling to the output voltage or the input dc voltage, a regulator circuit reduces ripples while
maintaining the same dc value. This voltage regulation is often accomplished using one of the widely used
voltage regulator IC chips.
VII. CONCLUSION
Due to the significance of health care services to our society, automating them relieves human workers of stress
and makes measurement easier. Patients are more likely to trust this system because of its transparency. The
doctor may assess the patient's current condition while continuously monitoring the patient's condition, which
will help him decide which treatment is most suited.
REFERENCES
[1] M. E. Mlinac, M. C. Feng,”Assessment of activities of daily living, selfcare, and independence”, Archives of Clinical
Neuropsychology, Vol. 31, Issue 6, Aug 2016 [1].
[2] V. Pasku et al.,”Magnetic Field Based Positioning Systems.” IEEE Comm.Surveys & Tutorials, Mar 2017 [2].
[3] K. Nguyen, Z. Luo, “Dynamic route prediction with the magnetic field strength
[4] for indoor positioning”, Int. Journal of Wireless and Mob. Computing, Vol.12Issue 1, Jan 2017 [3].
[5] A. Alarifi et al., “Ultra wideband indoor positioning technologies: Analysis andrecent advances,” Sensors (Basel), vol.
16, no. 5, May 2016 [4].
[6] E. Wang, M. Wang, Z. Meng, X. Xu. ”A Study of WiFi-Aided Magnetic Matching Indoor Positioning Algorithm.”
Journal of Computer and Comm. 5.03.2017 [5].
[7] A. R. Jimenez and F. Seco, “Event-driven Real-time Location-aware Activity Recognition in AAL Scenarios,” in Proc.
12th Int. Conf. on Ubiq. Computing and Amb. Intell., UCAmI 2018, 4-7 Dec, 2018; Punta Cana, Dominican Republic
[6].
32
Grenze International Journal of Engineering and Technology, June Issue
Abstract—There has been a lot of interest in time series forecasting in recent years. Deep neural
networks have shown their effectiveness and accuracy in various industries. It is currently one
of the most extensively used machine-learning algorithms for dealing with massive volumes of
data due to the reasons stated above. Statistical modeling includes forecasting, which is used for
decision-making in various fields. Time-varying variables may be forecasted based on their past
values, which is the goal of forecasting. Developing models and techniques for trustworthy
forecasting is an important part of the forecasting process. As part of this study, a systematic
mapping investigation and a literature review are used. Time-series researchers have relied on
ARIMA approaches for decades, notably the autoregressive integrated moving average model.
But the need for stationary makes this method somewhat rigid. Forecasting methods have
improved and expanded with the introduction of computers, ranging from stochastic models to
soft computers. Conventional approaches may not be as accurate as soft computing. In
addition, the volume of data that can be analyzed and the efficiency of the process are two of the
many benefits of using soft computing.
I. INTRODUCTION
According to Time Series Data, a process is observed at predefined intervals and a predetermined sample rate.
Developing rules from data and generating predictions about future values based on current observations are at
the heart of time series analysis. There has been an increase in the use of time-series observation data across
various industries and fields. Furthermore, the amount of time series data being produced is rising. One of the
most prominent academic fields is forecasting time series data. Meteorological and weather forecasting,
industrial production forecasting, and stock trend forecasting have benefited from its implementation. It might
assist decision-makers in avoiding danger and making better choices. Traditional time series forecasting methods
based on chance and statistics have succeeded in various fields, including meteorology, economics, and more.
Time series forecasting algorithms face substantial issues with the influx of large, non-linear time series data that
follow various distribution patterns due to the introduction of data science in health care. Outstanding results
have been achieved by using deep and machine learning to very sophisticated algorithms for forecasting time-
series data. This article aims to categorize the many approaches for predicting time series that are currently
available.
34
Yanke Hu et al. [63] Although RNNs' slow processing speed is typically neglected, new research has shown that
RNN-based strategies may be useful in numerous time series applications. By converting vital signs into a (0, 1)
vector and treating the problem as a computer vision problem, this article proposes a novel method for the
multivariate time series classification challenge in healthcare.
Zina M et al. [65] A Voronoi diagram-based approach for detecting outliers in time series data has been
described. The technique has several major benefits. Outliers are dealt with by considering the multivariate
nature of the data in the first place. Because the authors may choose whether or not to use a parametric model, it
is versatile in extracting relevant characteristics for separating outliers from non-outliers (such as a regression
model, as in this paper). Finally, Voronoi diagrams reveal the underlying geometric connection of the data
points. According to Experimental Data, our MVOD technique can accurately, sensitively, and robustly identify
outliers in a multivariate time series.
D. Multi-Dimensional Time Series Analysis
Dai, Xiangfeng et al. [12]. Researchers in this research provide three different ways of altering hypothesizing the
HASF approach for finding patterns in fragmented time series. This technique has proven to be resilient because
of the HASF's capacity (a) to retain the underlying trend, (b) to cope with the nonstationary and
heteroscedasticity of data, and (c) to represent the relevance of data samples that remain after deleting nearby
data.
Dugast, Mael, et al. [14] An early decision-making aid for Emergency Department administrators is crucial for
financial and public health reasons. When respiratory-like infections spread, these authors know that ED
admissions increase. Sometimes, the clinical signs of these disorders may be recognized. In particular, RSV is
linked to bronchiolitis symptoms in children by doctors. It is thus possible to address the RSV pandemic by
analyzing the temporal series of bronchiolitis admissions to pediatric emergency rooms. There is a need for a
new and unique method for recognizing early on the start of epidemic-related aberrant arrival in emergency
departments (EDs) and calculating the maximum number of arrivals, which indicates how soon the epidemic will
fade away. Detrended Fluctuation Analysis was utilized to get the admissions time series variability, and the
authors applied the persistent homology technique (DFA). To get the best DFA parameter value, solve a multi-
objective optimization problem.
35
bootstrap based
methods
Kamal et al. (2012) A rule-based To identify Opinions that
system and candidate are related
opinion feature-opinion either directly
mining system pairs from or indirectly
review
documents and
product features
Lin et al. (2012) JST, Reverse- Detects Weakly supervised
JST and LDA sentiment and nature of JST
topic and no
simultaneously labeled
from text documents.
Moraes et al. (2013) Document-level Achieved better Unbalanced
sentiment levels of data contexts
analysis, SVM classification
and ANN accuracy
Bagheri et al. (2013) A novel To learn multiword Accuracy
unsupervised aspects,
and domain-independent bootstrapping
model iterative
algorithm and
pruning methods
Kalaivani&Shunmuganat The Good accuracy Compare
han (2013) performance various
for sentiment sentiment
classification classification
approaches
Hai et al. (2014) Opinion IEDR Domain-specific
features from and
online reviews independent
corpus
Stavrianou&Brun (2015) Product Improving the Review only
reviews, NLP recommendation particular
and fine-grained s system product.
data
like opinion
Agarwal et al. (2015) Concepts Semantic Basic
extraction relations problem of
protocol between words SA
in natural
language and
Concept Net
ontology
Agarwal & Mittal (2016) Machine Bow High
learning representations dimensionality
protocols for y of features
SA space
Ahmad et al. (2016) Sentence-level Text level The problem
lexical based corpus based of domain
domain machine portability
independent learning
sentiments techniques
classification
technique
Garg, Bindu, et al. [19] New techniques described in this research produce the highest accuracy with the lowest
mean square error among all forecasting-related work. The pioneering dynamic computational algorithm may be
used to accurately and reliably estimate and anticipate the frequency of outpatient visits in any territory-care
hospital. Health care planning, allocation, and management might benefit from using the model presented in this
article. A decision support system for healthcare institutions may be developed using the design of the suggested
technique. Such a decision assistance system may significantly impact healthcare service efficiency. The
suggested model may be improved using a genetic algorithm in the future to cope with multi-dimensional time-
series data.
Gunnarsdottir, Kristin, et al. [20] an efficient technique to categorize sepsis in intensive care units (ICUs) may be
to use a generalized linear model. The concept is that the 781 probability model is updated each time a new
36
measurement is taken and utilized by ICU physicians to understand their patient's clinical status better. Instead of
only considering demographic factors, these authors found that including physiological time-series signals
improved classification accuracy and specificity. These authors were constrained by the number of patients in
the MIMIC II database that could be included in this investigation. Even though these findings are early, they
show that GLM can be used to monitor sepsis in real-time.
Liu, Bo et al. [34] MDLats is a method and system for discovering motifs in large-scale time series presented in
this work. It utilizes the RP algorithm and the ED to swiftly and precisely locate the motifs, combining the
benefits of both approximate and exact techniques. Hadoop is used to construct a production-level system.
MDLats' ECG classification findings and real-world use in healthcare prove its usefulness. In the future, these
authors want to use MDLats in various other areas, including air pollution, social networks, and logistical
optimization and lot of other areas where the system is being use on a large scale.
E. A Survey on Time Series Analysis on General Health Care
Almeida, Rui Jorge, et al. [6] A straightforward method for obtaining medical data summaries in descriptive
linguistic form is presented in this paper. Proposals incorporate categorical data and clearly show disparities
between patients with distinct class labels from linguistic summary protoforms. These authors propose
summarizing data in a new differential form based on a numerical criterion to compare linguistic summaries.
Multiple occurrences were detected in the same individuals over long periods in the reviewed data set.
Summaries of linguistic features are proposed that offer chronological context for the quantification of
characteristics and time
Baldassano, Steven, et al. [8] ICU caregivers are alerted to crucial occurrences in real-time using a configurable
platform established in this article. This platform developed and clinically applied open-source techniques for
identifying defective EEG electrodes, tracking burst suppression ratios, and detecting problems in
neuromonitoring data. When it comes to improving ICU workflow, easing the strain on nurses, and enhancing
research data quality, these authors showed how this platform could do all of those things and calculate clinically
significant trending indicators. Medical data analytics may greatly impact patient care, and this study provides a
framework for understanding a broad range of ICU data streams.
Biem, A et al. [9] These authors introduced STAM as a domain-agnostic, multi-component, generic time-series
analysis and management system and exhibited its capabilities via experiments on a real-time, large-scale
anomaly detection application and generated tests. The STAM system is created with a specific emphasis on
well-defined qualities. STAM is a general plug-and-play system. It gives the capacity to handle multi-
dimensional time-series data of nearly any size. The user inputs the data source, and the system executes the
processing. STAM stresses simplicity of use: the system needs minimum adjustment from the user and minimal
settings to start, save for adding data sources. It also offers sophisticated user control (e.g., sensitivity
modification and parameter selection).
Cao, Xi Hang, et al. [10] Methods for learning continuous-time LDS from MTS with different types of
imperfection, such as restricted time points and uneven sample intervals, are presented in this study. These
authors used a support vector machine model for classification tasks and a sophisticated LDS kernel formulation.
37
These authors demonstrated that our suggested technique is successful and superior to other ways based on the
outcomes of three diagnostic tasks with varying degrees of imperfection.
Hajihashemi, Zahra and Popescu, Mihail [22]. Elderly people at the highest risk of deterioration and adverse
events may be identified using our methods in this research. Automated in-home monitoring systems will use our
computational approach to track the health trends of older persons and notify healthcare practitioners so that they
may take action before things worsen. The features of TSW are discussed in depth here, followed by the findings
of the suggested approach's performance on a benchmark dataset.
This research was constrained by Helander, Elina, et al. [23] because it relied on data from just two healthy
patients. However, some findings were consistent with previous research. Behavioral weight loss participants
were tracked on average, 28 percent on weekdays and 17 percent on weekends. These two individuals showed a
reduction in self-monitoring frequency throughout the weekend. Weight fluctuates during the day and over time.
An average daily weight fluctuation of 2% to 3% may be regarded as typical, with daily weight fluctuations
being more frequent than day-to-to-day weight fluctuations
Hochstein, Axel, et al. [25] Using static Bayesian network theory, these authors develop the concept of
probabilistic event networks, which describe the relationships between regime shifts in time and their locations.
These authors demonstrate how RSVAR inference and learning algorithms must be altered to take higher-order
regime dynamics into account.
Lehman, Li-Wei H. et al. [32] This research aims to see whether the SLDS framework can be used to follow the
evolution of patients' health status over time. An ICU patient cohort was studied using the framework during the
first 24 hours of hospitalization. These authors found that the vital sign dynamics of patients who did not survive
their hospital stays evolved differently from those who did. As patients' health improves or worsens, the
distribution of their vital sign dynamics likewise changes. These findings confirm our idea.
Liu, junjian et al. [35] Using ontology, these authors developed a real-time monitoring system to monitor patient
care flow and compare it to the specified CP treatment requirements. The suggested system's monitoring data is
organized and stored in a database, making it easy for the computer to handle and evaluate.
Mei, Jiangyuan, et al. [37] Computer vision and pattern recognition applications rely on accurately measuring
and categorizing motion tracking signals (MTS). A new method for measuring MTS has been presented. MTS's
local distance is first determined using the Mahalanobis distance, as described in the new technique. The DTW is
then used to discover the best route to align MTS out of synchronization or have various lengths. Once this is
done, the difference between two MTS may be derived for MTS classification and clustering purposes. Learning
the Mahalanobis function for the MTS dataset is another major issue in the proposed MDDTW metric. LogDet
divergence-based metric learning with triplet constraints was developed in this study for the MTS example. Our
technique was tested on several well-known datasets. The results showed that the recommended strategy was
dependable and accurate. An issue with the suggested framework is that it is inefficient in computing.
According to this article, pealat, Clement et al. [41], Respiratory viruses significantly affect emergency
departments (ED) in France each winter. To prevent this from happening, it's critical to have the means to
monitor the passage of these viruses across the patient population. This is exactly what is being discussed in this
paper.
Penfold, Robert B, and Zhang, Fang [42] it is a simple yet effective evaluation of policies and programs. Despite
its drawbacks, few statistical techniques are as well-designed or as effective in their effect on the audience as this
one
Pierleoni, Paola, et al. [43] When used for Parkinson's disease diagnosis and treatment, this new system
presented in this research offers qualities that make it an excellent choice for ambulatory and home monitoring.
Tremors may be classified using a basic IMU device and a set of algorithms that can measure the intensity of
their symptoms in real-time using the UPDRS scale. The system also provides a mechanism for annotating
illness severity and progression to the neurologist. The report, thus, is unaffected by subjective judgments, such
as those made by medical experts who alternate in subsequent analyses of a patient.
Rusanov, Alexander, et al. [47] Data-driven, time-series clustering identifies people with managed and
uncontrolled diabetes. Other illness factors or long-term medical issues may apply to this strategy.
Sana Imtiaz et al. [48] Using a fitness monitoring app or a wearable device, these authors develop an online
system to predict a user's food habits and health statistics. To this end, these authors have developed and
deployed a pipeline that can reliably forecast user behavior and utilize commonalities between people to increase
model performance while ensuring data privacy. Assuming that the dataset and characteristics are consistent, our
predictions are less than 0.025 percent out of whack with reality.
Shi, Yong, et al. [50] Researching the Home Health Care 690 logistics optimization is of major importance in
this article since transportation expenses are one of the main kinds of spending in the business.
38
According to this research, Sindhu Shantha Nair et al. [52] Ethical considerations must be considered when
applying these dimensions to the healthcare business. Healthcare businesses must practice these aspects to the
letter to achieve a competitive edge. These fields may benefit from more education and training to improve
productivity. These perceptions and characteristics of organizational pedestal morally contribute to health care
excellence if they are understood, aware, and ready.
Stylianides, Nikolas, et al. [55] The suggested solution's usability and cost-effectiveness are shown in this paper's
assessment scenario. As discussed earlier, many research institutions have formed common repositories to share
medical data. Researchers may access the data they need to analyze in databases or binary files.
Vasco, Todor, et al. [57] Sensor data streams are modeled using semantics. WSN and WBAN data streams
demand that adaptable architectures be designed based on multiagent systems to meet the needs of real-time vital
signs monitoring. Semantically-driven sensor data streams are created by the system using the suggested
approach. Attributes such as timestamps, vital signs, and values define each reading. Yandong Zheng et al. [61]
These authors provide an efficient and privacy-preserving forward algorithm, which these authors then employ
to build a healthcare monitoring system. A collection of mutually orthogonal matrices was the first thing these
authors offered, and a strategy for building one was presented shortly after.
Zhang, Ying, et al. [64] in Consistent correlation between physiological and clinical events has been shown by
this method for synchronized data collection and clinical annotations. Even though hardware capabilities might
affect its performance, the system gathered and evaluated patient monitoring algorithms in real-time at the
bedside.
III. CONCLUSION
Two things are anticipated to occur in the future growth of technology and healthcare. First, the rising
sophistication of computerization and software development will be a potent mix for predicting. Various
complicated procedures and strategies now only imaginable may soon be realized and used in actual
circumstances. The variances and degrees of complexity are also growing regarding the quantity of data. In
addition, the healthcare departments requiring large-scale data forecasting will expand. In the future, there
will be rapid growth in using these two technologies, which combine forecasting and data mining. There
are various benefits to developing new forecasting methodologies, such as those based on soft computing
technologies, which may produce more accurate forecasting results than conventional approaches and more
efficient processes. A thorough evaluation of existing time series forecasting approaches is expected to
serve as a direction for future field classifications and analysis research.
REFERENCES
[1] Abbasi, A, France, S, Zhang, Z & Chen, H 2011, „Selecting Attributes for Sentiment Classification Using Feature
Relation Networks‟, IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 3, pp.447-462.
[2] Abdul Jalil Niazai; Abdullah Zahirzada; Mohammad Akbar Shahpoor; Abdul Rahman Safi; 2020 IEEE International
Conference on Advent Trends in Multidisciplinary Research and Innovation (ICATMRI)
[3] Agarwal, B & Mittal, N 2016, „Machine Learning Approach for Sentiment Analysis‟. In Prominent Feature Extraction
for Sentiment Analysis (pp. 21-45). Springer International Publishing.
[4] Agarwal, B, Poria, S, Mittal, N, Gelbukh, A & Hussain, A 2015, „Concept-Level Sentiment Analysis with Dependency-
Based Semantic Parsing: A Novel Approach‟. Cognitive Computation, vol. 7, no. 4, pp. 487-499.
[5] Ahmad, S, Kundi, FM, Tareen, I & Asghar, MZ 2016, „Lexical Based Semantic Orientation of Online Customer
Reviews and Blogs‟. arXiv preprint arXiv:1607.02355.
[6] Almeida, Rui Jorge; Lesot, Marie-Jeanne; Bouchon-Meunier, Bernadette; Kaymak, Uzay; Moyse, Gilles (2013).IEEE
International Conference on Fuzzy Systems (FUZZ-IEEE) - Linguistic summaries of categorical time series for septic
shock patient data.
[7] Bagheri, A, Saraee, M & De Jong, F 2013, „Care More About Customers: Unsupervised Domain-Independent Aspect
Detection for Sentiment Analysis of Customer Reviews‟. Knowledge-Based Systems, vol. 52, pp. 201-213.
[8] Baldassano, Steven; Gelfand, Michael; Bhalla, Paulomi Kadakia; Hill, Chloe; Christini, Amanda; Wagenaar, Joost; Litt,
Brian; Roberson, Shawniqua Williams; Balu, Ramani; Scheid, Brittany; Bernabei, John; Pathmanathan, Jay; Oommen,
Brian; Leri, Damien; Echauz, Javier (2020). IRIS: A Modular Platform for Continuous Monitoring and Caretaker
Notification in the Intensive Care Unit. IEEE Journal of Biomedical and Health Informatics.
[9] Biem, A.; Feng, H.; Riabov, A. V.; Turaga, D. S. (2013). Real-time analysis and management of big time-series data.
IBM Journal of Research and Development, 57(3), 8:1–8:12.
[10] Cao, Xi Hang; Han, Chao; Obradovic, Zoran (2018). IEEE International Conference on Healthcare Informatics (ICHI) -
Learning a Dynamic-Based Representation for Multivariate Biomarker Time Series Classifications.
39
[11] Chen, CC & Tseng, YD 2011, „Quality Evaluation of Product Reviews Using an Information Quality Framework‟,
Decision Support Systems, vol. 50, no. 4, pp. 755-768.
[12] Dai, Xiangfeng; Bikdash, Marwan (2017). Trend Analysis of Fragmented Time Series for health Apps: Hypothesis
Testing Based Adaptive Spline Filtering Method with Importance Weighting.
[13] Divya Gupta; M. P. S. Bhatia; Akshi Kumar; (2021). Resolving Data Overload and Latency Issues in Multivariate Time-
Series it Data for Mental Health Monitoring.
[14] Dugast, Mael; Bouleux, Guillaume; Mory, Olivier; Marcon, Eric (2018). Improving Health Care Management Through
Persistent Homology of Time-Varying Variability of Emergency.Department Patient Flow. IEEE Journal of Biomedical
and Health Informatics.
[15] Fabien Viton;Mahmoud Elbattah;Jean-Luc Guerin;Gilles Dequen; (2020). Heatmaps for Visual Explainability of CNN-
Based Predictions for Multivariate Time Series with Application to Healthcare . 2020 IEEE International Conference on
Healthcare Informatics (ICHI).
[16] Ferenti, Tamas (2017).IEEE 30th Neumann Colloquium (NC) - Biomedical applications of time series analysis.
[17] Fujita, Hamido (2017).IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY) - Data
analytics for cloud healthcare and risk predictions based on ensemble classifiers and subjective projection.
[18] Gang, Qu; Cui Shengnan, ; Jiafu, Tang (2014). The 26th Chinese Control and Decision Conference (2014 CCDC) -
Time series forecasting of medicare fund expenditures based on historical data.
[19] Garg, Bindu; Beg, M. M. Sufyan; Ansari, A. Q. 2012 Annual Meeting of the North American Fuzzy Information
Processing Society (NAFIPS) - A new computational fuzzy time series model to forecast several outpatient visits.
[20] Gunnarsdottir, Kristin; Sadashivaiah, Vijay; Kerr, Matthew; Santaniello, Sabato; Sarma, Sridevi V.2016 38th Annual
International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) - Using demographic and
time-series physiological features to classify sepsis in the intensive care unit.
[21] Hai, Z, Chang, K, Kim, JJ & Yang, CC 2014, „Identifying Features in Opinion Mining Via Intrinsic and Extrinsic
Domain Relevance‟, IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 3, pp. 623-634.
[22] Hajihashemi, Zahra; Popescu, Mihail (2015). A Multi-dimensional Time Series Similarity Measure with Applications to
Eldercare Monitoring. IEEE Journal of Biomedical and Health Informatics.
[23] Helander, Elina; Pavel, Misha; Jimison, Holly; Korhonen, Ilkka (2015). 37th Annual International Conference of the
IEEE Engineering in Medicine and Biology Society (EMBC) - Time-series modeling of long-term weight self-
monitoring data.
[24] Hirano, Shoji; Tsumoto, Shusaku (2017). 6th International Conference on Informatics, Electronics and Vision & 2017
7th International Symposium in Computational Medical and Health Technology (ICIEV-ISCMHT) - Towards
knowledge discovery from heterogeneous time-series medical databases
[25] Hochstein, Axel; Hyung-Il Ahn; Ying Tat Leung; Denesuk, Matthew 2014 International Conference on Prognostics and
Health Management - Switching vector autoregressive models with higher-order regime dynamics Application to
prognostics and health management.
[26] Jiang, L, Yu, M, Zhou, M, Liu, X, & Zhao, T 2011, „Target-Dependent Twitter Sentiment Classification. In Proceedings
of the 49th Annual Meeting of the Association for Computational Linguistics. vol. 1, pp. 151-160.
[27] Kalaivani, P & Shunmuganathan, KL 2013,Sentiment Classification of Movie Reviews By Supervised Machine
Learning Approaches‟, Indian Journal of Computer Science and Engineering (IJCSE), vol. 4, no. 4, pp. 285-292.
[28] Kale, David C.; Gong, Dian; Che, Zhengping; Liu, Yan; Medioni, Gerard; Wetzel, Randall; Ross, Patrick (2014). IEEE
International Conference on Data Mining - An Examination of Multivariate Time Series Hashing with Applications to
Health Care.
[29] Kamal, A, Abulaish, M & Anwar, T 2012, „Mining Feature-Opinion Pairs and Their Reliability Scores From Web
Opinion Sources‟, In Proceedings of the 2nd International Conference on Web Intelligence Mining and Semantics
ACM, pp. 15.
[30] Kristoffer Wickstrom;Karl Oyvind Mikalsen;Michael Kampffmeyer;Arthur Revhaug;Robert Jenssen;
(2021). Uncertainty-Aware Deep Ensembles for Reliable and Explainable Predictions of Clinical Time Series. IEEE
Journal of Biomedical and Health Informatics.
[31] Lavergne, M. Ruth; Law, Michael R.; Peterson, Sandra; Garrison, Scott; Hurley, Jeremiah; Cheng, Lucy; McGrail,
Kimberlyn (2017). Effect of incentive payments on chronic disease management and health services use in British
Columbia, Canada: Interrupted time series analysis.
[32] Lehman, Li-Wei H.; Nemati, Shamim; Adams, Ryan P.; Moody, George; Malhotra, Atul; Mark, Roger G. 2013 35th
Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) - Tracking
progression of the patient state of health in critical care using inferred shared dynamics in physiological time series.
[33] Lin, C, He, Y, Everson, R & Rüger, S 2012, „Weakly Supervised Joint Sentiment-Topic Detection From Text‟,
Knowledge and Data Engineering, IEEE Transactions, vol. 24, no. 6, pp. 1134-1145.
[34] Liu, Bo; Li, Jianqiang; Chen, Cheng; Tan, Wei; Chen, Qiang; Zhou, Menchu (2015). Efficient Motif Discovery for
Large-Scale Time Series in Healthcare. IEEE Transactions on Industrial Informatics.
[35] Liu, junjian; Huang, Zhengxing; Lu, Xudong; Duan, Huilong (2014). 7th International Conference on Biomedical
Engineering and Informatics - An ontology-based real-time monitoring approach to a clinical pathway.
40
[36] Liu, K, Xu, L & Zhao, J 2012, „Opinion Target Extraction Using Word- Based Translation Model‟, In Proceedings of
the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language
Learning, Association for Computational Linguistics, pp. 1346-1356.
[37] Mei, Jiangyuan; Liu, Meizhu; Wang, Yuan-Fang; Gao, Huijun (2015). Learning a Mahalanobis Distance-Based
Dynamic Time Warping Measure for Multivariate Time Series Classification. IEEE Transactions on Cybernetics.
[38] Moraes, R, Valiati, JF & Neto, WPG 2013, „Document-Level Sentiment Classification: An Empirical Comparison
Between SVM and ANN‟, Expert Systems with Applications, vol. 40, no. 2, pp. 621-633.
[39] Nickerson, Paul; Baharloo, Raheleh; Wanigatunga, Amal A.; Manini, Todd D.; Tighe, Patrick J.; Rashidi, Parisa
(2017). Transition Icons for Time Series Visualization and Exploratory Analysis. IEEE Journal of Biomedical and
Health Informatics.
[40] Ordonez, P.; Oates, T.; Lombardi, M. E.; Hernandez, G.; Holmes, K. W.; Fackler, J.; Lehmann, C. U.
(2012). Visualization of multivariate time-series data in a neonatal ICU.
[41] Pealat, Clement; Bouleux, Guillaume; Cheutet, Vincent (2019). IEEE EMBS International Conference on Biomedical &
Health Informatics (BHI) - Extracting Most Impacting Emergency Department Patient Flow By Embedding Laboratory-
confirmed and Clinical Diagnosis on The Stiefel Manifold.
[42] Penfold, Robert B.; Zhang, Fang (2013). Use of Interrupted Time Series Analysis in Evaluating Health Care Quality
Improvements.
[43] Pierleoni, Paola; Palma, Lorenzo; Belli, Alberto; Pernini, Luca (2014). IEEE-EMBS International Conference on
Biomedical and Health Informatics (BHI) - A real-time system to aid clinical classification and quantification of tremor
in Parkinson's disease.
[44] Pratyaksa, Hans; Permanasari, Adhistya Erna; Fauziati, Silmi; Fitriana, Ida (2016). 1st International Conference on
Biomedical Engineering (BIOMED) - ARIMA implementation to predict the amount of antiseptic medicine usage in a
veterinary hospital.
[45] Rajaei, Rasoul; Shafai, Bahram; Ramezani, Amin (2017). IEEE High-Performance Extreme Computing Conference
(HPEC) - A top-down scheme of descriptive time series data analysis for a healthy life: Introducing a fuzzy amended
interaction network.
[46] Roberts, Lauren; Michalak, Peter; Heaps, Sarah; Trenell, Michael; Wilkinson, Darren; Watson, Paul (2018). IEEE 14th
International Conference on e-Science (e-Science) - Automating the Placement of Time Series Models for IoT
Healthcare Applications.
[47] Rusanov, Alexander; Prado, Patric V.; Weng, Chunhua (2016). IEEE International Conference on Healthcare
Informatics (ICHI) - Unsupervised Time-Series Clustering Over Lab Data for Automatic Identification of Uncontrolled
Diabetes.
[48] Sana Imtiaz; Sonia-Florina Horchidan; Zainab Abbas; Muhammad Arsalan; Hassan Nazeer Chaudhry; Vladimir
Vlassov; (2020). Privacy-Preserving Time-Series Forecasting of User Health Data Streams. 2020 IEEE International
Conference on Big Data (Big Data).
[49] Shamsuddin, Rittika; Maweu, Barbara M.; Li, Ming; Prabhakaran, Balakrishnan (2018). IEEE International Conference
on Healthcare Informatics (ICHI) - Virtual Patient Model: An Approach for Generating Synthetic Healthcare Time
Series Data.
[50] Shi, Yong; Boudouh, Toufik; Grunder, Olivier (2017). A hybrid genetic algorithm for a home health care routing
problem with a time window and fuzzy demand. Expert Systems with Applications.
[51] Shukla, Shubhangu; Singh, Pulkit; Neopane, Narayan; Rishabh, (2019). 2019 4th International Conference on
Information Systems and Computer Networks (ISCON) - Health Care Management System Using Time Series Analysis.
[52] Sindhu Shantha Nair;Kennedy Andrew Thomas;Smritika S. Prem; (2021). The organizational pedestal of quality of care
climate in health care excellence. Zeitschrift für Evidenz, Fortbildung und Qualität im Gesundheitswesen.
[53] Sobkowicz, P, Kaschesky, M & Bouchard, G 2012, „Opinion Mining in Social Media: Modeling, Simulating and
Forecasting Political Opinions in the Web‟, Government Information Quarterly, vol. 29, no. 4, pp. 470-479.
[54] Stavrianou, A & Brun, C 2015, „Expert Recommendations Based on Opinion Mining of User‐Generated Product
Reviews‟, International Journal of Computational Intelligence, vol. 31, no. 1, pp. 165-183.
[55] Stylianides, Nikolas; Dikaiakos, Marios; Gjermundrod, Harald; Theodoros, (2012). 2012 IEEE 12th International
Conference on Bioinformatics & Bioengineering (BIBE) - Intensive Care Cloud: Exploiting cloud infrastructures for
near real-time vital sign analysis in intensive care medicine.
[56] Takeuchi, H.; Mayuzumi, Y.; Kodama, N. (2011). Annual International Conference of the IEEE Engineering in
Medicine and Biology Society - Analysis of time-series correlation between weighted lifestyle data and health data.
[57] Vascu, Todor; Frincu, Marc; Negru, Viorel (2016). International Symposium on innovations in Intelligent systems and
Applications (INISTA) - Energy-efficient sensors data stream model for real-time and continuous vital signs monitoring.
[58] Vong, Keovessna; Rasmequan, Suwanna; Chinnasarn, Krisana; Harfield, Antony (2015). 8th Biomedical Engineering
International Conference (American) - Empirical modeling for dynamic visualization of ICU patient data streams.
[59] Wickramasinghe, Asanga; Ranasinghe, Damith C.; Fumeaux, Christophe; Hill, Keith D.; Visvanathan, Renuka
(2016). Sequence Learning with Passive RFID Sensors for Real-Time Bed-egress Recognition in Older People. IEEE
Journal of Biomedical and Health Informatics.
[60] Xu, K, Liao, SS, Li, J & Song, Y 2011, „Mining Comparative Opinions From Customer Reviews for Competitive
Intelligence‟, Decision Support Systems, vol. 50, no. 4, pp. 743-754.
41
[61] Yandong Zhang; Rongxing Lu; Songnian Zhang; Yunguo Guan; Jun Shao; Hui Zhu; (2022). Toward Privacy-Preserving
Healthcare Monitoring Based on Time-Series Activities Over Cloud . IEEE Internet of Things Journal.
[62] Yang, Chengliang; Delcher, Chris; Shenkman, Elizabeth; Ranka, Sanjay 2018. IEEE 20th International Conference on e-
Health Networking, Applications and Services (Healthcom) - Clustering Inter-Arrival Time of Health Care Encounters
for High Utilizers.
[63] Yanke Hu;Raj Subramanian;Wangpeng An;Na Zhao;Weili Wu; (2020). Faster Healthcare Time Series Classification for
Boosting Mortality Early Warning System . 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS).
[64] Zhang, Ying; Silvers, Christine Tsien; Randolph, Adrienne G. 2007 29th Annual International Conference of the IEEE
Engineering in Medicine and Biology Society - Real-Time Evaluation of Patient Monitoring Algorithms for Critical
Care at the Bedside.
[65] Zina M. Ibrahim; Daniel Bean; Thomas Searle; Linglong Qian; Honghan Wu; Anthony Shek; Zeljko Kraljevic; James
Galloway; Sam Norton; James T Teo; Richard JB Dobson; (2022). A Knowledge Distillation Ensemble Framework for
Predicting Short- and Long-Term Hospitalization Outcomes From Electronic Health Records Data . IEEE Journal of
Biomedical and Health Informatics.
[66] Zwilling, Chris E.; Wang, Michelle Yongmei (2014). IEEE Healthcare Innovation Conference (HIC) - Multivariate
Voronoi outlier detection for time series.
[67] Zikos.D and D. Ostwal, "A Platform based on Multiple Regression to Estimate the Effect of in-Hospital Events on Total
Charges," in Healthcare Informatics (ICHI), 2016 IEEE International Conference on, 2016, pp. 403-408.
42
Grenze International Journal of Engineering and Technology, June Issue
Abstract—The network model (MANET) is just a wireless system that has no established
infrastructure. These nodes seem to be able to communicate without a central authority.
MANETs are ideal for emergency circumstances, vehicle networks, including military activities.
However, the MANET's flexibility makes it vulnerable to attacks like black hole attacks. The
black hole attack is one of the most common threats to MANET. In this attack, an unauthorized
node claims to have the best path to a target node, causing data packets to be misdirected and
then dropped. Several fixes have been made now. An overview of black hole attack prevention
measures and conclusion are presented in this work.
Index Terms— Cooperative Black Hole Attack, Black Hole Attack, Malicious Node, Packets.
I. INTRODUCTION
This short-term network has only one mobile node, which is capable of sending and receiving data by itself,
without the aid of any connections. Using multi-hop communication, MANET nodes can exchange information.
There is no direct data link between a destination node and a source node if the two are not within
communication range (G., P., Parmar, V., & Rishi, R., 2011). MANET has a dynamic topology since the nodes
can link and disconnect fast. The network's dynamic topography makes it more vulnerable to a wide range of
attacks. As a result, building this network and ensuring its route stability is extremely difficult. There are other
forms of malicious attacks being carried out by MANET; however, our primary focus is on black hole attacks
(Gerhards-Padilla, Aschenbruck and Martini, 2010). The healthcare industry demands round-the-clock
monitoring, which includes both routine updates and real-time emergency alerts sent across the network. But the
main problem in these situations is that the attacker nodes occur, causing unnecessary delays and potentially
disastrous effects. Traffic jams and network delays are both caused by this. When an attacker uses a black hole
assault, the node attacking it pretends to have the quickest pathway to the target node. Using this strategy, an
attacker's node will generate a bogus route, and all traffic will be diverted to that node (Percher et al., 2004). As
a result, the attacker node will be able to intercept all of the packets that are sent to or from the designated
destination (Santhakumar and Prabha, 2017). To find and separate Black hole nodes inside a MANET, the
(TBBT) Timer Based Baited Method includes both timers and baiting. Through the use of a Baited message, this
strategy enhances the ability to detect black holes. False id baiting
is used to find the network's black-hole nodes in this strategy. This method, on the other hand, increases network
latency while decreasing throughput (Yasin and Abu Zant, 2018). The Counter & Timer Based Baited Technique
(CTBM) in Splitting Black Hole Attacks with MANET is presented to address these issues. The baiting message,
the non-neighbor reply, and the retort are all part of this strategy. The network's "Black hole nodes" are
Figure 1Representation of Source and Destination node (source:Kaur and Kaur 2017)
44
A source node is shown as node 1 in fig1 while a destination node is shown as node 4. When the RREQ packets
from of the source node are received, it is assumed that Node 3 is malicious since it answers that path to target
node is shortest (Baadache and Belmehdi, 2017).
A malicious node is responsible for the loss of data packets. In the context of MANET, such malicious node
could be referred to as a "Black hole."
C. Cooperative Black hole attack
During a cooperative black hole attack, numeroushateful nodes work composed to break the routing protocols
specifications. Figure 2 depicts an attack by a Cooperative Black Hole(Baadache and Belmehdi, 2017).
In Fig. 2, the source node is designated by the letter A, while the destination address is denoted by the letter D.
Nodes B1 and B2 were collaborating on a project. The source node transmits the Further Request (FReq) onto
B2 over a variety of routes other than through B1. Because it is the next-hop to node B1, it certifies that node B2
does have a line to a target(Baadache and Belmehdi, 2017). However, data packets by rogue node B1, despite the
fact that both B1 and B2 produce FurtherRep (FRep) packets claiming to have the most secure and fastest route.
45
It is meant for such a non-cooperative Attacks, but it is unsuccessful inside a black hole since the black hole
node transmits phony proof towards the audit node, making the attack ineffective. For the second time, the
attacker node's origin is not recorded since the behavioral proof in React only stores information about
transmission packets, not nodes.
D. Neighborhood based Routing Recovery System
Routing Recovery Scheme is based just on AODV protocol to identify black holes in the neighborhood. Routing
recovery protocols are used to identify the assault and build the correct path(Baadache and Belmehdi, 2017). A
Modify-Route-Entry control message might be sent to the destination node if the paths are not same.
This method achieves a high detection rate while requiring less time to detect. When an attacker creates a forged
RREP packet, the attack fails.
V. RESULTS
A. Single Black-Hole Attack
Because the packet dropping brought on by the black-hole attack, when there is just one, the net was at its
lowest. When a black-hole attack is not present-day in the network, the native AODV throughput result was the
greatest. When a black-hole attack is present in the network, the throughput of TBBT is higher than native
AODV, but it is lower that native AODV when there's not a black-hole assault. The suggested TBBT increases
performance by discarding any answers from unidentified nodes who right to have a quicker pathway to the
target node than any other node, which results in a reduction in throughput. Additionally, the location of a black-
hole attack is crucial since it can be situated on the route that travels the shortest distance from source to
destination.
Figure 3: Results of Amount versus the number of nodes (source:Yasin and Abu Zant, 2018)
46
Figure 4: Results of regular End-to-End Stay versus the statistics (source:Yasin and Abu Zant, 2018)
Figure 5: Results of PDR versus the amount of nodes (source:Yasin and Abu Zant, 2018)
Figure 6: Results of Quantityagainst the amount of the black hole nodes (source:Yasin and Abu Zant, 2018)
47
Figure7: Results of the average End-to-End Delay against the number (source:Yasin and Abu Zant, 2018)
Figure 8: Results of PDR against the number of the black-hole nodes (source:Yasin and Abu Zant, 2018)
VI. CONCLUSION
Black-hole attacks are one of the most serious dangers to MANET. To keep the network against collapsing,
black hole nodes must be identified and isolated. This work proposed Techniques for identifying and shutting
down black holes which should be considered when developing black-hole aggressive protocols or methods. It
48
uses timing and baiting for improve black-hole identification without preserving End-to-End Delay, Throughput,
and Packet Delivery Ratio. Throughput and Packet Delivery Ratios of the planned technique were determined to
be almost identical to native AODV in simulation. We hope to improve this model's throughput & packet
delivery ratios with reducing overall late latency.
REFERENCES
[1] Baadache, A. and Belmehdi, A., 2017. Solution for Black Hole and Cooperative Black Hole Attacks in Mobile Ad Hoc
Networks. Egyptian Computer Science Journal (ISSN-1110-2586)Volume 41– Issue 1, January 2017,.
[2] Bala, A., Bansal, M. and Singh, J., 2009. Performance analysis of MANET under blackhole attack. In 2009 First
International Conference on Networks & Communications (pp. 141-145). IEEE.,.
[3] Gerhards-Padilla, E., Aschenbruck, N. and Martini, P., 2010. TOGBAD-an approach to detect routing attacks in tactical
environments. Security and Communication Networks, 4(8), pp.793-806.
[4] Goyal, P., Parmar, V., & Rishi, R., 2011. Manet: vulnerabilities, challenges, attacks, application. IJCEM International
Journal of Computational Engineering & Management, 11(2011), 32-37.,.
[5] Kaur, R. and Kaur, A., 2017. Technique for Detection and Isolation of Black Hole Attack in MANETs. International
Journal of Computer Applications, 174(4), pp.22-25.
[6] Kumar, V. and Kumar, R., 2015. An Adaptive Approach for Detection of Blackhole Attack in Mobile Ad hoc
Network. Procedia Computer Science, 48, pp.472-479.
[7] Nakayama, H., Kurosawa, S., Jamalipour, A., Nemoto, Y. and Kato, N., 2009. A Dynamic Anomaly Detection Scheme
for AODV-Based Mobile Ad Hoc Networks. IEEE Transactions on Vehicular Technology, 58(5), pp.2471-2481.
[8] Patil, A. and Kshirsagar, D., 2020. Blackhole attack detection and prevention by real time monitoring. In 2013 Fourth
International Conference on Computing, Communications and Networking Technologies (ICCCNT) (pp. 1-5). IEEE.,.
[9] Percher, J., Puttini, R., Mé, L., Sousa, d., Jouga, B. and Albers, P., 2004. A fully distributed IDS for MANET. In
Proceedings. ISCC 2004. Ninth International Symposium on Computers And Communications (IEEE Cat. No.
04TH8769) (Vol. 1, pp. 331-338). IEEE.,.
[10] PratapSingh, H., Pal Singh, V. and Singh, R., 2013. Cooperative Blackhole/ Grayhole Attack Detection and Prevention
in Mobile Ad hoc Network: A Review. International Journal of Computer Applications, 64(3), pp.16-22.
[11] Santhakumar, R. and Prabha, N., 2017. Resource Allocation In Wireless Networks By Channel Estimation And Relay
Assignment Using Data-Aided Techniques. International Journal of MC Square Scientific Research, 9(3), pp.40-47.
[12] Sarma, K. J., Sharma, R., & Das, R., 2014. A survey of black hole attack detection in manet. In 2014 International
Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT) (pp. 202-205).,.
[13] Sherif, A., Elsabrouty, M., & Shoukry, A., 2013. A novel taxonomy of black-hole attack detection techniques in mobile
Ad-hoc network (MANET). In 2013 IEEE 16th International Conference on Computational Science and Engineering
(pp. 346-352). IEEE.,.
[14] Woungang, I., Dhurandher, S., Obaidat GE, M. and Peddi, R., 2013. A DSR-based routing protocol for mitigating
blackhole attacks on mobile ad hoc networks. Security and Communication Networks, 9(5), pp.420-428.
[15] Yasin, A. and Abu Zant, M., 2018. Detecting and Isolating Black-Hole Attacks in MANET Using Timer Based Baited
Technique. Wireless Communications and Mobile Computing, 2018, pp.1-10.
[16] Yasin, A. and Abu Zant, M., 2018. Detecting and Isolating Black-Hole Attacks in MANET Using Timer Based Baited
Technique. Wireless Communications and Mobile Computing, 2018, pp.1-10.
[17] Zant, A., 2017. Detection and Prevention of Warmhole Attack in MANET: A Review. International Journal of Science
and Research.
49
Grenze International Journal of Engineering and Technology, June Issue
Abstract—A wideband band stop filter based on signal interference technique has been
designed. The proposed filter is designed using coupled line and stepped impedance modified pi
type transmission line with open circuited stubs in transmission path 1 and path 2 respectively.
The proposed filter has been designed at a frequency of 0.9 GHz. The simulated 3dB fractional
bandwidth is 0.8GHz and the insertion loss is well below -18dB or -20dB from 0.2 GHz to 2.14
GHz. Using path 2 ABCD parameters of the filter are found and these equations were solved to
obtain the position of zeros.
Index Terms— Signal Interference Technique, Coupled Line, Open Circuited Stub, Fractional
Bandwidth, Insertion loss and ABCD parameters.
I. INTRODUCTION
The performance of the wireless system degrades if there is interference from the existing communication
systems. In modern high data rate wireless communication applications for effective suppression of spurious
signals there is greater demand for compact wideband band stop filters. With the advent in Microwave Integrated
Circuits there is demand for such filters. A wideband band stop filter with larger fractional bandwidth and lower
insertion loss are essential in GSM, Zigbee, WLAN applications. Several methods [1-6] were reported in
designing wideband band stop filters. Rectangular micro strip open loop resonator with stub loaded resonator as
building blocks offering tuneable low space occupying wide band stop filter (BSF) for wireless applications was
reported [2]. A wideband BPF with an open coupled line in way 1 and transmission line in way 2 offering
reduced and sharp selectivity with six transmission zeros dependent on signal interference technique was
reported [3]. High selective fifth-order wideband band-pass filters (BPFs) with different transmission zeros
based on signal-interaction concepts are proposed in this paper. Transmission ways comprising of a shorted stub
and a couple of open coupled lines are utilized to acknowledge signal transmission originating from Port1 to
Port2 [4]. The impacts of electromagnetic coupling in parallel-conductor inhomogeneous transmission lines were
considered and it was demonstrated that the qualities of different coupled-line circuits implanted in an
inhomogeneous dielectric, (for example, the suspended substrate) differ extraordinarily from those in a
homogeneous domain [5]. This article presents a triple wideband band pass channel (TWB-BPF) with compact
A. Design Equations
In the proposed filter design ABCD Parameters of the transmission lines along path2 are given as:
cos 1 1
1 = → (1)
1 cos 1
1 0
2 = → (2)
2 1
cos 2 2
2 = → (3)
2 cos 2
51
cos 1 1
1 = → (4)
1 cos 1
1 0
2 = → (5)
2 1
fi = × → (20)
Two zeros are therefore proved to be residing at 0.67 GHz and 1.12 GHz respectively in coincidence with the
simulation results.
Fig. 2. Snapshot of Ansoft Circuit Schematic for the proposed topology where ZE=125 Ω, Zo=120 Ω, Z1=90 Ω, Z2=107 Ω, Zs=40 Ω
52
III. FILTER DESIGN AND RESULTS
The circuit simulation of the proposed model was performed using Ansoft designer SV and was reported in
Fig.3. After the circuit simulation a 3D model has been constructed in HFSS to perform full wave simulations.
The proposed filter’s 3D model is exhibited in Fig. 4. Optimization has been performed to get the perfect results.
Data tables of both the results have been collected to plot in the same graph using MATLAB so that both the
results can be compared and was reported in Fig.5. In comparison it was found that both circuit simulation graph
and full wave simulation graph are in good agreement.
53
IV. CONCLUSIONS
A compact filter response is recorded with a wideband band stop channel configuration utilizing micro strip
transmission lines with open circuit stubs dependent on signal interference scheme. The claimed model utilizes
in path 1 an open coupled line and in path 2 a transmission line. Due to the superposition of two transmission
ways a band stop channel with two zeroes and six poles were obtained at an operational frequency of 0.9GHz.
The location of obtained two transmission zeros with in the rejection band at 0.67GHz and 1.12 GHz
respectively were analysed and proved using design equations. Theoretical analysis and simulated analysis were
approved and was seen that they were in great understanding.
The proposed channel configuration has to be manufactured on a Teslin sheet utilizing an adhesive copper sheet.
The future work involves generating a Dxf file using the 3D model in HFSS. Using this fabrication has to be
been done on a Teslin sheet by pasting adhesive copper sheet on the design. Two connectors are to be connected
on both the sides making a connection between ground plane and the design .These connectors should be used
for measuring and obtaining results. Measurements has to be done using a device called VNA(Vector Network
Analyser) where two ports of the device are to be connected to the two connectors. Before measuring the results
Calibration of VNA should be performed. Various bending effects on the fabricated prototype can then be
carried out on the fabricated model.
REFERENCES
[1] Hong J-S, Lancaster MJ." Microstrip filters for RF/Microwave applications." New York: Wiley Publications, Apr.
2004.
[2] Ahamad A Ibrahim, Omar K. El Shafey, et.al., “Compact and Wideband microwave band stopfilter for wireless
applications”, Springer-Analog Integrated Circuits and Signal Processing, Vol. 104, No.3, Sep 2020, pp 243–250
[3] Kanaparthi V. Phani Kumar , S.S. Karthikeyan, "Compact, high selectivity and wideband band pass filter with multiple
transmission zeros", AEU - International Journal of Electronics and Communications (AEU-INT J ELECTRON C),
Jul.2018
[4] Wen Jie Feng, Wen Quan Che, Yu Mei Chang, Su Yang Shi, and Quan Xue, “High Selectivity Fifth-Order Wideband
Bandpass FiltersWith Multiple Transmission Zeros Based on Transversal Signal-Interaction Concepts”, IEEE
Transactions on Microwave Theory and Techniques, Vol.6,1 No.1, Jan.2013
[5] George I. Zysman, A. Kent Johnson, “Coupled transmission line networks in an inhomogeneousdielectric medium”,
IEEE Transactions on Microwave Theory and Techniques, Vol.17, No.10, Oct.1969.
[6] Yang Xiong , LiTian Wang , Li Gong, “Compact tri wideband band wideband band transmission zeros”, ETRI
Journal, Oct. 2018
[7] Deng PH, Tsai J-T. Design of microstrip cross-coupled band pass filter with multiple independent designable
transmission zeros using branch-line resonators. IEEE Microwave Wireless Compon. Lett., Apr. 2013, pp:249-51.
54
Grenze International Journal of Engineering and Technology, June Issue
Abstract—Block chain is presenting new chances to develop new categories of digital services.
Even though research on the topic is still in its early stages, it has mostly focused on the
technical and legal challenges rather than utilizing this ground-breaking concept and creating
better digital services. The study offers a novel block-chain based E-voting system that tackles
some of the drawbacks of current systems and assesses some of the well-known block chain
frameworks in order to build a block-chain based E-voting system. It has long been challenging
to develop a secure electronic voting system that maintains the fairness and privacy of paper
ballots while ensuring the transparency and flexibility afforded by electronic systems in current
electoral schemes. We concretely assess the technology's potential by explaining a case study,
including the election process and the implementation of a blockchain-based application that
improves security and reduces the cost of holding national elections. A block chain is a type of
distributed database that is this allows for the sharing of all data among all network users. By
definition, a block chain system has several benefits that are suitable for electronic voting
system. It is independent of a centralized server because its distributed architecture guarantees
great system performance. Availability. Since each participant has full data, using the protocol,
they may verify each block uploaded to the chain regarding the electronic voting system we
propose, the double envelope encryption is combined here. Block chain technology and
technique. This is the mechanism for turning the people's opinions into action in order to better
manage system. Traditional elections have not pleased either the people or the government in
recent years. They are not completely secure because ballots are easily tampered with.
Additionally, it raises concerns about transparency and voter safety. Furthermore, it takes far
too long to count the votes.
Index Terms— Voting, Block chain, Hashing of Fingerprint, Smart-Contract, Mining, Markle
tree.
I. INTRODUCTION
Democratic voting is an important and rigorous mechanism in all regions. Countries typically vote using
traditional the paper-ballots, mechanical devices, and electronic voting systems. [1] However, it requires new
digital technology. Digital voting uses electronic voting machines, and there are two types of digital voting:
electronic voting and I-voting. Electronic voting is when voters use their devices to vote at a vote canter. I-voting
needs a software interface because of this. The essential requirements for determining if a democratic process is
legitimate include Accuracy, resilience against illegal conduct, efficiency, stability, and transparency of the
Modern cryptocurrencies were developed in 2008 using the blockchain concept in conjunction with a number of
other technologies and computing ideas. These digital currencies are rather than a centralized system or source,
cryptosystems are used to secure data. Bitcoin was the first such blockchain-based cryptocurrency [6]. A
cryptographic address is associated with information that represents electronic cash on the Bitcoin blockchain.
Users of Bitcoins can digitally agree to and move ownership of that data to a different user, with a public record
56
of this transfer on the Bitcoin, enabling every connection user to independently confirm the authenticity of the
exchanges. A dispersed group of individuals stores, upkeeps, and cooperatively manages the Bitcoin blockchain.
Bitcoin deployments are typically developed with a specific objective or purpose in mind. Examples of these
features include distributed ledger systems between businesses, cryptocurrencies, smart contracts (software
loaded on a blockchain and executed by machines running that chain), and smart contracts themselves. A
continual stream of innovations in the ledger technology sector have been made, and as the market landscape
changes, new platforms are frequently unveiled.
B. Voting methods
In a democracy, there are several ways to cast a ballot. Many nations have switched from straightforward
elections with ballots on paper to computerized voting machines over time. The following are the most popular
techniques [7]:
1. Ballot Paper
2. Vote over Internet
3. Electronic voting Machine
4. Biometric voting Machine
Ballot Paper: The act of writing down your preferences on a piece of paper and submitting it is one of
the most basic election mechanisms. This is referred to as ballot paper voting. In the table's first
column, which is left vacant for voters to express their preferences, are the names of the candidates and
the parties running in the election. The guidelines are quite basic. You can mark your choice of
candidate with a cross, place the paper in the voting booth after folding it. The Ballot Box must be taken
care of and monitored by the Election Authority in charge of overseeing the election [7].
Vote over Internet: With such a method, we can vote in the convenience of our own homes. Contrary to
other internet-based services, this platform has received harsh criticism for the way it operates. The
administration has repeatedly asked teams of cybersecurity specialists from around the world to
evaluate its voting platform. The testing team has frequently come to the conclusion that not only can
they alter the vote total, but they can also erase all evidence that they ever existed. There won't be any
digital fingerprint that can be used to identify them in the system. [8] This has sparked considerable
division among the populace. Justice for everyone was a key democratic principle. One of the major
democratic celebrations is the right to vote. Therefore, it is the responsibility of the government to make
sure that the public has complete faith in the system and that any problems are resolved.
Electronic voting Machine: The Ballot button on the control unit is pressed by the electoral authority in
charge of overseeing the election, which initializes the EVMs. When a voter presses a button that
corresponds to the candidate he wishes to support, the voting system will then turn on an LED light to
let him know that the vote was accurately recorded. The machine then locks by itself. Now, when a new
voter casts their ballot, the lock can only be opened by a fresh ballot number, which the person in
charge will push once more. This will prevent a single person from casting numerous votes.
Biometric voting Machine: They devised a voting system based on fingerprints, where a biometric
machine helps identify the voter. Numerous investigations have established the uniqueness of each
person's fingerprint. When casting a vote, this can be used to identify the voter. The entire process
concludes fast and painlessly. The government must establish a nationwide scheme for voter
registration, whereby everyone is fingerprinted and a final list of all eligible voters is compiled. In
conclusion, because voters can cast ballots from any location in the world, we may run into a similar
authentication issue while designing a blockchain-based system. Thus, to ensure that the voters who
cast their votes are authentic, we may utilize biometric sensors or even face recognition software. [7]
C. EVM
Voting is done using electronic machines, and a voting method is introduced. This system is centralized. Voting
data can be easily changed in this way. There is no system in place for voters to check the accuracy of their vote.
[9] Describe a voting system based on block chain in which each EVM is directly connected to another EVM in
a network. This method has three parts: peer verification transactions, chain manipulation detection, and
fingerprint authentication. It is subject to DoS (Denial- of-Service) attacks and eavesdropping.
D. Blockchain Based E-voting
Due to the rise in popularity of currencies, blockchain, the technology that underpins them, is receiving increased
attention from researchers. Numerous e-voting methods have been implemented in conjunction with blockchain.
57
Electronic voting is the term used to describe voting that relies on gear or software that is electronic. Such
systems may be able to support/implement a wide range of tasks, from the voting process' startup to the storage
of votes. There are many different types of method, including computers, mobile devices, and kiosks seen at
voting offices [10].
For government and organizations to polling devices, Agora was created. It is an end-to-end verifiable
blockchain-based voting system. Agora used tokens to identify eligible voters in elections, and each eligible
voter received a token from an institution. However, in order to oversee the voting process, each of these
methods leaned on reliable outside sources. The third party could collaborate with the candidates to manipulate
the election. Additionally, the public can access and see the data that is kept in the blockchain. This compromises
the security and dependability of Using block chain for voting.
Blockchain-based voting’s protocols incorporate smart contracts and encryption algorithms to address this issue.
The third party was typically replaced using smart contracts. Smart contracts enabled the Open Vote Network, an
identity protocol. The following procedures were part of the E- voting systems: The first stage is voter
registration (registration). Officials check voters' IDs on election day (verification and authentication). People
who are qualified to vote may do so in the following phase (casting collation). The vote ought to be verified and
encrypted. The votes' correctness, confidentiality, and anonymity must all be ensured and cannot be modified in
any manner. The final step in the counting process for electronic voting systems is to add up all the votes in
accordance with the design (counting display of results). The majority of e- voting applications use central
authority control. Such systems have a number of shortcomings and perceived hazards. The majority of e-voting
applications use centralized government control. Such systems have a number of shortcomings and perceived
hazards. For instance, there are no standards for electronic voting systems, security and reliability risks, fraud
and hacker vulnerabilities, expensive machine costs, and insecure transaction storage.
58
III. PROPOSED VOTING MECHANISM
A. Data Management of the system
Huge amount of data is generated during the election process. As a result, data should be collected in a
systematic manner. As shown, our system employs two types of storage- Database Storage and Cloud Storage.
B. Voter Registration
• Each person must visit their local voter registration office and submit the required documentation in
order to cast a valid ballot.
• A set of public and private keys will be created using a key creation method.
• In the block chain network, voters are identified using their public key. The voter's mobile phone
numbers receive the secret key. They can use this private key to cast a ballot and take part in the voting
system.
• The hash is generated from voter’s submitted fingerprint and the algorithm used is fingerprint hash
generation. The whole procedure of creating hashes from data provided by voters is shown. [2]
• To create a new hash value, the created hash will be coupled with the voter's other information.
• The final hash value will serve as a voter list in the block chain’s genesis block.
, where hash value = membership proof.
C. Voting with a smart contract
The following are the functions of smart contracts running on the block chain:
• Voter Verification:
- Using an internet-connected device, the voter logs into the voting system with their private key.
- Submit your fingerprint, NID, and other data.
- In the genesis block, smart contracts created on the block chain compare the information provided by
the valid voter to the information submitted.
- If the data matches, a list of candidates is shown to the voter. [13]
• Make a Casted vote block:
- The voter selects one of the candidates from the list and votes.
- Use a digital signature to sign the vote, then send the transaction to the smart contract (SC).
- For each vote cast by the voter, smart contract creates a Voter ID for Identification of each vote.
– Increase the number of votes cast for the chosen candidate. Make a block containing the transactions made
by the voter, together with their VID and Candidate Vote number. [2]
• Selection of Miner:
- A miner selection algorithm is run by the SC.
- In order to prevent record interference, all miners compete to generate outcome of the block’s hash
first in Bitcoin, which requires a significant amount of processing power. It has been suggested that the
voting architecture choose a Miner based on heuristics extrapolated from that Miner's accomplishments.
The SC compiles factors such node capacity, energy use, and delay. [5]
IV. CONCLUSION
Securing vote security is a challenge for many nations. To guarantee voter participation and validity, the security
of vote data, and the accurate measurement of votes, a smart contract-based block chain- based voting system
can be developed. In order to cut down on computational expenses, this technique entails the SC providing voter
authentication and participating in the choice of a Miner in the Block chain. Additionally, it promptly counts the
votes, speeding up the election process. Citizens can vote using smart devices from anywhere thanks to this
mechanism. This will help to raise the number of voters needed to establish democracy in any nation. The
objective of this project is to develop an encryption method that will increase the security of our system in the
future.
ACKNOWLEDGMENT
The authors wish to thank the anonymous reviewers for their useful suggestions that helped in improving the
quality of this paper. We would also like to thank Vishwakarma Institute of Information Technology, Pune for
supporting this work.
59
REFERENCES
[1] F. Hjálmarsson, G. K. Hreiðarsson, M. Hamdaqa and G. Hjálmtýsson, “Blockchain-based e-voting system,” in 2018
IEEE 11th international conference on cloud computing (CLOUD), 2018.
[2] B. L. J. K. S. A. N. S. S. R. R. P. Yu and M. H. Au, “Platform-independent secure blockchain-based voting system,” in
Information Security: 21st International Conference, Guildford, UK, 2018.
[3] K. M. Khan, J. Arshad and M. M. Khan, “Investigating performance constraints for blockchain based secure e-voting
system,” Future Generation Computer Systems, pp. 13--26, 2020.
[4] A. Alam, S. Z. U. Rashid, M. A. Salam and A. Islam, “Towards blockchain-based e-voting system,” 2018 international
conference on innovations in science, engineering and technology (ICISET), pp. 351--354, 2018.
[5] Fatrah, S. El Kafhali, A. Haqiq and K. Salah, “Proof of concept blockchain-based voting system,” in Proceedings of the
4th International Conference on Big Data and Internet of Things, 2019.
[6] P. Baudier, G. Kondrateva, C. Ammi and E. Seulliet, “Peace engineering: The contribution of blockchain systems to the
e-voting process,” Technological Forecasting and Social Change, vol. 162, p. 120397, 2021.
[7] Y. Soni, L. Maglaras and M. A. Ferrag, “Blockchain based voting systems,” in European Conference on Cyber Warfare
and Security, 2020.
[8] M. a. P.-M. A. Pawlak and N. Kryvinska, “Towards the intelligent agents for blockchain e-voting system,” Procedia
Computer Science, vol. 141, pp. 239--246, 2018.
[9] Li, J. Xiao, X. Dai and H. Jin, “AMVchain: authority management mechanism on blockchain-based voting systems,”
Peer-to-peer Networking and Applications, vol. 14, pp. 2801--2812, 2021.
[10] R. Taş and Ö. Ö. Tanrıöver, “A systematic review of challenges and opportunities of blockchain for E-voting,”
Symmetry, vol. 12, no. 8, p. 1328, 2020.
[11] J.-H. Hsiao, R. Tso, C.-M. Chen and M.-E. Wu, “Decentralized E-Voting Systems Based on the Blockchain
Technology,” in Advances in Computer Science and Ubiquitous Computing: CSA-CUTE 17, 2018.
[12] Y. Abuidris, R. Kumar, T. Yang and J. Onginjo, “Secure large-scale E-voting system based on blockchain contract using
a hybrid consensus model combined with sharding,” Etri Journal, vol. 43, no. 2, pp. 357--370, 2021.
[13] U. Jafar, M. J. A. Aziz and Z. Shukur, “Blockchain for electronic voting system—review and open research challenges,”
Sensors, vol. 21, no. 17, p. 5874, 2021.
60
Grenze International Journal of Engineering and Technology, June Issue
Abstract—E-mail is one of the most important modern official means of communication with
high reliability, and this is the reason for its widespread and wide popularity. However, this
does not make it safe from threats and attacks. The major threat to email is spamming and
phishing, which cause a lot of financial losses to the victims. In this paper, we have described a
filter based on analyzing the email header elements and its characteristics, extracting the most
important features, and testing it on many rules and conditions that can detect and block
phishing email messages. This filter is tested on five standard datasets containing spam and
phishing emails using header information only and has achieved an overall average accuracy of
about 96.31 percent.
Index Terms— Email header, features extraction, Phishing emails, Black-lists, White-lists.
I. INTRODUCTION
Email is the most important and best way to communicate between companies, institutions, and offices, despite
the wide spread of modern means of communication such as social media and others, due to its ease of use,
strength of protection, and reliability. But it is not without some problems and defects that threaten protection,
and the most prominent of these threats is phishing attack, which is the fraudulent act of pretending to be a
reliable entity in a communication in order to obtain confidential user information (such as usernames,
passwords, bank account information, or credit card information) and other information [1]. Due to the increase
in phishing attacks and the significant financial losses it causes to individuals and companies, much research has
appeared to study and block email phishing attacks.
In this paper, we have analyzed the elements of the email header and its various properties to create rules and
conditions that can classify an email message as phishing email or non-phishing email. The email, like normal
postal mail, contains two parts header part and body part. Fig. 1 shows the header elements in the email message.
The header: represents an envelope of the E-mail containing information like the sender's and receiver's email
addresses, email subject, message journey information across the various servers, cc, bcc, etc [2]. It consists of
following fields such as:
i. From: contains Email sender information like name and email address of the sender.
ii. To: contains the email address(es) of the receiver(s) of the email, which may be delivered to a single
recipient or several recipients. It's a mandatory field. The message has to have at least one recipient's
address.
iii. Subject: contains information about message content.
iv. Received: contains information about the message journey, which involves details of mail transmission
servers which it has travelled. It can be used to track the message's path.
v. Reply-To: includes an email address that is immediately inserted in the "To" field when the user replies
to the email message.
vi. Return-Path: includes the sender's details, like their email and a link to reply to them. It is added by the
server that delivers the message to the recipients.
vii. Message ID: It is a special identifier that is given to each message by the host when the message is
created. divided into two parts, local and domain, which are separated by an at-sign and enclosed in
angle brackets. "<" local-part "@" domain-part ">" [3].
The email has another part called Body which contains the content to be read by the recipient; it could be text, an
image, an attachment, or both [4]. The phishers or spammers used to modify these headers information by doing
forgery. Thus, changes in these fields are important features.
62
Ajay U. Surwade (2020) developed Origin based Filter which blocks phishing e-mail by extracting header part
information of e-mails using Blacklist approach [14].
Youness Mourtaji, Mohammed Bouhorma, Daniyal Alghazzawi, Ghadah Aldabbagh, and Abdullah Alghamdi
(2021) developed a solution based on a hybrid rule-based approach that extracts features from six different
methods, including the blacklisting method, which checks the domain name against two antiviruses blacklists
that consider this domain blacklisted [15].
III. METHODOLOGY
This work is based on analyzing the important header element to extract important features to design an origin-
based filter that has some rules and conditions, as mentioned in Section-IV, to classify the email as phishing or
non-phishing. The methodology adopted for this origin-based filter (OBF) as shown in Figure-2.
The features such as ‘From’, ‘Reply-To’, ‘Return-Path’ and ‘Message-ID’ as shown in Table-1, are extracted
from the standard datasets such as Enron [16], Public Phishing Corpus [17], SPAM Archive [18],
CSDMC2010_SPAM [19] and Spam Assassin [20].
According to the RFC822 protocol, the message header has been extracted and four standard fields selected for
features extraction as shown in table I:
The Python code is developed using the rules mentioned in Section-IV. This code is tested with the standard
datasets as mentioned above. This architecture is classifying emails as Phishing or Non-Phishing emails. The
Accuracy of classification has been calculated and reported in Table-II to VI. The IP addresses or domain names
of classified Phishing emails are extracted and are stored as the Blacklists similarly, the IP addresses or domain
names of classified Non-Phishing emails are extracted and are stored in the Whitelists.
63
A regular expression is used to validate the From field with various email address formats.
b) From field and Reply-To field has different domain name.
FROM != REPLY-TO
c) From field and Return-Path fields has different domain name.
FROM !=RETURN-PATH
d) Reply-To or Return-Path or both is empty.
len(REPLY-TO) == 0 OR len(RETURN-PATH) == 0
OR (len(REPLY-TO) == 0 AND RETURN-PATH) == 0)
e) From field and message id has different domain name.
from_domain!= message_id
There is one possible probability value for each condition mentioned above. "0" means "false" or "1" means
"true.". The value “1” (ie, ‘true’) classify email as phishing email while, value "0" (ie, ‘false’) classify email as
non-phishing email. The decision-making condition is as given below:
If ((FROM != RETURN-PATH) or (FROM != REPLY-TO) or (len(REPLY-TO) == 0) or len(RETURN-
PATH) == 0 or (len(REPLY-TO) == 0) AND (RETURN-PATH) == 0) or (from_domain !=
message_id))
Then classify email as Phishing Email and Extract it’s IP addresses or Domain names and
store it as Blacklists.
Else, classify email as Non-Phishing Email and Extract it’s IP addresses or Domain names and store it
as Whitelists.
The results collected during experiments are reported in next sections.
TABLE II. EMAIL CLASSIFICATION IN ENRON DATASET TABLE III. EMAIL CLASSIFICATION IN PUBLIC PHISHING CORPUS DATASET
TABLE IV. EMAIL CLASSIFICATION IN SPAM ARCHIVE DATASET TABLE V. EMAIL CLASSIFICATION IN CSDMC2010_SPAM DATASET
The results for ‘Spam Archive’ dataset shown in Table-IV, the Folder name ‘02/2020’ has minimum accuracy
which is 90.46 %. This folder contains total 7495 spam emails which should have been classified as Phishing
email but, only 6780 have been classified as Phishing and 715 are misclassified as non-Phishing.
64
The results for ‘Public Phishing Corpus’ dataset shown in Table-III, the Folder name ‘20051114’ contains total
438 phishing emails which should have been classified as Phishing emails but, only 411 have been classified as
Phishing and 27 are misclassified as non-Phishing.
The results for Enron dataset shown in Table-II, the Folder name ‘BG’ contains total 10000 spam emails which
should be classified as Phishing emails but only 9402 are classified as Phishing while, 598 have been classified
as non-Phishing.
The results for ‘CSDMC2010_SPAM’ dataset shown in Table-V, the Folder name ‘Spam’ contains total 1378
spam emails which should be classified as Phishing emails but only 1315 are classified as Phishing while, 63
have been classified as non-Phishing.
The above mis-classification suggests that, the features extracted from these emails are not sufficient and need
some more features for accurate classification. So, we need to investigate the other features along with existing
features so that accuracy can be improved. In future we are planning to investigate other header features
‘Subject’ field of the email.
REFERENCES
[1] Tak, Gaurav & Ojha, Gaurav. (2013). Multi-Level Parsing Based Approach Against Phishing Attacks with the Help of
Knowledge Bases. International Journal of Network Security & Its Applications. 5. 10.5121/ijnsa.2013.5602.
[2] P. Mishra, E. S. Pilli and R. C. Joshi, "Forensic Analysis of E-mail Date and Time Spoofing," 2012 Third International
Conference on Computer and Communication Technology, 2012, pp. 309-314, doi: 10.1109/ICCCT.2012.69.
[3] Hamid, I. R. A., Abawajy, J., & Kim, T. H. (2013). Using feature selection and classification scheme for automating
phishing email detection. Studies in informatics and control, 22(1), 61-70.
[4] Beaman, C., & Isah, H. (2022). Anomaly Detection in Emails using Machine Learning and Header Information. arXiv.
https://doi.org/10.48550/arXiv.2203.10408.
[5] Tianda Yang, Kai Qian, Dan Chia-Tien Lo, K. Al Nasr and Ying Qian, "Spam filtering using Association Rules and
Naïve Bayes Classifier," 2015 IEEE International Conference on Progress in Informatics and Computing (PIC), 2015,
pp. 638-642, doi: 10.1109/PIC.2015.7489926.
[6] Jain, A.K., Gupta, B.B. A novel approach to protect against phishing attacks at client side using auto-updated whitelist.
EURASIP J. on Info. Security 2016, 9 (2016). https://doi.org/10.1186/s13635-016-0034-3 .
[7] Davneet Kaur and Sheetal Kalra, “Five-tier barrier anti-phishing scheme using hybrid approach”, Information Security
Journal-A Global Perspective, 2016, DOI: 10.1080/19393555.2016.1215573.
[8] Ghogare, Pramod & Surwade, Ajay & Patil, Manoj. (2018). Effective E-mail Spam Filtering Using Origin Based
Information. International Journal of Computer Sciences and Engineering. 6. 359-362. 10.26438/ijcse/v6i11.359362.
[9] O. Abahussain and Y. Harrath, "Detection of Malicious Emails through Regular Expressions and Databases," 2019
International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), 2019,
pp. 1-5, doi: 10.1109/3ICT.2019.8910291.
[10] T. Krause, R. Uetz and T. Kretschmann, "Recognizing Email Spam from Meta Data Only," 2019 IEEE Conference on
Communications and Network Security (CNS), 2019, pp. 178-186, doi: 10.1109/CNS.2019.8802827.
[11] Thashina Sultana, K A Sapnaz, Fathima Sana, Jamedar Najath, 2020, Email based Spam Detection, INTERNATIONAL
JOURNAL OF ENGINEERING RESEARCH & TECHNOLOGY (IJERT) Volume 09, Issue 06 (June 2020).
[12] Anchit Bijalwan, “Botnet Forensic Analysis Using Machine Learning”, Hindawi’s Journal of Security and
Communication Networks, Volume2020, pp:1-9, February-2020. Article ID 9302318,
https://doi.org/10.1155/2020/9302318.
[13] Kulkarni, Priti & Saini, Jatinderkumar & Acharya, Haridas. (2020). Effect of Header-based Features on Accuracy of
Classifiers for Spam Email Classification. International Journal of Advanced Computer Science and Applications. 11.
10.14569/IJACSA.2020.0110350.
65
[14] A. U. Surwade, "Blocking Phishing e-mail by extracting header information of e-mails",(2020), International
Conference on Smart Innovations in Design, Environment, Management, Planning and Computing (ICSIDEMPC),
2020, pp. 151-155, doi: 10.1109/ICSIDEMPC49020.2020.9299596.
[15] Mourtaji, Youness & Bouhorma, Mohammed & Alghazzawi, Daniyal & Aldabbagh, Ghadah & Alghamdi, Abdullah.
(2021). Hybrid Rule-Based Solution for Phishing URL Detection Using Convolutional Neural Network. Wireless
Communications and Mobile Computing. 2021. 1-24. 10.1155/2021/8241104.
[16] http://nlp.cs.aueb.gr/software_and_datasets/Enron-Spam/index.html. Last accessed on 10 September, 2022.
[17] https://academictorrents.com/details/a77cda9a9d89a60dbdfbe581adf6e2df9197995a Last accessed on 14 October 2022.
[18] http://untroubled.org/spam/ Last accessed on 28 September 2022.
[19] https://github.com/jdwilson4/Intro-to-MachineLearning/tree/master/Data/SPAMData Last accessed on 12 November
2021.
[20] https://spamassassin.apache.org/old/publiccorpus/ Last accessed on 2 October 2022.
66
Grenze International Journal of Engineering and Technology, June Issue
Abstract—The system proposed will be a project which will perform the work of weather
detecting with IoT technology implemented in it. The system will be linked to a webpage to
which data will be provided by the various sensors which will be used in the system for the
different purposes. Then the website will analyze the data and accordingly weather conditions
will be displayed on the screen along with the graphical representation of the data. Also, the
data will be stored in the database for future references. Different parameters will be sensed by
the device namely temperature, humidity, atmospheric pressure, sound , light intensity, carbon
dioxide and monoxide levels in the air present there.
I. INTRODUCTION
This is a simple weather detecting system powered by Arduino UNO, this system detects environmental
parameters such as temperature, humidity, barometric pressure, air and sound quality and light intensity as well.
The device is having IoT (Internet of things) technology applied in it. Weather detection have always been a
standout topic for meteorologist.
Weather conditions need to monitored on frequent and daily basis as it assumes a significant role in metereology.
Weather conditions are important for many other factors like farming, deciding schedules for export of goods,
flight schedules and deciding various other outdoor activities. Forecasts depends on environmental parameters
like temperature, Humidity, Wind, etc. To serve this purpose one needs to study these factors.
The proposed device is having different sensors for detecting different environmental parameters namely DHT11
for temperature and humidity, BMP180 for altitude and atmospheric pressure, MQ07 and MQ135 for carbon
dioxide and monoxide levels respectively, sound sensor as FC04 and LDR for light intensity. Also, the project is
linked to a website which will fetch all the data from the sensors and then will accordingly analyze it and will the
output as graphs of the data. Also, the data will always be stored in the database of the website for future
references.
Table I. Type Sizes for Camera-Ready Papers
Methodology/Experimental
Block Diagram
Below is the block diagram of the project. The functional diagram represents the main concept of the project.
The project is powered by an Arduino Uno, on which various sensors are attached. Acoustic sensors such as
FC04, MQ135 for carbon monoxide, DHT11 for temperature and humidity, BMP180 for altitude and barometric
pressure and MQ07 for carbon dioxide levels in the air. In addition, we will use LDR to measure light intensity.
2)BMP180
BMP180 is a high-precision sensor designed for consumer applications. Air pressure is nothing but the weight of
air acting on everything. Air has weight and wherever there is air its pressure will be felt. The BMP180 sensor
detects this pressure and provides this information as a digital output.
3)MQ135
The MQ135 gas sensor is used in air quality monitoring equipment and is suitable for detecting or measuring
NH3, NOx, Alcohol, Benzene, Smoke, and CO2. The MQ135 sensor module comes with a digital pin that allows
this sensor to work even without a microcontroller and is very useful when you are just trying to detect a specific
gas.
68
4)MQ07
The MQ07 gas sensor is highly sensitive to carbon monoxide. The sensor can be used to detect different gases
containing CO, it is inexpensive and suitable for different applications.
5)ESP 8266
ESP8266 is a low cost WiFi module of the ESP series thst you can use to control your electronic projects
anywhere in the world. It has a buily-in microcontroller and a 1MB flash that allows it to connect to WiFi
networks.
69
IV. FUTURE SCOPE
This system is very useful for farmers where they can ensure high crop yield and reduce the risk caused by
weather through IoT. In particular, it is useful when drastic changes in the environment take place. In the future,
we may also add different types of sensors such as earthquake detection sensor, light sensor, rain level sensors.
We may also add machine learning and artificial intelligence algorithms to predict future weather and its effect
on the environment
V. CONCLUSION
This system is used to detect environmental parameters such as temperature, humidity, pressure, air quality, etc.
Sensors in the data collection environment. By implementing sensors in the system, we can bring the
environment to life. The results of the collected data are then displayed on the screen via Wi-Fi. We can use this
device to monitor a specific room or place where environmental parameters need to be monitored. Accuracy of
this exact model with real data. The main purpose if this device is to make the system beneficial and useful.
ACKNOWLEDGMENT
"DESH VIT, Pune", "Principal VIT, Pune", "HOD, DESH" sincerely thank you for giving us valuable advice
and suggestions in this project. "P.P Musale" would like to thank you for your valuable advice, enthusiastic
attitude and support throughout our project. We are fortunate to work under your leadership.
REFERENCES
[1] Dhanashree S. Medhekar, Mayur P. Bote, Shruti D. Deshmukh, “ The Heart Disease Prediction System Using Naïve
Bayes”, INTERNATIONAL JOURNEY OF ADVANCED RESEARCH IN SCIENCE, TECHNOLOGY AND
ENGINEERING, VOL.2, ISSUE 3, MARCH 2013, pp 1-5
[2] Amruta A. Taksande, PS Mohod, “Application of Data Mining in Weather Forecasting Using Regular Model Growth
Algorithms”, International Journal of Science and Research(IJSR), Volume 4 Issue 6, June 2015
[3] Mary Nsabagwaa, Maximus Byamukamab, Emmanuel Kondelaa, “Towards a Powerful and affordable Automated
Weather Station”
70
[4] Mehrnoosh Torabi, Sattar Hashemi, “ Data Mining Models for Weather Forecasting”, The 16th CSI International
Symposium on Artificial Intelligence and Signal Processing (AISP 2012), IEEE, pp 579-584
[5] Mr. Sunil Navadia, Mr. Jobin Thomas, Mr. Pintukumar Yadav, Ms. Shakila Shaikh, “ Weather Forecasting: A new
approach for Measuring & Analysing weather data”
[6] P. Sushmitha, G.Sowmybala “Design and Implementation of Weather Monitoring and Controlling System”,
International Journal of Computer Applications.
[7] T.R.V Anandharajan G.Abhishek Hariharan, K.K.Vignajeth, R.Jijendiran “ Monitoring the weather with Artificial
Intelligence.
[8] Raj Kumar, Shiva Prakash, “Performance and Parametric Analysis of IoT Motes with Different Network Topologies”
71
Grenze International Journal of Engineering and Technology, June Issue
Abstract— According to physiological research, there are a variety of variances in both speech
and face movements. These facial and vocal expressions are shared by healthy and depressed
people. On the basis of this information, we offer the Multimodal Attention Feature Fusion and
a novel Spatio-Temporal Attention (STA) network technique that are utilized to get the
multimodal representation of depression signals to be able to predict the amount of personal
depression. Correctly, we first separate segmenting the speech amplitude spectrum and video
into predetermined lengths and submitting them to the STA network, which focuses on the
audio and video frames used to detect depression in addition to integrating the attentional
processing of spatial and temporal information mechanisms. The output of the STA network's
final full connection layer is where the audio and video segment-level functionality is acquired.
In order to collect the changes in every aspect of the audio and segment-level features for videos
and summarize them as an audio and video feature level, this study also provides the eigen
evolution pooling approach. The MAFF is then used to create a multimodal representation
composed of modal complementary data, which is then inputted into a support vector
regression predictor to determine the severity of the depression. The utility of our strategy is
illustrated by experimental findings on the depression databases for AVEC2013 and
AVEC2014.
I. INTRODUCTION
Depression is a condition that causes people to have extremely low moods and the inability to engage in typical
social interactions. More gravely, we can observe that depression can also cause behaviors that contribute to self-
harm and suicide. As a result, depression will overtake heart disease as the second biggest cause of death by
2030. Fortunately, we can state that early diagnosis and therapy can assist people in quickly getting out of
problems. However, the diagnostic process is typically challenging and heavily dependent on the doctors, which
can prevent some patients from receiving timely, effective therapy. Finding a system for automatically
diagnosing depression is therefore vital to help clinicians work more effectively. The model of automatic
depression identification has new opportunities thanks to new algorithms, and this could lead to model
improvement by increasing model accuracy and accurately forecasting depressed clients. According to
physiological research, depressive patients' speech and facial movements differ slightly from those of healthy
people.
73
attributed to anything other than a shared inductive bias resulting from the similarity of the tasks. We provide a
brief explanation of how to create multitask decision trees from the top down in order to demonstrate the
generality of the MTL methodology. Decision trees are not typically used to learn several tasks; therefore, this is
noteworthy. By doing this, a system is created that generalizes particular conceptual clustering techniques,
enhancing their applicability in fields where the separation between features (information that will become
available in the future) and classes (objects we wish to forecast) must be maintained.
[8] Robert J. McAulay et. al worked-on analysis/synthesis method was used to analyze speech that was both
clear and interfered with in various ways. In every instance, natural-sounding, high-quality synthetic speech was
produced. The technique may also be applied to the parametric representation of non-speech sounds, such as
music and particular marine biological noises. Finally, it is important to keep in mind that tools used to change
the width of are essential for high-quality speech reconstruction in addition to updating the average pitch. It's
vital to remember that, despite using the frequency analysis window, there are no voicing choices made during
the analysis and synthesis process.
III. METHODOLOGY
1. Gathering of Data: Gathering of data is required at the first stage as we need to create a data-set which can
used to analyze and generate a better working model. The data will be collected in the form of audio and video
and it will be collected sufficiently in order to create a better working model.
2. Pre-Processing Of Data: Steps involved in pre-processing of data are: Data Cleaning, Feature Selection &
Data Transformation.
Data Cleaning is the process which involves removing and fixing the missing or incorrect data which is stored in
the database as whenever we create a data set it is bound to have some errors and those errors should be cleaned
which will give better and accurate results.
Feature Selection is the process of picking up appropriate and approximate features from the data-set and then
accordingly we can direct a way in which our model can be influenced.
Data Transformation is a process where the needs and behavior of algorithm is taken into account and the data is
changed accordingly such as the structure or the format of the data.
3. Projection/Prediction of Data: Is a step where we refer to the output after the model is trained on the data set
which was provided for training where we can predict the face gestures from the video data set and the
tone/audio notes of the voice collected from the audio data set to make predictions based on the previous data.
4. Tools needed for Data Visualization: Data Visualization tools provide a very easy way to create visuals on a
data set which are huge. Statistics of the data which are generally not visible are clearly stated and the underlying
patterns which are there can be easily uncovered.
The flow chart is shown in the following fig
74
IV. SYSTEM ARCHITECTURE
Algorithms Used:
A. SVM
The acronym SVM stands for "Support Vector Machine," a machine learning under supervision algorithm that
the ability to create regression and classification models that perform well on both linearly and non-linearly
distinguishable data. The SVM algorithm performs classification with the help of margin. The objective of the
algorithm is to find the borderline which can most accurately distinguish the data points in n-dimension space as
of which the boundary line is called the hyperplane.
75
B. CNN
V. CONCLUSION
According to physiological research, both facial and verbal activity differ little between depressed and very
healthy people. In light of this reality, we develop a multimodal spatiotemporal representation paradigm for
automatic depression level identification. The suggested STA network focuses on frames relating to depression
detection in addition to integrating secular information. In addition, by removing the information between
processes, the suggested MAFF method enhances the multimodal representation's quality. Experimental
AVEC2013 and AVEC2014 results show that our approach has a decent performance in terms of detection.
Human speech is a sophisticated combination of words and feelings. Every word might mean something
different depending on the context in which it is used. Every user will have a different mental state, making it
challenging to understand their input. Their feelings can help us grasp what they're going through even better.
Making a schedule and choosing the therapies are also aided by this. If the data is accessible, we also think about
applying this approach to identify more diseases. To increase the detection accuracy, we will partition various
tasks and train separate models in the future.
REFERENCES
[1] "Automatic Depression Detection Via Facial Expression Using Multiple Instance Learning," IEEE 17th International
Symposium on Biomedical Imaging (ISBI), 2020, by Yanfei Wang, Jie Ma, Bibo Hao, Pengwei Hu, Xiaoqian Wang,
Jing Mei, and Shaochun Li
[2] Estimating severity of Depression From Acoustic Features and Embedding of Natural Speech | IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP) | 978-1-7281-7605-5/20/$31.00 2021 IEEE, Sri
Harsha Dumpala, Sheri Rempel, Katerina Dikaios, Mehri Sajjadian, Rudolf Uher, Sageev Oore.
[3] Automatic Assessment of Depression Based on Visual Cues: A Systematic Review, by Anastasia Pampouchidou,
Panagiotis Simos, Kostas Marias, Fabrice Meriaudeau, Fan Yang, Matthew Pediaditis, and Manolis Tsiknakis, IEEE
Transactions on Affective Computing, 2017.
[4] "Topic Modeling Based Multi-modal Depression Detection," Proceedings of the 7th Annual Workshop on Audio/Visual
Emotion Challenge, Yuan Gong and Christian Poellabauer, 2017, pp. 69–76.
[5] A Weakly Supervised Learning Framework for Detecting Social Anxiety and Depression, Proceedings of the ACM on
interactive, mobile, wearable and ubiquitous technologies, vol. 2, no. 2, pp. 1–25, 2018. A. Salekin, J. W. Eberle, J. J.
Glenn, B. A. Teachman, and J. J. Stankovic.
[6] "Automated speech- based screening of depression using deep convolutional neural networks," Procedia Computer
Science, vol. 164, pp. 618–628 (2019). K. Chlasta, K. Wok, and I. Krejtz.
[7] Multitask learning: A knowledge-based source of inductive bias, R. Caruana, Proceedings of the ICML, 1993.
[8] Speech Analysis/Synthesis Based on a Sinusoidal Representation by Robert J. McAulay, VOL. ASSP-34, NO. 4,
AUGUST 1986.
76
Grenze International Journal of Engineering and Technology, June Issue
Abstract—Research is being done right now to try to predict the future value of cryptocurrency.
The possibility of using a python-based approach has been explored by the scientific community
as a means of realizing this aim. In predictive analytics, it is becoming standard practice to use
the same dataset for both training and testing purposes. Traditional studies have been slowed
down by problems with precision and efficiency. This research makes use of optimization and
the Python programming language to provide a versatile prediction model with little
implementation time. Dataset size is decreased when classification is performed in Python,
which shortens the training period. Eliminating extraneous information also improves the
performance of the trained model. Because of this change, we want to develop a system that is
both adaptable and extensible. To put it another way, such a system would help cryptocurrency
investors make better decisions while buying and selling cryptocurrency. Using several factors,
the study's results have significantly influenced Bitcoin price forecasts. Investment choices are
often guided by such analyses for many fund managers and private investors. Scientists have
developed a flexible and scalable strategy for determining an appropriate script's ideal value.
Investors will need a mechanism to choose which currency to purchase at any given moment
according to market circumstances as trading platforms progress.
Index Terms— Machine learning, Crypto currency, PSO, Accuracy, F1 score, recall value,
Precision.
I. INTRODUCTION
Cryptocurrency is a form of virtual currency that is encrypted to prevent forgery and double spending. The
networks behind many crypto currencies are completely decentralized. The blockchain technology is crucial to
these. Global computer networks keep a general ledger. Digital assets include cryptocurrency. It runs on a
decentralized system. Numerous industries, including finance and law, are expected to be shaken up by
blockchain and related technology, according to experts. Faster monetary transactions are one of the main
advantages of cryptocurrency. The inconsistency of prices and the high cost of transactions are two major
constraints. When using cryptography to safeguard digital or virtual money, simplicity in maintaining and
managing the cryptographic information is a primary concern. It is facilitating the avoidance of complexity and
the reduction of data processing time. Information may be accessed in this system through bitcoin and
individuals with certain permissions. Present research is focusing on following objectives:
Predict
PREDICTION CRYPTO- Statistical
FACTORS CURRENCY techniques
Basic PRICE
High PREDICTION Predict
Dimensiona Machine Learning
l Techniques
A. Machine learning
Understanding and developing 'learning' techniques, or methods that use data to enhance performance on some
set of tasks, is the focus of ML, a subfield of computer science. It's considered a kind of AI. In order to generate
predictions or judgments without being explicitly programmed, machine learning algorithms construct a model
using sample data. This data is referred to as training data. In many fields, including health, email filtering, voice
recognition, and computer vision, traditional algorithms would be too time-consuming or costly to design. This
is where machine learning algorithms come in.
B. Particle Swarm Optimization
PSO was first suggested by Kennedy and Eberhart in 1995. Scientists that study social behaviour believe that
individuals of a travelling school of fish or flock of birds "may profit from the experience of all other members."
When one bird in a flock goes out in search of food, the others may benefit from the information it gathers by
hearing from the other birds about the best spots to eat. In this context, "best" means "best" in a high-
dimensional issue space, where several solutions exist.
C. LSTM
LSTM has been recognized as a prominent artificial RNN. Such is widely used in the field of deep learning. One
of LSTM's distinguishing features is its capacity for connectivity and feedback. Contrast with a regular feed
forward neural network, which this is not. This goes beyond handling individual pieces of information, like
graphics. It's also the last step in a series of media files, such as an audio or video file. To accomplish
classification tasks, LSTM networks are deemed appropriate. LSTM networks have been considered as a type of
RNN. Aside from the regular units, LSTM also supports some unique ones. A single ‘memory cell' makes up an
entire LSTM unit. These memory cells can keep information stored for very long periods of time. Due of
LSTM's improved customization options, users are increasingly switching over from RNN. They can control the
inflow and distribution of Inputs based on learned Weights. Therefore, it allows for adaptability in output
management. Accordingly, LSTM is enabling management skills and productive outcomes.
D. Crypto currency
Cryptocurrency, often known as crypto, crypto-currency, or just crypto, is a kind of digital money meant to
function as a means of exchange on a decentralized network, rather than a centralized one backed by a
government or a bank. A digital ledger is a database that keeps track of who owns currencies and when. It uses
encryption to prevent unauthorized access to the database and ensure the integrity of all transactions and coin
ownership transfers. Cryptocurrencies, despite their name, are not regarded to be currencies in the classic sense.
E. Role of Machine Learning in Crypto currency
Predicting cryptocurrency using Machine Learning is the best option available. In order to make a reasonably
accurate prediction, the model needed to satisfy a number of criteria. Daily and 5-minute interval price
predictions for Bitcoin are made using a wide variety of ML models, including as LDA,LR, RF, XGBoost, SVM,
DT, QDA, and KNN. When it comes to blockchain and cryptocurrencies, the uses of machine learning go far
beyond price prediction. By streamlining the back-end processes of crypto trading and mining, ML has the
potential to address the security problems in this technology through deep learning and reinforcement learning.
78
II. LITERATURE REVIEW
Various studies have been conducted to determine how to best predict the price of cryptocurrencies. In 2013, A.
Cheung [1] received almost little attention from the academic community. As a result, individuals complete their
prior knowledge gaps. As part of our research on the occurrence of Bitcoin bubbles, we deploy a newly created
tool that is pretty effective at spotting bubbles. There have been a number of brief bubbles in the cryptocurrency
market since 2010, but the three largest bubbles all occurred in the years between 2011 and 2013, lasting
between six and six-and-a-half months each and ultimately leading to the downfall of Mt Gox. Numerous studies
have shown that the hazards associated with Bitcoin may be mitigated. In 2019, GARCH-in-mean models were
used by J. Liu [3] to investigate the link between volatility and returns of the dominant cryptocurrency and the
ripple effects of the cryptocurrency market. According to E. Bouri [4] in Inn 2020, all three cryptocurrency have
had considerable jump activity in their return series. These numbers suggest that the existence of one
cryptocurrency boom raises the probability that subsequent cryptocurrency booms will occur as well.
Conversely, co-jumping refers to jumping in sync with other traders to maximize volume. In 2020, N. Akbulaev
[5] studied the theoretical and practical connections between Bitcoin and Ethereum. Expanding the scope of
previous studies on the fundamental properties of Bitcoin and Ethereum and the correlations between their
values has allowed for a better understanding of recent trends in the industry. The values of Bitcoin & Ethereum
were shown to be correlated, and this connection might be leveraged to mitigate risk when trading
cryptocurrencies on exchanges like Gemini. In [6], we looked at whether Bitcoin is a means of exchange or an
asset, as well as its present and potential future applications. Their research demonstrates that Bitcoin's statistical
properties are distinct from those of conventional asset classes like stocks, bonds, and commodities, and this
holds true in both stable and volatile financial environments. Speculation, rather than usage as a means of trade
or currency, is the most common use of Bitcoin, according to data collected from Bitcoin accounts. S. Corbet [7]
published a study in 2018 that looked at the temporal and frequent connections between three major crypto
currencies and many different types of financial assets. [7] Many indicators point to the fact that these
possessions are distinct from monetary and material prosperity. The data suggests that Bitcoin investments may
provide diversification advantages for short-term traders. The interconnectedness of things may change over
time as a result of shocks to the financial system from outside the nation. In 2013, E. Turkedjiev [8] used the
ANN to provide short-term stock value predictions, particularly for financial institutions. The nonlinearity of
artificial neural networks makes them useful for analyzing stock market time data (ANN). The Hong Kong
Straight Train and the QDII, both introduced in 2007, also had a substantial effect on the price gaps between A
and H shares in 2010. Several legislative proposals are also made with the goal of narrowing the gap between A-
and H-stock prices. According to L. Guoyi [10], the total equity & GDP, earnings after tax per share, & market
index are all significant elements that affect a bank's stock price. To determine whether or whether this
information has a relationship to the end-of-day share prices of the banks in question, a test model is employed
for analysis and verification. According to S.Beng Ho [11], in order to have a good general learning machine,
you need one that can solve a broad variety of problems fast in a dynamic environment.
79
dropout layers. Once a dataset is obtained, characteristics are chosen to use in the training process. Then, a 12
hidden layer LSTM1 layer and a 5 hidden layer LSTM2 layer are implemented, with the training/testing ratio
determined. Over fitting is fixed by dropout layers, and after that a fully connected layer and a softmax layer are
utilized. Decisions on potential intrusions may be made with the use of a classification operation.
End
Start
Classify for
Lab overvalued and
Normal
el under valued
Dataset
Classification
Activation function
Dataset
Train Test
70% 30%
Full Connected layer
Dropout Dropout
LSTM LSTM
B. Research Methodology
The dataset of crypto currency are captured using python script and PSO is applied over dataset. The PSO is
supporting in getting optimized price in order to support investor regarding best prices. Then dataset is filtered
considering optimized value and machine learning approach is used for training.
Proposed objectives include study on the establishment of records of Bitcoin pricing. In the current study,
categorization of cryptocurrency prices is recommended on the basis of undervalue, overvalue, and typical
pricing. Researchers may now evaluate their findings with the help of the accuracy parameters they obtained.
80
Start
Data preprocessing
Initialize machine learning model by setting batch size, epoch, and hidden layer.
Perform node classification on the bases of high, low and moderate call drop ratio
Stop
Figure 7. Global optimized price Figure 8. Factor plot of global optimized price by weekend/weekday
81
Figure 9. Simulation of training loss and testing loss
105
100 BTC
95
ETHERIAM
90
WAVE
85
A 80
VI. CONCLUSION
09-10-2022 10-10-2022 11-10-2022
Date
12-10-2022 13-10-2022
The cryptocurrency industry has developed a complex cryptographic infrastructure to oversee its many
LTC
MATIC
operations. The issues that arise from manually handling a crypto currency’s administration are addressed and
avoided in this project. Information likes as users, crypto holders, author ids, and author biographies are being
managed as part of the research. The research sphere is expansive. When handling data, this system took into
account a number of factors. The simulation results show that the suggested technique is more precise than
previous methods. Simulation results conclude that proposed LSTM model is providing accuracy above 97%.
FUTURE SCOPE
Due to rapid growth in craze of cryptocurrency it has become essential for investors to take investment decision
considering overvalue, under value and normal value. Present research has focused on the optimized value of
cryptocurrency and proposed efficient machine learning approach. Such research could play significant role in
predicting best prices in case of stock market also. Thus present research would contribute toward different
investment options.
REFERENCES
[1] A. (Wai-K. Cheung, E. Roca, and J.-J. Su, “Crypto-currency bubbles: an application of the Phillips–Shi–Yu (2013)
methodology on Mt. Gox bitcoin prices,” Applied Economics, vol. 47, no. 23. Informa UK Limited, pp. 2348–2358,
Feb. 04, 2015 [Online]. Available: http://dx.doi.org/10.1080/00036846.2015.1005827
82
[2] J. Carrick, “Bitcoin as a Complement to Emerging Market Currencies,” Emerging Markets Finance and Trade, vol. 52,
no. 10. Informa UK Limited, pp. 2321–2334, Aug. 02, 2016 [Online]. Available:
http://dx.doi.org/10.1080/1540496X.2016.1193002
[3] J. Liu and A. Serletis, “Volatility in the Cryptocurrency Market,” Open Economies Review, vol. 30, no. 4. Springer
Science and Business Media LLC, pp. 779–811, Aug. 24, 2019 [Online]. Available: http://dx.doi.org/10.1007/s11079-
019-09547-5
[4] E. Bouri, D. Roubaud, and S. J. H. Shahzad, “Do Bitcoin and other cryptocurrencies jump together?,” The Quarterly
Review of Economics and Finance, vol. 76. Elsevier BV, pp. 396–409, May 2020 [Online]. Available:
http://dx.doi.org/10.1016/j.qref.2019.09.003
[5] N. Akbulaev, I. Mammadov, and M. Hemdullayeva, “Correlation and Regression Analysis of the Relation between
Ethereum Price and Both Its Volume and Bitcoin Price,” The Journal of Structured Finance, vol. 26, no. 2. Pageant
Media US, pp. 46–56, Apr. 29, 2020 [Online]. Available: http://dx.doi.org/10.3905/jsf.2020.1.099
[6] D. G. Baur, K. Hong, and A. D. Lee, “Bitcoin: Medium of exchange or speculative assets?,” Journal of International
Financial Markets, Institutions and Money, vol. 54. Elsevier BV, pp. 177–189, May 2018 [Online]. Available:
http://dx.doi.org/10.1016/j.intfin.2017.12.004
[7] S. Corbet, A. Meegan, C. Larkin, B. Lucey, and L. Yarovaya, “Exploring the dynamic relationships between
cryptocurrencies and other financial assets,” Economics Letters, vol. 165. Elsevier BV, pp. 28–34, Apr. 2018 [Online].
Available: http://dx.doi.org/10.1016/j.econlet.2018.01.004
[8] P. Alagidede and T. Panagiotidis, “Stock returns and inflation: Evidence from quantile regressions,” Economics Letters,
vol. 117, no. 1. Elsevier BV, pp. 283–286, Oct. 2012 [Online]. Available:
http://dx.doi.org/10.1016/j.econlet.2012.04.043
[9] E. Turkedjiev, M. Angelova, and K. Busawon, “Validation of Artificial Neural Network Model for Share Price UK
Banking Sector Short-Term Trading,” 2013 UKSim 15th International Conference on Computer Modelling and
Simulation. IEEE, Apr. 2013 [Online]. Available: http://dx.doi.org/10.1109/UKSim.2013.31
[10] W. Kang and R. A. Ratti, “Oil shocks, policy uncertainty and stock market return,” Journal of International Financial
Markets, Institutions and Money, vol. 26. Elsevier BV, pp. 305–318, Oct. 2013 [Online]. Available:
http://dx.doi.org/10.1016/j.intfin.2013.07.001
[11] S.-H. Kim and D. Kim, “Investor sentiment from internet message postings and the predictability of stock returns,”
Journal of Economic Behavior & Organization, vol. 107. Elsevier BV, pp. 708–729, Nov. 2014 [Online].
Available: http://dx.doi.org/10.1016/j.jebo.2014.04.015
[12] T. Hendershott and M. S. Seasholes, “Liquidity provision and stock return predictability,” Journal of Banking &
Finance, vol. 45. Elsevier BV, pp. 140–151, Aug. 2014 [Online]. Available:
http://dx.doi.org/10.1016/j.jbankfin.2013.12.021
[13] X. Li, X. Huang, X. Deng, and S. Zhu, “Enhancing quantitative intra-day stock return prediction by integrating both
market news and stock prices information,” Neurocomputing, vol. 142. Elsevier BV, pp. 228–238, Oct. 2014 [Online].
Available: http://dx.doi.org/10.1016/j.neucom.2014.04.043
[14] S.-B. Ho, “Deep thinking and quick learning for viable AI,” 2016 Future Technologies Conference (FTC). IEEE, Dec.
2016 [Online]. Available: http://dx.doi.org/10.1109/FTC.2016.7821605
[15] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, and M. Ghogho, “Deep learning approach for Network Intrusion
Detection in Software Defined Networking,” 2016 International Conference on Wireless Networks and Mobile
Communications (WINCOM). IEEE, Oct. 2016 [Online]. Available: http://dx.doi.org/10.1109/WINCOM.2016.7777224
[16] R. McKenna, S. Herbein, A. Moody, T. Gamblin, and M. Taufer, “Machine Learning Predictions of Runtime and IO
Traffic on High-End Clusters,” 2016 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, Sep.
2016 [Online]. Available: http://dx.doi.org/10.1109/CLUSTER.2016.58
[17] W. LIU, “Machine Learning Algorithms and Applications: A Survey,” International Journal of Computer Science and
Information Technology for Education, vol. 3, no. 1. Global Vision Press, pp. 37–46, May 30, 2018 [Online]. Available:
http://dx.doi.org/10.21742/IJCSITE.2018.3.1.07
[18] S. Wawre et. al. “Sentiment Classification using Machine Learning Techniques”, International Journal of Science and
Research (IJSR), Vol., 5 Issue 4,2016.
[19] C. Yin, Y. Zhu, J. Fei, and X. He, “A Deep Learning Approach for Intrusion Detection Using Recurrent Neural
Networks,” IEEE Access, vol. 5. Institute of Electrical and Electronics Engineers (IEEE), pp. 21954–21961, 2017
[Online]. Available: http://dx.doi.org/10.1109/ACCESS.2017.2762418
[20] S. Sendra, A. Rego, J. Lloret, J. M. Jimenez, and O. Romero, “Including artificial intelligence in a routing protocol using
Software Defined Networks,” 2017 IEEE International Conference on Communications Workshops (ICC Workshops).
IEEE, May 2017 [Online]. Available: http://dx.doi.org/10.1109/ICCW.2017.7962735
[21] D. Zhang, X. Han and C. Deng, "Review on the research and practice of deep learning and reinforcement learning in
smart grids," in CSEE Journal of Power and Energy Systems, vol. 4, no. 3, pp. 362-370, September 2018, doi:
10.17775/CSEEJPES.2018.00520.
[22] F. Wang, L. Duan, and J. Niu, “Optimal Pricing of User-Initiated Data-Plan Sharing in a Roaming Market,” IEEE
Transactions on Wireless Communications, vol. 17, no. 9. Institute of Electrical and Electronics Engineers (IEEE), pp.
5929–5944, Sep. 2018 [Online]. Available: http://dx.doi.org/10.1109/TWC.2018.2851578
83
[23] X. Qiao, D. Shi, and F. Xu, “Optimal pricing strategy and economic effect of product sharing based on the analysis of
B2C sharing platform,” 2019 16th International Conference on Service Systems and Service Management (ICSSSM).
IEEE, Jul. 2019 [Online]. Available: http://dx.doi.org/10.1109/ICSSSM.2019.8887720
[24] Arti, K. P. Dubey and S. Agrawal, "An Opinion Mining for Indian Premier League Using Machine Learning
Techniques," 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU),
Ghaziabad, India, 2019, pp. 1-4, doi: 10.1109/IoT-SIU.2019.8777472.
[25] R. Bhowmik and S. Wang, “Stock Market Volatility and Return Analysis: A Systematic Literature Review,” Entropy,
vol. 22, no. 5. MDPI AG, p. 522, May 04, 2020 [Online]. Available: http://dx.doi.org/10.3390/e22050522.
[26] X. Shen, G. Wang, and Y. Wang, “The Influence of Research Reports on Stock Returns: The Mediating Effect of
Machine-Learning-Based Investor Sentiment,” Discrete Dynamics in Nature and Society, vol. 2021. Hindawi Limited,
pp. 1–14, Dec. 31, 2021 [Online]. Available: http://dx.doi.org/10.1155/2021/5049179.
84
Grenze International Journal of Engineering and Technology, June Issue
Abstract—In this technology fueled world, everyone is using smart devices, electronic gadgets,
wireless products etc. and huge amounts of data are being generated, collected, and stored in
the databases. To efficiently process and intelligently analyze the huge amount of data, the
knowledge about subfield of Artificial Intelligence that is, Particularly Machine learning (ML)
is required. There are various types of machine learning and its algorithms have been
introduced to handle real world scenarios. This paper discusses the Comprehensive survey
based on methodologies, techniques, algorithms, applications, and challenges faced by
unsupervised machine learning and how unsupervised learning techniques can be helpful in
real world business and environment. Thus, this study’s key contribution is explaining the
principles of different unsupervised machine learning techniques and their applicability in
various real-world application domains, such as cybersecurity systems, smart cities, healthcare,
e-commerce, agriculture, and many more.
Index Terms— Machine learning, Unsupervised learning, clustering, feature selection and
featureextraction.
I. INTRODUCTION
Very large and enormous amount of data and information are collected and stored from various sources like
mobile phones, personal computers, sensors, cameras, satellites, log files, health care tracker, bio informatics,
human generated data like social media data where enormous number of photos, videos, audios have been
uploaded daily on the internet. Every day, 2.5 quintillion bytes of data generated roughly. Intelligently
collecting, processing, analyzing huge volumes of data and developing corresponding smart gadgets, automated
applications using the knowledge of Artificial Intelligence and Machine learning. Machine learning allows
software applications and programs to automatically predict with accurate accuracy without being explicitly
programmed. Machine learning is the most important field of Data Science. In real world, through learning
capability and experience human tries to learn and Machine works based on human instructions. Machine
learning is the one where machine automatically learns by experience as human does. The role of Machine
learning is to learn, improve performance by experience and predict things with best accuracy. Machine learning
have been classified into supervised, unsupervised and reinforcement learning
Machine learning model or Machine learning model or algorithms Reinforcement learning models are
algorithms learns from labeled learns from unlabeled data. It is also based on reward or penalty. It is
data. It is also called as Task called as Data Driven also called as Environment
oriented approach. Approach. Driven Approach.
86
II. UNSUPERVISED MACHINE LEARNING
Unsupervised learning, by the name itself it could be easily understood that it will not guide by any supervision.
This type of learning should automatically extract knowledge, underlying hidden patterns, data groupings from
the dataset without human intervention. It will group objects or items based on similarities. Unsupervised
machine learning will deal with unlabeled dataset where there is no target output tagged with corresponding
input. Hence unsupervised learning is helpful in real life scenarios because all real-world problems will not come
up with input and output pattern.
A. Steps involved in Unsupervised learning techniques:
Gathering Unlabeled data: In unsupervised machine learning, gathering of data (raw unlabeled
data) isthe important part where it finds insights and trends from the data without supervision.
Interpretation: It interprets the raw input data to find out the hidden patterns and trends.
Algorithm: Then will apply suitable algorithms like clustering algorithms or association rules.
Processing: Here, the data points divide into groups called clusters based on similarity which is
measured using Euclidean or cosine distance.
Output: When new data point arrives, the algorithm will push the data point into most similar groups
andgives the predicted output based on similarity without any supervision.
B. Advantages of using Unsupervised learning
Unsupervised learning helps to solve problems without human intervention.
It automatically learns from the data and discover underlying patterns and group items or objects
basedon similarities.
It is less complex when compared with supervised machine learning, because in supervised it
involveshuman intervention because one has to understand the input data and label them.
C. Disadvantages of using Unsupervised learning
It provides less accuracy of results because it has no labeled data and machine must discover
automatically the underlying new patterns and relationships hence provides somewhat less accuracy
compared to supervised machine learning.
Evaluating an unsupervised machine learning model is quite difficult when compared to supervised
model.
D. Types of Unsupervised learning technique
In unsupervised we have input data and not having corresponding output data. If there is a set of image dataset,
then algorithm does not know about the input features and not trained upon images provided. The unsupervised
model should try to learn upon their own and perform the task by clustering or grouping the images based on
similarities. Unsupervised have been classified into two types,
Clustering
Association
E. Clustering
Clustering is grouping of objects based on similarities. It groups the given data points and objects that possess
more similarity will remain in same group and objects that possess less similarity will move to other groups of
clusters. It can be helpful in marketing sectors or industries where they group customers based on their behavior.
Clustering has been used in wide range of applications like e-commerce sites, cybersecurity, health care
analytics, behavioral analytics etc. Many clustering algorithms has been introduced, most popular and widely
used clustering algorithms that is used in machine learning is,
K-Means Clustering
Agglomerative hierarchical clustering
DBSCAN Clustering
F. Association Rules
Association rules is a type of unsupervised learning method which is used to find the relationship between the
objects in the large databases. Association rule mining makes effective marketing strategy. Market basket
analysis is a one example for association rule mining since it finds relationship between the items purchased by
87
the customers. If a person buys product ‘x’, then he/she might buy product ‘y’. If a customer who doesn’t buy
product ‘y’ followed by ‘x’ then they are said to be typical customers and marketing agents target them and cross
sell the items to them. Association rule mining finds frequent items, pairs, associations etc. from relational or
transactionalor any kind of databases.
It has been divided into two parts,
Antecedent
Consequent
“If customer purchases the product bread, then he is likely to buy Jam”
Antecedent: It can be found in datasets. It is bread from above statement.
Consequent: It can be found in combination with Antecedent. It is Jam from above statement.
The relationship can be described in two parameters, “Support” and “Confidence”. Support indicates how many
times the if/then relation occurs in the datasets, whereas Confidence refers to number of times these if/then
relationships have found to be true.
There are different types of algorithms in association rule mining,
Apriori algorithm
FP – Growth algorithm
88
to increase in k value. When k is 3 or above, the distance between data point and centroid becomes minimum
and become stable. So, selecting k points above 3 can be optimal solution for identifying the number of clusters.
3. Silhouette Method
It is used to find the accurate separation of k clusters in the dataset. It can be calculated using the formula,
s(o) = b(o) – a(o)
Max{a(o), b(o)}
89
should be one value greater than number of dimensions. Min Point = Dimension+1. Epsilon value can be
decided using elbow graph or k distance graph. The maximum curvature value can be selected as epsilon to
get more accurate value. Hence, DBSCAN approach can be more useful for clustering related problem as
when compared to k- means clustering or Hierarchical clustering. But K-means and hierarchical clustering
can be useful for some applications also. Density based problems can be solved using DBSCAN algorithm.
90
In Genetics and Anomaly Detection
Image, speech and pattern recognition
VII. CONCLUSION
Unsupervised learning is a powerful tool which can be used for large databases. There are various kinds of
applications developed using unsupervised learning technique. Variety of algorithms are there to solve problems
and give more accurate results. Both advantages and disadvantages are there in unsupervised learning technique,
when successfully solved these challenges faced by unsupervised learning, it gives more profit to companies,
strengthen relationship between companies and their customers, improve user performance and there is a lot of
advantages when using unsupervised machine learning technique. It automatically solves problems by grouping
the data points based on similarity without any human intervention. Less human intervention, more automatic
process with fairly good results is unsupervised machine learning. Hence, this paper describes the complete
survey on unsupervised machine learning with its applications, algorithms and its challenges faced in real world
scenarios.
REFERENCES
[1] A. Toshniwal, K. Mahesh and R. Jayashree, "Overview of Anomaly Detection techniques in Machine Learning," 4th
International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I- SMAC), pp.808-815, 2020, doi:
10.1109/I-SMAC49090.2020.9243329.
[2] Li, N., Shepperd, M., & Guo, Y., "A systematic review of unsupervised learning techniques for software defect
prediction," Information and Software Technology, Vol.122, pp.106287, 2020.
[3] Rodrigues, J., Belo, D., & Gamboa, H., "Noise detection on ECG based on agglomerative clustering of morphological
features," Computers in biology and medicine, Vol.87, pp.322-334, 2017.
[4] Shakeel, P. M., Baskar, S., Dhulipala, V. S., & Jaber, M. M., "Cloud based framework for diagnosis of diabetes mellitus
using K-means clustering," Health information science and systems, Vol.6, No.1, pp.1-7, 2018.
[5] [5] M. Sujithra, P. Velvadivu, J. Rathika, R. Priyadharshini and P. Preethi, "A Study on Psychological Stress of
Working Women In Educational Institution Using Machine Learning," 2022 13th International Conference on
Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, 2022, pp. 1-7, doi:
10.1109/ICCCNT54827.2022.9984460.
[6] Liang, Z., Wang, C., Duan, Z., Liu, H., Liu, X., & Ullah Jan Khan, K., "A Hybrid Model Consisting of Supervised and
Unsupervised Learning for Landslide Susceptibility Mapping" Remote Sensing, Vol.13, No.8, pp.1464, 2021.
[7] Kuang, H., Qiu, Y., Li, R., & Liu, X., "A hierarchical K-means algorithm for controller placement in SDN- based WAN
91
architecture," 2018 10th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA),
IEEE, pp.263-267, February 2018.
[8] Addagarla, S. K., & Amalanathan, A., "Probabilistic Unsupervised Machine Learning Approach for a Similar Image
Recommender System for E-Commerce," Symmetry, Vol.12, No.11, pp.1783, 2020.
[9] S. Siddharth, R. Darsini and M. Sujithra, "Sentiment analysis on twitter data using machine learning algorithms in
python", Int. J. Eng. Res. Comput. Sci. Eng., vol. 5, no. 2, pp. 285-290, 2018.
[10] Acar, E., & Yener, B., "Unsupervised multiway data analysis: A literature survey," IEEE transactions on knowledge and
data engineering, Vol.21, No.1, pp.6-20, 2008.
92
Grenze International Journal of Engineering and Technology, June Issue
Abstract—In the past decade, the availability of voluminous and high-dimensional datasets has
immensely emerged with continuous technological innovations to foster new ways to gather and
analyze data. Thus, feature selection has become a challenging task in areas of application like
text classification, data analysis, prediction, information retrieval etc. Knowledge extraction
using machine learning models usually depends on the quality and quantity of data that they
operate on. Feature selection is one of the core concepts to identify and remove irrelevant as
well as redundant information that may impact accuracy or have no impact on the results.
Feature selection methods are discussed in this review paper along with practical limitations.
Subsequently, the workflow to solve a feature selection problem is also elaborated with feature
selection methods. In feature selection, many surveys and empirical assessments were
performed in many areas like classification, prediction, regression, and clustering, respectively.
Index Terms— feature selection, feature ranking, filter, hybrid, wrapper, embedded.
I. INTRODUCTION
The abrupt growth in volume as well as dimension of datasets is problematic for algorithms and leads to high
computational cost and memory usage. Mobiles, social media, cameras, infrastructure-less wireless networks,
weblogs, radio frequency identification (RFID) readers, internet search, web-based data are few names of
information gathering sources. According to the IDC Digital Universe study, “By 2020, around 40 zettabytes
will be the size of the digital universe” [1]. Apart from structured data in traditional datasets, a large volume of
unstructured and semi-structured data is present.
In the late 19th century, feature selection emerged as an important technique for selection of most relevant,
significant, and important features. Variable Selection is a part of Feature Engineering, also known as ‘feature
selection’ (or attribute selection). It is the process to identify and removal of irrelevant or less vital features to
achieve better accuracy and performance enhancement of the model [2]. The identification or extraction of most
relevant features also known as dimensionality reduction techniques [1]. The advantage of feature selection is
that no information loss of a single feature importance. In case, when original features are very diverse, and a
small set of features is required then removal of some features may lead to a chance of loss of information.
Whereas drawback of feature extraction is that the linear combinations of original features are not interpretable
and the information about the contribution of each feature is often lost. Feature extraction often decreases the
feature space size without losing a lot of information. The choice of feature extraction or selection methods
totally depends on the data type of application domain [3].
94
III. FEATURE SELECTION METHODS AND LIMITATIONS:
The feature selection problem has been studied form many years by statistics, researcher, and machine learning
communities. With emerging data mining research, more attention has been given to feature selection
techniques. Feature selection is also known as subset selection and a pre-processing technique used in machine
learning to increase learning accuracy by removing irrelevant features [16]. Taxonomy of feature selection
techniques for different datasets is shown in “Fig. 1”.
Feature Selection
Select a subset from Filter, wrapper, Embedded Applicable for microarray data, text mining, weather
the original set and hybrid methods forecasting, image processing, sentimental analysis, etc.
Complete (Exhaustive) search, Sequential Search, Exponential Search and Randomized Search are most
common search Strategies used for feature selection. A large range of application areas (text analysis, microarray
data analysis, climate change prediction, digital image processing, sentimental analysis, etc.) uses different types
of feature selection algorithms (Filter, wrapper, Embedded and hybrid methods).
A. Filter Methods
Filter methods is the most generic approach among all the four and works irrespective of data modelling
algorithm. Optimal feature set is selected by analysing general features of the dataset. In literature, methods with
respect to filter class are as univariate (evaluate a single feature using ranking) and multivariate (evaluate an
entire feature subset using search strategy). Filter methods cannot be applied universally for different knowledge
discovery operations thus has been classified as regression, classification, or clustering [6], [10], [17]–[32] .
Filter approach to feature subset is shown in Fig. 2.
According to [33], filter methods are faster and have low computational cost in contrast to wrapper and
embedded methods but inefficient reliability in classification problems. As a result, this method is more popular
to both academicians and industry practitioner. According to survey analysis, in Table I filter methods are
summarised into 2 categories, along with application utility and references.
Information Gain (IG) [17]; Gain Ratio [18]; Chi-square [18]; Fisher Score (F-Score) [19]; Classification
Symmetrical uncertainty [20]
Univariate Filter
Relief and ReliefF [23]; ANOVA / Term Variance/ Variance Threshold [24]; Count/ Classification, Regression
class
95
B. Wrapper Methods
In 1996, Ron Kohavi and George H. John proposed the wrapper procedure to decrease irrelevant features [34].
Black-box is used as a predictor to evaluate the variable subset using wrapper methods. Wrapper methods
performs better in case of a smaller number of features in contrast it is expensive to be used for huge feature set
due to high computational cost and feature selection process got slower if each feature is evaluated with trained
classifier [16]. Another drawback is higher risk of overfitting as if using classifier model data learning rate is too
high and provides poor generalization [4], [6], [9]. The induction technique (also known as the ‘black box’) has
been used to represent the supervised problems. Each training instance is characterised by feature vectors and a
class label as shown in Fig. 3. Further, ranking is used for irrelevant features elimination [34].
In 2015, Diao R and Shen Q. suggested that to form a hybrid approach with intellectual properties for feature
selection, development of a meta-framework may be beneficial for dynamic identification of suitable algorithms
[35]. According to [14], Genetic algorithms (GA), Particle Swarm Optimization (PSO), and Ant Colony
Optimization (ACO) are widely used among existing heuristic methods for the variable selection problem. It has
been observed that due to its cogent structure GA is preferable even though PSO and ACO provide higher
accuracy for similar tasks. According to literature survey, Wrappers algorithms can be categorized into
Sequential selection and Heuristic search. In Table II Wrapper methods are summarized, along with application
utility and references.
Entire Search
Feature Strategy, Induction Classifier
set subset Algorithm
generation
Wrappers perform feature selection based on performance of modelling algorithm (black box). E.g., for
classification tasks, features subsets will be evaluated based on classifier performance, whereas, for clustering, it
will be based on performance of clustering algorithm [13], [36]–[42]. While implementing Wrappers, the subset
generation is dependent on search strategy same as filter methods. The model evaluation steps are repeated for
each subset until all features are ranked. According to literature, wrappers performance is better than filters
because it evaluates subsets using real modelling algorithms. Generally, any combination of search strategy and
modelling algorithms can be implemented for variable selection. While modelling methods like Naïve Bayes,
linear support vector and Extreme Learning Machines are best with wrappers for greedy search problems.
Sequential/ Greedy Sequential backward selection and Sequential forward [38] Classification
selection algorithms
Wrapper
feature Global/ Random/ Ant Colony Optimization (ACO) and Genetic Algorithm (GA) [39]; Classification, Clustering
selection Heuristic selection Particle Swarm Optimization (PSO-SVM) [42]; Artificial Bee Colony
algorithms (ABC) [40]; Random mutation hill-climbing [36]; Simulated annealing
(SA) [41]
C. Embedded Methods
The filter methods have a major drawback that is independent of classifiers which results worse performance than
wrapper methods. However, wrapper has high computational cost. A midway solution is the use of embedded or
hybrid methods and both use the principal of the classifier to generate criteria to rank most optimal features [43].
Embedded methods are robust and effective while dealing with high dimensional dataset and have lower risk of
overfitting in comparison to wrapper [44]. First, these methods train a machine learning model then drive feature
importance. Finally, remove non-important features using the feature subset. Fig. 4 illustrates the embedded
variable selection methodology.
Embedded methods consider not only feature dependencies via relationship between input and output feature,
but also search features locally that allow local discrimination. In other words, it uses independent criteria to find
optimal subset for known cardinality. The final optimal subset is selected among optimal subsets by using
learning algorithm having best accuracy level. Various types of decision tree algorithms are used by different
embedded methods as CART, C4.5, random forest, multinomial logistic regression, and its variants [45]. Among
96
ML
Entire Produce Algorithm
Feature a Feature + Classifier
set subset Performan
ce
different embedded methods, Support Vectors based on Recursive Feature Elimination (SVM-RFE) is widely
used [46] Whereas, regularization method also known as penalization, is one of the common embedded types of
feature selection. Among regression based embedded methods, LASSO and RIDGE regression are mostly used
for hyperspectral data to reduce overfitting through inherent correction [47], [48]. According to literature survey,
some mostly used Embedded algorithms are mentioned in Table III. Embedded methods are summarised, along
with application utility and references [45], [47]–[52].
Embedded Tree based methods / Classification and Regression Trees (CART) [45] Classification, Regression, Clustering
feature Decision tree algorithms ID3 [50]; Random Forest [51] Classification
selection C4.5 Decision Trees [49] Clustering
Regularization / LASSO regression (L1 regularization) [47]
Regression methods RIDGE regression (L2 regularization) [48] Regression
Elastic nets (L1/L2 regularization) [52]
D. Hybrid Methods
Hybrid methods are developed to utilize the advantages of filters as well as wrappers methods [44]. These
methods are sequential based as the first step is commonly based on filter methods to reduce the number of
features that help to remove redundant features then a wrapper method is applied to select the desired number of
features from a reduced set for optimal feature selection [53]. It can be any combination of classical methods
such as filter-filter, filter-wrapper, and filter-filter-wrapper where the output of one method as feature subset is
provided as input to another. Independent test and performance evaluation functions are used by this approach
for feature subset selection. Thus, it helps to improve the efficiency and accuracy for high dimensional dataset
with better computational cost [54]. Hybrid feature selection method layout is shown in Fig. 5. Several
methodologies are developed using hybrid methods, such as: hybrid ant colony optimization; fuzzy random
forest-based feature selection mixed gravitational search algorithm or hybrid genetic algorithms.
optimal
feature pre-
Entire feature
selction Classifier
Feature set selection
(filter
(wrapper
method)
OR filter)
IV. COMPARISON
All above mentioned methods have been widely used by many researchers for different applications. The
performance of feature selection methods totally depends upon the dimensionality of the dataset. Thus, new
methods for feature selection are always needed. Table IV, summaries and compared the feature selection
methods along with merits and demerits [9], [13], [14], [37], [43], [44].
97
TABLE IV. COMPARISON OF FILTER, WRAPPER, EMBEDDED AND HYBRID METHODS
Filter methods Wrapper methods Embedded methods Hybrid methods
High computational efficiency. Better results than filter, but The performance degrades if Better computational complexity than
computational expensive for number of irrelevant features is wrapper methods.
huge dataset. high in target set.
No interaction with modelling Dependent on modelling Dependent on modelling Dependent on modelling algorithm
algorithm. algorithm performance. algorithm performance. performance.
Low computational cost for High computational cost for Computational cost is less in More flexible and robust against high
large dataset. large dataset thus works comparison to wrapper dimensional data
better for small dataset. methods.
Fast execution than wrapper Slow execution than filter Faster than wrapper methods. High performance than filter
methods methods. methods.
Less prone to overfitting High risk of overfitting as if Generally used to reduce over Overcomes the demerits of wrapper,
data learning rate is too high. fitting; Least prone to enhanced search algorithm thus least
overfitting prone to overfitting
Mostly ignores feature Consider feature Identification of small set of Dependents on the combination of
dependencies and consider each dependencies. features may be problematic. different feature selection methods.
feature separately.
The features subsets are Cross validation methods are Cross validation methods are Cross validation methods are used for
evaluated by using statistical used for method assessment. used for valuation of methods. assessment.
tests
by an appropriate function (filter methods) or directly by the execution of a Data Mining tool (wrapper methods)
[6], [22], [53].
In Table V, feature selection methods are summarized with respect to application domains and evaluation
metrics. Although there is no ideal solution for a particular statement of problem and difficult to conclude until a
significant number of test situations are efficiently addressed.
VI. CONCLUSION
In this paper, various strategies are reviewed in the context of feature selection. Each algorithm selects the
variables without computing redundancy out of which some algorithms also do not consider performance and
accuracy. On the other hand, some algorithms do not consider the existence of noisy data when picking features.
According to literature, if the computational time is extended, then the learning process becomes negligible.
Filter methods can be used on huge datasets with many features because they are faster. However, it has no
effect on accuracy. Wrapper approaches pick the best characteristics with high precision. However, the
computational cost is high. Some hybrid solutions attempted to address the shortcomings of both methodologies.
The objective of this study is to provide an in-depth comprehension of feature selection. If the high dimensional
dataset contains irrelevant, insignificant, and unimportant features which in turns prevents effective modelling
and may produce less accurate as well as less understandable results or may fail to achieve desirable results.
Based on the research, an efficient unified framework for any size of dataset with minimal computing cost, and
highest accuracy is required for variable selection.
98
REFERENCES
[1] Gantz John and Reinse David, “THE DIGITAL UNIVERSE IN 2020: Big Data, Bigger Digital Shadow s, and Biggest
Growth in the Far East,” Dec. 2012. https://www.cs.princeton.edu/courses/archive/spring13/cos598C/idc-the-digital-
universe-in-2020.pdf (accessed Nov. 09, 2022).
[2] I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” Journal of Machine Learning Research,
vol. 3, pp. 1157–1182, 2003, Accessed: Nov. 09, 2022.
[3] A. Janecek, W. Gansterer, M. Demel, and G. Ecker, “On the Relationship Between Feature Selection and Classification
Accuracy,” in Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge
Discovery at ECML/PKDD, Sep. 2008, pp. 90–105.
[4] Radha V and N Krishnaveni N, “Feature Selection Algorithms for Data Mining Classification: A Survey,” Indian J Sci
Technol, vol. 12, no. 6, pp. 1–11, Feb. 2019,
[5] T. Li, Y. S. Ho, and C. Y. Li, “Bibliometric analysis on global Parkinson’s disease research trends during 1991–2006,”
Neurosci Lett, vol. 441, no. 3, pp. 248–252, Aug. 2008, Accessed: Nov. 10, 2022.
[6] J. Tang, S. Alelyani, and H. Liu, “Feature selection for classification: A review,” Data Classification: Algorithms and
Applications, pp. 37–64, Jan. 2014,
[7] J. Li et al., “Feature selection: A data perspective,” ACM Comput Surv, vol. 50, no. 6, Dec. 2017,
[8] K. Zhu and J. Yang, “A cluster-based sequential feature selection algorithm,” Proceedings - International Conference
on Natural Computation, pp. 848–852, 2013, Accessed: Nov. 10, 2022.
[9] Z. Zeng, H. Zhang, R. Zhang, and Y. Zhang, “A Hybrid Feature Selection Method Based on Rough Conditional Mutual
Information and Naive Bayesian Classifier,” ISRN Applied Mathematics, vol. 2014, pp. 1–11, Mar. 2014, Accessed:
Nov. 14, 2022.
[10] S. Tabakhi and P. Moradi, “Relevance–redundancy feature selection based on ant colony optimization,” Pattern
Recognit, vol. 48, no. 9, pp. 2798–2811, Sep. 2015, Accessed: Nov. 10, 2022.
[11] V. H. Medina Garcia, J. Rodriguez Rodriguez, and M. A. Ospina Usaquén, “A comparative study between feature
selection algorithms,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics), vol. 10943 LNCS, pp. 65–76, 2018,
[12] L. E. A. Laura Emmanuella and A. M. de Paula Canuto, “Filter-based optimization techniques for selection of feature
subsets in ensemble systems,” Expert Syst Appl, vol. 41, no. 4, pp. 1622–1631, Mar. 2014, Accessed: Nov. 10, 2022.
[13] T. Tekin Erguzel, C. Tas, and M. Cebi, “A wrapper-based approach for feature selection and classification of major
depressive disorder–bipolar disorders,” Comput Biol Med, vol. 64, pp. 127–137, Sep. 2015, Accessed: Nov. 10, 2022.
[14] M. M. Javidi and N. Emami, “A hybrid search method of wrapper feature selection by chaos particle swarm
optimization and local search,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 24, no. 5, pp.
3852–3861, Jan. 2016,
[15] S. Das, P. K. Singh, S. Bhowmik, R. Sarkar, and M. Nasipuri, “A Harmony Search Based Wrapper Feature Selection
Method for Holistic Bangla Word Recognition,” Procedia Comput Sci, vol. 89, pp. 395–403, Jan. 2016, Accessed: Nov.
10, 2022.
[16] Chandrashekar Girish and Sahin Ferat, “A survey on feature selection methods,” Computers & Electrical Engineering,
vol. 40, no. 1, pp. 16–28, Jan. 2014, Accessed: Nov. 10, 2022.
[17] Hoque N., Bhattacharyya D. K., and Kalita J. K., “MIFS-ND: A mutual information-based feature selection method,”
Expert Syst Appl, vol. 41, no. 14, pp. 6371–6385, Oct. 2014, Accessed: Nov. 10, 2022.
[18] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, “Data Mining: Practical Machine Learning Tools and Techniques,”
Data Mining: Practical Machine Learning Tools and Techniques, pp. 1–621, Nov. 2016, Accessed: Nov. 10, 2022.
[19] Duda R.O., Hart P.E., and Stork D.G., “Pattern Classification,” Journal of Classification 2007 24:2, vol. 24, no. 2, pp.
305–307, Sep. 2007,
[20] L. Yu and H. Liu, “Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution,” in : Proc.
20th International Conference on Machine Learning (ICML-2003), 2003, pp. 856–863.
[21] M. Dash and Y. S. Ong, “RELIEF-C: Efficient feature selection for clustering over noisy data,” Proceedings -
International Conference on Tools with Artificial Intelligence, ICTAI, pp. 869–872, 2011, Accessed: Nov. 10, 2022.
[22] S. Alelyani, J. Tang, and H. Liu, Chapter: Feature Selection for Clustering: A Review. Chapman and Hall/CRC, 2018.
[23] M. Robnik-Šikonja and I. Kononenko, “Theoretical and Empirical Analysis of ReliefF and RReliefF,” Machine
Learning 2003 53:1, vol. 53, no. 1, pp. 23–69, Oct. 2003,
[24] F. Ahmed and M. L. Gavrilova, “Two-Layer Feature Selection Algorithm for Recognizing Human Emotions from 3D
Motion Analysis,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and
Lecture Notes in Bioinformatics), vol. 11542 LNCS, pp. 53–67, 2019,
[25] H. Nematzadeh, R. Enayatifar, M. Mahmud, and E. Akbari, “Frequency based feature selection method using whale
algorithm,” Genomics, vol. 111, no. 6, pp. 1946–1955, Dec. 2019, Accessed: Nov. 10, 2022.
[26] H. Liu, M. Zhou, X. S. Lu, and C. Yao, “Weighted Gini index feature selection method for imbalanced data,” ICNSC
2018 - 15th IEEE International Conference on Networking, Sensing and Control, pp. 1–6, May 2018,
[27] C. Lai, M. J. T. Reinders, and L. Wessels, “Random subspace method for multivariate feature selection,” Pattern
Recognit Lett, vol. 27, no. 10, pp. 1067–1076, Jul. 2006, Accessed: Nov. 10, 2022.
99
[28] A. J. Ferreira and M. A. T. Figueiredo, “An unsupervised approach to feature discretization and selection,” Pattern
Recognit, vol. 45, no. 9, pp. 3048–3060, Sep. 2012, Accessed: Nov. 10, 2022.
[29] D. M. Witten and R. Tibshirani, “A framework for feature selection in clustering,” J Am Stat Assoc, vol. 105, no. 490,
pp. 713–726, Jun. 2010,
[30] Y. Li, M. Dong, and J. Hua, “Localized feature selection for clustering,” Pattern Recognit Lett, vol. 29, no. 1, pp. 10–18,
Jan. 2008,
[31] D. S. Modha and W. S. Spangler, “Feature Weighting in k-Means Clustering,” Machine Learning 2003 52:3, vol. 52,
no. 3, pp. 217–237, Sep. 2003,
[32] P. Moradi and M. Rostami, “Integration of graph clustering with ant colony optimization for feature selection,” Knowl
Based Syst, vol. 84, pp. 144–161, Aug. 2015, Accessed: Nov. 10, 2022.
[33] Saptarsi Goswami and Amlan Chakrabarti, “Feature Selection: A Practitioner View,” International Journal of
Information Technology and Computer Science(IJITCS), vol. 6, no. 11, pp. 66–77, 2014,
[34] R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artif Intell, vol. 97, no. 1–2, pp. 273–324, Dec.
1997, Accessed: Nov. 10, 2022.
[35] R. Diao and Q. Shen, “Nature inspired feature selection meta-heuristics,” Artificial Intelligence Review 2015 44:3, vol.
44, no. 3, pp. 311–340, Jan. 2015,
[36] D. B. Skalak, “Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms,”
Machine Learning Proceedings 1994, pp. 293–301, Jan. 1994, Accessed: Nov. 10, 2022.
[37] J. C. Cortizo and I. Giraldez, “Multi criteria wrapper improvements to Naive Bayes learning,” Lecture Notes in
Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),
vol. 4224 LNCS, pp. 419–427, 2006,
[38] K. Z. Mao, “Orthogonal forward selection and backward elimination algorithms for feature subset selection.,” IEEE
Trans Syst Man Cybern B Cybern, vol. 34, no. 1, pp. 629–34, Feb. 2004,
[39] H. R. Kanan, K. Faez, and S. M. Taheri, “Feature selection using Ant Colony Optimization (ACO): A new method and
comparative study in the application of face recognition system,” Lecture Notes in Computer Science (including
subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4597 LNCS, pp. 63–76,
2007,
[40] M. Schiezaro and H. Pedrini, “Data feature selection based on Artificial Bee Colony algorithm,” EURASIP Journal on
Image and Video Processing 2013 2013:1, vol. 2013, no. 1, pp. 1–8, Aug. 2013,
[41] S. W. Lin, Z. J. Lee, S. C. Chen, and T. Y. Tseng, “Parameter determination of support vector machine and feature
selection using simulated annealing approach,” Appl Soft Comput, vol. 8, no. 4, pp. 1505–1512, Sep. 2008, Accessed:
Nov. 10, 2022.
[42] Chung-Jui Tu, Li-Yeh Chuang, Jun-Yang Chang, and Cheng-Hong Yang, “Feature selection using PSO-SVM,” IAENG
Int J Comput Sci, vol. 33, no. 1, 2007,
[43] I. S. Oh, J. S. Lee, and B. R. Moon, “Hybrid genetic algorithms for feature selection,” IEEE Trans Pattern Anal Mach
Intell, vol. 26, no. 11, pp. 1424–1437, Nov. 2004, Accessed: Nov. 10, 2022.
[44] das Sanmay, “Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection,” in Proceedings of the Eighteenth
International Conference on Machine Learning (ICML’01), 2001, pp. 74–81.
[45] M. Sandri and P. Zuccolotto, “Variable Selection Using Random Forests,” Data Analysis, Classification and the
Forward Search, pp. 263–270, Aug. 2006,
[46] Maciej Serda et al., “Sparse multinomial logistic regression via Bayesian L1 regularisation,” Uniwersytet śląski, vol. 7,
no. 1, pp. 209–216, 2007,
[47] R. Muthukrishnan and R. Rohini, “LASSO: A feature selection technique in predictive modeling for machine learning,”
2016 IEEE International Conference on Advances in Computer Applications, ICACA 2016, pp. 18–20, Mar. 2017,
Accessed: Nov. 10, 2022.
[48] M. Imani and H. Ghassemian, “Ridge regression-based feature extraction for hyperspectral data,”
https://doi.org/10.1080/01431161.2015.1024894, vol. 36, no. 6, pp. 1728–1742, Mar. 2015,
[49] S. W. Lin and S. C. Chen, “Parameter determination and feature selection for C4.5 algorithm using scatter search
approach,” Soft Computing 2011 16:1, vol. 16, no. 1, pp. 63–75, May 2011,
[50] S. Nizamani, N. Memon, U. K. Wiil, and P. Karampelas, “Modeling Suspicious Email Detection using Enhanced
Feature Selection,” Dec. 2013,
[51] B. H. Menze et al., “A comparison of random forest and its Gini importance with standard chemometric methods for the
feature selection and classification of spectral data,” BMC Bioinformatics, vol. 10, Jul. 2009, Accessed: Nov. 10, 2022.
[52] A. Destrero, S. Mosci, C. de Mol, A. Verri, and F. Odone, “Feature selection for high-dimensional data,” Computational
Management Science 2008 6:1, vol. 6, no. 1, pp. 25–40, Apr. 2008,
[53] Muhammad Shakil Pervez and Dewan Md. Farid, “Literature Review of Feature Selection for Mining Tasks,”
International Journal of Computer Application, vol. 116, no. 21, pp. 30–33, 2015.
[54] Veerabhadrappa and L. Rangarajan, “Bi-level dimensionality reduction methods using feature selection and feature
extraction,” Int J Comput Appl, vol. 4, no. 2, pp. 33–38, 2010.
[55] M. Yousef, A. Kumar, and B. Bakir-Gungor, “Application of biological domain knowledge based feature selection on
gene expression data,” Entropy, vol. 23, no. 1. 2021.
100
[56] R. K. Palacharla and V. K. Vatsavayi, “A novel filter based multivariate feature Selection technique for text
classification,” J Theor Appl Inf Technol, vol. 99, no. 18, 2021.
[57] O. M. Alyasiri, Y. N. Cheah, and A. K. Abasi, “Hybrid Filter-Wrapper Text Feature Selection Technique for Text
Classification,” in International Conference on Communication and Information Technology, ICICT 2021, 2021.
[58] K. Golalipour, E. Akbari, S. S. Hamidi, M. Lee, and R. Enayatifar, “From clustering to clustering ensemble selection: A
review,” Engineering Applications of Artificial Intelligence, vol. 104. 2021.
[59] A. Adla and S. T. Zouggar, “Performance Assessment of Random Forest Induction Methods,” in 2021 International
Conference on Decision Aid Sciences and Application, DASA 2021, 2021.
[60] M. Tiboni, C. Remino, R. Bussola, and C. Amici, “A Review on Vibration-Based Condition Monitoring of Rotating
Machinery,” Applied Sciences (Switzerland), vol. 12, no. 3. 2022.
[61] A. N. Anggraeni, K. Mustofa, and S. Priyanta, “Comparison of Filter and Wrapper Based Feature Selection Methods on
Spam Comment Classification,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 3,
2021,
[62] P. Jayapriya and K. Umamaheswari, “Performance analysis of two-stage optimal feature-selection techniques for finger
knuckle recognition,” Intelligent Automation and Soft Computing, vol. 32, no. 2, 2022.
101
Grenze International Journal of Engineering and Technology, June Issue
Abstract—Devanagari script is used by most of the people in India. There are some script-
specific structural characteristics of Devanagari script which makes the character recognition
problem more challenging. Many OCR tools are available for printed or handwritten
Devanagari script recognition. In these systems the input is given in the form of images of the
script which can be scanned or photographed. But the existing systems are not robust. They
give unexpected results when the input to the system is not ideal that is the image is rotated or
tilted or has illumination variance. Our goal is to build a robust OCR system for printed
Marathi highlighted text where the variation with respect to font, size, orientation and
illumination are allowed. This paper proposes appropriate image transformation techniques to
get a robust Devanagari Character Recognition System.
Index Terms— Devanagari, PCA, Tesseract, Levenshtein edit distance, OCR, Perspective
transform, Sauvola Thresholding, Highlighted word.
I. INTRODUCTION
Humans have highly developed sense for several pattern recognition tasks; one such task we very easily perform
every now and then is: recognizing the written text. Humans can develop their reading and writing skills in their
first few years of education and when they growup they can easily recognize text even if it is printed in different
styles, sizes, font and orientation. Even the broken, distorted and misspelled words can also be recognized by
human and all this is possible by past experiences. From lots of research it is found that reading skill of
computers is still way behind the human. In this paper, the goal is to recognize highlighted Marathi words. An
image of highlighted printed or handwritten word is taken as an input. The image can be scanned or
photographed using smart-phone or web camera. Many OCR systems for Devanagari script recognition presently
exists.
Existing systems have certain limitations like they cannot work on tilted images, images captured at different
angles of rotation or in presence of illumination variance. Tilt angle and rotation angle are not the same. The Tilt
angle is the angle made by the camera with the plane. Tilt angle is same as the elevation angle whose zero is at
the horizon. If it is non-zero, the effects due to tilt are observed in the image and may not give correct results.
The rotation angle is the azimuthal angle. Its axis of rotation is perpendicular to the plane. The existing systems
may give unexpected result if angle of rotation is non-zero. Also, if there is uneven illumination or shading, the
existing systems may fail to recognize the word correctly. So, a methodology is proposed to transform the image
V. PROPOSED METHODOLOGY
In the proposed methodology, we have focused on recognizing Marathi highlighted words in any orientation i.e.
tilt or rotation, and also having different illumination effects. For recognizing the highlighted script Tesseract
API is used. In case of Marathi Language,Tesseract fails to recognize the script correctly if the tilt and rotation
angle are non-zero or have light variance.The paper focuses on improving these Tesseract API limitations.
Figure 1 Shows the block diagram of the proposed system, which consists of different phases, beginning with
input printed text imagee with a highlighted word, pre-processing, rotation invariant, tilt invariant, illumination
103
invariant, Tesseractimplementation,finding the best match for the Tesseract output using Levenstein distance and
final recognized text. The block diagram of the proposed system as follows:
A. Pre-processing
Pre-processing steps (as shown in Figure2 ) is applied on the input image to remove the noise from it and also to
minimize the variations in the character styles. The scanned document sometimes has Salt and Pepper noise or
Shaded areas. This noise must be filtered during preprocessing step. Sometimes image contains some black
spots. To remove these black spots and noise along with black shade at the edges, filtering has been done. Here,
we have used Median filter to remove high frequency components that cause noise in the image.
Input Image: The highlighted printed text is captured by mobile phone, and that captured image is an input to
our system. Captured image contains one or more highlighted words. Recognition of these highlighted words
is the goal of this project.
Masking: Masking of highlighted part is done here, for that lower and upper bound of color is found out with
the help of BGR values of particular color and then bitwise AND operation is done to extract color part only.
(RGB)Color image to Grayscale image: The input contains color text image. In preprocessingphase the image
is converted to grayscale image.
Thresholding: Thresholding is also known as binarization. In this certain threshold value has been set, this
will convert the pixels to black and white. If pixel value is above the threshold, the pixel is converted into
white and if the pixel value is less than threshold value, thepixel is converted into black. Quality of binarized
image depends on value of thethreshold.
Canny Edge Detection: Canny edge detection is a process to extract significant structural information from
image and reduce the amount of data to be processed
Boundary Tracing or Contour Detection: Contour detection means to find out the boundary of the area of
interest using edges. It will identify connected components of an image and store that pixel values in the form
of array. Contour can be found out by traversing the rows of image which is already filtered. The contour
detection algorithm searches foreground pixel and store it into an array by marking it. Similarly, it will find all
the neighborhood pixels. This process will continue till all the pixels of the image have been stored or it will
continue to search in next row.
B. Rotation Invariance
To make the system rotation invariant, first we need to find the angle by which the image is rotated and then
compensate the rotation. This is implemented using step shown in Figure3.
104
Figure3.Block diagram to make system Rotation Invariant
1. Mask the highlighted portion as already discussed in masking step of pre-processing part.
2. Crop the masked image to get only highlighted portion. This is required since the masked highlighted portion
has black background which may give us improper dataset obtained in step 3.
3. Classify the pixels into background pixels and foreground pixels. The black pixels of the text are the
foreground pixels and rest others are background pixels.
4. Implement Principal Component Analysis on the foreground pixels obtained in previous step. The brief
explanation of PCA is given immediately after step 5.
5. Rotate the image by the negative of the angle obtained in step 4.
Principal Component Analysis (PCA) [12] is used for finding the direction of maximum variance i.e.the
directions where the data is most spread out. For finding this direction of maximum variance, Eigen vectors of
the covariance matrix associated with the dataset are calculated. The Eigen vector corresponding to largest
Eigen value gives the vector in the direction of maximum variance. In this paper, the coordinates of the pixels
lying inside the contour of the highlighted word forms the dataset.
Implementation of PCA ,steps are as follows
I. A nx2 matrix of the dataset is formed, where n is the number of pixels lying inside the contour of
highlighted pixel.
⎡ ⎤
⎢ ⎥
=⎢ ⎥
⎢ ⎥
⎣ ⎦
II. Find Covariance matrix of the Data matrix:
( ) ( , )
_ =
( , ) ( )
Where, ( , )=
( )
∑( − ̅ )( − ) and ( )=
( )
∑( − ̅ )( − ̅)
III. Find Eigen values and Eigen vectors of the covariance matrix.
IV. Find Eigen vector corresponding to the largest Eigen value. The direction of this Eigen vector can be
obtained by
( ) = tan
The value of angle gives us the angle made by highlighted word with the X axis. If the value is not zero then in
order to make it parallel to X axis we need rotate the image by the angle , Rotate angle= 0-angle.
Figure 4 shows Input Image, its HSV Image then Masked Image, followed by Median Filter Image, its Gray
Scaled Image,application of Binary Thresholding , Contour Detection, Rotated Image, Median Filter on rotated
Image and Final Image.
Figure 4.(a)Input Image, (b)HSV Image, (c)Masked Image, (d) Median Filter Image, (e)Gray Scaled Image, (f)Binary Thresholding ,
(g)Contour Detection, (i)Rotated Image, (j) Median Filter on rotated Image, (k)Final Image
105
C. Tilt Invariant
When the photos taken at slightly tilted angle, the highlighted word is not visible properly. To make the word
properly visible perspective transform method is used.
Implementation of Perspective Transform
To make this system tilt invariant Perspective Transform plays an important role. In the transformed image the
letters are not slanted and straight. This algorithm is very useful for OCR.
Initially the contour for the highlighted word is found. For perspective transform, we need to define the region of
interest which is in the form of rectangle. The coordinates of the vertices of rectangle are such that the top-left
point have the smallest (x+y) sum, the bottom right have the largest (x+y) sum, the top-right have smallest (x-y)
difference and the bottom-left has largest (x-y) difference. These points are then placed in consistent order. The
height and width of the rectangle enclosing the highlighted word can be determined using the above obtained
vertices.The first point is (0, 0) in the list of points it indicates the top-left corner. The second point is top-right
corner given by (maxWidth - 1, 0), (maxWidth - 1, maxHeight - 1) gives the bottom-right corner and (0,
maxHeight - 1) gives the bottom-left corner. In a consistent ordering representation these points are defined.
Top-down view of the image is obtained using cv2.getPerspectiveTransform function. It requires two arguments
rectand dst. The rect is the list of four regions of interested points in the original image and dst is list of
transformed points. The cv2.getPerspectiveTransform function returns the actual transformation matrix M. The
transformation matrix is applied in cv2.warpPerspective function. The transform matrix M, image, height and
width of output image pass in to cv2.warpPerspective function resulted into warped image, which is our top-
down view. Figure 5 shows steps of making tilt invariant.
Figure5. (k) Tilted Image, (l) ROI image shown by rectangle, (m) Warped Image
D. Illumination Invariant
To make this system illumination invariant Sauvola thresholding method is used.
Sauvola thresholding
Sauvola thresholding is a local thresholding technique. This technique is useful for text recognition where
the background of images is not uniform.[7] In this method thresholds are calculated for every pixel by
using formula which is mentioned below. The formula contains the mean and standard deviation of the
local neighborhood which is defined by a window centered around the pixel. The local thresholding value
will be calculated by the following equation:
( , )
( , ) = ( , ). 1 − . 1 −
Where k is a constant equal to 0.5, and R denotes the dynamic range of the standard deviation s (defined as R
128 for a grayscale documents)
Algorithm:
input_image
def_mean_std(image[ndarray(N, M)], int(window_size))
{
m=mean of each pixel of image
s=standard deviation of each pixel of image
returns:m, s
}
def_mean_std (image[ndarray(N, M)], int window_size, k=(float), r=None )
( , )
{ ( , )= ( , ). 1 − . 1 −
returns:T: [ndarray(N, M)]
}
106
Above function is used by Sauvola threshold, in which mean and standard deviation of each pixel of an
image has been calculated and return by using neighborhood. Here, neighborhood is defined by rectangular
window having size w*w. Where, window_size(w) should be odd integer value such as (3, 5, 7, . . . . .).
Here, Parameter window_size determines the size of the window that contains the surrounding pixels. Here,
Sauvola thresholding is applied to an array, threshold value T is calculated using the formula given in
algorithm. Where, m(x, y) is mean of pixel (x, y), s(x, y) is standard deviation of pixel (x, y), k is used to
weights the effect of standard deviation, R is maximum standard deviation of grayscale image.
This algorithm is used to compensate illumination effects of the image even if the image is captured in
different light variations, this Sauvola threshold preserved information contained in an image. Figure 6. has
two images, image (n) is the image having illumination effect and image (o) obtained by applying Sauvola
thresholding which is very useful for OCR system to recognize text correctly.
E. Text Recognition
Text recognition is the most important task of any OCR system, there are various OCR systems are available but
they are not capable to produce correct output if there is a variation with respect to rotation, tilt and illumination.
So aim of this project to make a robust system where variations with respect to rotation, tilt and illumination are
allowed. In this work main focus is on these three aspects.
This text recognition system is implemented using Google API called Tesseract which supports more than 110
languages[13], where Long Short Term Memory neural network is used to train text file. It is used to convert
image into text. It has 98% accuracy but when there are variations found in rotation, tilt effects, light effects its
accuracy starts decreasing. So, we tried to overcome this problems using PCA to make system rotation invariant,
perspective transform to make system tilt invariant and Sauvola thresholding to make system illumination
invariant. So recognition rate get increases above 98%.
F. Extraction of Best match
The goal of this work is to make robust OCR system which recognizes the word even if it is distorted, word is
misspelled or some characters in the words are deleted. It should be recognized in its correct form and it is done
by finding best match to the infected word. For this, thelist of words is stored in text file. The word obtained
from previous step is search in that text file, and if there is a match found for that particular word, that best
match will be treated as a recognized text. This will increase the accuracy of our system.
To find out best match for the detected word, Levenshtein distance method is used. The Levenshtein distance
[14] is a string distance measurement technique. It is used for measuring the difference between two sequences
of string. In casual way, the Levenshtein distance between two strings is the minimum number of single-
character edits i.e. insertions, deletions or substitutions which are required to change one string into the other.
This Levenshtein edit distance method would help us in matching a word or string in its infected form with its
original form.
The Levenshtein edit distance method is implemented in following order
Minimum length of the two words
Actual Levenshtein edit distance between the words
Length of subset string match, starting from the first letter
In this project Levenshtein distance method is used to find best match to the recognized Marathi word from the
Marathi keywords to increase the accuracy. The keywords list is already stored in the text file. The recognized
words found in infected form it will get correct by using Levenstein distance edit method as shown in Error!
Reference source not found.
Mathematically, the Levenshtein distance between two strings a, b is given by lev(a,b)
(, ) if min(i, j) = 0
⎧
⎪ ( , ) = ( − 1, ) + 1
( , ) =
⎨ ( , ) = ( , − 1) + 1 otherwise
⎪ ( , ) = ( − 1, − 1) + 1(
⎩ )
107
TABLE I. LEVENSTEIN EDIT DISTANCE OBSERVATION
Incorrect Word Correct
Word
अशुद अशु ी
िव यता िव ता
नैस रक नैसविगक
अनधा ् य अ धा
Where 1( ) is the indicator function equal to 0 when = and equal to 1otherwise, and , ( , ) is the
distance between the first characters of and the first characters of .
Note that the first element in the minimum corresponds to deletion from (a to b), the second to insertion and the
third to match or mismatch, depending on whether the respective symbols are the same.
The experiment is done on 100 images. The experimental analysis shows that the proposed method work fine
with long word i.e. word having more than two letters it gives 98% accuracy, the accuracy get reduced due to
illumination effects on image.
108
VII. CONCLUSION
In the context of the current project we are planning to use the character recognition system for knowledge
search engine where users are school children, for whom handling complicated Devanagari keyboard might be
difficult. For the children, transliteration facility to provide input Marathi words using English keyboard is
difficult. So, we allow user to enter the search query by the captured image in which region of interest is
highlighted by marker so that user can acquire more information through the knowledge search engine. We use
the character recognition system to recognize the input and convert it into digital form and pass it on to our
knowledge search Engine. This feature would enable the users to enter search queries in an easy and user
friendly manner.The developed character recognition system has several standalone applications as well, such as
digitization of documents, automating several systems in which ability to recognize the text/numbers play a
crucial role (e.g. recognizing the amounts written on checks, recognizing the addresses written on the envelopes,
recognizing names, addresses, phone numbers written on forms etc.), to list a few. The underlying algorithms
and techniques which we aim at developing would be applicable to all these applications in general. This
approach can be used in multilingual character recognition as well.
REFERENCES
[1] HuiminLu ,Baofeng Guo, Juntao Liu, Xijun Yan “A Shadow Removal Method for Tesseract Text Recognition”,2017
10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI
2017)
[2] Taeyoung Kim, Yu-Wing Tai, Sung-Eui Yoon “PCA ased Computation of Illumination Invariant Space for Road
Detection”,2017 IEEE Winter Conference on Applications of Computer Vision
[3] H. El Bahi, Z. Mahani, A. Zatni and S. Saoud “A robust system for printed and handwritten characterrecognition of
images obtained by camera phone”,WSEAS TRANSACTIONS on SIGNAL PROCESSING Volume 11, 2015
[4] Sam S. Tsai, Huizhong Chen, David Chen, Vasu Parameswaran, Radek Grzeszczuk, Bernd Girod “Visual Text Features
for Image Matching”, 2012 IEEE International Symposium on Multimedia
[5] Annmaria Cherian, Sebastein “Automatic Localization and Recognition of Persectively Distorted Text in Natural Scene
Images”
[6] Vidula T. V.,Vrinda V. Nair “A Robust Performance Evaluation Scheme for Rectification Algorithms in Camera
Captured Document Images ”, 2014 ICCSC, Trivandram
[7] J. Sauvola and M. Pietikainen, “Adaptive document image binarization,” Pattern Recognition 33(2), pp. 225-236, 2000.
DOI:10.1016/S0031-3203(99)00055-2
[8] Y. Gurav, P. Bhagat, R. Jadhav and S. Sinha, "Devanagari Handwritten Character Recognition using Convolutional
Neural Networks," 2020 International Conference on Electrical, Communication, and Computer Engineering (ICECCE),
2020, pp. 1-6, doi: 10.1109/ICECCE49384.2020.9179193.
[9] Shalini Puria,, Satya Prakash Singh, “An efficient Devanagari character classification in printed and handwritten
documents using SVM”, www.sciencedirect.com Procedia Computer Science 152 (2019) 111–121
[10] Tripathy, Nilamadhaba, Tapabrata Chakraborti, Mita Nasipuri, and Umapada Pal. "A scale and rotation invariant
scheme for multi-oriented character recognition." In 2016 23rd International Conference on Pattern Recognition (ICPR),
pp. 4041-4046. IEEE, 2016.
Agrawal, Mimansha, Bhanu Chauhan, and Tanisha Agrawal. "Machine Learning Algorithms for Handwritten
Devanagari Character Recognition: A Systematic Review." vol 7 (2022): 1-16.
[11] Ding, Chris, Ding Zhou, Xiaofeng He, and HongyuanZha. "R 1-pca: rotational invariant l 1-norm principal component
analysis for robust subspace factorization." In Proceedings of the 23rd international conference on Machine learning, pp.
281-288. 2006.
[12] Badla, Sahil. "Improving the efficiency of Tesseract OCR Engine." (2014).
[13] Haldar, Rishin, and Debajyoti Mukhopadhyay. "Levenshtein distance technique in dictionary lookup methods: An
improved approach." arXiv preprint arXiv:1101.1232 (2011)
109
Grenze International Journal of Engineering and Technology, June Issue
Abstract—Dental Biometric is a new field of study in the sector of biometrics identification. This
technique can be sometimes used instead of the usual fingerprint biometric identification. In
most of the cases, Dental Biometric credentials come handy to analyze the details of a dead
person where, the Dental Biometrics of the person before death and the person after their death
could potentially explain various reasons for their death and could justify their identity after
death. Neural Networks have the potential to learn about anything using their complex
construction structure. Here, x-ray copies of the dental teeth structure of a potential individual
are given or fed to the network and the neural network with the help of an object detection
platform such as OpenCV2 detects the teeth structure and can visualize the teeth structure with
x-ray of that individual alone. As the neural is now able to see the teeth, it can learn a lot of
crucial details about it from the picture it can understand. The interpretation about the teeth
contours and the number of teeth is made along the way. The proposed methodology has better
accuracy than the fuzzy clustering relevant methods. Also Suggested to use the appropriate
values of parameters that should be opted for the algorithm.
I. INTRODUCTION
As there is a greater availability of medical digital data, expanding processing power and advances in artificial
intelligence, computer-aided diagnosis (CAD) has made tremendous progress during the previous two decades.
CAD systems that aid radiologists, physicians in decision-making are used to solve a variety of medical issues,
notably breast and colon cancer identification, lung disease classification, and brain lesion identification. Digital
radiography's growing popularity encourages more research in the field. Radiographic image processing has now
become a major issue of automation in dentistry, as radiographic information is a vital aspect of diagnosis of
dental health monitoring, and treatment planning. Several investigations are done in the last decade to address
the problem of teeth detection. There have been several suggested pixel-level techniques for tooth detection that
were based on classic computer vision techniques like thresholding, histogram equalization etc. With enough
recall (sensitivity) one can help the computer to be able to distinguish the teeth. On CT images too, various
techniques have been utilized and a manual method to place coordinates surrounding each tooth has been
developed.
II. DATASET
From January 2016 to March 2017, 1574, panoramic samples of radiographs is randomly selected from the X-
ray’s provided by Reutov Stomatological Clinic, Russia. The database doesn’t not include any other features like
gender, age, time etc. The tooth detection and identification models were trained in the training group, while the
software's performance was verified in the testing group. The XG-3 - Sirona Orthophos X-ray machine was used
to capture all panoramic radiographs (Sirona Dental Systems GmbH, Bensheim, Germany). Ground truth
comments for the photographs were provided by five radiology professionals with varied levels of experience.
Experts are instructed to draw bounding boxes around all teeth with high-resolution panoramic radiographs. Due
to the skewness in the data collection, complete anonymzed data is used. The Steklov Institute of Mathematics in
St. Petersburg, Russia, made a formal decision that the use of radiographic material for this work was exempt
from ethics committee or IRB approval.
IV. MODELS
A. Deep Learning Models
The suggested approach makes use of deep learning techniques. Deep learning enables a computer programme to
extract and learn attributes from the input data to understand previously unheard examples. Deep learning
111
techniques stand out because they can learn directly from raw data input, such as the pixels in pictures, without
the requirement for manual feature engineering. [1]One of the most popular deep learning techniques for image
recognition is deep CNNs. To efficiently represent and learn hierarchical features at various levels of abstraction,
CNN designs take advantage of unique properties of image input information, such as spatial relationships
between objects; see LeCun et al. for a thorough description of deep learning techniques.
B. UNET Model Neural Network
The encoder and decoder were the two essential parts of the introduced technology. The covenant comprises
various first in the encoder, followed by the pooling layers. It is used to extract the image's factors. To enable
translation, the second portion decoder utilizes transposed convolution. It's an F.C joined layers network yet
again.
C. Teeth Detection
The Faster R-CNN model is used in the teeth detecting method. 15 Faster R-CNN arose from the A fast R-CNN
infrastructure that used the R-CNN methodology (Region-based CNN). It is challenging to find the areas of
interest through object detection. R-CNN offered a unified strategy for both regions of interest proposal
generation and object localization. Using Fast R-CNN, which streamlined the pipeline and optimized
computation, improved R-performance for CNNs. Finally, a CNN-based method that was much more
sophisticated was presented by Faster R-CNN.The R-CNN is made up of two parts: the object detector and the
regional proposal network (RPN). RPN proposes region of interest i.e teeth in this case. The object detector
makes use of these recommendations to better localize and categories the objects. Both modules produce feature
maps, which are condensed versions of the source image, by using the CNN convolution layers that lie beneath.
In contrast to standard computer vision algorithms, which demand hand-engineering of the features, the features
are derived during the training phase.
By moving the window over the feature map and creating potential bounding boxes called "anchors" at each
window point, RPN creates regional suggestions. The RPN employs the specific regressor to narrow the
bounding box and determines the likelihood that each anchor will contain an object or a background. The top N-
ranked region ideas are then sent to the object detection network.[2] The object detector generates the final
bounding box coordinates for a two-class detection task after refining the class value of a region to determine
whether it is a tooth or a background. Model weights that had been pre-trained on the ImageNet data set were
used to create the fundamental CNN. All of CNN's layers were adjusted because the data set is sizable enough
and differs enough from ImageNet. With exponential decay following, the learning rate was initially set at 0.001.
D. Teeth Numbering
The teeth are numbered by a convolutional architecture called VGG-16. The model was taught to estimate the
number of teeth using the two-digit notation. This module categorizes the teeth using the output from the teeth
detection module. Based on the anticipated boundary bounds, it crops the teeth. Each clipped image is then given
a two-digit tooth number by the VGG-16 CNN. The classifier begins with a set of confidence ratings for each of
the 32 classes, estimating the likelihood that each bounding box will contain one of the 32 potential tooth
numbers. The classified data is again processed by custom heuristic algorithm to enhance prediction results. This
process is carried out so that each portion of the tooth appear only once.
Like how they were used for teeth detection, the weights learned on ImageNet dataset is then used to initialize
the CNN model. Based on annotations X-rays, cropped images were created for training. [4] The cropping
process was modified to include nearby structures, which increased the CNN's prediction quality by providing
context. To further increase the variety of the data set, the images were improved. 64 batches were used to train
the CNN. The Keras library and TensorFlow serve as the backend for the dental numbering module, which is
developed in Python.
112
Figure.2: Clear Image of Input X-ray
The above figure.4 shows the operations of the UNet Model that was implemented by the program. The UNet
model primarily makes sure that the image detected by the yolo program already is understood further. The
images loaded to the model are expected in the dimensions (512,512) so that the model is implemented better.
In the figure.5, the f1 score of the UNet model implemented previously has been given. The score implies that
the model is very good in the detection of the teeth x-ray images given to it. It can also be said that the model
scored a 95% in the f1_score calculated for this model.
In the figure.6, the successfully predicted mask or the teeth of the test images by the model fitted has been
displayed. The model in the program now understands or reads from the x-ray images provided containing the
teeth images and the proof that it detects the teeth successfully is implied by this figure.[5].
In the figure.7, a contoured x-ray image of the test images given to the model has been displayed. Contours are
used here as a change in the view of the x-ray image. Here, only the boundary of the teeth detected has been
marked providing a new perceptive. Like the previous image, even this figure is evidence of a proof that the
model can detect the teeth in the x-ray image.
113
Figure.6: Successful predicted mask
In the figure.8, a detailed detection view of the teeth x-ray image given to the model as a test image is displayed.
The pixel details of every tooth detected by the model has been displayed next to it. This can significantly help
technicians as these details have been computed easily by the model now that it is detecting and understanding
the x-ray image of the teeth given to it.
In this figure, the total teeth that have been detected by the model so far in that image given to it as a test x-ray
image has been given. Note that usually people can have about 28 to about 32 teeth in case of when some or all
wisdom teeth have sprouted in that individual. In some cases, people may have less teeth as it would have fallen
out. All these cases make it more important to know the number of the teeth in a person through the x-ray image
provided.
VI. CONCLUSION
The Dental Biometric System's UNET model-based design is particularly effective at identifying the biometrics
of the teeth. Panoramic radiographs can be a very effective tool to support patients’ diagnosis and to define a
114
treatment plan to them. The use of segmentation models to detect teeth and their exact limits can be of
paramount importance for eliminating a task that is quite susceptible to human failure. Biometric traits can be
used for authentication and personal security. [3,6]It is possible to collect even the pixel details and
measurements of the teeth shown in the panoramic x-ray photographs. These findings are extremely helpful for
forensic and dental research. Based on the F1-score, this model provides 95% accuracy. By ensuring that the
UNET model effectively detects additional information about the teeth, such as their type and any surgical
identification in them, this study can be further refined. The work's results are acceptable and offer directions for
a more superior and efficient dental segmentation procedure. The results obtained in this work are satisfactory
and present paths for a better and more effective dental segmentation process.
REFERENCES
[1] Yetis, A.D., Yesilnacar, M.I., Atas, M. (2021). A machine learning approach to dental fluorosis classification. Arabian
Journal of Geosciences, 14(2):1- 12.
[2] L. Megalan Leo and T. Kalapalatha Reddy, “Learning compact and discriminative hybrid neural network for dental
caries classification,” Microprocessors and Microsystems, vol. 82, Article ID 103836, 2021.
[3] C. Muramatsu, “Tooth detection and classification on panoramic radiographs for automatic dental chart filing: improved
classification by multi-sized input data,” Oral Radiology, vol. 37, no. 1, pp. 13–19, 2021.
[4] M. Sujithra and G. Padmavathi, "Next generation biometric security system: An approach for mobile device
security", Proc. CCSEIT, pp. 371-381, 2012.
[5] Fariza et al. (2019) Fariza A, Arifin AZ, Astuti ER, Kurita T. Segmenting tooth components in dental x-ray images
using Gaussian kernel-based conditional spatial Fuzzy C-Means clustering algorithm. International Journal of Intelligent
Engineering and Systems. 2019
[6] M. Sujithra and G. Padmavathi,” An Improved PCA based Zero Crossing Feature Extraction For Real- Time Biometric
Iris Authentication In low Power Resource Constrained Mobile Devices” International Journal of Applied Engineering
Research,2015.
115
Grenze International Journal of Engineering and Technology, June Issue
Abstract—The core business of the banking sector is sanctioning loans to different individuals
and industries. The credit risk analysis of these elements gives guarantee about regular
repayment of loan. As a result, healthy business firms repay their loan regularly thereby
increasing good return on investment to bank. It is possible to increase the accuracy of credit
risk calculation using current technology like Big Data and different analytical tools. In our
approach, along with traditional parameters like profit/loss, financial history, financial status of
directors, cash flow, we also included non-formatted data like news and informal information
for analysis. This information can be included as positive, negative and regular. This
information can be collected using Big Data techniques from websites, news websites,
government agencies and external agencies. This is used to construct the credit scoring models
and to predict the borrower’s creditworthiness and default risk. Looking at the uncertainty
associated with judging the credit of borrower, it is necessary to add new tools and methods to
get maximum correctness. Our approach to use Big Data analysis tools to input informal
sources available on internet, will increase the accuracy of finding good borrower for banks.
Index Terms— Financial Analysis, Credit risk, Big Data, Data mining.
I. INTRODUCTION
The banking industry deals with capital flow and risk associated with it. The overall performance and profit of
the bank depends upon the repayment of the loans distributed to different sectors. Bank distributes loans to
individual and other businesses. An individual’s credit can be calculated based on his income, tax paid, saving
and assets. But for business firms it is a complicated process. Many banks are now using automated tools for risk
calculation and credit determination. These tools take into consideration of profit/loss, sales history, and
financial status of promoters, cash flow and other parameters. From these parameters, bank calculate the credit
level of the firm. The bank's success greatly depends upon its decision of credit to firm. Banks are also exposed
to different kinds of risk, but the most challenging risk is credit risk. The performance of loan contracts affects
profitability and stability of a bank growth and development. The extent to which a borrower uses the credit
facility efficiently will greatly impact the firm’s repayment ability and performance, which in turn affects the
lending institutions. Credit risk is the loss of bank’s profit, since the customer does not adhere to his or her loan
refund commitment. Financial institutions are facing the problem of loan proposals because of continuous
changes in the business environment, credit regulations, marketing strategies and the competition in business
117
highlighted by author. The data generated by social; site is unformatted and is termed as Big Data. The BigData
tools are currently used for analysis of data generated by social sites and interne for credit risk analysis.
Wenshuai Wu described the advantage of using Big data analysis for credit risk calculation. [13] The complex
methods and parameters involved in risk calculation always need advance tools to incorporate in this process.
Author has focused on modern tools and future trends in this complex process of credit risk calculation. The
volume of data is huge; hence independent research is carried out for the effect of social information on risk
calculation in our proposed system. The social site's information and news provided are valuable information
that can provide the financial state of firms and their business domain. If this information is utilised for the
calculation of credit risk, a bank may avoid a possible loss to its assets.
118
TABLE I. TRADITIONAL RISK FACTOR ANALYSIS
SR. NO. RISK FACTORS VARIABLES DESCRIPTION
Funds raised by the firm/
Capital
borrower
Leverage and Solvency Net worth Capital + reserves
1
Indicators It is a proportion
Debt Equity
between firms total debt and
ratio
total equity
Creditors, Loans to be repaid
Current
within one year, provisions of
Liabilities
taxes and expenses.
Current Cash in hand and bank
Assets balances, Inventory of the firm.
2 Liquidity Indicators
It measures the proportion of a
party’s current assets to its
Current Ratio current liabilities and thus
gives a measure of the short
term liquidity of the firm.
Sales Sale of the Goods by firm
Total sales minus Total
Profit
expenses.
Profit after depreciation of
3 Profitability Indicators
Net Profit Building/machinery and
furniture.
Profit to sales It is the percentage of profit to
ratio total sales of the year.
scoring. The deductive credit scoring system awards points (weights) to particular relevant attributes of the credit
parameters. The weightage value of attributes is aggregated to a total score. The relevant attributes and their
weights are determined by the credit decision-makers based on their experiences. The cut-off for the score is
considered by bank for rejection of the loan or determining risk associated with the credit sanctioned.
Process flow diagram is shown in figure 2. Basic steps of data mining are not shown in the figure, but data
cleansing and attribute pruning is necessary for proper application of data mining for prediction. Different
119
algorithms can be implemented here. We used decision tree technique for analysis of regular parameters.
Similarly proper key words suitable to firms business are selected so that big data tools used for filtering data
from social sites can give correct score. The Natural Language Toolkit is used for text tokenization of data
collected form social site. Key words are separated and different clustering techniques are applied to classify the
information into positive, negative and normal. These classes are provided different weights by bank depending
upon the policies and processes adopted by bank. So total credit score is calculated from the two stages presented
to decide on loans. Our system is capable of indicating the status of loan after it is also sanctioned. As it is
continuously monitor the internet for information. Any change in domain policies, and company business news
are picked up by the prosed system to predict the sanctioned loan condition in future.
14 6 18 2
6 3 4 5
Existing System Proposed System
120
V. CONCLUSIONS
Credit risk is major risk parameter in the banking sector. Wrong calculation of credit score leads to credit
defaulters. Irregular repayment leads to poor health of bank. This affects the interest of stock holders of the
financial sector. For a good and healthy economy of the country, it is essential to have a strong banking sector. It
is expected that banks should give loans by proper analysis of the firm’s financial health. In this paper we have
proposed the system which relies on the present automatic system using data mining techniques and adding
social site information. The system takes into account non informal and non-structured data for analysis. This
data works as sensor for detecting the health of firm and its business sector. The bank can decide using this
system the credit limit, interest rate and repayment capacity of the firm. The implementation using decision tree
and Hadoop shows good results at the primary work carried out by us. The correctness of system can be judged
by evaluating it further using different evaluation matrix. This system not only helps the bank during the
sanctioning process of loans but also helps monitor the firm’s capability to repay the loan during its loan period.
REFERENCES
[1] Kwaku D. Kessey, “Assessing credit risk management practices in the banking industry of ghana: processes and
challenges”, Global Journal of Management and Business Research, Vol. 15 Issue 6 Version 1.0 pp. 201-212, 2015.
[2] Hamid EslamiNosratabadi, SanazPourdarab and Ahmad Nadali, “A new approach for labeling the class of bank credit
customers via classification method in data mining”, International Journal of Information and Education Technology,
Vol. 1, No. 2, pp 151- 156, June 2011.
[3] Asrin KARIMI, “Evaluation of the Credit Risk with Statistical analysis”, International Journal of Academic Research in
Accounting, Finance and Management Sciences, Vol. 4, No.3, pp. 206–211, July 2014.
[4] Ms. A. R. Ghatge, Mr. P. P. Halkarnikar, “Estimation of credit risk for business firms of nationalized bank by neural
network approach”, International Journal of Electronics and Computer Science Engineering, Vol. 2, No. 3, pp. 828-
834, 2012.
[5] Maubi Andrew Mokaya,Dr. Ambrose Jagongo, “Corporate loan portfolio diversification and credit risk management
among commercial banks in kenya”, International Journal of Current Business and Social Sciences, Vol.1, Issue 2,
pp.81-111, 2014.
[6] Sudhakar M, Dr. C. V. K Reddy, “Two step credit risk assesment model for retail bank loan applications using decision
tree data mining technique”, International Journal of Advanced Research in Computer Engineering & Technology
(IJARCET), Vol. 5 Issue 3, pp. 705 – 718, March 2016.
[7] Sudhamathy G., “Credit Risk Analysis and Prediction Modellingof Bank Loans Using R”, International Journal of
Engineering and Technology (IJET), Vol. 8 No. 5, pp. 1954 -1966, Oct-Nov 2016.
[8] Naoyuki Yoshino and Farhad Taghizadeh-Hesary,“A comprehensive method for the credit risk assessment of small and
medium-sized enterprises based on asian data”, ADBI Working Paper Series, December 2018.
[9] Khaled Alzeaideen, “Credit risk management and business intelligence approach of the banking sector in Jordan”,
Cogent Business & Management, 6:1, 1675455, DOI: 10.1080/23311975.2019.1675455, 2019.
[10] Anchal Goya, Ranpreet Kaur, “Loan Prediction Using Ensemble Technique”, International Journal of Advanced
Research in Computer and Communication Engineering, Vol. 5, Issue 3, March 2016.
[11] Somayeh Moradi, Farimah Mokhatab Rafiei, “A dynamic credit risk assessment model with data mining techniques:
evidence from Iranian banks”, Financial Innovation, 2019.
[12] Salihu, Armend; Shehu, Visar, “A Review of Algorithms for Credit Risk Analysis”, Proceedings of the ENTRENOVA -
ENTerprise REsearch InNOVAtion Conference, , IRENET - Society for Advancing Innovation and Research in
Economy, Zagreb, Vol. 6, pp. 134-146, 10-12 September 2020.
[13] Wenshuai Wu, “Credit Risk Measurement, Decision Analysis, Transformation and Upgrading for Financial Big Data”,
Hindawi Complexity, Vol. 2022, Article ID 8942773, https://doi.org/10.1155/2022/8942773, 2022.
121
Grenze International Journal of Engineering and Technology, June Issue
Abstract—The certificates and grade reports are crucial records for anyone applying for a job
or seeking higher education because they act as identification verification. The traditional
paper-based certificate method makes obtaining such a crucial document highly time-
consuming and expensive. A trusted authority issues the digital certificate, which is a document
that may be used to demonstrate authenticity. Technology development has made it possible for
the practice of producing fraudulent report cards and diplomas. Document fraud and forgery
have gone unreported due to a lack of anti-forge mechanisms. E-documents use digital
signatures to enable authentication, integration, and non-repudiation, however forgery is
possible when the key itself is compromised. In order to prevent certificate fraud and guarantee
the security, legitimacy, and secrecy of diplomas, blockchain technology is deployed. Blockchain
ensures correctness and trustworthiness of information and allows for quick authentication of
degree certificates.
I. INTRODUCTION
In order to demonstrate a graduate's qualifications once they successfully complete the chosen course,
universities provide certificates to the graduates. The crucial records needed to apply for jobs and further
education are these marksheets and degree certificates. Validation and verification of documents have grown in
importance. It is important to confirm that the graduate's diploma is authentic and that the owner is the rightful
owner with the appropriate authorization [1]. Traditional paper certificates need a lot of time and money. They
are even susceptible to fraud brought on by blunders and forgery. Mark sheets made of paper have a long
procedure, little flexibility, and are not environmentally friendly. However, the forging of certificates has
increased as a result of the availability of sophisticated and affordable technologies. Both the credential bearer
and the university that granted the certificate are put in danger [6] as a result of this. This study suggests a
system that uses blockchain technology to digitalize both the production of degree certificates and their
verification. The markcards are protected from fabrication and falsification thanks to the immutable nature of
blockchain technology. [11]
More than any other invention this century, blockchain technology will significantly influence the way we live in
the future. Anyone who cannot comprehend it will soon feel left behind when they awaken in a technologically
advanced world that increasingly resembles magic. The development of various techniques to verify academic
records such as degree markcards has confused people on which architecture is best to identify real and forged
markcards. [12] As blockchain has a very interesting feature of immutability this project uses permissionless
blockchain such as Ethereum as a platform to build a system that issues and validates the degree certificates [16].
Traditional paper-based degrees and mark cards are susceptible to fraud due to typos and forgeries. Markcards
123
Step 2: Admin User logged in,
Yes, Go to Next Step.
No, Go to Step 9.
Step 3: Display dashboard, View Certificate and Issue Certificate for all the students list.
Step 4: Logout
Yes, end the process. Stop,
No, Go to next step.
Step 5: Click on view certificate.
Step 6: Certificate available?
Yes, Click on View Certificate Go to Next Step.
No, Click on Issue Certificate Go to Step 8.
Step 7: Get url from Certificate Table & Prompt Display window to download certificate. Go to Step 2.
Step 8: Update CertificateAvailable True in Database and call enroll(id,certificateHash) to update into
blockchain. Go to Step 2.
Step 9: Company User logged in.
Yes, Go to next step.
No, Go to Step 16.
Step 10: Display dashboard, Validate Certificate for students list.
Step 11: Logout
Yes, end the process. Stop,
No, Go to next step.
Step 12: Click on Validate Certificate, Upload the Certificate File & find md5sum of image.
Step 13: ImageHash == blockChainStoredHash
Yes, Go to Next Step.
No, Display Failed message Go to Step 9.
Step 14: Display message: Validation Success & Display Profile Page.
Step 15: Click on dashboard. Go to Step 9.
Step 16: Logged in as Student User.
Step 17: Logout
Yes, end the process. Stop,
No, Go to next step.
Step 18: Request for certificate
Step 19: Certificate Issued by University?
Yes, Display the profile page. Go to Next Step,
No, wait for the certificate to be issued.
Step 20: Logout
Yes, end the process. Stop,
No, Go to next step.
The modules are as follows:
1. User Interface Design: After the user login to user interface, the first page visible is the dashboard. In
the dashboard, admin user can view the certificate, Issue certificate, able to see list of students and
companies. Student user can request and view his certificate, whereas company user can validate and
view certificate of the candidate. In general user interface has Student List, Company List, View
Certificate, Issue Certificate, Request Certificate and Validate Certificate modules. These modules are
visible based on the role of each user exists the digital certificate system.
2. Verification: Verification has steps to verify data from database using database connector and
blockchain using web3 connector. Database connectors used to update Certificate availability in User
table and Certificate URL in the Certificate table. Web3 Connector used to add student details to block
chain and get student details from the blockchain. The purpose of the verification is to verify whether
the uploaded certificate is valid or not using the connectors.
3. Server: In project, firebase database is used for storing the data. As shown in figure, User table stores
student and user details like name, email id, usn number etc., each user is categorized using user role
column. User role 1 is set for admin and 2 & 3 is set for company & student respectively. In the project,
firebase database is used for storing the data. As shown in the figure, User ta student and user details
like name, email id, usn number etc., each user is categorized using user role column. User role 1 is set
for admin and 2 & 3 is set for company & student respectively. Certificate table stores the data URL
124
details. In project, firebase database is used for storing the data. As shown in the figure, User ta student
and user details like name, email id, usn number etc., each user is categorized using user role column.
User role 1 is set for admin and 2 & 3 is set for company & student respectively. Certificate table stores
the data url details for the image.
4. Blockchain: Ganache is personal blockchain for Ethereum development, which can be used to deploy
contracts, develop your applications and run tests. Records are stored in terms of blocks, each Ganache
is personal blockchain for Ethereum development, which can be used to deploy contracts, develop your
applications and run tests. Records are stored in terms of blocks, each of the records contains usn and
certificate hash.
As shown in the below Fig. 4.3, Data Flow diagram for digital certification consists of three main entity. Admin,
Student/Company and Blockchain. When admin click on View Certificate button, first in the system will verify
in the user table, whether certificate is available is true, if its true then certificate url will be fetched from
certificate table and url will be passed to admin user for download. If certificate is not available for particular
student, then issue certificate will be shown to admin user. When admin user clicks on issue certificate, is puts
the image url to the certificate table and update certificate available column in user table as true. And also,
push’s the details to blockchain using enroll methods with USN and certificate has detail. In case of company
user, whenever they click on validate user, system verifies the certificate with newly uploaded certificate hash
with hash available in the block chain for that particular student. If the certificate is valid then profile
information of the student is displayed. In case of student user, student will request for the certificate, and wait
for certificate to be issued, once certificate is issued, he will able to view and download the certificate.
B. Sequence Diagram
User Click on login button, validate user will be called for validatin g user email and password. To validate the
user, emailed and password verified in the database, when credentials match with the database credentials user
Login will be successful. And the dashboard will display list of students.
Fig. 1: Data Flow Diagram of Admin Fig. 2: Data Flow Diagram of Admin
125
Fig. 7: Student Profile Fig. 8: Verification status Page after the validation of marksheet
Fig. 9: Verification Status Page for an altered marklist Fig. 10: Verification Data Flow
Fig. 11: User Validation Data Flow Fig. 12: University/ Company Certificate Validation Data Flow
The graphic below illustrates the five significant modules that make up the module flow diagram. The ensuing
modules are combined to fulfil the needs of the suggested project. These are the modules:
1. User Interface Design
2. Server
3. Blockchain
4. Verification
The front-end web application that enables registration for students and universities through easy form filling is
included in the user interface design module. Every time a user registers, the database of universities and
students is updated. Only from the specific university that the student attends may they request a certificate. The
university computes the hash of the certificate and uploads the requested document to the web server if the user
is a legitimate student. Now the student can download their diploma. The blockchain stores the certificate's hash
rather than the complete file because the valid transactions are committed to it. The certificate produced by the
student may now be hashed using the same process, and prospective employers and other universities can
compare the results to the hash stored in the blockchain for that specific certificate. The markscard is legitimate
if the hashes of the two are identical; else, it is invalid.
126
Fig. 13: Architecture Design
dashboard comprising of student list, company list and logout option. Under the student list, the action column is
either empty for particular user who just registered but not requested any certificates or if the particular user is
requesting a certificate, then action column contains issue certificate button which is clicked to select the
particular student certificate from the college database to upload it. If certificate is already issued to a student,
then only view certificate button is visible to view certificate. The admin can add new companies by navigating
to company list Company user is directed to the dashboard comprising of student list. The student list contains a
column which is either empty for student who did not have any certificates yet or contains a validate certificate
button if a particular student has the certificate. When a student user log into the system then he will be having
all the details given by him during registration along with a certificate row either containing a button view
certificate if the certificate is already issued otherwise request certificate button to request for certificate. When a
student takes his certificate to a company, then company can simply login to the system and click on validate
certificate button for particular student which redirects to a validation page where the certificate given by student
is uploaded using cloud. If the certificate is same as issued by the admin/college then the certificate valid
message pop up in the same window otherwise shows certificate invalid indicating the certificate is either forged
or altered.
127
TABLE 1: TEST CASES
V. CONCLUSIONS
Any record that is stored in the blockchain repository cannot be changed because to the immutable nature of
blockchain technology. It offers security, integration, and authentication. Online access to digital degree
certificates reduces costs and saves time. The ability to receive a replica of the original markcards or certificates
online in the event of loss or damage to the originals offers a great deal of flexibility. Security is provided by the
fact that the papers kept in the blockchain repository cannot be changed or removed. The main benefactors of
this system are universities, students, and recruiters because it makes the process of creating and verifying
certificates and marksheets easier. In conclusion, the suggested model avoids certificate fraud and falsification,
and employers may be confident that they will receive accurate information from the blockchain repository.
Digital certification also has huge scope for all the sectors of education like medical, engineering, pharmacy. law
etc. which can adopt new certification and save a lot of manual work. Digital certification is essential for all the
online courses, as digital certification avoid forgery so these online course certifications will be easily accepted
by organisation. For example, company user creation option through online portal directly helps rather than
university creating the same. So, university will only accept organisation request and allow them to use the
digital certification if they match certain criteria. Using digital certificate, we can do a lot more. We may also
incorporate the online test, so the student will receive their certification right away following the test's approval.
Digital certificate has lots of future scope in future, as in current situation social distance has become very
important aspect of life, so degree certificates and markscard can be obtained online.
REFERENCES
[1] M. Warasart and P. Kuacharoen, “Paper-based Document Authenticating using Digital Signature and QR Code,” no.
Iccet, 2012.
[2] Z. Chen, “Anti-Counterfeit Authentication System of Printed Information Based on A Logic Signing Technique”.
[3] Oliver Miquel, Moreno Joan, Prieto Gerson, Benitez,David(2018):”Using Blockchain as a tool for tracking and
verification of official degrees: business model”,29th European Regional Conference of the ITS.
[4] Juliana Nazare,Kim Hamilton Duffy, J. Philipp Schmidt “Digital Certificate Project” MIT Media Labs, 2015.
[5] Stephen Thompson “The Preservation of Digital Signatures on the Blockchain” University of British Columbia iSchool
Student Journal ,vol.3 (Spring 2017).
[6] Jayashri, N., Rampur, V., Gangodkar, D., Abirami, M., Balarengadurai, C., & Kumar, A. (2023). Improved block chain
system for high secured IoT integrated supply chain. Measurement: Sensors, 25, 100633.
[7] Osman Ghazali and Omar S. Saleh, “A Graduation Certificate Verification Model via Utilization of the Blockchain
Technology”, e-ISSN: 2289-8131 vol. 10 no. 3-2.
[8] X. Technologies, “Blockchain imperative for educational certificates,” Xanbell Technologies, 2017.
[9] MIT Media Lab Learning Initiative and Learning Machine, “Digital Certificates Projects.” [Online]. Available:
http://certificates.media.mid.edu/.
[10] R. Arenas and P. Fernadez, “CredenceLedger: A Permissioned Blockchain for Verifiable Academic Credentials.” In
IEEE international conference on Engineering,Technology and Innovation (ICE/ITMC). Stuttgart, Germany 2018.
[11] Ahmed Badr, Laura Rafferty, Quassy H. Mahmoud, Khalid Elgazzar, Patrick C.K. Hung “A Permissioned Blockchain-
Based System for Verification of Academic Records” in IEEE 2019.
128
[12] Sharma, I., & Sharma, S. (2022, November). Blockchain Enabled Biometric Security in Intemet-of-Medical-Things
(IoMT) Devices. In 2022 International Conference on Augmented Intelligence and Sustainable Systems (ICAISS) (pp.
971-979). IEEE.
[13] Neethu Gopal and Vani V Prakash, “ Survey on Blockchain Based Digital Certificate System,” IRJET, vol. 5, Issue: 11 |
Nov 2018.
[14] Nitin Kumavat, Swapnil Mengade, Dishant Desai, Jesal Varolia, “Certificate Verification System using Blockchain,”
IJRASET vol. 7 Issue IV, Apr 2019
[15] Tyagi, S., Ansari, N., Bisht, D., Kumar, R., Memoria, M., Awasthi, M., ... & Gupta, A. (2022, May). Role of IOT and
Blockchain in Achieving a Vision of Metropolitan’s Digital Transformation. In 2022 International Conference on
Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON) (Vol. 1, pp. 752-757). IEEE.
[16] T. Keerthana, R. Tejaswini, V. Yamini, K. Hemapriya, “Integration of Digital Certificate Blockchain and Overall
Behavioural Analysis using QR and Smart Contract”, IJRESM vol. 2, Issue-3, March 2019.
129
Grenze International Journal of Engineering and Technology, June Issue
Abstract—This paper proposed a miniaturized microstrip antenna (MMA). The design consists
of a microstrip antenna having a radiating patch with modified U slots. Dielectric substrate of
PTFE is used to design the antenna. Co-axial probe feeding technique is used to energise the
MMA. HFSS software is used to model the design of the MMA. Initial patch antenna without
slots, resonant at 6.65 GHz. After etching slots on the radiating patch, the MMA resonant at
3.85 GHz. So the 66.46% of miniaturization has been done. The MMA provides a maximum
gain of about 5.54 dBi at 3.85 GHz. This design may be used to operate for wireless
applications.
I. INTRODUCTION
The design of a miniaturized antenna has emerged as one of the most important issues for modern broadcasting
systems due to the rapidly expanding wireless networks. Due to its favorable characteristics, including as
compactness, inexpensive manufacture, simplicity of structure, and effective compatibility with small electronic
apparatus, microstrip antennas are the most suitable candidate for wireless applications [1-2]. For the purpose of
shrinking the antenna's size, slots of various sizes and shapes have been inserted into the radiating patch, ground
planes, or both [3,4]. S. Islam & et al. reports on a small antenna for RFID applications [5]. The antenna has slots
of various sizes to accomplish its 32% compactness. It has been observed that employing slots for a multiband
patch antenna reduced size by 30% presented in [6]. M.S.H. Ali & et al. [7] demonstrates a small Patch antenna.
This offers two bands. The antenna has two L-shaped slots that are mirror images of one another, along with two
slits and a square slot. This antenna offers a compactness of 41.2%. A comparison study has been done on a
compact equilateral triangular patch antenna with various slot shapes, and the best compactness of 43.47% is
reported in [8]. It is estimated that a small microstrip antenna for mobile communication can achieve 46.13%
more compactness by employing two irregular rectangular slots at the patch's edge [9]. By utilizing open end
meandering slots in the ground plane, a rectangular microstrip antenna reports an 83% compactness [10]. Using
an H-shaped slot on the radiating patch and a U and L slot combination on the ground plane, it is reported that
the compactness is 86% [11]. A circular patch antenna adds an open ended slot to the radiating patch to increase
compactness by 86.5% [12]. A miniaturization of 50% has been reached by only using defected ground structure
a b c d e f
50 50 13 17.75 10.6 7.6
g h i j k l
10.6 7.6 8 6.6 1 1.4
Figure.1: The layouts of the MMA Figure.2(x-z): The evolution of the MMA (Ant_A to Ant_C)
TABLE II: COMPARISON INFORMATION TABLE REGARDING DIFFERENT STEPS OF THE PROPOSED ANTENNA
Steps Resonant Reflection co- Gain Compactness
Frequency efficient (%)
131
Figure.3: Comparison reflection co-efficient versus frequency plot for ANT_A to ANT_C
resonant frequency of 3.85 GHz. The plot of the gain is given in figure.7. The radiation patterns (E and H plane)
of the MMA are given in figure.8 and figure.9. In this article, initial rectangular patch antenna (Ant_A) without
slot gives resonant frequency of 6.65 GHz. After placing slots on the patch (Ant_C), the frequency shifts from
6.65 GHz to 3.85 GHz.
Figure 4(a) shows the surface current distribution of the Ant_A and its vector current distribution is shown in
figure 4(b). Figure 5(a) shows the surface current distribution of the proposed antenna and its vector current
distribution is shown in figure 5(b). Red color indicates maximum current density. In case of figure 4(b), it is
shown that current flows from left to right on entire radiating patch but in case figure 5(b), the current path
becomes leathered due to the presence of the slots. As the current path increases that means length increases and
frequency decreases. So the frequency shifts from higher to lower range. This phenomenon is called
miniaturization or compactness. The compactness of the proposed antenna has been calculated by using
following equations (1-8).
132
Perimeter of the Ant_A to resonate at the frequency of 6.65 GHz is =2× (17.75+13) mm=61.5 mm…… (1)
So, the perimeter of the proposed antenna to resonate at 3.85 GHz is (61.5×6.65)/3.85=106.22 mm……… (2)
Therefore, total length and breadth of proposed antenna= 106.22/2=53.11 mm……… (3)
The length of the proposed antenna with respect to initial antenna is (17.75/30.75) ×53.11=30.65 mm….. (4)
The breadth of the proposed antenna with respect to initial antenna is (13/30.75) ×53.11=22.45 mm….. (5)
The area of Ant_A=17.75×13 mm2=230.75 mm2……… (6)
The area of the proposed antenna is =30.65×22.45 mm2= 688.1 mm2………… (7)
So, the compactness= {(688.1 -230.75)/ 688.1 }×100%=66.46 %.......( 8)
Figure.6: Reflection co-efficient plot of the MMA Figure.7: Gain plot of the MMA
Figure.6: The radiation patterns (E plane) of the proposed Figure.7: The radiation patterns (E plane) of the proposed MMA
MMA
V. CONCLUSION
This article presents a miniaturized microstrip antenna. Only insertions of slots on the radiating patch have been
done. HFSS software simulation tool is used to model and simulate the proposed miniaturized microstrip
antenna. The design is very simple and easy. It provides 66.46% of miniaturization. This antenna may be useful
in wireless applications.
133
TABLE III: COMPARISON TABLE
Ref Substrate Techniques to % of
area achieve Miniaturizati
miniaturization on
[5] 37.7×28.4 Slots on the patch 32%
[6] 24×24 Slots on the patch 30%
[8] 120×120 Slots on the patch 43.47%
[9] 24×30 Slots on the patch 46.13%
[13] 22×20 DGS on the 50%
ground plane
[14] 60×60 Koch fractal 45%
geometry
[19] 45×25 Slots and slits 33%
Propose 50×50 Slots on the patch 66.46%
d MMA
REFERENCES
[1] J. S. Kuo and K. L. Wong, “A compact microstrip antenna with meandering slots in the ground plane,” Microwave Opt.
Technol. Lett. vol 29, pp. 95-97, April 20, 2001.
[2] C. L. Tang, H. T. Chen, and K. L. Wong, “Small circular microstrip antenna with dual frequency operation,” Electron.
Lett. Vol 33, pp. 1112–1113, June 19, 1997.
[3] H. Malekpoor and S. Jam, “Design of a multi-band asymmetric patch antenna for wireless applications”, Microwave
Opt. Tech. Lett. vol. 55, pp. 730–734, April, 2013.
[4] U. Kiran, V. R. M, R. M. Yadahalli, P. V. Hunagund and S. F. Farida, “Microstrip-line-fed rectangular microstrip
antenna with open end meandering slots in the ground plane for compact broadband operation” Microwave Opt Technol
Lett. Vol 49, pp. 824 – 827, April, 2007.
[5] K. L. Wong and K. P. Yang, “Compact dual-frequency microstrip antenna with a pair of bent slots,” Electron. Lett. Vol
34, pp. 225–226, Feb. 5, 1998.
[6] S. Islam and M. Latrach, “Design construction and testing of a compact size patch antenna for RFID applications”
Microwave Opt. Tech. Lett. vol. 55, pp. 2920–2925, 2013.
[7] M. S. M. ALI, S. K. A. RAHIM, M. I. SABRAN, M. ABEDIAN, A. ETENG AND M. T. ISLAM, “Dual band
miniaturized microstrip slot antenna for WLAN applications”, MICROWAVE OPT. TECH. LETT. VOL. 58, PP. 1358–
1362, JUNE, 2016.
[8] S. Dasgupta, B. Gupta and H. Saha, “Compact equilateral triangular patch antenna with slot loading” vol. 56, pp. 268–
274, February, 2014.
[9] S. Chatterjee, U. Chakraborty, I. Sarkar, P. P. Sarkar, and S. K. Chowdhury, “A compact microstrip antenna for mobile
communication” India Conference (INDICON), Annual IEEE, 17-19 December, 2010, Kolkata, India.
[10] U. Kiran, V. R. M, R. M. Yadahalli, P. V. Hunagund and S. F. Farida, “Microstrip-line-fed rectangular microstrip
antenna with open end meandering slots in the ground plane for compact broadband operation” Microwave Opt Technol
Lett. Vol 49, pp. 824 – 827, April, 2007
[11] S .I .H. Shah, S. Bashir, A. Altaf, and S. D. H. Shah “Compact multiband microstrip patch antenna using defected
ground structure (DGS)” XIXth International Seminar/Workshop on Direct and Inverse Problems of Electromagnetic
and Acoustic Wave Theory (DIPED), 22-25 September, 2014, Tbilisi, Georgia.
[12] K. Mondal, L. Murmu, and P. P. Sarkar, “Investigation on compactness, bandwidth and gain of circular microstrip patch
antenna” Devices for Integrated Circuit , 23-24 March, 2017, Kalyani, India.
[13] Hanae Elftouh, Naima A. Touhami, Mohamed Aghoutane, Safae El Amrani, Antonio Tazon and Mohamed Boussouis,
“Miniaturized Microstrip Patch Antenna with Defected Ground Structure” Progress In Electromagnetics Research C,
vol. 55, pp. 25–33, 2014.
[14] Il-Kwon Kim, Jong-Gwan Yook and Han-Kyu Park, “Fractal-shape small size microstrip patch antenna” Microwave and
Optical Technology Letters, vol. 34, no. 1, July 5 2002.
[15] Jeevani Jayasinghe, Omar Saraereh, Rajas Khokle and Karu Esselle, “Design and analysis of m-segment fractal
boundary antennas” vol.61, issue 9, pp. 2119-2125, 2019.
[16] Yang Cai, Zuping Qian, Wenquan Cao and Yingsong Zhang, “Research on the half complementary split-ring resonator
and its application for design” Microwave and Optical Technology Letters, vol. 57, no. 11, November 2015.
[17] D. R. Minervino, 1 A. G. D’Assuncaoand C. Peixeiro2, “Mandelbrot fractal microstrip antennas” Microwave and
Optical Technology Letters vol. 58, no. 1, January 2016
[18] Jai Mangal and L Abhinav Varma, “A Miniatueized rectangular slotted patch antenna for WiFi frequency range
applications” IEEE 2nd international conference on applied electromagnetic, signal processing and communication
(AESPC), Bhubaneswar, India, 26-28 November, 2021.
[19] Ziyang Li, Leilei Liu, Pinyan Li and Jian Wang, “Miniaturized design of CPW-Fed slot antennas using slits” 2017 Sixth
Asia-Pacific Conference on antenna and Propagation (APCAP), Xi'an, China, 16-19 October 2017.
134
[20] Sudipta Das, Parimal Chowdhury, Arindam Biswas, Partha Pratim Sarkar, and Santosh Kumar Chowdhury, “Analysis of
a Miniaturized Multiresonant Wideband Slotted Microstrip Antenna With Modified Ground Plane”, IEEE antennas and
wireless propagation letters, vol. 14, pp. 60-63, 2015
[21] Biswarup Rana, Soumen Banerjee, Priyasha Chatterjee, Ritam Banerjee, Rituparna Basak, “Design of a CPW-Fed Spiral
Ring-Loaded Miniaturized Slot Antenna” International Conference and Workshop on Computing and Communication
(IEMCON), Vancouver, BC, Canada, 15-17 October 2015.
[22] Balanis, C.A., “Antenna Theory: Analysis and Design”, John Wiley & Sons, Inc, 1997.
135
GrenzeInternational Journal of Engineering and Technology, June Issue
Abstract—It is difficult to divvy up and keep track of resources amongst several entities. This is
particularly true for complex and ever-changing systems, such as those seen in cloud
computing, software engineering, and the Internet of Things (IoT). Providing safe access
control is crucial to the success of such a system. In particular, the safe, adaptable, and
granular handover of privileges from one entity to another. Here, we introduce a blockchain-
based multi-organizational delegation system. In our system, smart contracts on the blockchain
specify how the consortium's member organizations interact and how their shared resources
are divided up.
I. INTRODUCTION
Cloud computing is a method of remote, scalable resource provisioning that uses utility-based computing
models. Parallel computing, grid computing, and distributed computing are all realized on the cloud [1]. Users
can access a shared pool of resources in the cloud and use them as needed using an "on-demand" model [2].
Users can use cloud services whenever and wherever they like thanks to the cloud's powerful computing
capabilities and massive storage capacity. IT assets such as databases, servers, communication devices,
networks, and software systems are housed in a cloud data Centre. As more customers use the cloud, more
servers or other gear will be needed to meet demand. Thus, the creation of more physical nodes will result in an
increase in data center power usage. Today, 2% of all electricity used in the globe goes toward powering data
centers. By the year 2030, projections show it will have reached 8%. Data centers have three major power users:
servers, data Centre networks, and cooling systems. The network uses 10% to 25% of the energy, the cooling
systems use 15%-30%, and the servers use 40%-55%, respectively [3].
Computing resources such as RAM, CPU, Network, and Storage are provided by IaaS (Infrastructure as a
Service) and their use is typically governed by Service Level Agreements (SLAs) (Service Level Agreement).
Use of resources is also influencing energy requirements. One of the causes of the data center's energy
insufficiency is the inefficient use of its resources [4]. Even at 10% CPU utilization, the workload is light enough
to cause the energy consumption to be more than 50% of the maximum. This is where IaaS's virtualization
techniques come into play, which helps maximize the usefulness of any given cloud's assets [5]. Due to the
shared resources made available by virtualization, VMs can take the place of PMs in processing user requests.
Separating virtual machines (VMs), moving VMs, and merging VMs are all examples of what may be done
using virtualization. VM migration is a method for moving active virtual machines from one physical host to
137
II. RESEARCH METHODOLOGY
We propose Automatic software cloud resource allocation utilizing permission block, a policy-based and
autonomic middleware that enables self-adaptiveness for data management in clouds, to address these issues.
Therefore, the proposal combines three highly sought-after elements: Software cloud resource allocation is
monitored in real-time, and the collected and aggregated metrics (such as write latency, read latency, uptime,
free memory, etc.) Are secured on a blockchain for optimal privacy and integrity. Data management decisions
are made based on what cloud service is best suited to help you satisfy your service level agreements (slas), and
your data is transmitted securely. Moreover (iii), the cloud storage setup is automatically re-configured (based on
the simple, reusable, and extendable configuration policies), meaning that a human operator is no longer needed
to monitor and manually re-configure the cloud storage setup's security.
III. WORKING
Each node in a distributed ledger maintains a chain of records called "blockchain." The Bitcoin network's
consensus issue was proposed by S. Nakamoto. Each block in the blockchain, with the exception of the initial
block, includes the hash of the prior block, as illustrated in Figure 2. (Genesis block). The former block is always
produced ahead of the latter, and each block contains transactions, which are logs of acts taken on the
blockchain, such as the transfer of assets. Figure 02 further elucidates the technique by which a blockchain is
created. As shown in the diagram, step one of a transaction involving Node0 involves a user signing it with his
private key. As a result, the digital signature improves security and data integrity, and the transaction can be
tracked using the user's public key. Afterward, Node0's immediate neighbor receives the transaction broadcast
(i.e., Node1 and Node2).
Node1 and Node2 ensure the broadcast transaction follows the transaction protocol before broadcasting it to
Nodes3 and 4. If the transaction does not follow the protocol, it will be dropped.
138
Each network should make it clear to all participants what kind of protocol will be used for transactions before
the blockchain is even created. Transaction protocol's primary goal is to maintain network order in the
blockchain.
IV. CONCLUSION
This paper initially examines the privacy and security concerns surrounding edge computing-enabled IoT, before
moving on to describe the features of blockchains that make them ideal for use in IoT applications. It was
suggested to use a common framework for all Internet of Things (IoT) use cases that involve blockchain
technology and edge computing. The entire process of a transaction was laid out in minute detail under the
proposed framework. Additionally, the edge computing resources allocation problem was addressed by
developing a smart contract in a private blockchain network that utilized the cutting-edge reinforcement learning,
Asynchronous Advantage Actor-Critic algorithm. In particular, the efficiency of the suggested method is
improved over the state-of-the-art edge computing resource allocation techniques by catering to various service
users and differentiating between their Quality of Service (QoS) needs. That's an example of how AI and
blockchains can work together. The simulation results were presented to prove the efficiency of the proposed
resource allocation system for edge computing. Joint optimization of blockchain settings and edge computing
resource allocation is something we plan to investigate more in future work.
REFERENCES
[1] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash, “Internet of things: A survey on enabling
technologies, protocols, and applications,” IEEE Communications Surveys & Tutorials, vol. 17, no. 4, pp. 2347–2376,
2015.
[2] M. M. Rathore, A. Ahmad, A. Paul, and S. Rho, “Urban planning and building smart cities based on the internet of things
using big data analytics,” Computer Networks, vol. 101, pp. 63–80, 2016.
[3] C. Qiu, X. Wang, H. Yao, J. Du, F. R. Yu, and S. Guo, “Networking integrated cloud-edge-end in iot: A blockchain-
assisted collective qlearning approach,” IEEE Internet of Things Journal, 2020.
[4] C. Qiu, F. R. Yu, H. Yao, C. Jiang, F. Xu, and C. Zhao, “Blockchainbased software-defined industrial internet of things:
A dueling deep q-learning approach,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 4627–4639, 2018.
[5] P. Garcia Lopez, A. Montresor, D. Epema, A. Datta, T. Higashino, A. Iamnitchi, M. Barcellos, P. Felber, and E. Riviere,
“Edge-centric computing: Vision and challenges,” ACM SIGCOMM Computer Communication Review, vol. 45, no. 5,
pp. 37–42, 2015.
[6] J. Du, L. Zhao, J. Feng, and X. Chu, “Computation offloading and resource allocation in mixed fog/cloud computing
systems with min-max fairness guarantee,” IEEE Transactions on Communications, vol. 66, no. 4, pp. 1594–1608, 2018.
[7] S. Shen, Y. Han, X. Wang, and Y. Wang, “Computation offloading with multiple agents in edge-computing–supported
iot,” ACM Transactions on Sensor Networks (TOSN), vol. 16, no. 1, pp. 1–27, 2019.
[8] X. Wang, Y. Han, V. C. Leung, D. Niyato, X. Yan, and X. Chen, “Convergence of edge computing and deep learning: A
comprehensive survey,” IEEE Communications Surveys & Tutorials, vol. 22, no. 2, pp. 869–904, 2020.
[9] X. Wang, C. Wang, X. Li, V. C. Leung, and T. Taleb, “Federated deep reinforcement learning for internet of things with
decentralized cooperative edge caching,” IEEE Internet of Things Journal, 2020.
139
Grenze International Journal of Engineering and Technology, June Issue
Abstract— Nowadays, it's fairly usual to share moments on social media. By communicating
thoughts, ideas, and enjoyable experiences over text, we can express our feelings without
needing a lot of words. To investigate people's opinions, sentiments, and emotions, for instance,
businesses may target YouTube as an abundant source of data. A greater comprehension of an
author's emotions is often possible through emotion analysis. Analyzing expressions as positive,
negative, or neutral has been the focus of almost all projects evaluating Telugu social media.
We'll categorize the expressions in this essay into groups based on the emotions of happiness,
fury, fear, disgust, and melancholy. Different approaches have been used in the case of other
languages to automatically recognize textual emotions, however few of them were based on deep
learning. Now let's talk about the system we utilized to classify the feelings stated in Telugu
YouTube comments. For tasks requiring phrase classification, our model includes an XLM-
RoBERTa and Multilingual BERT that was specifically trained on our dataset using trained
word vectors. We contrasted the outcomes of our method with those of other machine learning
techniques. The architecture of our deep learning technique is a word-based, end-to-end
network, phrase, and document vectorization procedures. The proposed deep learning strategy
was tested using the Telugu YouTube comments dataset, and the results were promising when
compared to more traditional machine learning methods.
I. INTRODUCTION
As social media has become more popular, internet users can now voice their opinions on a wide range of
subjects. Social networking sites are increasingly being used for a variety of activities, such as the advertising of
products, the sharing of news, and the recognition of achievements.
Emotion analysis, often known as opinion mining, is the study of how to infer from textual data how individuals
feel about a particular thing, person, or organization.
Market analysis, e-commerce, social media monitoring, and many more areas are examples of contemporary
applications for emotion analysis. Telugu is the fifteenth most frequently spoken language in the world, with
more than 75 million native speakers. The creation of a technique for Telugu text emotion analysis will benefit
several people and organizations.
Everyday life brings us into contact with a variety of events, which leads to the formation of opinions regarding
those occurrences. A person's emotions are strong feelings they have in reaction to their circumstances or
interpersonal relationships. It has a big impact on consumer decision-making in a lot of different areas, such e-
commerce, restaurants, movies, interests, and satisfaction with a service or a product. Additionally, it affects our
health! Users can now voice their opinions about a comment, picture, or event using Facebook's replies feature,
which has just undergone some changes. These reactions include angry, happy, love, and surprise.
In academic circles, emotional analysis is seen as a kind of higher, more developed version of sentiment
analysis. Sentiment analysis is used to classify texts (posts, words, or documents) as neutral, positive, or
negative. Emotional analysis, on the other hand, is a more extensive and in-depth investigation of user emotions
with the goal of examining the psychology of various user behaviors and illuminating deeper human emotional
meanings including anger, disgust, trust, grief, delight, and surprise.
The English language has a good reputation in the field of emotion detection, including the accessibility of
datasets and dictionaries, in contrast to Telugu, which has a dearth of resources.
In this study, We look into automatic emotion recognition for Telugu language using MULTI LINGUAL BERT
using four steps: word, sentence, document vectorization, and classification. Displaying the performance and
precision that deep learning has so far attained, we also compared this methodology to other machine learning
techniques. We applied our techniques to analyze user sentiment in the YouTube comments dataset.
141
F. LINJIAN LI(2021) A Novel Emotion Lexicon for Chinese Emotional Expression Analysis on Weibo: Using
Grounded Theory and Semi-Automatic Methods
The downsides for the model include the methodology that was utilized (ALO and SC-LIWC), the metrics that
were used to generate the output (Precision, recall, and F1), and the methodology that was used (ALO). Only
users from China's Weibo were surveyed for this data set.
The strength of the relationship that each word has with the corresponding emotion category was not included in
the lexicon.
G. Chang Liu,Taiao Liu, Shuojue Yang,And Yajun Du(2021)
Individual Emoticon Recognition Approach Combined Gated Recurrent Unit with Emotion Distribution Model.
This paper proposes a model called semantic emoticon emotion recognition (SEER). First the input text is
divided into four categories with emotion dictionary and emoticons. Then it is combined by a bidirectional gated
recurrent unit (Bi-GRU) a network with an emotion-vector-capturing attention mechanism. Lastly, a emoticon
distribution model is constructed to obtain emotion vectors from various social network data. Fourth, we
combine the emoticon emotion characteristics in text with the texts semantic emotional components using
various fusion weights based on the various types of input short messages. Depending on the resulting emotion
vector, we finally divide the short text emotions into six categories.
H. BHARATHI RAJA CHAKRAVARTHI (2021)Dataset for identification of homophobia and transophobia in
Multilingual youtube comments.
This paper describes the process of building the dataset, qualitative analysis of data, and inter-annotator
agreement. In addition, we create baseline models for the dataset.
I. FERDOUS AHMED(2020) Emotion Recognition from Body Movement
The methodologies that were applied were SVM,LDA,GNV,DT, and KNN. The metrics that were applied for
the purpose of obtaining the output were f-Score,p-Score, and Accuracy. The limitations of the model are as
follows: Observed a marginal drop in performance across the board in action-independent cases
J. ZISHAN AHMAD, RAGHAV JINDAL, ASIF EKBAL and PUSHPAK BHATTACHHARYY (2020)
Borrow from rich cousin: transfer learning for emotion detection using cross lingual embedding. Expert Systems
with Applications.
This paper is mapped with the emotions of disaster domain sentences in Hindi. Dataset is created for the disaster
domain sentences. The models used here are CNN and Bi-LSTM (Bi-Directional Long Short Term Memory).
For Hindi emotion categorization, the neural networks are trained on the available datasets, and then the weights
are adjusted using one of four transfer learning techniques.
K. Zhenzhong Lan, Mingda Chen, Piyush Sharma, and Rady Soricut (2019)
ALBERT: A Lite BERT for Self-supervised learning of language representations.
For BERT to use less memory and train more quickly, we provide two parameter-reduction strategies. Detailed
empirical data demonstrates that our suggested methods produce models that scale far better than the original
BERT. We also employ a self-supervised loss that emphasises modelling inter-sentence coherence, and we
demonstrate that it consistently facilitates tasks that require multi-sentence inputs.
L. Alexis Connaeu, Kartikay Khandelwal, Naman Goyal,Vishrav Chaudhary (2019)
Unsupervised cross-lingual representation learning at scale.
Using more than two terabytes of filtered CommonCrawl data, we train a Transformer-based masked language
model on 100 different languages. On a number of cross-lingual benchmarks, including +14.6% average
accuracy on XNLI, +13% average F1 score on MLQA, and +2.4% F1 score on NER, our model, called XLM-R,
greatly surpasses multilingual BERT (mBERT). Low-resource languages are where XLM-R excels, with
Swahili's XNLI accuracy increasing by 15.7% and Urdu's by 11.4% over earlier XLM models. The trade-offs
between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages
at scale are among the important aspects that must be considered in order to accomplish these advantages, and
we also give a thorough empirical study of these factors.
M. Stephen Merity, Nitish Shirish Keskar, And Richard Socher. (2019)
An analysis of neural language modeling at multiple scales.
142
We provide a model architecture and training method that, when applied to the WikiText-103 data set, achieves
state-of-the-art performance while being significantly quicker than an NVIDIA cuDNN LSTM-based model by a
factor of two. using the Quasi-Recurrent Neural Network (QRNN), an Longer sequences within batches and
softmax with weight tying.
N. Jeremyhoward And Sebastian Ruder (2019)
Universal language model fine-tuning for text classification.
We describe strategies that are essential for fine-tuning a language model and propose Universal Language
Model Fine-tuning (ULMFiT), a powerful transfer learning method that may be used for any NLP application.
On six text classification tasks, our approach greatly exceeds the state-of-the-art, lowering the error on most
datasets by 18–24%. Furthermore, it matches the performance of training from scratch on 100x more data with
only 100 labeled instances. Our pre-trained models and code.
O. VINAY KUMAR JAIN, SHISHIR KUMAR, and STEVEN LAWRENCE FERNANDES (2019)
Extraction of emotions from multilingual text using
intelligent text processing and computational linguistics.
Every word of emotion in a tweet is significant in decision-making, hence an efficient pre-processing technique
has been utilized to maintain the significance of multilingual emotional words. The Naive Bayes algorithm and
Support Vector Machine (SVM) are used to classify tweets' sentiments in exquisite detail.
143
Although it is not always the case, it is possible that they will correlate to sentences that were written next to one
another in the original text. Although not always the case. Do not, however, take this too literally because it is
not always the case. But on the other side, they can decide not to follow through in the end. The model must then
decide whether or not the two sentences were located in the text directly after one another. If you have access to
a dataset with labeled sentences, for example, you may use the characteristics the BERT model produced as
inputs to train a typical classifier.
The standard classifier will be able to learn from the tagged sentences as a result. As a result, the model is able to
develop an internal representation of the languages that are included in the training set. This will give the
standard classifier the chance to learn about the classification of sentences. Then, using this representation of the
languages, features that are useful for later tasks in the process can be extracted.
A. Algorithm: Xlm-Roberta
1. Importing XLMRobertaTokenizer and XLMRobertaForSequenceClassification from transformers.
2. Model is named as xlm-roberta-base.
3. Input that is the tokenizer is given as XLMRobertaTokenizer.from_pretrained(MODEL_TYPE).
4. The module is downloaded of 100 percent.
5. Checking the size of the vocab.
6. Verifying whether the special tokens are present or not.
7. Model inputs are given such as
input_ids (type: torch tensor)
attention_mask (type: torch tensor)
labels (type: torch tensor)
8. The very first input is the 'input_ids'. These represents the sentences which also represent tokens.
9. The second is the 'token_type_ids'.
10. Third is the 'attention_mask'. It has the same length as of 'input_ids' and it also tells the model which tokens
in the 'input_ids' are working and which are padding.
11. To indicate token or a special word '1' is used and for padding '0' is used.
12. Third input also consists of 'labels'.
13. A tokenizer is used to create XLM-RoBERTa input for both one and two input sentences.
14. The sequence of tokens are decoded.
15. The truncated tokens will return in a list called overflowing_tokens.
16. Data is loaded.
17. Creates folds according to the requirement for traning and testing.
18. Displays the required folds.
IV. IMPLEMENTATION
We'll outline the data in this part that was used, as well as our methodologies, in order to recognise emotions in
Telugu YouTube videos using a deep learning approach XLM-RoBERTa.
The three modules of the project for implementation are:
a. Dataset creation
b. Training dataset
c. Testing dataset
A. Dataset Creation
The dataset of Telugu YouTube comments was provided and used for the training of the model, which is an
ordinal classification task based on the intensity of feelings: You must classify a comment into one of five
ordinal classes of intensity for the emotion represented by the letter E if it is offered together with an emotion
that begins with that letter. One comment has been added to the dataset for each of the following emotions, for a
total of one thousand comments includes rage, fear, joy, disgust, and sadness.
Our dataset was split into two sets: 500 comments made up the training set, and 100 comments made up the
testing set. 90% of the dataset had to be used for training, while just 10% was necessary for testing. The test
dataset was only used to test the created model and offer an indicator of how well the trained model is working.
To train the model, the training datasets were classifier and to optimise the parameters. The model was not given
access to the test dataset.
144
Fig1: Block diagram
B. Training Dataset
Data Preprocessing is done. Because our dataset was in Telugu, we had to perform some specialized pre-
processing in order to identify the most effective pattern i.e, training the dataset. The following are the steps that
we followed:
i. The standardization of the writing of certain characters that can be written in a variety of different
ways, such as writing in the normal form.
ii. Do away with all of the diacritics.
iii. Eliminate all of the punctuation marks.
iv. Get rid of characters that are repeated: When describing an action, such as laughing, YouTube users
frequently repeat a character to highlight and accentuate their meaning. Hahaha, enlargement Wow,
what outrage! Oh no, etc. We eliminated all other occurrences of repeated characters because we
believed that a word could only have two instances of a repeating character.
In addition, we have the option of including a step that gets rid of stop words in the input text. Stop words
include things like prepositions, conjunctions, and other similar words.
C. Testing the Dataset
The dataset is divided in two types as data used for training for 80 percent and 20 percent for testing the data.
The trained dataset is tested with different algorithms like XLM-RoBERTa and Multilingual BERT. To test the
dataset necessary python libraries for Colab code execution dataset as pandas data frame are imported. We have
used seaborne's count plot to count various emotions. The task is to find the best machine learning algorithm
with good accuracy.
145
V. RESULTS & DISCUSSION
This model is trained using XLM-RoBERTa algorithm and Multilingual BERT algorithm with around 600
Telugu sentences mapped with the emotions happy, neutral, disgust and anxiety. This model gives the accuracy
of 77 percentage for XLM-RoBERTa algorithm and for the algorithm Multilingual BERT it gives 53 percentage.
146
VI. CONCLUSION
This study attempted to classify comments made on social media. We applied the XLM-RoBERTa and
Multilingual BERT strategies. With a macro-averaged f1 score of 0.77 for XLM-RoBERTa and for Multilingual
BERT f1 score is of 0.53. XLM-RoBERTa method outscored all other models. Overall, the models are seen to
identify emotions like anxiety, happiness, neutral, and disgust. The models are far less accurate in classifying
more complex emotions like fear, rage, and melancholy. To enhance the performance of the models, alternative
strategies, such as genetic algorithm-based ensembeling, can be tested in the future.
REFERENCES
[1] Emotion Analysis in Man-Machine Interaction Systems – T.Balomenos 2022
https://link.springer.com/chapter/10.1007/978-3-540-30568-2_27
[2] Multimodal emotion recognition using deep learning SMSA -Abdullah 2022
https://scholar.google.com/scholar?cluster=11062434886599925582&hl=en&as_sdt=0,5
[3] Emotion Analysis in Tamil - Omkar Gokhale, Shantanu Patanka, Onkar Litake, Aditya Mandke, Dipali Kadam 2022
https://scholar.google.com/scholar?cluster=11062434886599925582&hl=en&as_sdt=0,5
[4] Emotion detection in roman Urdu text using machine learning - A Majeed 2022
https://dl.acm.org/doi/abs/10.1145/3417113.3423375?casa_token=fS0ijtmLfAIAAAAA:a0hgzLiIfWlAYnp3E5x5fvxZ9
TAX GgYBBZ_XmDBI0xiY0NU1nfJvK5xXkwfMGPTguPBNBwrfb4GmjBQ
[5] Sentiment Analysis and Emotion Detection on Cryptocurrency Related Tweets Using Ensemble LSTM-GRU Model -
NAILA ASLAM, FURQAN RUSTAM, ERNESTO LEE, PATRICK BERNARD WASHINGTON, AND IMRAN
ASHRAF 2022 https://ieeexplore.ieee.org/abstract/document/9751065
[6] A Novel Emotion Lexicon for Chinese Emotional Expression Analysis on Weibo: Using Grounded Theory and Semi-
Automatic Methods - LIANG XU, LINJIAN LI, ZEHUA JIANG, ZAOYI SUN, XIN WEN, JIAMING SHI, RUI SUN,
AND XIUYING QIAN 2021 https://ieeexplore.ieee.org/abstract/document/9139939
[7] Emotion Recognition from Body Movement - FERDOUS AHMED, A. S. M. HOSSAIN BARI, AND MARINA L.
GAVRILOVA 2020 https://ieeexplore.ieee.org/abstract/document/8945309
[8] Individual Emotion Recognition Approach Combined Gated Recurrent Unit with Emoticon Distribution Model -
CHANG LIU,TAIAO LIU, SHUOJUE YANG,AND YAJUN DU 2021
https://ieeexplore.ieee.org/abstract/document/9597507
[9] Borrow from rich cousin: transfer learning for emotion detection using cross lingual embedding. Expert Systems with
Applications - Zishan Ahmad, Raghav Jindal, Asif Ekbal, and Pushpak Bhattachharyya. 2020.
https://www.sciencedirect.com/science/article/pii/S0957417419305536?casa_token=kEwcg0YM4DIAAAAA:Gxkl7Wj
7_hbtk7vdEPDEwzd7eqgnW_-4xRCl5c8PxV0GRulYhpHcieOkW895-482sC5rtYWEyiOO
[10] Dataset for identification of homophobia and transophobia in Multilingual youtube comments. - Bharathi Raja
Chakravarthi, Ruba Priyadharshini, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Kayalvizhi Sampath, Durairaj
Thenmozhi, Sathiyaraj Thangasamy, Rajendran Nallathambi, and John Phillip McCrae 2021
https://arxiv.org/abs/2109.00227
[11] ALBERT: A Lite BERT for Self-supervised learning of language representations. - Zhenzhong Lan, Mingda Chen,
Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut 2020
/group?id=ML_Reproducibility_Challenge/2020
[12] Unsupervised cross-lingual representation learning at scale. Alexis Conneau, Kartikay Khandelwal, Naman Goyal,
Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin
Stoyanov. 2019. https://arxiv.org/abs/1911.02116
[13] An analysis of neural language modeling at multiple scales. - Stephen Merity, Nitish Shirish Keskar, and Richard
Socher. 2019. https://arxiv.org/abs/1803.08240
[14] Universal language model fine-tuning for text classification. - Jeremy Howard and Sebastian Ruder. 2019.
https://arxiv.org/abs/1801.06146
[15] Extraction of emotions from multilingual text using intelligent text processing and computational linguistics. - Vinay
Kumar Jain, Shishir Kumar, and Steven Lawrence Fernandes. 2019.
https://www.sciencedirect.com/science/article/pii/S1877750317301035?casa_token=xGYROGWQ9aIAAAAA:O-
oKcOApy1QWNbdRDt6pGb-XK8bus-y1sWsk2SCdQGmDYxUs6ycXsePgqUht0qlwpwtDgwxaUzJZ
147
Grenze International Journal of Engineering and Technology, June Issue
Abstract— One of the best ways to address urgent sustainability issues like global warming,
depletion of fossil fuel reserves, and emissions of greenhouse gases is to use electric vehicles
(EVs). By reducing environmental damage and lowering emissions that contribute to climate
change, incorporating electric vehicles into the distribution system will benefit public health. A
microgrid is a tiny power network that collaborates with groups of loads as well as distribution
generators, powerful software solutions, and other elements utilizing devices connected to the
grid that function as a single controlled entity. Last but not least, the study states various ways
to mitigate the effects of electric vehicles on distribution system power quality and get rid of
harmonics.
Index Terms— Integration of electric vehicles, micro grid, power distribution systems,
ecological damage, harmonics.
I. INTRODUCTION
For ensuring a sustainable future, issues like global warming, the depletion of fossil fuel reserves, and emissions
of greenhouse gases (GHGs) require immediate attention. The primary reason for the rising consumption of
fossil fuels and emissions of greenhouse gases is the rapid rise in global energy consumption. As a consequence
of these issues, the renewable energy sector has carried out substantial research for identifying ways and where
to replace traditional fossil fuels and lessen environmental problems. The electrification of the transportation
sector is seen as a promising solution to this issue because it is one of the largest contributors to rising emissions
of pollutants.MG is regarded as the smartest option for optimal operation because of the power grid's rapid
expansion and the intricate structure of DES over long distances. However, electric vehicles have been put on
hold and restricted to golf carts and delivery trucks due to the readily available availability of fossil fuels,
advancements in combustion technology, and the ease with which internal combustion engines can be utilized.
However, EV penetration remains low due to concerns about cruising range, deteriorating batteries, a lack of
charging infrastructure, and high initial costs. Even though electric vehicles have a significant impact on the
reduction of air pollution, they may harm the quality of the grid's power supply. Rechargeable batteries are used
to power electric motors and store energy in electric vehicles. EV battery chargers employ electronic power
devices to transform DC voltage to AC voltage since non-linearity has an impact on electricity quality. Nonlinear
loads have a detrimental effect on power quality. Microgrid has transformed into a key investigation part in wise
the structure and movement systems. Microgrid principally incorporates different environmentally friendly
power sources using different innovative advances, for example, Power electronic-based technology. EVs can
have a negative impact on energy systems, particularly power quality, but they can also significantly reduce CO2
149
widespread today. In AC MG, the three-phase AC bus serves as the power connection point between MGs and
these primary power grids. Installing the common connection area between the Microgrids and the main power
grid is simple. A quick switch serves as the disconnect point. The DG powers the load when normal events take
place, and any extra power generated is sent to the power grid. The AC microgrid will receive the necessary
power from the main grid if the DG's output power falls below the load demand. A significant detail to make
reference to is that air conditioner Microgrids' power quality guidelines are managed in view of ordinary power
conveyance frameworks and methods of activity.
Figure 2: AC microgrid
E. DC Microgrids
The majority of microgrid generators produce DC that must be converted to AC in order to keep up with the AC
grids of today. Because some devices require SC power to function, DC conversion is required at the system's
end. Be that as it may, AC MG successfully lessens the DC-AC-DC energy transformation, prompting energy
misfortune. Using high DC voltage operation as a benchmark, DC MG aims to solve the AC MG issue. By
reducing the number of converters in a single MG system process, the DC-MG structure, in contrast to AC-MG,
can provide significant energy savings.. The necessity of the class converters to communicate the DESs,
stockpiling gadgets, and loadings. It is evident that DC MGs are less likely to cause power quality issues than
AC-distributed grids and are better suited for residential distribution systems. DC MG eliminates the need for
DG synchronization and ensures that control is highly dependent on the DC bus voltage, removing several
control challenges in microgrid. Additionally, primary control is made much simpler by the absence of reactive
power flow management [3]. Additionally, the power supply is unaffected because many modern devices use
direct current and do not contain power electronics. The switching phase transients must be properly controlled
in MG to prevent device destruction. Thus, the issue of his PQ for this situation warrants extra examination.
Figure 3: DC microgrid
150
control their discharge to and charge from the grid. When it comes to balancing power on the grid in
unidirectional V2G, EV batteries are regarded as switchable loads. A single battery in an electric car is
insufficient to affect the grid. A large number of electric vehicles known as EV Aggregators act as a link
between individual electric vehicles and market participants. Electric vehicles can connect to third-party
aggregators on their own or as part of a fleet across cities or regions. Microgrids can receive fewer
communication signals from EV aggregators. As a result, market operators can benefit from EV aggregators'
ability to lower complexity and mitigate cybersecurity risks.
Figure 5: EV Utilization
151
Voltage stability impacts:The term voltage soundness alludes to the power lattice's capacity to keep up with
the voltage on all transports at voltage levels after an issue. Grid voltage stability can be significantly
impacted by variations in demand for the load and features. After system contingencies, the power system
may become more unstable if the load model's alpha is negative. For investigating a comprehensive
examination of the effect of EV penetration on the stability of the grid's voltage of a test distribution
network of 43 buses with interconnected EV charging stations was carried out [7]. The 43-bus test
distribution network's weakest bus has a significantly shorter charging range when using an integrated EV
charging station. The weakest buses' charging margins continue to decline as EV integration grows.
However, the power grid's voltage stability is impacted by the location of EV charging stations, (P/P 0) = a
(V/V 0) a + b
Frequency stability impacts: The frequency may deviate from the permissible range if there is an imbalance
in the power grid's load demand and generation demand. Frequency stability refers to a power system's
capacity to maintain acceptable frequencies following a power system failure [8]. The charging grid's load
demand will rise dramatically as more electric vehicles are sold, necessitating more power generation to
maintain a frequency that is within acceptable limits.
D. Power quality impacts of EV integration
Investments in power generation, demand, prices, and emissions will undoubtedly rise with increasing EV
penetration. The economic effects of incorporating electric vehicles into electricity markets have been the subject
of numerous studies. The following are examples of how EV integration affects load profiles, energy prices,
operating costs, and ancillary services.
152
for reducing harmonics in power systems [10]. Due to its simplicity, economy, resistance to maintenance,
and high reliability, it is primarily utilized in power transmission and distribution networks.
Active filter: A nonlinear load's harmonic current spectrum is out of phase with that of active filters. The
active filter's harmonic current effectively eliminates all network harmonics when fed directly into the
system in real-time. By continuously providing reactive power, both capacitive and inductive, active filters
can improve power factors in addition to suppressing harmonic currents.
Hybrid control techniques:Grid-connected solar energy systems typically employ hybrid control technology.
Passive-based control strategies have progressed to recover reactive power and avoid harmonic distortion by
utilizing the current control loop of an attached inverter. This approach is planned to eliminate symphonies
twisting by utilizing an ongoing control circle to create and follow the consonant substance [11]. Utilizing
physics to model the energy of damped injections and locating system settings that will produce the
appropriate response is the objective of control engineering.
B. Superharmonics Mitigation
For improving the power quality of the PV as well as batteries in MG, DVRs are used to deal with voltage
drops and spikes. With his fuzzy logic-based DVR, he overcame the MG-connected mesh's waviness and
slack [12]. MPC was used to make the DVR work better and deal with Sage and MG swelling made of PV,
a super capacitor, and a battery.
Other tools for addressing issues with power quality include STATCOM and SVC. Due to the high
prevalence of DREs as MGs, voltage fluctuations were mitigated by STATCOMs [13]. In power system
MGs, STATCOMs were also utilized for reducing fluctuation in voltage and compensating for reactive
power.
153
REFERENCES
[1] Pinilla, J.T.M., 2022. Hosting Capacity: A Tool to Modernize the Grid and to Contribute to the Integration of Distributed
Energy Resources in Colombia. Global Journals of Research in Engineering, 22(B1), pp.33-40.
[2] Kharrazi, A., Sreeram, V. and Mishra, Y., 2020. Assessment techniques of the impact of grid-tied rooftop photovoltaic
generation on the power quality of low voltage distribution network-A review. Renewable and Sustainable Energy
Reviews, 120, p.109643.
[3] Xingang, Y., Aiqiang, P., Guangzheng, Y., Chenyang, L. and Yangxiu, Y., 2019, May. Supraharmonics measurement
algorithm based on CS-SAMP. In 2019 IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia) (pp. 160-164).
IEEE.
[4] Li, H., Lv, C. and Zhang, Y., 2019, July. Research on new characteristics of power quality in distribution networks. In
2019 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS) (pp. 6-10). IEEE.
[5] Shabbir, N., Kütt, L., Jarkovoi, M., Iqbal, M.N., Rassõlkin, A. and Daniel, K., 2021. An overview of measurement
standards for power quality.
[6] Tavakoli, A., Saha, S., Arif, M.T., Haque, M.E., Mendis, N. and Oo, A.M., 2020. Impacts of grid integration of solar PV
and electric vehicle on grid stability, power quality and energy economics: A review. IET Energy Systems Integration,
2(3), pp.243-260.
[7] Hamadi, A., Alam, M.S. and Arefifar, S.A., 2021. Analyzing the Impact of Electric Vehicle Charging Stations on Power
Quality in Power Distribution System. 2021 SAE Technical Paper, (2021-01-0199).
[8] Bouzelata, Y., Kurt, E., Uzun, Y. and Chenni, R., 2018. Mitigation of high harmonicity and design of a battery charger
for a new piezoelectric wind energy harvester. Sensors and Actuators A: Physical, 273, pp.72-83.
[9] Suman, S., Chatterjee, D. and Mohanty, R., 2022. Development of improved harmonic compensation technique for PV-
wind hybrid distributed generator connected to microgrid. Electric Power Systems Research, 210, p.108071.
[10] Town, G., Taghizadeh, S. and Deilami, S., 2022. Review of Fast Charging for Electrified Transport: Demand,
Technology, Systems, and Planning. Energies, 15(4), p.1276.
[11] Kurt, E., Cottone, F., Uzun, Y., Orfei, F., Mattarelli, M. and Özhan, D., 2017. Design and implementation of a new
contactless triple piezoelectrics wind energy harvester. international journal of hydrogen energy, 42(28), pp.17813-
17822.
[12] Alkahtani, A.A., Alfalahi, S.T., Athamneh, A.A., Al-Shetwi, A.Q., Mansor, M.B., Hannan, M.A. and Agelidis, V.G.,
2020. Power quality in microgrids including supraharmonics: Issues, standards, and mitigations. IEEE Access, 8,
pp.127104-127122.
[13] Naqvi, S.A.H., 2020. Active Power Measurement in Energy Metering (Master's thesis).
[14] Iqbal, M.N., Measurement Based Approach for Residential Customer Stochastic Current Harmonic Modelling.
[15] Yamahata, C., Stranczl, M., Sarajlic, E., Krijnen, G.J. and Gijs, M.A., 2012. Temporally aliased video microscopy: an
undersampling method for in-plane modal analysis of microelectromechanical systems. Journal of
Microelectromechanical systems, 21(4), pp.934-944.
[16] Lim, J., Wang, P., Shaw, S., Gong, H., Armacost, M., Liu, C., Do, A., Heydari, P. and Nenadic, Z., 2022. Artifact
propagation in subdural cortical electrostimulation: Characterization and modeling. Frontiers in neuroscience, 16.
[17] Wang, X.K., Hao, Z. and Tan, S.K., 2013. Vortex-induced vibrations of a neutrally buoyant circular cylinder near a
plane wall. Journal of Fluids and Structures, 39, pp.188-204.
[18] Mirsafian, S., 1996. Forced vibration of two beams joined with a nonlinear rotational joint. University of Kansas.
[19] Niu, Y., Yang, T., Yang, F., Feng, X., Zhang, P. and Li, W., 2022. Harmonic analysis in distributed power system based
on IoT and dynamic compressed sensing. Energy Reports, 8, pp.2363-2375.
[20] Pond, T.L. and Martin, C.R., 2020. Electrical characteristics of the oxyfuel flame while cutting steel. Experimental
Thermal and Fluid Science, 112, p.109985.
[21] Chen, H. and Konofagou, E.E., 2014. The size of blood–brain barrier opening induced by focused ultrasound is dictated
by the acoustic pressure. Journal of Cerebral Blood Flow & Metabolism, 34(7), pp.1197-1204.
[22] Wang, K.W. and Harne, R.L., 2017. Harnessing bistable structural dynamics: for vibration control, energy harvesting
and sensing. John Wiley & Sons.
[23] Halterman, K., Valls, O.T. and Wu, C.T., 2015. Charge and spin currents in ferromagnetic Josephson junctions. Physical
Review B, 92(17), p.174516.
154
Grenze International Journal of Engineering and Technology, June Issue
Abstract— In post pandemic scenario, most studies have been realized that there is a need for
more work on accuracy in IDS, taking into account the prior study in the field. Furthermore,
various aspects are impacting the time consumption during training operations in existing
literature. Conventional studies have only given a few solutions for effective intrusion detection.
When used, the insights and recommendations from this study will have a significant effect on
the strategy employed to reliably foretell IDS. Taking into consideration the training model, the
current research ought to provide a flexible and scalable approach to IDS detection. The
proposed model will train on a large dataset, increasing the likelihood that it would provide
accurate results. Future research should continue utilizing the same paradigm in order to
enhance IDS detection. The finding has significant implications for improving IDS forecasting.
I. INTRODUCTION
The increased frequency of cyber assaults may be directly linked to the rising popularity of using online
resources. Passwords, credit card numbers, and other sensitive information sent via a network are susceptible to
attack from both within and outside the system. This attack may be carried either manually or automatically by
the aggressor. The effectiveness and ferocity of these assaults are only increasing. This particular gang of
cybercriminals has becoming tougher to stop. Cybercriminals or cyber attackers are the nefarious individuals
responsible for these types of data breaches. Individuals or groups with deep domain experience in the field may
sometimes suggest novel, flexible, and reliable intrusion detection systems (IDS).
A. Background
Intrusion detection is a topic that will be explored in depth in this study. Although IDS studies have been
conducted over decades, scholars continue to worry about how reliable their findings are. Multiple machine
learning strategies would be used to enhance the IDS's detection capabilities. This research would examine the
state of the art in intrusion detection systems in order to pave the way for future developments in the field. For
the purpose of security analysis, researchers may think about using an RNN-based LSTM model. A filtering
system would be used to enhance precision and efficiency. Furthermore, the suggested IDS model's performance
will be compared to that of the standard model.
IDS systems include things like burglar and thief alarms. A home's lock system, for instance, is one defense
against intruders. However, a burglar alarm will generate noise (or "ring the alarm") to alert the homeowner(s)
that a lock system has been compromised and an intruder is attempting to enter the residence. And now there's
been an attempt at breaking into their house. Furthermore, IDS is greatly aided by firewalls and routers, which
allow for near-instantaneous data transmission.
C. Taxonomy of IDS
The IDS Classification is shown in Figure 2. Another categorization of intrusion detection systems may be
utilized in terms of the protected system type when considering the area as the source of data. HIDS and
network-based IDS are two families of IDS programs that leverage information gleaned from a single host
(system) as well as information gleaned from a network segment.
Using a modem put in an organization's private network; external users might access the Intranet without
detection from the firewall. IPS is network threat prevention system that analyses network traffic flows to
identify & prevent vulnerability exploitation. Network (NIPS) and Host (Host) are two forms of preventive
systems (HIPS). They monitor network traffic and take steps to safeguard networks and systems. False positives
and false negatives are the IPS problem. In an intrusion detection system, a false positive occurs when an alert is
triggered despite the fact that there was no assault. In the case of an attack, a false negative is described as an
occurrence that fails to raise an alert. A single point of failure, marked updates, and encrypted communication
might all be at risk if inline operations were used. IDS monitor the actions taking place in a system or network. It
might be a piece of hardware or an application on a computer, depending on your preference. It keeps tabs on
any suspicious behavior that may have occurred on a network or system. It makes a substantial contribution to
the guarantee of data security. It's one of the most cutting-edge tools for spotting all kinds of network threats
with pinpoint accuracy. A network-based system analyses activity such as the amount of traffic, IP address,
service ports, and protocol to determine the network's health.Network traffic is monitored by Intrusion Detection
Systems (IDS) to look for unusual behavior. In addition, it sends out warnings as soon as activity is identified.
156
This is referred to as a network-capable software program. It performs a full scan of the system to look for any
potentially dangerous activity or policy violations.
A variety of components make up an intrusion detection system. Sensors that generate security events are one
component. It is causing the intrusion detection system to go into overdrive. There's also a console.During
routine operations, intrusion detection systems look for signs of known assaults or deviations. Deviations and
anomalies are forwarded up the stack and examined at the protocol and application layers.
Intrusion Prevention System: As a safety measure, intrusion detection systems have long been explored. In the
IoT system, it works mostly at the network layer. Designed for IoT-dependent intelligent systems, IDS must be
able to function in a very limited processing capacity Fast reaction time is required for this. This is designed to
handle a large amount of data quickly.
D. Machine learning
ML can be cited as algorithms due to which the software applications predict output in a very accurate way and
for this purpose no programming is required. In the algorithms of Machine Learning previous records are used in
the form of input for the prediction of fresh output values. Fraud detection, waste filtering, malware threat
detection, BPA, and predictive maintenance are all possible uses for machine learning. The way an algorithm
improves its accuracy in making predictions in a common way to classify classical machine learning.
Unsupervised learning and supervised learning are the two most common methods. Scientists want to make
predictions about algorithm selection based on data.
Working of supervised machine learning: An algorithm can only be trained using labeled inputs and intended
outcomes when using supervised machine learning.
Working of unsupervised machine learning: There is no need for data to be labeled when using unsupervised
machine learning (ML) techniques. Their job is to seek for patterns in unlabeled data in order to divide it into
manageable chunks for further analysis.
157
Uses of machine learning: Machine learning is now being used in a wide range of fields. Among its many uses is
in Facebook's News Feed recommendation engine. It's possible that the recommendation engine may begin to
priorities postings from a certain group if a member often pauses to read them. The engine is working behind the
scenes to strengthen the member's online habits. The News Feed will be adjusted if the member's reading habits
change and he or she fails to keep up with postings from that particular group in the following weeks.
Binary Classification
Clustering Classification
E. Deep learning
Computers are taught to learn by doing, and this is what is known as "deep learning" in machine learning.
Autonomous vehicles rely on deep learning to identify things like stop signs and pedestrians, among other
things. DNN (Deep neural networks) is used for prediction and classification whereas CNN is used for
prediction, recognition and vision. RNN & LSTM is used for prediction and classification. Present research
would make use of RNN & LSTM for IDS detection and classification.
DNN Prediction and time series prediction
Classification
RNN & LSTM
158
Analysis by M Tavallaei et al. [1] of the KDD CUP 99 data set was completed in 2009. J. Martens and I.
Sutskever [2] focused on learning recurrent neural networks in 2011. A new intrusion detection method was
unveiled by M. Sheikhan et al. [3] in 2012. They employed a smaller RNN for their task. It was a feature
grouping-based technique. S.Revathiet et al. [4] recommended a full investigation of the NSL-KDD dataset in
2013. Various machine learning algorithms were used. This was done in order to identify intrusions.Researchers
W. Li et al. [5] studied the most recent intrusion detection systems under development in 2014. Based on KNN
algorithmic programs their system was developed. A wireless sensor network mechanism was
devised.Information extraction and automated learning algorithms were surveyed in 2016 by A. L. Buczak et al.
[6]. They concentrated on ways to avoid intrusion detection in the first place. Detection of an intrusion was
accomplished by the using approaches that combined information extraction with machine learning. Deep
learning was cited by A. Javaid et al. [7] in 2016. Their efforts were also dedicated to the development of a more
effective intrusion detection system. Classification algorithms for network traffic were examined by Bo Dong et
al. [8] in 2016. They came to the conclusion that a variety of ways would be implemented as part of a free
information package, and they then put those approaches into practice. They came up with the best method for
detecting intrusions from this collection of realistic examples. For the time being, deep learning was the greatest
option because of its ability to forecast. Because of this, deep learning approaches were already being used in
industries such as structural identification or organizational structure. Monitoring security events provided data
for intrusion detection analysis, which were used to determine the current state of the network. Existing intrusion
detection approaches that used automated learning showed enhanced accuracy and efficiency.Deep learning was
also suggested by T.A. Tang [9] et al. in 2016. The goal of their method was to identify network intrusions.
Software defined networking was the focus of the study. Chuanlong Yin et al. [10] presented a model and
technique for using a neural network-based identification system in 2017. In addition, they assessed the efficacy
of design in the context of dual and multiple class structures. Other factors that affect accuracy include neuron
density, as well as different learning speeds' impact on neuron number. NSL-KDD was utilized as the dataset.
Using the RNN-IDS classification model, it was discovered that it’s possible to accurately represent the data.
Compared to other automated learning approaches, the categorization model was significantly more efficient and
accurate. Using their architecture, intrusion detection was more accurate. It provided the most up-to-date
research approach for detecting intrusions.Analysis of data pre-processing was carried out in 2017 by N.
Paulauskas et al. [11]. They examined the impact of pre-processing data on IDS methods.The NSL-KDD dataset
was utilized in their study. In 2017, P. S. Bhattacharjee et al. [12] proposed IDS. They utilized the NSL-KDD
data collection for this purpose in 2017. R. A. R. Ashfaq [13] did work on fuzziness based semi-supervised
learning approach in 2017. They did research for intrusion detection system. Sara A. Althubiti et al. [14] was the
one responsible for putting the detecting system in place in 2018. To do it, their team used the Coburg Intrusion
Detection data package. In addition, this researcher used the Long-Short-Term Memory (LSTM) and Deeply
Structured Learning (DSL) methods (LSTM). Their research yielded an accuracy of around 85 microns. This
level of precision was deemed acceptable. Their LSTM outputs were compared with the most elegant approaches
in order to meet our assessment criteria. To do which, they employed a variety of measures such as authenticity
and versatility.Meira, Jorge [15] did comparative Results with Unsupervised Techniques in 2018. Their research
played significant role in detection of cyber attack novelty. Kolli [16] in 2018 focused on Cyber Situational
Awareness (CSA) for PTC. They considered Distributed IDS System. Clotet [17] in 2018 considered real-time
anomaly-based IDS. They considered this system for cyber-attack detection. Their system worked at the
industrial process level of Critical Infrastructures.Intrusion detection was designed by Peisong Li et al. [18] using
an enhanced DBN and GA in 2019. Iterative development of DBN network topologies yielded diverse network
structures for different assaults, including low-frequency attacks and other forms of attacks. In order to provide
intrusion detection, a DBN should be created that optimized network layout. There's no limit to the number of
hidden layers that may be generated using a genetic algorithm. Neurons in the "hidden layer" develop in a
manner similar to this. Speed of detection was provided by minimizing system complexity to the maximum
degree feasible. Using this technique, an intrusion detection system's performance could be improved. Arul [19]
makes use of ANN in their IDS based research in 2019.Khraisat [20] did survey of intrusion detection systems in
2019. Author considered techniques, datasets and challenges related to IDS. R. Vinayakumar [21] introduced
Deep Learning Approach in order to implement Intelligent IDS in 2019. Many alternative approaches of
automated learning were employed by Qusay H. Mahmoud et al. [22] in 2020, including SVM, DT, and random
forest. Using the most recent IoTID20 information package, new IDS techniques in IOT networks might be
supported.They considered hessian-free optimization in their research. Y. Zhou [23] proposed an efficient
intrusion detection system in 2020. This system was based on feature selection and ensemble classifier. Y. J.
159
Chew [24] considered decision Tree in 2020. They considered sensitive Pruning in Network dependent
IDS.Song, Yajie& Bu [25] proposed Novel Intrusion Detection Model in 2020.
160
V. SCOPE OF RESEARCH
The approach utilized to reliably forecast IDS will be significantly influenced by the thoughts and
recommendations related to this study. Taking into consideration the training model, the latest research ought to
provide a flexible and scalable method of detecting IDS intrusions. Since the proposed model will use a large
dataset for training, its overall accuracy should improve. If we want to make progress in IDS detection, further
research must use the same paradigm. The research will have significant implications for improving the ability to
foresee IDS.
ACKNOWLEDGMENT
The authors wish to thank A, B, C. This work was supported in part by a grant from XYZ.
REFERENCES
[1] M. Tavallaee, E. Bagheri, W. Lu, and A. A. A. Ghorbani, ‘‘A detailed analysis of the KDD CUP 99 data set,’’ inProc.
IEEE Symp. Comput. Intell.Secur. Defense Appl., Jul. 2009, pp. 1–6.
[2] J. Martens and I. Sutskever, ‘‘Learning recurrent neural networks with hessian-free optimization,’’ presented at the 28th
Int. Conf. Int. Conf.Mach. Learn., Bellevue, WA, USA, Jul. 2011, pp. 1033–1040.
[3] M. Sheikhan, Z. Jadidi, and A. Farrokhi, ‘‘Intrusion detection using reduced-size RNN based on feature
grouping,’’NeuralComput. Appl.,vol. 21, no. 6, pp. 1185–1190, Sep. 2012.
[4] S. Revathi and A. Malathi, ‘‘A detailed analysis on NSL-KDD dataset using various machine learning techniques for
intrusion detection, ’’Int. J. Eng.Res. Technol., vol. 2, pp. 1848–1853, Dec. 2013.
[5] W. Li, P. Yi, Y. Wu, L. Pan, and J. Li, ``A new intrusion detection system based on KNN classi_cation algorithm in
wireless sensor network,''J. Elect. Computer. Eng., vol. 2014, Jun. 2014, Art. no. 240217.
[6] L. Buczak and E. Guven, ‘‘A survey of data mining and machine learning methods for cyber security intrusion
detection,’’IEEECommun.Surveys Tuts., vol. 18, no. 2, pp. 1153–1176, 2nd Quart., 2016.
[7] Javaid, Q. Niyaz, W. Sun, and M. Alam, ‘‘A deep learning approach fornetwork intrusion detection system,’’ presented
at the 9th EAI Int. Conf.Bio-inspired Inf. Commun. Technol. (BIONETICS), New York, NY, USA,May 2016, pp. 21–
26
[8] B. Dong and X. Wang, “Comparison deep learning method to traditional methods using for network intrusion
detection,” in Proc. IEEE ICCSN,2016, pp. 581–585
[9] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, and M. Ghogho,‘‘Deep learning approach for network intrusion
detection in soft-ware defined networking,’’ inProc. Int. Conf. Wireless Netw. MobileCommun. (WINCOM), Oct. 2016,
pp. 258–263.
[10] Chuanlong Yin, Yuefei Zhu, A Deep Learning Approach for IntrusionDetection Using Recurrent Neural Networks,
IEEE Access, Received September 5, 2017, accepted October 5, 2017, date of publication October 12, 2017, date of
current version November 7, 2017.
[11] N. Paulauskas and J. Auskalnis, ‘‘Analysis of data pre-processing influence on intrusion detection using NSL-KDD
dataset,’’ inProc. Open Conf.Elect., Electron. Inf. Sci. (eStream), Apr. 2017, pp. 1–5.
[12] P. S. Bhattacharjee, A. K. M. Fujail, and S. A. Begum, ‘‘Intrusion detection system for NSL-KDD data set using
vectorised fitness function in genetic algorithm,’’Adv. Comput. Sci. Technol., vol. 10, no. 2, pp. 235–246, 2017.
[13] R. A. R. Ashfaq, X.-Z. Wang, J. Z. Huang, H. Abbas, and Y.-L. He,‘‘Fuzziness based semi-supervised learning
approach for intrusion detec-tionsystem,’’Inf. Sci., vol. 378, pp. 484–497, Feb. 2017.
[14] Althubiti, Sara & Jones, Eric & Roy, Kaushik. (2018). LSTM for Anomaly-Based Network Intrusion Detection. 1-3.
10.1109/ATNAC.2018.8615300.
[15] Meira, Jorge. (2018). Comparative Results with Unsupervised Techniques in Cyber Attack Novelty Detection.
Proceedings. 2. 1191. 10.3390/proceedings2181191.
[16] Kolli, Satish& Lilly, Joshua &Wijesekera, Dusminda. (2018). Providing Cyber Situational Awareness (CSA) for PTC
Using a Distributed IDS System (DIDS). V001T03A004. 10.1115/JRC2018-6142.
[17] Clotet, Xavier &Moyano, José & León, Gladys.(2018). A real-time anomaly-based IDS for cyber-attack detection at the
industrial process level of Critical Infrastructures.International Journal of Critical Infrastructure Protection.23.
10.1016/j.ijcip.2018.08.002.
[18] P. Li and Y. Zhang, "A Novel Intrusion Detection Method for Internet of Things," 2019 Chinese Control And Decision
Conference (CCDC), Nanchang, China, 2019, pp. 4761-4765, doi: 10.1109/CCDC.2019.8832753.
[19] A. Arul Anitha and L. Arockiam, “ANNIDS: Artificial neural network based intrusion detection system for internet of
things,” Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 11, pp. 2583–2588, 2019, doi: 10.35940/ijitee.K1875.0981119.
161
[20] A. Khraisat, I. Gondal, P. Vamplew, and J. Kamruzzaman, “Survey of intrusion detection systems: techniques, datasets
and challenges,” Cybersecurity, vol. 2, no. 1, 2019, doi: 10.1186/s42400-019-0038-7.
[21] R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, A. Al-Nemrat, and S. Venkatraman, “Deep Learning
Approach for Intelligent Intrusion Detection System,” IEEE Access, vol. 7, no. c, pp. 41525–41550, 2019, doi:
10.1109/Access.2019.2895334.
[22] Ullah, Imtiaz, and Qusay H. Mahmoud. "A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT
Networks." Canadian Conference on Artificial Intelligence. Springer, Cham, 2020.
[23] Y. Zhou, G. Cheng, S. Jiang, and M. Dai, “Building an Efficient Intrusion Detection System Based on Feature Selection
and Ensemble Classifier,” Comput. Networks, p. 107247, 2020, doi: 10.1016/j.comnet.2020.107247.
[24] A. Kajal and S. Nandal, “A Hybrid Approach for Cyber Security: Improved Intrusion Detection System using ANN-
SVM,” Indian Journal of Computer Science and Engineering, vol. 11, no. 4, pp. 412-425, 2020, doi:
10.21817/indjcse/2020/v11i4/201104300.
[25] Y. J. Chew, S. Y. Ooi, K. S. Wong, and Y. H. Pang, “Decision Tree with Sensitive Pruning in Network-based Intrusion
Detection System,” Lect. Notes Electr. Eng., vol. 603, pp. 1–10, 2020, doi: 10.1007/978-981-15-0058-9_1.
[26] Song, Yajie& Bu, Bing & Zhu, Li. (2020).A Novel Intrusion Detection Model Using a Fusion of Network and Device
States for Communication-Based Train Control Systems.Electronics. 9. 181. 10.3390/electronics9010181.
162
Grenze International Journal of Engineering and Technology, June Issue
Abstract— Distributing medicines to people located in remote areas is a daunting not so easy for
the Government and the Real-time remote general Healthcare Clinic help meet the above
requirements. The model has basic and emer-gency medicines which is monitored and can be
refilled. We can consider it to be a computerized drug stock-room system that people can access
easily in an emergency without having to go near the pharmacy. The ma-chine could be easily
setup in remote regions, such as highways, deserts, tribal areas. It is a µcontroller and mo-tors
based system that dispenses the drug when the user accesses the event via input drug storage
data can be obtained remotely based on this information refilling the machine will be hassle-
free.
I. INTRODUCTION
Some groups in India are biting the dust because they are not in leadership positions and medicine is not
available in time. Most need regular access to the most basic medicines. This is due to the level of network needs
and as a result there is a growing shortage of funds to hold these posts. If the need for medicine is so great that
pharmacies are not open or medicines are not available in this mood in the evening, problems arise. In remote
and remote areas and areas with low public turnover, the availability of medicines within reach of patients is a
fundamental issue. Gov-ernment accountability, prudent choice, adequate public sector support, productive
circular structures, control of costs and commitments, and drug delivery by current and future prescribers to
improve the enduring quality of healthcare in India.
Education in the culture of normal use. In the current situation where we want all devices to be programmed, this
model will be of great help for the state of well-being. This framework is fully constrained by a 16-bit
microcontrol-ler. To improve efficacy and patient well-being, these frameworks provide potential components
for PC-controlled capacity, management, observation, and tracking.
The World Wellbeing Association mentions ‘Countries with less social finance classes has less of well-being.
Happi-ness Inequality is tightly linked to levels of social status. Happiness has all the characteristics of being one
of the anchor points that connect education and access to data. Competent electronic health checks with clever
billing frameworks for remote areas are another idea, but very useful in an agricultural country like India, where
healthcare is virtually rudimentary. Dispensing machines allow prescriptions to be stored securely in permissive
payment units and the use of opiates and other controlled substances to be tracked electronically. The framework
is customizable as new skills can be added or current skills can be modified according to requirements.
III. COMPONENTS
The prototype done is made using an ESP32. The other components used in the model are DC motors, switches,
and IR sensors. Components are selected from those available in local stores to represent the most appropriate
quality-price ratio. A DC motor is used to drive the trays and deliver the desired drug. A gear set converts that
high speed into high torque. This is important for the design as the main job is to open the tray with the product
on it.
A medical vending machine should be designed as a service to deliver medicines with special requests. This
model’s working requires a vending machine to dispense commonly used non-prescription medications on
demand. The working principle of this method is illustrated in block diagram.
164
The block diagram above fully describes how the project works. As you can see, the whole project is divided
into her two units, the control unit and the delivery unit. The control unit consists of parts necessary for data
processing such as controllers and input switches, and the delivery unit consists of motors, infrared sensors, chat-
bots, etc., all attached to the delivery tray.
The main component of the model is the microcontroller ESP32. Connect all peripherals and programmatically
con-trol all processes in the system. Panels are used as inputs for controllers. Separately, an infrared sensor is
connected to the controller to monitor and update the availability of medicines in the medicine tray. push buttons
in the system allow product selection. Each medicine becomes available at the push of a button.
DC motors are responsible for drug delivery by opening and closing the shell. Pressing a specific button turns on
the motor for the respective drawer, opening it with the help of a gear set and closing it after a specified time as a
pre-caution. Patients/consumers have access to medicines during business hours.
The prototype has a chat-bot feature that recommends medications that are appropriate for a patient's symptoms.
This chat-bot uses embedded C and is trained with a certified list of symptoms and medication records.
Other than that the availability of medicines in the trays are monitored and continuously updated to the cloud via
internet using the ESP32 controller. And the data in the cloud are in encrypted format for basic security purpose
and for viewing the details an OTP generation method for the registered email id is used
The data is available with the time stamps.
Figure 3: Chat-bot
A problem observed is when the IR sensor generated high output value. When inspected the reason for this
malfunction was the bright lights in the environment where it was used. So the IR sensor works the best in dark
conditions
165
Another problem is that since multiple components are used the availability of pins for connecting them lacked
which resulted in making multiple compromises and it also reduces the availability of pins for additional
components in the future
166
VII. CONCLUSION
In this real-time remote general medicine clinic project, a basic model of a medication machine
with inventory monitoring and consultation software (chat bot) was created. Its potential hasn't been fully
realized yet, but there are a lot of additions that can be made to the model, so this project shows how to make a
remote medication dispenser and what additional basics you can do with your project. Ideas include tethering
chat-bots to hardware, the ability to physically monitor patients to prescribe better dosages, and
software improvements for better counselling.
REFERENCES
[1] Pill dispenser with alarm via smart phone notification; Chi-Sheng Shih ,Nurmiza Binti Othman; Ong Pek Ek; IEEE 5th
Global Conference on Consumer Electronics Year: 2016.
[2] Sarika Oundhakar, International journal of engineering technology science and research, December 2017.
[3] S Mukund and NK Srinath, Design of Automatic medication dispenser, 2012.
[4] S Gayathri, N Raj Kumar, V Vinothkumar, Human health monitoring system using wearable sensors, international
research journal of engineering and technology,vol.2, NO. 8 June 2015.
[5] Varun Vaid, Comparison of different attributes in modeling a FSM based Vending machine in two different styles, 2014.
[6] P. Rajendra Prasad, N. Narayan, S. Gayathri and S. Ganna, "An Efficient E-Health Monitoring with Smart Dispensing
System for Remote Areas," 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information &
Communication Technology (RTEICT), 2018
[7] V. Tank, S. Warrier and N. Jakhiya, "Medicine dispensing machine using raspberry pi and arduino controller," 2017
Conference on Emerging Devices and Smart Systems (ICEDSS), 2017.
167
Grenze International Journal of Engineering and Technology, June Issue
Abstract— A doctor uses the diagnosis of medical images to predict whether the person suf-fers
from damaged tissues or organ. Therefore, object detection and image classification of medical
images has received some concentration in medical field. This paper as put forward a varicose
vein detection algorithm based on varicose superficial venous thrombophlebitis images and
multi-level neural network algorithm (MSNN). The varicose vein diagnosis sys-tem has better
accuracy and performance due to the uniqueness of the leg vein images. The proposed system
used multi-scale technology by pre-processing the images and extracting features by feature
extraction to create a detection system of varicose disease with high per-formance.
Index Terms— Vascular endothelial cells, inflammation, Multi- scale neural network, Scle-
rotherapy, Endovascular Laser Therapy.
I. INTRODUCTION
Biometrics is an automated recognition system that is used recognize individuals by means of physical
appearances. Veins is a blood carrying vessel that carries blood to every part of the body. Varicose veins appear
as dark blue or purple veins which are swollen and enlarged. Digital medicine is a category in medical field,
which is a combination of prescription medication and ingestible sensor component.
The deep learning algorithm proposed by Lau etal.[1] the algorithm is used to achieve a good recognition system
by training with several skin damaged images of the patient. Mohammed etal.[2] used 3 dimensional convolution
network to classify and recognize the patients with Alzheimer’s dis-ease. Kawahara etal.[3] used a CNN
structure for obtaining structural connectivity map from DTI by MRI scans. Yuan[4] used a multi-dimensional
CNN algorithm to classify the skin tissue damages, working with each CNN model with a different image
resolution. Al-Antari etal[5] used a modified neural network combined wth support vector machine to detect
breast tumors, suggesting that the given model as a improved performance of the breast image classification.
Varicose veins present in the lower limbs of the body are the most common disease of outer ves-sels. Around
23% of the people present in the world suffer from varicoses in their legs and knees and at present, number of
the people having varicose veins has exceeded 25 million in the world. In China, the ubiquity of the varicoses as
exceeded 8%. In addition to affecting esthetics, varicose veins can also cause complications, such as bleeding in
the lower limbs, superficial veins swelling, inability to walk for long distance and reduced work capacity.
Varicoses are generally valves that are enlarged and swollen that generally appear on the legs and feet of the
patient body. This condition happens due to the flow of blood is against gravity and occurs after prolonged
standing. Varicose veins cause aching pain in the legs and feet that cause dis-comfort in loss of joint function,
which can lead to circulatory problem.
The main problem facing doctors today is the difficult access to veins for intravenous drug administration.
Figure 1 Figure 2
The first work is done by Lau et al.[6] in which a network algorithm has successfully achieved a good detection
system using 100 MRI images of the skin damages. By Kawahara et al.[7] applied a MSNN structure that
connects the structural connectivity map that is obtained by DTI and MRI images. The main advantage of the
MSNN algorithm is that it makes use of the spatial coherence that is it gives equal importance to all the edges of
the network and it is independent of the geometrical transformations. The treatments that are used for varicose
vein disease are injection treatment(sclerotherapy), laser treatment and vein surgery. Surgery is considered as one
of the painful methods as it involves interceding by tubal ligation and also pulling out of the veins, these causes
the patient massive pain and also the patient takes a long period of time to recover from the surgery.
Sclerotherapy is a methodology that done by injecting medicine through the blood vessels or lymph vessels and
this makes the swelling to shrink. This causes enormous pain in the swollen area as it is difficult to exactly find
the affected vein. Sclerotherapy is commonly used in the treatment of spider veins which is a mild version of the
varicose veins and this procedure is a non- surgical treatment as it only requires injection for treatment of the
varicose vein disease. Complications of the varicose veins includes draining of blood, skin changes of the
patient, rash, ulcers, infection, bleeding, blood clots. Endovascular varicoses laser surgery is a method that is
used to treat varicose veins by penetrating laser into the body and produces heat which shrinks the varicose
veins. A model is designed to examine the bandwidth of the ultrasound devices and various values of the return
169
speed. The conclusion of the simulation is that the bandwidth in which necrosis in the valves of the blood vessels
can be determined.
A. Multi-level Neural Network
Specialists and various exploration results that have extraordinarily advanced the advancement of clinical
picture research favor the use of profound convolutional brain networks in clinical pictures. In any case,
because of the absence of open huge scope data sets, most examination utilizing profound convolutional brain
networks is completed on restrictive little data sets, and the heartiness of the organization isn't excellent.Fig.2
represents the multi-level neural network.
Furthermore, profound convolutional brain networks utilize managed preparing strategies. To further develop
the organization speculation execution, it is important to extend the organization progressive system or look for
a more sensible organization structure. As the level of the organization develops, more marked preparing
information is required for variation. This paper joins the elements of Google-Net organization structure, VGG,
and the writing to build a lightweight profound convolutional brain network with further developed highlight
extraction. The letter "C" demonstrates the pack base layer. "M" indicates the MFM enactment layer. "P"
demonstrates most extreme pooling and "Fc" shows a completely associated layer.
B. Inception Layer
Google Net is a kind of neural network that uses inception model for training and evaluating the given dataset.
Google Net is a model that utilizes multiple size of the filter sizes present in the inception layer and rarely max
pooling, two filter sizes for making the resolution of the images halved. There are some changes present in the
Google Net from the other similar architectures of the neural network, that is Google Net uses 1x1 convolution
layer and also global mean pooling.Fig.3 shows the inception model structure.
C. Activation Function
An activation function is a module that is used to provide compact outputs for compact inputs whereas it provides
huge values if the input surpasses a threshold limit. There are few examples for activation function such as Relu,
Sigmoid, Step function, Leaky Relu. The mathematical representation of the activation function is given as in the
Figure.4
D. Relu
Relu is a rectified function in which the function tends to zero if the input values are less than zero and the
function tends to the input value if the input values are greater than or equal to zero.Fig.5 shows the equation and
Fig.6 shows the graph of relu function.
Figure 3 Figure 4
170
Figure 5
Figure 6
Varicoses is classified into many types based on the severity of the disease in the patient’s body. In this paper,
each part of the image segmentation is developed to detect the varicoses by image pre-processing, feature
extraction and at the end segment the images over the original images. By using the algorithm, the classification
results determine the efficiency of the infected veins.Fig.7 and 8 shows the block diagram of training set and
testing set of the network.
IV. WORKING
A. Image Preprocessing and Feature Extraction
The images present in the dataset are pre-processed and the features are extracted by using mean pixel value
method. These images are then separated into training set and testing set.
B. Neural Network Training
After pre-processing and feature extraction of the images present in the dataset, the MSNN algorithm is the
trained using the training set of the dataset. After training the neural network using training set, it is then
evaluated and validated by the testing dataset. It is done by matching the testing set with the original images and
the accuracy of the MSNN algorithm is measured. The accuracy of the testing set tends to increase by increasing
the number of neural network iterations. The efficiency and error test curves of the MSNN model in maximum
aggregate sampling mode are shown.
C. Network Performance Comparison
The amount of training parameters of the MSNN model network is related to the depth of the network and the
number of network filters in each layer. A deeper network layer and a wider network width can capture more
network features and improve the network's ability to represent features. The network parameters of NIN and
Google-Net are smaller than VGG. Network training time is shorter, resource overhead is less, and network
execution efficiency is higher. At the same time, the MSNN model constructed in this article introduces a deeply
separable convolutional layer, the calculation size of the model parameters is the smallest, and the network runs
171
the fastest not only in the network. A desktop computer with a graphics card can run fast and can run quickly on
a portable mobile device.
Combined with image multi-scale technology, it can still achieve high classification accuracy while reducing the
parameters of the deep convolutional neural network model.
Figure 7 Figure 8
V. RESULTS
The proposed system accuracy for 30 epochs or 150 iterations of MSNN model is shown in Fig.9
Also the model is used to predict test samples of the dataset and the confusion matrix of the predicted values is
displayed in the Fig.10
Figure 9 Figure 10
VI. CONCLUSION
There is a positive correlation between vascular endothelial cell inflammation and varicose veins of the lower
extremities. Therefore, this paper uses vascular endothelial cells as a research object to construct a deep
convolutional neural network for lower limb varicose veins to improve classification and recognition accuracy.
The accuracy of the presented work with MSNN comes close to meeting the acceptable error levels that would
be required for a system with some classification of a patient's leg image.
Compared with the existing deep convolutional neural network model, the network can improve the feature
extraction ability of the network, has the characteristics of fast running.
REFERENCES
[1] Real-Time Epidemiology Of Varicose Veins And Chronic Venous Disease Prediction Using Decision Tree Algorithm ,
R.Panneer Selvi , R.Sasikumar , S.Deva Priya , C.Jeganathand, Turkish Journal of Computer and Mathematics
Education ,Vol.12 No.9 (2021)
[2] Analysis of Varicose Veins in Lower Limbs through Multiscalar CNN, Krishnarani. M, Malini.V, Rathna. P, Sharmila.
A, Vaizhnavi. G,IJESC, 2020 Volume 10 Issue No.4
[3] Analysis of varicose veins of lower extremities based on vascular endothelial cell inflammation images and multiscale
deep learning, RuizongZhu,HuipingNiu,NingningYin,TianjiaoWu,YapeiZhao,November 2019
[4] Detection of Diseases Using Machine Learning Image Recognition Technology in Artificial Intelligence, Jian
Huang,Jing Li ,Zheming.
[5] Li,Zhu Zhu,Chen Shen,Guoqiang Qi,Gang Yu, Hindawi Computational Intelligence and Neuroscience,Volume 2022,
Article ID 5658641
[6] Suma V R Amog Shetty, Rishab F Tated, Sunku Rohan, “CNN based Leaf Disease Identification and Remedy
Recommendation System”
[7] Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA
2019] IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5.
172
[8] . Al-Antari M A , Al-Masni M A , Mun-Taek C , et al. A fully integrated computer-aided diagnosis system for digital X-
ray mammograms via deep learning detection, segmentation, and classification[J]. International Journal of Medical
Informatics, 2018, 117:44-54.
[9] Lau H K , Chang J W , Daut N , et al. Exploring Edge-Based Segmentation Towards Automated Skin Lesion
Diagnosis[J].AdvancedScienceLetters,2018,24(2):1095-1099
[10] ShenW, Zhou M ,YangF , et al. Multi-scale Convolutional Neural Networks for Lung Nodule Classification[J]. Inf
Process Med Imaging,2015,24:588-599.
[11] Haxhe J, De Maeseneer M, Schoevaerdts J. Le traitement des varices en Belgique: oùallonsnous? Impact des nouvelles
technologies sur les pratiquesmédicales.Phlébologie 2010;63(2):9-14.
[12] R. Kenneth, "User Authentication via Keystroke Dynamics: an Artificial Immune System Based Approach", in
Proceedings of 5th International Conference on Information Technology, 2011
173
Grenze International Journal of Engineering and Technology, June Issue
Abstract— Senders and recipients both put a lot of work into managing their email. Perhaps
some of this job can be automated. In order to determine. (i) the kind of automated e-mail
handled user’s desire. (ii)the types of data and computation required to support that
automation, we conducted a mixed-methods need-finding study. We organised the demands
through a design workshop, ran a poll to further understand those categories, then categorised
the email automation software already on the market to determine which criteria have been
satisfied. Our findings point to the necessity of a richer data model for rules, additional
attention-management options, using context of internal and external emails, complex
processing like response aggregation, and sender-friendly features. We created a framework for
producing short stories to better explore our findings. An efficient information system gives
users accurate, timely, and pertinent information that they may use to make decisions. When
making judgments for current operations as well as long-term strategic planning. To ensure
that the decisions made are the right ones, the decision-making process must be supported by
timely and pertinent data and information. Information is created as a result of data
processing, which information systems perform using information technology.. Data
management is required to ensure that information is the right information, at the right time,
accurately, and pertinently. Data is the building block of information and is gathered in a
database (database) to provide To collect the necessary information on the academic
organization of the institution, for instance, a university must create an academic database that
at the very least includes student, lecturer, course, room, and schedule data. Therefore, a basic
understanding of databases and Database Management Systems is required in order to build a
successful database. Database management solutions are used to organize the massive volumes
of data that businesses use on a regular basis. Managers need to be able to swiftly and readily
discover certain facts so that they can make decisions. The company divides the complete data
collection into a series of linked data tables; by reducing data repetition, these linked tiny
collections of data will ultimately improve data consistency and accuracy. Most businesses
today employ databases with a relational structure. Automated email is any message that is
automatically sent from your email service provider (ESP) in response to a particular user's
actions (or inactions) on your website or web app. You may use automated mail to provide one-
to-one communications to customers in real time, enhancing their engagement, loyalty, and
retention.
Index Terms— SQL, SMTP, Python, Pandas, NumPy, E-mail, Automation etc.
I. INTRODUCTION
In the 51 years since its creation, email has developed into not just a commonplace instrument for individual and
II. MOTIVATION
Our Initial Promise to finding the best solution and overcoming current industry issues. It Will Improve Our
Profile Being a Sponsored Project. Our approach will lessen manual labour. First exposure to industry.to
maximise effectiveness.
V. STATEMENT
Automated emails are issued to non-fillers to pay their invoices as soon as is possible.
175
Phase 1: Importing the organization's pre-existing dataset for the Non-Fillers Category.
By using manual filters on an excel sheet, the organisation has filtered out this dataset.
gathering of all transaction information for a single consumer that falls into the non-filler category.
Keeping track of this transaction in many Excel files using the GSTIN number.
sending each non-filler an automated email by obtaining their email addresses from the organization's
database.
Phase 2: Importing the GSTR2-A Company register and Portal datasets.
Data comparison on the terms set forth by the business.
Data division into two categories:
The matched data from both datasets fall under the fillers category.
The unmatched data from both datasets fall into the non-fillers category.
Exporting the Non-filler dataset that has been filtered out onto a separate Excel sheet for usage in the
following phase
VIII. CONCLUSION
Users want their email management to be more automated, according to our research. The results of the three
different need-finding probe we carried out consistently pointed to a few common categories of email needs:
capturing richer data models and internal and (time-varying) external context, using them for recipients to
manage attention and for senders to lessen recipient load, and automated content processing to, for example,
aggregate replies to an invitation or extract attached photos into a relevant storage location.
hackers that abuse the characteristics of email today. We discovered people utilizing attention-management
techniques that recycle existing email features. For instance, users have been seen marking emails as unread
across numerous probes to act as a reminder to revisit and read them.
ACKNOWLEGDEMENT
Perseverance, inspiration and motivation have always played a key role in the success of any venture. At this
level of understanding it is difficult to understand the wide spectrum of knowledge without proper guidance and
advice, hence we take this to express our sincere gratitude to our respected Project Guide who as a guide evolved
an interest in us to work and select an entirely new idea for project work. He has been keenly co-operative and
helpful to us in sorting out all the difficulties. I am also grateful to my classmates and cohort members,
especially my office mates, for their editing help, late-night feedback sessions, and moral support. Thanks,
should also go to the librarians, research assistants, and study participants from the university, who impacted and
inspired me. We would also like to thank our HOD and Principal, for their continuous advice and support.
REFERENCES
[1] Daniel Gruen, Steven L. Rohall, Suzanne Minassian, Bernard Kerr, Paul Moody, Bob Stachel, Martin Wattenberg, and
Eric Wilcox. 2004. Lessons from the reMail prototypes. In Proceedings of the 2004 ACM conference on Computer
supported cooperative work (CSCW ’04). ACM, New York, NY, USA, 152–161
[2] The Lifetime of Email Messages, Ahmed Hassan Awadallah.2018 History Of Emails.
[3] Database Management System Individual Research Report, Edward Hung 2001 Data Base Management & Data
Comparison
[4] Control System Over Database Management, M.C. Crowley 2005 Automation.
[5] A Study On Data Management Tools, L. Ramanan, M Kumar, K.P.V Ramanakumar 2013 Data Segregation Concepts
[6] Automated Response Suggestion For Emails, Anjuli Kannan, Karol Kurach, Sujith Ravi 2016 Automation of Emails
[7] Gnanaprakash, V., N. Kanthimathi, and N. Saranya., Automatic number plate recognition using deep learning, IOP
Conference Series: Materials Science and Engineering, Vol. 1084. No. 1. IOP Publishing, (2021).
176
[8] R. Madhanraj, T. Vigneshwaran, N. Palanivel ap, M. Srivarappradhan, Automatic number plate detection in vehicles
using faster R-cnn, Ieee international conference on system, computation, automation and networking (icscan), July
2020.
[9] [8] Sahu, C.K., Pattnayak, S. B., Behera, S. & Mohanty, M. R., A Comparative Analysis of Deep Learning Approach for
Automatic Number Plate Recognition, Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics
and Cloud) (I-SMAC), 932-937, (2020).
[10] Gupta, Shally, Rajesh Shyam Singh, and H. L. Mandoria, A Review Paper on Automatic Number Plate Recognition
System, (2020).
[11] Ganta, S., & Svsrk, P., A novel method for Indian vehicle registration number plate detection and recognition using
image processing techniques, Procedia Computer Science, 167, 2623-2633, (2020).
[12] Y. Sukesh, M. Ramana Sai Prasada Varma, S. Usha Kiruthika, Vehicle Number Recognition from Vehicle Images
using CNN, International Journal of Innovative Technology and Exploring Engineering (IJITEE), ISSN: 2278-3075,
Volume-9 Issue-6, April (2020).
177
Grenze International Journal of Engineering and Technology, June Issue
Abstract— Platforms, data storage, and IT services are delivered over the Internet in cloud
computing, a contemporary computing technology. Task management is crucial for effective
scheduling and affects the overall effectiveness of cloud computing environments due to the full
availability of resources and the significant number of tasks assigned to it. In cloud
environments, security is a crucial concern in addition to timing. Since cloud computing
services go beyond data archiving and backup, supporting data dynamics through the most
popular types of data manipulation, like block modification, insertion, and deletion, is also
crucial for practical use. A step, that is. Public auditability or dynamic data manipulation have
not always been effective in prior attempts to ensure remote data integrity, but this document
accomplishes both. We first recognized the challenges and potential security concerns of direct
extension with fully dynamic data updates from prior work, and then we seamlessly
incorporated these two crucial features into the protocol design. In particular, we enhance
existing proof-of-storage models by modifying the conventional Merkle hash tree structure for
block tag authentication to achieve effective data dynamics. This demonstrates how to construct
an elegant validation scheme for dot. To secure cloud data storage, a variety of techniques have
been put into practice [1]. The safety analysis method described in [1] is not a useful technique,
though. The new idea of smart card authentication is used in this work to provide security for
cloud data storage. Data storage in the cloud can be made more secure using an effective
method called smart card authentication. We implemented this prototype in accordance with
the CPDP scheme within the virtualization framework of a cloud-based storage service. Hadoop
Distributed File System (HDFS) 6 is illustrated in Figure 5 as an example.
It is a distributed, scalable, and portable file system [14]. HDFS's architecture is made up of
NameNodes and DataNodes, where NameNodes translate filenames to a collection of block
indices and DataNodes hold actual data blocks. The NameNode's index hash table and
metadata must be integrated in order to support the CPDP scheme and provide query services
based on hash values ((3)i,k) or index hash records (i). implement a protocol for verification.
Index Terms— cloud computing, cloud security, hybrid clouds, public verifiability, and storage
security.
I. INTRODUCTION
Hybrid clouds effectively offer dynamic scaling of services and data transfers by integrating a variety of private
and public cloud services. For instance, a customer can combine data from various private or public providers
into one backup file or archive (see Figure 1). As an alternative, a service can take data from other services that
are located in a private cloud and store it in its own storage, creating a hybrid cloud.
179
migration, and collectively gathers and stores customer data. Because operating costs are lower, communication
complexity can be reduced to minimum.
According to Qian Wang et al. Through dynamic data manipulation, [2] introduces a new scheme that enables
remote he data integrity and auditability. This step first pinpoints specific scaling issues and potential security
concerns with fully dynamic data updates. Through manipulation of the conventional Merkle Hash Tree (MHT)
structure, which is used to validate blocktags, we achieve efficient data dynamics and enhance retrievability
models. This is a very effective and safe technique [2].
An allocation framework driven by models was put forth in 2012 by Tekin Bicer, David Chiu, and Gagan
Agrawal [3]. Data-intensive applications running in hybrid cloud environments can benefit from this technique's
support for time- and money-efficient execution. With a 3 point 6 percent error rate, you can meet
implementation deadlines, stick to budgetary restrictions, and shorten application run times.
Using the currently in use cloud computing organisms, Haoming Liang, Wenbo Chen, and Kefu Shi [4] proposed
a method for analyzing programming and task scheduling models. The programming process, its modification
process, and the flow of replacing services and resources are all explained with the help of examples.
In 2010, Ravi Sandhu, Raj Boppana, and Ram Krishnan put forth a fresh idea for integrating mission-driven
presentation, pliability, and security policies into the computing and communication infrastructure by integrating
hooks and supporting protocols into the cloud. This methodology can effectively address the twin cloud security
and accessibility issues [5].
A dynamic user-integrated cloud computing architecture was introduced in 2011 by Guannan HU and Wenhao
ZHU [5]. This model expands the capabilities of cloud computing data centers by actively integrating clients
with storage and computing capability. Services are offered to other users through client cooperation with the
data center [5]. In order to better meet the practical learning needs of lifelong learners, Xiang Li, Jing Liu, Jun
Han, and Qian Zhang proposed The article describes design of micro-learning platform architecture constructed
through cloud computing expertise, details the layered structural design of micro-learning platform based cloud,
and details the intention [6].
Xinwen Zhang, Anugeetha Kunjithapatham, Simon Gibbs, Joshua Schiffman, and Sangoh Jeong proposed A
Solution for Authentication and Secure Session Management between Weblets Running on the Device Side and
Weblets in the Cloud in 2009 [7]. allows cloud weblets to access sensitive user data through external web
services and offers protected migration. In business environments, it enables application integration between
private and public clouds [8].
Year: 2010, Yan Zhu, Huaixi, et al. [9] suggests a hybrid cloud data ownership plan that facilitates data
migration and service scalability. This opens up possibilities in which several cloud service providers collaborate
to store and manage customer data. Less overhead and simpler communication are the outcomes of this plan.
According to Qian Wang et al. [10] is a protocol that outlines the challenges and potential security concerns of
direct extension with fully dynamic data updates before demonstrating how to create complex validation
schemes for error-free integration. is recommending. This has an effect on block tag validation using the
conventional Merkle Hash Tree (MHT) structure [10].
Arash Nourian and Muthucumaru Maheswaran proposed a new image coding scheme in 2012 that uses the Ima
coding method, which transforms images using chaos maps, to enhance image privacy and enable the cloud to
carry out some types of computation. introduced. chosen following random masking.
Jia Yu, Rajkumar Buyya, and Kotagiri Ramamohanarao presented a method in 2008 for allocating the proper
resources to workflow tasks in order to complete their execution and enable each user to perform the desired
function. It makes an effort to enhance the workflow scheduling algorithms currently in use that have been
created and used by various grid projects [13].
Luis Mendonça and Henrique Santos published research findings and test results in 2012 that defined an efficient
set of traffic parameters that could model both normal and abnormal behavior of networks and demonstrate
abnormal and coordinated behavior. We focused on the detection of botnet movements in the instance of The
detection framework model was also foreseen and tested with actual traffic gathered at the University of Minho
campus edge [15].
A new security load balancing architecture based on multilateral security (LBMS) was proposed by Pengfei Sun
Qingni Shen, Ying Chen Zhonghai, and Wu Cong Zhang in 2011. This architecture provides the ideal physical
security device by automatically migrating tenant VMs during peak loads. CloudSim, a simulation of cloud
computing, is the foundation of this method. When VMs move to physical machines for load balancing, this
design tries to prevent potential attacks.
A new hybrid scheme that combines anomaly and signature detection with honeypots was put forth by Pragya
Jain and Anjali Sardana in 2012. To enable real-time system capabilities, the first stage used signature-based
180
detection of known worm attacks. Anomaly detectors can quickly spot deviations from the norm at the second
level. Honeypots are used to identify zero-day attacks at the top level. It provides the advantages of a resource-
efficient honey farm by utilizing honeypots and both detectors. Regulators route data traffic to the proper
honeypots [18].
181
On cloud servers, many users keep their files.
In a batch system, each user validates his own data.
To accomplish this, a number of scheduling and priority algorithms are employed. E.g. bottleneck, deadlock.
therefore, the auditing time will be very less.
To determine if the evidence passes an integrity check, a verifier (TTPA) runs an algorithm. If the validation
process worked, she returns TRUE; otherwise, she returns FALSE. Finally, the user receives a thorough test
report. In the system description, we discussed the use of TTPA for hybrid cloud data integrity verification.
Her three main client-side phases make up our plan. The initialization stage, the key generation stage, and the tag
generation stage are what they are called.
Step 1:
A certificate is first obtained by the user (or data owner) during his KeyGen stage.
CA (Certificate Authority) request and.
give his name and public key. tick The.
The format of certificates issued by CA is as follows:
C(DO) = [ID(DO),, sigCA(ID(DO),,)] -.
where [sigCA(ID (DO),,)] denotes the CA's digital signature. produced using a proprietary algorithm.
Step 2:
The data owner uses the verification algorithm verCA[C (DO)] to confirm permissions.
Step 3:
TTPA and the cloud server must receive this certificate in a secure manner. A perpetrator might pose as a
trustworthy person.
user and acquire her valid.
public key and identity information are contained in a certificate. In order to prevent this, the consumer gives
cloud servers and verifiers access to the certification authority verCA[C(DO)]'s verification algorithm. Online
server.
additionally got a certificate.
C(DO) = [ID(DO), verDO, sigCA (ID(DO), verDO,)] ---(9).
C(CS) = [ID(CS), verCS, sigCA (ID(DO)), verCS,]----- (10).
Step 4: During the exchange of communications.
the and the TTPA.
Using the cloud server, a safe, authenticated channel is created.
the session key. These two parties communicate using the same session key.
The generated proof (P), which Cloud Server signed with its private key.
creating a signature with (SK).
Next, send (P,, C.
(DO) to TTPA.
Step 5:
After receiving the signature, TTPA first verifies the certificate using a verification algorithm. If.
is certified, the answer is checked using its public key.
(PK). Otherwise, the consistency check will stop.
Step 6:
Decrypt the proof reply once more in the following step.
utilizing the public key (PK). After making sure that, this is done.
Verification of the certificate of authenticity. Her.
The authenticity of the server is then established by verifying the signature.
Step 7:
Finally, TTPA checks the proof by running the.
verification algorithm. By doing this, you can prevent active attackers from altering your data in any way.
182
V. RESULT ANALYSIS
Insider dangers Y
External assaults Y
Listening in Y
Identity Theft Y
The various proposed attack preventions are being criticized, as shown in Table 1.
An analysis of first factor authentication is presented in Table 2. Here, the number of bits used to generate the
secret value is dependent on the number of bits used to create the token.
It shows that the proposed method reduces the load on the server because the server only stores the server's
private key.
VI. CONCLUSION
In any environment, security is crucial to the transmission of data from sender to receiver. When data is
consistent, it is kept in a single place across many nodes. For data storage and data integrity, the cloud concept
was introduced. The ability to check nodes from one cloud to another and back up dynamic data is typically
possible if multiple clouds are implemented and data is stored dynamically. Add unstructured data to
challenging Hadoop. By authenticating the node from one cloud to another, we use the hybrid cloud concept to
dynamically store data exposed from any node in any cloud. Describe the idea. We're working to make it a
reality. Data ownership is supposed to be dynamic, so whenever one of your node's clients wants to access
data kept in another cloud, it has to first approve public access to the data. The suggested method
used here improves authentication, lessens the chance of eavesdropping, and guards against a number
of attacks like DOS, replay attacks, etc.
VII. REFERENCES
[1] Yan Zhu, Huaixi Wang, Zexing Hu, Gail-Joon Ahn, Hongxin Hu, Stephen S. Yau “Efficient Provable
Data Ownership for Hybrid Clouds”, 2010, 1st Part on Computer and Communication Security. Proceedings of the 17th
ACM Conference (CCS) '10), S. 756-758, 2010.
[2] Qian Wang, Cong Wang, Jin Li1, Kui Ren and Wenjing Lou, “For Storage Security in Cloud Computing Enabling
public verifiability and data dynamics of ”, 2009 Proceedings of die 14. Europäische Konferenz zur Forschung in der
Computersicherheit (ESORICS'09), S. 355-370, 2009.
183
[3] Tekin Bicer, David Chiu und Gagan Agrawal, „Time and Cost Sensitive Data-Intensive Computing on
Hybrid Clouds“, 2012 IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 636 –
643, Mai 2012.
[4] Haoming Liang, Wenbo Chen and Kefu Shi “Cloud Computing: Programming Models and
[5] Guannan HU und Wenhao ZHU, "A Dynamic User-integrated Cloud Computing Architecture", 2011 International
Conference on Innovative Computing and Cloud Computing (ICCC '11), S.: 36-40, 2011.
[6] Xiang Li, Jing Liu, Jun Han and Qian Zhang, "Architecture Design of Microlearning Platform Based on Cloud
Computing", Proceedings of the 2011 International Conference on Innovative Computing and Cloud Computing (ICCC
'11), S. 80- 83, 2011
[7] Zhang, Joshua Schiffman, Simon Gibbs, Anugeetha Kunjitha, Patham and Sangoh Jeong, Securing Elastic Applications
on Mobile Devices for Cloud Computing, Proceedings of the 2009 ACM Workshop on Cloud Computing Security
(CCSW '09), S. 127-134, 2009. Yan Zhu, Huaixi Wang, Zexing Hu, Gail-Joon Ahn, Hongxin Hu, Stephen S.
Yau, “Efficient Provable Data Ownership for Hybrid Clouds,” Proceedings of the 17th ACM Conference on Computer
and Communication Security (CCS '10), S. 756-758, October 2010.
[8] Qian Wang, Cong Wang, Jin Li1, Kui Ren and Wenjing Lou, “Enabling Public Verifiability and Data
Dynamics of Storage Security in Cloud Computing,” Proceedings of the 14th European Conference on Research
in Computer Security ( ESORICS'09), S. 355-370, 2009.
[9] Arash Nourian und Muthucumaru Maheswaran, "Towards privacy-enhanced limited image processing in the cloud",
13th ACM/IFIP/USENIX International Middleware Conference Proceedings of the 9th Middleware Doctoral
Symposium (MIDDLEWARE '12), Artikel Nr. 5, 2012.
[10] Jean Bacon1, David Evans1, David M. Eyers1, Matteo Migliavacca2, Peter Pietzuch2 and Brian Shand3, “End-to-
Work in the Cloud”. Enhancing End Application Security," Proceedings of the ACM/IFIP/USENIX 11th International
Conference Middleware (Middleware '10), S. 293-312, 2010.
[11] Jia Yu, Rajkumar Buyya, Kotagiri Ramamohanarao, "Workflow Scheduling Algorithm for Grid Computing", 2008
Springer Berlin Heidelberg, ISSN NO. 1860-949X, pp. 173–214,
[12] Jon Oberheide, Kaushik Veeraraghavan, Evan Cooke, Jason Flinn and Farnam Jahanian, "Virtualized Security
Services in the Cloud for Mobile Devices", Proceedings of the First Workshop on Virtualization in Mobile Computing
(MobiVirt' 08), S. 31-35, 2008.
[13] Luís Mendonça und Henrique Santos, ``Botnets: A Heuristic-Based Detection Framework'', Proceedings of the Fifth
International Conference on Security of Information and Networks (SIN '12), S. 33-40, 2012
[14] Alex Kantchelian Justin Ma und Ling Huang "Robust detection of comment spam using entropy rate", Proceedings of
the 5th ACM Workshop on Security and Artificial Intelligence (AISec '12) Record, S. 59-70, 2012.
[15] Pengfei Sun Qingni Shen, Ying Chen Zhonghai und Wu Cong Zhang, "POSTER: LBMS: Load Balancing based on
Multilateral Security in Cloud", Proceedings of the 18th ACM Conference on Computer and Communication
Security. (CCS '11), S. 861-864, 2011.
[16] Shiuan-Tzuo Shen, Hsiao-Ying Lin and Wen-Guey Tzeng, "Effective Alignment for Secure Erasure Code-Based
Storage Systems Reliability Check Scheme", IEEE Trans. On Reliability, vol. 64. Nr. September 3, 2015.
184
Grenze International Journal of Engineering and Technology, June Issue
Abstract— Fires in public places such as shopping malls, hospitals, train stations, and others can
endanger both people and resources. This has been a big concern for the past few decades.
Stopping these accidents should be a priority. So to achieve these, there are various techniques,
but there are some loopholes in those techniques. So to overcome those loopholes, we have
developed a model that detects fire in images and video frames, and as soon as it detects fire, it
sends an alert message to the nearest fire station and related authorities. The main purpose of
these models is to prevent accidents due to fire and minimize human work load. This paper uses
the CNN algorithm to build these projects.
I. INTRODUCTION
Fire accidents are one of the biggest threats to industries, social gathering places, hospitals, malls, and various
densely populated areas across the world. These kinds of incidents may cause damage to property and the
environment and pose a threat to human and animal life. According to the recent National Risk Survey Report
[1], fire was ranked third in terms of its impact on various fields related to problems, among many others. There
were some fire accidents in different countries that resulted in the impending ecological disaster, claiming
millions of lives and resulting in billions of dollars in damage. Early detection of fire can save many lives, as
well as various resources, and prevent damage caused by fire. This is one of the advantages of detecting fire at
an early stage. In order to achieve high accuracy and robustness in dense urban areas, detection through local
surveillance is It is both necessary and effective. There were numerous issues with traditional fire systems, such
as false alarms, detecting fire when there was none, and ringing alarms when there was no fire. Maintenance was
difficult. The use of sensors in hot, dusty industrial conditions is also not possible. Thus, detecting fires through
surveillance video One of the most feasible and cost-effective solutions is streaming, which is suitable for the
replacement of existing systems without the need for large infrastructure installations or investments. The
existing domain knowledge is heavily used in video-based machine learning models.
As a result, they must be updated to meet new threats. So our model can detect fire in a video or image frame
and send an alert message as soon as it detects fire. It can be used to detect fires in surveillance videos. Unlike
existing systems, this neither requires special infrastructure for setup, like hardware-based solutions, nor does it
need domain knowledge and prohibitive computation for development.
186
E. Deployment:
After the software has been certified and no problems or errors have been reported, it is deployed. The software
may then be delivered as is or with proposed improvements in the object portion depending on the assessment.
The maintenance of the software starts once it has been deployed.
F. Maintenance:
Once the client begins utilising the built systems, the real problems surface and periodic problem-solving is
required. Maintenance is the process when the developed product is given attention.
This built model will collect data from CCTV or surveillance footage and process it gradually in real time.
Frame by frame, the video is processed, and then the processed frames are fed into the pretrained CNN model.
This pre-trained CNN model will categorise frames into two groups in real time: one with fire and the other
without fire. This pre-trained CNN model may be set up to operate on a distant server using data from various
video surveillance systems. After processing these inputs, the pre-trained CNN model outputs a real-time
prediction on the real-time streaming data. To ensure that no frames are lost, streaming frames will be kept in
data storage. As the knowledge grows, the model will shortly
By doing this, a rich dataset is produced, and the model is trained using a large number of frames from the
dataset. As a result, the model's frame prediction accuracy will grow. Since the architecture of the monitoring
system won't need to be changed, this fire detection will be affordable. The model uses information from current
CCTVs or surveillance systems to forecast the presence of fire. This architecture is shown in Fig. 1.
187
V. CNN
An example of an eager learner algorithm is CNN. One of the Deep Learning models is CNN. The classification
performance of CNN is excellent. The most effective algorithm for classifying images at the moment is CNN.
Since LeNet, a Deep Learning algorithm, CNN has gained enormous popularity for classifying images. Huatan
Watan Jishu Wutan AUG/2020, Volume XVI, Issue VIII, ISSN 1001-1749 Pages: 99 There were five concepts
utilised, all of which had very good accuracy, to classify handwritten digits. The convolution layer, the Relu
layer, the Pooling layer, and the fully connected layer are the four processing layers that make up a convolution
neural network. One layer's output is used as the input for another layer.
1. The convolution layer, which forms the foundation of CNN, maps several kernels onto the input data before
performing a dot product to produce a feature map. 2. The Relu layer replaces negative numbers with 0 while
leaving other values alone. 3. To minimise and produce translation invariance, select the maximum value for the
pooling layer in a limited area around these maps. The fourth layer of CNN is known as the fully connected
layer. This layer receives as its input the output of the previous three levels. A fully connected layer will classify
the photos using the training data it has collected using the provided weights.
Implementation
VI. ADVANTAGES
1) Easy to detect Fire or NOT.
2) Improve Accuracy.
3) Time Saving.
4) Easy to use.
5) It is User Friendly Application.
VII. APPLICATIONS
1) Fire Detection System.
2) Helps in Fire Fighting
VIII. CONCLUSIONS
In conclusion, a sizable archive of recordings of forest fires in a range of scene conditions has been used to
investigate an aerial-based forest fire detection approach. The chromatic and motion characteristics of a forest
fire are first extracted, and then they are rectified using a rule to highlight the burning region, in order to increase
the detection rate. Second, smoke is also extracted using our suggested algorithm to address the issue of dense
smoke that almost completely engulfs the fire. In the actual application of aerial forest fire monitoring, the
proposed System framework demonstrates its robustness with a high accuracy rate of detection and a low false
alarm rate. Making difficult and specific scene comprehension data sets for fire detection methods and in-depth
trials may be the main focus of future research.
Additionally, fire detection systems can be integrated with reasoning theories and information-hiding algorithms
to intelligently observe and authenticate the video stream and start the necessary actions in an autonomous
manner.
188
ACKNOWLEDGMENT
Sincere appreciation to Prof. U. L. Tupe and HOD Dr. Nihar Ranjan for their assistance in resolving project-
related issues. A particular thanks to the computer wizards who created lovely libraries that are time-saving.
Additionally, we appreciate the IEEE community's assistance with the use of libraries.
REFERENCES
[1] C. Kao and S. Chang, “An Intelligent Real-Time Fire-Detection Method Based on Video Processing,” IEEE (2003 Int.)
[2] C. Ha, U. Hwang, G. Jeon, J. Cho, and J. Jeong, “Vision-based fire detection algorithm using optical flow”, 2012.
[3] C. E. Premal and S. S. Vinsley, “Image Processing Based Forest Fire Detection using YCbCr Colour Model”, 2014.
[4] N. I. Binti Zaidi, N. A. A. Binti Lokman, M. R. Bin Daud, H. Achmad, and K. A. Chia, “Fire recognition using RGB
and YCbCr color space”, 2015.
[5] K. Poobalan and S. Liew, “Fire Detection Algorithm Using Image Processing Techniques,”, December, 2015.
[6] Khan Muhammad, Jamil Ahmad, Zhihan Lv, Paolo Bellavista, Po Yang, and Sung Wook Baik, “Efficient Deep CNN-
Based Fire Detection and Localization in Video Surveillance Applications”, 2018,IEEE.
[7] Oxsy Giandi, Riyanarto Sarno “Prototype of Fire Symptom Detection System”, ©2018 IEEE.
[8] Sneha Wilson, Shyni P Varghese, Nikhil G A, Manolekshmi I, Raji P G, “A Comprehensive Study on Fire Detection”,
(2018). [9] Jiang Feng, Yang Feng, “Design and experimental research of video detection system for ship fire”, ©2019
IEEE. [10] Ke Chen,Yanying Cheng, “Research on Image Fire Detection Based on Support Vector
Machine”,(2020),IEEE [11] UANG HONGYU1, KUANG PING1, LI FAN1, SHI HUAXIN1, “An Improved
MultiScale Fire Detection Method Based On Convolutional Neural Network”, © 2020 IEEE.
189
Grenze International Journal of Engineering and Technology, June Issue
I. INTRODUCTION
Timetable is basically a structure which shows the time at which some prescribed event occurs. For educational
institutes timetable is for achieving its basic purpose of lecture delivery, and is used for scheduling of events
throughout the day, week, term or year for each batch. It requires the combination of resources like batch of
students, classes, instructors, time slots, and days arranged in a way such that no mentioned resources have an
overlap. This practice of mapping events in general (classes/ exams) to timeslot subject to the constraints, is
carried out manually in most of the institutes, requiring lot of manpower and time. Hence, time tabling gives rise
to scheduling problem that is tedious and as well requires solution in every institute at least once or twice an
academic year.
From above discussion, time tabling is a non-polynomial (NP) complete problem, i.e., a problem which has no
defined way to draw an appropriate solution. This NP complete scheduling problem falls in the class of
computational problems for which no efficient algorithm that can give accurate solution has yet been found.
191
integrating PSO with prototype methodology which creates particles that can upgrade themselves and have own
memory. But unlike GA it does not have operators like crossover and mutation to avoid convergence.
This section explained how others through various approaches used intelligence to solve the problem by setting
rules and how classical genetic algorithm can prioritize these rules dynamically to optimize the timetable
generation by providing benefit of distributed solution and load balancing. It can serve as the best possible way
to provide a solution provided constraints and proper convergence condition.
From above discussion, many studies have focused on making an approach to time table scheduling using AI by
considering diverse techniques. However, rapidly evolving demands do not sustain with these attempts. The
well-established relationship between constraints and scheduling is very crucial for our problem to provide an
optimized solution. In light of this, we also studied the area of genetic algorithms in great depth, researchers have
made latest and advanced studies that can be compatible with the evolving world.
Indeed, a brief analysis by Sourabh Katoch, Sumit Singh Chauhan, Vijay Kumar (2020) [9] on advances in
genetic algorithm and their implementation, has made a clear differentiation between all above discussed
attempts and motivated us for an approach using GA in our problem domain. Further for GA implementation and
mathematical modelling, the analysis made by L V Stepanov, A S Koltsov, A V Parinov, A S Dubrovin (2018)
[10] was studied and on its basis our proposed model was developed.
III. METHODOLOGY
A. Proposed System
In our approach we have developed model for providing solution to Timetabling problem based on genetic
algorithm in AI, it is based on natural selection and evolution method. Here we use basic terminology which can
be referred as follows.
Phenotype refers to population in real world, whereas Genotype is used for reference to population in
computational world.
Population is generally referred to set of human beings in phenotype but in genotype it is set of solutions (here it
will be a set of generated time table, also sometimes population can be referred to as generation)
Chromosome is the individual solution to given problem and a gene specifies one element position in a
chromosome.
To simply understand implementation of GA and the genetic operators that are responsible for the alteration in
composition of offspring, the following part of this section can be referred.
Implementation of GA in our developed model is done as in below steps,
Preprocessing
The prerequisite to perform operations in GA is to convert potential solution into a simple value like a string of
real or binary numbers. It helps in improvement of speed of algorithm. We have used conversion of data to
binary string i.e., a string of combination of 0 or 1. These bits of string are responsible for characteristic
presentation of solution as well as algorithm accuracy. Each chromosome string is a composition of sequential
arrangement of gene string.
Initial population
After encoding, the first step of the algorithm is to generate initial population which is done by random creation
of individuals on the basis of the constraints defined. The larger the population, the better will be the results.
This process sometimes can also be described as selection process or in terms of GA it can be referred as
reproduction operator.
Evaluation of population
The parameter used for evaluating an individual is known as fitness of an individual. The fitness of each
chromosome is determined within a generation, which is an estimate of how well the solution satisfies the given
constraints relative to other solutions.
As we have used binary encoding, our range for the fitness function will be between 0 (worst solution) and 1
(best solution). The fitness function defined in our approach is as follows,
∑ ̇ ∗ℎ∗
( )=
192
d= days per week,
t= total fitness of a generation
After determining the fitness of all chromosomes in a generation, i.e., complete evaluation of population, we
need to select chromosomes for further mating so that new generations can be created. This is done using
Roulette wheel selection method [9]. The basic principle on which this selection method works is
Selection α fitness
The concept of this technique, is to dividing a wheel according to proportion based on fitness value. Then each
chromosome is mapped in the suitable proportion. Eventually, the wheel is rotated and the pointer points at the
chromosome is selected for further reproduction. Also, as the basis of proportion is fitness value, mostly the
larger proportion will be composed of fitter chromosomes, increasing probability of pointer to stop at fitter
individuals.
Crossover
Now for mating the selected chromosomes, the crossover operator is used, hence resulting in new chromosomes
together making a new generation. We have used single point crossover in our model. The selected
chromosomes using above selection method are used and single point crossover is performed [9]. In this
operation, random point of a chromosome is selected, and for two parents, gene after this point is swapped
resulting in new off springs with different gene composition.
Mutation
This genetic operator is very crucial as it is used to prevent converging at a very early stage of reproduction. It
basically alters genes of chromosomes for generating diverse variety of population. We have implemented
Displacement mutation (DM), as name suggests it operates by displacing genes within a chromosome. This
results in production of diversity, hence reducing risk of premature convergence.
B. Algorithm
Based on the above described implementation of GA in our model, its algorithm can be depicted as in Fig. 1.
C. Constraints
In GA constraints can be classified into two categories,
Hard constraints, these mandatorily needs to be followed. In our model the hard constraints that are,
Same student must not have two lectures simultaneously.
Same lecturer must not have two lectures simultaneously.
One room must be allocated to only one lecture at a time.
193
Soft constrains are the constraints that are required to be satisfied, but it is not obvious that they will always be
satisfied. In our model the soft constraints considered are,
Same lecturer must not have two consecutive lectures.
Fixed slot assignment for particular subject.
D. System Architecture
The system architecture of the developed model as shown in Fig. 2 uses GA to solve timetabling problem and
generates optimal minimum or no error solution. It has capability to take various resources (class, subject,
teacher details) as input in a very user-friendly manner and process them for the required output in low cost and
less time.
E. Hardware and Software Requirements
The minimum hardware requirements for implementing the system are Processor of minimum configuration
Pentium IV/Intel I3 core with speed of at least 1.1 GHz, RAM 512 MB (min) and Hard Disk of minimum 20GB.
The output is displayed on standard monitor screen and for data input keyboard and mouse is required.
Software requirements for implementation is about the prerequisites to be installed on the system for proper
functioning. In our model, such prerequisite software are Java 8 and above supporting compiler, Struts-2
framework, Apache tomcat server and MySQL database. Operating system supporting our model are Windows 7
and above versions.
For developing front end Html5, CSS, JavaScript, bootstrap and AJAX is used, whereas the backend code is in
Java.
F. Modules and Interfaces
Registration and Login: When the user visits the homepage, for the first time they need to register by
providing basic details. These are then stored in the user detail database as seen in system architecture
Fig. 2. Login is a submodule of this interface, where user can enter credentials for logging in.
Input Interface: After login, user is asked for all input details required like number of slots, batches,
days per week for which schedule is to be created. Then after the course details along with faculty
details and submit button is clicked. This directs to the browser page where the optimized timetable is
displayed.
V. CONCLUSION
We have implemented GA in AI for smart time table generation, which produces optimal timetable subject to the
constraints defined.
Though the solution cannot always be 100% optimal and suited, as the degrees of optimization depend upon the
constraints defined. The improvement has been achieved using intelligence system by the appropriate use of
genetic operators. Developed system works accurately initially for schools as they execute classes in a very
classical model and gradually increases its scope to colleges and higher secondary classes which has more
constrains hence making the problem more complex, but our model deals with it in a user-friendly way. There is
future scope to enhance the developed system model for producing timetable and scheduling for various
purposes like examinations, bus schedules, and many more.
194
Figure 3. Genotype output on console determining fitness
REFERENCES
[1] Dipti Srinivasan Tian Hou Seow Jian Xin Xu “Automated timetable generation using multiple context reasoning for
university models”, IEEE conference (2002).
[2] Sadaf N. Jat, Shengxiang Yang “A mimetic algorithm for university course timetabling problem”, 20th IEEE
International conference on tools with artificial intelligence (2008).
[3] K. Nguyen, D. Nguyen, K. Trieu, and N. Tran, “Automating a real-world university timetabling problem with Tabu
search algorithm”, in 2010 IEEE RIVF International Conference on Computing & Communication Technologies,
Research, Innovation, and Vision for the Future (RIVF), (2010).
[4] A. Azlan and N. M. Hussin, “Implementing graph coloring heuristic in construction phase of curriculum-based course
timetabling problem”, in 2013 IEEE Symposium on Computers & Informatics (ISCI), (2013).
[5] R. E. Febrita and W. F. Mahmudy, “Modified genetic algorithm for high school time-table scheduling with fuzzy time
window”, in 2017 International Conference on Sustainable Information Engineering and Technology (SIET), (2017).
[6] T. Elsaka, “Autonomous generation of conflict-free examination timetable using constraint satisfaction modelling”, in
2017 International Artificial Intelligence and Data Processing Symposium (IDAP), (2017).
[7] K. Y. Junn, J. H. Obit, and R. Alfred, “The study of genetic algorithm approach to solving university course timetabling
problem”, in Lecture Notes in Electrical Engineering, Singapore: Springer Singapore, (2018), pp. 454–463.
[8] E. Psarra and D. Apostolou, “Timetable scheduling using a hybrid particle swarm optimization with local search
approach”, in 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA),
(2019).
[9] Sourabh Katoch & Sumit Singh Chauhan & Vijay Kumar, “A review on genetic algorithm: past, present, and future”,
Multimedia Tools and Applications, Springer (2021).
[10] L V Stepanov et al, “Mathematical modeling method based on genetic algorithm and its applications”, Journal of
Physics: Conference Series (2019).
[11] Peter Brucker, “Scheduling and constraint propagation”, November (2002).
[12] David E. Goldberg “Genetic Algorithm in Search, Optimization and Machine Learning”, (1989).
[13] John McCall, “Genetic algorithms for modelling and optimization”, Journal of Computational and Applied Mathematics
184 (2005).
[14] B. Shirazi, H. Fazlollahatabar and D. Shafiei, “A Genetic Approach to Optimize Mathematical Model of Faciliies
Relocation Problem in Supply Chain”, Jouranl of Applied Sciences (2008).
[15] Abdelghany A, Abdelghany K, Azadian F “Airline flight schedule planning under competition”, Comput Oper Res
87:20–39 (2017).
[16] Arkhipov DI, Wu D, Wu T, Regan AC “A parallel genetic algorithm framework for transportation planning and logistics
management”, IEEE Access 8:106506–106515 (2020).
[17] Baker JE, Grefenstette J “Proceedings of the first international conference on genetic algorithms and their applications”,
Taylor and Francis, Hoboken, pp 101–105 (2014).
195
Grenze International Journal of Engineering and Technology, June Issue
I. INTRODUCTION
The necessity for novel and effective network architectures is made clear by the rising service heterogeneity and
consumption. To ensure quality of service (QoS) and to achieve Service-Level Agreements (SLAs), modern
networks are configured with complicated static rules. The complexity of administration and configuration
procedures tends to rise in multi-vendor setups. The problems with conventional network-centric architectures
are intractable. Consequently, with SDN and network programmability and automation, network-centric
paradigms give way to application-centric paradigms [1]. In order to grant access to and control over the network
resources, the SDN controller hosts APIs. However, these APIs require the applications to effectively optimize
network speed and security, and machine learning (ML) and artificial intelligence (AI) algorithms can support
this effort. There have been numerous systematic reviews of SDN studies conducted by academics and business
professionals. The authors of [2] describe different SDN load balancing methods, some of which use AI
algorithms. Using SDN designs, Ray et al. proposed an examination of IoT devices in [3]. In addition, [4]
discusses the difficulty of applying AI/ML in SDNs.
II. METHODOLOGY
The objective of this study is to compile a collection of publications and analyze them in order to address
different research issues. The following are the research questions:
RQ1. What kind of AI/ML mechanisms are applied to SDN?
RQ2. Can Performance of SDNs be improved by AI/ML?
RQ3. What are the main limitations of using AI/ML in SDNs with respect to Quality of Service (QoS)?
We gathered a number of publications for analysis. An initial batch of papers is assessed using
inclusion/exclusion criteria at the beginning of the procedure. In a subsequent iteration, references and citing
articles from the individuals listed are acquired and examined.
Our search was conducted using the terms "Software Defined Networks," "Artificial Intelligence," and "Machine
Learning," as well as their respective acronyms: ("SDN" OR "Software Defined Network" AND ("Artificial
Intelligence" OR "AI" OR "Machine Learning" OR "ML")
A. Start Set and Criteria
We submitted the search query into the IEEE Xplore, Google Scholar and Core, search engines in order to build
the start set. The first five papers from each group were then chosen.
The following acceptance standards were established before the articles were examined:
- Publication date between 2011 and 2022 (OpenFlow's release date);
- Published in first- or second-quartile peer-reviewed articles (Scimago ranking);
- Written in English;
- Focus on the current themes, such as SDN and AI/ML applications;
- Articles with access that the authors have been given.
With the help of the approval criteria, we selected a collection of articles that directly examine AI/ML in SDNs
from reputable sources. Only six of the prospective beginning set's fifteen articles ultimately complied with the
requirements: [7–12]. A total of 344 were omitted based on the year of publication, 112 were excluded because
they were not published in conferences or journals with high enough rankings, 14 were unavailable, 296 were
duplicates, 4 were not articles, and 582 did not directly address the issues at hand. 98 articles were chosen from
this process, which was conducted between August and September 2022.
III. DISCUSSION
The primary objectives of this part are to present our findings, discuss them, and provide an overview of existing
and emerging trends.
A. Application of AI/ML algorithms
The application of AI/ML algorithms in the publications is displayed in Table 1. The ranking shows that
supervised learning algorithms are second in prevalence to neural networks (NNs) methods. Other mechanisms,
197
such as self-organizing maps, have been discussed in a number of studies but have not received as much
attention.
The popularity neural networks and of deep learning, as well as RF and DT algorithms, can be used to explain
why supervised learning techniques are preferred over the others. Unsupervised learning discovers patterns from
unlabeled data, whereas supervised learning uses labelled data to adjust model parameters.
[17]-[19] Neural Network (NN) Collaborative intrusion Due to its simple and parallel computational
prevention; capabilities, it achieved a low overhead;
Predicting the performance of Achieved low mean squared error (MSE);
SDN; improved efficiency and a 19.3% reduction in
Load balancing network latency.
[20], [21] Reinforcement Cognitive network Manage networks efficiently;
Learning (RL) management
[22] Deep RL Adaptive multimedia traffic Dynamic coordination of computational, networking,
control mechanism leveraging and caching resources
[23],[24] Deep Q-Learning Q value-action function Promote resilience and scalability
approximation
[25] SVM Predict link failure; Reduces the start-up time for identification and
[26], [27] Detect DDoS attack; classification recognition; lowers the rate of false
alarms
[28] Laplacian SVM Traffic classification on the real Similar applicability to supervised learning; only
Internet data tested in a lab environment; processes synthetic data
In this context, it appears that supervised learning can be used more readily to enhance network decision-making
in areas like routing and QoS. RL and unsupervised learning algorithms are outperformed by supervised learning
algorithms, which are distinguished by a peak in 2017 and an erratic drop in the years that followed. It's
interesting to note that RL has been rising gradually. In IoT, 5G access networks and automotive networking—
dynamic situations recognized by RL to perform unsupervised and supervised learning methods—this suggests a
possible application for SDNs. Based on these findings, it is predicted that the usage of RL to train networks and
SDN processors would have to adapt to variations in resource demand and traffic, especially with the
introduction of SDNs in increasingly complicated networks.
B. Artificial Intelligence in SDN
Numerous issues, resource allocation and admission control, [29], have been successfully solved using AI and
ML methodologies. However, in the SDN era, AI's function was greatly expanded due to the significant efforts
made by the business sectors. Many researchers have revealed a significant trend in the scientific community's
application of AI methods in SDNs.
C. ML Methods In SDN-Concept Networks
The reinforcement learning, the semi-supervised, the unsupervised, the supervised learning method are the four
kinds of ML approaches. A mathematical model is created by supervised learning algorithms using a labelled
training sample. Algorithms for unsupervised learning gain information derived from test results that were not
labelled. Additionally, when some of the sample input lacks labels, semi-supervised learning approaches are
used to build mathematical models using sparse training data. Numerous classification and prediction issues have
been successfully solved by the application of ML methods [30]. In this section, we will continue our work from
198
an algorithmic perspective by providing many traditional ML techniques used in SDN also listed in Table I for
greater understanding.
1) Supervised Learning in SDN-Concept Networks
Nowadays, Numerous various applications, including spam detection, object and speech recognition, commonly
use supervised learning [31]. Predicting the value of results obtained from the values of a vectors of input
variable is the objective. In the context of regression approaches, the SDN architecture uses a regression to
predict [32].The key performance indicator (KPI) for the application and the network metrics are additionally
related using multiple linear regressions [33]. Regression algorithm usage in SDN is currently uncommon on the
whole. We focus on introducing the categorization techniques in SDN. The Logistic Regression, SVM, Decision
Trees, KNN, Naive Bayesian algorithms are some of the most frequently used classification methods.
a) K-Nearest Neighbours (KNN) In SDN-Concept Networks
KNN is categorized by calculating the distance between various feature values. The categorization outcomes
actually depend on a relatively limited set of nearby samples. KNN is appropriate for multiclass classifications
which has been extensively utilized as a classifier in many different fields.
Predis, a computationally straightforward and effective KNN method, was proposed by Zhu et al. [13] as its
detection technique. Because of its better efficiency design, it can accurately identify a variety of other forms.
The algorithm takes a long time when the training dataset is huge. KNN, one of the most straightforward ML
algorithms, is simple to use, estimates features accurately, and works well for multiclass classifications. When
used to huge datasets, the algorithm takes a long time.
b) Support Vector Machine (SVM) in SDN-Concept Networks
Generalized linear classifiers include SVM that uses supervised learning to carry out binary classification.
Because both structural and empirical risk minimizations are taken into consideration in the optimization
problem, SVM is stable.SVM only works for binary classification tasks, it should be mentioned. Multiple
classification tasks will therefore be broken down into a number of binary questions. The technology uses SVM
embedded in the controller to identify DDoS attacks in [26]&[27]. It can identify the distinction between flow
entries created by DDoS attack traffic that are malicious and flow entries created by normal traffic that are
benign. In terms of binary classification problems, SVM has a lower rate of false alarms. Effectively cutting
down on the time needed to start classification recognition and assault detection is the detection strategy. Since
SVM is established at the SDN controller stage, the effectiveness of the SDN system is not significantly
impacted by its complexity.
c)Neural Networks (NN) in SDN-Concept Networks
According to the testing findings, CIPA is more effective than [35] at identifying DDoS flooding attacks. CIPA
also has success finding outbreaks of the Witty, Slammer, and Conficker worms. Due to its parallel and
straightforward processing capabilities, the system achieved little computational and communication overhead.
A multi-label classification method was suggested by He et al. [36] to estimate global network allocations. The
neural network approach outperformed decision trees and logistic regression and reduced algorithm runtime by
up to two-thirds. In order to estimate traffic demands off-line for a mobile network operator, Alvizu et al. [34]
employed a neural network technique, which reduced the optimality gap between 0.2% and 0.45%. Additionally,
the next configuration time point was predicted off-line using a NN technique.
A system for intrusion detection for SDN built on NN technique was proposed by Abubakar et al. [37]; it made
use of the NSL KDD dataset to obtain a high reliability of 97.3%.
d)Decision Tree (DT) in SDN-Concept Networks
DT is a prediction model that illustrates the relationship among both object values and characteristics. It is a tree
data structure where leaf nodes signifies a category and branch route denotes a potential parameter value,
and internal node in the tree indicates an object. DT is frequently used in data mining to examine data for
prediction. Packet classification is its primary use in networks. These are well-known methods, such as Hyper
Cuts [38], Hi Cuts [39], Cut Split [40] or Effi Cuts [41], Partition Sort [42], which incorporates the advantages
of DTs and TSS (Tuple Space Search), is proposed in light of the significantly increased dimensionality and
dynamism in SDN. A least cost disruptive (LCD) decision tree was developed to resolve trade-offs between
good delivery of services, adaption costs, and users disruptive level variables [43].The DTs were employed in
the work in [44] as a technique of solving the Flow Table Congestion Problem (FTCP). The main advantage of
DT over KNN and SVM is that it can be easily implemented, and that preparation of data is either trivial or not
even necessary. However, when there are too many categories, errors could increase more frequently.
e)Ensemble Learning in SDN-Concept Networks
The objective of the supervised learning method is to develop a stable model that excels in all situations,
although the facts aren’t quite obvious. In order to create a stronger, more complete strong supervision model,
199
multiple weak supervised models are combined through ensemble learning. A particular approach is used to
merge a group of individual learners after they have first been formed. Bagging and boosting are two of the main
ensemble learning algorithms. Despite the fact that these methods are used less frequently than traditional
methods. RF (Random Forest) creates bagging integration and uses it in numerous settings using DT as the
foundation learner. The indoor localization generated by the model in [45] has a high efficiency of 98.3% and
performs best when using SVM, NN and KNN. It trains itself using RF-based cross validation (Neural
Networks). To effectively describe the latency distribution of a single VNF, Lei et al. [46] suggest using a
random-forest regression prediction approach. Ensemble learning is more accurate than the conventional
methods mentioned above, but comes with a high level of complexity.
2) Unsupervised Learning in SDN-Concept Networks
The training samples' labels are unknown in unsupervised learning techniques. Analyzing training samples
without labels that adds another layer of support for data analysis, the objective is to discover the fundamental
characteristics and laws governing the data. The most popular technique is "clustering," and K-means is the most
basic and well-known algorithm[48]. In an SDN-based WAN design, a controller placement problem is solved
using a hierarchical K-means algorithm [49]. There are also algorithms that contrast or combine supervised
learning and unsupervised learning. Understanding each algorithm's benefits and drawbacks is the goal of the
comparison. Different supervised and unsupervised learning methods, including Naive Bayes, KNN, K-medoids
and K-means, are used by Barki et al. [47] to categorize the traffic as abnormal or normal. Compared to Bayes
and KNN, K-means and K-mediods are faster but less accurate. For traffic categorization, the two ML
techniques unsupervised K-means clustering and supervised SVM are investigated [50].
3) Semi-Supervised Learning in SDN-Concept Networks
Generally, unsupervised and supervised learning are the two main subcategories of machine learning technology.
While unsupervised learning only utilizes unlabeled sample sets, supervised learning only employs labelled
sample sets. However, because labelling data is so expensive, there are sometimes far more unlabeled data
available in real issues than there are labelled data. As a result, semi-supervised learning methods that may be
applied to both labelled and unlabeled samples quickly developed. A combination of unsupervised and
supervised learning is used in this learning strategy. By employing a large amount of unlabeled samples and
small amount of labelled data and it primarily focuses on how to build and categorize models. The same
applications as supervised learning utilize semi-supervised learning [36]. Semi-supervised learning has only been
evaluated in the lab and has historically been used to handle synthetic data, whereas [51] has done studies to
achieve accurate traffic classification of real Internet data. To re route effectively and accomplish the objectives
of resource, the QoS parameters may be used. The QoS classifier also uses semi-supervised machine learning to
handle traffic from unidentified applications. Its practical importance has not been adequately represented,
relative speaking. Additionally, more study is needed to determine the practical benefits of semi-supervised
learning.
4) Reinforcement Learning (RL) in SDN-Concept Networks
Through trial and error and rewards from interactions, an agent learns the reward-guiding activity referred as RL.
In SDN-concept networks [52], [53], RL delivers path selection or route optimization and is often used to
support reliability and scalability [20]. When delay reduction and throughput maximization are employed as the
primary operational and maintenance method for DROM [54], the resulting network performance, routing
services, and convergence are all significantly improved. The Internet of Vehicles (IoV) environment can be
sensed and learned from to give an optimal routing policy adaptively. SDCoR [55] is the first study to do this
and outperforms numerous common IoV protocols. Reduce the number of distinct pathways used for contiguous
data frames in order to address the primary challenge of high jitter [52]. It is suggested to mix certain innovative
RL research with other technologies for improved performance. To discover the best overlay paths with the least
amount of monitoring overhead, for instance, Random NN with RL are constructed [21]. For making auto-
scaling policy decisions, SRSA [56], an RL-based auto-scaling decision mechanism, was investigated.
Furthermore, due to the complex and dynamic network environment, RL with architecture changes are
considered.
To effectively manage networks with SON (Self-Organizing Networks) capabilities, Daher et al. [57] presented a
scalable strategy based on distributed RL. By combining DL with RL, Deep Reinforcement Learning (DRL)
accelerates learning and enhances the effectiveness of RL algorithms. DRL has produced outstanding outcomes
in both theory and practice. In particular, the Google Deep Mind team's DRL-based Alpha Go program is
regarded as a significant development in the field of artificial intelligence. Our findings support DRL's claims of
some development in SDN concept networks. DRL for leveraging adaptive multimedia traffic control
mechanism was researched by Huang et al. Without using a mathematical model, it can directly govern
200
multimedia flow. Deep Q-Learning (DQL) is specifically employed for the majority of DRL-related activities
[58]. And in various network circumstances, different DQL approaches can be employed to solve various
challenges. He et al. [59] suggested an integrated DQL methodology with SDN that uses a deep Q network to
approximate the Q value-action function. Overall, RL is a significant ML technique that is frequently applied to
network-related problems. Keep in mind that it only describes the interaction processes as opposed to offering a
different teaching strategy. Additionally, an RL can be created from any learning algorithm [60], and it will be
frequently used for analysis and prediction. Figure 1 gives a brief description of machine learning methods [61].
201
automate networks. These results imply that SDNs can successfully implement AI/ML algorithms, potentially
accommodating both present and future specifications
IV. CONCLUSION
To investigate the application of AI/ML approaches in SDNs, we examined at 98 publications (out of a total of
1450). The findings imply that algorithms for supervised learning considerably outperform those for
unsupervised learning and reinforcement learning. According to the majority of studies, NNs are the most
effective way to improve intelligence and optimize SDNs. However, in environments where numerous diverse
devices compete for network resources, RL has noticed a minor rise in adherence and may begin to see a higher
202
rise in the number of SDN problems it is capable of resolving (e.g., 5G networks). Network management,
automation, performance, and QoS are all enhanced through supervised and reinforcement learning.
One of the most effective AI technologies is ML for managing and operating autonomous networks because to
its capacity to extract information from data, which is fuelled by the availability of data and the theoretical
advancement of ML frameworks. Although there have been some analyses of the problems and difficulties for
ML in different SDN-based networks there hasn't been much proof that the applications haven't succeeded in
providing workable management solutions for autonomous networks. We concentrate on SDN network
applications using ML techniques for the other part. We also talk about the future directions for this field of
study. The primary problems with ML approaches are noted. Although ML areas have made considerable
progress, effective ML is challenging due to challenging patterns and a lack of training data availability. Because
of this, many ML programs frequently perform below expectations. We anticipate that our discussions will serve
as a straightforward manual for the advancement of SDN and the creation of a more intelligent network.
Researchers with various goals can use this study to better understand the fundamental problems in the subject.
Future network design and management will depend heavily on SDN-concept networks using ML techniques in
all areas, including resource management, intelligent routing management, network security, flow control, etc.
Future research will be done in-depth on the main issues mentioned in the study. The uses of AI/ML in SDN
applications increases the potential and worth of these architectures in research and business. Future trends are
expected to place an even greater emphasis on AI/ML techniques, as they offer substantial performance
improvements.
REFERENCES
[1] Cisco: The Art of Application-Centric Networking. Tech. rep.,
https://www.cisco.com/c/dam/en/us/solutions/collateral/borderlessnetworks/officeextend-solution/cisco td 030513
fin.pdf
[2] M. R. Belgaum, S. Musa, M.M.Alam, M. M.Su’Ud, A Systematic Review of Load Balancing Techniques in Software-
Defined Networking. IEEE Access 8 (2020)98612– 98636. https://doi.org/10.1109/ACCESS.2020.2995849
[3] P. P. Ray, N. Kumar, SDN/NFV architectures for edge-cloud oriented IoT: A systematic review. Computer
Communications 169 (2021) 129–153. https://doi.org/10.1016/J.COMCOM.2021.01.018
[4] N. Sultana, N. Chilamkurti, W. Peng, R.Alhadad,Survey on SDN based network intrusion detection system using
machine learning approaches. Peer-to-Peer Networking and Applications 2018 12:2 12(2) (2018) 493–501.
https://doi.org/10.1007/S12083-017-0630-0
[5] K. Benzekki, A. El Fergougui, and A. E. Elalaoui,Software-defined networking (SDN): A survey,Secur. Commun.
Netw., 9(18) (2016) 5803-5833.
[6] S. Sezer, et al., “Are we ready for SDN? Implementation challenges for software-defined networks,” IEEE Commun.
Mag., 51(.7) (2013) 36-43.
[7] J. Xie, F. Richard Yu, T. Huang, R. Xie, J. Liu, C. Wang, Y. Liu, A survey of machine learning techniques applied to
software defined networking (SDN): Research issues and challenges. COMST 2018 21(1)(2019)393–430.
https://doi.org/10.1109/COMST.2018.2866942
[8] Y. Cao, R. Wang, M. Chen, A.Barnawi, AI Agent in Software-Defined Network: Agent-Based Network Service
Prediction and Wireless Resource Scheduling Optimization. IEEE Internet of Things Journal 7(7) (2020) 5816–
5826.https://doi.org/10.1109/JIOT.2019.2950730
[9] Y. Zhao, Y. Li, X. Zhang, G. Geng, W. Zhang, Y. Sun, "A survey of networking applications applying the software
defined networking concept based on machine learning." IEEE Access 7 (2019) 95397-95417.
[10] M. Latah, L. Toker, Artificial Intelligence Enabled Software Defined Networking: A Comprehensive Overview. IET
Networks 8(2) (2018) 79–99. https://doi.org/10.1049/iet-net.2018.5082
[11] S. Nanda, F. Zafari, C.Decusatis, E.Wedaa, B. Yang, Predicting network attack patterns in SDN using machine learning
approach. NFV-SDN, (2017)67–172. https://doi.org/10.1109/NFV-SDN.2016.7919493
[12] J. C. Ferreira, D. Teixeira &J. Macedo, Systematic literature review of AI/ML in software-defined networks using the
snowballing approach (2021).
[13] L. Zhu, X. Tang, M. Shen, X. Du, and M. Guizani, Privacypreserving DDoS attack detection using cross-domain traffic
in Software Defined Networks, IEEE J. Sel. Area Comm., 36(3) (2018) 628-643.
[14] T.H. Lei, Y.T. Hsu, I. C. Wang, and C. H.P. Wen, Deploying QoSassured service function chains with stochastic
prediction models on VNF latency,in Proc IEEE NFV-SDN, 2017, pp.1-6.
[15] W. Li, X. Li, H. Li, and G. Xie, “CutSplit: A decision-tree combining cutting and splitting for scalable packet
classification,” in Proc. IEEE INFOCOM, 2018, pp.2645-2653.
[16] Y. Sorrachai, D. James, A. X. Liu, and E. Torng, “A sortedpartitioning approach to fast and scalable dynamic packet
classification,” IEEE ACM T. Network, vol.26,no.4, pp. 1907-1920. 2018.
[17] Chen, X.F., Yu, S.Z.: ‘CIPA: A collaborative intrusion prevention architecture for programmable network and SDN’,
Computers and Security, 2016,58, pp. 1-19.
203
[18] Sabbeh, A., Al-Dunainawi, Y., Al-Raweshidy, H.S., Abbod, M.F.: ‘Performance prediction of software defined network
using an artificial neural network’. In Proc. of SAI Computing Conference (SAI), London, UK, July 2016, pp. 80-84.
[19] Chen-Xiao, C., Ya-Bin, X.: ‘Research on load balance method in SDN’, International Journal of Grid and Distributed
Computing, 2016,9,(1), pp. 25-36.
[20] L. S. R. Sampaio, et al., “Using NFV and reinforcement learning for anomalies detection and mitigation in SDN,” in
Proc IEEE ISCC, 2018, pp.432-437.
[21] F. Francois and E. Gelenbe, “Optimizing secure SDN-enabled interdata centre overlay networks through cognitive
routing,” in Proc IEEE MASCOTS, 2016, pp.283-288.
[22] X. Huang, T. Yuan, G. Qiao, and Y. Ren, “Deep reinforcement learning for multimedia traffic control in Software
Defined Networking,” IEEE Network, vol.32, no.6, pp. 35-41, 2018.
[23] N. C. Luong, et al., “Applications of deep reinforcement learning in communications and networking: A survey,” arXiv
preprint arXiv:1810.07862, 2018.
[24] Y. He, Z. Zhang, and Y. Zhang, “A big data deep reinforcement learning approach to next generation green wireless
networks,” in Proc IEEE GLOBECOM, 2017, pp. 1-6.
[25] K. Bao, J. D. Matyjas, F. Hu, and S. Kumar, “Intelligent Software-Defined Mesh Networks with link-failure adaptive
traffic balancing,” IEEE TCCN, vol.4, no.2, pp.266-276, 2018.
[26] Y. Yu, L. Guo, Y. Liu, J. Zheng, and Y. Zong, “An efficient SDNbased DDoS attack detection and rapid response
platform in vehicular networks,” IEEE Access, vol. 6, pp. 44570-44579, 2018.
[27] D. Hu, P. Hong, and Y. Chen, “FADM: DDOS flooding attack detection and mitigation system in software-defined
networking,” in Proc. IEEE GLOBECOM, 2017, pp.1-7
[28] P. Wang, S. C. Lin, and M. Luo, “A framework for QoS-aware traffic classification using semi-supervised machine
learning in SDNs,” in Proc IEEE SCC, 2016, pp.760-765.
[29] Testolin, A., Zanforlin, M., De Grazia, M.D.F., et al.: ‘A machine learning approach to QoE-based video admission
control and resource allocation in wireless systems. In: Proc. of 13th Annual Mediterranean Ad Hoc Networking
Workshop (MED-HOCNET), Piran, Slovenia, June 2014, pp. 31-38.
[30] S. Nanda, F. Zafari, C. DeCusatis, E. Wedaa, and B. Yang, “Predicting network attack patterns in SDN using machine
learning approach,” in Proc. IEEE NFV-SDN, 2016, pp.167-172.
[31] C. Song, et al., “Machine-learning based threat-aware system in software defined networks,” in Proc. IEEE ICCCN,
2017, pp.1-9.
[32] D. Comaneci, and C. Dobre, “Securing networks using sdn and machine learning,” in Proc. IEEE CSE,2018, pp.194–
200.
[33] H.Z. Jahromi, A.Hines, and D.T. Delanev, “Towards applicationaware networking: ML-based end-to-end application
KPI/QoE metrics characterization in SDN,” in Proc. ICUFN, 2018, pp.126–131.
[34] Alvizu, R., Troia, S., Maier, G., Pattavina, A.: ‘Matheuristic with machinelearning-based prediction for software-defined
mobile metro-core networks’, IEEE/OSA Journal of Optical Communications and Networking, 2017,9,(9), pp. D19-
D30
[35] Gamer, T.: ‘Collaborative anomaly-based detection of large-scale internet attacks’, Computer Networks, 2012,56,(1),
pp. 169-185.
[36] He, M., Kalmbach, P., Blenk, A., Kellerer, W., Schmid, S.: ‘Algorithm-Data Driven Optimization of Adaptive
Communication Networks’. In Proc. of IEEE 25th International Conference on Network Protocols (ICNP), Toronto,
Canada, Oct. 2017, pp. 1-6
[37] Abubakar, A., Pranggono, B.: ‘Machine learning based intrusion detection system for software defined networks’. In
Proc. of Seventh International Conference on Emerging Security Technologies (EST), Canterbury, UK, Sept. 2017, pp.
138-143.
[38] P. Xiao, W. Qu, H. Qi, Y. Xu, and Z. Li, “An efficient elephant flow detection with cost-sensitive in SDN,” in Proc
IEEE INISCom, 2015, pp.24–28.
[39] M. Latah, and L. Toker, “Towards an efficient anomaly-based intrusion detection for software-defined networks,” IET
Networks, vol.7, no.6, pp.453-459, 2018.
[40] W. Li, X. Li, H. Li, and G. Xie, “CutSplit: A decision-tree combining cutting and splitting for scalable packet
classification,” in Proc. IEEE INFOCOM, 2018, pp.2645-2653.
[41] D. Côté, “Using machine learning in communication networks,” J. Opt. Commun. Netw., vol.10, no.10, pp. D100-D109,
2018.
[42] Y. Sorrachai, D. James, A. X. Liu, and E. Torng, “A sortedpartitioning approach to fast and scalable dynamic packet
classification,” IEEE ACM T. Network, vol.26,no.4, pp. 1907-1920. 2018.
[43] D. Chemodanov, P.Calyam, S.Valluripally, H. Trinh, J. Patman & K. Palaniappan. (2018). On qoe-oriented cloud
service orchestration for application providers. IEEE Transactions on Services Computing, 14(4), 1194-1208.
[44] B. Leng, L. Huang, C. Qiao, and H. Xu, “A decision-tree-based online flow table compressing method in Software
Defined Networks,” in Proc IEEE/ACM IWQoS, 2016.pp.1-2.
[45] R. Gomes, M. Ahsan, and A. Denton, “Random forest classifier in SDN framework for user-based indoor localization,”
in Proc IEEE EIT, 2018, pp.537-542.
[46] Q. Cheng, et al., “Guarding the perimeter of cloud-based enterprise networks: An intelligent sdn firewall,” in Proc IEEE
HPCC,2018, pp.897-902.
204
[47] L. Barki, A. Shidling, N. Meti, D. G. Narayan, and M. M. Mulla, “Detection of distributed denial of service attacks in
software defined networks,” in Proc IEEE ICACCI, 2016, pp.2576-2581.
[48] P. C. Lin, P. C. Li , and V. L. Nguyen, “Inferring OpenFlow rules by active probing in software-defined networks,” in
Proc IEEE ICACT, 2017, pp.415-420.
[49] H. Kuang, Y. Qiu, R. Li, and X. Liu, “A hierarchical K-means algorithm for controller placement in SDN-Based WAN
architecture,” in Proc ICMTMA, 2018, pp.263-267.
[50] Z. Fan, and R. Liu, “Investigation of machine learning based network traffic classification,” in Proc ISWCS,2017, pp.1-
6.
[51] P. Wang, S. C. Lin, and M. Luo, “A framework for QoS-aware traffic classification using semi-supervised machine
learning in SDNs,” in Proc IEEE SCC, 2016, pp.760-765.
[52] J. Chavula, M. Densmore, and H. Suleman, “Using SDN and reinforcement learning for traffic engineering in
UbuntuNet Alliance,” in Proc ICACCE, 2016, pp.349-355.
[53] S. Sendra, A. Rego, J. Lloret, J. M. Jimenez, and O. Romero, “Including artificial intelligence in a routing protocol using
Software Defined Networks,” in Proc IEEE ICC, 2017, pp.670-674.
[54] C. Yu, J. Lan, Z. Guo, Z. Guo, and Y. Hu, “DROM: Optimizing the routing in Software-Defined Networks with deep
reinforcement learning,” IEEE Access, vol.6, pp.64533-64539, 2018.
[55] C. Wang, L. Zhang, Z. Li, and C. Jiang, “SDCoR: Software Defined cognitive routing for internet of vehicles,” IEEE
Internet Things, vol.5, no.5, pp. 3513 – 3520, 2018.
[56] P. Tang, F. Li, W. Zhou, W. Hu, and L. Yang, “Efficient auto-scaling approach in the telco cloud using self-learning
algorithm,” in Proc IEEE GLOBECOM, 2015, pp.1–6.
[57] T. Daher, S. B. Jemaa, and L. Decreusefond, “Softwarized and distributed learning for SON management systems,” in
Proc IEEE/IFIP NOMS, 2018, pp.1-7.
[58] N. C. Luong, et al., “Applications of deep reinforcement learning in communications and networking: A survey,” arXiv
preprint arXiv:1810.07862, 2018.
[59] Y. He, Z. Zhang, and Y. Zhang, “A big data deep reinforcement learning approach to next generation green wireless
networks,” in Proc IEEE GLOBECOM, 2017, pp. 1-6.
[60] S. C. Lin, I. F. Akyildiz, P. Wang, and M. Luo, “QoS-aware adaptive routing in multi-layer hierarchical software
defined networks: A reinforcement learning approach,” in Proc IEEE SCC, 2016, pp.25-33.
[61] Sultana, N., Chilamkurti, N., Peng, W., Alhadad, R.: Survey on SDN based network intrusion detection system using
machine learning approaches. Peerto-Peer Networking and Applications 2018 12:2 12(2), 493–501 (jan 2018).
https://doi.org/10.1007/S12083-017-0630-0
[62] Fathy, C., & Saleh, S. N. (2022). Integrating deep learning-based IoT and fog computing with software-defined
networking for detecting weapons in video surveillance systems. Sensors, 22(14), 5075.
[63] Phan, T. V., &Bauschert, T. (2022). Deepair: Deep reinforcement learning for adaptive intrusion response in software-
defined networks. IEEE Transactions on Network and Service Management.
[64] Kim, G., Kim, Y., & Lim, H. (2022). Deep Reinforcement Learning-Based Routing on Software-Defined
Networks. IEEE Access, 10, 18121-18133.
[65] Jiménez-Lázaro, M., Berrocal, J., & Galán-Jiménez, J. (2022, April). Deep Reinforcement Learning Based Method for
the Rule Placement Problem in Software-Defined Networks. In NOMS 2022-2022 IEEE/IFIP Network Operations and
Management Symposium (pp. 1-4). IEEE.
[66] Raikar, M.M., Meena, S.M. (2022). Reinforcement Learning Based Routing in Software Defined Network. In: Rout,
R.R., Ghosh, S.K., Jana, P.K., Tripathy, A.K., Sahoo, J.P., Li, KC. (eds) Advances in Distributed Computing and
Machine Learning. Lecture Notes in Networks and Systems, vol 427. Springer, Singapore. https://doi.org/10.1007/978-
981-19-1018-0_16
[67] Al Jameel, Mohammed, Triantafyllos Kanakis, Scott Turner, Ali Al-Sherbaz, and Wesam S. Bhaya. 2022. "A
Reinforcement Learning-Based Routing for Real-Time Multimedia Traffic Transmission over Software-Defined
Networking" Electronics 11, no. 15: 2441. https://doi.org/10.3390/electronics11152441
[68] Huang, X., Yuan, T., Qiao, G., Ren, Y.: Deep Reinforcement Learning for Multimedia Traffic Control in Software
Defined Networking. IEEE Network 32(6), 35–41 (nov 2018). https://doi.org/10.1109/MNET.2018.1800097
[69] Liu, W.X., Zhang, J., Liang, Z.W., Peng, L.X., Cai, J.: Content Popularity Prediction and Caching for ICN: A Deep
Learning Approach with SDN. IEEE Access 6, 5075–5089 (dec 2017). https://doi.org/10.1109/ACCESS.2017.2781716
[70] Yu, C., Lan, J., Guo, Z., Hu, Y.: DROM: Optimizing the Routing in SoftwareDefined Networks with Deep
Reinforcement Learning. IEEE Access 6, 64533–64539 (2018). https://doi.org/10.1109/ACCESS.2018.2877686
[71] Ganesan, E., Hwang, I., Liem, A. T., & Ab-Rahman, M. S. (2021, June). SDN-enabled FiWi-IoT smart environment
network traffic classification using supervised ML models. In Photonics (Vol. 8, No. 6, p. 201). Multidisciplinary
Digital Publishing Institute.
[72] Alvizu, R., Troia, S., Maier, G., Pattavina, A.: Matheuristic with machinelearning-based prediction for software-defined
mobile metro-core networks. Journal of Optical Communications and Networking 9(9), D19–D30 (sep 2017).
https://doi.org/10.1364/JOCN.9.000D19
[73] Nanda, S., Zafari, F., Decusatis, C., Wedaa, E., Yang, B.: Predicting network attack patterns in SDN using machine
learning approach. NFV-SDN 2016 pp. 167–172 (may 2017). https://doi.org/10.1109/NFV-SDN.2016.7919493
205
[74] Azzouni, A., Pujolle, G.: NeuTM: A neural network-based framework for traffic matrix prediction in SDN. NOMS 2018
pp. 1–5 (jul 2018). https://doi.org/10.1109/NOMS.2018.8406199
[75] Usama, M., Qadir, J., Raza, A., Arif, H., Yau, K.L.A., Elkhatib, Y., Hussain, A., Al-Fuqaha, A.: Unsupervised Machine
Learning for Networking: Techniques, Applications and Research Challenges. IEEE Access 7, 65579–65615 (2019).
https://doi.org/10.1109/ACCESS.2019.2916648
[76] Wang, P., Lin, S.C., Luo, M.: A framework for QoS-aware traffic classification using semi-supervised machine learning
in SDNs. SCC 2016 pp. 760–765 (aug 2016). https://doi.org/10.1109/SCC.2016.133
[77] Owusu, A.I., Nayak, A.: An Intelligent Traffic Classification in SDNIoT: A Machine Learning Approach. BlackSeaCom
2020 (may 2020). https://doi.org/10.1109/BLACKSEACOM48709.2020.9235019
[78] Yu, C., Lan, J., Xie, J., & Hu, Y., QoS-aware traffic classification architecture using machine learning and deep packet
inspection in SDNs. Procedia computer science, 131(2018)1209-1216.
[79] K. Bhushan, B. Gupta, Distributed denial of service (ddos) attack mitigation in software defined network (sdn)-based
cloud computing environment, Journal of Ambient Intelligence and Humanized Computing (2018) 1–13.
[80] M. Begovic, S. Causevic, B. Memic &A. Haskovic AI-aided traffic differentiated QoS routing and dynamic offloading
in distributed fragmentation optimized SDN-IoT. Int. J. Eng. Res. Technol, 13(8)(2019) 1880-1895.
[81] F. AL-Tam and N. Correia, “Fractional switch migration in multi-controller software-defined networking,” Comput.
Networks, 157(2019)1–10.
[82] N. Saha, S. Bera, and S. Misra, “Sway: Traffic-Aware QoS Routing in Software-Defined IoT,” IEEE Transactions on
Emerging Topics in Computing, IEEE Computer Society, (2018)1–12.
[83] H. Yao, X. Yuan, P. Zhang, J. Wang, J. Chunxiao, and M. Guizani, Machine Learning Aided Load Balance Routing
Scheme Considering Queue Utilization,IEEE Trans. Veh. Technol.,(2019) 1–1.
[84] I. I. Awan, N. Shah, M. Imran, M. Shoaib, and N. Saeed, An improved mechanism for flow rule installation inband
SDN, J. Syst. Archit., 96(2019)1–19.
[85] I. H. Abdulqadder, D. Zou, I. T. Aziz, B. Yuan& W. Dai, Deployment of robust security scheme in SDN based 5G
network over NFV enabled cloud environment. IEEE Transactions on Emerging Topics in Computing, 9(2)(2018) 866-
877.
206
Grenze International Journal of Engineering and Technology, June Issue
Abstract— Deep Learning plays an important role today in disease detection and prediction. All
deep learning models need to be trained to process the input; Extract features and return
prediction results. Before classification and prediction, the given input must be preprocessed to
perform segmentation using augmentation. Only with the help of preprocessed images each
model can make accurate predictions at higher speeds. This proposed work aimed to detect
Diabetic Eye Diseases by means of segmenting the augmented images using U-Net. U-Net is
familiar with its Encoder-Decoder architecture for sampling. Retinal Blood Vessel is one of the
most precise parts of an eye. Based on the nature of this blood vessel one can identify whether it
is affected by diabetic retinopathy or not. So, segmenting the blood vessel helps to classify the
disease category in early stage and of course U-Net is probably meant for segmenting medical
images. In this paper, the discussions will be made on preprocessing eye images from the data
set, segmenting those images using U-Net to extract the retinal blood vessel, classification based
on segmentation..
I. INTRODUCTION
A. Image Processing:
Image processing as the name suggests, means processing images, and many techniques are required to reach the
goal. It is the core domain of computer vision that plays a key role in many real-world examples, such as
robotics, self-driving cars, and object recognition. Image processing allows us to transform and manipulate
thousands of images simultaneously and derive useful insights from them. It has a wide range of applications in
almost every field. The final output may be in the form of either an image or an equivalent feature of that image.
This can be used for further analysis and decision making. An image can be represented as a 2D function F (x, y)
where x and y are the spatial coordinates. The amplitude of F at a particular value of x, y is known as the
intensity of the image at that point. If the x, y and amplitude values are finite, it can be called as digital image.
This is an array of pixels arranged in columns and rows. A pixel is an image element that contains information
about intensity and color. Images can also be represented in 3D, where x, y, and z are spatial coordinates. Pixels
are arranged in a matrix. This is called an RGB image. There are different types of images: i) RGB Image -
Contains three layers of a 2D image, these layers being the red, green and blue channels, ii) Grayscale Images -
These images contain shades of black and white and contain only one channel.
Figure1.U-Net Architecture
This paper is arranged as chapters in the following manner, Chapter II contains Literature Survey, Chapter III
contains Image Preprocessing, Chapter IV contains Segmentation, Chapter V contains Proposed Method,
Chapter VI contains Segmentation result and discussions, and Chapter VII contains Conclusion.
208
parameters have been calculated for the CNN after three image processing stages as resizing the image, applying
Histogram Equalization, and applying CLAHE. After the image processing-based enhancement, the
classification was made using the CNN. The performance of the introduced method was evaluated by using 400
retinal fundus images in the MESSIDOR database. In Ref. [5], this work depicts the green channel of the RGB
model exhibits the best contrast between the vessels and background while the red and blue ones tend to be
noisier. The grey image from the green channel is processed and the retinal blood vessels appear darker in the
grey image and then invert it to appear brighter than non-vessel background. Salt and pepper noise is added in
order to represent the presence of noise. In order to remove the salt and pepper noise, order and median filters are
used. The output of the order filter gives better contrast between the vessels and the background, thereby
removing the noise more accurately than the other filters.
In Ref. [6], This study aimed to detect Optic Disc. Some of the features of Diabetic Retinopathy are exudates,
hemorrhages and micro aneurysms. Detection and removal of optic disc plays a vital role in extraction of these
features. This paper focuses on detection of optic disc using various image processing techniques, algorithms
such as canny edge, Circular Hough (CHT). Retinal images from IDRiD, Diaret_db0, Diaret_db1, Chasedb and
Messidor datasets were used.
In Ref. [7], the proposed model has been trained with three types, back propagation NN, Deep Neural Network
(DNN) and Convolutional Neural Network (CNN) after testing models with CPU trained Neural network gives
lowest accuracy because of one hidden layer whereas the deep learning models are outperforming NN. The Deep
Learning models are capable of quantifying the features as blood vessels, fluid drip, exudates, hemorrhages and
micro aneurysms into different classes. In Ref. [8], Glaucoma is a group of conditions, in which high pressure
inside the eye damages the optic nerve of the eye. The vision lost due to glaucoma is irreversible and cannot be
regained. Hence it is very important to detect this disease as early as possible and treat early to preserve vision.
In this paper, the performance of five preprocessing techniques is compared namely Contrast Adjustment,
Adaptive Histogram equalization, Median filtering, Average filtering and Homomorphic filtering. The
performances of these techniques are evaluated using Mean Square Error (MSE) and Peak Signal to Noise Ratio
(PSNR).
In Ref. [9], proposed automated methods consist of pre-processing, blood vessels extraction, optic disc
segmentation and macula region segmentation. Initially, pre-processing is performed using shade correction and
top-hat transformation for enhancement of dark anatomical structures such as blood vessels and macula/fovea
region. A novel graph cut method is used to extract blood vessels. Then template based matching and
morphological operations are used for detection and extraction of optic disc. Finally, post processing is used for
detection of macula in retinal images. In Ref. [10], the work proposed was an OD segmentation model from
fundus images based on Retina Net extension with DenseNet that addresses the vanishing gradient problem,
enhances feature propagation, performs deep supervision, strengthens feature reuse and reduces the number of
parameters. The model was developed based on promising results achieved by the Retina Net and the DenseNet
in many object detection problems. Combining both models facilitates the reuse of computation through dense
connections and improves gradient flow.
In Ref. [11], segmenting the optic disc (OD) is an important and essential step in creating a frame of reference
for diagnosing optic nerve head pathologies such as glaucoma. The main contribution of this paper is in
presenting a novel OD segmentation algorithm based on applying a level set method on a localized OD image.
To prevent the blood vessels from interfering with the level set process, an in-painting technique was applied.
The new automatic eye disease diagnosis system has to be robust, fast, and highly accurate, in order to support
high workloads and near-real-time operation.
209
Enhancing contrast is a crucial component of image processing for both human and machine vision. It is
commonly used in speech recognition, texture synthesis, medical image processing, and many other image/video
processing applications as a pre-processing step.
The most common Pixel brightness transforms operations are
1. Gamma correction or Power Law Transform
2. Sigmoid stretching
3. Histogram equalization
Two commonly used point processes are multiplication and addition with a constant.
g(x)=αf(x)+β (1)
The parameters α>0 and β are called the gain and bias parameters and sometimes these parameters are said to
control contrast and brightness respectively.
A. Histogram Equalization
It is a well-known contrast enhancement technique due to its performance on almost all types of images.
Histogram equalization provides a sophisticated method for modifying the dynamic range and contrast of an
image by altering that image such that its intensity histogram has the desired shape. Unlike contrast stretching,
histogram modeling operators may employ non-linear and non-monotonic transfer functions to map between
pixel intensity values in the input and output images.
The normalized histogram can be represented as,
P(n) = (number of pixels with intensity n) / (total number of pixels)
B. Image Filtering and Segmentation
The purpose of utilizing filters is to change or improve the qualities of the images and/or to extract important
data from the images, such as edges, corners, and blobs. A kernel, which is a tiny array applied to each pixel and
its neighbors inside a picture, defines a filter. Some of the basic filtering techniques are; i. Low Pass Filtering
(Smoothing), ii. High pass filters (Edge Detection, Sharpening), iii. Directional Filtering, iv. Laplacian Filtering.
Often based on the properties of the picture's pixels, Image Segmentation is a widely used method in digital
image processing and analysis to divide an image into various parts or areas. Foreground and background can be
distinguished in an image using segmentation, and pixels can be grouped together according to their similarity in
color or shape. Image Segmentation is mainly used in Face detection, medical imaging, Machine vision,
Autonomous Driving.
C. Fourier Transform
In Ref. [3] is an important image processing tool used to decompose an image into its sine and cosine
components. The output of the transform represents an image in the Fourier or frequency domain, while the input
image is equivalent in the spatial domain. In Fourier-domain images, each point represents a specific frequency
in the spatial-domain image. Fourier transforms are used in a variety of applications such as image analysis,
image filtering, image reconstruction, and image compression.
210
5. Watershed Segmentation
1. Edge-Based Segmentation: Edge-based segmentation is a popular image processing technique that identifies
the edges of various objects in a given image. It helps locate features of associated objects in the image using the
information from the edges. Edge detection helps strip images of redundant information, reducing their size and
facilitating analysis.
Edge-based segmentation algorithms identify edges based on contrast, texture, color, and saturation variations.
They can accurately represent the borders of objects in an image using edge chains comprising the individual
edges.
2. Thresholding Segmentation: Thresholding is the simplest image segmentation method, dividing pixels based
on their intensity relative to a given value or threshold. It is suitable for segmenting objects with higher intensity
than other objects or backgrounds.
The threshold value T can work as a constant in low-noise images. In some cases, it is possible to use dynamic
thresholds. Thresholding divides a grayscale image into two segments based on their relationship to T, producing
a binary image.
3. Region-Based Segmentation: Region-based segmentation involves dividing an image into regions with similar
characteristics. Each region is a group of pixels, which the algorithm locates via a seed point. Once the algorithm
finds the seed points, it can grow regions by adding more pixels or shrinking and merging them with other
points.
4. Clustering-Based Segmentation: Clustering algorithms are unsupervised classification algorithms that help
identify hidden information in images. They augment human vision by isolating clusters, shadings, and
structures. The algorithm divides images into clusters of pixels with similar characteristics, separating data
elements and grouping similar elements into clusters.
5. Watershed Segmentation: Watersheds are transformations in a grayscale image. Watershed segmentation
algorithms treat images like topographic maps, with pixel brightness determining elevation (height). This
technique detects lines forming ridges and basins, marking the areas between the watershed lines. It divides
images into multiple regions based on pixel height, grouping pixels with the same gray value. The watershed
technique has several important use cases, including medical image processing. For example, it can help identify
differences between lighter and darker regions in an MRI scan, potentially assisting with diagnosis.
B.CNN for Image Segmentation:
In Ref. [12], we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing
images. As well as, improving the Encoder- Decoder CNN structure SegNet with index pooling and U-Net to
make them suitable for multi-targets semantic segmentation of remote sensing images.
Convolutional neural network is a hierarchical model whose input is raw data, such as RGB image and raw audio
data. Convolutional neural networks extract high level semantic information layer by layer from the input layer
of raw data by stacking a series of operations such as convolution operation, pooling operation and mapping of
non-linear activation functions. This process is called “feed-forward operation”. Different types of operations in
the convolutional neural networks are called “layers”. Convolution operations are convolutional layers and
211
pooling operations are pooling layers. The last layer of convolutional neural network transforms its target tasks
(classification, regression, etc.) into the objective function. By calculating the error or loss between the predicted
value and the real value, the error or loss back-forward layer by layer by the back-propagation algorithm to
update the parameters of every layer and then back-forward again and again until the network model converges.
In Ref. [13], during past days, a lot of research has been carried out for retinal blood vessel segmentation for
identification of Diabetic Retinopathy using various machine learning and deep learning models. In this research
work, Convolutional Neural Network (CNN) and CLAHE are applied together to tackle the problem of retinal
blood vessel segmentation of images over the DRIVE dataset. The method undergoes pre-processing- grey scale
conversion and CLAHE, feature extraction using morphological feature, segmentation, training and prediction
using CNN. Experimental evaluation shows that the proposed method outperforms 98.06% accuracy.
In Ref. [14], a comprehensive review of the literature has been written, covering a broad spectrum of pioneering
works for semantic and instance-level segmentation, including fully convolutional pixel-labeling networks,
encoder-decoder architectures, multi-scale and pyramid-based approaches, recurrent networks, visual attention
models, and generative models in adversarial settings. We investigate the similarity, strengths and challenges of
these deep learning models, examine the most widely used datasets, report performances, and discuss promising
future research directions in this area. In Ref. [15], with the advent of neural networks, deep convolutional neural
networks (DCNNs) provide benchmarking results in the problems related to computer vision. Manifold DCNNs
have been proposed for semantic segmentation such as U-Net, DeepU-Net, ResUNet, DenseNet, RefineNet, etc.
The general procedure is common for all the models. It has three phases - pre-processing, processing and output
generation. The outputs of the processing phase are the masked image and segmented image. In this paper, a
systematic critique of the existing DCNNs for semantic segmentation has been manifested. The datasets and the
architectures of the existing models have also been discussed in this paper with illustrations.
In Ref. [16], the proposed model has 13 layers and uses dilated convolution and max-pooling to extract small
features. Ghost model deletes the duplicated features, makes the process easier, and reduces the complexity. The
Convolution Neural Network (CNN) generates a feature vector map and improves the accuracy of area or
bounding box proposals. Restructuring is required for healing. As a result, convolutional neural networks
segment medical images. It is possible to acquire the beginning region of a segmented medical image. The
proposed model gives better results as compared to the traditional models, it gives an accuracy of 96.05,
Precision 98.2, and recall 95.78. In Ref. [17], they present a network and training strategy, that relies on the
strong use of data augmentation to use the available annotated samples more efficiently. The architecture
consists of a contracting path to capture context and a symmetric expanding path that enables precise
localization. We show that such a network can be trained end-to-end from very few images and outperforms the
prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal
structures in electron microscopic stacks.
In Ref. [18], after researching various techniques, they have found that, the CNN is one the most powerful tool in
image segmentation technique. Detailed analysis of CNN is also done here explaining different layers and
workings of each layer. As we know CNN technology is at a boost of implementation nowadays in making the
human life more and more convenient and less manual. There have already been a lot of work done in various
fields like commutation, medical tasks, crop monitoring, road transportation, activity detection, product quality
monitoring. In Ref. [19], a novel attention Gabor network (AGNet) based on deep learning for medical image
segmentation that is capable of automatically paying more attention to the edge and consistently for
improvement to the segmentation performance is proposed. The proposed model consists of two components.
The first one is to determine the approximate location of the organs of interest in the image using convolution
filters, and the other one is to highlight salient edge features intended for a specific segmentation task using
Gabor filters. In order to facilitate collaboration in between the two parts, a region attention mechanism based on
Gabor maps is suggested. The mechanism improved performance by learning to focus on the salient regions of
the image that are useful for the authors' tasks.
In Ref. [20], fully automatic segmentation of wound areas in natural images is an important part of the diagnosis
and care protocol since it is crucial to measure the area of the wound and provide quantitative parameters in the
treatment. Various deep learning models have gained success in image analysis including semantic
segmentation. This manuscript proposes a novel convolutional framework based on MobileNetV2 and connected
component labeling to segment wound regions from natural images. The advantage of this model is its
lightweight and less compute-intensive architecture. The performance is not compromised and is comparable to
deeper neural networks. We build an annotated wound image dataset consisting of 1109-foot ulcer images from
889 patients to train and test the deep learning models.
212
V. PROPOSED METHOD
A. Dataset:
In the proposed method, the UNET model is trained with DRIVE dataset. This dataset contains 20 training
images with mask and manual ground truth images. The images were enhanced using some preprocessing
techniques such as, Grayscale transformations, Brightness transformations, applying CLAHE (Contrast Limited
Adaptive Histogram Equalization), Extracting edges by Canny Edge Detection etc., The aim of this method is to
segment the blood vessel from retina for classification of disease severity.
B. Flow Diagram:
The flow diagram (in Figure 3) depicts the work flow of process to extract blood vessel and segment them using
UNET Model. In the proposed method, images are acquired from the dataset. The RGB image is then converted
into a grayscale image. The Gray image is fed to the process of CLAHE (Contrast Limited Adaptive Histogram
Equalization). The image is processed with Canny Edge Detection method. By this, Retinal Vessels are extracted
from the image. The output is considered as ground truth image for segmentation process. The UNET model is
trained to segment the retinal blood vessel from the given image. This model is implemented in Google Colab,
NVIDIA Tesla K80 GPU. The model performance is measured using segmented image by comparing them with
ground truth.
RGB Image to Ground
Image Gray Image Truth
Acquisition Image
Denoising Segmented
Image
Edge Detection
Method Result
(Canny) Evaluation
C. ALGORITHM:
Input: Ii, Read images from Dataset,
Process:
Step1: Convert RGB Image to Grayscale Image
Step2: Increase Brightness using CLAHE.
Step3: Denoise the image by Non-Local Mean Denoising function.
Step4: Detect the edges by Canny Edge Detector – Ground Truth Image, GI.
Step5: Train the model, Ui = Ii + GIi.
Step6: Segmented Image, Si.
Output: Retinal Blood Vessels Extraction.
D. CLAHE (Contrast Limited Adaptive Histogram Equalization):
The CLAHE operates on small regions in the image, called tiles, rather than the entire image. The neighboring
tiles are then combined using bilinear interpolation to remove the artificial boundaries. This algorithm can be
applied to improve the contrast of images. There are two parameters to be considered. They are,
i. clip Limit – This parameter sets the threshold for contrast limiting. The default value is 40.
ii. tileGridSize – This sets the number of tiles in the row and column. By default, this is 8×8. It is used while the
image is divided into tiles for applying CLAHE.
213
E. De-noising Images:
In Ref. [22], its basic idea is to build a pointwise estimation of the image, where each pixel is obtained as a
weighted average of pixels centered at regions that are similar to the region centered at the estimated pixel. For a
given pixel xi in an image x, NLM (xi) indicates the NLM-filtered value.
Let wi, j be the weight of xj to xi, which is computed by.
W i,j = 1/Ci exp (- ‖ − ‖ ) / h (2)
where Ci denotes a normalization factor, and h indicates a filter parameter.
F. Canny Edge Detection Method:
This method contains four major steps to process;
Reduce Noise using Gaussian Smoothing.
Compute image gradient using Sobel filter.
Apply Non-Max Suppression or NMS to just jeep the local maxima
Finally, apply Hysteresis thresholding which that 2-threshold values T_upper and T_lower which is
used in the Canny () function.
Figure 4. shows the original image, preprocessed, edge image, and segmented image.
A. Result Comparison:
In previous research works, the authors used Convolutional Neural Network models and H-minima for
classification using the DRIVE dataset. But segmentation helps the classification task in better way. And also, U-
Net models is specifically meant for bio-medical images. This model really predicts the segmented images as in
214
the ground truth image. The edge images were generated automatically. This method is implemented in Google
Colab with GPU environment. From the results, the proposed model is compared with existing works of
different datasets and different models. The proposed method outperforms well with U-Net Model. The
following table (Table.2) shows the comparison about the dataset used and accuracy obtained by other authors.
VII. CONCLUSION
In the proposed method, DRIVE dataset is used and Canny Edge Detection with U-Net model was implemented.
The model uniquely identifies all the edges present in the original eye image, also it represents the blood vessels
of retina. From the identified edges, the U-Net model is trained with images and masks for various number of
epochs. Finally, it achieves better accuracy compared to other previous works. This work can be enhanced
furtherly as, by applying different datasets such as, IDRiD, Messidor-2, STARE etc., and the segmentation can
also be done with other models like Attention U-Net, Res U-Net. Due to its easy-to-access and good
performance, the proposed method accelerates the diagnosis of Diabetic Eye Diseases in early stages itself.
REFERENCES
[1] https://pyimagesearch.com/2022/02/21/u-net-image-segmentation-in-keras/
[2] https://blog.paperspace.com/unet-architecture-image-segmentation/
[3] https://www.mygreatlearning.com/blog/introduction-to-image-pre-processing/
[4] D. Jude Hemanth1, Omer Deperlioglu2, Utku Kose3, “An enhanced diabetic retinopathy detection and classification
approach using deep convolutional neural network”, https://doi.org/10.1007/s00521-018-03974-0.
[5] P. Pearline Sheeba, V. Radhamani, “Analyzing and Feature Extraction of Diabetic Retinopathy in Retinal Images”,
International Journal of Engineering Research & Technology (IJERT) http://www.ijert.org, ISSN: 2278-0181.
[6] Priyanka Konatham, Mounika Venigalla, Lakshmi Pooja Amaraneni, K. Suvarna Vani, “Automatic Detection of Optic
Disc for Diabetic Retinopathy”, International Journal of Innovative Technology and Exploring Engineering (IJITEE),
ISSN: 2278-3075 (Online), Volume-9 Issue-7, May 2020.
[7] Suvajit Dutta, Bonthala CS Manideep, Syed Muzamil Basha, Ronnie D. Caytiles1 and N. Ch. S. N. Iyengar,
“Classification of Diabetic Retinopathy Images by Using Deep Learning Models”,
http://dx.doi.org/10.14257/ijgdc.2018.11.1.09.
[8] Mrs.S.Rathinam*1 M.E,(Ph.D), Dr.S.Selvarajan*2 M.E.,Ph.D., “Comparison of Image Preprocessing Techniques on
Fundus Images for Early Diagnosis of Glaucoma”, International Journal of Scientific & Engineering Research, Volume
4, Issue 12, December-2013 1368 ISSN 2229-5518.
[9] P. R. Wankhede, K. B. Khanchandani, “Feature Extraction In Retinal Images Using Automated Methods”, International
Journal Of Scientific & Technology Research Volume 9, Issue 03, March 2020 ISSN 2277-8616.
[10] Manal AlGhamdi, “Optic Disc Segmentation in Fundus Images with Deep Learning Object Detector”, Journal of
Computer Science 2020, 16 (5): 591.600 DOI: 10.3844/jcssp.2020.591.600.
[11] Ahmed Almazroa, Weiwei Sun, Sami Alodhayb, Kaamran Raahemifar, Vasudevan Lakshminarayanan, “Optic disc
segmentation for glaucoma screening system using fundus images”, https://doi.org/10.2147/OPTH.S140061.
[12] Muhammad Alam, Jian-Feng Wang, Cong Guangpei, LV Yunrong, Yuanfang Chen, “Convolutional Neural Network for
the Semantic Segmentation of Remote Sensing Images”, Mobile Networks and Applications (2021) 26:200–215,
https://doi.org/10.1007/s11036-020-01703-3.
[13] Arun Kumar Yadav, Arti Jain, Jorge Luis Morato Lara and Divakar Yadav, “Retinal Blood Vessel Segmentation using
Convolutional Neural Networks ”, DOI: 10.5220/0010719500003064.
[14] Shervin Minaee, Yuri Boykov, Fatih Porikli, Antonio Plaza, Nasser Kehtarnavaz, and Demetri Terzopoulos, “Image
Segmentation Using Deep Learning: A Survey”, arXiv: 2001.05566v5 [cs.CV] 15 Nov 2020.
[15] Rishipal Singh , Rajneesh Rani, “Semantic Segmentation using Deep Convolutional Neural Network: A Review”,
International Conference On Intelligent Communication And Computational Research (ICICCR-2020),
https://ssrn.com/abstract=3565919.
[16] Marcelo Zambrano-Vizuete, Miguel Botto-Tobar ,Carmen Huerta-Sua´rez, Wladimir Paredes-Parada, Darwin Patiño
Perez, Tariq Ahamed Ahanger, and Neilys Gonzalez, “Segmentation of Medical Image Using Novel Dilated Ghost Deep
Learning Model”, Computational Intelligence and Neuroscience, https://doi.org/10.1155/2022/6872045.
215
[17] Olaf Ronneberger, Philipp Fischer, and Thomas Brox, “U-Net: Convolutional Networks for Biomedical Image
Segmentation”, arXiv: 1505.04597v1 [cs.CV] 18 May 2015.
[18] Ravi Kaushik, Shailender Kumar, “Image Segmentation Using Convolutional Neural Network”, International Journal Of
Scientific & Technology Research Volume 8, Issue 11, November 2019, ISSN 2277-8616.
[19] Shaoqiong Huang, Mengxing Huang,Yu Zhang, Jing Chen, Uzair Bhatti, “Medical image segmentation using deep
learning with feature enhancement”, IET Image Processing, doi: 10.1049/iet-ipr.2019.0772.
[20] ChuanboWang, D. M.Anisuzzaman, VictorWilliamson, Mrinal Kanti Dhar, Behrouz Rostami , Jefrey Niezgoda,
SandeepGopalakrishnan, ZeyunYu, “Fully automatic wound segmentation with deep convolutional neural networks”,
https://doi.org/10.1038/s41598-020-78799-w.
[21] https://datagen.tech/guides/image-annotation/image-segmentation/, https://www.v7labs.com/blog/image-segmentation-
guide.
[22] Linwei Fan, Fan Zhang , Hui Fan and Caiming Zhang, “Brief review of image denoising techniques”, Visual Computing
for Industry, Biomedicine, and Art (2019) 2:7 https://doi.org/10.1186/s42492-019-0016-7.
[23] Wang Xianchenga,b*, Li Weia ,Miao Bingyic , Jing Hed,e , Zhangwei Jiangf , Wen Xue , Zhenyan Ji g , Gu Hongh ,
Shen Zhaomeng, “Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network”,
International Conference on Data Science (ICDS 2018), www.sciencedirect.com.
[24] Boubakar Khalifa Albargathe, S.M., Kamberli, E., Kandemirli, F. et al. Blood vessel segmentation and extraction using
H-minima method based on image processing techniques. MultimediaToolsAppl 80,2565–2582(2021),
https://doi.org/10.1007/s11042-020-09646-3.
216
Grenze International Journal of Engineering and Technology, June Issue
COVID-19 Tracker
N Malarvizhi1, Arun Kumar Dash2, J Aswini3 and V Manikanta4
1
Professor, Dept of CSE, Vel Tech Rangarajan Dr Sagunthala R&D Institute of Science and Technology, Chennai
2-4
UG Student, Dept of CSE, Vel Tech Rangarajan Dr Sagunthala R&D Institute of Science and Technology, Chennai
3
Professor, Dept of CSE, Saveetha Engineering College, Chennai
Email: [email protected], [email protected], [email protected], [email protected]
Abstract— Covid-19, which is also called as “The outbreak of Corona virus” as it is commonly
known as in the process of getting terminated completely, but it is never known about the future
of any virus or bacteria. The prevalence of Corona Virus in our society can be successfully
monitored only through proper tracking and the subsequent analysis with those tracking. The
process of analyzing the tasks is completely dependent on the society experts, but they can
analyze only with the tracking provided to them. The application that is being built now eases
the work of the analysis experts. It will keep a track of the day-to-day cases of corona, its surges
and downfall. It will show the daily changes in the form of graphs and charts which, with a
better User Interface that can help not only the experts, but also general people, visualize the
readings clearly by minimizing the visible details and making it more abstract suitable for
smaller screen sized applications. People can check the ups and downs of the covid cases in a
pictorial format through bar charts, pie charts, line graphs and other graphical diagrams. The
data for all the countries will be fetched via an API call which will be updated by the API
service provider itself. The data fetched will then be processed and visualized in the form of
bars and charts. Apart from the daily cases, users can also view the daily deaths and recoveries.
They can additionally check the total number of Covid cases till date. The total number of
Deaths and recoveries can also be viewed by the user with ease. Users will be given an option to
sort the type of data they want to view i.e., death, recoveries, new cases and total cases. Based
on their input, the pictorial representation will be shown to them. Mobile phone users always
look for an application for their comfort, thus an application which visualizes less detailing and
more abstract way targeting small screen users.
I. INTRODUCTION
The title of this project "Covid-19 Tracker" in simple terms means a record of daily covid cases prevailing in the
country. These records in turn helps the analyst and experts to properly understand the Covid scenario and make
correct decisions for future. Furthermore, the tracking helps every individual using the app to remain up to date
and aware of the future consequences. In the past, the cases turned less predictable when there was a second
wave in the curve plotting the number of active cases and new cases. Though the curve of death cases did not
find a clean wave second time in the graph. These incidents made the cases go unpredictable causing not only
the analysts, but also the general people look into graphs everyday with less understanding of the high level
graphs in mobile devices and great numbers which doesn't explain much.
Though there were many types of visualizations, more detailing and lower level graphical representations lost
218
time to get the response. This data is passed further to database and to the UI and received from network to the
database and so on. Thus, back end manages the data and sends it to the front end through a single layer to
interact to avoid multiple point of contacts between front and back end.
There are three modules in this application namely, , Domain module, Data module and Presentation module.
The presentation module handles the input data received from user and converts it to function calls and passes it
to the domain module. Also, this module handles all the views and visualizations presented to the user such as
buttons, colours, layouts, frames and images. The domain module collects the function calls from the
presentation module and passes it to the data module. The data module collects the data requests from the
domain module and passes responses to the domain module the data fetched either from database or through
some network calls. The final data is passed to the presentation module and displayed to the user..
A. Domain Module
The domain module consists of all types responsible for the application's back-end. It consists of the entities not
limiting to the types present in the database model. These entities are used throughout the application in place of
the set of data which is structured. Domain also has the implementations of the use cases which are required for
the functionality of the application. It uses the help of Data module to fetch data for the specific use case and
may or may not convert it to specified entities and pass back to UI. The UI gets help of these use cases to obtain
the data without directly interacting with the database or API.
B. Data Module
The data module consists of all network calls such as API requests and responses, socket management and
database management. The data is obtained from the network and entered to the database tables. These tables
receive data from sockets also. The formatting of these data happens in this module. The formatted data is passed
to the Domain module as a response.
219
C. Presentation Module
The presentation module performs all input and output operations with the user. This module has all the views
which displays buttons, text, images, etc. These views are flooded with data obtained from Domain module.
There can be inner views which also can be used with the help of the Domain module for obtaining the data. The
data used here should be easy to understand as it interacts directly with the user.
D. Input Design
The input is collected through user interface items such as Buttons, selectors, touch events through the mobile's
screen.The inputs are converted to use cases in the presentation module in the app and sent to domain module to
collect data as a response to the input events. The inputs are logged using the default logger present in the
development environment and are collected frequently while the application is used. These events are collected
just in case the application crashes, the recent events are sent to the developers in the crash report to analyse the
cause of the crash.
E. Output Design
The output is displayed as an iOS application in iPhones. The output includes the graphical representations of the
covid-19 data in bar graphs, pie-charts, half pie-charts. The data is also represented in maps showing the counts
inside circular buttons, which on click displays the country's case details in a pull-over view, which can be
dragged across the screen to view the details. Further, there are buttons to switch the type of data in the map
visualization. A sample screenshot of output is displayed in Figure 2.
220
The proposed system has less detailed data in graphs, charts allowing users to easily understand and get
convinced over the covid cases by looking at these visualizations over the mobile phone. This system has
performance advantages as it processes less data which further reduces the time and space complexity over the
data which is complex in existing systems. This efficiency produced in the existing system makes this
application more eligible for needs over the existing system.
221
V. CONCLUSION AND FUTURE ENHANCEMENTS
The proposed work is mainly be used to track the Covid-19 surges. Finding, testing, isolating and treating a
Covid patient is an ongoing process that is undoubtedly helping a lot of people, but, it is also vital to trace and
track the day to day Covid-19 cases to keep a record and analysis from the records. The analysis done will in
turn help the experts and other individuals to create awareness and increase the mental satisfaction among
people. This model can be adopted for further visualization concepts such as stocks, HR management, analytics,
business management, project management and health services. All industries which use graphs and other
visualizations to display in mobile applications can build iOS application using simple visualization and enhance
the application to be comfortable for the user which looking at the data.
REFERENCES
[1] Brehmer, M., Lee, B., Isenberg, P., & Choe, E. K. (2018). Visualizing ranges over time on mobile phones: A task-based
crowdsourced evaluation. IEEE transactions on visualization and computer graphics, 25(1), 619–629.J. Clerk Maxwell,
A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73.
[2] Comba, J. L. (2020). Data visualization for the understanding of covid-19. Computing in Science & Engineering, 22(6),
81–86.
[3] Hamzah, F. B., Lau, C., Nazri, H., Ligot, D. V., Lee, G., Tan, C. L., Shaib, M., Zaidon, U. H. B., Abdullah, A. B.,
Chung, M. H. et al. (2020). Coronatracker: Worldwide covid-19 outbreak data analysis and prediction. Bull World
Health Organ, 1(32), 1–32.
[4] Han, J., Brown, C., Chauhan, J., Grammenos, A., Hasthanasombat, A., Spathis, D., Xia, T., Cicuta, P., & Mascolo, C.
(2021). Exploring automatic covid-19 diagnosis via voice and symptoms from crowdsourced data. ICASSP 2021-2021
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8328–8332.
[5] Lampos, V., Majumder, M. S., Yom-Tov, E., Edelstein, M., Moura, S., Hamada, Y., Rangaka, M. X., McKendry, R. A.,
& Cox, I. J. (2021). Tracking covid-19 using online search. NPJ digital medicine, 4(1), 1–11.
[6] Ribeiro, M., Nisi, V., Prandi, C., & Nunes, N. (2020). A data visualization interactive exploration of human mobility
data during the covid-19 outbreak: A case study. 2020 IEEE Symposium on Computers and Communications (ISCC),
1–6.
[7] Singh, P. K., Nandi, S., Ghafoor, K. Z., Ghosh, U., & Rawat, D. B. (2020). Preventing covid-19 spread using
information and communication technology. IEEE Consumer Electronics Magazine, 10(4), 18–27.
222
Grenze International Journal of Engineering and Technology, June Issue
Abstract— People who are visually impaired face various problems in their daily life. Their
daily activities are greatly restricted by loss of eyesight. They usually travel using blind
navigation systems or by their accumulated memories in their long-term exploration. This
paper is about a new design of assistive smart vision goggles for visually impaired persons. The
main objective of the proposed model is to make the user comfortable carrying the device that is
designed in a wearable format. This device includes a pair of glasses and an obstacle-detection
module fitted in it the center of goggles, an output device i.e., voice through a headset, a camera
to find the obstacle, and text recognition i.e., it helps the blind to read out the text by scanning
through the camera where the input is given through a switch. For image capturing, we use
Raspberry pi and for obstacle detection, we use ultrasonic sensors which can scan at most 5-6
meters of 30 degrees range. Cloud technology is used to identify the object scanned by the
camera. These Smart Vision Goggles for Blind People are portable devices, lightweight, easy to
use, and user-friendly. These glasses could easily guide blind people and help them to avoid
obstacles.
Index Terms— Smart goggles, Blind People, Raspberry pi, Ultrasonic Sensors, Text recognition.
I. INTRODUCTION
People who are visually impaired have decreased ability of visual perception. Blind mobility is one of the major
challenges encountered by visually impaired persons in their daily lives. Their life activities are greatly restricted
by the loss of eyesight. Visually impaired people are who have lost their ability to see completely and who have
partial loss of vision [1]. According to the world health organization (WHO), there are around 285 million
visually impaired people in the world. Among the 285 million people, 39 million people are totally blind and 246
million people have a low vision. Because they don't require usage of hands or just minimal hand use, wearable
gadgets are thought to be the most helpful of all assistive technologies. The head-mounted type is the most
common. Their primary benefit is that, unlike other devices, they naturally point in the direction of the viewer,
negating the need for extra direction signs. This paper highlights a brand-new style of smart glasses that can help
with a variety of jobs while yet having a modest construction cost. To communicate information to the user, the
design makes use of the latest Raspberry Pi 4 Model B, a camera, and earphones.
III. IMPLEMENTATION
The implementation of Smart vision Goggles for the Blind is achieved with the help of components such as
Raspberry pi, ultrasonic sensors, a webcam, and earphones. All these components are placed in a compact and
secure manner which helps the person to carry them easily. The person will be comfortable with the model, he
can use it whenever he needs it and can remove it when it is not necessary. These goggles will help blind people
to reach their destination independently. The reason these glasses are more reliable and easier to implement is
that they are developed using ultrasonic sensors and raspberry pi which are commonly available almost
everywhere. These ultrasonic sensors have a specified range (2cm-4m) for sensing the objects. These sensors
detect the obstacles in front of a person, like they can sense objects, stairs and buildings and give a voice-over to
224
the person through headphones. We designed the goggles in such a way that they can identify the object in front
of the person by taking pictures via camera. Conversion of captured image information into voice will be
provided to the user through headphones that help the blind people to know who/what is in front of them. These
goggles also help the blind to read text by scanning the book through the camera. Another feature we have added
to this project is, if the blind person is feeling insecure then he can press an input switch upon which a message
will be sent to the caretaker of that blind person immediately.
The Raspberry pi is a low-cost and small size computer that is used for running programs. Ultrasonic sensors are
used to measure the distance by using ultrasonic waves. Here, the sensor head emits an ultrasonic wave and
receives the wave reflected from the target. These sensors measure the distance from the target by measuring the
time between the emission and reception. They have a limited range to detect the obstacle i.e., 2cm-4m.
Ultrasonic sensors are placed on the top of the model, some with a certain angle to identify the obstacles, digs,
holes, and stairs and alert the user with a message through the earphones. The functional diagram of the
proposed model is shown in Figure 1. The features such as voice-out obstacles, detecting the image, reading the
scanned image, and sending the current location are programmed on the raspberry pi. A specific button is
provided to execute all these features. Pressing the button at different periods enables the features execution
accordingly. The output of all the features is in speech format using the gTTS (google Text to Speech) library
[4].
225
Stepwise procedure to use this prototype model is as follows.
Step 1: First, make a connection of all the modules and other components to the raspberry pi, and
microcontroller.
Step 2: Get the option from the user through the input switch.
Step 3: Ultrasonic sensors automatically sense the obstacles in front of the user and alert him by specifying the
range of distance as shown in Figure 5.
Step 4: If the input switch is pressed for less than 5sec then the image is captured through camera and the
obstacle is identified using cloud technology Imagga as shown in Figure 3.
Step 5: If the input switch is pressed for 5-10sec then the image is captured and the text present in that image is
extracted using pyTesseract module and converted to speech using gTTS library as shown in Figure 4.
Step 6: If the input switch is pressed for more than 10sec then a notification is sent to the user’s caretaker.
Step 7: All the actions are converted into a resultant text document and given to gTTS module which converts
the text into speech that can be heard by the user through earphones.
Figure 3. Capturing the image and identifying the object. Figure 4. Capturing the image and reading the text
The raspberry pi used in our project is “Raspberry PI 4 Model B” (shown in figure 6) with a high-speed 64-bit
quad-core processor, 4GB RAM and Dual-band 2.4/5.0 GHZ wireless LAN [9]. It is the heart of this project
which is responsible for performing all control actions [8].
226
B. External Webcam:
Figure 7. Webcam
An external webcam is attached to the goggles which are used for capturing the images for the optical character
recognition (OCR) feature and the dominant feature [10]. The external webcam that is used in this prototype is
shown in Figure 7.
C. Ultrasonic Sensors:
Ultrasonic sensors are used to detect obstacles, digs, and stairs in front of the user. These ultrasonic sensors work
in a particular range of distances [11]. The ultrasonic sensors that are used in this project are shown in Figure 8
and their range is 2cm-4m.
D. Input Switch:
The input switch provided would act as a hand-held remote through which the user can select a particular feature
to execute and hear the result. The switch will be operated based on the period the user holds it. If the time is less
than 5sec then the obstacle is identified by scanning through a webcam. If the time is 5-10sec the text
recognition will be enabled. If it is more than 10sec then a message is sent to the respective person who takes
care of the user [12]. The switch used in this project is shown in Figure 9 and the power rating is MAX 50mA
24V DC and the operating temperature range lies between -20 to 70 0C.
E. Microcontroller:
The microcontroller used in this project is ATmega328p. It is an 8-bit AVR RISC-based microcontroller chip. It
consists of flash memory 32 KB ISP with read-while-write capabilities, EEPROM of 1KB, 23 general-purpose
I/O pins, SRAM (Static Ram) of 2KB, 32 general-purpose working registers, etc [13]. Microcontroller is used to
get the information from the ultrasonic sensors and gives it to the raspberry pi for processing the data.
227
V. SOFTWARE USED
IDE— IDLE software is needed to be installed on the computer to give the optimal functioning of the
application [14]. The specifications of the software used are:
Operating System – Windows
Programming Language – Python3 [15]
IDE – IDLE
Libraries – requests, time, cv2, speake3, pytessseract, pyttsx3, json, telepot, RPi.GPIO, Imagga.
Figure 10. Hardware and Software Interfacing Figure 11. Hardware connections
The hardware and software interfacing is shown in Figure 10 and hardware connections are depicted in Figure
11. The USB cable is connected to Raspberry pi for the power supply and earphones are plugged in. After
interfacing, the program is compiled and run, then the respective features will get executed based on input from
the switch and the output is heard through earphones.
IX. RESULTS
The output of the smart vision goggles is observed as shown in figures 12 to 15. When the user is walking with
these goggles on, if there is any obstacle then he is hearing a message “obstacle in front of you (if there ia any
object or stairs)” or “digs infront of you (if there is downward stairs or hole)” as shown in figure 12. If the user
wants to know what the obstacle is then he can press the input switch. The features offered here are:
Case 1: If he presses it for about less than 5 sec, then it captures the image and gives the information about the
objects that are captured in the image to the user through earphones. That information is shown in Figure 14.
Case 2: If he presses it for about 5-10 sec, then it captures the image and and it reads the text and gives that
information to the user through earphones, which is shown in Figure 15.
228
Case 3: Another feature included in the project is, if the user felt insecure he can long press (>10sec) the input
switch so that a message is sent to his caretaker as shown in Figure 13.
Figure 12. Obstacle message to the user through earphones Figure 13. A message sent to the caretaker’s mobile
Figure 14. Captured image and Obstacle Information heard by the user
Figure 15. Text captured and text information heard by the user
X. CONCLUSION
Smart vision goggles, is a project in which we have used raspberry pi model to provide assistive care for blind
people. This product helps blind people to read newspapers, and books, it helps to identify the obstacles in front
of the user, captures the images, and can send an alert message to the person who is taking care of the user. All
these features have voice output to the user through headphones. These features are accessed by the user through
the input switch provided. This proposed model is providing efficient results and it is easy to use. With this
model the user is able to walk through roads, labs and hallways without any assistance.
229
XI. FUTURE SCOPE
All the features that are developed in this product are processed by the raspberry pi processor. For further
enhancement of the project, we can add directions to guide the person, help the user to know the current date and
time, help him to know the exact location where he is, weather information, and can get the top 10 headlines of
present-day news. All these features can be processed by using raspberry pi. This may help the person to know
the things happening in society without any assistance.
REFERENCES
[1] M. R. Miah and M. S. Hussain, "A Unique Smart Eye Glass for Visually Impaired People," International Conference on
Advancement in Electrical and Electronic Engineering (ICAEEE), pp. 1-4, doi: 10.1109/ICAEEE.2018.8643011, 2018.
[2] Global data of visually impaired from WHO website www.who.int/blindness/ publications/globaldata/en/.
[3] L. -B. Chen, J. -P. Su, M. -C. Chen, W. -J. Chang, C. -H. Yang and C. -Y. Sie, "An Implementation of an Intelligent
Assistance System for Visually Impaired/Blind People," IEEE International Conference on Consumer Electronics
(ICCE), pp. 1-2, doi: 10.1109/ICCE.2019.8661943, 2019.
[4] K. R. Rani, "An audio aided smart vision system for visually impaired," International Conference on Nextgen Electronic
Technologies: Silicon to Software (ICNETS2), 2017, pp. 22-25, doi: 10.1109/ICNETS2.2017.8067889.
[5] L. Abraham, N. S. Mathew, L. George and S. S. Sajan, "VISION- Wearable Speech Based Feedback System for the
Visually Impaired using Computer Vision," 2020 4th International Conference on Trends in Electronics and Informatics
(ICOEI)(48184), pp. 972-976, doi: 10.1109/ICOEI48184.2020.9142984.
[6] P. P. Singh, S. S. Hegde, R. Varun, V. Hegde, and K. A. S. Devi, “Audio Narration of a Scene for Visually Disabled
using Smart Goggle”, IJRESM, vol. 5, no. 4, pp. 73–75, Apr. 2022.
[7] Du Buf, J.M. Hans; Barroso, João; Rodrigues, João M.F.; Paredes, Hugo; Farrajota, Miguel; Fernandes, Hugo; João
José; Teixeira, Victor; Saleiro, Mário. The SmartVision Navigation Prototype for Blind Users, International Journal of
Digital Content Technology and its Applications, 5, 5, 351-361, 2011.
[8] M. T. Islam, M. Ahmad and A. S. Bappy, "Real-Time Family Member Recognition Using Raspberry Pi for Visually
Impaired People," IEEE Region 10 Symposium (TENSYMP), 2020, pp. 78-81, doi: 10.1109/TENSYMP
50017.2020.9230937.
[9] A. Pardasani, P. N. Indi, S. Banerjee, A. Kamal and V. Garg, "Smart Assistive Navigation Devices for Visually
Impaired People," 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), 2019,
pp. 725-729, doi: 10.1109/CCOMS.2019.8821654.
[10] K. Kumar, R. Patra, M. Manjunatha, J. Mukhopadhyay and A. K. Majumdar, "An electronic travel aid for navigation of
visually impaired persons," Third International Conference on Communication Systems and Networks (COMSNETS
2011), 2011, pp. 1-5, doi: 10.1109/COMSNETS.2011.5716517.
[11] R. Agarwal et al., "Low-cost ultrasonic smart glasses for blind," 8th IEEE Annual Information Technology, Electronics
and Mobile Communication Conference (IEMCON), 2017, pp. 210-213, doi: 10.1109/IEMCON. 2017.8117194.
[12] J. A. Brabyn, "New Developments in Mobility and Orientation Aids for the Blind," in IEEE Transactions on Biomedical
Engineering, vol. BME-29, no. 4, pp. 285-289, April 1982, doi: 10.1109/TBME.1982.324945.
[13] M. P. Agrawal and A. R. Gupta, "Smart Stick for the Blind and Visually Impaired People," Second International
Conference on Inventive Communication and Computational Technologies (ICICCT), 2018, pp. 542-545, doi:
10.1109/ICICCT.2018.8473344.
[14] K. S. P, A. A, G. K and H. S, "Raspberry Pi based Smart Assistance for Visually Impaired People," 2022 3rd
International Conference on Electronics and Sustainable Communication Systems (ICESC), 2022, pp. 1199-1204, doi:
10.1109/ICESC54411.2022.9885412.
[15] M. A. V., L. M. Gladence, R. Velaga and B. Valluri, "Smart Assistive System for Blind People using Raspberry PI,"
2020 International Conference on System, Computation, Automation and Networking (ICSCAN), 2020, pp. 1-8, doi:
10.1109/ICSCAN49426.2020.9262433.
230
Grenze International Journal of Engineering and Technology, June Issue
Abstract— Recommender system (RS) has emanated as the most popular application of e-
commerce websites. In the e-commerce business, collaborative filtering based RS systems
suggest products to the customers and find the interesting things for the users which they may
wish to purchase. The success rate of any recommendation system depends upon, information
reliability as well as expression of daily life behavior in the consolidated format. The main task
is to produce the best ranked list of ‘n-number’ of items for the user’s need. Due to the natural
structure, Z-Numbers are more consistent to producing a recommendation list. To solve real
life problem, Z-Numbers should be incorporated into decision-making models. However, Online
shopping involved multicriteria group decision-making (MCGDM). Z-information persisting
some difficulties with MCGDM. Therefore, to enhancing the ability of Z-numbers Complex
Fuzzy Sets (CFSs), are employed. Besides this entropy, distance measure, and aggregation
operator are combined to produced MCGDM based ranking of customer preference.
I. INTRODUCTION
Data Science is one of the fastest growing fields under the umbrella of Computer Science. Moreover,
Recommendation System is a co-domain of Data Science that is used by various e-commerce websites like
Amazon, Flipkart, Netflix, and so on. It helps in presenting relevant products in front of user to increase the sale
and customer base. However, it is not so easy to plot precise mapping of products and priority or choice of
customer. The main reasons include the limited knowledge of decision-makers, lack of evaluation time, and
insufficient understanding of alternatives. The success rate of any recommendation system depends upon,
information reliability as well as expression of daily life behavior in the consolidated format. Z-numbers have
been proven to be advantageous in revealing uncertain information. In 2011 Zadeh [1] proposed the concept of
Z-numbers. In this article improved Z-numbers are applied to harness the Recommendation System.
Considering that various uncertain data are inevitable in the process of addressing actual problems, uncertain Z-
numbers are to express information and measure information reliability.
Generally accepted, linguistic description is more consistent with the daily expression [1]. However, there are
still some difficulties limiting the applications. Like the calculation process in Z-Numbers is a typical nonlinear
programming problem, which will inevitably lead to the problem of complicated calculation. Therefore, to
enhancing the ability of Z-numbers Complex Fuzzy Sets (CFSs), are employed. Besides this entropy, distance
II. FUNDAMENTALS
A. Z-numbers
Z-numbers are improved to uncertain Z-numbers. By utilizing an interval concept instead of a specific concept to
characterize the fuzzy constraint and reliability. Typically, a Z-number is composed of A and B, recorded as Z =
(A, B). The parameters A and B represent the fuzzy constraint and reliability, respectively. The two components
are connected by the underlying possibility distribution (UPD). For example User e1 is very certain, that product
a1 is good based on a criteria c1” can be transformed into Z = (“good,” “very certain”).
Usually, the determination of UPD is a problem. In 2002, the concept of complex fuzzy sets (CFSs) was defined
by Ramot et al. [20]. It is an extension of type-1 fuzzy sets, in which the codomain of the membership function is
the unit disc. A brief description of CFS is given in next subsection.
B. Complex Fuzzy Sets
Assumed that S is CFS in an universe of discourse U, then S = {(x,µ_s (x))|x ∈ U}
μ_x=r_s (x).e^(jω_s (x) )
where r_s (x) ∈ [0, 1], ω_s (x) ∈ [0, 2π], j=√-1.
C. Uncertain Z-Numbers
In [17] Z-numbers were combined with Atanassov’s interval-valued intuitionistic linguistic fuzzy sets to
formalized “uncertain Z-numbers”.
IVZ = ((very good, [0.6, 0.8]), (uncertain, [0.7, 0.8])).
Expression can be represented by pair of bounded interval IVZ= ([Sa−, Sa+], [Hb− , Hb+ ]), where Sa− and Sa+
represent the upper and lower bounds of linguistic constraint, respectively. Hb− and Hb+ indicate the upper and
lower bounds of reliability, respectively.
The aim of IVZ is to better represent the fuzzy constraint and reliability. IVCFSs
= [ ( ), ( )] ∗ [ω ( ),ω ( )] can also achieve the aim. ( ) and ( ) represent the upper and lower
bounds of amplitude term, respectively. ω ( ), and ω ( ) indicate the upper and lower bounds of phase term,
respectively.
The theme of ‘IVZ’ , state that “interval concept is more appropriate than a specific concept to characterize the
fuzzy constraint and reliability, e.g. {anticipated budget deficit, between 1.8 and 2.2 million dollars, likely, very
likely}
232
D. Transformation From Uncertain Z-Numbers to IVCFSs:
Z- numbers are expressed by triplet (Z,A,B) [1] to make information more expressive and reliable A and B are
were projected into interval rather than specific in uncertain Z-numbers. To express IVZ in IVCFSs, the
membership variables of [Sa− , Sa+ ] and the probabilistic variables of [Hb−,Hb+ ] are represented by − ( ) and
( ) and ω ( ), and ω ( ) respectively by using definition (1).
= ( , ,. .. .. .. )= ⨁ ⨁. . . . . . . . . . . (4)
⨁ = [1 − (1 − ) ,1
[ ( ∏ ( ( / )) ), ( ∏ ( ( / )) )]
− (1 − ) ]. (5)
233
Therefore to satisfy multicriteria decision making in collaborative filtering based recommendation uncertain Z-
numbers are with improved version in proposed algorithm.
Steps
(i) : Attain the IVCFSs of evaluation information as shown in Table1.
(ii) : Estimate weight vector.
(iii) : Obtain collective evaluation information.
(iv) Calculate the positive and negative distances (Dp) and (Dn) between alternatives and ideal
solutions.
(v) : Generate a non- decreasing order of list of products based on matching of customer query.
The evaluation information is displayed in Table I, II, and III for attributes c1,c2,c3,andc4.
234
VI. EXPERIMENTS
During the online shopping displaying specific set of products in first few seconds play extremely important role
to attract customer attention. Recommendation of best suited products are based on profile building features of
Collaboration filtering. Besides this, produce a list of products with decreasing order of taste, may be helpful in
printing good impression in customer memory. Customer, always want to choose the most suitable brand of the
product based on their needs. Suppose that customer wants to select one from four brands a1, a2, a3, and a4. For
the sake of prudence, a group of three online users e1,e2,and e3 evaluate the options from four aspects, including
the physical appearance (c1), color combination(c2), performance (c3), and price (c4) among the brands a1, a2,
a3, and a4 . Assuming that the Linguistic Term Sets (LTSs) are H = {h0 = extremely terrible, h1 = very terrible, h2
= terrible, h3 = slightly terrible, h4 = slightly good, h5 = good, h6 = very good, h7 = extremely good} and S = {s0
= extremely uncertain, s1 = very uncertain, s2 = uncertain, s3 = slightly uncertain, s4 = slightly certain, s5 =
certain, s6 = very certain, s7 = extremely certain}.
A. Results
Table IV,V, and VI consist of evaluation information in the form of IVCFSs. After calculating the weight vector
of criteria for user e1,e2,and e3, estimated collective evaluation information with aggregation operator is shown in
Table VI . Besides these distances between alternatives and ideal solutions distances Dp and Dn are estimated.
Besides this Table VII is displaying ranking of all alternatives in decreasing order of preference. Hence the order
of displaying of face sketch can follow the estimated sequence to achieve better impression.
TABLE IV INPUTS ARE IN IVCFS FOR E1
c1 c2 c3 c4
a1 (0.43,0.57,0.90,4.49) (0.14,0.86,3.59,4.49) (0.29,0.43,1.79,2.69) (1.79,2.24,4.49,6.28)
a2 (0.14,0.29.1.79,2.69) (0.71,1,3.59,5.38) (0.29,0.57,1.79,4.49) (0.43,0.71,3.59,6.28)
a3 (0.29,0.43,0.90,1.79) (0.71,0.86,0.90,4.49) (0.71,1,2.69,3.59) (0.57,0.71,0.90,1.79)
a4 (0.43,0.57,2.69,4.49) (0.14,0.71,0.90,1.79) (0.86,1,1.79,2.69) (0.43,0.57,2.69,3.59)
c1 c2 c3 c4
a1 (0.28,1,2.690,6.28) (0.42,0.57,0.89,4.48) (0.14,0.85,1.79,3.49) (1.39,0.854,3.58,3.58)
a2 (0.14,0.28,1.79,3.58) (0.14,0.28,1.79,2.69) (0.710,1,1.79,2.69) (0.73,1,1.79, 2.62)
a3 (0.14,1,0.89,3.58) (0.28, 0.42, 0.89, 1.79) (0.714,0.85,2.69,1.79) (0.71,0.89,0.90,1.79)
a4 (0.14,0.28, 3.58, 6.28) (0.42,0.57,2.69,4.49) (0.14,0.71,1.79,4.49) (0.14,0.71,2.69,4.48)
c1 c2 c3 c4
a1 (0.43,0.57,0.90,4.49) (0.14,0.86,3.59,4.49) (0.29,0.43,1.79,2.69) (1.79,2.24,4.49,6.28)
a2 (0.14,0.29.1.79,2.69) (0.71,1,3.59,5.38) (0.29,0.57,1.79,4.49) (0.43,0.71,3.59,6.28)
a3 (0.29,0.43,0.90,1.79) (0.71,0.86,0.90,4.49) (0.71,1,2.69,3.59) (0.57,0.71,0.90,1.79)
a4 (0.43,0.57,2.69,4.49) (0.14,0.71,0.90,1.79) (0.86,1,1.79,2.69) (0.43,0.57,2.69,3.59)
a1 a2 a3 a4
e1 0.54 0.66 0.61 0.62
e2 0.65 0.57 0.58 0.44
e3 0.62 0.56 0.61 0.53
Ø 0.62 0.58 0.59 0.55
Ranking a1> a3> a2> a4
235
III. CONCLUSIONS
Even though Z-Numbers is a naive concept just introduced in 2011, and it is in preliminary stage. Besides this
it’s an effective tool to develop MCGDM. E-Commerce website using recommendation system to help online
customers in purchasing daily life commodities. Therefore, an improved version of Z-Numbers i.e. IVZ is
applied along with CFS. In this paper IVCFS TECHNIQUE is adopted to handle uncertain Z-numbers. The
support of CFS is unrestricted and may include real numbers, daily households, accessories, moreover provides
feasibility of interpreting information expression uncertainty and reliability. Therefore, CFS base IVZ is used to
ranked and recommend certain set of preferred items.
Generalized entropy, distance measure, and aggregation operator are applied proposed recommendation system.,
which is ultimately producing a ranking of favorite products from high to low. The limitation of presented work
is high degree of dependency of evaluation information from different sects of users.
REFERENCES
[1] L. A. Zadeh, “A note on Z-numbers,” Inf. Sci., vol. 181, no. 14, pp. 2923–2932, 2011.
[2] G. Beliakov, M. Pagola, and T. Wilkin, “Vector valued similarity measures for Atanassov’s intuitionistic fuzzy sets,”
Inf. Sci., vol. 280, pp. 352–367, 2014.
[3] T. Verma and A. Kumar, “Ambika methods for solving matrix games with Atanassov’s intuitionistic fuzzy payoffs,”
IEEE Trans. Fuzzy Syst., vol. 26, no. 1, pp. 270–283, Feb. 2018.
[4] L. De Miguel, H. Bustince, J. Fernandez, E. Induráin, A. Kolesárová, and R. Mesiar, “Construction of admissible linear
orders for interval-valued Atanassov intuitionistic fuzzy sets with an application to decision making,” Inf. Fusion, vol.
27, pp. 189–197, 2016.
[5] J. Xu, S. P. Wan, and J. Y. Dong, “Aggregating decision information into Atanassov’s intuitionistic fuzzy numbers for
heterogeneous multi-attribute group decision making,” Appl. Soft Comput., vol. 41, pp. 331–351, 2016.
[6] J. Q. Wang, P. Wang, J. Wang, H.-Y. Zhang, and X.-H. Chen, “Atanassov’s interval-valued intuitionistic linguistic
multicriteria group decision-making method based on the trapezium cloud model,” IEEE Trans. Fuzzy Syst., vol. 23, no.
3, pp. 542–554, Jun. 2015.
[7] J. Chai, S. Xian, and S. Lu, “Z probabilistic linguistic term sets and its application in multi-attribute group decision
making,” Fuzzy Optim. Decis. Making, vol. 20, pp. 529–566, 2021.
[8] S. Xian, J. Chai, T. Li, and J. Huang, “A ranking model of Z-mixturenumbers based on the ideal degree and its
application in multi-attribute decision making,” Inf. Sci., vol. 550, pp. 145–165, 2021.
[9] F. Sabahi, “Introducing validity into self-organizing fuzzy neural network applied to impedance force control,” Fuzzy
Sets Syst., vol. 337, pp. 113–127, 2017.
[10] S. Razvarz and M. Tahmasbi, “Fuzzy equations and Z-numbers for nonlinear systems control,” Procedia Comput. Sci.,
vol. 120, pp. 923–930, 2017.
[11] R. A. Aliev, W. Pedrycz, B. G. Guirimov, and O. H. Huseynov, “Acquisition of Z-number-valued clusters by using a
new compound function,” IEEE Trans. Fuzzy Syst., vol. 30, no. 1, pp. 279–286, 2022, doi:
10.1109/TFUZZ.2020.3037969.
[12] W. Jiang, Y. Cao, and X. Deng, “A novel Z-network model based on Bayesian network and Z-number,” IEEE Trans.
Fuzzy Syst., vol. 28, no. 8, pp. 1585–1599, Aug. 2020.
[13] R. Aliev, A. Alizadeh, and O. Huseynov, “The arithmetic of discrete Znumbers,” Inf. Sci., vol. 290, pp. 134–155, 2015.
[14] R. A. Aliev, O. H. Huseynov, and L. M. Zeinalova, “The arithmetic of continuous Z-numbers,” Inf. Sci., vol. 373, pp.
441–460, 2016.
[15] Q. Liu, H. Cui, Y. Tian, and B. Kang, “On the negation of discrete Znumbers,” Inf. Sci., vol. 537, pp. 18–29, 2020.
[16] H.-G. Peng and J.-Q. Wang, “A multicriteria group decisionmaking method based on the normal cloud model with
Zadeh’s Znumbers,” IEEE Trans. Fuzzy Syst., vol. 26, no. 6, pp. 3246–3260, Dec. 2018.
[17] H. G. Peng, H. Y. Zhang, J. Q Wang, and L. Li, “An uncertain Z-number multicriteria group decision-making method
with cloud models,” Inf. Sci., vol. 501, pp. 136–154, 2019.
[18] K. W. Shen and J. Q. Wang, “Z-VIKOR method based on a new comprehensive weighted distance measure of Z-
number and its application,” IEEE Trans. Fuzzy Syst., vol. 26, no. 6, pp. 3232–3245, Dec. 2018.
[19] F. Sabahi, “FN-TOPSOS: Fuzzy networks for ranking traded equities,” IEEE Trans. Fuzzy Syst., vol. 25, no. 2, pp. 315–
332, Apr. 2017.
[20] D. Ramot, R. Milo, M. Friedman, and A. Kandel, “Complex fuzzy sets,” IEEE Trans. Fuzzy Syst., vol. 10, no. 2, pp.
171–186, Apr. 2002.
[21] Q. Zhang, Y. Chen, J. Yang, and G. Wang, “Fuzzy entropy: A more comprehensible perspective for interval shadowed
sets of fuzzy sets,” IEEE Trans. Fuzzy Syst., vol. 28, no. 11, pp. 3008–3022, Nov. 2020.
[22] Q. Jia, J. Hu, Q. He, W. Zhang, and E. Safwat, “A multicriteria group decision-making method based on AIVIFSs, Z-
numbers, and trapezium clouds,” Inf. Sci., vol. 566, pp. 38–56, 2021.
236
Grenze International Journal of Engineering and Technology, June Issue
Abstract— Number of Smartphone users are increasing day by day and mobiles have become
an integral part of the society. This is because of the rich variety of mobile devices and essential
applications provided by its manufacturers. The increasing number of mobile devices invite
skilled developers and hackers to develop malware that invades personal and business
information in a very efficient manner. Therefore, mobile devices are an ideal target for various
security issues and data privacy threats in a mobile ecosystem. Threats posed by malwares
include leaking of private information, financial loss to users, system damage. For better
protection on computers, researchers and manufacturers making great efforts to produce anti-
malware systems with effective detection methods why the most targeted platform for the
malware developers is Android and why and how and when malwares propagate into the
mobile system with how they are detected, and protection mechanism is discussed in this paper.
Index Terms— Detection algorithm, mechanism, malware, machine learning approach, threats,
vulnerabilities.
I. INTRODUCTION
In recent times, the use of mobile devices for both business and personal purposes has increased significantly.
Modern tablets and Smartphone's provide many useful services such as internet browsing, maps, social network
clients, internet banking in addition to standard mobile functionality including phone calls, SMS, and Bluetooth.
The data used and stored in these services is often highly sensitive and therefore desired by the attackers. Mobile
devices may have become the most popular gadgets, but their security is still a developing domain. this is a
rising significance and a cumulative need, but it is comparatively weak area for the user’s data privacy and
protecting. Although mobile companies do think for the user’s security and data privacy, the use of applications
from the internet creates complex issues in relation to handling threats and vulnerabilities when securing a user’s
data privacy.
There are thousands of diverse applications accessible from application stores for each mobile device, and these
applications have an extensive range of purposes, including web browsing, entertainment (movies, games, and
music), social networking, communication (e-mail, internet messaging), banking, and location-based services.
Security goal is the processes of achieving are confidentiality, integrity, and availability.
Confidentiality is referred as the process of preserving the data from the unauthorized confinements and
implies to the proprietary information and personal privacy security.
Integrity refers to the process of safeguarding the information from the unauthorized attacker’s process of
destruction or modification and further the guarantee for the authenticity and non-reputation of information.
Availability is characterized as guaranteed access and utilization of data within the assured time. The term
availability ensures that the information and data used are utilized in timely.
The important aspects of mobile device security and data privacy issues is discussed. Sensitive security issues
affecting on smart phones such as malware attacks, vulnerabilities, and Threats is addressed. Classification of
malware, malware propagation and the types of malware analysis techniques is discussed. This survey presents
238
the trusted security countermeasures, various malware detection techniques to help the users to protect their
devices. Research questions for future works are introduced in this review. This paper is organized as follows. In
Section 2, the malware classification is discussed. In Section 3, this paper discussed the malware propagation.
Section 4 presented the malware analysis and detection approaches taxonomy. In Section 5, discussed the
malware detection mechanism. In section 6, we have presented the performance evaluation techniques. Future
directions are described in Section 7. In section 8, conclusion is presented.
239
UPDATE ATTACKS: Repackaging technique includes malicious payload within the original app but that is
easier to detect by analyzing the source code. To evade detection malware developers instead of including
malicious payload within the app, they include only an update component which downloads the malicious
payload at run time after the app is installed on the device. Hence scanning the source code will not be able
to detect the malware as initially there is no malicious code within the app.
DRIVE BY DOWNLOADS: This technique employs traditional drive-by-downloads to Android devices as
well in which users are enticed to download interesting or attractive apps. For example, Tracker malware
has in-app advertisement library. After clicking on that advertisement link user is redirected to a website
which displays the message to download an app which can save battery of the device. However, that
downloaded app is a malware which subscribes to premium rate services without user’s knowledge.
Static Analysis: Static analysis can perform the investigation in the downloaded app by inspecting its own
software properties and its source code. It is an inexpensive way to find malicious activities in code
segments without executing application and notifying its behavior. Many techniques can be used for static
analysis: decomplication, decryption, pattern matching and static system call analysis etc. The obfuscation
and encryption techniques embedded in software makes static analysis hard. Static analysis is further
divided into two categories-
- misuse detection
- anomaly detection traditionally used by anti-viruses.
240
Misuse detection: signature-based approach for detection of malware based on security policies and rulesets
by matching of signatures is where the Misuse detection is used. Data flow dependency and control flow
dependencies in source code that would help to understand the behavior of apps, this is possible in case of
static analysis.
Anomaly detection: machine learning algorithms for learning of known malwares and predicting unknown
malware is used in case of Anomaly detection. identifying action of malware rather than pattern is the most
common application of this approach. The procedure undergone here are first used to construct suspicious
behavior of applications and then observed signatures are matched against database of normal behavior
applications. by training network with classifier such as support vector machine (SVM) It can distinguish
between malicious and normal behavior.
Dynamic Analysis: Dynamic analysis perform the implementation of application in secluded environment to
keep in track of its execution behavior. Various heuristics are considered for monitoring dynamic behavior
which includes-monitoring network activity, file changes and system call traces. Android applications can
run in an Android SDK, a mobile device emulator running on desktop computer for emulation of software
and hardware features except generating phone calls. For testing purposes emulator supports Android
Virtual Device (AVD) configurations. When applications start running on the emulator, it can use all
services like to invoke other applications, accessing network state, play audio and video, store and retrieve
data.
Permission Analysis: Permissions play key role while analyzing android applications. They are listed in
Manifest.xml file while each application is installed. Install time permissions limits application behavior
with control over privacy and reduces bugs and vulnerabilities. Users have right to allow or deny the
installation of applications, but he cannot go for the selection of individual permissions. These permissions
are required in android applications because the use of resources in android phones is based on this
permission set. Some researchers can detect malicious behavior of android applications based on
permissions specified in Manifest.xml.
241
The main limitation of specification-based system is that it if very difficult to accurately specify the behavior the
system or program.
B. Behavioral-Based Detection
The behavior-based malware detection system is composed of several applications, which together provide the
resources and mechanisms needed to detect malware on the Android platform. Each program has its own specific
functionality and purpose in the system and the combination of all of them creates the Behavior-Based malware
detection system. For collecting data from Android applications, the Android data mining scripts and
applications mentioned in are the responsible, and the script running on the server will be the responsible for
parsing and storing all collected data.
C. Permission –based Detection
Applications run in a sandbox environment however they need permissions to access certain data. At the time of
installation, Android platform asks the user to grant or deny permission for the application based on the activities
the application can perform. This is to overcome a limitation in Android platform where the developers can
intentionally hide permission label to a component. If no label is specified there is no restriction as it had default
allow policy.
D. Cloud Based Malware Detection
The Google Play apps are examined for the malware. Bouncer are the service used automatically to examine the
apps on the Google Play Store for malware. As soon as an application is uploaded, the Bouncer checks it and
then compares it to other known malware, Trojans, and spyware. Every application is run in a simulated
environment to see if it will behave maliciously on an actual device.
E. Data Mining Based Malware Detection
In data mining methods for detecting malicious executables, a malicious executable as a program that performs
function, such as compromising a system’s security, damaging a system, or obtaining sensitive information
without the user’s permission. Their data mining methods detect patterns in large amounts of data, such as byte
code, and use these patterns to detect future instances in similar data. Their framework used classifiers to detect
new malicious executables.
242
VII. FUTURE DIRECTIONS
In case of security and privacy, the Smartphone users are not able to figure out the number of attacks on their
devices and how much money malicious apps may steal from their accounts. In this survey, first, we have
discussed different types of the mobile device vulnerabilities and threats, Secondly, we have classified malware
and malicious applications focusing on how the attack is executed and what is the target of the attackers. Finally,
discussed the possible malware detection defense mechanisms for mobile device security and then, suggested
some future directions to improve the detection of malicious or abnormally behaving applications before its
propagation. Using new machine learning techniques for providing real-time behavior analysis and identifying
fake apps. Deep learning algorithms can be utilized for the features extraction with more accuracy during the
malware testing. The Mobile OS companies, especially popular ones, should consider more security mechanisms
for preventing against unpredictable attacks.
VIII. CONCLUSION
Smart phones are becoming popular in terms of power, sensor, and communication. With the rapid proliferation
of the Smartphone gadgets and developing apps with a lot of features, as several sensors and connections, the
number of malware and attacks is raising. Modern, smart phones provide lots of services such as messaging,
browsing internet, emailing, playing games in addition to traditional voice services. Increase in the number of
smart phones on the market, the need for malware analysis is an urgent issue. Malware is a critical threat to
user’s computer system in terms of stealing
confidential information
corrupting or disabling security system.
This survey paper explains some occurred technologies used by security researchers to gear these threats. It says
malware types, static, dynamic and hybrid malware analysis techniques, malware detection mechanisms. Among
the various existing approaches Machine Learning methods have shown the results with high Accuracy in the
detection of malicious activities. With this categorization, we want to provide an easy understanding for users
and researchers to improve their knowledge about the security and privacy of smart phones.
REFERENCES
[1] Ammar Ahmed E. Elhadi, MohdAizainiMaarof and Ahmed Hamza Osman, Malware Detection Based on Hybrid
Signature Behaviour Application Programming Interface Call Graph, American Journal of Applied Sciences 9 (3): 283-
288, 2020, ISSN 1546-9239, 2021, Science Publications
[2] Sujithra, M., Padmavathi, G., (2012): A Survey on Mobile Device Threats, Vulnerabilities, and their Defensive
Mechanism. International Journal of Computer Applications (0975-8887) Volume 56-- No.14
[3] Kirti Mathur, Saroj Hiranwal, A Survey on Techniques in Detection and Analyzing Malware Executables, International
Journal of Advanced Research in Computer Science and Software Engineering, ISSN: 2277 128X, Volume 3, Issue 4,
April 2020.
[4] Sujithra, M.; Padmavathi, G.: Enhanced permission-based malware detection in mobile devices using optimized random
forest classifier with PSO-GA. Res. J. Appl. Sci. Eng. Technol. 12, 732–741 (2016)
[5] Matthew G. Schultz, Eleazar Eskin, Erez Zadok, and Salvatore J. Stolfo, Data Mining Methods for Detection of Study
New Malicious Executables, in Proceedings of the Symposium on Security and Privacy, 2021, pp. 3849.
[6] Sujithra M, Padmavathi G, Narayanan S (2015) Mobile device data security: a cryptographic approach by outsourcing
mobile data to cloud. Procedia Computer Science. 47:480–485
[7] Jonathan joseph bloun, adaptive rule-based malware detection employing learning classifier systems, Thesis- Master of
science in computer science, Missouri University of science and technology, 2022.
243
Grenze International Journal of Engineering and Technology, June Issue
Abstract— Risky or subpar medications are indeed an issue for healthcare organizations,
especially in middle-income countries with inadequate pharmaceutical care and pharmaceutical
legal frameworks. Impoverished pharmaceuticals can lead to major adverse drug reactions, the
emergence of bacterial resistance, as well as the possibility of therapeutic failure. They can also
raise medical costs and erode the trust of the people in medical institutions. In the rapidly
growing world, the study of fake pharmaceutical medicines is crucial. Hyperspectral imaging
(HSI) is a developing imaging technique for use in medicine. The dataset, which has three
dimensions, two spatial and one spectral is acquired via HSI. A summary of the research on
hyperspectral sensing techniques for the identification of fake pharmaceutical medicines is
provided in this research article. This study aims to evaluate the available various technology
for identifying fake medicines.
Index Terms— Hyperspectral imaging (HIS), Raman hyperspectral imaging, spectroscopy, fake
pharmaceuticals medicines, spectral signature, Visible to near-infrared region, Spectral data
analysis.
I. INTRODUCTION
The impact of fake medicines on society has a global reach. Due to massive distribution paths, unauthorized
online pharmacies, and reused materials and packaging, they are difficult to identify. It is necessary to develop
detection techniques for identifying fake medicines as well as various techniques for confirming counterfeiting
to solve this problem. For the identification of fake medicines tablets, various challenges are rising day by day.
Developing a new technique to identify fake pharmaceutical products with appearance inspection. Now the rapid
development of identification techniques in that analytical, chromatography, and spectroscopy are available but
these techniques are taking more time, more data, or maybe impacting the environment. So, to reduce this
problem need the development of new techniques or methods. Selling in fake medicine (FM) the prevalence of
fake or poor medications is increasing. Quite striking, particularly as the leading medical products or
medications there is no need for a permit. A primary goal in rich nations, exercise is a costly medication of living
(hormones, steroids, appetite suppressants, medications for premature ejaculation, Psychotropic drugs) whereas,
in underdeveloped regions, daily existence pharmaceuticals (Tablets of antibiotics, antimalarial, antituberculosis,
antiretroviral medicine) are the objective [1]. Error! Reference source not found. shows the categories of fake
medicines with its definition. Substandard medicines are approved medical item that does not meet benchmarks,
The world health organization (WHO) has noted that counterfeit medications have identification the illegally
source information purposefully and misnamed. Several nations definitions of the word counterfeit medication
have made it difficult to share information among them or truly comprehend the scope of the issue on a
worldwide scale [6]. Various categories of counterfeit medications are distinguished, and that can be found using
a variety of analytical techniques. Some of the most typical products include those having an inadequate amount
of the active ingredient or products with no active ingredient at all. 15.6 % of items are packed wrongly, and
21.4 % of products are constructed using inappropriate material. 8.5 % of the precise product's genuine copies
have significant contamination [7]. Fake medications with a wrong dosage of an active component. It can cause a
variety of medical conditions. Minimal antibiotic treatment may not eradicate the germs but it could cause the
growth of bacterial resistance. In Cambodia, fake malaria medicines killed 30 people in 2000 [8]. In 1993, more
than 100 kids in Nigeria were killed as a result of a toxic chemical found in fake cough syrup. Due to the
presence of ethylene glycol in cough syrup in place of glycerol, comparable cases occurred both China and India
during the time period 1990 and 2007 as well as in Panama. Around 190,000 people died in 2002 as a result of
polyethylene glycol poisoning in paracetamol syrup [9].
Fake medications with an undeclared active component. Recreation medications with potential botanical
constituents are a typical target for this kind of fraud. Despite being healthy, cannabinoids could nonetheless
have certain pharmaceutical consequences for the body. Therefore, it needs to be considered, and using such
medications must be used with a prescriber. Four children cough syrups Promethazine oral solution, Kofexmalin
baby cough syrup, Makeoff baby cough syrup, and Magrip N cold syrup have recently been labeled as
substandard by the WHO after causing kidney problems and the deaths of about 66 kids in the Gambia [10].
According to the literature, fake medicines cover practically all pharmaceutical medicines types such as
antimicrobials, antimalarials [11,12]. erectile dysfunctions, herbal, diabetes, and weight control [13].
A. Hyperspectral Remote Sensing
HRS (Hyperspectral Remote Sensing), also known as imaging spectroscopy, can give good image spectral
information. Researchers and scientists have investigated and applied imaging spectroscopy techniques for the
detection, identification, and mapping of minerals on land, in waters, and the atmosphere, based on the
characteristics of HRS, which combine imaging with spectroscopy and possess individual absorption features of
materials due to specific chemical bonds in a solid, liquid, or gas. As a result, HRS technology, as an enhanced
remote sensing instrument, has been explored for a variety of applications, including geology, geomorphology,
and environmental monitoring, etc. [65]. Hyperspectral Imaging (HSI) is a technique that examines the entire
wavelengths of colors instead of identifying primary colors RGB (red, green, and blue) with each pixel. So, the
light reaching every pixel is divided into numerous separate wavelength channels and provides additional details
about what has been viewed. The unique color signature of an individual object may be recognized using HSI.
Unlike other optical technologies that can only detect a single color but hyperspectral imaging can detect the
whole color spectrum in each pixel [66].
Standard RGB photographs display a basic contour and color they can always distinguish between connected
component sampling that shares the same contours and color that varied in their materials responses. Obtain the
245
same, particularly when it comes to pharmaceutical prejudice. A primary between hyperspectral images and
standard RGB data is the higher Spectrum Resolution and larger Spectrum Range of hyperspectral data. Figures
1 and 2 demonstrate that the hyperspectral signal contains additional data than the standard RGB data. An earlier
technique for image processing is much less suited for hyperspectral data since data is considered hyperspectral
it has more than a hundred plus bands [67].
For the identification of active pharmaceuticals ingredients in medicines use the hyperspectral data. In
hyperspectral data, a target at a high number of distinct wavelengths is described as hyperspectral imaging. In its
most basic form, hyperspectral data is a data cube in which the first two dimensions indicate spatial distances
and the third represents the spectral wavelength or wavenumber. Hyperspectral imaging is used in various
applications like remote sensing, agriculture, and food, also be used in pharmaceutical studies [68]. The
advantage of the technique is it required little sample and do not make use of any chemical or solvent, which is
increased safety, reduced negative environmental impacts, and save chemical laboratory analysis cost. When the
data is captured by hyperspectral image sensors all the chemical information is in a given spectral range. Thus,
with appropriate data analysis, it is simultaneously determining several quantitative properties with a single scan
image.
246
TABLE II: APPLICATIONS OF HYPERSPECTRAL IMAGING SYSTEMS IN MEDICAL
Spectral range Methods Application Method of
(nm) measuring
440 -640 Hyperspectral imaging Skin cancer [14] Fluorescence and
Reflectance
500- 600 Medical hyperspectral imaging (MHS Diabetic foot [15] Reflectance
their API and excipients [36]. Aripiprazole in pharmaceutical formulation and bulk is detected quantitatively
using high-performance liquid chromatography (RP-HPLC). According to the study, recovery experiments and
the computation of the percent return were both used to assess the method's reliability [37]. Fake medications
were examined using portable Raman spectroscopy with tailored localized plain views. Local Straight-Line
Screening (LSLS) as well as fundamental problems severely (PCA). An algorithm was used to identify the fake
drug mixed with herbal medications. To detect suspicious fake medicines, the LSLS technique was extended to
Raman Spectroscopy by weighting developed false positive false negative ratios adjustments [38]. A total of 26
anabolic androgenic substances (ASA) tablets are founds. Outraged, the developed technology has been applied
to products intended to promote stronger and larger muscles. UHPLC-MS (ultra-high-performance liquid-
tandem mass chromatography spectrometry) technology has been developed and approved for screening and
quantifying AAS found in counterfeit medications and supplements [39]. Hyperspectral detection is a method are
used have to detect fake medicines tablets. By adding different amounts of calcium carbonate, these medication
powders have been modified to mimic fake medications. For this study, a hyperspectral sensor that operates in
the visual range and near-infrared (350–1050 nm) range was utilized. The findings suggest that we would
achieve a classification accuracy of greater than 90% [40]. To study to identify fake drugs, counterfeit drugs
were detected using image analysis and processing, within range of visible near-infrared (400-1000 nm) and
short-wave infrared (1000-2500 nm) hyperspectral imaging. Pfizer Viagra original product and imitation pills
were compared. The Gray-Level Co-Occurrence Matrix (GLCM) analysis allows for the assessment of the
homogeneity of pill component distribution [41].
53 formulations from 29 distinct medicinal product families have been measured, producing a massive library of
spectra. The principal component analysis (PCA), the K-Nearest Neighbors (KNN), the support vector machine
247
(SVM) and the discriminant analysis (DA) were among the chemometric methods used to analyze the data.
Near-infrared spectroscopy is used for the identification of pharmaceutical tablets by using a rapid investigative
tool for counterfeit detection [42]. Near-infrared (NIR) spectroscopy was used as a quick and easy analytical
approach to distinguish fake pharmaceuticals. Atorvastatin calcium sesquihydrate (AT) formulations were found
in seven different types of brand-name and generic pills. The likelihood of classifying the AT tablet samples into
the seven kinds was 100%. The major excipient combinations were what determined the PCA and SIMCA (Soft
independent modeling of class analogy) classification of the AT tablets [43]. From the related work we studied
and analyze the hyperspectral imaging technology that is used most to identify fake medicines and active
pharmaceutical ingredients (API) with an average spectral range is 450nm to 1000nm. We also analyze the fake
medicine tablets that are found in internet pharmacies and medicines stores those are Paracetamol, Pfizer Viagra,
Plavix, and Cialis.
248
as both a quantitative approach and a method for target analysis (the presence of one or more recognized
substances).
E. Gas Chromatography
Gas chromatography (GC) have been employed to identify and find fake medications. Gas chromatography has
been used to verify the authenticity of essential oils and the presence of residual solvents, volatile components,
and unidentified chemicals or analogues (particularly in the quality assurance of herbal remedies) [48].
F. Hyperspectral Imaging Method
Hyperspectral imaging (HSI) is a developing imaging technique for use in medicine, particularly in the detection
of illness with photograph treatment. The hypercube collection, which has three dimensions two geographical
and one spectral is acquired via HSI. HSI's remotely sensed spectrum scanning yields sensor readings on the
biology, and architecture, including the substance of the material. Hyperspectral imaging is a blended technique
that integrates spectroscopic with image processing. The three-dimensional (3-D) dataset of spatial and spectral
data is produced through Hyperspectral imaging besides gathering spectral information at each pixel of a two-
dimensional (2-D) sensor emits. This set of information is referred to as a hypercube. Figure 1 Hypercube versus
red, green, and blue picture comparisons. The two-dimensional picture on every wavelength is a part of the
three-dimensional database known as Hypercube. The reflectance curve (spectral signature) of a pixel in the
picture may be located in the lower left. Merely 3 picture bands on the red, green, and blue wavelengths are
present in an RGB color picture. The RGB picture's image pixel gradient is shown in the bottom right. The
source of each spectrum on samples may be identified with geographical information, allowing for a more
thorough investigation of how light interacts with the disease. HSI can recognize several pathological disorders
thanks to the spectral signature of each pixel in the photos. In comparison to multispectral imaging (like red,
green, and blue color cameras), HSI often covers a continuous region of the visible spectrum with more spectral
bands (more than hundred) and greater spectral resolution [49]. Over the past years, hyperspectral imaging
methods have successfully demonstrated their value in a variety of pharmacological research domains. It
involves taking pictures of an object at several distinct wavelengths.
A hyperspectral picture is essentially a datacube with 2 dimensions that reflect spectral wavelengths.
Hyperspectral imaging is used in chemical imaging, which is the process of identifying and quantifying the
chemical components of a sample or product, as well as their dispersal or uniformity. Compared to hyperspectral
imaging, which covers any frequency band, spanning visual to lengthy infrared, chemical imaging generally uses
the near-infrared (NIR) or short-wave (SWIR) infrared ranges, which carry on the bases of the chemical
relationships The NIR and SWIR spectra of the organic compounds which make up the majority of
pharmaceuticals are distinctive. Chemical compositions inside a material can be identified and quantified using
spectral features. Combining, tracking tablets manufacture, and spotting fake goods are just a few of the medical
research and quality assurance applications that make utilization of chemical imaging.
249
Figure 2:Workflow steps for proposed system
A. Data Pre-processing
Images recognition and data standardization are the fundamental components of the hyperspectral imaging pre-
processing stage. The literature additionally makes use of the Gaussian function to flatten spectral signatures and
reduce the noise impact [50]. Data normalization modifies or normalizes hyperspectral illumination data to
values that indicate the inherent characteristics of biomaterials, such as absorbance or reflectance. A superior
way of preparing data for the analysis is by normalization, which also minimizes systems distortion and image
artifacts caused by uneven surface illumination or a lot of unnecessary data in the hyperspectral imagery's sub-
bands. Absorbance and reflectance [51,52] are the most popular pre-processing techniques used in hyperspectral
data. By covering the sensor lens, taking a dark picture, and subtracting the hyperspectral data for the dark
image. From the hyperspectral data taken from the area of interest, the camera's dark current impact was
eliminated. To create a white reference image, a white diffuse reflectance target was employed. The
hyperspectral data relative reflectance (R) is determined by equation 1. From the collected data it needs to select
a spatial region of interest (ROI) and then apply further pre-processing [53].
R= 100 (1)
whereas Id is the dark image, Iw is the white reference image, and Is is the raw hyperspectral data.
250
appropriate categories that use the K-NN algorithm, emphasizing both their distinctions and similarities. Finding
patterns, compressing them, and reducing their size without losing information [59]. A popular non-parametric
technique for classification in pattern recognition is the KNN algorithm. The core element of KNN is that a data
point's categorization is decided by the classifications of its closest K neighbours [60].
Partial Least Square Regression (PLSR): It is frequently used for quantitative spectrum analysis as well as
reflectance spectroscopic data processing. It breaks down both variables and discovers additional components.
When there are several highly line segments key predictors, it is utilized to build forecasting analytics. Based on
the spectrum, it may be utilized to create a linear prediction model for the sample size. Distinct frequency
readings make up each spectrum. The answers are forecasted linearly using the PLS Factors, which are generated
as a specific linear combination of the spectral range. Compared to the previous multiple regression approach
method, it produces richer findings [61].
Support Vector Machine (SVM): The aims of SVM is to identify an ideal higher-dimensional space as a decision
boundary in high-dimensional space, is founded on a statistical learning concept. The SVM chooses from an
unlimited number of linear decision boundaries the one that minimizes the classification error in the instance of a
two-class pattern-recognition problem when the categories are discrete. The decision boundaries chosen will
therefore be the one that leaves the largest class label, where the margin is defined as the sum of the distances
from the nearest instances of the two classes to the higher dimensional space [62,63]. Support vector machines'
benefits include efficiency in high-dimensional environments. It is useful in situations where the quantity of
dimensions exceeds the number of samples. It is also memory efficient since it only uses a portion of the training
points (known as support vectors) in the decision function. There are standard processors available, but we can
also define our processors. SVM has some drawbacks, including avoiding over-fitting when selecting processor
functions and regularisation terms if the number of features is much more than the sample size [64].
V. CHALLENGES
Currently, the areas of active pharmaceutical ingredient detecting (API), pharmaceutical validity authentication,
medication cluster analysis, or medication covering layer recognition are used hyperspectral technologies in
pharmaceuticals. The majority of pharmaceutical identification techniques like analytical methods, spectroscopy
methods, and the basic principal component analysis (PCA), with partial minimal, when it comes to statistical
techniques like polynomial regression (PLSR), data evaluation in the experiment is often the implementation
situation; commercial pharmaceutical production processes, as well as other locations, are not excluded from
this. As a result, there are many potential applications for hyperspectral technology in the context of
pharmaceutical research, yet there are numerous obstacles to overcome. The main problem at the moment is how
to employ hyperspectral technologies in medication identification in the commercial pharmacy setting using
extreme accuracy and cheap cost. So continued development of that kind of technique is being constrained by a
shortage of pharmaceutical hyperspectral data sources or a lack of scientific fervor. Hence, should ensure the
continued development of hyperspectral technologies inside this sector, academics have to not only extend
beyond identification techniques to generally utilized computer vision techniques, yet also need to regularly
incorporate popular pharmacological detecting data.
VI. CONCLUSION
Based on the study, the issue of fake medications has indeed been extensively acknowledged, although it is not
yet properly described and adequately handled. According to a study, we found the most fake pharmaceuticals
medicines are antimicrobials, antimalarials, erectile dysfunctions, herbal, diabetes, and weight control. In this
study, we have tried to provide an overview of several approaches for the identification of fake pharmaceutical
medicines, particularly we are focusing on hyperspectral imaging technology because the is non-invasive, actual
hyperspectral imaging technologies are three-dimensional picture cubes having 2-dimensional aspects, and one
spectral aspect can be acquired. Every hypercube pixel may well be described by a spectral curve that really can
reach from the Ultraviolet towards the Infrared spectrum. This hyperspectral imaging acquired remotely sensed
spectrum offers analytical details on the material biology, architecture, including substance. As it gives spectrum
data that could be utilized to differentiate between authentic and fraudulent medications, hyperspectral sensors
easier processes thus accelerating the identification of falsified medicines as found 350 nm to 1000nm visible
range to near-infrared range.
251
REFERENCES
[1] Bottoni P. Fake pharmaceuticals: A review of current analytical approaches. Microchemical Journal. 2019 Sep 1.
[2] Johnston A. Substandard drugs: a potential crisis for public health. British journal of clinical pharmacology.2014.
[3] https://www.ema.europa.eu/en/humanregulatory/overview/public-health-threats/falsified medicines-overview accessed
on 20/09/2022
[4] Wood D. Drug diversion. Australian prescriber. 2015 Oct;38(5):164.
[5] WHO global surveillance and monitoring system for substandard and falsified medical products
https://apps.who.int/iris/handle/10665/326708 accessed on 20/09/2022
[6] Glass BD. Counterfeit drugs and medical devices in developing countries. Research and Reports in Tropical Medicine.
2014
[7] Dégardin K, Roggo Y, Margot P. Understanding and fighting the medicine counterfeit market. Journal of
pharmaceutical and biomedical analysis. 2014 Jan 18.
[8] Newton PN, Green MD, White NJ. Counterfeit anti-infective drugs. The Lancet infectious diseases. 2006 Sep.
[9] Deisingh AK. Pharmaceutical counterfeiting. Analyst. 2005.
[10] https://timesofindia.indiatimes.com/city/ranchi/vigil-in-state-after-who-bans-4-kid-cough accessed on 10/10/2022
[11] Björkman-Nyqvist M, Svensson J, Yanagizawa-Drott D. The market for (fake) antimalarial medicine: Evidence from
uganda. Abdul Latif Jameel Poverty Action Lab. 2013 Jun.
[12] Bottoni P,Fake pharmaceuticals: A review of current analytical approaches. Microchemical Journal. 2019 Sep.
[13] Ho HM, Xiong Z, Wong HY, Buanz A. The era of fake medicines: Investigating counterfeit medicinal products for
erectile dysfunction disguised as herbal supplements. International Journal of Pharmaceutics. 2022 Apr.
[14] Kong SG, Martin MET. Hyperspectral fluorescence imaging for mouse skin tumor detection. Etri Journal. 2006 Dec.
[15] Greenman RL, Panasyuk S, Wang X, Lyons TE, Dinh T, Longoria L, Giurini JM, Freeman J, Khaodhiar L, Veves A.
Early changes in the skin microcirculation and muscle metabolism of the diabetic foot. The Lancet. 2005 Nov.
[16] Kester RT, Bedard Real-time snapshot hyperspectral imaging endoscope. Journal of biomedical optics. 2011 May.
[17] Dicker DT, Lerner J, Van Belle P, Guerry, 4th D, Herlyn M, Elder DE, El-Deiry WS. Differentiation of normal skin and
melanoma using high resolution hyperspectral imaging. Cancer biology & therapy. 2006 Aug.
[18] Sorg BS, Moeller BJ, Donovan O, Cao Y, Dewhirst MW. Hyperspectral imaging of hemoglobin saturation in tumor
microvasculature and tumor hypoxia development. Journal of biomedical optics. 2005 Jul.
[19] Panasyuk SV, Yang S, Faller DV, Ngo D, Lew RA, Freeman JE, Rogers AE. Medical hyperspectral imaging to facilitate
residual tumor identification during surgery. Cancer biology & therapy. 2007 Mar.
[20] Akbari H, Kosugi Y, Kojima K, Tanaka N. Detection and analysis of the intestinal ischemia using visible and invisible
hyperspectral imaging. IEEE Transactions on Biomedical Engineering. 2010 May.
[21] Akbari H, Halig L, Schuster DM, Fei B, Osunkoya A, Master V, Nieh P, Chen G. Hyperspectral imaging and
quantitative analysis for prostate cancer detection. Journal of biomedical optics. 2012 Jul.
[22] Kumar S, Desmedt C, Larsimont D, Sotiriou C, Goormaghtigh E. Change in the microenvironment of breast cancer
studied by FTIR imaging. Analyst. 2013.
[23] Frosch T, Wyrwich E, Yan D, Domes C, Domes R, Popp J, Frosch T. Counterfeit and substandard test of the
antimalarial tablet Riamet® by means of Raman hyperspectral multicomponent analysis. Molecules. 2019 Sep.
[24] Kaneko H, Funatsu K. Classification of drug tablets using hyperspectral imaging and wavelength selection with a
GAWLS method modified for classification. International journal of pharmaceutics. 2015 Aug.
[25] Wilczyński S, Koprowski R, Marmion M, Duda P, Błońska-Fajfrowska B. The use of hyperspectral imaging in the
VNIR (400–1000 nm) and SWIR range (1000–2500 nm) for detecting counterfeit drugs with identical API composition.
Talanta. 2016 Nov.
[26] Shinde SR, Bhavsar K, Kimbahune S, Khandelwal S, Ghose A, Pal A. Detection of Counterfeit Medicines Using
Hyperspectral Sensing. In2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology
Society (EMBC) 2020 Jul.
[27] Dégardin K, Guillemain A, Guerreiro NV, Roggo Y. Near infrared spectroscopy for counterfeit detection using a large
database of pharmaceutical tablets. Journal of pharmaceutical and biomedical analysis. 2016 Sep.
[28] Feng Y, Lei D, Hu C. Rapid identification of illegal synthetic adulterants in herbal anti-diabetic medicines using near
infrared spectroscopy. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy. 2014 May.
[29] Lu F, Weng X, Chai Y, Yang Y, Yu Y, Duan G. A novel identification system for counterfeit drugs based on portable
Raman spectroscopy. Chemometrics and Intelligent Laboratory Systems. 2013 Aug.
[30] Sacré PY, Deconinck E, De Beer T,Chiap P, Crommen J, De Beer JO. Comparison and combination of spectroscopic
techniques for the detection of counterfeit medicines. Journal of pharmaceutical and biomedical analysis. 2010 Nov.
[31] Gupta RS, Deshmukh RR, Kshirsagar AV. Spectral Database of Pharmaceutical Common Excipients and Paracetamol
API Using ASD Field Spec 4 Spectrordiometer. Medico-Legal Update. 2021 Apr.
[32] Yabré M, Sakira AK, Bandé M, Goumbri BW, Ouattara SM, Fofana S, Somé TI. Detection of Falsified Antimalarial
Sulfadoxine-Pyrimethamine and Dihydroartemisinin-Piperaquine Drugs Using a Low-Cost Handheld Near-Infrared
Spectrometer. Journal of Analytical Methods in Chemistry. 2022 May.
[33] Lawson G, Ogwu J, Tanna S. Quantitative screening of the pharmaceutical ingredient for the rapid identification of
substandard and falsified medicines using reflectance infrared spectroscopy. PLoS One. 2018 Aug.
252
[34] Schlegel LB, Schubert-Zsilavecz M. Quantification of active ingredients in semi-solid pharmaceutical formulations by
near infrared spectroscopy. Journal of pharmaceutical and biomedical analysis. 2017 Aug.
[35] Rebiere H, Martin M, Ghyselinck C, Bonnet PA, Brenier C. Raman chemical imaging for spectroscopic screening and
direct quantification of falsified drugs. Journal of Pharmaceutical and Biomedical Analysis. 2018 Jan.
[36] Spálovská D, Pekárek T, Kuchař M, Setnička V. Comparison of genuine, generic and counterfeit Cialis tablets using
vibrational spectroscopy and statistical methods. Journal of Pharmaceutical and Biomedical Analysis. 2021 Nov.
[37] Ahmed N, Shaikh O. A.Development and validation of rapid HPLC method for determination of Aripiprazole in bulk
drug and pharmaceutical formulation. Journal of Innovations in Pharmaceutical and Biological Sciences. 2017
[38] Lu F, Weng X, Chai Y, Yang Y, Yu Y, Duan G. A novel identification system for counterfeit drugs based on portable
Raman spectroscopy. Chemometrics and Intelligent Laboratory Systems. 2013 Aug.
[39] Cho SH, Park HJ, Lee JH, Do JA, Heo S, Jo JH, Cho S. Determination of anabolic–androgenic steroid adulterants in
counterfeit drugs by UHPLC–MS/MS. Journal of Pharmaceutical and Biomedical Analysis. 2015 Jul.
[40] Shinde SR, Bhavsar K, Kimbahune S. Detection of Counterfeit Medicines Using Hyperspectral Sensing. In2020 42nd
Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 2020 Jul.
[41] Wilczyński S, Koprowski R. The use of hyperspectral imaging in the VNIR (400–1000 nm) and SWIR range (1000–
2500 nm) for detecting counterfeit drugs with identical API composition. Talanta. 2016 Nov.
[42] Dégardin K, Guillemain A, Guerreiro NV, Roggo Y. Near infrared spectroscopy for counterfeit detection using a large
database of pharmaceutical tablets. Journal of pharmaceutical and biomedical analysis. 2016 Sep.
[43] Hattori Y,Peerapattana J, Otsuka M. Rapid identification of oral solid dosage forms of counterfeit pharmaceuticals by
discrimination using near-infrared spectroscopy. Bio-Medical Materials and Engineering. 2018 Jan.
[44] Sacré PY, Deconinck E, Chiap P, Crommen J, De Beer JO. Comparison and combination of spectroscopic techniques
for the detection of counterfeit medicines. Journal of pharmaceutical and biomedical analysis. 2010 Nov.
[45] Dégardin K, Guillemain A, Guerreiro NV, Roggo Y. Near infrared spectroscopy for counterfeit detection using a large
database of pharmaceutical tablets. Journal of pharmaceutical and biomedical analysis. 2016 Sep.
[46] Neuberger S, Neusüß C. Determination of counterfeit medicines by Raman spectroscopy: systematic study based on a
large set of model tablets. Journal of pharmaceutical and biomedical analysis. 2015 Aug.
[47] Deconinck E, Sacré PY, De Beer JO. Chromatography in the detection and characterization of illegal pharmaceutical
preparations. Journal of chromatographic science. 2013 Sep.
[48] Phillips G. World Congress of Pharmacy and Pharmaceutical Sciences: anticounterfeiting measures. Pharm J. 2003.
[49] Lu G, Fei B. Medical hyperspectral imaging: a review. Journal of biomedical optics. 2014 Jan;19.
[50] Kong SG, Du Z, Martin M, Vo-Dinh T. Hyperspectral fluorescence image analysis for use in medical diagnostics.
InAdvanced Biomedical and Clinical Diagnostic Systems III 2005 Apr.
[51] Sowa MG, Payette JR, Hewko MD, Mantsch HH. Visible-near infrared multispectral imaging of the rat dorsal skin flap.
Journal of biomedical optics. 1999 Oct.
[52] Gillies R, Freeman JE, Cancio LC, Brand D, Hopmeier M, Mansfield JR. Systemic effects of shock and resuscitation
monitored by visible hyperspectral imaging. Diabetes technology & therapeutics. 2003 Nov.
[53] Farrugia J, Griffin S, Valdramidis VP, Camilleri K, Falzon O. Principal component analysis of hyperspectral data for
early detection of mould in cheeselets. Current Research in Food Science. 2021 Jan.
[54] Lu G, Fei B. Medical hyperspectral imaging: a review. Journal of biomedical optics. 2014 Jan.
[55] Lu D, Weng Q. A survey of image classification methods and techniques for improving classification performance.
International journal of Remote sensing. 2007 Mar.
[56] Rodarmel C, Shan J. Principal component analysis for hyperspectral image classification. Surveying and Land
Information Science. 2002 Jun.
[57] Huse VA, Chaudhary D, Gawali BW. Image Processing Approach for Fish Image Analysis–A.
[58] Sonawane MM, Gawali BW, Manza RR, Mendhekar S. Analysis of Skin disease techniques using Smart Phone and
Digital Camera Identification of Skin Disease. Research Journal of Science and Technology. 2022 Jul.
[59] Guo Y, Han S, Li Y, Zhang C, Bai Y. K-Nearest Neighbor combined with guided filter for hyperspectral image
classification. Procedia Computer Science. 2018 Jan.
[60] Gore¹ RD, Nimbhore SS, Gawali BW. Understanding soil spectral signature though RS and GIS Techniques.
[61] Vapnik V. The nature of statistical learning theory. Springer science & business media; 1999 Nov 19.
[62] Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods.
Cambridge university press; 2000 Mar 23.
[63] available on https://scikit-learn.org/stable/modules/svm.html accessed on 15-11-2022
[64] Pu R. Hyperspectral remote sensing: fundamentals and practices. CRC Press; 2017 Aug 16.
[65] Vasefi F, MacKinnon N, Farkas DL. Hyperspectral and multispectral imaging in dermatology. In Imaging in
Dermatology 2016 Jan.
[66] Chen SY, Chen YC, Lien CT. A New Application of Hyperspectral Techniques in Drug Classification. In International
Conference on Intelligent Information Hiding and Multimedia Signal Processing 2018 Nov.
[67] Fake medicines: The worldwide industry putting your life in danger [Internet]. Srinath Perur M. 2019 [cited 1 July 2019]
CNN.www.cnn.com/2018/10/30/health/fakemedicinepartner/ index.html
253
Grenze International Journal of Engineering and Technology, June Issue
Abstract—The study, design, and construction of every electrical system required to create a
fully functional off-road EV are presented in this report. Three main electrical categories were
required, namely critical vehicle systems, e.g. engine control unit integration with engine
sensors and a motor-controlled shifting system, safety systems and the additional designs that
added value to the vehicle, e.g. an electronic driver interface, digitally controlled shifting etc.
I. INTRODUCTION
The benefits of EVs include increased energy efficiency due to regenerative braking and newer packaging
options, along with a reduction in CO2 emissions. The goal of this thesis is to increase understanding of the
benefits of electric power trains and how they might be used[1]-[2]. The objective of the paper is to highlight
that how different choices affect the vehicle dynamics, how the motors and gears might be configured and
the way the accumulator pack and motor controller might be designed to maximize the performance of the
car.
Many experts are looking for alternate energy sources because of the problem of air pollution caused by
automobiles. A promising approach is to use an electric car, known as a combustion motor replacement. In order
to determine its characteristics, the first step in this research is to model the power flow inside the energy system
for electric vehicles[3]-[4]. Because electric vehicles are highly dependent on the finite amount of electrical
energy provided by the battery, power flow efficiency is a crucial topic to discuss. Therefore, it needs to be
handled effectively. In order to ensure that the amount of electrical energy meets the needs of electric vehicles,
the study will track the power flow calculation[5]-[8]. The electrical layout of an off-road electric vehicle model
using MATLAB/Simulink software is to get the best power flow response to the electric vehicle energy system.
255
D. Kelly DC to DC converter Specifications
Nominal input voltage: DC 48V, 60V, 72V
Output voltage: DC 13.5V under 70℃ or DC 12.2V above 70℃
Operating voltage range: 40V-100V
Output current: 30A
Output power: 400W
Operating Temperature Range: -20℃ to 90℃ (case temperature)
Full - load efficiency: ≥93%
Ripple coefficient: ≤1%
Weight: 2.25lbs
The basic simulink will be derived with respect to the following block diagram.
In the configuration the main blocks are battery, drive cycle, controller, DC motor, vehicle body and
transmission and feedback. The driver cycle is the driver input who will ride the vehicle. Here FTP75 drive cycle
is used to give the output on how it reacts to the drive cycle. The main purpose to design the EV is to understand
the speed of vehicle, calculate SOC discharge rate, distance travelled by vehicle. The SOC block in MATLAB
simulink will calculate the SOC, distance is measured by distance and time parameter.
where, S : state of charge; Ch : actual stored “Ah” capacity in the battery; Chnom : the nominal “Ah” capacity of
the battery. The battery terminal voltage VB is determined by battery SOC and its impedance.
256
Battery size calculation :
= (3000x8)/48= 500Ah
Current required to run load: I 1=load(watt)/voltage = 3000/50= 60A
All these components are available in the Simscape electrical components library so they need a connection to
the solver configuration block which helps solve the simulation.The majority of blocks are Simscape blocks. It
convert the physical signal from Simscape block to Simulink signal.
Initially it was set to 90% charge and after running for 1000s the final charge is around 55%.
The distance obtained for a given FTP75 drive cycle with80Ah battery capacity is able to cover say 10km
distance for 80Ah battery capacity.
Distance Travelled The distance obtained for a given FTP75 drive cycle with80Ah battery capacity is able to
cover 11.14km distance
257
Fig 3: Motor controller design
258
VI. CONCLUSION
The EV is created by MATLAB simulink and it is obtained for a drive cycle by FTP75 driver.It can be tested
for any drive cycle by selecting the drive cycle block. It can travel 11.14 km by the presence of 80Ah battery
capacity. It will travel further more distance if the battery capacity has been enhanced. However, it should be
taken into account considering space and cost in the real time.
Making the actual model is made simpler by modelling the EV prototype system. The battery life of an EV can
be estimated using this prototype. The model can be used to assess the efficiency of an EV during startup or
constant-speed operation.
REFERENCES
[1] Kim, S., Chung, S., Shin, W., Lee, J., A study of predicting model of an electrical energy balance for a conventional
vehicle, Procee-dings of the 17th World Con-gress The International Federa-tion of Automatic Control Seoul, Korea,
July 6-11, 2008.
[2] Kunzli, N., Public-Health Impact of Outdoor and Traffic-Related Air Pollution: An European Assessment, The Lancet,
Vol. 356,Number 9232, September 2000, pp. 795-801.
[3] Larminie, J., Lowry, J., 2003, Electric Vehicle Technology Explained, John Wiley & Son.
[4] Lustenader, E. L., Guess, R. H., Richter, Turnbull, F. G., De-velopment of a Hybrid Flywheel /Battery Drive System for
Elec-tric Vehicle Applications, IEEE Transactions on Vehicular Tech-nology, Vol. VT-26, May 1977, pp.135-143.
[5] Patterson, P., Quantifying the Fuel Use and GHG Reduction Poten-tial of EVs and HEVs, Available April 26, 2002,
http://www.ott. doe.gov/pdfs/evsl7 .pdf
[6] J. Bauman and M. Kazerani, "A comparative study of Fuel-Cell– Battery, Fuel-Cell–Ultra capacitor, and Fuel-Cell–
Battery–Ultra capacitor Vehicles," in IEEE Transactions on Vehicular Technology, vol. 57, no. 2, pp. 760-769, March
2008.
[7] Drishya.K.Sasi and Jiji. K S, "A survey of bidirectional DC/DC converters for battery storage applications in distributed
generation systems," 2020 International conference on power, Instrumentation, Control and Computing (PICC), Trissur,
India, 2020.
[8] M. Kabalo, B. Blunier, D. Bouquain and A. Miraoui, "State of the art of DC-DC converters for fuel cell vehicles," 2010
IEEE Vehicle Power and Propulsion Conference, Lille, France, 2010, pp. 1-6. [6] S. Miao, F. Wang and X. Ma, "A New
Transformer-less Buck Boost converter with positive output voltage," in IEEE Transactions on Industrial Electronics,
vol. 63, no. 5, pp. 2965-2975, May 2016.
259
Grenze International Journal of Engineering and Technology, June Issue
Abstract—Of late, there is overwhelming growth of the e-mobility sector and hence, growing
interest in integrating automotive electronics. In this context, the present paper discusses Eddy
Current Braking (ECB) Systems. Electrification of the braking systems would aid in gaining
electronic control and total integration of electrical and electronic components in an
automobile. Electrical eddy-current braking is a prominent type of braking systems used in
textile, oil rigs and locomotive sectors. Potential of incorporating ECB with existing frictional
braking to form Integrated braking systems for automotive braking applications is explored.
ECB provide retardation, while frictional components are required to halt the brake disc.
Development of analytical model and a preliminary hardware model are carried out in the
present paper.
Index Terms— Eddy Current Braking, electrical braking systems, permanent magnets.
I. INTRODUCTION
Brakes have changed considerably from traditional wooden log blocks to present day Automatic braking
systems. For more than 100 years, braking systems have evolved to adapt with improving automotive
capabilities and road conditions. Throughout history, the mechanical friction type of braking mechanism has
been favored, with little to no advocation towards Electrical or electromagnetic braking. With the exponential
growth of e-mobility sector and interest of electrifying automotive components, the focus on electric braking is
gaining prominence.
In the present paper, Eddy Current Braking (ECB) Systems are looked into as a prospect for application in
automotive braking. Integrated ECB systems which incorporate existing frictional braking components and
employ Permanent Magnets (PMs) are explored to fulfil the needs of automotive braking.
A brief review on evolution of braking systems and current status of ECB is done in Section II. In section III, ECB
is designed using Permanent Magnets incorporating brake disc. Section III covers a simple analytical model is
developed to validate the expressions and plot characteristic graphs of an ECB. In section V, a preliminary
hardware model is designed to validate the fundamental working principles of an ECB employing PMs.
Subsequent results obtained and inferences observed are elaborated in section VI, followed by concluding remarks
in section VII.
261
Keeping these in minds, the scope for further research work is inferred to be:
Development of an end-to-end model which uses a simple analytical method taking into consideration the
new PMs.
This analytical model’s equations and results are to be validated by a simple working hardware model.
The architecture of mechanically moved magnets to control the braking can be explored and the setup is to
be realized incorporating already present components of traditional frictional braking.
In the model, we can either employ Permanent Magnet (PM) or Electro-magnets (EMs) to generate the magnetic
field. In case of PMs, air-gap distance is varied to gain control over braking force, whereas excitation current is
varied in case of EMs.
Permanent Magnets
During the rotation of brake disc with velocity vm, the induced voltage ui in the disc can be calculated using
Faraday’s law of induction,
∮ ⃗ × ⃗ =( − )0.5 (1)
where l is the width of the disc [23]. The resistances along axis of eddy current path, namely R1 and R2 is
calculated. Disc physical and material properties are taken into consideration.
= (2)
= (3)
262
where, CL is the inductance fitting parameter and is the length of the magnet. The respective reactance can be
calculated by,
= (8)
Due to the impedance Z and voltage induced in the disc, the eddy current can be estimated as,
= (9)
The drag force produced in an eddy current brake can be approximated by,
∆
= ( )
(10)
Where, Nec is the number of eddy current paths. All parameters used for the calculations are presented in Table 1
including brake disc and magnet parameter values.
The Effective flix density at an air gap distance lg is calculated by,
= ( − ) (11)
( )
where, d is the Disc diameter.
The aim of the design is to find the physical parameters of the ECB which ensure maximum braking torque. Fig.
2 shows the Torque-speed curve of an ECB with its
characteristic values.
Torque axis and Speed axis correspond to Braking Torque of the ECB and Speed of the Brake disc respectively.
It can be observed that the Torque of an ECB increases up to its critical speed VCrit, where it reaches Maximum
torque Tmax. As the speed increases past Critical speed, the torque decreases.
263
Fig. 3. Algorithm implemented in MATLAB to solve for Braking parameters of ECB
Fig. 4. Braking Force – Speed curve of Eddy Current Brake Fig. 5. Braking Torque – Speed curve of Eddy Current Brake
264
V. HARDWARE DESIGN AND PRELIMINARY REALIZATION
A preliminary hardware model is designed to test our hypothesis on fundamental physics and working of an
Eddy Current Braking model employing Permanent Magnets. Neodymium Circular disc magnets, which possess
high magnetic flux density are chosen to decelerate an Aluminium disc of a similar dimensions to that of an
automotive brake disc. Suitable drive type to actuate the disc is designed and motor sizing is carried out. Fig
7.shows the CAD model of the designed system.
A preliminary hardware working model is realized to test out the theoretical assumptions and to ensure the
fundamental understanding of working of an ECB using PM are validated.
Due to restrictions not limiting to financial, availability and time restrictions, some variation has been made from
the designed model.
A mild steel disc of 20mm outer diameter was procured instead, and appropriate light-weight shaft has been used
to enable rotation of the disc. The shaft is directly coupled with a PMSM motor to eliminate any drive losses.
Neodymium circular disc magnets are mounted on a retractable flat member which oscillates the magnet in and
away from the disc. Ball bearings are installed at both ends of the shaft to ensure there is no other load other than
the disc itself. The whole frame for the model has been constructed with Iron square members to ensure strength
and rigidity.
265
Fig. 8 and Fig. 9 contains the images of realized ECB model using PMs.
Fig. 8 Realized working model of preliminary Braking system using PMs Fig. 9. View of PMs in vicinity of the Brake disc
Fig. 10 Disc speed – Time graphs of initial speed 330 RPM Fig. 11 Disc speed – Time graphs of initial speed 860 RPM
In the next set of experiments, the motor power is taken off to simulate accelerator pedal being taken off in case
of an automobile. Subsequently, readings of time taken by the disc coming to halt, with magnet in its vicinity
and magnet without in its vicinity are taken. This corresponds to braking effect and the deceleration rate of the
brake disc.
Fig. 12 shows the graph of Brake disc speed versus time, with magnet in discs vicinity and without magnet in
discs vicinity. It can be observed that the disc naturally decelerates at the rate of 11 RPM/s from 330 RPM to 0
RPM in 0.5 seconds. When the permanent magnet is brought in to produce brake torque, the disc now
decelerates at increased magnitude of 18.33 RPM/s and takes only 0.3 seconds to stop.
In another case when the initial speed of the disc is 860 RPM, the disc naturally decelerates at 8.9 RPM/s to
come to a halt at 1.60 seconds. When the braking as a form of bringing in the permanent magnets is applied, the
disc decelerates at increased magnitude of 23.8 RPM/s and takes 0.6 seconds to stop. It is illustrated in Fig. 13.
266
Fig. 12 Deceleration graph of initial speed 330 RPM Fig. 13 Deceleration graph of initial speed 860 RPM
VII. CONCLUSION
With the exponential growth of e-mobility sector and the increasing interest in automotive electronics, focus on
electric braking is gaining prominence. In this context, the present work was taken up to explore the realization of
Eddy current braking using Permanent Magnets for automobile braking application.
Evolution of braking system was studied, leading to exploration of current state Eddy Current Braking
Mechanisms. Eddy Current Brake is designed with Permanent Magnets incorporating already present frictional
braking component of brake disc. Simple analytical model is developed to validate the equations and plot the
characteristic graphs of an Eddy Current Braking System. Preliminary hardware system to validate the
understanding of fundamental working of ECB was designed and realized.
Usage of ECBs in passenger vehicles has advantages like gaining electronic control over the braking mechanism
and complete Electrical-Electronic integration of all the components in an automobile. It can be concluded that the
presented work aims to aid the transition from traditional frictional braking to Integrated ECB systems.
ACKNOWLEDGMENT
Authors would like to thank the authorities of KLE Technological University, Hubli - 580031 (INDIA) for the
support in carrying out this research work under Research Experience for Under-graduate (REU) scheme.
REFERENCES
[1] Britannica, T. Editors of Encyclopedia (2015, February 5). Michelin. Encyclopedia Britannica.
[2] Post, W. (2014). Car braking systems. In Fundamentals of Automotive and Engine Technology (pp. 130-141). Springer
Vieweg, Wiesbaden.
[3] Volti, R. (2006). Cars and culture: The life story of a technology. JHU Press
[4] Feeney, B., Guran, A. S., Hinrichs, N., & Popp, K. (1998). A historical review on dry friction and stick-slip phenomena.
[5] Gowda, D., Kumar, P., Muralidhar, K., & BC, V. K. (2020, November). Dynamic analysis and control strategies of an
anti-lock braking system. In 2020 4th International Conference on Electronics, Communication and Aerospace
Technology (ICECA) (pp. 1677-1682). IEEE.
[6] Siegel, I. H. (1965). Independent Inventors: Six Moral Tales. Pat. Trademark & Copy. J. Res. & Ed., 9, 643.
[7] H. Sakamoto, “Design of permanent magnet type compact ECB retarder,” Society of Automotive Engineers #973228,
pp. 19-25, 1997.
[8] Shin Kobayashi, Yukitoshi Narumi (1999). Eddy current reduction braking system (U.S. Patent no. 6237728B1). U.S.
Patent and Trademark Office.
[9] TohruKuwahara (1999). Eddy current reduction apparatus (U.S. Patent no. 6,209,688). U.S. Patent and Trademark
Office.
[10] TohruKuwahara (1999). Permanent magnet type eddy current braking system. (U.S. Patent no. 5,944,149). U.S. Patent
and Trademark Office.
[11] Jiangyin Intellectual Property Operation Co., Ltd (2012). Permanent magnet disc brake and braking method thereof
(China Patent no. 102979837B). China Patent and Trademark Office.
[12] Jiangsu University (2018). A kind of permanent magnetism double disk brake and its braking method (China Patent no.
108895096B). China Patent and Trademark Office.
[13] Luo, L., Zhai, Q., Li, W., Qian, C., & Liu, H. (2017). Research on an integrated electromagnetic auxiliary disc brake
device for motor vehicles. IEEJ Transactions on Electrical and Electronic Engineering, 12(3), 434-439.
267
[14] Ji, Y., Wang, J., Xu, Y., Liu, Z., Zhou, Y., & Li, J. (2016). Study on the thermal-magnetic coupling characteristics of
integrated eddy current retarder (No. 2016-01-0185). SAE Technical Paper.
[15] Gay, S. E. (2010). Contactless magnetic brake for automotive applications (Doctoral dissertation, Texas A & M
University).
[16] Ye, L., Li, D., Ma, Y., & Jiao, B. (2011). Design and performance of a water-cooled permanent magnet retarder for
heavy vehicles. IEEE Transactions on Energy Conversion, 26(3), 953-958.
[17] Simeu, E., & Georges, D. (1996). Modeling and control of an eddy current brake. Control Engineering Practice, 4(1),
19-26.
[18] Anwar, S. (2004). A parametric model of an eddy current electric machine for automotive braking applications. IEEE
transactions on control systems technology, 12(3), 422-427.
[19] Gay, S. E., &Ehsani, M. (2005, September). Analysis and experimental testing of a permanent magnet eddy-current
brake. In 2005 IEEE Vehicle Power and Propulsion Conference (pp. 10-pp). IEEE.
[20] Gay, S. E., &Ehsani, M. (2006). Parametric analysis of eddy-current brake performance by 3-D finite-element analysis.
IEEE Transactions on Magnetics, 42(2), 319-328.
[21] Gay, S. E., &Ehsani, M. (2005, September). Optimized design of an integrated eddy-current and friction brake for
automotive applications. In 2005 IEEE Vehicle Power and Propulsion Conference (pp. 290-294). IEEE
[22] Gay, S. E., & Ehsani, M. (2006). Parametric analysis of eddy-current brake performance by 3-D finite-element analysis.
IEEE Transactions on Magnetics, 42(2), 319-328.
[23] Holtmann, C., Rinderknecht, F., & Friedrich, H. E. (2015, March). Simplified model of eddy current brakes and its use
for optimization. In 2015 Tenth International Conference on Ecological Vehicles and Renewable Energies (EVER) (pp.
1-8). IEEE.
268
Grenze International Journal of Engineering and Technology, June Issue
Abstract—In this study, a convolution neural network (CNN) is used to classify images for the
detection of skin illness. We collected in a database from the government medical hospital in
Aurangabad and the HAM10000 online data. The seven classes are in the skin diseases dataset
such as Basel Cell Carcinoma, Psoriasis, Ringworm, Impetigo, Leprosy, and Eczema. The seven
additional categories of skin disease are in the database. We have used pre-processing
techniques to improve the model accuracy such as resizing images, and normalization of a
dataset. We have used a deep learning algorithm for the classification of skin diseases in the
database. We have used a deep learning algorithm for the classification of skin disease. It is
given an 80.2% percent accuracy rate and its overall accuracy is 78%. Acne disease
identification is got 100 accuracies while testing for it is 97.6% accurate. For the classification of
skin diseases, we used a deep learning system. Its total accuracy is 78% and it has an accuracy
rate of 82.2%. Identification of the acne disease has a 100 accuracy rating, while testing for it
has a 97.6% accuracy rating.
Index Terms— CNN, Skin Disease Dataset, Deep Learning, Convolution Neural Network,
Image Processing.
I. INTRODUCTION
The most common disease nowadays that affects people of all ages is skin disease and lesions, however young
children and the elderly are immunity powerless compared to other. The investigation of patient’s medical
history and symptoms, skin scrapping, dermoscopic examination, and skin biopsy is a common method for skin
disease diagnosis. But these methods of diagnosis are exhausting, time-consuming, and prone to error. Most of
them request an expert dermatologist with superb vision. Medical imaging technologies are sophisticated and
trustworthy in diagnosing skin diseases. However, people in low-resource contexts are healthcare institutions. In
the healthcare sector, digital imaging cameras and sensing platforms have recently emerged as an alternate
method of disease diagnosis. The most recent generation of the camera allows for high-resolution digital image
capture images in high resolution to its high-definition camera, enormous storage capacity, and high-resolution
digital image capture to its high-definition camera, enormous to its probability, affordability, and connectivity
[1].
Computer-aided diagnosis is important and required because it may analysis of different types of skin diseases.
The bulk of regularly used algorithms for forecasting skin diseases involve deep learning. This approach will
We have captured images from Government Medical Hospital Aurangabad, under the observation of the
dermatology lab. We used a direct-current light source to avoid the flickering effect of alternating current (AC).
Additionally, information is withheld unless physicians completely and honestly disclose the objectives behind
the collection of their data. Both dermatologists and patients are aware that we are only gathering this
information for the study. This data was gathered from every patient, and our study was authorized by the
institute's ethics committee.
In Source _1 (Sony HD Camera) we have data from seven different skin datasets such as Acne, psoriasis,
Eczema, Wart, Ringworm, Vitiligo, and Skin Cold., which are mentioned in table 2. The data indicates a
6016*3384 resolution.
Another is the Source 2 we have taken the pictures from Kaggle [6]. We have considered skin infection pictures
with the natural parts. It has been seen that the proposed framework yield exactness differs as for skin illnesses.
270
TABLE II. LOCAL DATABASE FOR SKIN DISEASE
Acne BCC Eczema Leprosy Psoriasis
Skin Disease Train Data Test
(80%) Data (20%)
Acne 180 25
Basal Cell Carcinoma- BCC 45 10
Psoriasis 275 120
Ringworm 50 21
Ringworm Skin Wrath Vitiligo Dilation
Cold Impetigo 45 35
Leprosy 122 79
Eczema 118 23
Total 825 313
We have additionally gathered pictures from the web. More than 1012 images with a resolution of 678*600 have
been downloaded on seven different infections: melanocytic, bcc, benign keratosis lesion, melanoma,
dermatofibroma, vascular lesion, and Akira. In the underlying preparation stage, trademark properties of
ordinary picture highlights are confined, and, in light of these, a one-of-a-kind portrayal of every characterization
classification is made for seven distinct classes.
The classes are skin inflammation, acne infection, leg infection, hand infection, dermatitis subcutaneous, lichen
simplex, stasis dermatitis, and ulcers [7]. In the testing stage, these component space allotments are utilised to
create group picture highlights.
A. Image Acquisition
With the use of two sources, including dermatologist photographs from a digital camera and the Kaggle website
as well as online and offline disease images.
Data Preparation
At the point when we gathered our pictures, all the pictures were in an alternate measurement. Our informational
index is diverse in its height, width, and size. In any case, the profound neural classifier needs a comparable
informational index for preparing and testing the informational index. So we set the pixels to 100 x 100. At that
271
point, we'll prepare our model. Our all-out picture number post-growth is 3000. We used 2400 images for
preparation and 600 for testing [8].
B. Convolution neural Network
We developed our algorithm based on pre-trained 825 images we adjust the final layer and used on, Our dataset
as inputs, CNN have similar functions, where the calculated feature are combined with each other. The
simplified framework of the entire process is shown in figure 2. CNN is often used in real life for image
recognition and natural language processing. Each pixel in the input images was transformed into element in a
matrix. If there are 100 images input images, the input matrix would be 825 images. 256*100 dimensional
images this also called input layers.
Image data generator generates argumentation of images in real time while the model is still training.
One can apply random transformation on each training images as it is passed to model.
Data are distributed in following layers.
1. Convolutional
2. Pooling
3. Dropout
4. Flatten
5. Dense
Our idea is to build up a new CNN model, in our model, we have 13 layers. We also have 5 convolutional layers:
• The first layer has 32-3 × 3 filters and ‘linear’ as an activation function.
• The second layer has 64-3 × 3 filters and ‘linear’ as an activation function.
• The third layer has 128-3 × 3 filters and ‘linear’ as an activation function.
• The fourth layer has 256-3 × 3 filters and ‘linear’ as an activation function.
Additionally, we may state that the average size of the five max-pooling layers is 2 X 2. The parameters of our
two dropout layers are 0.3 and 0.4, respectively. In our model, a level layer exists. Finally, there are two thick
layer capabilities known as "softmax" and "straight". Both skills are used in the beginning process. But to
determine the likelihood of our five classes, "softmax" is used [9].
Training Model
Adam optimizer is used for the compilation of our model. For training purposes, we use 80% of our training
dataset, and then the rest of the 20% dataset is used for testing purposes. our training dataset consists of 2400
images. So we can say that the number of training sets consists of 1920 images and validating set consist of 670
images. Our classifier’s batch size 78. 50 epochs were used by us to train the model.
V. PERFORMANCE EVALUATION
Preparing precision is regarded as the model's accuracy when applied to the data we prepare. The model's
accuracy when applied to a small sample of data from any class is referred to as the approval precision. The
diagram in the illustration depicts the creation and acceptance of exactness. to evaluate the performance of the
proposed model, we conduct a set of experiments by comparing the proposed model to several state-of-art in
diagnosis models. A convolutional neural network (CNN) system is used for Deep Learning [10]. In image
processing, such a method is commonly used to classify the object as well as to perform the ROI detection and
segmentation process. There are a number of layers in CNN to detect various features of the input layer learning.
272
At different resolutions, filters are applied to every trained image and the outcome of the convolution layer. The
CNN algorithm used for disease detection is based on layer. Our aim behind using CNN for skin disease
detection is to improve the recognition results compared to other classifiers.
we have shown in figure 6. The ratio of disease prediction result. In the graph is affected among people and more
prominent in men and infection on the lower extremity of the body is more visible in women. and some
unknown regions also show infection and its visible in men and women, that acral surface show at least cases
that too in men.only gender groups don’t show this kind of infection.
In the online dataset, skin diseases were found maximum 45 year old patient and below 10 year old. Maximum
of 10 below. We observed that the probability of having a disease ratio is increased compared to men and
women who are facing skin cancerous disease; we have found most of the melanocytic lesions in
dermatofibroma disease. The age group between 0-75 years is infected the most by melanocytic nevi. On the
other hand, people aged 80-90 are affected more by benign keratosis-like lesions. All the gender groups are
affected the most by melanocytic nevi.
A. Performance Measures
Experiments have been carried out to validate the efficiency of the proposed model. The experiment was carried
out with a core i5, 2.3 GHz processor with 8 GB RAM using python. Comparisons with other models conducted
to measure the performance of the classification are evaluated in terms of classification sensitivity, specificity,
and accuracy from the confusion matrix. The measures are computed by using the equations described below
with the following convolutions. In this study, the confusion matrix was used to calculate several metrics. This
matrix forms four indices which are true positive (TP), false positive (FP), false negative (FN), and true negative
(TN). TP and TN match the number of correctly predicted hypoxic and normal samples, whereas the FP and FN
match the number of incorrectly predicted hypoxic and normal samples, respectively.
Accuracy, Recall, and F1-score have been determined from our test dataset which contains 600 pictures. So we
can say that out Precession normal is 0.76, Recall normal is .78 and F1-score normally us 0.78. Finally, we can
say that our classifier is quite acceptable. Characterization table is given underneath. The total accuracy we got
78 %.
273
Figure 7 Confusion Matrix
Table V. Show that the proposed system can produce high accuracy when we apply it to multiclass skin disease.
The result shows that the proposed system correct identifiers all Acne, BCC (Basel cell carcinoma), and psoriasis
patients with a diagnosis. Finally, Acne skin diagnosis has the highest accuracy compared with other diseases.
Table VI. Show that the proposed system has higher performance in terms of accuracy specification and
sensitivity and f-score than the system proposed system by using higher accuracy compared with other diseases
using online data using skin disease. The result shows that the proposed system nearby correctly identifies all
patients with DF, MEL, and AKIRA disease. Finally, dermatofibroma skin disease has also can be diagnosed
with the nearby accuracy state with other state of art skin diagnosis systems.
274
VI. CONCLUSION
In this study, we have presented a CNN method for the diagnosis of dermatological disease in brief. We gather
information from internet and original datasets with various photos, including those of skin conditions including
vitiligo, psoriasis, warts, eczema, and skin cold. Another source is online, where 1012 pictures of seven distinct
infections melanocytic, bcc, benign keratosis lesion, melanoma, dermatofibroma, vascular lesion, and Akira have
been obtained. In terms of identifying skin lesions, we have seen some quite encouraging findings. We identified
seven different skin conditions in certain hairy photos. Finally, we ran a statistical analysis to compare
performance with the results of our objective investigation. The results of the statistical tests conducted on both
datasets’ photos to assess performance led to the conclusion that our technique is the statistically best algorithm.
As a consequence, when each class was examined independently, our accuracy rate values in multiple
classifications rose, and using CNN classification resulted in findings with varied degrees of accuracy. In terms
of identifying skin lesions, we have seen some quite encouraging findings. As a consequence, when each class
was examined independently, our accuracy rate values in multiple classifications rose and using CNN
classification resulted in findings with varied degrees of accuracy. The method displays disease skin detection
accuracy in an online database. It has an accuracy rating of 82.2% and a 78% total accuracy. Acne disease
detection accuracy is 100 percent, and test results are 97.6 percent accurate.
REFERENCE
[1] Minakshi M. Sonawane, Bharti W. Gawali, Ramesh R.Manza, Sudhir Mendhekar, “analysis of skin disease technique
using smartphone and digital camera identification of skin disease”, Research Article, vol. 4,issue. 3, pp. 529–551, July -
September 2022.
[2] Minakshi M. Sonawane, Ramdas D. Gore, Bharti W. Gawali, Ramesh R.Manza, Sudhir Mendhekar, “Computed Aided
Diagnosis System for Skin Disease Identification”, International Conference on IoT-based Control Networks and
Intelligent Systems (ICICNIS 2020), pp. 656-667, 2020.
[3] R. J. Hay, N. E. Johns, H. C. Williams, I.W. Bolliger, R. P. Dellavalle, and D. J. Margolis, “The global burden of Skin
disease in 2010: An Analysis of the prevalence and impact of skin conditions”, 55 J, Investigative Dermatology, vol.
134, no. 6, PP NO.1527_1534, 2014.
[4] Sardana K, Mahajan S, Sarkar R. Spectrum of skin diseases among Indianchildren. Pediatr Dermatol, 26 (1) pp no:6–
13,2009.
[5] Palak Mehta, Bhumika Shah, “Review on Techniques and Steps of Computer Aided Skin Cancer Diagnosis”
International Conference on Computational Modeling and Security (CMS2016).
https://doi.org/10.1016/j.procs.2016.05.28.
[6] Online available: https://www.medicalnewstoday.com/articles/154322.
[7] Abraham Getachew Kelbore, Philip Owiti ,Anthony J. Reid , Efa Ambaw Bogino, Lantesil Wondewosen and Blen
KassahunDessu, “Pattern of skin disease in childeren attending a dermatology clinic in a referral hospital in wolaita soda
, southern Ethiopia ”,BMC Dermatology, https://doi.org/10.1186/s12895-019-0085-5, pp no- 3-8, 2019.
[8] Housman TS, Feldman SR, Willi ford PM, Fleischer AB Jr., Goldman ND, et al., “Skin cancer is among the most Costly
of all Cancers to treat for the Medicare population”, J Am Acad Dermatol 48: pp. 425_429, 2003.
[9] Z. Hu and C. S. Yu, “Functional research and development of skin barrier”, Chinese Journal of Clinicians, vol. 7, No. 7,
PP NO 3101–3103, 2013.
[10] A.F. Jerant, J. T. Johnson, C. D. Sheridan, and T. J. Caffrey. "Early detection and treatment of skin cancer", Am Fam
Physician 62 (2): 357–68, 375–6,381–2. PMID 10929700, July 2000.
275
Grenze International Journal of Engineering and Technology, June Issue
Abstract— The need for safety components in the contemporary rail systems is increasing as a
result of the need to prevent accidents. Among the presence of impediments on the tracks,
whether they are stationary or mobile, is one of the major factors that might cause significant
accidents. This study focuses on one of the most effective techniques for preventing railway
accidents caused by track cracks and obstacle identification. The primary goal of this project is
to create a method for identifying railway track cracks and notifying nearby stations. The
location of track defects is being pinpointed by a GPS system. The project provides a solution
by providing an advanced tracking and management system for trains to improve the current
railway track inspection and hence transportation service. The approach is based on a potent
blend of mobile computing, an infrared sensor, an Arduino Uno and the Global System for
Mobile Communication (GSM).
Index Terms— Arduino, GPS module, GSM module, IR sensor, Motor driver.
I. INTRODUCTION
Although it has been determined that rail cracks are the main reason for derailments in the past, there are
currently no accessible low-cost automated testing methods available. Because of this problem's significant
effects, a practical and affordable solution that can be used on a large scale is needed. Crack is often assessed
manually by the experts called keyman. This is accomplished with a track checker, a miniature railroad car
designed to inspect the integrity of railroad lines. The early track checkers, also known as track walkers, were
only individuals who walked on the tracks to ensure that they were undamaged Fig. 1. However, a contemporary
track checker is a compact carriage with wheels that may either be controlled by an engineer known as a "Track
Checker" or it can be automated. This vehicle travels on railroad rails. One of the best nondestructive testing
(NDT) methods now available for surface and near-surface fault detection is eddy current testing (ECT) for
crack detection Fig. 2. The most effective ECT devices are powerful enough to measure thin materials precisely
and easily identify minute variances.
In order to enhance the inspections, new detection techniques must be created. Even if the government has taken
the required precautions to ensure a safe voyage, accidents will still occur owing to these cracks, thus this study
focuses on an effective technique to prevent these mishaps. This study describes an approach for inspecting and
to find breaks and cracks in railway tracks. Tracking has advanced significantly with the development of
communication technology, making it easier to monitor items like automobiles. The prior approaches were
replaced with renewed options. These options rely on the integration of technology using the Global Positioning
System (GPS) and other technologies [1].
III. METHODOLOGY
The key components of crack detection system are given in block diagram Fig. 3. The concept includes several
technologies,including previously discussed IR, GPS and GSM, in the proposed system.
The microprocessor is the heart of the system which coordinates crack detection and communication. The GSM
module is driven by the microcontroller to transmit text messages containing the current coordinates of the crack
consisting its latitude and longitude received by the GPS receiver to the appropriate authority.
277
A. Arduino UNO
It uses a type Uno Arduino board (R3) as seen in Figure. With a wide range of shields, it is one of the most
popular and frequently utilized. The Italian word "Uno," which means "One," was chosen to symbolize the
imminent introduction of Arduino 1.0. The Arduino Uno version 1.0 board with the crack detecting mechanism
will serve as the standard going forward. The most well- liked construction boards for electronics and robotics is
the Arduino UNO. The flexibility of the board and it has become extremely popular because it can connect
numerous robotic components, including sensors, actuators, etc. One of the advantages of UNO is a USB port
that can be programmed using the Arduino software and communicates with software such as Windows or Mac
OS without the need for drivers. The Arduino Uno's ATmega328 has a boot loader built in that enables users to
upload new code to it without the need for an external hardware programmer. A microcontroller board called the
UNO R3 Stater Kit is based on the ATmega328.It includes a 16 MHz crystal oscillator, a USB port, 6 analogue
inputs, a rest button, a power jack, and 14 digital input/output pins. You only need to connect it to a computer
with a USB wire and power it with an AC to DC adapter or battery to get things going. It comes with everything
you need to support the microcontroller. The Uno R3 is compatible with all currently available shields and is
adaptable to new shields that utilize these extra pins. Shields for Arduino are simple to use boards that may be
used to completea variety of activities quickly.
B. IR Sensor
IR LED and Photodiode are the primary electroniccomponents required to make an infrared detectorcircuit. One
kind of diode is an IR photodiode that can detect light, serve as a source of illumination, and on rare occasions
have a black or dark blue layer on the outside that makes it appear like an LED. As a source of infrared rays, IR
LEDs are the kind of LED that emit light in the infrared range. When no light is shining on it, it has an
extremely high resistance. This set of infrared transmitters andreceivers, also known as an IR TX- RX pair, may
be purchased for not very much money from any respectable electronics parts store Additional components
needed for this sensorinclude a transistor type 2222 and resistances of 330 and 10 ohms .
C. GPS Module
The major three connections required to use the module as presented in Figure 5. A microwave horn antenna to
detect problems with rail tracks the computer terminal viawhich the module transmits GPS satellite data.
Figure 5. GPS module with Arduino board Figure 6. GSM module with Arduino
D. GSM Module
The GSM module requires a predeterminednumber, GPRS shield, and SIM card to send SMS messages with
the GPS coordinates notification to a cell phone,.
The general procedure used to deliver SMS is as follows:
Place the SIM card in the slot provided as indicated in Figure 6, installed atop the GSM shield.
Connect the module to the Arduino board.
278
Connecting external power to the shield source using USB-to-Arduino cable. Gently push the power
button to check the power On/Off indication.
Observe how frequently the network LED blinks; it begins blinking rapidly for a few seconds
looking for the network.
E. Motor Driver
It uses a DC motor driver of type L293D Figure 7. Two DC motors are connected to and run by a dual H- bridge
motor driver integrated circuit. Both in a clockwise and anticlockwise direction. It operates on the principle of an
H-bridge, a type of circuit that allows voltage to flow in either way, allowing the motor to revolve either
clockwise or anticlockwise. Two H-bridge circuits that may independently spin two dc motors are present in a
single L293D chip.
The input pins on the left (pin 2) and right (pin 15and 10) will control how the motors on the left and right side
rotate accordingly. The L293D is intended in positive-supply applications to drive inductive loads such relays,
solenoids, DC and bipolar stepping motors, as well as other high current/high voltage loads. The motors are spun
according to the inputs given across the inputs as logic 0 or logic 1. A single L293D chip has two h-bridge
circuits that in may independently operate two dc motors. Due its compactness it is very much employed in
robotic application for controlling DC motors. The pin layout of an is provided below L293D motor controller.
279
V. SYSTEM SOFTWARE
The Arduino's code for proposed system is created in the C programming language. The program's strategy can
be separated into four sections: The first section on the Arduino UNO's input output addressing, the second on
moving the engine forward and getting sensor data, and the third on figuring out if there is a fracture by
analyzing the sensor reading. The crack's latitude and longitude, as determined by the GPS module, are
shown in the third section. The fourth section involves sending coordinates to a predetermined cell phone using a
GSM module. Figure 8 illustrates the flow chart of the proposed system. Programs built using Arduino Software
(IDE) are referred to as drawings. These drawings are created in a text editor and stored alongside the file UNO
extension.
VII. CONCLUSION
The primary goal of this work is to use a robotic crack tracing system to replace the manual approach of railroad
fracture identification. The prosed automatic rail track inspection system comprising Simple parts includea board
for Arduino connected to IR sensors, motors, GSM and GPS modules was developed and tested on a prototype
rail track. The system successfully detected the crack (which was intentionally put in the track) and sent the SMS
with geo location of the crack to the predefined mobile number. Checking the SMS is far convenient compared
to web based alert system as it just gives information of crack in single touch.The developed system is found
reliable and economical, and it is useful in places where manual inspection is difficult and expensive, such as in
mountainous, forested, and remote areas. The system is tested on both its software and hardware sides, and is
working well. Hence the proposed system can be implemented on railroads. The presented system not only
economical but it will save a significant amount of time in contrast to the modern techniques. Given that
everything is automated, tracking how well the tracks are maintained may be the end less chance of error than
conventional detection methods. Consequently, it will greatly reduce the likelihood of train accidents.
REFERENCES
[1] Ritika Mukhija Mr. Rajakumar P, “Railway Management System using IR sensors and Internet of Things Technology”,
YMER || ISSN : 0044-0477, VOLUME 21 : ISSUE 5 (May) – 2022, pp 1345-1352.
280
[2] Richard J. Greene, John R. Yates and Eann A. Patterson, "Crack detection in rail using infraredmethods", Opt. Eng.
46, 051013, May 2007.
[3] B. R. Krishna, D. Seshendra, G. Raja, T.Sudharshan, and K.Srikanth, “Railway track fault detection system by using ir
sensors and bluetooth technology,” Asian Journal of Applied Science and Technology (AJAST), pp. 82–84, 2017
[4] Prof. Z.V. Thorat and Nikhil Ranjane, “Automatic Railway Track Crack Detection System Using GSM GPS”,in IEEE
International Conference, 2020.
[5] Mr. Anand S. Muley, Mr. Siddhant B. Patil2, Prof. A.H.Shelar, “Railway Track Crack Detection based on GSM
Technique”, International research journal of engineering and technology (IRJET), Volume: 04, pp. 1252-1254, Issue:
01/jan2017.
[6] Jubayer Jalil and M. B. I. Reaz, “Accident detection and reporting system using GPS, GPRS and GSM technology”,in
IRJET, October 2019.
[7] K.Divya and R.Anjugam, “Railway Safety Monitoring System using Ultrasonic and IR Sensor”, in IJSRD- International
Journal for Scientific Research Development, 2018.
[8] Rijoy Paul, Nima Varghese, Unni Menon, Shyam Krishna, “Railway Track Crack Detection Rijoy”, International
Journal of Advanced Research and Development, Volume3, Issue3(2018).
[9] Rennu George, Divya Jose, Gokul T.G., Keerthana Sunil, Varun A.G. “Automatic Broken Track Detection Using IR
Transmitter and Receiver”, International Journal of Advanced Research in Electrical, Electronics and Instrumentation
Engineering, Vol.4, Issue 4, April 2015, pp-2301-2304.
[10] Amir Falamarzi* , Sara Moridpour, “A Review on Existing Sensors and Devices for Inspecting Railway Infrastructure”,
Jurnal Kejuruteraan 31(1) 2019: 1-10
[11] Pravinram, Prasath, Nanda Gopal, Haribabu, “Railway Track Crack Detection Robot using IR and GSM”, International
Journal for Scientific Research and Development, Vol. 4,Issue 02, 2016, pp-652-657
[12] Nanda Kishore, Ruhejadhav, Aishwarya, Pallavi, Railway Track Crack Detection Using GPS and GSM”, International
Journal of Innovative science and Research Technology (IJISRT), Vol 5, Issue 4, pp. 386-389, April-2020.
281
Grenze International Journal of Engineering and Technology, June Issue
Abstract—This project's primary goal is to develop a successful functioning prototype that can
detect the existence of gas leakage, which in this case is liquefied petroleum gas (LPG) and
control the gas to prevent leaks in great amounts. The prototype should perform immediate
data transfer and warning in addition to detection and control. This can be done by
implementing an alerting system, such as an alarm via a buzzer, an application via a Wi-Fi
module, and SMS via a GSM module. Here, the alerting system is in place to inform nearby
residents and industry workers about the leakage. Once a leak has been discovered, the first
step is to pinpoint its precise location. This can be done in two different ways. The first way is to
check the pressure using a barometric pressure sensor in each region, if there is a difference
between the pressure in one region and the previous which means a leak of gas has occurred;
the higher the pressure drop, the bigger the leak. The second approach involves using an
appropriate gas sensor, in this case MQ-6, to directly detect the gas concentration that is
leaking. This is accomplished by comparing the error between the actual value and the
predetermined value of the sensors. Here, we are combining both the approaches to increase
redundancy. When this condition is met, an alert is triggered, and valves at the location of the
gas leak are controlled using an Arduino Mega Controller. Further repair action will be
performed at the damaged pipe line. This results in a shorter response time for damage
restoration. As a conclusion, this project has offered students the chance to incorporate theories
into solving issues relevant to the engineering scope of work.
Index Terms— Arduino Mega Rev 3, GSM module, ESP8266 Wi-Fi module, Sensor,
HX710B Atmospheric Pressure Sensor Module.
I. INTRODUCTION
Today's industrial accidents and worker fatalities are primarily caused by dangerous gas leakages. Consider the
recent "VIZAG GAS LEAK" event that happened in LG Polymers on May 7, 2020, for a clearer understanding.
This issue involved uncontrolled Styrene vapours coming from the boiler tanks. In the immediate aftermath of
the tragedy, 12 individuals lost their lives, 585 people required hospital care, and cattle and vegetation were
destroyed. This is one of the recent incidents to occur in our nation. In this decade, there have been a lot of other
accidents.
Gas usage causes serious issues in both domestic and commercial settings. The gas that is used could be pricey
283
Figure 1. Block Diagram of the Proposed model
for our prototype. The primary objective of the system is to determine the location of the gas leak, or the precise
segment where the gas leak occurred, by utilizing gas sensors (G1, G2, G3) and pressure sensors (P1, P2, P3).
Opening any one of the hand valves (HV1, HV2, or HV3) in this prototype causes a leak at different positions.
The attention shifts to controlling action once the damaged component has been located. Closing the control
valves (Solenoid valves or air valves) at the beginning and end of the segment where the gas leak is occurring is
the primary objective of the control action.
For a better understanding, if a gas leak occurred at segment 2, i.e., by opening of HV2, then there will be a
change in pressure from segment 2, i.e., at P2, P3 respectively. We can also observe an exponential rise in the
readings of the gas concentration in parts per million (ppm) with respect to the gas that is leaking at the gas
sensor G2 only. The controller determines when to close the valves in relation to the intended set point by using
these measurements of pressure change and gas concentration. Since we were using the MQ 6 sensor, whose
range is 200 to 10,000 ppm, we had set the set point at 300 ppm. Whenever the concentration exceeds the
predetermined level or whenever there is a significant shift in the pressure values from the previous segment.,
i.e., if the gas leak satisfied the aforementioned requirements, a signal from the Arduino controller is delivered to
the solenoid valves, S2 and S3 which are located at the inlet and outlet of the segment 2 respectively.
Additionally, a triggered signal is sent to the alarm system, which comprises of a GSM module that sends an
SMS to factory workers informing them of the location of the gas leak and a Wi-Fi module that transmits
continuous monitoring data from sensors to the ThingView application or ThingSpeak website over the internet
[3]. In addition to these, we also have the most widely used equipment, such as an alarm (buzzer) to alert the
industry that there is a leak and an exhaust fan to remove the gas leak into the atmosphere. Due to this
segmentation and closing of control valves there will be a minor gas leak which will be released and breathed
into the atmosphere where it does not significantly impact the health of the people in and around the industry
premises. Therefore, since the desired position of the leakage at a specific section is known, less time is needed
for damage restoration.
B. Hardware and software
The hardware setup is shown in Figure 3. A brief description about the hardware used is given below.
Arduino Mega 2560 Rev 3
The ATmega2560 serves as the basis for a microcontroller board known as the Arduino Mega 2560. (datasheet).
It includes 16 analogue inputs, 4 hardware serial ports (UARTs), a 16 MHz crystal oscillator, 54 digital
input/output pins (14 of which can be utilized as PWM outputs), a USB connector, a power jack, an ICSP
header, and a reset button. It is used to obtain sensor data and produces activation signals that operates the
solenoid valves and sends alert signals.
284
Figure 2. Top view of the proposed prototype
MQ-6 Sensor
The MQ6 Gas Sensor is a Metal Oxide Semiconductor (MOS) type Gas Sensor that is primarily used to identify
the presence of Butane and LPG in the air. The MQ 6 sensor has a range of 200 to 10,000 ppm [5].
GSM Module
SMS notifications are sent using a SIM900 GSM module when gas is detected. GSM is intended to be a tool for
exchanging information [6].
Piezo buzzer
The buzzer's primary function is to transform the input signal, which is current (less than 15 mA), into sound.
HX710B Atmospheric Pressure Sensor Module
Atmospheric Pressure Sensor Module, HX710B, with an altitude resolution of 10 cm is used. This barometric
pressure sensor is best used with altimeters and variometers. The sensor module can sense 0-40KPa air pressure.
This sensor can be used to monitor water level and other air pressure [7].
LCD
LCD is employed for displaying the message indicating that “gas detected at zone” into the display, which is
initially coded in program to display the danger.
Wi-Fi Module
A self-contained SOC with an integrated TCP/IP protocol stack, the ESP8266 Wi-Fi Module allows any
microcontroller to access your Wi-Fi network. The ESP8266 is capable of offloading all Wi-Fi networking tasks
from another application processor or hosting an application [8].
Exhaust fan
Exhaust fans are used for exhaling of toxic gases in industries.
Solenoid Valves
It is an electromechanical valve that is often used to control the flow of liquid or gas. There are many different
kinds of solenoid valves, but the two most common varieties are direct acting and pilot driven [9].
285
IC L293D
L293D IC is a typical Motor Driver IC which allows the DC motor to drive on any direction. Here in this
prototype, it is used for opening and closing of solenoid valves [10].
C. Proposed system design flow
The flowchart diagram in Fig. 4 depicts the entire hardware connection process of the suggested system design.
Following are some quick discussions of the process flow's steps:
Step 1: Firstly, connect all the Modules and other components to the Arduino Mega Micro controller.
Step 2: Get the sensor data from the respective sensors that are placed at respective locations.
Step 3: Check whether the predetermined condition is satisfied or not, that is if the Gas concentration
from the MQ-6 Gas sensor is greater than or equal to 300 ppm or if there is an exponential change in
the Pressure values from Pressure sensor.
Step 4: If the condition is not satisfied then go back to the second step.
Step 5: If the condition met then Arduino finds the location where the leakage is occurred with the help
of sensor data.
Step 6: Now it sends a triggered signal to all the alert systems such as GSM module and Wi-Fi module
and also share the sensor data with them.
Step 7: It also sends the control signal (CO) to IC L293D to apply the desired control action at desired
control valve.
286
As mentioned earlier in Methodology section, let us study the same example practically. That is if leakage is
occurred at second segment which is done by opening second hand valve, then the sensor parameter changes are
shown in Table 1.
Figure 6 shows the LCD display which displays the pressure and MQ- 6 sensor readings. Here in the display the
first line shows the Pressure values and second line gives the gas concentration values. Here A, B, C indicates
the segments 1, 2, 3 respectively. When the gas leakage occurs, then the LCD displays the location of gas
leakage(Location 2) with respect to the pressure and MQ-6 sensor as shown in Figure 7.
TABLE I: OBSERVATIONS
Parameters Before opening the 2nd hand valve After opening the 2nd hand valve
Pressure at position 1 (A) 320 Pa 215 Pa
Pressure at position 2 (B) 315 Pa 215 Pa
Pressure at position 3 (C) 310 Pa 220 Pa
Gas concentration at Position 1 (A) 0 ppm 0 ppm
Gas concentration at Position 2 (B) 0 ppm 431 ppm
Gas concentration at Position 3 (C) 0 ppm 0 ppm
Figure 6. LCD display showing the gas and pressure sensors readings Figure 7. LCD display showing the position of leakage
Figure 8. (a) Graphical Representation of pressure (Pa) values in ThingSpeak at segment 1, (b) Graphical Representation of Gas
concentration (ppm) values in ThingSpeak at segment 1
287
Figure 9. (a) Graphical Representation of pressure (Pa) values in ThingSpeak at segment 2, (b) Graphical Representation of Gas
concentration (ppm) values in ThingSpeak at segment 2
Figure 10. (a) Graphical Representation of pressure (Pa) values in ThingSpeak at segment 3, (b) Graphical Representation of Gas
concentration (ppm) values in ThingSpeak at segment 3
Coming to the alerting system, the SMS and Graphical analysis data will be sent to the application or website.
The Figures 8,9,10 gives the individual sensor data with respect to time which is sent to the ThingSpeak website
through Wi-Fi module to get aware of the data in the form of graphs and a warning is sent when the gas leakage
exceeds the limit. The same data can also be seen in mobile through ThingView application. For better
understanding, visit the website through the link https://thingspeak.com/channels/243722.
Figure 11. SMS sent to the registered mobile with sensor values
The Data is also sent to the mobile as SMS as shown in Figure 11 by using SIM900A Quad Band GSM/GPRS
Module.
288
V. CONCLUSION
In terms of the project's goals, the prototype has given adequate answers for preventing the toxic gas leaks in
industries at low cost. The detection, alerting, and control systems make up the prototype's three core parts. The
detection part is designed with a MQ-6 sensor and a HX710B atmospheric pressure sensor. The control system is
constructed using solenoid valves, an Arduino Mega controller, an exhaust fan, and an IC L293D. The alert
system is created using an LCD, an ESP8266 Wi-Fi Module, and a SIM900 GSM Module. This alert system is
incorporated to increase public awareness of toxic gas leaks, which in turn lowers accidents brought on by toxic
gas leaks. The integration of these three systems gives us accurate results. The prototype's design eliminates all
the limitations of previous works and makes it simple for both small and large-scale companies to deploy
without making significant changes to the existing system.
FUTURE SCOPE
Other types of sensors, as well as other actuators, can be employed that may perform better than the traditional
paradigm. It also enables flexibility because other modules may be introduced without affecting the existing
modules for taking control actions and alerting system.
REFERENCES
[1] R. K. Kodali, R. N. V. Greeshma, K. P. Nimmanapalli and Y. K. Y. Borra, "IOT Based Industrial Plant Safety Gas
Leakage Detection System," 2018 4th International Conference on Computing Communication and Automation
(ICCCA), 2018, pp. 1-5, doi: 10.1109/CCAA.2018.8777463.
[2] S. Z. Yahaya, M. N. Mohd Zailani, Z. H. Che Soh and K. A. Ahmad, "IoT Based System for Monitoring and Control of
Gas Leaking," 2020 1st International Conference on Information Technology, Advanced Mechanical and Electrical
Engineering (ICITAMEE), 2020, pp. 122-127, doi: 10.1109/ICITAMEE50454.2020.9398384.
[3] Yahaya, S. Z., Mohd Zailani, M. N., Che Soh, Z. H., & Ahmad, K. A. (2020). IoT Based System for Monitoring and
Control of Gas Leaking. 2020 1st International Conference on Information Technology, Advanced Mechanical and
Electrical Engineering (ICITAMEE). doi:10.1109/icitamee50454.2020.9398384
[4] H. Paul, M. K. Saifullah and M. M. Kabir, "A Smart Natural Gas Leakage Detection and Control System for Gas
Distribution Companies of Bangladesh using IoT," 2021 2nd International Conference on Robotics, Electrical and
Signal Processing Techniques (ICREST), 2021, pp. 109-114, doi: 10.1109/ICREST51555.2021.9331226.
[5] Ba Thanh Nguyen and Anh Vu Nguyen, “IoT Application for Gas Leakages Monitoring,” International Research
Journal of Advanced Engineering and Science, Volume 5, Issue 4, pp. 51-53, 2020.
[6] M. A. Subramanian, N. Selvam, S. Rajkumar, R. Mahalakshmi and J. Ramprabhakar, "Gas Leakage Detection System
using IoT with integrated notifications using Pushbullet-A Review," Fourth International Conference on Inventive
Systems and Control (ICISC), pp. 359-362, doi: 10.1109/ICISC47916.2020.9171093, 2020.
[7] A. Banik, B. Aich and S. Ghosh, "Microcontroller based low-cost gas leakage detector with SMS alert," 2018 Emerging
Trends in Electronic Devices and Computational Techniques (EDCT), 2018, pp. 1-3, doi:
10.1109/EDCT.2018.8405094.
[8] R. K. Kodali, R. N. V. Greeshma, K. P. Nimmanapalli and Y. K. Y. Borra, "IOT Based Industrial Plant Safety Gas
Leakage Detection System," 2018 4th International Conference on Computing Communication and Automation
(ICCCA), 2018, pp. 1-5, doi: 10.1109/CCAA.2018.8777463
[9] A. Suryana et al., "Detection of Leak Position in Household LPG Distribution Pipes Using Gas Pressure Sensors and
Continuity Equation,"6th International Conference on Computing Engineering and Design (ICCED), pp.15,
doi10.1109/ICCED51276.2020 .9415775, 2019.
[10] R. S. Rosli, M. H. Habaebi and M. R. Islam, "Characteristic Analysis of Received Signal Strength Indicator from
ESP8266 WiFi Transceiver Module," 7th International Conference on Computer and Communication Engineering
(ICCCE), pp. 504-507, doi: 10.1109/ICCCE.2018.8539338, 2018.
[11] K. Gavaskar, D. Malathi, G. Ravivarma and A. Arulmurugan, "Development of LPG Leakage Detection Alert and Auto
Exhaust System using IoT," 2021 7th International Conference on Electrical Energy Systems (ICEES), 2021, pp. 558-
563, doi: 10.1109/ICEES51510.2021.9383633.
[12] M. S. Saeed and N. Alim, "Design and Implementation of a Dual Mode Autonomous Gas Leakage Detecting Robot,"
2019 International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), 2019, pp. 79-84,
doi: 10.1109/ICREST.2019.8644075.
289
Grenze International Journal of Engineering and Technology, June Issue
I. INTRODUCTION
Nowadays various electronic devices are used in close vicinity to each other. The electromagnetic interference
(EMI) which occurs due to interference in electrical and magnetic fields of electronic devices could disturb the
functioning of any device as well as can interrupt the working of other devices in same confined space, which
can be dangerous [1,2,3]. Various studies shown that there has been rise in number of calamity due to EMI.
These calamities are mostly found in an environment where so many electronic devices are working at the same
time [4] as noted by rathi et al. To tackle with this problem, the manufacturing companies of electronic devices
putting their constant efforts. They want to fortify the safety features of electronic devices to the effects of
electromagnetic radiations (EM). The solution of this problem is to offer a shielding for EM radiations. By
shielding a device, electromagnetic compatibility of the device can be improved. To guard any electronic device,
we need to stop the transmission of EM radiation through a shield called EMI shield.
Electrical conductivity is most significant requirement of EMI shield. We know that all metals are very good
conductor of electricity. So they are assumed to be perfect source of protection against EMI waves. But they
have some limitations also, like heavy weight, rigid and rusting nature. In today scenario, an alternative is used,
II. EXPERIMENTAL
A. Materials
The PCNC films were developed using the polymer P(VDF-TrFE) (99.9 % pure) powder purchased from
Piezotec Arkema group. The polymer molecular weight was 200,000 g/mol. rGo was used as the conducting
filler having product ID-777684, in powdered form as solutes and for dissolving the solutes and N-N,-
Dimethylformamide (DMF) was used as solvent to form a heterogeneous solution. The items were purchased
from Sigma Aldrich, India.
B. Fabrication of Nanocomposite Film
The simple and effective technique of solution casting was used to develop P(VDF-TrFE)/rGo PCNC. Fig. 1
describe the process of solution casting method. Firstly the P(VDF-TrFE) is dissolved with DMF in a glass
tumbler. The combination of P(VDF-TrFE) and DMF was heated at a temperature of around 80OC for approx 2 h
with a stirring speed at 400 rpm so that we get a homogenous solution. After this rGo with different ratio is
added in that solution, the obtained solution was again heated at 50OC for around 5 h. Acquired uniform solution
is transferred into a glass petri dish and allowed to heat slowly so that solvent can be removed from the petri
dish. Once the films are completely dried, the films can be unwrapped from the petri dish. The same process of
fabrication was repeated for the different concentration of rGO as 5wt%, 10wt% and 15wt% so we obtained the
films with composition P(VDF-TrFE)/rGo as 95/05 , 90/10 and 85/15 respectively.
291
III. RESULTS AND DISCUSSION
A. Surface morphology of composite films
With the help of SEM images we get the idea of surface morphology of the fabricated CPNC. Fig. 2 shows the
SEM images of fabricated films. Fig. 2(a) shows the morphology of pure P(VDF-TrFE) we can see the spotted
grain structure like surface of the film as reported earlier [11,12]. Fig. 2(b) and 2(c) shows the SEM image of
90/10 and 85/15 respectively. We can see the effect of rG0 on the surface, clearly the conducting channels can be
seen on the surface of P(VDF-TrFE) as concentration of rGo increases the conducting channel increases.
Figure 2. SEM images of (a) pure P(VDF-TrFE); (b-c) SEM images of composition 90/10, 85/15 of P(VDF-TrFE)/rGo respectively
B. Dielectric analysis
SE of fabricated CPNC samples was calculated with help of dielectric parameters in the frequency range of 10
KHz to 1 MHz. Conductivity (σ) and Dielectric constant (ϵʹ) of the fabricated CPNCs are shown in Fig. 3(a) and
(b) with respect to the frequency. SE of the films mainly depends on conductivity, it is clear from the graph that
conductivity of film with 85/15 P(VDF-TrFE)/rGo ratio is maximum and the film with composition 95/05
P(VDF-TrFE)/rGo has smallest value of conductivity. The main cause for this is the formation of various
conducting channels between the interfacial area of polymer matrix, which increases the dipole moment.
Dielectric constant of the films also calculated and it is maximum for film with 15% of rGo.
C. Mechanical Properties
The mechanical properties analysis gives the idea about the performance of the fabricated films when the films
are lay open to stretching or pulling force before it fails [13]. The mechanical properties of films with
combination labels 85/15,90/10,95/05 are depicted in the stress – strain diagram shown in Fig. 4. From the graph
we can observe that 85/15 is the most brittle among these films and 95/05 is the most ductile material. The
toughness of the films will depend on area under the curve in stress – strain diagram which is different for all
hence, the toughness for them will also be different. The film with composition of 90/10 is the most satisfactory
material according to the graph but still we can see the composition 85/15 have flexibility of 0.7mm and can bear
292
a load of around 1.5 MPa however as the concentration of conducting filler decreases the flexibility and
mechanical strength of the films increases.
IV. CONCLUSION
In the presented work thin, cost effective and flexible composite sheets have been developed for EMI shielding
application. As we increase the content of filler in the composition of film the SE of films get increases as well.
With the study of dielectric and SE analysis, we can say that the CPNC with 15wt% of rGo is giving most
suitable result as, it’s SE is highest and it is in between 36dB to 25dB for examined frequency range of 10khz to
1Mhz. It conductivity is also highest among other fabricated CPNC. Also the mechanical properties of CPNC
with 15wt% of rGo is satisfactory. This is also clear from SEM images that the rGo filler distributed evenly
through P(VDF-TrFE) polymer matrix. The fabricated P(VDF-TrFE)/rGo composite film has a very good
prospect to be used as a flexible EMI shielding film to increase the electromagnetic compatibility of various
electronic equipment.
REFERENCES
[1] B. Zhao, C. Zhao, R. Li, S. Hamidinejad and C. Park, “Flexible, Ultrathin, and High-Efficiency Electromagnetic
Shielding Properties of Poly(Vinylidene Fluoride)/Carbon Composite Films,” ACS Appl. Mater. Interfaces, vol. 9, no.
24, pp. 20873−20884, Jun. 2017.
293
[2] M. Ameli., M. Nofar, S. Wang. and C.B Park. “Lightweight Polypropylene/Stainless-Steel Fiber Composite Foams with
Low Percolation for Efficient Electromagnetic Interference Shielding,” ACS Appl. Mater, Interfaces. vol. 6, no. 14, pp.
11091−11100, July 2014,
[3] D. X. Yan, H. Pang, B. Li, R. Vajtai, L. Xu, P. G. Ren, J.H. Wang and Z. M. Li, “Structured Reduced Graphene
Oxide/Polymer Composites for Ultra - Efficient Electromagnetic Interference Shielding,” Adv. Funct. Mater, vol. 25,
no. 4, pp. 559−566, Jan. 2015.
[4] V. Rathi, V. Panwar, G Anoop, M. Chaturvedi, K. Sharma and B. Prasad. "Flexible, Thin Composite Film to Enhance
the Electromagnetic Compatibility of Biomedical Electronic Devices," in IEEE Trans. Electromagnetic Compatibility,
vol. 61, no. 4, pp. 1033-1041, Aug 2019.
[5] F. Paulis, M. H. Nisanci, A. Orlandi, M. Y. Koledintseva, and J.L. Drewniak, “Design of homogeneous and composite
materials from shielding effectiveness specifications,” IEEE Trans. Electromagnetic Compatibility, vol. 56, no. 2, pp.
343–351, Apr. 2014.
[6] S. H. Lee, D. Kang, and 1.k. Oh, “Multilayered Graphene-Carbon Nanotube-Iron Oxide Three-Dimensional
Heterostructure for Flexible Electromagnetic Interference Shielding Film,” Carbon, vol. 111, pp. 248−257, Jan. 2017
[7] M. Chen, L. Zhang, S. Duan, S. Jing, H. Jiang, M. Luo, and C. Li, “Highly conductive and flexible polymer composites
with improved mechanical and electromagnetic interference shielding performances,” Nanoscale, vol. 6, no. 7, pp. 3796-
3803, 2014
[8] V. Rathi and V. Panwar, “Electromagnetic interference shielding analysis of conducting composites in near and far field
region,” IEEE Trans. Electromagn. Compat., vol. 99, pp1-7, Jan. 2018.
[9] Q. M. Zhang, V. Bharti, and G. Kavarnos, “Poly (vinylidene fluoride) (PVDF) and its copolymers,” Encyclopedia of
Smart Materials, John Wiley & Sons, New York, pp. 807–825, Jul. 2002.
[10] V. Panwar and G. Anoop “An ionic polymer–metal nanocomposite sensor using the direct attachment of an acidic ionic
liquid in a polymer blend” J. Mater. Chem. C, 7, 9389-9397, Jul.2019
[11] S.W. Hahm, and D.Y. Khang, “Crystallization and microstructure-dependent elastic moduli of ferroelectric P(VDF–
TrFE) thin films,” Soft Matter, vol. 6, no. 22, pp. 5802–5806, 2010.
[12] J. Ryu, K. No. Y. Kim.Y, E. Park, and S. Hong,.“Synthesis and Application of Ferroelectric Poly (Vinylidene Fluoride-
co-Trifluoroethylene) Films using Electrophoretic Deposition,” Scientific Reports, vol. 6, 36176. Nov 2016
[13] H.L. Chen et.al., “Predicting mechanical properties of polyvinylidene fluoride/carbon nanotube composites by molecular
simulation”, Materials Research Express, vol. 4, no. 11, 115025, Nov. 2017.
294
Grenze International Journal of Engineering and Technology, June Issue
Abstract—The future of renewable power generation is Solar power. The main problem with
solar power generation is solar panels occupy more area on roof tops, open areas and they are
not easy to mount. The concern with solar panels is that they are difficult to install, maintain,
and clean on a regular basis. They also take up a lot of room on roofs or other open spaces.
Additionally, shifting the solar panels in accordance with the position of the sun may produce
up to 40% more solar electricity. Here, we suggest a of solar panel that may be placed on bodies
of water, like lake pools, freeing up space on the ground. We also provide a novel technique that
is sun tracking floating solar for moving solar panels in accordance with the position of the sun
using LDR sensors, which would increase power production and the floating system in the
water resists the solar panel from overheating. Additionally, the water is also conserved due to
reduction in evaporation of water from the water body. In the upcoming 10 years, India
proposes the generation 1GW and 1.75 GW of solar Photo Voltaic power from renewable
energy sources. As on date around 5000MW has been commissioned in different parts of
country, as per the Jawaharlal Nehru Solar Mission. To meet the target, there is a need to
produce more solar energy in short span of 10 years. Floating solar Photo Voltaic plants are an
emerging form of PV systems, that floats on the water bodies like canals, water reservoirs,
lakes, and ponds. This paper proposes a prototype of floating sun tracking solar panel to
increase the production of solar energy using floating solar panels, Raspberry Pi Pico
microcontroller board, Thonny IDE software.
Index Terms— Raspberry Pi Pico microcontroller board, Thonny IDE, Current Sensor
module, DHT11 temperature and Humidity Sensor.
I. INTRODUCTION
The standard solar panel does not have much efficiency and the production of energy is very little. To overcome
this problem, we have proposed a floating sun tracking solar panel. The floating Photo Voltaic system exploits
many functions such as cooling, concentrating, and tracking. The outcomes of the system have designated an
important influence on cooling and tracking on the system competence. The main advantage is large amount of
solar energy production when compared to roof top solar panels and the solar panel is introduced in the water
therefore due to cooling effect of water the panel will not get heated continuously[1]. This increases the
efficiency of solar panel which also leads to production of large amount of energy. More electricity will be
generated due to cooling effect of water in floating solar than the terrace roofing systems. The geometry of the
given system has been determined with two major aspects [2]. Firstly, the module should protect as much water
II. IMPLEMENTATION
Fig.1 represents the floating sun tracking solar panel flow diagram. The heart of the system is the controller, i.e.,
Raspberry Pi Pico RP2040. Single axis solar panel is used, which rotates in 1800 according to this position
maximum amount of energy is absorbed by the LDR. The panel rotation is done with the help of servo motor.
Micro Python is used in Raspberry Pi Pico for rotating the servo motor. The floating of the panel is done with the
help of vacuum filled tubes [3]. The solar panel continuously rotate in the direction of sun with the help of servo
motor. Thus, the Photo Voltaic cells absorb the maximum amount of energy from the sun. Chargeable batteries
are used to store energy for future needs. The floating solar panels can be installed at the existing power plants.
These floating solar panels keep water bodies fresh and clean while generating the renewable energy. 16×2 LCD
is used to display the generated voltage. Due to continuous rotation of panel in the sun direction there is
continuous heating this reduces efficiency in solar panel. In order to overcome this, we use floating solar panel
as it continuously dissipates the heat. When compared to roof top solar panels, efficiency of the floating solar
panel is increased by 35%. A solar panel of 100 watts when received 10 hours of direct sun-hours per day will
generate 2kWh of energy. Then the maximum energy annually produced is 730kWh [4].
Fig.2 Flow Chart for the Working of Proposed System Fig.3 RP2040Raspberry Pi Pico
296
The flowchart diagram in Fig.2 depicts the working process of the suggested system design. Following are some
quick discussions of the process flow steps:
Step1: Place the Solar Panel on a roof top or in an open area.
Step2: Connect the model to an external power supply for rotating the servo motor.
Step3: If the LDR1 has absorbed more voltage than LDR2, the servo motor rotates in 30o Clockwise
direction.
Step4: If the LDR2 has absorbed more voltage than LDR3, the servo motor rotates in 60o Clockwise
direction.
Step5: If LDR3 absorbs the maximum voltage, then the servo motor rotates in 90o clockwise direction.
Fig.4 shows the pin configuration of Raspberry Pi Pico. The Raspberry Pi Pico consists of an integrated
temperature sensor and low power sleep and dormant modes. Table.1 shows the specifications of Raspberry Pi
Pico (RP2040).
297
Fig.5 shows a INA219 based DC voltage and Current sensor module. CJMCU-219 is a zero drift I2C interface
based. It is a bidirectional current/power monitoring module. It is an essential component of power monitoring
system. It is capable of sensing current, voltage, power. It transmits data to host microcontroller using I2C bus
protocol. The specifications of CJMCU-219 current sensor module is given in Table.2 [6].
C. LDR Sensor
LDR or light dependent resistor is shown in Fig.6. It is a kind of resistor whose resistance changes depending on
the amount of the light falling on its surface and it is made of a high resistance semiconductor. These resistors
are used in circuits where it is required to sense the presence of light. It’s operation is based on semi
conductivity. LDR has a variety of functions and resistance. The electrons in the semiconductor material's
valence band are stimulated to the conduction band when light strikes the object, or when photons strike it. To
cause the electrons to move from the valence band to the conduction band, the incident photons must have an
energy larger than the bandgap of the semiconductor material. As a result, when sufficiently energetic light
impacts the device, a huge number of charge carriers are produced as more and more electrons are driven to the
conduction band[7].
D. DHT-11 Temperature and Humidity Sensor
Fig.7 shows a digital sensor DHT11 for measurement of Humidity and temperature. The sensor is interfaced
with the Raspberry Pi Pico. This DHT11 is available as both sensor and module. In this prototype we are using
DHT11 sensor. DHT11 measures the surrounding air using thermistor and capacitive humidity sensor [8].
Table.3 shows the specifications of DHT11 Temperature and Humidity Sensor.
298
E. LCD
LCD mean liquid crystal display is shown in fig.8. This 16×2 LCD working principle is, it blocks the light rather
than dissipate. It is an electronic display module used in many applications like mobile phones, calculators, and
computers. The LCD used here has 16×2 display with 40 pins [9]. The main advantages of using this kind of
LCD are they are inexpensive, simply programmable also there are no limitations for displaying custom
characteristics.
F. MG995 Servo Motor
Fig.9 shows a MG995 which is a heavy-duty reliable servo motor. It is a high-speed actuator with dual bearings.
It is a low power, and cost effective which is feasible for Industrial production. A maximum torque of 208 oz-in
is delivered at 6 volts. It has a maximum rotational speed of 0.13 seconds per 60o. If the voltage is dropped to a
smallest of 4.8 volts, it maintains a torque of 180 oz-in and rotates with a speed of 0.17 seconds per 60o [10].
G. Thing Speak IOT Platform
Thing Speak is an IOT platform used for gathering the real-time data like location, climatic changes information
and other device data. In our proposed floating sun tracking solar panel model this IOT platform is used to
collect the voltage of both the floating solar panel and the static solar panel.
H. Thonny IDE
Fig.10 shows the interface of the Thonny IDE. It is the platform used for coding. We use Micro python for
programming the servo motor to rotate in the direction of the sun.
299
of the sun with the help of servo motor. The amount of voltage generated is shown in the LCD display and the
current, voltage and Temperature readings are taken from the thing speak cloud using the Wi-Fi module
ESP8266. This is specifically for use in Internet of Things (IoT) systems. With a 32-bit processor, some RAM,
and, depending on the supplier, between 512KB and 4MB of memory, the ESP8266 is a complete Wi-Fi system
on a chip. This enables the chip to work as a standalone device that can run simple programmes or as a wireless
adaptor that can add Wi-Fi functionality to other systems.
VI. RESULT
There were two solar panels tested. The other is a floating sun tracking solar panel, while the first is a static
rooftop solar panel. The static solar panel is positioned at a 33-degree angle since this will ensure that its power
production is maximised. From 8 AM to 6 PM, the test was conducted continuously across three days, with
measurements being made continuously. In this test, a load was used to compute the solar panel's current and
voltage (battery of capacity 15V-17V). Figures 13,14,15 shows the voltage, current and temperature readings of
the static solar panel and Figures 16,17,18 shows the voltage, current and temperature readings of Sun Tracking
Floating solar panel in Thing Speak IOT platform and the readings were taken for every 15 minutes from 8:00
AM to 5:00 PM. The following results shows that the floating sun tracking solar panel is more efficient than
static solar panel.
(a) (b)
(c) (d)
Fig.19 shows the final output voltage displayed on the LCD. The power rating of the solar panel used in the
prototype is 5W,12V which is useful for charging small electronic devices. 25-35% energy is generated in
floating solar panel when compared with roof top solar panel. The solar panel used in the prototype is of area 27
X 19 sq.cm ,which generates a maximum energy of 11 volts. The standard roof top solar panel with the same
power rating generates a maximum of 9 volts.
300
(e) (f)
Fig.13(a) Graphical representation of Voltage for Static Solar Panel in Thing Speak (b)Graphical representation of Current for Static Solar
Panel in Thing Speak (c)Graphical representation of Temperature for Static Solar Panel in Thing Speak (d) Graphical representation of
Voltage for Floating Sun Tracking Solar Panel in Thing Speak (e) Graphical representation of Current for Floating Sun Tracking Solar
Panel in Thing Speak (f) Graphical representation of Temperature for Floating Sun Tracking Solar Panel in Thing Speak
Table.6 shows the real time statistical data of the prototype floating Sun Tracking Solar Panel. The current,
voltage, and temperature readings are taken from 8:00 AM to 5:00 PM continuously. The sun tracking panel
rotates in the direction, when the maximum light falls on the panel, due to absorbance of the light by LDR sensor
the servo motor runs, this causes change in the position of the solar panel which leads to absorption of maximum
energy.
TABLE VI. REAL TIME STATISTICAL DATA OF FLOATING SUN TRACKING SOLAR PANEL AFTER TESTING
Solar panel position with respect to Sun Tracking Time Temperature(oC) Current (mA) Voltage (V)
30o 8:00 AM 27 193 9
30o 9:00 AM 28 199 10
60o 11:00 AM 32 203 11
60o 12:00 PM 35 260 18
60o 1:00 PM 34 221 13
60o 2:00 PM 33 234 15
90o 4:00 PM 31 203 11
90o 5:00 PM 30 198 10
VII. CONCLUSION
The concept of floating sun tracking solar panel is neoteric. In this study, we provide an easy-to-understand
explanation of the solar tracking mechanism used to increase solar gain energy. We also discuss how
inexpensive it is to operate and maintain a solar tracker. The tracking system is used to locate the solar panel in
sun direction to produce the maximum amount of energy. The floating system is used to cool the solar panel
which is heated continuously due to sun tracking. The cooling system dissipates the heat absorbed by the solar
panel so that it works efficiently and produces more amount of energy when compared to standard roof based
solar panels. The floating solar panel generates a maximum voltage of 11 volts while a roof top solar panel
generates a maximum voltage of 9 volts for a standard 5W solar panel.
FUTURE SCOPE
As the renewable energy resources are free, proper management is needed and we need to discover more
technologies for energy production through these free resources. Other types of sensors as well as solar panels
can be employed in water reservoirs, lakes that may perform better than the traditional paradigm. For maximum
301
absorption of light from the sun we can use anti reflective coatings on the solar panel. By using them the
destructive interference will be eliminated from incident light waves from the sun. Thus, the maximum amount
of light is transmitted to the solar panel which increases the amount of energy production.
REFERENCES
[1] Sahu, Alok Yadav, Neha & Sudhakar, "Floating photovoltaic power plant", A review: Renewable and Sustainable
Energy Reviews, vol 66, pp. 815-824, 2016.
[2] Sriwirote.B, Noppakant.A & Pothisaran, "Increasing efficiency of an electricity production system from solar energy
with a method of reducing solar panel temperature," International Conference on Applied System Innovation (ICASI),
pp.1-3, 2017.
[3] R. Chowdary, M. A. Aowal, and Rehman. A , "Floating Solar Photovoltaic System: An Overview and their Feasibility at
Kaptai in Rangamati," IEEE International Power and Renewable Energy Conference, pp.1-5, 2020.
[4] D. Mital, B.Saxena and K. V. S. Rao, "Floating solar photovoltaic systems: An overview and their feasibility at Kota in
Rajasthan," International Conference on Circuit, Power, and Computing Technologies (ICCPCT), pp.1-7, 2017.
[5] Y. Bikrat, A. Benlghazi, and D. Moussaid, "A Photovoltaic Wireless Monitoring System," International
Symposium on Advanced Electrical and Communication Technologies (ISAECT), pp. 1-5, 2018.
[6] D. Tukymbekov, M. Nurgaliyev, N. Kuttybay, Y. Nalibayev and G. Dosymbetova, "Intelligent energy efficient street
lighting system with predictive energy consumption," International Conference on Smart Energy Systems and
Technologies (SEST), pp. 1-5, 2019.
[7] F. I. Musthaffa, S. Shakhir, F. F. Musthaffa and A. T. Naif, "Simple design and implementation of solar
tracking system two axis with four sensors for Baghdad city," 9th International Renewable Energy
Congress(IREC), pp.1-5, 2018.
[8] Mondhal.A, M. J. Alih, and Dutta. P, "IoT Enabled Smart Solar Panel Monitoring System Based on Boltuino
Platform, "IEEE International IOT, Electronics and Mechatronics, Conference(IEMTRONICS), pp.1-7, 2022.
[9] C. -Y. Yang, C. -Y. Hsieh, F. -K. Feng and K. -H. Chen, "Highly Efficient Analog Maximum Power Point
Tracking (AMPPT) in a Photovoltaic System," in IEEE Transactions on Circuits and Systems I: Regular
Papers, vol. 59, no. 7, pp. 1546-1556, 2012.
[10] T. Kaur, S. Mahajan, S. Verma, Priyanka and J. Gambhir, "Arduino based low-cost active dual axis solar tracker, "IEEE
1st International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), pp. 1-5,
2016.
[11] M. S. S. K, V. K. R, and J. R, "Simulation and Implementation of Dual-Axis Solar Tracker with PV Panel for Domestic
loads, "13th IEEE PES Asia Pacific Power & Energy Engineering Conference (APPEEC), pp.1-6, 2021.
302
Grenze International Journal of Engineering and Technology, June Issue
Abstract—This paper presents a Grey Wolf Optimization (GWO) algorithm for doubly fed
induction generator (DFIG)-based wind turbine generators. In order to address the system
operability challenges that have arisen as a result of the continuous reduction of system inertia
caused by the increasing penetration of renewable power generation. The GWO algorithm that
has been makes it possible for individual DFIG generators to contribute an efficient inertial
response. This response helps to stabilize the rate at which the frequency is changing and
minimizes the large frequency deviations, when disturbance occurs. The DC voltage of the
DFIG runs at different levels in accordance with the changes in the inertia constant to facilitate
energy exchange with the associated ac grid. Additionally, the standard control system for the
DFIG has been updated to accommodate the implementation of the GWO algorithm. The
proposed model utilized GWO in order to evaluate optimum values of inertia constant, that
assists in improving output power efficiency levels. Concerning the DFIG, practical challenges
such as maximizing active power while minimizing reactive power are examined, and pertinent
solutions are offered for a variety of different cases.
Index Terms— Grey Wolf Optimization, Power System, DFIG, Wind, Inertia, Load Frequency
Control.
I. INTRODUCTION
In contrast to fixed-speed machines where active and reactive power control is not independent, DFIG-based
wind turbines are the preferred option for network operators [1]. The turning speeds of conventional wind
turbines are fixed; on the other hand, DFIG technology enables wind turbines to function at a broad range of
speeds. Traditional wind turbines have fixed turning speeds. The back-to-back converter is affixed to the rotor of
the DFIG, and its function is to provide the rotor with currents of varying frequencies in order to obtain the
required rotational speeds of the rotor. This application note provides a demonstration of how a back-to-back
converter controller may be used in combination with a DFIG wind turbine to generate electricity levels [2]. The
dynamic response of DFIG to fluctuations in wind speed as well as the process of turbine braking. The amount
of power that the wind has in the form of kinetic energy [3, 4], which is represented by the symbol Pv, by
utilizing (1).
where Vv represents average speed of wind in the area, A = π R2, where R represents rotor blade diameter, and ρ
represents density of air in the current area sets. The power recovered by wind turbine can be represented via (2),
= (2)
The power coefficient Cp is a number that does not have a specific unit of measurement, and it is used to express
how well a wind turbine is able to transform the kinetic energy of the wind into the mechanical energy that may
be used. The power output of the wind turbine is what is used to measure the efficiency of the wind turbine. This
coefficient shifts as a function of the wind speed, the speed of the rotor blades, and the angle at which the pitch is
set [5, 6]. The length of the rotor blades in proposed model of a Wind Turbine with DFIG is set to R = 50 meters,
and the air density is set to = 1.225 kilograms per cubic meter. Both of these settings are in meters per second
squared. Automatic adjustment in pitch angle (β) is done in such a way as to guarantee that the change in Cp as
shown in the “Fig. 1”. The value of Cp will be maximum when β is taken to be zero. Apart from this, the output
power of the turbine will be less if the value of β is different [16].
One of many DFIG controls may be activated at any one time, and the one that is used is determined by the zone
in which the machine is now running. After investigating performance of different models [7, 8, 9] it can be
observed that existing models showcase high efficiency for control of DFIG operations. In this regard, it is
feasible to observe that the models [10, 11, 12] that are now in use are either very sophisticated or do not
incorporate a significant amount of control with the intention of maintaining constant output levels. In order to
address these problems, one possible solution is described in Section 3, and it is titled "Design of a Machine
Learning technique for increasing Inertia constant Selection in Wind Farm installations." The proposed model
was put through an examination in section 4, during which its results were compared with those of previously
executed DFIG-based deployments. This paper comes to a conclusion with a number of in-depth observations on
the model that has been provided, as well as suggestions for optimization models that may enhance the
performance of the model in a variety of use scenarios.
II. PROPOSED GWO ALGORITHM FOR IMPROVING INERTIA CONSTANT SELECTION IN WIND FARM DEPLOYMENT
After referring existing DFIG based control models [13, 14, 15], it was observed that existing models do not use
stochastic optimizations, which limits their applicability under real-time use cases. To overcome this limitation,
proposed Grey Wolf Optimization (GWO) algorithm for estimation of inertia constant for DFIG based wind
turbines is discussed in this text. The algorithm works via the following process,
Initialize the following parameters,
Total wolves existing in the model (N_w)
Total iterations for which the model will be evaluated (N_i)
Learning rate for the model (L_r)
Initialize all wolves to be ‘Delta’, and evaluate them for each iteration via the following process,
If the Wolf is currently marked as ‘Delta’, then process it, else go to the next wolf in sequence
To process a Wolf, generate its internal configuration via the following process,
Stochastically generate an inertial constant via (3),
304
= (0, 1) (3)
Where, Hs represents the inertial constant, and STOCH indicates a stochastic process to generate numbers
between given ranges.
Based on this value of Hs, simulate the model, and estimate its fitness via (4)
= (4)
Where, Pactive represents active power at the output of model, while Preactive represents output reactive power
levels.
Evaluate fitness for all Wolves, and then estimate fitness threshold via (5)
=∑ ∗ (5)
o At the end of each iteration, re-evaluate all Wolves via the following process shown in “Fig. 2”,
Mark Wolf as ‘Alpha’, if f>2*fth (6)
Mark Wolf as ‘Beta, if f>fth (7)
Mark Wolf as ‘Gamma, if f>LW*fth (8)
Else, Mark Wolf as ‘Delta’, if for this configuration,
f<fth (9)
Repeat this process for all iterations, and then select the ‘Alpha’ Wolf with maximum fitness levels. Due to
selection of Wolf with maximum fitness, active power is increased, while reactive power levels are reduced at
the output, which assists in improving circuit efficiency levels. This is advantageous, because it's possible that an
excessive quantity of reactive power may cause the components to overheat, which would significantly cut down
on the equipment's lifetime.
305
If power quality standards and regulations are not followed to, it may result in unanticipated shutdowns, power
losses, and even fines. This is in addition to the fact that it may cause blackouts. Thus, the efficiency of DFIG
based wind systems is improved due to selection of proper inertia constants. This efficiency is evaluated for
different models in the next section of this text.
Where, P(Out) & R(Out) represents active and reactive power outputs for N different circuit reading iterations.
designations.
The outputs were obtained for 3 Phase Voltage across 575V grid (Vabc_575), 3 Phase Current across 575V grid
(Iabc_575), Active Power (P), Reactive Power (Q), 3 Phase Voltage across 25kV grid (Vabc_25), and 3 Phase
Current across 25kV grid (Iabc_25). These waveforms can be observed from “Fig. 4” as follows, based on these
readings, the power efficiency (P) levels were evaluated via (10), for the circuit with GWO and without GWO
were tabulated in table 1 as follows, which represents circuit performance under different simulation instances.
Based on these results and “Fig. 5”, it can be observed that the proposed model can improve the power efficiency
levels by 8.5% after application of GWO, which makes it useful for a wide variety of real-time simulation use
cases. Due to these advantages, the proposed model is useful for improving power efficiency for different DFIG
based wind farms.
IV. CONCLUSION
In this paper, optimal value of inertia constant for DFIG wind farm has been obtained for different loading
condition, GWO optimization has been utilized. Because of this, the proposed model is able to enhance the
power efficiency levels after the application of GWO by 8.5%, which allows it to be useful for a wide variety of
various real-time simulation use cases. This is because of the fact that: It is possible that the model that has been
306
Figure 4. Output voltage & current levels for different component
120.00
100.00
ή% 80.00
60.00
40.00
20.00
0.00
1 2 3 4 5 6 7 8 9 10 12 15 18 20 25
Time in Sec.
Without GWO With GWO
created may be successful in boosting the power efficiency of a variety of DFIG-based wind farms as a result of
these advantages.
307
REFERENCES
[1] S. Huang, Q. Wu, Y. Guo and F. Rong, "Hierarchical Active Power Control of DFIG-Based Wind Farm With
Distributed Energy Storage Systems Based on ADMM," in IEEE Transactions on Sustainable Energy, vol. 11, no. 3, pp.
1528-1538, July 2020, doi: 10.1109/TSTE.2019.2929820.
[2] Z. Dong, Z. Li, L. Du, Y. Liu and Z. Ding, "Coordination Strategy of Large-Scale DFIG-Based Wind Farm for Voltage
Support With High Converter Capacity Utilization," in IEEE Transactions on Sustainable Energy, vol. 12, no. 2, pp.
1416-1425, April 2021, doi: 10.1109/TSTE.2020.3047273.
[3] N. Shabanikia, A. A. Nia, A. Tabesh and S. A. Khajehoddin, "Weighted Dynamic Aggregation Modeling of Induction
Machine-Based Wind Farms," in IEEE Transactions on Sustainable Energy, vol. 12, no. 3, pp. 1604-1614, July 2021,
doi: 10.1109/TSTE.2021.3057854.
[4] B. Liu et al., "Impedance Modeling of DFIG Wind Farms With Various Rotor Speeds and Frequency Coupling," in
IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 68, no. 1, pp. 406-410, Jan. 2021, doi:
10.1109/TCSII.2020.2997927.
[5] X. Wang, H. Yu, Y. Lin, Z. Zhang and X. Gong, "Dynamic Equivalent Modeling for Wind Farms With DFIGs Using
the Artificial Bee Colony With K-Means Algorithm," in IEEE Access, vol. 8, pp. 173723-173731, 2020, doi:
10.1109/ACCESS.2020.3024212.
[6] Y. Zhang, C. Klabunde and M. Wolter, "Frequency-Coupled Impedance Modeling and Resonance Analysis of DFIG-
Based Offshore Wind Farm With HVDC Connection," in IEEE Access, vol. 8, pp. 147880-147894, 2020, doi:
10.1109/ACCESS.2020.3015614.
[7] M. Wang et al., "Impedance Modeling and Stability Analysis of DFIG Wind Farm With LCC-HVDC Transmission," in
IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 12, no. 1, pp. 7-19, March 2022, doi:
10.1109/JETCAS.2022.3144999.
[8] K. Sun, W. Yao, J. Fang, X. Ai, J. Wen and S. Cheng, "Impedance Modeling and Stability Analysis of Grid-Connected
DFIG-Based Wind Farm With a VSC-HVDC," in IEEE Journal of Emerging and Selected Topics in Power Electronics,
vol. 8, no. 2, pp. 1375-1390, June 2020, doi: 10.1109/JESTPE.2019.2901747.
[9] J. Shair, X. Xie, J. Yang, J. Li and H. Li, "Adaptive Damping Control of Sub synchronous Oscillation in DFIG-Based
Wind Farms Connected to Series-Compensated Network," in IEEE Transactions on Power Delivery, vol. 37, no. 2, pp.
1036-1049, April 2022, doi: 10.1109/TPWRD.2021.3076053.
[10] J. Liu et al., "Impact of Power Grid Strength and PLL Parameters on Stability of Grid-Connected DFIG Wind Farm," in
IEEE Transactions on Sustainable Energy, vol. 11, no. 1, pp. 545-557, Jan. 2020, doi: 10.1109/TSTE.2019.2897596.
[11] L. M. Castro and E. Acha, "On the Dynamic Modeling of Marine VSC-HVDC Power Grids Including Offshore Wind
Farms," in IEEE Transactions on Sustainable Energy, vol. 11, no. 4, pp. 2889-2900, Oct. 2020, doi:
10.1109/TSTE.2020.2980970.
[12] Y. Wu and P. Zhang, "Online Monitoring for Power Cables in DFIG-Based Wind Farms Using High-Frequency
Resonance Analysis," in IEEE Transactions on Sustainable Energy, vol. 13, no. 1, pp. 378-390, Jan. 2022, doi:
10.1109/TSTE.2021.3113017.
[13] R. Venkateswaran and Y. H. Joo, "Retarded Sampled-Data Control Design for Interconnected Power System With
DFIG-Based Wind Farm: LMI Approach," in IEEE Transactions on Cybernetics, vol. 52, no. 7, pp. 5767-5777, July
2022, doi: 10.1109/TCYB.2020.3042543.
[14] [N. Tong et al., "Coordinated Sequential Control of Individual Generators for Large-Scale DFIG-Based Wind Farms," in
IEEE Transactions on Sustainable Energy, vol. 11, no. 3, pp. 1679-1692, July 2020, doi: 10.1109/TSTE.2019.2936757.
[15] H. Dong, M. Su, K. Liu and W. Zou, "Mitigation Strategy of Sub synchronous Oscillation Based on Fractional-Order
Sliding Mode Control for VSC-MTDC Systems With DFIG-Based Wind Farm Access," in IEEE Access, vol. 8, pp.
209242-209250, 2020, doi: 10.1109/ACCESS.2020.3038665.
[16] Fatima Zohra naama, et. Al., "Model and simulation of wind turbine and its associated Permanent Magnet Synchronous
Generator", Energy Procedia, pp. 1-10, 2019.
308
Grenze International Journal of Engineering and Technology, June Issue
Abstract—Emotions cause different physical, behavioural and cognitive changes in the human
body. Emotions can be positive and negative. Negative emotion is the experience of negative
feelings such as anger, frustration, panic, stress and fear. These negative emotions can cause
severe health problems. So there is a need for detection of negative emotions . It will help in
improving the health of the human body. As these emotions result in a change of various
physiological parameters like heart rate, skin temperature, blood pressure, skin conductance,
etc., these signals can be used to detect the emotions of a person. These signals are generated
by the body during the functioning of various physiological systems, so they cannot be regulated
artificially. Due to this reason, it is a reliable source for the detection of such information. So
physiological signal is one of the most important factor in the field of emotion detection. The
change in signals represents certain characteristics which are used to estimate the emotions.
This work mainly focuses to build a better model of negative emotion detection for Typically
Developed group using Machine learning approach with the help of Electrocardiogram (ECG)
signal. This study was conducted on DECAF database for typically developed group. The
study focused to extract the relevant features from both ECG and HRV signals. Then to identify
which is more contributing towards negative emotion detection. A machine learning model was
developed for typically developed group db4 as mother wavelets for feature extraction. The
significant features of ECG and HRV were then classified separately using the logistic
regression, ensemble and support vector machine. Logistic regression classifier achieved
maximum accuracy using HRV data for typically developed (TD) group.
Index Terms— ECG, negative emotion detection, DWT, ma- chine learning.
I. INTRODUCTION
Emotion is a state of thought that arises spontaneously and is accompanied by physiological changes. Emotion is
made up of three parts: a subjective component that defines how we feel emotions, a physiological component
that describes how our bodies react to emotions, and an expressive component that reflects the human reaction to
each emotion. External motivations, thoughts, and changes in interior feelings are all referred to as emotion.
Emotion recognition has become a vast field of study in cognitive science, engineering, and psychol- ogy.
Emotion detection was used in psychology to comprehend the feelings of persons who were being counselled. It
310
DECAF[1] is a database containing the physiological re- sponse to different emotions elicited in 30 subjects
using 36 movie clips and 40 one minute music video segment. They had collected different signals such as
MEG, horizon- tal electrooculogram hEOG, ECG, trapezium electromyogram.,(tEMG) and near-infrared facial
videos which were recorded synchronously.
III. METHODOLOGY
The work focuses to extract the relevant features from both ECG and HRV signals. Then to identify which is
more contributing towards negative emotion detection. A machine learning model was developed for typically
developed group db4 as mother wavelets for feature extraction. The significant features of ECG and HRV were
then classified separately using the logistic regression, ensemble, support vector ma- chine and k-nearest
neighbor. ECG data for various emotions are collected from DECAF [1]. In this work ECG signal corresponding
to two different emotions happy and sad are considered. The block diagram is shown in Fig. 1.
A. ECG Data
ECG data was taken from DECAF database[1]. It is a mul- timodal database containing physiological response
to various emotions. The emotions are elicited using 24 one minute movie segments and 40 one minute music
segments. The experiment was carried out for 30 healthy subjects. In this work only ECG signal response to the
one minute movie segment for emotions happy and sad was analysed.
B. Pre-Processing
Pre-processing is done to improve the quality of signals by removing noises. These noises include power line
interference, baseline wandering and high-frequency noises [6]. This helps to improve the quality of the negative
emotion detection method. Baseline wander is a low-frequency noise that arises from breathing, electrodes
attached to the body, or subject movement. It occurs in the frequency range of 0.5 to 0.6Hz. Baseline wander can
cause the amplitude of the QRS complex to increase significantly. The wavelet-based approach is best for
removing ECG signals. The DWT-based method makes use of high-level decomposition to eliminate low-
frequency components corresponding to the baseline variation. DWT was performed using Daubechies (db8) as
the mother wavelet because of the similarity of wavelet function with the shape of ECG signal[7]. Then high-
frequency noises occurring due to power line noises were removed using 6th order low pass Butterworth filter
with a cut of frequency of 50 Hz since in India the power line frequency is at 50 Hz. After noise removal heart
rate variability (HRV) was derived. It refers to the variation of the time interval between successive heartbeat.
Fig. 2 and Fig. 3 represents raw and corresponding pre-processed signals of happy emotion. Fig. 4 and Fig. 5
represents raw and corresponding pre-processed signals of sad emotion.
311
Fig. 3. Pre-processed ECG signal containing happiness of TD group
C. Feature extraction
Here features are extracted from ECG and HRV. Different features are extracted to get the emotional content in
the signal. Feature extraction helps to reduce the redundant data present in the signal. Thereby it helps to get
useful information from the signals. Different feature extractions techniques are used in the literature [8]. DWT
(Discrete Wavelet Transform) was used to extract the features here.
DWT makes use of the mother wavelet which is a single prototype function used to decompose the input
signal. Decomposition depends on the scaling and shifting derive frequency sub-bands of the input signal.
Fig. 4. Raw ECG signal containing sadness of TD group Fig. 5. Pre-processed ECG signal containing sadness of TD group
DWT decomposes the original signal to approximation and detail coefficients with the help of a low pass filter
and high pass filter. The output of the low pass filter (LPF) is known as approximation coefficients and the
output of the high pass filter (HPF) is known as detailed coefficients. The output of this LPF is again applied to
HPF and LPF which forms the second decomposition level. In this study 14 level decomposition is done. This is
because the emotional content is present in the low-frequency band and high-frequency band [10]. Detail
coefficient from 11th to 14th is used for extracting various features.
Wavelet transform make use of mother wavelets. Different wavelets includes Daubechies (db) wavelet, Haar
wavelet, Symlet wavelet, Coiflet wavelet etc. Daubechies are orthogonal wavelets which is characterized by
maximum number of van- ishing moments for some predefined support length. The name of these wavelets are
represented as dbN. Here N represents the order of these wavelets. Usually N varies from 1 to 8. In this work,
analyses were carried out using db4 mother wavelet. ECG and HRV features were extracted for negative emotion
detection using two emotions. All the features extracted from ECG and HRV data is listed in Table I.
In addition, time domain features of HRV is also considered which includes mean R-R interval
difference(meanRR), Root Mean Square Distance of Successive R-R interval(RMSSD), Number of R peaks in
ECG that differ more than 50 mil- lisecond(NN50), percentage of successive RR intervals that differ more than
50 ms (pNN50), standard deviation of RR intervals(SD RR), and Standard Deviation of Heart Rate(SD HR).
312
TABLE I. FEATURES EXTRACTED FROM ECG AND HRV
Serial Features Description
no.
1 max Maximum value of signal in
each level
2 min Minimum value of signal in
each level
3 mean Mean value of signal in
each level
4 median Median value of signal in
each level
5 std Standard deviation of signal
in each level
6 mad Mean absolute deviation of
signal in each level
7 range Range of signal in each
level
8 power Power of signal in each
level
9 L1 norm L1 norm of signal in each
level
10 L2 norm L2 norm of signal in each
level
11 Kurtosis Kurtosis value of signal in
each level
12 entropy Entropy value of signal in
each level
13 skewness Skewness value of signal in
each level
14 HF power Sum of power of levels 11
and 12
15 LF power Sum of power of levels 13
and 14
16 LF power norm HF power/(LF power + HF
power)
17 HF power norm LF power/(LF power + HF
power)
18 power HF power + LF power
19 ratio Ratio of HF power and LF
power
D. Feature extraction
Here features are extracted from ECG and HRV. Different features are extracted to get the emotional content in
the signal. Feature extraction helps to reduce the redundant data present in the signal. Thereby it helps to get
useful information from the signals. Different feature extractions techniques are used in the literature [8]. DWT
(Discrete Wavelet Transform) was used to extract the features here.
DWT makes use of the mother wavelet which is a single prototype function used to decompose the input
signal. Decomposition depends on the scaling and shifting
E. Feature extraction
Here features are extracted from ECG and HRV. Different features are extracted to get the emotional content in
the signal. Feature extraction helps to reduce the redundant data present in the signal. Thereby it helps to get
useful information from the signals. Different feature extractions techniques are used in the literature [8]. DWT
(Discrete Wavelet Transform) was used to extract the features here.
313
DWT makes use of the mother wavelet which is a single prototype function used to decompose the input
signal. Decomposition depends on the scaling and shifting.
TABLE III SIGNIFICANT FEATURES WITH P AND MEAN VALUES FOR HRV
Sl Features Sig. Mean Value Mean Value (sad)
no. Value (happy)
1 meand11 0.028 618.372 692.67
2 L1d14 0.012 -3.5x107 -5.4x107
3 NN50 0.000 70.504 70.90
4 SD HR 0.000 26.808 28.004
F. Classification
The significant features obtained after feature selection is classified using various machine learning algorithms.
Every machine learning classifier have two phases. First phase is training phase and followed by a testing phase.
70% of total available data is used for training and model is tested using remaining 30%. Classifiers are used to
classify the significant features into emotions happiness and sadness of typically developed group. Here three
different machine learning models such as logistic regression, ensemble and SVM are used for negative emotion
detection.
314
Fig. 6. Confusion matrix obtained for Logistic regression using HRV features
V. C ONCLUSION
Electrocardiogram signals are an effective way for analyzing human emotions. In this work, negative emotion
detection for typically developed group using different classification models was done. The study included a
comparison between ECG and HRV features. It was found that HRV features are more contributing towards the
emotion detection. Logistic regression and Ensemble classifier showed better performance compared to other
machine learning algorithms for typically developed group.
REFERENCES
[1] Mojtaba Khomami Abadi, Ramanathan Subramanian,Seyed Mostafa Kia, Paolo Avesani, Ioannis Patras, and Nicu
Sebe, “DECAF: MEG Based Multimodal Database for Decoding Affective Physiological Re- sponses”, IEEE
Transactions on Affective Computing , Vol. 6, No. 3, July-September 2015
[2] Z. Cheng, L. Shu, J. Xie and C. L. P. Chen, ”A novel ECG-based real time detection method of negative emotions in
wearable applications”, 2017 International Conference on Security, Pattern Analysis, and Cy- bernetics (SPAC),
Shenzhen, 2017, pp. 296-301.
[3] H. Guo, Y. Huang, C. Lin, J. Chien, K. Haraikawa and J. Shieh, ”Heart Rate Variabil-ity Signal Features for Emotion
Recognition by Using Principal Component Analysis and Support Vectors Machine”, 2016 IEEE 16th International
Conference on Bioinformatics and Bioengineer- ing (BIBE), Taichung, 2016, pp. 274-277.
[4] M. S. Goodwin, C. A. Mazefsky, S. Ioannidis, D. Erdogmus, and M. Siegel,“Predicting aggression to others in youth
with autism using a wearable biosensor,” Autism research, vol. 12, no. 8, pp. 1286–1296, 2019.
[5] Anandhi, B., and S. Jerritta., ”Recognition of valence using QRS complex in children with Autism Spectrum Disorder
(ASD)”, IOP Conference Series: Materials Sci- ence and Engineering, Vol. 1070. No. IOP Publishing, 2021.
315
[6] P. Chettupuzhakkaran and N. Sindhu, ”Emotion recognition from phys- iological signals using time-frequency analysis
methods”, 2018 Interna- tional Conference on Emerging Trends and Innovations in Engineering and Technological
Research ICETIETR 2018, pp. 1–5, 2018.
[7] M. Bassiouni, E.-S. El-Dahshan, W. Khalefa, and A.-B. M.Salem,“Intelligent hybrid approaches for human ecg signals
identification,” Signal, Image and Video Processing, vol. 12, pp. 941–949, 07 2018.
[8] Shu, L., Xie, J., Yang, M., Li, Z., Li, Z., Liao, Yang, X. (2018), ”A review of emotion recognition using physiological
signals”, Sensors, vol. 18, no. 7, 2018.
[9] A. Bagirathan, J. Selvaraj, A. Gurusamy, and H. Das, “Recognition of positive and negative valence states in
children with autism spec- trum disorder (asd) using discrete wavelet transform (dwt) analysis of electrocardiogram
signals (ecg),” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 1, pp. 405–416, 2021.
[10] M. Murugappan, S. Murugappan, and B. S. Zheng, ”Frequency band analysis of electrocardiogram (ECG) signals for
human emotional state classification using discrete wavelet transform (DWT)”, Journal of Physical Therapy Science,
vol. 25, no. 7, pp. 753–759, 2013.
316
Author Index
A H
Aarya Pawar 7 Halkarnikar P P 116
Abdul Rahman 231 Harish Kumar 93
Abhishek Kajal 155 Harshavardhan J 295
Abinanda P 110 Hemant Singh Pokhariya 290
Aditya Tripathi 178 Himanshu Pal 290
Ajay U Surwade 61
Ali Albkhrani 269 J
Amol Dhakne 116 Jayadeep K 223
Anjani Pujitha PSVL 295 Jerritta S 309
Anusha N 282
Arjyadhara Pradhan 254 K
Arshad Ali 178 Kalaiselvi K 140
Arun Kumar Dash 217 Kanchan Shelke 174
Aswini J 217 Karthick Myilvahanan J 33
Ayushi Agarwal 14, 23 Karthik A 295
Kavya K 276
B Khandagale H P 116
Babita Panda 254 Kirti Thakur 93
Bachu Munideepika 50 Krishnaveni A 33
Bharti W Gawali 244, 269
Biplab Bag 130 L
Brijesh Prasad 290 Lakhan Jadhav 185
Brunda U 140 Lipika Nanda 254
C M
Chandrasekhar D 276 Madhukumar Patnala 50
Chitralekha Jena 254 Malarvizhi N 217
Mani Barathi S P S 237
D Manikanta V 217
Danish Tamboli 72 Manne Sowmya 223
Deepak Nandal 77 Mansi Patil 102
Deepesh Bhati 303 Mayur Gaikwad 185
Deepti Jagyasi 148 Minakshi M Sonawane 269
Dhyaneshwaran J 29 Mohana Sundaram N 33
Mohit Patil 67
F Mrityunjaya Kappali 260
Farrel Deva Asir J 29
N
G Nachiket Joshi 55
Gayathri G 110 Nallapothula Sreenivasulu 50
Gopi Sivanadh D V S 295 Neelam Chandolikar 102
Y Z
Yashwanth Krishna A 282 Zeba Khan 231
Yashwanth M 163
View publication stats