0% found this document useful (0 votes)
97 views

Disease Prediction Using Machine Learning

Predicting disease at an early stage becomes critical and the most difficult challenge is to predict it correctly along with the sickness. The prediction happens on the basis of the symptoms of an individual. The model presented can work like a digital doctor for the disease prediction which helps to diagnose the disease timely and can be efficient for the person to take immediate measures. The model is much more accurate in prediction of potential ailments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views

Disease Prediction Using Machine Learning

Predicting disease at an early stage becomes critical and the most difficult challenge is to predict it correctly along with the sickness. The prediction happens on the basis of the symptoms of an individual. The model presented can work like a digital doctor for the disease prediction which helps to diagnose the disease timely and can be efficient for the person to take immediate measures. The model is much more accurate in prediction of potential ailments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

ISSN (online) 2583-455X

BOHR International Journal of Computer Science


2022, Vol. 2, No. 1, pp. 58–61
https://doi.org/10.54646/bijcs.011
www.bohrpub.com

Disease Prediction Using Machine Learning


G. Vasu Sena, K. Rajinikanth, Mohammed Khaja Faizan∗ and D. Rohit Rajan

Student, Computer Science and Engineering, Gokaraju Rangaraju Institute of Engineering and Technology
∗ Corresponding author: [email protected]

Abstract. Predicting disease at an early stage becomes critical, and the most difficult challenge is to predict it
correctly along with the sickness. The prediction happens based on the symptoms of an individual. The model
presented can work like a digital doctor for disease prediction, which helps to timely diagnose the disease and
can be efficient for the person to take immediate measures. The model is much more accurate in the prediction
of potential ailments. The work was tested with four machine learning algorithms and got the best accuracy with
Random Forest.
Keywords: Machine Learning, Random Forest, Disease Prediction, Naïve Bayes.

INTRODUCTION which features are represented by internal nodes and the


branches of the tree represent decision rules. The decision
The main goal of our project is to provide the disease tree has two nodes. The judgment or test is made based on
name by taking the symptoms from the user or patients. the dataset’s properties.
Nowadays everything is available on the internet, so we
thought of predicting the disease based on the symptoms Naïve Bayes Algorithm
that are given by the customer online. It is an interactive
system that takes symptoms from the customer. The cus- The algorithm that is used in the classification of binary
tomer has to provide a minimum of 2 symptoms that they and multiclass is the Naïve Bayes algorithm. The Naïve
are suffering from. Bayes algorithm is very simple and easy to under-
The system responds effectively graphical user interface stand, and the Naïve Bayes algorithm provides good
(GUI) to make it look like or feels like it is a live interac- output for a wide range of output P (class1|data1) = (P
tion. You can create this type of disease prediction using (data1|class1) × P (class1)) / P (data1). With the help of the
machine learning algorithms as well as artificial algorithms Naïve Bayes algorithm, we can calculate the probability of
to enquire, identify, and respond to the custumer. a piece of data belonging to a given class.

Random Forest Algorithm K-Nearest Neighbour (KNN)

1. Random forest selects k number of records randomly A pattern has been found to link the data and results,
from data having m records. which helps in improving the recognition with each iter-
ation. It involves the following steps:
2. A separate decision tree is created for each sample.
3. Output is produced from every decision tree. 1. We need to load the required data.
4. The result is based on averaging for classification and 2. We need to calculate the distance between points,
regression. Random forest is considered one of the which is called the Euclidean distance.
effective algorithms used in classification. 3. We have top k top distances.
Python was chosen for a variety of reasons. It is depen-
Decision Tree Algorithm
dent on your perspective and background. It is made
Decision trees are commonly employed for classification. for programmers. One of the most well-known program-
A decision tree is a classifier with a tree structure in ming languages is Python. Python is one of the easiest

58
Disease Prediction Using Machine Learning 59

Figure 1. Work flow.

programming languages to learn. It is quite simple, and METHODOLOGY


we can use the grammar language in it as syntax. Python
is one of the high-level languages, which has an inbuilt A methodology is a representation of a system’s structure,
garbage collector that is used to free up the memory from behavior, and other features. A system architecture is made
the elements that are not used in the code. up of system components and subsystems that interact to
form the total. Individuals use an architecture diagram
to abstract the overall structure of a software system
Bits and Pieces together and define constraints, linkages, and boundaries between
components. The methodology of the work is shown in
This approach can utilize the already done work by uti-
Figure 1.
lizing it as a starting point. All the information from
Python has many applications. Some of them are the
accomplished work can be combined.
following:

• Web Development
Equations Many web development projects use Python because
The equations should be inserted in an editable format Python has introduced a lot of frameworks that make
from the equation editor. work easier, simpler, and more attractive.
• Data Science
∞ Data science itself involves so many stages like data
 nπx nπx 
f ( x ) = a0 + ∑ bn cos
s
+ cn sin
s
mining, data sorting, data processing, etc. So, Python
n =1 provides inbuilt functions that make work easier and
simpler to work with.
PROPOSED SYSTEM
RESULTS
In this model, we (GUI) take the symptoms from the user
and predict the disease he is suffering from. The interface This dataset was acquired from a Kaggle reference. Here, in
responds immediately in a fraction of a time with accurate the dataset, we have 5,000 rows of data that help in training
accuracy. models very efficiently (shown in Figure 2).
The testing data has nearly 45 rows that help in calculat-
• The user has to fill in the details like his name. ing accuracy (shown in Figure 3).
• The user has to enter the symptoms of suffering, at Figure 4 shows the interface of the disease prediction
least 2 symptoms. scenario, and Figure 5 shows the final result achieved after
• The system will store the data like his name and the providing symptoms in the interface.
disease he is suffering from so that treating him the The work is being done with four machine learning
next time will be easy and fast to cure him. algorithms, i.e., decision trees, random forest, KNN, and
60 G. Vasu Sena et al.

Figure 2. Training set.

Figure 3. Test set.

Figure 4. Interface for prediction of disease.


Disease Prediction Using Machine Learning 61

Figure 5. Final result.

the medical field. By building such types of systems, we


can save time and money spent by the patients to undergo
tests or scanning to know what they are suffering from.
On average, our system achieved an accuracy of 93% in
editing diseases with the symptoms given by the user with
the random forest algorithm. In creating this system, we
also added a way to store the data entered by the user in
the database, which can be used in the future to help in
creating a better version of such a system.

REFERENCES
[1] Mohapatra, H. (2015). HCR (English) using neural network. Interna-
Figure 6. Comparison with different machine learning models.
tional journal of advance research and innovative ideas in education,
1(4), 379385.
[2] Mohapatra, H., and Rath, A. K. (2019). Detection and avoidance of
Naïve Bayes. The best result was achieved with a random water loss through municipality taps in India by using smart taps and
forest algorithm. The comparison of all classifiers is shown ICT. IET Wireless sensor systems, 9(6), 447–457.
in Figure 6. [3] Mohapatra, H., and Rath, A. K. (2019). Fault tolerance in WSN
through PE-LEACH protocol. IET wireless sensor systems, 9(6),
358–365.
CONCLUSION [4] Mohapatra, H., Debnath, S., and Rath, A. K. (2019). Energy man-
agement in wireless sensor network through EB-LEACH (No. 1192).
After completing the work, we can conclude that the ran- Easy Chair.
dom forest predicts the disease with high accuracy, and [5] Nirgude, V., Mahapatra, H., and Shivarkar, S. (2017). Face recognition
after the random forest, it is the decision tree that gives system using principal component analysis & linear discriminant
analysis method simultaneously with 3d morphable model and neu-
one of the best accuracies. We have created a system that ral network BPNN method. Global journal of advanced engineering
can decrease the rush at hospitals and medical areas, and technologies and sciences, 4(1), 1–6.
it also helps in reducing the workload on the medical
staff. As a result, our system benefits both patients and

You might also like