Disease Prediction Using Machine Learning
Disease Prediction Using Machine Learning
Student, Computer Science and Engineering, Gokaraju Rangaraju Institute of Engineering and Technology
∗ Corresponding author: [email protected]
Abstract. Predicting disease at an early stage becomes critical, and the most difficult challenge is to predict it
correctly along with the sickness. The prediction happens based on the symptoms of an individual. The model
presented can work like a digital doctor for disease prediction, which helps to timely diagnose the disease and
can be efficient for the person to take immediate measures. The model is much more accurate in the prediction
of potential ailments. The work was tested with four machine learning algorithms and got the best accuracy with
Random Forest.
Keywords: Machine Learning, Random Forest, Disease Prediction, Naïve Bayes.
1. Random forest selects k number of records randomly A pattern has been found to link the data and results,
from data having m records. which helps in improving the recognition with each iter-
ation. It involves the following steps:
2. A separate decision tree is created for each sample.
3. Output is produced from every decision tree. 1. We need to load the required data.
4. The result is based on averaging for classification and 2. We need to calculate the distance between points,
regression. Random forest is considered one of the which is called the Euclidean distance.
effective algorithms used in classification. 3. We have top k top distances.
Python was chosen for a variety of reasons. It is depen-
Decision Tree Algorithm
dent on your perspective and background. It is made
Decision trees are commonly employed for classification. for programmers. One of the most well-known program-
A decision tree is a classifier with a tree structure in ming languages is Python. Python is one of the easiest
58
Disease Prediction Using Machine Learning 59
• Web Development
Equations Many web development projects use Python because
The equations should be inserted in an editable format Python has introduced a lot of frameworks that make
from the equation editor. work easier, simpler, and more attractive.
• Data Science
∞ Data science itself involves so many stages like data
nπx nπx
f ( x ) = a0 + ∑ bn cos
s
+ cn sin
s
mining, data sorting, data processing, etc. So, Python
n =1 provides inbuilt functions that make work easier and
simpler to work with.
PROPOSED SYSTEM
RESULTS
In this model, we (GUI) take the symptoms from the user
and predict the disease he is suffering from. The interface This dataset was acquired from a Kaggle reference. Here, in
responds immediately in a fraction of a time with accurate the dataset, we have 5,000 rows of data that help in training
accuracy. models very efficiently (shown in Figure 2).
The testing data has nearly 45 rows that help in calculat-
• The user has to fill in the details like his name. ing accuracy (shown in Figure 3).
• The user has to enter the symptoms of suffering, at Figure 4 shows the interface of the disease prediction
least 2 symptoms. scenario, and Figure 5 shows the final result achieved after
• The system will store the data like his name and the providing symptoms in the interface.
disease he is suffering from so that treating him the The work is being done with four machine learning
next time will be easy and fast to cure him. algorithms, i.e., decision trees, random forest, KNN, and
60 G. Vasu Sena et al.
REFERENCES
[1] Mohapatra, H. (2015). HCR (English) using neural network. Interna-
Figure 6. Comparison with different machine learning models.
tional journal of advance research and innovative ideas in education,
1(4), 379385.
[2] Mohapatra, H., and Rath, A. K. (2019). Detection and avoidance of
Naïve Bayes. The best result was achieved with a random water loss through municipality taps in India by using smart taps and
forest algorithm. The comparison of all classifiers is shown ICT. IET Wireless sensor systems, 9(6), 447–457.
in Figure 6. [3] Mohapatra, H., and Rath, A. K. (2019). Fault tolerance in WSN
through PE-LEACH protocol. IET wireless sensor systems, 9(6),
358–365.
CONCLUSION [4] Mohapatra, H., Debnath, S., and Rath, A. K. (2019). Energy man-
agement in wireless sensor network through EB-LEACH (No. 1192).
After completing the work, we can conclude that the ran- Easy Chair.
dom forest predicts the disease with high accuracy, and [5] Nirgude, V., Mahapatra, H., and Shivarkar, S. (2017). Face recognition
after the random forest, it is the decision tree that gives system using principal component analysis & linear discriminant
analysis method simultaneously with 3d morphable model and neu-
one of the best accuracies. We have created a system that ral network BPNN method. Global journal of advanced engineering
can decrease the rush at hospitals and medical areas, and technologies and sciences, 4(1), 1–6.
it also helps in reducing the workload on the medical
staff. As a result, our system benefits both patients and