Multiple Disease Prediction Using Machine Learning
Multiple Disease Prediction Using Machine Learning
Abstract:- Machine learning, which is a type of computer illnesses are critical for lowering healthcare costs, optimizing
technology, has changed healthcare a lot. It helps doctors treatment techniques, and improving patient outcomes.
predict diseases better and faster. In healthcare, using Because of its ability to study huge amounts of data and
machine learning algorithms decision tree (DT), logistic identify subtle patterns, machine learning offers fascinating
regression (LR), support vector machine (SVM) that can pathways for multi-disease prediction. Support vector
help predict lots of different diseases at the same time. machines (SVMs) are powerful supervised learning models
This helps doctors find and treat illnesses early, which that are commonly used in classification problems. SVMs
makes patients better and saves money on healthcare. aim to optimize the margin between unique classes in data by
This paper looks at how we can use computer programs determining the best hyperplane to separate them. The SVM
that learn from data to predict many diseases. It talks approach is suitable for a wide range of medical diagnostic
about why this is good, what problems we might face, and applications since it can handle both linear and nonlinear
where we might go next with it. We give a summary of relationships between input data and target variables. This
the several machine learning models and information study wanted to make a system that could predict several
sources that are often employed in illness prediction. We diseases using SVMs. It checked how good this system was
also go over the significance of feature selection, model at predicting Parkinson's disease, diabetes, and heart disease.
assessment, and combining several data modalities for . Using this dataset, the SVM model was trained to
improved illness prediction. We give a summary of the understand the complex correlations between the existence of
several machine learning models and information sources the three diseases and the input features. Targeted illness
that are often employed in illness prediction. We also go management techniques, individualized treatment plans, and
over the significance of feature selection, model early interventions can all be made easier with the help of
assessment, and combining several data modalities for machine learning models for accurate disease prediction. It
improved illness prediction. The research shows that may help medical professionals make better judgments,
using machine learning algorithms to predict many improve patient care, and better allocate resources within
diseases at once could really help public health. Again, we healthcare systems. It also has potential for population-level
use a machine learning model to determine whether or disease surveillance, which would help public health officials
not an individual is impacted by a few diseases. This quickly identify illness outbreaks and put preventative
training model trains itself to predict illness using sample measures in place. The investigation and analysis of the SVM
data. model's performance in predicting heart disease, diabetes,
and Parkinson's disease revealed the utility and practicality of
Keywords:- Disease Prediction, Disease Data, Machine applying machine learning algorithms to complex medical
Learning, Decision Tree (DT), Logistic Regression (LR), diagnosis. The study and evaluation of the LR model's
Support Vector Machine (SVM). performance in predicting lung cancer and breast cancer . As
a consequence, this work points out the potential of SVM and
I. INTRODUCTION LR as effective tools in the field of multi-disease prediction.
Machine learning can help us move closer to producing more
In recent years, machine learning has made big progress precise, timely, and tailored healthcare interventions, which
and is being used in lots of industries, like healthcare, to do will improve patient outcomes and build more successful
really amazing things. Using computer systems that learn healthcare systems.
from data can help doctors detect diseases more accurately
and improve patient outcomes by predicting many diseases at II. LITERATURE SURVEY
the same time. This study used the Support Vector Machines
(SVM) and logistic regression (LR) algorithms to predict the In this project, we studied existing research about using
presence of five prevalent diseases: Parkinson's disease, machine learning methods, like Support Vector Machines
diabetes, heart disease, lung cancer, and breast cancer. (SVM) and logistic regression (LR), random forest , to
Diabetes, Parkinson's disease, lung cancer, and breast cancer predict several diseases such as diabetes, heart disease, and
are important public health concerns that have a significant Parkinson's disease. We looked at other studies that did
influence on people's lives and healthcare systems all over similar things to understand more about how they did it and
the world. Reducing healthcare expenditures, optimizing what they found. This helped us set up our own project.
treatment strategies, and improving patient prognosis are all
dependent on early identification and correct diagnosis of
these disorders. Early detection and precise diagnosis of these
According to the journal, diabetes is one of the world's IV. PROBLEM SYSTEM
most dangerous illnesses, affecting a variety of afflictions
including blindness. In this article, scientists used Currently, many machine learning models in healthcare
machine learning algorithms to identify diabetes illness are focused on assessing a single ailment at once. For
since they are straightforward and versatile in predicting example, one model may concentrate on evaluating liver
whether a patient is suffering or not. The purpose of this problems, another on cancer, and a third on lung disorders. If
study was to develop a system that would help people someone wishes to anticipate more than one ailment, they
properly diagnose their diabetes. They compared the must utilize many websites or techniques. There is no
accuracy of the key algorithms (LR, Random Forest, DT, standard method for analyzing numerous illnesses using one
and SVM): 72%, 74%, 72.91%, and 73%. system. Some of these models are inaccurate, which might be
The main purpose of this study is to show how important detrimental to patients. When an organization wants to
the heart is to living people. As a result, detection of heart review its patients' health information, they must utilize
disease and forecasting must be precise and accurate since several models, which takes time and money. Also, some
they are crucial and can result in cardiac-related passing systems only consider a few factors, which might lead to
away. As a result, machine learning may assist in the incorrect results.
prediction of any natural disaster. In this paper, they
examine the accuracy of machine learning for predicting V. EXISTING SYSTEM
heart disease using k-nearest neighbor, decision tree, and
naïve bayes with training and testing datasets. The Disease prediction using machine learning algorithms,
authors additionally analyzed the techniques and their existing system project aims to predict diabetes, heart
accuracy: DT 52%, KNN 45%, and Naïve Bayes 52.33%. disease, and Parkinson's disease using different machine
Parkinson's disease is a common condition that affects the learning methods. This project uses several different types of
nervous system and nerve-controlled organs in the body. machine learning algorithms to make predictions. These
It gets worse over time. The SVM model correctly include Naive Bayes, Decision Trees, Random Forest,
identified if someone had Parkinson's disease or not in Support Vector Machine (SVM), and Logistic Regression.
about 71% of cases. The system gathers data from many sources, prepares it, and
trains models using the generated information. and analyzes
III. METHODOLOGY how well they perform. One of the algorithms employed by
the system is SVM, which achieved a 73% accuracy rate in
The technique for the Multiple Disease Prediction predicting diabetes. Similarly, the SVM algorithm diagnosed
project is stated as follows: Parkinson's disease with 71% accuracy. This demonstrates
Data Collection : We got the data from Kaggle.com, a that the SVM model accurately identified the existence or
popular website for getting datasets. The data was nonexistence of Parkinson's disease in 71% of instances.
collected specifically for diabetes, heart disease, lung SVM algorithm is best algorithm to predict Parkinson's
cancer, Parkinson's disease, and breast cancer. Disease in early stage. By applying the decision tree (DT)
Data Preprocessing :We examine and correct the data to method, the system predicted heart disease with an accuracy
ensure that it is of excellent quality and suitable for of 52%. The system also has other machine learning methods
training our machine-learning models. This includes like Naive Bayes, Decision Trees, and Random Forest. These
addressing missing information, deleting duplicates, and might work better or worse depending on the illness being
altering the data for the purpose of making it more predicted. Overall, the existing system uses machine learning
convenient to utilize. algorithms to predict diabetes, heart disease, and Parkinson's
Model Selection : We choose various machine learning disease. We can make the models better by improving them
approaches for each sickness prediction challenge more, which would help us predict illnesses more accurately.
depending on their effectiveness and suitability for that
specific prediction. For example, we can apply Support VI. PROPOSED SYSTEM
Vector Machine (SVM) or Logistic Regression,
depending on which is more suited for the job.. In the current system, we don't use Decision Trees (DT)
and Naive Bayes for implementing the models. However, in
Training and Testing: We divided the prepared data into
the proposed system, we plan to add two more diseases and
two sets: training and testing. We utilize the training data
to educate the models and then test them to see how well develop models for them using Support Vector Machine
they perform. We assess their performance using (SVM) and Logistic Regression (LR). We use new methods,
accuracy to determine how excellent each model is. including data normalization to assure data consistency, label
encryption to transform text data into numbers, and
Model Deployment: We utilize Streamlit cloud
decreasing dimensionality to minimize features while
deployment tools to build a web app that users can
preserving crucial information. We apply techniques that fit
interact with. The program has a simple UI and allows
the dataset well and select simple models to enhance
users to anticipate five diseases: heart disease, lung
performance. The proposed system utilizes the Streamlit
cancer, diabetes, Parkinson's disease, and breast cancer.
library, Streamlit Cloud, Python notebooks, and Python. It's a
When an illness is picked, the user is prompted to enter
detailed project for predicting diseases, employing machine
the information required for the prediction.
learning algorithms like Support Vector Machine (SVM),
Logistic Regression, Random Forest, and K-Nearest
REFERENCES