Semister Project
Semister Project
SEMISTER PROJECT
Name ID
1. Mehari Fitihamlak 0904668
2. Muluken Kindachew 0904857
3. Meneberu Lake 0904721
4. Mikiyas Mengist 0904800
Declaration
We are students of Bahir Dar University in Bahir Dar Institute of technology (BIT), faculty of
Electrical and Computer Engineering. The information found in this proposal project is our
Original work. In addition, all sources of materials that will be used for the project work will be
fully acknowledged.
This project proposal has been submitted for examination with my approval as a university
Advisor.
i
Contents
Declaration ................................................................................................................................... i
List of Tables .............................................................................................................................. iii
List of Abbreviations/ Acronyms ................................................................................................. iv
1. Introduction ........................................................................................................................ 1
2. Literature Review ................................................................................................................ 1
3. Problem statement ............................................................................................................... 2
4. Objective ............................................................................................................................. 3
4.1 General objectives ........................................................................................................ 3
4.2 Specific objectives ......................................................................................................... 3
5. Limitation ........................................................................................................................... 3
6. System Methodology ............................................................................................................ 4
7. Scope of the project.............................................................................................................. 4
8. Contribution/Significance of the Project ............................................................................... 5
9. Cost and Materials Required................................................................................................ 5
9.1 Materials required (tools).............................................................................................. 5
9.2 Cost ............................................................................................................................. 6
10. Time frame / Work plan ................................................................................................... 6
Reference ................................................................................................................................... 7
ii
List of Tables
Table 1: Cost of the project ........................................................................................................................... 6
Table 2: Time frame for the project .............................................................................................................. 6
iii
List of Abbreviations/ Acronyms
1. UI: User interface
2. NSIS(Nullsoft Scriptable Install System): is a professional open source system to create
Windows installers
3. IDE: Integrated development environment
iv
1. Introduction
Machine learning is the study of computer algorithms that improve automatically through experience
and by the use of data. It is seen as a part of artificial intelligence. Some applications of machine learning
are image recognition, speech recognition, Traffic prediction, Product recommendation, self-driving
cars, Email spam and Malware filtering etc.
Disease Prediction using Machine Learning is a system, which predicts the disease based on the
information provided by the user. It also predicts the disease of the patient or the user based on the
information or the symptoms he/she enter into the system and provides the accurate results based on
that information. If the patient is not much serious and the user just wants to know the type of disease,
he/she has been through. Now a day’s health industry plays major role in curing the diseases of the
patients so this is also some kind of help for the health industry to tell the user and also it is useful
for the user in case he/she doesn’t want to go to the hospital or any other clinics, so just by entering
the symptoms the user can get to know the disease he/she is suffering from and the health industry
can also get benefit from this system by just asking the symptoms from the user and entering in
the system and in just few seconds they can tell the exact and up to some extent the accurate
diseases. This Disease Prediction Using Machine Learning is completely done with the help of
Machine Learning and Python Programming language with Tkinter Interface for it and also using the
dataset that is available previously by the hospitals using that we will predict the disease. This project
is that a user can sit at their convenient place and have a check-up of their health the UI is designed
in such a simple way that everyone can easily operate on it and can have a check-up.
2. Literature Review
JyotiSoni, Ujma Ansari, Dipesh Sharma and SunitaSoni have done this research research paper
into provide a survey of current techniques of knowledge discovery in databases using data mining
techniques that are in use in today’s medical research particularly in Heart Disease Prediction.
Number of experiment has been conducted to compare the performance of predictive data mining
technique on the same dataset and the outcome reveals that Decision Tree outperforms and some
time Bayesian classification is having similar accuracy as of decision tree but other predictive
methods like KNN, Neural Networks, Classification based on clustering is not performing well.
(JyotiSoni, Ansari, Sharma, & Soni, 2011)
1
Shadab Adam Pattekari and AsmaParveen have conducted a research using Naïve Bayes
Algorithm to predict the heart diseases where user provides the data which is compared with
trained set of values. So from this research, patients were able to provide their basic information
which is compared with the data and the heart disease is predicted. (Adam & Parveen, 2012)
M.A.NisharaBanu, B Gomathy used medical data mining techniques like association rule mining,
classification, clustering I to analyze the different kinds of heart based problems. Decision tree is
made to illustrate every possible outcome of a decision. Different rules are made to get the best
outcome. In this research age , sex, smoking, overweight, alcohol intake, blood sugar, hear rate,
blood pressure are the parameters used for making the decisions. Risk level for different
parameters are stored with their id’s ranging (1-8). ID lesser than of 1 of weight contains the
normal level of prediction and higher ID other than 1 comprise the higher risk levels .K-means
clustering technique is used to study the pattern in the dataset. The algorithm clusters informations
into k groups. Each point in the dataset is assigned to the closed cluster. Each cluster center is
recomputed as the average of the points in that cluster. (Nishar Banu, MA; Gomathy, B;, 2013)
3. Problem statement
Now a day’s in Health Industry there are various problems related to machines or devices which
will give wrong or unaccepted results, so to avoid those results and get the correct and desired
results we are building a program or project which will give the accurate predictions based on
symptoms provided by the user and also based on the datasets that are available in that machine.
The health industry in information yet and knowledge poor and this industry is very vast industry
which has lot of work to be done. Therefore, with the help of all those algorithms, techniques and
methodologies we have done this project, which will help the peoples who are in the need. So the
problem here is that many people goes to hospitals or clinic to know how is their health and how
much they are improving in the given days, but they have to travel to get to know there answers
and sometimes the patients may or may not get the results based on various factors such as doctor
might be on leave or some whether problem so he might not have come to the hospital and many
more reasons will be there so to avoid all those reasons and confusion we are making a project
which will help all those person’s and all the patients who are in need to know the condition of
their health, and at sometimes if the person has been observing few symptoms and he/she is not
sure about the disease he/she is encountered with so this will lead to various diseases in future. So,
2
to avoid that and get to know the disease in early stages of the symptoms this disease prediction
will help a lot to the various people’s ranging from children to teenagers to adults and also the
senior citizens.
4. Objective
4.1General objectives
The main objective of this project is to detect the various diseases through examine symptoms of
patients using different techniques of machine learning models and also this system makes which
will make it easy for an end user to predict any diseases without visiting physical doctor for
diagnosis.
4.2Specific objectives
• This Project will predict the diseases of the patients based on the symptoms
• The disease is predicted using the algorithms and the user has to enter the
symptoms in the space provided.
5. Limitation
The limitations of this project are:
a) Disease Predictor does not recommend medications of the disease
b) Past history of the disease has not been considered
c) Past history of the user is not been considered
d) It will predict about 40 disease using about 100 symptoms
3
6. System Methodology
• Data understanding:
The data understanding phase starts with an initial data collection and proceeds with
activities in order to get familiar with the data, identify data quality problems, discover first
insights into the data, or detect interesting subsets to form hypotheses for hidden
information.
• Data preparation:
The data preparation phase covers all activities to construct the final dataset (data that will
be fed into the modeling tool(s)) from the initial raw data. Data preparation tasks are likely
to be performed multiple times, and not in any prescribed order. Tasks include table,
record, and attribute selection as well as transformation and cleaning of data for modeling
tools.
• Modeling:
In this phase, the focus is on applying various modeling techniques on the prepared
variables in order to create models that possibly provide the desired outcome.
• Evaluation
At this stage in the project, we will build the model that appears to have high quality from
a data analysis perspective. Before proceeding to final deployment of the model, it is
important to more thoroughly evaluate the model. At the end of this phase, a decision on
the use of the data mining results should be reached.
• Model deployment:
The model deployment stage covers putting a model into production use. We will deploy
the model on desktop application.
4
8. Contribution/Significance of the Project
The purpose of making this project called “Disease Prediction Using Machine Learning” is to
predict the accurate disease of the patient using the symptoms. If health industry adopts this
project then the work of the doctors can be reduced and they can easily predict the disease of the
patient. The general purpose of this Disease prediction is to provide prediction for the various
and generally occurring diseases that when unchecked and sometimes ignored can turns into fatal
disease and cause lot of problem to the. This system will predict the most possible disease based
on the symptoms. The health industry in information yet and knowledge poor and this industry is
very vast industry which has lot of work to be done. So, with the help of some algorithms,
techniques and methodologies we have done this project which will help the peoples who are in
the need.
5
9.2Cost
Based on our assumptions, this project can cost
Phase Cost($)
Data exploration 20
Model selection 10
Model training 150
Storage 40
Total 220
Model deployment 3
6
Reference
[1] Aditya Tomar, “Disease Prediction System using data mining techniques”, in
International Journal of Advanced Research in computer and Communication
Engineering, ISO 3297, July 2016.
[2] Dr. B.Srinivasan, K.Pavya, “A study on data mining prediction techniques in
healthcare sector”, in International Research Journal of Engineering and Technology
(IRJET), March-2016.
[3] Megha Rathi, Vikas Pareek, “An integrated hybrid data mining approach for
healthcare” , in IRACST -International Journal of Computer Science and Information
Technology Security (IJCSITS), ISSN: 2249-9555 , Vol.6, No.6,Nov-Dec 2016.
[4] Feixiang Huang, Shengyong Wang, and Chien-Chung Chan, “Predicting Disease By
Using Data Mining Based on Healthcare Information System” , in IEEE 2012.
[5] M.A. Nishara Banu,B Gomathy, “An approach to devise an Interactive software
solution for smart health prediction using data mining, in International Journal of
Technical Research and Applications , eISSN, Nov-Dec 2013.
[6] Al-Aidaroos, K., Bakar, A., & Othman, Z. (2012). Medical Data Classification with
Naive Bayes Approach. Information Technology Journal.
[7] Darcy A. Davis, N. V.-L. (2008). Predicting Individual Disease Risk Based On
Medical History.