0% found this document useful (0 votes)
69 views

Intellicorp An Ensemble Model To Predict Crop Using Machine Learning Algorithms

Contact us for project abstract, enquiry, explanation, code, execution, documentation. Phone/Whatsap : 9573388833 Email : [email protected] Website : https://dcs.datapro.in/contact-us-2 Tags: btech, mtech, final year project, datapro, machine learning, cyber security, cloud computing, blockchain,

Uploaded by

dataprodcs
Copyright
© © All Rights Reserved
0% found this document useful (0 votes)
69 views

Intellicorp An Ensemble Model To Predict Crop Using Machine Learning Algorithms

Contact us for project abstract, enquiry, explanation, code, execution, documentation. Phone/Whatsap : 9573388833 Email : [email protected] Website : https://dcs.datapro.in/contact-us-2 Tags: btech, mtech, final year project, datapro, machine learning, cyber security, cloud computing, blockchain,

Uploaded by

dataprodcs
Copyright
© © All Rights Reserved
You are on page 1/ 11

ABSTRACT

In this paper, We present a system for identifying cardiac disease that is both efficient and
accurate that is based on machine learning techniques in this study.Some of the classification
techniques used in the system include, logistic regression, , K-nearest neighbour, Nave bays,
and decision trees, as well as standard feature selection algorithms. The strategies for feature
selection are used to improve classification accuracy while reducing the classification
system's execution time. In addition, to acquire best practises in model assessment and hyper
parameter tweaking, the leave one subject out cross-validation method was used. The
performance of the classifier is evaluated using performance measurement metrics. The
classifiers' performance was assessed using features picked using feature selection methods.
The proposed feature selection technique (FCMIM) appears to be practicable when used with
a classifier support vector machine to develop a high-level intelligent system to identify heart
sickness, according to the experimental results. Furthermore, the proposed method can be
easily implemented in the healthcare industry to detect cardiac issues.

iv
TABLE OF CONTENTS
Chapter Title Pg no.
no.

1 INTRODUCTION 1
1.1 OUTLINE OF THE PROJECT 1
1.2 MACHINE LEARNING 1
2 LITERATURE SURVEY 3
3 ML ALGORITHM 6

3.1 ALGORITHM USED 6

3.1.1 Random Forest 6

3.1.2 KNN 6

3.1.3 Logistic Regression 7


4 DATASET 8

4.1 ATTRIBUTE INFORMATION 8


8
5 SOFTWARE REQUIREMENTS 14

5.1 SOFTWARE USED 14

5.1.1 Pycharm 14

5.1.2 Python 3.7 (64-bit) 14

5.2 LIBRARIES USED 15


METHODOLOGY 15
6 6.1 WORKFLOW DIAGRAM 15

v
6.1.1. Random Forest 18
6.1.2 Logistic Regression 20
6.1.3. KNN 21
7 APPLICATION WORKING 25
7.1 Machine Learning 26
7.2. Web Application 26
8 RESULTS AND DISCUSSION 32
9 CONCLUSION 35
REFERENCES 36
APPENDIX 42
A.SOURCE CODE 43
B.SCREENSHOTS 49
C.PLAGIARISM REPORT 54

vi
CHAPTER 1

INTRODUCTION

1.1 OUTLINE OF THE PROJECT

Heart disease (HD) is the critical health issue and numerous people have been
suffered by this disease around the world .The HD occurs with common
symptoms of breath shortness, physical body weakness and, feet are swollen.

Researchers try to come across an efficient technique for the detection of heart
disease, as the current diagnosis techniques of heart disease are not much
effective in early time identification due to several reasons, such as accuracy and
execution time.The diagnosis and treatment of heart disease is extremely difficult
when modern technology and medical experts are not available

The effective diagnosis and proper treatment can save the lives of many people.
According to the European Society of Cardiology, 26 million approximately people
of HD were diagnosed and diagnosed 3.6 million annually.Most of the people in
the United States are suffering from heart disease. Diagnosis of HD is
traditionally done by the analysis of the medical history of the patient, physical
examination report and analysis of concerned symptoms by a physician. But the
results obtained from this diagnosis method are not accurate in identifying the
patient of HD. Moreover, it is expensive and computationally difficult to analyze.

1.2 MACHINE LEARNING

Machine learning could be a subfield of computer science (AI). The Goal of


machine learning typically is to know the structure of information and match that
data into models which will be understood and used by folks. Although machine
learning could be a field inside technology, it differs from ancient process

1
approaches. In ancient computing, algorithms are sets of expressly programmed
directions employed by computers to calculate or downside solve. Machine
learning algorithms instead give computers to coach on knowledge inputs and use
applied math analysis so as to output values that fall inside a particular variable.
Thanks to this, machine learning facilitates computers in building models from
sample knowledge so as to modify decision-making processes supported
knowledge inputs.

2
CHAPTER 2

LITERATURE REVIEW

● According to Benjamin EJ et al. [1,] there are seven major risk factors for heart
disease: smoking, lack of physical activity, poor diet, obesity, high cholesterol,
diabetes, and high blood pressure. They also talked about heart disease statistics,
such as stroke and cardiovascular illness.

● In their experiments, Abhay Kishore et al. [2] found that recurrent neural
networks outperform other algorithms such as CNN, Nave Bayes, and SVM in
terms of accuracy. As a result, neural systems perform well in the prediction of
cardiac disease. They also developed a system that could predict quiet cardiac
failures and alert the client as soon as possible.

● When compared to other computations, M. Nikhil Kumar et al. [3] employed


numerous algorithms– Decision tree, random forest, Na-ve Bayes, KNN, Support
vector machine, logistic model tree technique, and Naive Bayes calculation. They
used the cardiac illness dataset from the UCI repository. In addition, the J48
algorithm was quicker to develop and had good results.

● Amandeep Kaur et al. [4] examined various algorithms for heart disease
prediction, including artificial neural networks, K–nearest neighbour, Nave Bayes,
and Support vector machines.

● Sahaya Arthy et al. [5] examine the existing data mining-based work on heart
disease prediction. In most cases, data mining methods are used to predict
cardiac diseases. They also discuss the databases used, such as the UCI
cardiology dataset, as well as the tools used, such as Weka, Quick Excavator,
Information Dissolver, Apache Mahout, Clatter, Bottom, R, and others. They come
to the conclusion that using a single algorithm improves prediction accuracy.
However, combining two or more algorithms can help to strengthen and improve
the accuracy of heart disease prediction

● Chala Beyene et al.[6] Proposed a methodology to read the circumstance of


HD to overcome the problem of opinion of HD. It bettered the actuality
methodology by choosing Naïve Bayes, J48, and SVM for predicting the
circumstance of HD for early automatic opinion in short time in order to support
the rates of services and reduce costs to save the life of individualities. This
methodology uses various attributes of HD so as to spot whether a patent has HD
or not. The comparison of study within the dataset is employed WEKA software.

● P.Sai Chandrasekhar Reddy et al.[7]Recommended ANN algorithms for HD


prophecy system in DM. The main end of this predicting system is to reduce cost
of a opinion like different type of test was done to make a decision for opinion of
HD. so, they have proposed a new system to prognosticate the condition of the
case predicated on their parameters analogous as age, blood pressure, eyeblink
rate, cholesterol,etc. and estimate if a case has HD or not. The proposed system
is handed its delicacy in java.

● Dwivedi t al.[8]Concentrated to estimate the performance of different ML


algorithms for HD prophecy. The comparison between different algorithms

analogous as Naïve Bayes, KNN, Logistic Retrogression and Bracket tree in

order to identify the high performance for predictin

5
CHAPTER 3

ML ALGORITHM

In machine learning, tasks are generally classified into broad categories.


These categories are based on how learning is received or how feedback on the
learning is given to the system developed. Two of the most widely adopted
machine learning methods are: -

Supervised learning which trains algorithms based on example input and output
data that is labeled by humans.

Unsupervised learning provides the algorithm with no labeled data in order to


allow it to find structure within its input data.

3.1 ALGORITHM USED

Because more than one class may be given to a single instance, multi-label
classification (MLC) is the best and optimum option. The MLC is embedded into
the algorithms used for categorization and model building, which are as follows:

1. Random Forest

2. KNN

3. Logistic Regression

3.1.1 Random Forest

This is an AI strategy used to predict and describe the connection between


independent and dependent data values. It predicts the dependent variable by
analyzing the relation between the other independent variables.

3.1.2 KNN

This is an AI strategy used to solve problems like regression and


classification. The algorithm gives an outcome that is based on the predictions of the
decision trees by taking the average or mean of the output from different trees. It can
be used to predict what will happen in the future.
3.1.3. Logistic Regression

This is an AI strategy used to solve problems like regression and classification.


The algorithm gives an outcome that is based on the predictions of the decision
trees by taking the average or mean of the output from different trees. It can be
used to predict what will happen in the future.

7
CHAPTER 4

DATASET
Dataset
Dataset –his database contains 306 attributes, but all published experiments refer
to using a subset of 14 of them. In particular, the Cleveland database is the only
one that has been used by ML researchers to this date. The "goal" field refers to
the presence of heart disease in the patient. It is integer valued from 0 (no
presence) to 4.Dimension Attribute-age, gender ,cp ,trestbps, chol, fbs, restecg
,thalach, exang, oldpeak, slope ,ca, thal, target.

Collecting data allows you to capture a record of past events so that we can use
data analysis to find recurring patterns. From those patterns, you build
predictive models using machine learning algorithms that look for trends and
predict future changes.
Predictive models are only as good as the data from which they are built, so good
data collection practices are crucial to developing high-performing models.
The data need to be error-free (garbage in, garbage out) and contain relevant
information for the task at hand. For example, a loan default model would not
benefit from tiger population sizes but could benefit from gas prices over time.
In this module, we collect the data from kaggle dataset archives. This dataset
contains the information of divorce in previous years.

S no Attributes Description

1 Age In Years
2 Sex Male, Female
3 Cp Angina, Abnang, Notang,
Asympt
4 Trstbps Resting blood pressure in
mm hg
5 Chol Serum cholesterol in
mg/dl
6 Fbs 1 if fasting blood sugar is
greater than
120 mg/dl, 2 if fasting
blood sugar is less
than 120 mg/dl
7 Restecg Electrocardiac graphic
results
8 Thalach Maximum heart rate
observed
9 Exang Exercise with angina has
occured
10 Oldpeak ST depression induced
through exercise
CHAPTER 5

SOFTWARE REQUIREMENTS

5.1 SOFTWARE USED

5.1.2 PYCHARM

PyCharm is one of the most popular Python IDEs. There is a multitude of reasons for this,
including the fact that it is developed by JetBrains, the developer behind the popular IntelliJ
IDEA IDE that is one of the big 3 of Java IDEs and the “smartest JavaScript IDE”
WebStorm. Having the support for web development by leveraging FLASK is yet another
credible reason.
The main reason Pycharm for the creation of this IDE was for Python programming, and to
operate across multiple platforms like Windows, Linux, and macOS. The IDE comprises code
analysis tools, debugger, testing tools, and also version control options. It also assists
developers in building Python plugins with the help of various APIs available. The IDE
allows us to work with several databases directly without getting it integrated with other
tools. Although it is specially designed for Python, HTML, CSS, and Javascript files can also
be created with this IDE. It also comes with a beautiful user interface that can be customized
according to the needs using plugins.

2. Python 3.7 (64-bit)

Python is an interpreted, object-oriented, high-level programming language with dynamic


semantics. It is high-level built-in data structures, combined with dynamic typing and
dynamic binding, make it very attractive for Rapid Application Development, as well as for
use as a scripting or glue language to connect existing components together. Python's simple,
easy to learn syntax emphasizes readability and therefore reduces the cost of program
maintenance. Python supports modules and packages, which encourages program modularity
and code reuse. The Python interpreter and the extensive standard library are available in
source or binary form without charge for all major platforms and can be freely distributed.

You might also like