0% found this document useful (0 votes)

26 views13 pages

Web Application

The document describes the steps to build a web application for heart disease classification using machine learning. It discusses problem definition, data collection, data preparation, modeling, evaluation and deployment. Gradient boosting performed best for classification. The model was saved and deployed in a web app using Flask.

Uploaded by

ouiam ouhdifa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views13 pages

Web Application

Uploaded by

ouiam ouhdifa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

The Lifecycle to Build a Web Application for Prediction from

Scratch
BE G I NNE R M A C HI NE LE A RNI NG PRO G RA M M I NG PYT HO N S T RUC T URE D D AT A T E C HNI Q UE UNC AT E G O RI Z E D

The data science lifecycle is designed for big data issues and data science projects. Generally, the data
science project consists of seven steps which are problem definition, data collection, data preparation,
data exploration, data modeling, model evaluation and model deployment. This article goes through the
data science lifecycle in order to build a web application for heart disease classification.

If you would like to look at a specific step in the lifecycle, you can read it without looking deeply at the
other steps.

Problem Definition

Clinical decisions are often made based on doctors’ experience and intuition rather than on the knowledge-
rich hidden in the data. This leads to errors and many costs that affect the quality of medical services.
Using analytic tools and data modeling can help in enhancing the clinical decisions. Thus, the goal here is
to build a web application to help doctors in diagnosing hear t diseases. The full code of is available in my
GitHub repository.

Data Collection
I collected the heart disease dataset from UCI ML. The dataset has the following 14 attributes:

age: age in years.

sex: sex (1=male; 0=female).
cp: chest pain type (0 = typical angina; 1 = atypical angina; 2 = non-anginal pain; 3: asymptomatic).
trestbps: resting blood pressure in mm Hg on admission to the hospital.
chol: serum cholesterol in mg/dl.
fbs: fasting blood sugar > 120 mg/dl (1=true; 0=false).
restecg: resting electrocardiographic results ( 0=normal; 1=having ST-T wave abnormality; 2=probable
or definite left ventricular hypertrophy).
thalach: maximum heart rate achieved.
exang: exercise-induced angina (1=yes; 0=no).
oldpeak: ST depression induced by exercise relative to rest.
slope: the slope of the peak exercise ST segment (0=upsloping; 1=flat; 2=downsloping).
ca: number of major vessels (0–3) colored by fluorosopy.
thal: thalassemia (3=normal; 6=fixed defect; 7=reversable defect).
target: heart disease (1=no, 2=yes).

Data Preparation and Exploration

Here is a snapshot of the data header.

The header of the heart disease dataset

From the first look, the dataset contains 14 columns, 5 of them contain numerical values and 9 of them
contain categorical values.

The dataset is clean and contains all the information needed for each variable. By using info(),
describe(), isnull() functions, no errors, missing values, and inconsistencies values are detected.

#Check null values df.isnull().sum()

Null values in the dataset

By checking the percentage of the persons with and without heart diseases, it was found that 56% of the
persons in the dataset have heart disease. So, the dataset is relatively balanced.

People with and

without heart disease
in the dataset

Attributes Correlation
This heatmap shows the correlations between the dataset attributes, and how the attributes interact with
each other. From the heatmap, we can observe that the chest pain type (cp), exercise-induced angina
(exang), ST depression induced by exercise relative to rest (oldpeak), the slope of the peak exercise ST
segment (slope), number of major vessels (0–3) colored by flourosopy (ca) and thalassemia (thal) are
highly correlated with the heart disease (target). We observe also that there is an inverse proportion
between heart disease and maximum heart rate (thalch).

Moreover, we can see that the age is correlated with number of major vessels (0–3) colored by flourosopy
(ca) and maximum heart rate (thalch). There is also a relation between ST depression induced by exercise
relative to rest (oldpeak) and the slope of the peak exercise ST segment (slope). Moreover, there is a
relation between chest pain type (cp) and exercise-induced angina (exang).

Next, we will analyze these correlations between these features further.

1. Age and Maximum Heart Rate

Heart disease is arising frequently in older people, and the max heart rates are lower for old people with
heart disease.

2. Chest Pain

There are four types of chest pain: typical angina, atypical angina, non-anginal pain, and asymptomatic.
Most of the heart disease patients are found to have asymptomatic chest pain.

2. Chest Pain and Exercise-Induced Angina

The people who have exercise-induced angina; they usually suffer from asymptomatic chest pain, and they
are more likely to have heart disease.

3. Thalassemia

People with reversible defects are likely to have heart disease.

4. ST depression and the Slope of the Peak Exercise ST Segment.

The people who have downsloping ST segment have higher values of ST depression and more chance to be
infected with heart disease. The greater the ST depression, the greater the chance of disease.

5. Age and Number of Major Vessels (0–3) Colored by Flourosopy.

Most of the heart disease patients are old and they have one or more major vessels colored by Flourosopy.

Data Modeling

Let’s create the machine learning model. We are trying to predict whether a person has heart disease. We
will use the ‘target’ column as the class, and all the other columns as features for the model.

# Initialize data and target target = df[‘target’] features = df.drop([‘target’], axis = 1)

– Data Splitting

We will divide the data into a training set and test set. 80% of the data will be for training and 20% for
testing.

# Split the data into training set and testing set X_train, X_test, y_train, y_test =
train_test_split(features, target, test_size = 0.2, random_state = 0)
– Machine Learning Model

Here, we will try the below machine learning algorithms then we will select the best one based on its
classification report.

Support Vector Machine

Random Forest
Ada Boost
Gradient Boosting

The following function for training and evaluating the classifiers.

def fit_eval_model(model, train_features, y_train, test_features, y_test): results = {} # Train the model
model.fit(train_features, y_train) # Test the model train_predicted = model.predict(train_features)
test_predicted = model.predict(test_features) # Classification report and Confusion Matrix

results[‘classification_report’] = classification_report(y_test, test_predicted) results[‘confusion_matrix’]

= confusion_matrix(y_test, test_predicted) return results

Initialize models, train and evaluate.

# Initialize the models sv = SVC(random_state = 1) rf = RandomForestClassifier(random_state = 1) ab =

AdaBoostClassifier(random_state = 1) gb = GradientBoostingClassifier(random_state = 1)# Fit and evaluate

models results = {} for cls in [sv, rf, ab, gb]: cls_name = cls.__class__.__name__ results[cls_name] = {}
results[cls_name] = fit_eval_model(cls, X_train, y_train, X_test, y_test)

Now, we will print the evaluation results.

# Print classifiers results for result in results: print (result) print()for i in results[result]: print (i,
‘:’) print(results[result][i]) print() print (‘ — — -’) print()

The results are below:

Suppor t Vector Machine Result

Random Forest Result

Ada Boost Results

Gradient Boosting Result

From the above results, the best model is Gradient Boosting. So, I will save this model to use it for web
applications.

– Save the Prediction Model

Now, we will pickle the model so that it can be saved on disk.

# Save the model as serialized object pickle with open(‘model.pkl’, ‘wb’) as file: pickle.dump(gb, file)

Model Deployment

It is time to start deploying and building the web application using Flask web application framework. For
the web app, we have to create:

1. Web app python code (API) to load the model, get user input from the HTML template, make the
prediction, and return the result.

2. An HTML template for the front end to allow the user to input heart disease symptoms of the patient and
display if the patient has heart disease or not.

The structure of the files is like the following:

/ ├── model.pkl ├── heart_disease_app.py ├── templates/ └── Heart Disease Classifier.html

Web App Python Code

You can find the full code of the web app here.

As a first step, we have to import the necessary libraries.

import numpy as np import pickle from flask import Flask, request, render_template

Then, we create app object.

# Create application app = Flask(name)

After that, we need to load the saved model model.pkl in the app.

# Load machine learning model model = pickle.load(open(‘model.pkl’, ‘rb’))

After that home() function is called when the root endpoint ‘/’ is hit. The function redirects to the home
page Heart Disease Classifier.html of the website.

# Bind home function to URL @app.route(‘/’) def home(): return render_template(‘Heart Disease

Classifier.html’)

Now, create predict() function for the endpoint ‘/predict’. The function is defined as this endpoint
with POST method. When the user submits the form, the API receives a POST request, the API extracts all
data from the form using flask.request.form function. Then, the API uses the model to predict the
result. Finally, the function renders the Heart Disease Classifier.html template and returns the
result.

# Bind predict function to URL @app.route(‘/predict’, methods =[‘POST’]) def predict(): # Put all form
entries values in a list features = [float(i) for i in request.form.values()] # Convert features to array
array_features = [np.array(features)] # Predict features prediction = model.predict(array_features) output =
prediction # Check the output values and retrieve the result with html tag based on the value if output == 1:
return render_template(‘Heart Disease Classifier.html’, result = ‘The patient is not likely to have heart
disease!’) else: return render_template(‘Heart Disease Classifier.html’, result = ‘The patient is likely to
have heart disease!’)

Finally, start the flask server and run our web page locally on the computer by calling app.run() and then
enter http://localhost:5000 on the browser.

if name == ‘main’: #Run the application app.run()

HTML Template

The following figure presents the HTML form. You can find the code here.

The form has 13 inputs for the 13 features and a button. The button sends POST request to the/predict
endpoint with the input data. In the form tag, the action attribute calls predict function when the form
is submitted.

<form action = “{{url_for(‘predict’)}}” method =”POST” >

Finally, the HTML page presents the stored result in the result parameter.

<strong style="color:red">{{result}}</strong>

Summary

In this article, you learned how to create a web application for prediction from scratch. Firstly, we started
with the problem definition and data collection. Then, we worked on data preparation, data exploration,
data modeling, and model evaluation. Finally, we deployed the model using a flask.

Now, it is time to practice and apply what you learn in this ar ticle. Define a problem, search for a dataset
on the Internet, and then go through the other steps of the data science lifecycle.

About the Author

Nada Alay

I am working in the data analytics field and passionate about data science, machine learning, and scientific
research. Photograph: I use the attached logo as a personal image on the Internet.

Article Url - https://www.analyticsvidhya.com/blog/2020/09/web-application/

Guest Blog

Deep Residual Network For Steganalysis of Digital Images
No ratings yet
Deep Residual Network For Steganalysis of Digital Images
13 pages
Aia LLD
No ratings yet
Aia LLD
118 pages
What Is Quality Planning - Quality Control Plans - ASQ
No ratings yet
What Is Quality Planning - Quality Control Plans - ASQ
6 pages
Project_Report
No ratings yet
Project_Report
18 pages
Heart Disease Report
No ratings yet
Heart Disease Report
8 pages
SUMMARY
No ratings yet
SUMMARY
16 pages
Lab Report Content - 15marks(1) (2)
No ratings yet
Lab Report Content - 15marks(1) (2)
10 pages
Ide To 6 Classification Algorithms
No ratings yet
Ide To 6 Classification Algorithms
34 pages
Heart Disease Prediction With Machine Learning Approaches
No ratings yet
Heart Disease Prediction With Machine Learning Approaches
5 pages
03-Supervised Machine Learning Classification
No ratings yet
03-Supervised Machine Learning Classification
33 pages
Batch-2 (Review 2)
No ratings yet
Batch-2 (Review 2)
19 pages
Final PPT Heart Disease
67% (3)
Final PPT Heart Disease
23 pages
Dissertation
No ratings yet
Dissertation
41 pages
Chapter 3 Old
No ratings yet
Chapter 3 Old
45 pages
Synopsis (Heart Disease Prediction)
No ratings yet
Synopsis (Heart Disease Prediction)
7 pages
Heart_Desease_Presentation (5) (2)
No ratings yet
Heart_Desease_Presentation (5) (2)
23 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
17 pages
Ml Projects Part c
No ratings yet
Ml Projects Part c
8 pages
AI-Based Predictive Support for Heart Disease Diagnosis
No ratings yet
AI-Based Predictive Support for Heart Disease Diagnosis
16 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
ETE 399 Mini Project
No ratings yet
ETE 399 Mini Project
7 pages
Ml Cep FAisal
No ratings yet
Ml Cep FAisal
18 pages
Paper
No ratings yet
Paper
7 pages
Final Heart Disease Project Proposal
No ratings yet
Final Heart Disease Project Proposal
12 pages
A.I Lab Report
No ratings yet
A.I Lab Report
24 pages
HEART DISEASE PREDICTION USING MACHINE
No ratings yet
HEART DISEASE PREDICTION USING MACHINE
88 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
49 pages
Heart Disease Prediction With Machine Learning Approaches
No ratings yet
Heart Disease Prediction With Machine Learning Approaches
6 pages
Diagnosis and Prediction of Heart Disease Using Machine Learning Techniques
No ratings yet
Diagnosis and Prediction of Heart Disease Using Machine Learning Techniques
11 pages
HussainBadshah SafwanSheikh
No ratings yet
HussainBadshah SafwanSheikh
12 pages
Heart Disease Prediction System Using Machine Learning[3][2]
No ratings yet
Heart Disease Prediction System Using Machine Learning[3][2]
19 pages
Case Study
No ratings yet
Case Study
21 pages
Heart Disease Prediction Using Machine Learning-1
No ratings yet
Heart Disease Prediction Using Machine Learning-1
6 pages
Analysis of Heart Disease Using Parallel and Sequential Ensemble Methods With Feature Selection Techniques Heart Disease Prediction
No ratings yet
Analysis of Heart Disease Using Parallel and Sequential Ensemble Methods With Feature Selection Techniques Heart Disease Prediction
17 pages
HEART DOC
No ratings yet
HEART DOC
15 pages
Zhang 2021 J. Phys. Conf. Ser. 1769 012024
No ratings yet
Zhang 2021 J. Phys. Conf. Ser. 1769 012024
6 pages
AI & ML Report
No ratings yet
AI & ML Report
14 pages
Heart Diasease Prediction (KNN) by Dr. Elmanani Simamora, M.si
No ratings yet
Heart Diasease Prediction (KNN) by Dr. Elmanani Simamora, M.si
34 pages
Heart Disease Prediction - Medical Image Analysis - Robust Healthcare Forecasting
No ratings yet
Heart Disease Prediction - Medical Image Analysis - Robust Healthcare Forecasting
5 pages
Second Progres Report
No ratings yet
Second Progres Report
10 pages
Heart Disease Detection Using Machine Learning
No ratings yet
Heart Disease Detection Using Machine Learning
12 pages
Heart Disease
No ratings yet
Heart Disease
13 pages
Heart Disease Predictive Analysis
No ratings yet
Heart Disease Predictive Analysis
4 pages
Final Research Paper
No ratings yet
Final Research Paper
3 pages
Deep Learning Project Report
No ratings yet
Deep Learning Project Report
7 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
8 pages
Predicting Disease With Machine Learning
No ratings yet
Predicting Disease With Machine Learning
20 pages
Heart Disease Prediction Using Hybrid Model
No ratings yet
Heart Disease Prediction Using Hybrid Model
6 pages
Heart Disease Prediction Model: Dissertation
No ratings yet
Heart Disease Prediction Model: Dissertation
4 pages
Final Report
No ratings yet
Final Report
43 pages
Prediction of Cardiovascular Disease Using Machine Learning: Journal of Physics: Conference Series
No ratings yet
Prediction of Cardiovascular Disease Using Machine Learning: Journal of Physics: Conference Series
9 pages
DTM 003
No ratings yet
DTM 003
6 pages
Heart disease
No ratings yet
Heart disease
5 pages
181B226 Internship Report
No ratings yet
181B226 Internship Report
48 pages
Heart Disease Prediction - Jupyter Notebook
100% (1)
Heart Disease Prediction - Jupyter Notebook
9 pages
Ai in HC - 2
No ratings yet
Ai in HC - 2
9 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
9 pages
Seminar Report - Shubham.2101229151
No ratings yet
Seminar Report - Shubham.2101229151
21 pages
Prediction of Heart Diseases Using Machine Learning
No ratings yet
Prediction of Heart Diseases Using Machine Learning
49 pages
AIML Practical 05 22105A2021
No ratings yet
AIML Practical 05 22105A2021
9 pages
review 2
No ratings yet
review 2
23 pages
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
SAP FICO Supporter Handbook
100% (1)
SAP FICO Supporter Handbook
35 pages
Galaxy S II Teardown - Splitting 8.9 MM of The Latest Samsung Technologies
No ratings yet
Galaxy S II Teardown - Splitting 8.9 MM of The Latest Samsung Technologies
10 pages
Sqlite - Installation: Install Sqlite On Windows
No ratings yet
Sqlite - Installation: Install Sqlite On Windows
2 pages
Group 14 Monitoring System
No ratings yet
Group 14 Monitoring System
27 pages
Hu&Ma JF 2024
No ratings yet
Hu&Ma JF 2024
126 pages
Disini Brosur Hardware Update September 2021
No ratings yet
Disini Brosur Hardware Update September 2021
4 pages
Marco Fiorini, Lorenzo Colombo, A.I. Komboï - Re-Imagining Xenakis' Komboï Through Cyber-Human Co-Creative Improvisation Practice
No ratings yet
Marco Fiorini, Lorenzo Colombo, A.I. Komboï - Re-Imagining Xenakis' Komboï Through Cyber-Human Co-Creative Improvisation Practice
11 pages
SD T518 200306 Evidence Manual
No ratings yet
SD T518 200306 Evidence Manual
79 pages
Earth work..CH-4 pdf.
No ratings yet
Earth work..CH-4 pdf.
12 pages
Week-04Assignment MCQ
No ratings yet
Week-04Assignment MCQ
5 pages
Python All in One
No ratings yet
Python All in One
10 pages
Crowther 2002_Rowe and Kahn's Model of Successful Aging Revisited Positive Spirituality_The Forgotten Factor
No ratings yet
Crowther 2002_Rowe and Kahn's Model of Successful Aging Revisited Positive Spirituality_The Forgotten Factor
9 pages
Detailed Lesson Plan in EPP
No ratings yet
Detailed Lesson Plan in EPP
7 pages
Sap PP Integration Flow
67% (3)
Sap PP Integration Flow
2 pages
Logi Ebook DashboardDesign Final
No ratings yet
Logi Ebook DashboardDesign Final
28 pages
Imac Summer 2001 Parts List
No ratings yet
Imac Summer 2001 Parts List
2 pages
Shortest Path Problem - Wikipedia
No ratings yet
Shortest Path Problem - Wikipedia
50 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Chapter 1
No ratings yet
Chapter 1
3 pages
2015 Chapter 4 MMS ITencrypted
No ratings yet
2015 Chapter 4 MMS ITencrypted
29 pages
Literature Review On Garbage Collection
100% (2)
Literature Review On Garbage Collection
4 pages
Constellations Cootie Catcher
No ratings yet
Constellations Cootie Catcher
6 pages
School Form 5 (SF 5) Report On Promotion and Level of Proficiency & Achievement
No ratings yet
School Form 5 (SF 5) Report On Promotion and Level of Proficiency & Achievement
3 pages
Intensive English 3: Week 3 Online Session 1 Unit 4: Gadgets
No ratings yet
Intensive English 3: Week 3 Online Session 1 Unit 4: Gadgets
19 pages
Voxyvi A System For Long-Term Audio and Video Acquisitions in Neonatal
No ratings yet
Voxyvi A System For Long-Term Audio and Video Acquisitions in Neonatal
15 pages
Centricity Cardiology CA1000 - PDS
No ratings yet
Centricity Cardiology CA1000 - PDS
9 pages
The Difference Between Standard Sales Rush Orders
No ratings yet
The Difference Between Standard Sales Rush Orders
6 pages

Web Application

Uploaded by

Web Application

Uploaded by

The Lifecycle to Build a Web Application for Prediction from

age: age in years.

Data Preparation and Exploration

Here is a snapshot of the data header.

The header of the heart disease dataset

#Check null values df.isnull().sum()

People with and

Next, we will analyze these correlations between these features further.

1. Age and Maximum Heart Rate

2. Chest Pain and Exercise-Induced Angina

People with reversible defects are likely to have heart disease.

4. ST depression and the Slope of the Peak Exercise ST Segment.

5. Age and Number of Major Vessels (0–3) Colored by Flourosopy.

# Initialize data and target target = df[‘target’] features = df.drop([‘target’], axis = 1)

Support Vector Machine

The following function for training and evaluating the classifiers.

results[‘classification_report’] = classification_report(y_test, test_predicted) results[‘confusion_matrix’]

Initialize models, train and evaluate.

# Initialize the models sv = SVC(random_state = 1) rf = RandomForestClassifier(random_state = 1) ab =

Now, we will print the evaluation results.

The results are below:

Suppor t Vector Machine Result

Ada Boost Results

Gradient Boosting Result

– Save the Prediction Model

Now, we will pickle the model so that it can be saved on disk.

The structure of the files is like the following:

Web App Python Code

As a first step, we have to import the necessary libraries.

Then, we create app object.

# Create application app = Flask(__name__)

# Load machine learning model model = pickle.load(open(‘model.pkl’, ‘rb’))

if __name__ == ‘__main__’: #Run the application app.run()

<form action = “{{url_for(‘predict’)}}” method =”POST” >

About the Author

Article Url - https://www.analyticsvidhya.com/blog/2020/09/web-application/

You might also like

# Create application app = Flask(name)

if name == ‘main’: #Run the application app.run()