0% found this document useful (0 votes)

22 views17 pages

Experiment 2

Uploaded by

mohammed.ansari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views17 pages

Experiment 2

Uploaded by

mohammed.ansari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Experiment - 2

Name: Ansari Mohammed Shanouf Valijan

Class: B.E. Computer Engineering, Semester - VII
UID: 2021300004
Batch: M

Aim:
To implement decision trees for classification and regression on healthcare datasets.

Objective:
To write programs in python in order to demonstrate the working of decision trees for
classification and regression task by using appropriate medical datasets for building the trees
and using it to perform prediction on new data.

Outcomes:
▪ To be able to configure a decision tree in based on the dataset under consideration.
▪ To be able to train and test the decision tree model using various performance
parameters.
▪ To be able to use the model to predict the solution to the problem under consideration
and interpret it well.

Theory:
Decision trees are a popular and versatile machine learning algorithm used for both
classification and regression tasks. They work by recursively splitting the data into subsets
based on certain criteria, forming a tree-like structure of decisions. Each internal node of the
tree represents a test or decision on an attribute (feature), each branch corresponds to the
outcome of that test, and each leaf node represents a class label (in classification) or a
continuous value (in regression). The goal of a decision tree is to create a model that predicts
the value of a target variable by learning simple decision rules inferred from the data features.
Decision trees are favoured for their interpretability, as the decision-making process is
transparent and can be easily visualized, making them an excellent choice when transparency
in model decision-making is essential.
The construction of a decision tree involves selecting the best attribute to split the data at
each node, a process typically guided by measures like Gini impurity, entropy, or information
gain in classification tasks, and variance reduction in regression. The algorithm evaluates all
possible splits across all features and chooses the one that best separates the data according
to the chosen criterion. This process is repeated recursively for each subset of data, forming
a tree until a stopping condition is met, such as reaching a maximum depth or having too few
samples to split further. One of the key advantages of decision trees is their ability to handle
both numerical and categorical data and their robustness to irrelevant features. However,
decision trees are prone to overfitting, especially with noisy data, as they can grow very deep
and complex, capturing random fluctuations in the data rather than the underlying pattern.

Decision trees are widely used in various fields, including finance, healthcare, marketing, and
more, due to their simplicity and interpretability. They can be used in tasks such as credit
scoring, medical diagnosis, and customer segmentation. Despite their strengths, decision
trees have some limitations. They tend to be unstable, meaning small changes in the data can
lead to significantly different trees. This sensitivity to data variations can reduce the model's
generalization ability. Moreover, decision trees can be biased towards features with more
levels (categories) and are not always the best performers in terms of predictive accuracy,
particularly when compared to more complex models like ensemble methods (e.g., Random
Forests or Gradient Boosting). To mitigate these issues, techniques such as pruning, which
involves cutting back the tree to prevent overfitting, and ensemble methods, which combine
multiple trees to improve stability and accuracy, are often employed. Despite these
challenges, decision trees remain a fundamental tool in the machine learning toolbox, valued
for their clarity and ease of use.
Dataset Description:
[1] For classification task-
For the task of constructing the decision tree, the Stroke Prediction Dataset was taken into
consideration.
(https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset/data)

The task at hand was to be able to predict whether a particular person would suffer from a
stroke based on 10 clinical features that characterize the person. These features include
gender, age, whether the person suffers from hypertension, whether the person has history
of heart diseases, person’s work type (private, government, self-employed), marriage history,
residence type (rural, urban), average glucose level, bmi and the person’s smoking status
(never smoked, formerly smoked).

The dataset has about 5000 records, thereby providing initial confidence for coming up with
a good decision tree model.

[2] For regression task-

Here, in order to construct the decision tree, the Body Mass Index Detection dataset was
utilized.
(https://www.kaggle.com/datasets/sayanroy058/body-mass-index-detection)

The idea was to predict the BMI of a person given his/her age, weight, bio-impudence, gender
and height. The dataset has about 741 records.

Code:
Following is a step-by-step implementation of the task at hand-

[1] Classification task:

Link to Notebook -> DecisionTreeClassification

Importing the required libraries

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.tree import export_text
import matplotlib.pyplot as plt
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt
import seaborn as sns

Importing the dataset

dataset = pd.read_csv('/content/healthcare-dataset-stroke-data.csv')

Viewing the dataset at a glance

dataset.describe(include='all')
Dropping irrelevant column ‘id’
dataset = dataset.drop(columns=['id'])

Plotting the distribution of features in the dataset

numeric_columns = df.select_dtypes(include=['float64', 'int64']).columns

for col in numeric_columns:

plt.figure(figsize=(8, 4))
sns.histplot(df[col], kde=True, bins=30)
plt.title(f'Distribution of {col}')
plt.show()

categorical_columns = df.select_dtypes(include=['object']).columns

for col in categorical_columns:

plt.figure(figsize=(8, 4))
sns.countplot(data=df, x=col)
plt.title(f'Count of {col}')
plt.show()
The above plots helped me in understanding the dataset under consideration a bit better. I
was able to use the plots to better preprocess the data, getting rid of certain rows based on
the distribution as seen above.
Plotting the correlation matrix
corr_matrix = df.corr()

plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('Correlation Matrix')
plt.show()
The feature ‘age’ seems to be highly correlated to the possibility of a person suffering from a
stroke. The information about residence seems to have the least effect on knowing whether
a particular person will or will not suffer a stroke. Accordingly, that feature may be dropped.

Plotting pair-wise features in an attempt to find further patterns in the dataset

sns.pairplot(df, hue='stroke')
plt.show()

In the above plots, orange data points represent people who suffered from stroke, while blue
points represent otherwise. One can clearly make out that usually, people with old age are
likely to suffer from stokes in all the scenarios (as represented by high density of orange points
in the row labelled as age).

Encoding the categorical attributes to their numerical counterparts

columns_to_encode = ['gender', 'ever_married', 'work_type', 'Residence_type',
'smoking_status']

for column in columns_to_encode:

unique_values = df[column].unique()
value_to_int = {value: idx + 1 for idx, value in enumerate(unique_values)}

df[column] = df[column].map(value_to_int)

Dropping the rows where BMI value is missing

df['bmi'].isna().sum()
(201/5110)*100
3.9334637964774952
df_cleaned = df.dropna(subset=['bmi'])

Viewing if the dataset is unbalanced

stroke_counts = df_cleaned['stroke'].value_counts()

plt.figure(figsize=(8, 6))
plt.pie(stroke_counts, labels=stroke_counts.index, autopct='%1.1f%%',
colors=['#ff9999','#66b3ff'], startangle=140)
plt.title('Distribution of Stroke Cases')
plt.show()

Since the dataset is unbalanced, random sampling was performed to balance it.
Balancing the dataset
stroke_one_df = df_cleaned[df_cleaned['stroke'] == 1]
stroke_zero_df = df_cleaned[df_cleaned['stroke'] == 0].sample(n=211,
random_state=1)
new_df = pd.concat([stroke_one_df, stroke_zero_df])
new_df.reset_index(drop=True, inplace=True)
stroke_counts = new_df['stroke'].value_counts()

plt.figure(figsize=(8, 6))
plt.pie(stroke_counts, labels=stroke_counts.index, autopct='%1.1f%%',
colors=['#ff9999','#66b3ff'], startangle=140)
plt.title('Distribution of Stroke Cases')
plt.show()

Training the decision tree model

df = new_df
X = df.drop('stroke', axis=1)
y = df['stroke']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,
random_state=42)
param_grid = {
'criterion': ['gini', 'entropy'],
'max_depth': [10, 20, 30, 40, 50],
'min_samples_split': [2, 5, 10],
'min_samples_leaf': [1, 2, 4]
}
clf = DecisionTreeClassifier()
grid_search = GridSearchCV(estimator=clf, param_grid=param_grid,
cv=5, n_jobs=-1, verbose=2)
grid_search.fit(X_train, y_train)

print("Best parameters found: ", grid_search.best_params_)

best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)
y_pred_ontrained = best_model.predict(X_train)
Evaluating the model
y_pred = best_model.predict(X_test)
from sklearn.metrics import accuracy_score
print("Accuracy Score: ", accuracy_score(y_test, y_pred))
tree_rules = export_text(best_model, feature_names=list(X.columns))
print("\nDecision Tree Rules:\n", tree_rules)
accuracy_score(y_train, y_pred_ontrained)

[2] Regression task:

Link to Notebook -> DecisionTreeRegression

Importing the necessary libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.tree import DecisionTreeRegressor, plot_tree
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.preprocessing import LabelEncoder
import seaborn as sns

Importing the dataset

df = pd.read_csv('/content/Body Mass Index.csv')

Dropping irrelevant columns and encoding the categorical columns

df = df.drop(columns=['BmiClass'])
label_encoder = LabelEncoder()

df['Gender_encoded'] = label_encoder.fit_transform(df['Gender'])
df = df.drop(columns=['Gender'])

Visualizing the various features of the dataset to better understand it

numeric_columns = df.select_dtypes(include=['float64', 'int64']).columns

for col in numeric_columns:

plt.figure(figsize=(8, 4))
sns.histplot(df[col], kde=True, bins=30)
plt.title(f'Distribution of {col}')
plt.show()

categorical_columns = df.select_dtypes(include=['object']).columns
for col in categorical_columns:
plt.figure(figsize=(8, 4))
sns.countplot(data=df, x=col)
plt.title(f'Count of {col}')
plt.show()

Viewing the correlation among different features present in the dataset

corr_matrix = df.corr()

plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('Correlation Matrix')
plt.show()

The above plot clearly depicts a high dependence of BMI on weight, which is quite logical.
Further, height shows a correlation almost half as strong as weight, still an important factor
to take into consideration. Age seems to have the least positive correlation with the BMI.

Viewing pair-wise plots

sns.pairplot(df, hue='Bmi')
plt.show()
In the above plots, darker hues (purple in colour) depict higher BMI values and as can be
observed, almost all features with values towards higher end are pointing towards a high BMI
value. An exception to this is the Bio Impudence v/s Height plot where high BMI values seem
to be scattered.

Splitting the processed and analysed dataset into train and test sets
X = df.drop(columns='Bmi')
y = df['Bmi']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42)
Defining the decision tree regressor model and training it (parameters were chosen after
experimenting with different configurations and choosing the ones that avoided overfitting)
regressor = DecisionTreeRegressor(
max_depth=25,
min_samples_split=40,
min_samples_leaf=15,
max_features='sqrt',
random_state=10
)

regressor.fit(X_train, y_train)

Evaluating the model

y_pred = regressor.predict(X_test)

mae = mean_absolute_error(y_test, y_pred)

mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, y_pred)

print(f"Mean Absolute Error (MAE): {mae}")

print(f"Mean Squared Error (MSE): {mse}")
print(f"Root Mean Squared Error (RMSE): {rmse}")
print(f"R-squared (R^2): {r2}")

Printing the decision tree as hypothesized

plt.figure(figsize=(20, 10))
plot_tree(regressor,
feature_names=X.columns,
filled=True,
rounded=True,)
plt.title('Decision Tree Visualization')
plt.show()

Output:
[1] Classification task:
Upon evaluating the model, following was the accuracy obtained-
Accuracy Score: 0.8263492063492064
Training accuracy was as follows-
Accuracy Score: 0.9149659863945578

Class 0 representing no stroke, class 1 representing a possibility of stroke.

[1] Regression task:

Following performance parameters were obtained on training dataset-
Mean Absolute Error (MAE): 1.85
Mean Squared Error (MSE): 10.16
Root Mean Squared Error (RMSE): 3.19
R-squared (R^2): 0.89

Following performance parameters were obtained on test dataset-

Mean Absolute Error (MAE): 2.1160518106723467
Mean Squared Error (MSE): 10.597756621559329
Root Mean Squared Error (RMSE): 3.255419576883958
R-squared (R^2): 0.8517373327150053

Decision tree that was hypothesized for the regression task is as follows-
Conclusion:
By performing this experiment, I was able to understand the basic concepts associated with
building a decision tree. I was able to build, train and test the tree in python and was able to
come up with the following inferences-
▪ In case of classification task, the analysis steps revealed a huge dependence on age as
a factor in determining whether a person would or would not suffer from a stroke.
▪ The trained decision tree model showed an accuracy of 82.63 percent on validation set
while the accuracy of 91.49 percent was obtained on training set.
▪ Printing the decision tree hypothesized further supported the first inference owing to
a large number of decision nodes being based on age.
▪ In case of regression task, the analysis, logically, entailed a heavy dependence on
weight and height as features for the prediction of body mass index of an individual.
▪ The model trained initially had a test r-square value of 0.98 which was identified as
overfitting. The rectified model, then, had the test r-square value of around 0.8517
percent while the r-square value on training data was approximately 0.89.

Statistics For Health Care Research Research A Practical Workbook 1st Edition Susan K Grove ISBN 9781416002260 PDF Download
No ratings yet
Statistics For Health Care Research Research A Practical Workbook 1st Edition Susan K Grove ISBN 9781416002260 PDF Download
337 pages
BRM MCQs For PG - Merged
No ratings yet
BRM MCQs For PG - Merged
57 pages
AIML Record Batch 9
No ratings yet
AIML Record Batch 9
88 pages
Lecture Notes - Decision Tree
No ratings yet
Lecture Notes - Decision Tree
13 pages
Cse437 4
No ratings yet
Cse437 4
14 pages
Mini Project 2024
No ratings yet
Mini Project 2024
48 pages
Team 03
No ratings yet
Team 03
21 pages
Lab6A-Asset Tracking
No ratings yet
Lab6A-Asset Tracking
27 pages
Machine Learning - Project
No ratings yet
Machine Learning - Project
26 pages
Experiment 1
No ratings yet
Experiment 1
21 pages
Chapter 3
No ratings yet
Chapter 3
27 pages
Dissertation
No ratings yet
Dissertation
41 pages
BCS301.Module 5
No ratings yet
BCS301.Module 5
43 pages
Openlab 1
No ratings yet
Openlab 1
17 pages
CSS 2024 25 BE CE A B Sem VII OTH Lec 4 Unit II Asymmetric RSA DH Ciphers
No ratings yet
CSS 2024 25 BE CE A B Sem VII OTH Lec 4 Unit II Asymmetric RSA DH Ciphers
29 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Heart Disease Diagnosis Using Machine Learning
No ratings yet
Heart Disease Diagnosis Using Machine Learning
26 pages
C2 W4 Lab 02 Tree Ensemble
No ratings yet
C2 W4 Lab 02 Tree Ensemble
10 pages
Sampling Techniques
No ratings yet
Sampling Techniques
32 pages
Meta Analysis in A Digitalized World: A Step by Step Primer: Esther Kaufmann Ulf Dietrich Reips
No ratings yet
Meta Analysis in A Digitalized World: A Step by Step Primer: Esther Kaufmann Ulf Dietrich Reips
21 pages
C2 W4 Lab 02 Tree Ensemble
No ratings yet
C2 W4 Lab 02 Tree Ensemble
16 pages
Experiment 1
No ratings yet
Experiment 1
16 pages
Experiment 7
No ratings yet
Experiment 7
13 pages
DSM Practical 1
No ratings yet
DSM Practical 1
14 pages
CSS 2024 25 BE CE A B Sem VII AVN Lec 1 Introduction
No ratings yet
CSS 2024 25 BE CE A B Sem VII AVN Lec 1 Introduction
14 pages
PRJ-Parkinsons Disease Prediction
No ratings yet
PRJ-Parkinsons Disease Prediction
16 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
m4 Codes
No ratings yet
m4 Codes
6 pages
Experiment 4
No ratings yet
Experiment 4
12 pages
Rapport
No ratings yet
Rapport
21 pages
Heart Disease Predictor - ML - Report
No ratings yet
Heart Disease Predictor - ML - Report
15 pages
Experiment 3
No ratings yet
Experiment 3
9 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
16 pages
Experiment 5
No ratings yet
Experiment 5
10 pages
SUMMARY
No ratings yet
SUMMARY
16 pages
Experiment 4
No ratings yet
Experiment 4
8 pages
Lecture 15: Tree-Based Algorithms - Applied ML
No ratings yet
Lecture 15: Tree-Based Algorithms - Applied ML
17 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Experiment 2
No ratings yet
Experiment 2
12 pages
K-Nearest Neighbors For Diabetes Prediction: Malik Yousaf (F2020019038) Ahsan Rauf (F2020019057)
No ratings yet
K-Nearest Neighbors For Diabetes Prediction: Malik Yousaf (F2020019038) Ahsan Rauf (F2020019057)
15 pages
ML Lec-12
No ratings yet
ML Lec-12
17 pages
Experiment 5
No ratings yet
Experiment 5
14 pages
Decision Support
No ratings yet
Decision Support
21 pages
Draft Xai
No ratings yet
Draft Xai
16 pages
DSM Mini Project
No ratings yet
DSM Mini Project
11 pages
Experiment 5
No ratings yet
Experiment 5
8 pages
Experiment 4
No ratings yet
Experiment 4
8 pages
Aih Exp 2
No ratings yet
Aih Exp 2
8 pages
Audit Sampling (Chapter 9)
100% (2)
Audit Sampling (Chapter 9)
25 pages
23ucc542 ml9
No ratings yet
23ucc542 ml9
6 pages
23 Ucc 554 Aiml
No ratings yet
23 Ucc 554 Aiml
5 pages
Experiment 8
No ratings yet
Experiment 8
13 pages
Stroke Prediction
No ratings yet
Stroke Prediction
10 pages
AIH Lab2
No ratings yet
AIH Lab2
10 pages
Subject: STT 041 Lesson Title: Introduction To Statistics
No ratings yet
Subject: STT 041 Lesson Title: Introduction To Statistics
11 pages
Experiment 8 ML Vtu
No ratings yet
Experiment 8 ML Vtu
4 pages
Experiment 2
No ratings yet
Experiment 2
7 pages
Experiment 1
No ratings yet
Experiment 1
7 pages
Experiment 6
No ratings yet
Experiment 6
7 pages
1 s2.0 S1877050918315163 Main PDF
No ratings yet
1 s2.0 S1877050918315163 Main PDF
11 pages
Experiment 8
No ratings yet
Experiment 8
4 pages
Heart Disease Report
No ratings yet
Heart Disease Report
8 pages
Class Assignment On Decision Trees
No ratings yet
Class Assignment On Decision Trees
6 pages
Experiment 3
No ratings yet
Experiment 3
6 pages
7.1.1. Linear Regression - Intuition
No ratings yet
7.1.1. Linear Regression - Intuition
7 pages
Prediction of Heart Disease Using Decision Tree in Comparison With KNN To Improve Accuracy
No ratings yet
Prediction of Heart Disease Using Decision Tree in Comparison With KNN To Improve Accuracy
5 pages
Class-Work-Naive-Bayes (21-10-2024)
No ratings yet
Class-Work-Naive-Bayes (21-10-2024)
5 pages
Class-Work-1 (26-08-2024)
No ratings yet
Class-Work-1 (26-08-2024)
5 pages
Heart Disease Prediction Using Machine Learning Techniques: Abstract
No ratings yet
Heart Disease Prediction Using Machine Learning Techniques: Abstract
5 pages
Edited Version of Cardiovascular Diseases Risk Prediction Dataset Report
No ratings yet
Edited Version of Cardiovascular Diseases Risk Prediction Dataset Report
25 pages
Short Notes - Binomial Distribution - Lakshya MHTCET 2025
No ratings yet
Short Notes - Binomial Distribution - Lakshya MHTCET 2025
3 pages
Healthcare-Project-Simplilearn - Week3
No ratings yet
Healthcare-Project-Simplilearn - Week3
7 pages
Ecc 3014 Lecture 1 - Introduction To Statistics
No ratings yet
Ecc 3014 Lecture 1 - Introduction To Statistics
19 pages
Project Report
No ratings yet
Project Report
13 pages
2020-3-Ns-Spi (Q)
No ratings yet
2020-3-Ns-Spi (Q)
3 pages
Assignment On Module-3
No ratings yet
Assignment On Module-3
3 pages
Laporan Hasil Analisis Regresi Dan Uji Pendukungnya: A. Lampiran I. Output Eviews
No ratings yet
Laporan Hasil Analisis Regresi Dan Uji Pendukungnya: A. Lampiran I. Output Eviews
9 pages
Stroke Prediction Dataset
No ratings yet
Stroke Prediction Dataset
48 pages
XL Miner User Guide
No ratings yet
XL Miner User Guide
420 pages
Experiment 7
No ratings yet
Experiment 7
6 pages
Assignment-1, 2
No ratings yet
Assignment-1, 2
2 pages
Experiment 3
No ratings yet
Experiment 3
5 pages
Mat Prob Pertemuan 4
No ratings yet
Mat Prob Pertemuan 4
20 pages
Cardiovascular Disease Prediction
No ratings yet
Cardiovascular Disease Prediction
2 pages
Chapter 7
No ratings yet
Chapter 7
53 pages
IEEE Conference Team ATOM
No ratings yet
IEEE Conference Team ATOM
5 pages
Lesson 8 Random Sampling Activity 12
No ratings yet
Lesson 8 Random Sampling Activity 12
6 pages
Medical
No ratings yet
Medical
4 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
17 pages
E IS388 Theory MellaMargaretaVeronica 00000059669
No ratings yet
E IS388 Theory MellaMargaretaVeronica 00000059669
7 pages
Sampling Design and Analysis MTH 494: Ossam Chohan Assistant Professor CIIT Abbottabad
No ratings yet
Sampling Design and Analysis MTH 494: Ossam Chohan Assistant Professor CIIT Abbottabad
34 pages
Synopsis (Heart Disease Prediction)
No ratings yet
Synopsis (Heart Disease Prediction)
7 pages
REASEARCH
No ratings yet
REASEARCH
4 pages
Panel Data For Learing
100% (2)
Panel Data For Learing
34 pages
Name: Le Ho Thao Nguyen Student ID: 20194224
No ratings yet
Name: Le Ho Thao Nguyen Student ID: 20194224
9 pages
Chapter 2 Frappy Student Sample Commentary
No ratings yet
Chapter 2 Frappy Student Sample Commentary
1 page
Heart Disease Prediction - Jupyter Notebook
100% (1)
Heart Disease Prediction - Jupyter Notebook
9 pages
Course Content: St. Paul University Philippines
50% (4)
Course Content: St. Paul University Philippines
33 pages
Tugas 2: Bayessian Method (Atribut Kontinyu) : Outlook Temperature Humidity Windy Play-Class
No ratings yet
Tugas 2: Bayessian Method (Atribut Kontinyu) : Outlook Temperature Humidity Windy Play-Class
3 pages
Machine Learning Lab: Delhi Technological University
No ratings yet
Machine Learning Lab: Delhi Technological University
6 pages
ETE 399 Mini Project
No ratings yet
ETE 399 Mini Project
7 pages
14 Variable Sampling Plan
100% (3)
14 Variable Sampling Plan
59 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
Final Research Paper
No ratings yet
Final Research Paper
3 pages
DSAASTAT - by Andrea Onofri
No ratings yet
DSAASTAT - by Andrea Onofri
10 pages
20MIS7043 (LAB 7) .Ipynb Colaboratory
No ratings yet
20MIS7043 (LAB 7) .Ipynb Colaboratory
4 pages
20MIS7095 (LAB 7) .Ipynb Colaboratory
No ratings yet
20MIS7095 (LAB 7) .Ipynb Colaboratory
4 pages
7708 - MBA PredAnanBigDataNov21
No ratings yet
7708 - MBA PredAnanBigDataNov21
11 pages
ALAS - M2L3 Application
No ratings yet
ALAS - M2L3 Application
6 pages
Detecting Outliers With Grubbs' Test
No ratings yet
Detecting Outliers With Grubbs' Test
8 pages
12 Duncan's Multiple Range Test (DMRT) : September 2019
No ratings yet
12 Duncan's Multiple Range Test (DMRT) : September 2019
19 pages
Second Progres Report
No ratings yet
Second Progres Report
10 pages
Business Statistics: For University of Delhi
No ratings yet
Business Statistics: For University of Delhi
11 pages
ml2 PDF
No ratings yet
ml2 PDF
5 pages
2007 AP Statistics Multiple Choice Exam
No ratings yet
2007 AP Statistics Multiple Choice Exam
17 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet