0% found this document useful (0 votes)

21 views

Random Forest Model

The description about random forest machine learning algorithm and the mathematics behind it.

Uploaded by

ayushaware.iitkgp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Random Forest Model

The description about random forest machine learning algorithm and the mathematics behind it.

Uploaded by

ayushaware.iitkgp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Build a random forest model

May 1, 2023

1 Build a random forest model

1.1 Introduction

As we’re learning, random forests are popular statistical learning algorithms. Some of their primary
benefits include reducing variance, bias, and the chance of overfitting.
Here, we will train, tune, and evaluate a random forest model using data from spreadsheet of survey
responses from 129,880 customers. It includes data points such as class, flight distance, and inflight
entertainment. our random forest model will be used to predict whether a customer will be satisfied
with their flight experience.
Note: first we require exploratory data analysis, data cleaning, and other manipulations to prepare
it for modeling.

1.2 Step 1: Imports

We shall Import relevant Python libraries and modules, including numpy and pandaslibraries
for data processing; the pickle package to save the model; and the sklearn library, con-
taining: - The module ensemble, which has the function RandomForestClassifier - The
module model_selection, which has the functions train_test_split, PredefinedSplit, and
GridSearchCV - The module metrics, which has the functions f1_score, precision_score,
recall_score, and accuracy_score
[1]: # Import `numpy`, `pandas`, `pickle`, and `sklearn`.
# Import the relevant functions from `sklearn.ensemble`, `sklearn.
,→model_selection`, and `sklearn.metrics`.

import numpy as np
import pandas as pd

import pickle as pkl

from sklearn.ensemble import RandomForestClassifier

from sklearn.model_selection import train_test_split, PredefinedSplit,␣
,→GridSearchCV

from sklearn.metrics import f1_score, precision_score, recall_score,␣

,→accuracy_score

1
As shown in this cell, the dataset has been automatically loaded in for us. We do not need to
download the .csv file
[2]: # RUN THIS CELL TO IMPORT YOUR DATA.
air_data = pd.read_csv("Invistico_Airline.csv")

Now, we are ready to begin cleaning your data.

1.3 Step 2: Data cleaning

To get a sense of the data, display the first 10 rows.

[3]: # Display first 10 rows.
air_data.head(10)

[3]: satisfaction Customer Type Age Type of Travel Class \

0 satisfied Loyal Customer 65 Personal Travel Eco
1 satisfied Loyal Customer 47 Personal Travel Business
2 satisfied Loyal Customer 15 Personal Travel Eco
3 satisfied Loyal Customer 60 Personal Travel Eco
4 satisfied Loyal Customer 70 Personal Travel Eco
5 satisfied Loyal Customer 30 Personal Travel Eco
6 satisfied Loyal Customer 66 Personal Travel Eco
7 satisfied Loyal Customer 10 Personal Travel Eco
8 satisfied Loyal Customer 56 Personal Travel Business
9 satisfied Loyal Customer 22 Personal Travel Eco

Flight Distance Seat comfort Departure/Arrival time convenient \

0 265 0 0
1 2464 0 0
2 2138 0 0
3 623 0 0
4 354 0 0
5 1894 0 0
6 227 0 0
7 1812 0 0
8 73 0 0
9 1556 0 0

Food and drink Gate location … Online support Ease of Online booking \
0 0 2 … 2 3
1 0 3 … 2 3
2 0 3 … 2 2
3 0 3 … 3 1
4 0 3 … 4 2
5 0 3 … 2 2
6 0 3 … 5 5

2
7 0 3 … 2 2
8 0 3 … 5 4
9 0 3 … 2 2

On-board service Leg room service Baggage handling Checkin service \

0 3 0 3 5
1 4 4 4 2
2 3 3 4 4
3 1 0 1 4
4 2 0 2 4
5 5 4 5 5
6 5 0 5 5
7 3 3 4 5
8 4 0 1 5
9 2 4 5 3

Cleanliness Online boarding Departure Delay in Minutes \

0 3 2 0
1 3 2 310
2 4 2 0
3 1 3 0
4 2 5 0
5 4 2 0
6 5 3 17
7 4 2 0
8 4 4 0
9 4 2 30

Arrival Delay in Minutes

0 0.0
1 305.0
2 0.0
3 0.0
4 0.0
5 0.0
6 15.0
7 0.0
8 0.0
9 26.0

[10 rows x 22 columns]

Now, we will display the variable names and their data types.
[4]: # Display variable names and types.
air_data.dtypes

3
[4]: satisfaction object
Customer Type object
Age int64
Type of Travel object
Class object
Flight Distance int64
Seat comfort int64
Departure/Arrival time convenient int64
Food and drink int64
Gate location int64
Inflight wifi service int64
Inflight entertainment int64
Online support int64
Ease of Online booking int64
On-board service int64
Leg room service int64
Baggage handling int64
Checkin service int64
Cleanliness int64
Online boarding int64
Departure Delay in Minutes int64
Arrival Delay in Minutes float64
dtype: object

Question: What do you observe about the differences in data types among the variables included
in the data?
Next, to understand the size of the dataset, identify the number of rows and the number of columns.
[5]: # Identify the number of rows and the number of columns.
air_data.shape

[5]: (129880, 22)

Now, we check for missing values in the rows of the data. Starting with .isna() to get Booleans
indicating whether each value in the data is missing. Then, we use .any(axis=1) to get Booleans
indicating whether there are any missing values along the columns in each row. Finally, we use
.sum() to get the number of rows that contain missing values.

[6]: # Get Booleans to find missing values in data.

# Get Booleans to find missing values along columns.
# Get the number of rows that contain missing values.
air_data.isna().any(axis=1).sum()

[6]: 393

WE drop the rows with missing values. This is an important step in data cleaning, as it makes
the data more useful for analysis and regression. Then, save the resulting pandas DataFrame in a
variable named air_data_subset.

4
[8]: # Drop missing values.
# Save the DataFrame in variable `air_data_subset`.
air_data_subset = air_data.dropna(axis=0)

Next, display the first 10 rows to examine the data subset.

[9]: # Display the first 10 rows.
air_data_subset.head(10)

[9]: satisfaction Customer Type Age Type of Travel Class \

Flight Distance Seat comfort Departure/Arrival time convenient \

0 265 0 0
1 2464 0 0
2 2138 0 0
3 623 0 0
4 354 0 0
5 1894 0 0
6 227 0 0
7 1812 0 0
8 73 0 0
9 1556 0 0

Food and drink Gate location … Online support Ease of Online booking \
0 0 2 … 2 3
1 0 3 … 2 3
2 0 3 … 2 2
3 0 3 … 3 1
4 0 3 … 4 2
5 0 3 … 2 2
6 0 3 … 5 5
7 0 3 … 2 2
8 0 3 … 5 4
9 0 3 … 2 2

On-board service Leg room service Baggage handling Checkin service \

0 3 0 3 5

5
1 4 4 4 2
2 3 3 4 4
3 1 0 1 4
4 2 0 2 4
5 5 4 5 5
6 5 0 5 5
7 3 3 4 5
8 4 0 1 5
9 2 4 5 3

Cleanliness Online boarding Departure Delay in Minutes \

0 3 2 0
1 3 2 310
2 4 2 0
3 1 3 0
4 2 5 0
5 4 2 0
6 5 3 17
7 4 2 0
8 4 4 0
9 4 2 30

Arrival Delay in Minutes

0 0.0
1 305.0
2 0.0
3 0.0
4 0.0
5 0.0
6 15.0
7 0.0
8 0.0
9 26.0

[10 rows x 22 columns]

Confirm that it does not contain any missing values.

[ ]: # Count of missing values.
air_data_subset.isna().sum()

Next, convert the categorical features to indicator (one-hot encoded) features.

Note: The drop_first argument can be kept as default (False) during one-hot encoding for
random forest models, so it does not need to be specified. Also, the target variable, satisfaction,
does not need to be encoded and will be extracted in a later step.

6
[10]: # Convert categorical features to one-hot encoded features.
air_data_subset_dummies = pd.get_dummies(air_data_subset,
columns=['Customer Type','Type of␣
,→Travel','Class'])

Question: Why is it necessary to convert categorical data into dummy variables?**

Next, display the first 10 rows to review the air_data_subset_dummies.
[11]: # Display the first 10 rows.
air_data_subset_dummies.head(10)

[11]: satisfaction Age Flight Distance Seat comfort \

0 satisfied 65 265 0
1 satisfied 47 2464 0
2 satisfied 15 2138 0
3 satisfied 60 623 0
4 satisfied 70 354 0
5 satisfied 30 1894 0
6 satisfied 66 227 0
7 satisfied 10 1812 0
8 satisfied 56 73 0
9 satisfied 22 1556 0

Departure/Arrival time convenient Food and drink Gate location \

0 0 0 2
1 0 0 3
2 0 0 3
3 0 0 3
4 0 0 3
5 0 0 3
6 0 0 3
7 0 0 3
8 0 0 3
9 0 0 3

Inflight wifi service Inflight entertainment Online support … \

0 2 4 2 …
1 0 2 2 …
2 2 0 2 …
3 3 4 3 …
4 4 3 4 …
5 2 0 2 …
6 2 5 5 …
7 2 0 2 …
8 5 3 5 …
9 2 0 2 …

7
Online boarding Departure Delay in Minutes Arrival Delay in Minutes \
0 2 0 0.0
1 2 310 305.0
2 2 0 0.0
3 3 0 0.0
4 5 0 0.0
5 2 0 0.0
6 3 17 15.0
7 2 0 0.0
8 4 0 0.0
9 2 30 26.0

Customer Type_Loyal Customer Customer Type_disloyal Customer \

0 1 0
1 1 0
2 1 0
3 1 0
4 1 0
5 1 0
6 1 0
7 1 0
8 1 0
9 1 0

Type of Travel_Business travel Type of Travel_Personal Travel \

0 0 1
1 0 1
2 0 1
3 0 1
4 0 1
5 0 1
6 0 1
7 0 1
8 0 1
9 0 1

Class_Business Class_Eco Class_Eco Plus

0 0 1 0
1 1 0 0
2 0 1 0
3 0 1 0
4 0 1 0
5 0 1 0
6 0 1 0
7 0 1 0
8 1 0 0
9 0 1 0

8
[10 rows x 26 columns]

Then, check the variables of air_data_subset_dummies.

[12]: # Display variables.
air_data_subset_dummies.dtypes

[12]: satisfaction object

Age int64
Flight Distance int64
Seat comfort int64
Departure/Arrival time convenient int64
Food and drink int64
Gate location int64
Inflight wifi service int64
Inflight entertainment int64
Online support int64
Ease of Online booking int64
On-board service int64
Leg room service int64
Baggage handling int64
Checkin service int64
Cleanliness int64
Online boarding int64
Departure Delay in Minutes int64
Arrival Delay in Minutes float64
Customer Type_Loyal Customer uint8
Customer Type_disloyal Customer uint8
Type of Travel_Business travel uint8
Type of Travel_Personal Travel uint8
Class_Business uint8
Class_Eco uint8
Class_Eco Plus uint8
dtype: object

Question: What changes do you observe after converting the string data to dummy variables?**

1.4 Step 3: Model building

The first step to building the model is separating the labels (y) from the features (X).

[13]: # Separate the dataset into labels (y) and features (X).
y = air_data_subset_dummies["satisfaction"]
X = air_data_subset_dummies.drop("satisfaction", axis=1)

Once separated, split the data into train, validate, and test sets.

9
[14]: # Separate into train, validate, test sets.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25,␣
,→random_state = 0)

X_tr, X_val, y_tr, y_val = train_test_split(X_train, y_train, test_size = 0.25,␣

,→random_state = 0)

1.4.1 Tune the model

Now, We fit and tune a random forest model with separate validation set. We begin by determining
a set of hyperparameters for tuning the model using GridSearchCV.
[15]: # Determine set of hyperparameters.
cv_params = {'n_estimators' : [50,100],
'max_depth' : [10,50],
'min_samples_leaf' : [0.5,1],
'min_samples_split' : [0.001, 0.01],
'max_features' : ["sqrt"],
'max_samples' : [.5,.9]}

Next, create a list of split indices.

[16]: # Create list of split indices.
split_index = [0 if x in X_val.index else -1 for x in X_train.index]
custom_split = PredefinedSplit(split_index)

Now, instantiate your model.

[17]: # Instantiate model.
rf = RandomForestClassifier(random_state=0)

Next, use GridSearchCV to search over the specified parameters.

[18]: # Search over specified parameters.
rf_val = GridSearchCV(rf, cv_params, cv=custom_split, refit='f1', n_jobs = -1,␣
,→verbose = 1)

Now, fit your model.

[19]: %%time

# Fit the model.

rf_val.fit(X_train, y_train)

Fitting 1 folds for each of 32 candidates, totalling 32 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=-1)]: Done 32 out of 32 | elapsed: 42.9s finished

10
CPU times: user 5.64 s, sys: 127 ms, total: 5.76 s
Wall time: 48.2 s

[19]: GridSearchCV(cv=PredefinedSplit(test_fold=array([-1, -1, …, -1, -1])),

error_score=nan,
estimator=RandomForestClassifier(bootstrap=True, ccp_alpha=0.0,
class_weight=None,
criterion='gini', max_depth=None,
max_features='auto',
max_leaf_nodes=None,
max_samples=None,
min_impurity_decrease=0.0,
min_impurity_split=None,
min_samples_leaf=1,
min_samples_split=2,
min_weig…
n_estimators=100, n_jobs=None,
oob_score=False, random_state=0,
verbose=0, warm_start=False),
iid='deprecated', n_jobs=-1,
param_grid={'max_depth': [10, 50], 'max_features': ['sqrt'],
'max_samples': [0.5, 0.9],
'min_samples_leaf': [0.5, 1],
'min_samples_split': [0.001, 0.01],
'n_estimators': [50, 100]},
pre_dispatch='2*n_jobs', refit='f1', return_train_score=False,
scoring=None, verbose=1)

Finally, obtain the optimal parameters.

[20]: # Obtain optimal parameters.
rf_val.best_params_

[20]: {'max_depth': 50,

'max_features': 'sqrt',
'max_samples': 0.9,
'min_samples_leaf': 1,
'min_samples_split': 0.001,
'n_estimators': 50}

1.5 Step 4: Results and evaluation

Use the selected model to predict on our test data. We use the optimal parameters found via
GridSearchCV.
[21]: # Use optimal parameters on GridSearchCV
rf_opt = RandomForestClassifier(n_estimators = 50, max_depth = 50,

11
min_samples_leaf = 1, min_samples_split = 0.001,
max_features="sqrt", max_samples = 0.9,␣
,→random_state = 0)

Once again, fit the optimal model.

[22]: # Fit the optimal model.
rf_opt.fit(X_train, y_train)

[22]: RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,

criterion='gini', max_depth=50, max_features='sqrt',
max_leaf_nodes=None, max_samples=0.9,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=0.001,
min_weight_fraction_leaf=0.0, n_estimators=50,
n_jobs=None, oob_score=False, random_state=0, verbose=0,
warm_start=False)

And predict on the test set using the optimal model.

[23]: # Predict on test set.
y_pred = rf_opt.predict(X_test)

1.5.1 Obtain performance scores

First, get our precision score.

[24]: # Get precision score.
pc_test = precision_score(y_test, y_pred, pos_label = "satisfied")
print("The precision score is {pc:.3f}".format(pc = pc_test))

The precision score is 0.950

Then, we collect the recall score.
[25]: # Get recall score.
rc_test = recall_score(y_test, y_pred, pos_label = "satisfied")
print("The recall score is {rc:.3f}".format(rc = rc_test))

The recall score is 0.945

Next, obtain our accuracy score.
[26]: # Get accuracy score.
ac_test = accuracy_score(y_test, y_pred)
print("The accuracy score is {ac:.3f}".format(ac = ac_test))

The accuracy score is 0.942

12
Finally, collect our F1-score.
[27]: # Get F1 score.
f1_test = f1_score(y_test, y_pred, pos_label = "satisfied")
print("The F1 score is {f1:.3f}".format(f1 = f1_test))

The F1 score is 0.947

Question: How is the F1-score calculated?**
F1 scores are calculated using the following formula:
F1 = 2 * (precision * recall) / (precision + recall)
Question: What are the pros and cons of performing the model selection using test data instead
of a separate validation dataset?
Pros: * The coding workload is reduced. * The scripts for data splitting are shorter. * It’s only
necessary to evaluate test dataset performance once, instead of two evaluations (validate and test).
Cons: * If a model is evaluated using samples that were also used to build or fine-tune that model,
it likely will provide a biased evaluation. * A potential overfitting issue could happen when fitting
the model’s scores on the test data.

1.5.2 Evaluate the model

Now that we have results, evaluate the model.

Question: What are the four basic parameters for evaluating the performance of a classification
model?
1. True positives (TP): These are correctly predicted positive values, which means the value of
actual and predicted classes are positive.
2. True negatives (TN): These are correctly predicted negative values, which means the value
of the actual and predicted classes are negative.
3. False positives (FP): This occurs when the value of the actual class is negative and the value
of the predicted class is positive.
4. False negatives (FN): This occurs when the value of the actual class is positive and the value
of the predicted class in negative.
Reminder: When fitting and tuning classification modeld, data professioals aim to minimize false
positives and false negatives.
Question: What do the four scores demonstrate about the model, and how do we calculate them?
• Accuracy (TP+TN/TP+FP+FN+TN): The ratio of correctly predicted observations to total
observations.
• Precision (TP/TP+FP): The ratio of correctly predicted positive observations to total pre-
dicted positive observations.

13
• Recall (Sensitivity, TP/TP+FN): The ratio of correctly predicted positive observations to all
observations in actual class.
• F1 score: The harmonic average of precision and recall, which takes into account both false
positives and false negatives.
Calculate the scores: precision score, recall score, accuracy score, F1 score.
[28]: # Precision score on test data set.
print("\nThe precision score is: {pc:.3f}".format(pc = pc_test),
"for the test set,", "\nwhich means of all positive predictions,",
"{pc_pct:.1f}% prediction are true positive.".format(pc_pct = pc_test *␣
,→100))

The precision score is: 0.950 for the test set,

which means of all positive predictions, 95.0% prediction are true positive.

[29]: # Recall score on test data set.

print("\nThe recall score is: {rc:.3f}".format(rc = rc_test), "for the test␣
,→set,",

"\nwhich means of which means of all real positive cases in test set,",
"{rc_pct:.1f}% are predicted positive.".format(rc_pct = rc_test * 100))

The recall score is: 0.945 for the test set,

which means of which means of all real positive cases in test set, 94.5% are
predicted positive.

[30]: # Accuracy score on test data set.

print("\nThe accuracy score is: {ac:.3f}".format(ac = ac_test),
"for the test set,", "\nwhich means of all cases in test set,",
"{ac_pct:.1f}% are predicted true positive or true negative.".
,→format(ac_pct = ac_test * 100))

The accuracy score is: 0.942 for the test set,

which means of all cases in test set, 94.2% are predicted true positive or true
negative.

[31]: # F1 score on test data set.

print("\nThe F1 score is: {f1:.3f}".format(f1 = f1_test), "for the test set,",
"\nwhich means the test set's harmonic mean is {f1_pct:.1f}%.".
,→format(f1_pct = f1_test * 100))

The F1 score is: 0.947 for the test set,

which means the test set's harmonic mean is 94.7%.
Question: How does this model perform based on the four scores?

14
The model performs well according to all 4 performance metrics. The model’s precision score is
slightly better than the 3 other metrics.

1.5.3 Evaluate the model

Finally, create a table of results that we can use to evaluate the performace of the model.
[32]: # Create table of results.
table = pd.DataFrame()
table = table.append({'Model': "Tuned Decision Tree",
'F1': 0.945422,
'Recall': 0.935863,
'Precision': 0.955197,
'Accuracy': 0.940864
},
ignore_index=True
)

table = table.append({'Model': "Tuned Random Forest",

'F1': f1_test,
'Recall': rc_test,
'Precision': pc_test,
'Accuracy': ac_test
},
ignore_index=True
)
table

[32]: Model F1 Recall Precision Accuracy

0 Tuned Decision Tree 0.945422 0.935863 0.955197 0.940864
1 Tuned Random Forest 0.947306 0.944501 0.950128 0.942450

Question: How does the random forest model compare to the decision tree model we built in the
previous lab?
The tuned random forest has higher scores overall, so it is the better model. Particularly, it shows
a better F1 score than the decision tree model, which indicates that the random forest model may
do better at classification when taking into account false positives and false negatives.

1.6 Considerations

What are the key takeaways from this lab? - Data exploring, cleaning, and encoding are
necessary for model building. - A separate validation set is typically used for tuning a model,
rather than using the test set. This also helps avoid the evaluation becoming biased. - F1 scores
are usually more useful than accuracy scores. If the cost of false positives and false negatives are
very different, it’s better to use the F1 score and combine the information from precision and recall.
* The random forest model yields a more effective performance than a decision tree model.

15
What summary would we provide to stakeholders? * The random forest model predicted
satisfaction with more than 94.2% accuracy. The precision is over 95% and the recall is approx-
imately 94.5%. * The random forest model outperformed the tuned decision tree with the best
hyperparameters in most of the four scores. This indicates that the random forest model may
perform better. * Because stakeholders were interested in learning about the factors that are most
important to customer satisfaction, this would be shared based on the tuned random forest. * In
addition, you would provide details about the precision, recall, accuracy, and F1 scores to support
your findings.

1.6.1 References

Accuracy, Precision, Recall & F1 Score: Interpretation of Performance Measures, Renuka Joshi
What is the Difference Between Test and Validation Datasets?, Jason Brownlee
Decision Trees and Random Forests Neil Liberman

Data Cleaning - Cheatsheet
100% (2)
Data Cleaning - Cheatsheet
8 pages
Talbi 2021
No ratings yet
Talbi 2021
32 pages
Letter of Job Application: February 20, 2023
No ratings yet
Letter of Job Application: February 20, 2023
2 pages
Mod 3 Guidelines in Giving Emergency CAre
67% (3)
Mod 3 Guidelines in Giving Emergency CAre
5 pages
Car-price-prediction (1)
No ratings yet
Car-price-prediction (1)
42 pages
ML5 Decision Tree Airline Safety
No ratings yet
ML5 Decision Tree Airline Safety
3 pages
Airline Passenger Booking Analyze
No ratings yet
Airline Passenger Booking Analyze
26 pages
BPP Business School - Applied Modelling and Visualisation
No ratings yet
BPP Business School - Applied Modelling and Visualisation
19 pages
Rutik Kothwala Final Practical Data Science
No ratings yet
Rutik Kothwala Final Practical Data Science
27 pages
BDABI - Project Work Mikiyas Henok Dureti
No ratings yet
BDABI - Project Work Mikiyas Henok Dureti
34 pages
Final
No ratings yet
Final
15 pages
Dse4 Stug082
No ratings yet
Dse4 Stug082
43 pages
Major Project
No ratings yet
Major Project
17 pages
Flight-Price-Prediction - Flight - Price - Ipynb at Master Mandal-21 - Flight-Price-Prediction
No ratings yet
Flight-Price-Prediction - Flight - Price - Ipynb at Master Mandal-21 - Flight-Price-Prediction
28 pages
Supervised Regression
No ratings yet
Supervised Regression
24 pages
Data Cleaning
No ratings yet
Data Cleaning
13 pages
Capstone Project - Airline Passenger Satisfaction
No ratings yet
Capstone Project - Airline Passenger Satisfaction
18 pages
EDS - Python Cheat Sheet
0% (1)
EDS - Python Cheat Sheet
3 pages
Flight Price Prediction Report
No ratings yet
Flight Price Prediction Report
18 pages
Exercise5 Solution
No ratings yet
Exercise5 Solution
22 pages
Ict Project Report (1)[1]
No ratings yet
Ict Project Report (1)[1]
14 pages
Airfare ML - Predicting Flight Fares
No ratings yet
Airfare ML - Predicting Flight Fares
21 pages
SN Travel Jupyter Notebook PDF
No ratings yet
SN Travel Jupyter Notebook PDF
28 pages
Airlanes Booking Analys
No ratings yet
Airlanes Booking Analys
26 pages
CST 383 Final
No ratings yet
CST 383 Final
23 pages
Advance Python
No ratings yet
Advance Python
5 pages
Airline Passenger Satisfact
No ratings yet
Airline Passenger Satisfact
6 pages
Copy of Black and White Modern Artificial Intelligence Presentation
No ratings yet
Copy of Black and White Modern Artificial Intelligence Presentation
47 pages
ML9 Random Forest Hotel Booking Cancellation
No ratings yet
ML9 Random Forest Hotel Booking Cancellation
2 pages
Presentation On Flight Price Prediction
No ratings yet
Presentation On Flight Price Prediction
30 pages
Titanic Akshaya
No ratings yet
Titanic Akshaya
12 pages
BH GF
No ratings yet
BH GF
16 pages
Untitled1 - Jupyter Notebook
No ratings yet
Untitled1 - Jupyter Notebook
4 pages
Python For DS Cheat Sheet
100% (2)
Python For DS Cheat Sheet
6 pages
Feature Selection Techniques in ML With Python-1
No ratings yet
Feature Selection Techniques in ML With Python-1
7 pages
MLLABDSA
No ratings yet
MLLABDSA
16 pages
Cart-Rf-Ann: Prepared by Muralidharan N
67% (3)
Cart-Rf-Ann: Prepared by Muralidharan N
33 pages
Naan Mudhalvan Phase 2
No ratings yet
Naan Mudhalvan Phase 2
13 pages
CSC407_Chapter 2-3
No ratings yet
CSC407_Chapter 2-3
46 pages
Step 16 Chapter4
No ratings yet
Step 16 Chapter4
64 pages
Overview of Data Cleaning
No ratings yet
Overview of Data Cleaning
17 pages
Flight Booking
No ratings yet
Flight Booking
25 pages
1-Flight Booking
No ratings yet
1-Flight Booking
25 pages
Quick Guide To Data Cleaning With Examples - Sunscrapers
No ratings yet
Quick Guide To Data Cleaning With Examples - Sunscrapers
11 pages
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
No ratings yet
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
11 pages
Data Clearning
No ratings yet
Data Clearning
7 pages
Articles Xgboost Classification With Smote-Enn Algorithm
No ratings yet
Articles Xgboost Classification With Smote-Enn Algorithm
11 pages
Data Science Tutorial 1686911993
No ratings yet
Data Science Tutorial 1686911993
41 pages
AI Exercises
No ratings yet
AI Exercises
2 pages
Another Project-Creating Customer Segments
No ratings yet
Another Project-Creating Customer Segments
31 pages
UNITIV.BtechIot
No ratings yet
UNITIV.BtechIot
43 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Thesis Defense
No ratings yet
Thesis Defense
25 pages
E-Commerce Product Delivery Prediction
No ratings yet
E-Commerce Product Delivery Prediction
13 pages
Zomoto Data Analysis Using Python_1
No ratings yet
Zomoto Data Analysis Using Python_1
10 pages
Data 5000 W
No ratings yet
Data 5000 W
1 page
Project Report-Micro Credit Loan
No ratings yet
Project Report-Micro Credit Loan
8 pages
Py_ Customer Churn Classification — Actuaries' Analytical Cookbook
No ratings yet
Py_ Customer Churn Classification — Actuaries' Analytical Cookbook
76 pages
task2-eda-cleaning
No ratings yet
task2-eda-cleaning
33 pages
Capstone Assessment
No ratings yet
Capstone Assessment
18 pages
Machine Learning Path
No ratings yet
Machine Learning Path
21 pages
Why to admire Trump: A quick-to-read guide
From Everand
Why to admire Trump: A quick-to-read guide
Dahl Triumph
No ratings yet
The Deadly Promise
From Everand
The Deadly Promise
Gilou Bareau
No ratings yet
Facets of Music
100% (1)
Facets of Music
21 pages
NSEP Solutions
No ratings yet
NSEP Solutions
15 pages
The Perception Process
No ratings yet
The Perception Process
19 pages
Educ 3 Midterm
No ratings yet
Educ 3 Midterm
4 pages
Neurosonology in Critical Care Monitoring the Neurological Impact of the Critical Pathology 1st Edition by Camilo RodrÃguez, Claudio Baracchini, Jorge Mejia Mantilla, Marek Czosnyka, Jose Suarez, LÃ¡szlÃ³ Csiba, Corina Puppo, Eva Bartels 3030814181 9783030814182 - Download the ebook today and own the complete version
100% (8)
Neurosonology in Critical Care Monitoring the Neurological Impact of the Critical Pathology 1st Edition by Camilo RodrÃguez, Claudio Baracchini, Jorge Mejia Mantilla, Marek Czosnyka, Jose Suarez, LÃ¡szlÃ³ Csiba, Corina Puppo, Eva Bartels 3030814181 9783030814182 - Download the ebook today and own the complete version
91 pages
Voice Assistent Synopsis PDF
No ratings yet
Voice Assistent Synopsis PDF
4 pages
Research ABC
No ratings yet
Research ABC
4 pages
Study Habits Affecting The Academic Performance of HUMSS Students at Pablo Roman National High School
100% (2)
Study Habits Affecting The Academic Performance of HUMSS Students at Pablo Roman National High School
4 pages
Free Natal Chart Report - Cafe Astrology
No ratings yet
Free Natal Chart Report - Cafe Astrology
37 pages
2ND QUARTER LESSON 7 Multimedia Information and Media
No ratings yet
2ND QUARTER LESSON 7 Multimedia Information and Media
32 pages
Barriers To English Language - Research
No ratings yet
Barriers To English Language - Research
9 pages
GE 715 SILABUS Edited-Sosyedad
No ratings yet
GE 715 SILABUS Edited-Sosyedad
6 pages
Literature Review On Route Survey
100% (2)
Literature Review On Route Survey
4 pages
Block 18 - (Group 1)
No ratings yet
Block 18 - (Group 1)
34 pages
The Design of Discrete Vocabulary Tests
No ratings yet
The Design of Discrete Vocabulary Tests
5 pages
Penn State MPM
No ratings yet
Penn State MPM
11 pages
Blue and Green Modern Mental Health Medical Presentation
No ratings yet
Blue and Green Modern Mental Health Medical Presentation
11 pages
Johor MUET MODULE
100% (1)
Johor MUET MODULE
100 pages
Grade 12 Gce - Candidate Provisional Register: Home Reports
No ratings yet
Grade 12 Gce - Candidate Provisional Register: Home Reports
2 pages
TRAINING & Development LG
50% (4)
TRAINING & Development LG
108 pages
LEARNING ACTIVITY SHEET ON Compose An Argumentative Essay Module 2 - Week 2 - Quarter 3 - School Year 2020-2021 Background Information For Learners
No ratings yet
LEARNING ACTIVITY SHEET ON Compose An Argumentative Essay Module 2 - Week 2 - Quarter 3 - School Year 2020-2021 Background Information For Learners
4 pages
PT1 Set 1
No ratings yet
PT1 Set 1
3 pages
Name: Nicole Liu - Grade Level Being Taught: 1st Subject/Content: Science/Conclusion Group Size: Whole Date of Lesson: 11/9/16
No ratings yet
Name: Nicole Liu - Grade Level Being Taught: 1st Subject/Content: Science/Conclusion Group Size: Whole Date of Lesson: 11/9/16
8 pages
DLL-2022-2023 Week 1-2 UCSP
No ratings yet
DLL-2022-2023 Week 1-2 UCSP
4 pages
Observant Creator
No ratings yet
Observant Creator
3 pages
ҚМЖ на англ.
No ratings yet
ҚМЖ на англ.
10 pages
The-Moral-Life-of-Babies-The-New-York-Times
No ratings yet
The-Moral-Life-of-Babies-The-New-York-Times
7 pages