0% found this document useful (0 votes)

15 views43 pages

CV0003

Uploaded by

sooyoungchoi093

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views43 pages

CV0003

Uploaded by

sooyoungchoi093

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

CV0003: INTRODUCTION TO

DATA SCIENCE AND AI

MINI PROJECT
PRSENTATION

Presented by : CV4 Group 6

TABLE OF CONTENT
INTRODUCTION 03
DATA EXTRACTION & CLEANING 07
EXPLORATORY DATA ANALYSIS (EDA) 10
MACHINE LEARNING 15
KEY INSIGHTS 42
INTRODUCTION
(Intro to Dataset & Problem
Statement)
DATASET
Dataset 2 : The COVID Tracking Project APIs

Source : https://covidtracking.com/
Documentation : https://covidtracking.com/data/api
Define your own problem on any dataset you extract
using the APIs.
Primarily offers historic and current values of US
COVID-19 stats.
INTRODUCTION
COVID-19, caused by the SARS-CoV-2 virus,
emerged in late 2019 and quickly spread globally,
leading to one of the largest pandemics in recent
history.
The virus impacted all aspects of daily life, including
public health systems, economies, and social
behaviors, due to its high transmission rate and
severity.
Understanding COVID-19 trends has become crucial
for managing public health responses, predicting
outbreaks, and planning future pandemic
preparedness.
PROBLEM & OBJECTIVES
WHAT IS KEY TRENDS IN COVID-19 EVENT?
Daily.csv
Identify significant turning points in COVID-19 case
numbers, hospitalizations, and mortality rates.

The dataset contains historic U.S.

WHAT WAS THE IMPACT OF INTERVENTIONS? COVID-19 data from January 2020
Examine how specific events (e.g., lockdowns, vaccine) to March 2021. It includes 420
affected COVID-19 trends & improving pubic health policy daily records with 25 columns,
covering variables such as positive
HOW TO DEVELOP FORECASTING MODELS? and negative cases,
Use machine learning to forecast future trends of hospitalizations, ICU usage,
hospitalization and positive cases (potential outbreaks). ventilator counts, and death.

https://covidtracking.com/data/api
DATA
EXTRACTION &
CLEANING
DATA EXTRACTION & CLEANING
DATA EXTRACTION

DROP COLUMNS WITH TOO MANY MISSING VALUES

FILL MISSING VALUE, BY USING MEDIAN

CONVERT DATE INTO DATETIME FORMAT

CONVERT COLUMNS WITH NUMERIC DATA

DROP IRRELEVANT COLUMNS

RENAME COLUMN FOR CLARITY

DATA EXTRACTION
EXPLORATORY
DATA ANALYSIS
(EDA)
EXPLORATORY DATA ANALYSIS (EDA)

CORRELATION
HEATMAP
Relationships between variables, such as cases,
hospitalizations, and deaths. It helps identify related
trends, guiding feature selection and highlighting
significant interactions for deeper analysis.
EXPLORATORY DATA ANALYSIS (EDA)

STATISTICAL
DESCRIPTION
Table showing Mean, std, Quartile,
Median by .describe function

NORMALIZED
BOXPLOT
Visualize the statistical information of
normalized data by using boxplot
EXPLORATORY DATA ANALYSIS (EDA)

Correct the date format. Allign with Time series analysis requirement, %Y%m%d

Remove unecessary, repeated, empty variable from dataset

Filtering Data starting from March 2020,

Normalize the data. ensure that variables with different scales or ranges are brought
to a comparable scale, making them suitable for plotting on the same graph or for
algorithms that are sensitive to feature magnitude.

Marking Past Intervention on COVID case by vertical line

EXPLORATORY DATA ANALYSIS (EDA)
MACHINE
LEARNING
Machine Learning

Supervised Learning Unsupervised Learning

Regression Classification
Rupture
Turning Point

Linear Regression Random Forest

Random Forest
Exponential smoothing
LSTM
MACHINE LEARNING TECHNIQUES :
LINEAR REGRESSION

Measures the goodness of fit

of model
Determines how the variables are mutually dependeent
to each other

Import the Linear Regression model:

Split the data into train and test sets uniformly
Fit Linear Regression Model in training dataset
Check Coefficient of Linear Regression Model
Plot the graph

Repeat for x values of hospitalised cases and Icu cases

LIMITATIONS OF LINEAR REGRESSION

Low Expected Variance and

high MSE, model is not an
accurate model!
EXPONENTIAL SMOOTHING
OBJECTIVE
forecast COVID-19 cases by weighting past observation
and giving more importance in recent data

SMOOTHING PARAMETER
Higher (~1): more sensitive in recent data and ignore older one
HIGH SMOOTHING LEVEL Lower (~0): smoother forecast but slow respond in recent data

WHY EXPONENTIAL SMOOTHING?

Simplicity: Easy to implement and computationally
efficient.
Adaptability: Flexible in adjusting to recent changes,
with variants available to account for trends and
seasonality.
Short-Term Focus: Best suited for short- to medium-
term forecasts, where recent trends are expected to
continue.

LOW SMOOTHING LEVEL

HIGH SMOOTHING LEVEL
STRENGTH
More Responsive to Recent Changes: more weight on
recent data, better suited to datasets where recent trends
or changes are significant.
Captures Volatility: useful for volatile datasets where
fluctuations are important, such as daily or weekly data with
frequent shifts.

WEAKNESS
It may lead to overfitting by closely following random noise,
which can reduce the generalization of the forecast.
LOW SMOOTHING LEVEL
STRENGTH
Smoother Forecasts and More Stable: creates a smoother,
more stable forecast that is less reactive to recent changes.
It focuses more on the long-term trend rather than short-
term fluctuations.
Reduces Overfitting Risk: can reduce the risk of overfitting,
as the forecast line smooths out random variations and
focuses on broader patterns.

WEAKNESS
Less reactive in short-term or sudden changes
Code Snippet MACHINE LEARNING :
CLASSIFICATION
CLASSIFICATION INTO LOW,
MEDIUM, HIGH
Calculate the 33rd and 66th percentiles
of the "hospitalizedCurrently" data.
low (below the 33rd percentile),
medium (between the 33rd and 66th
percentiles), and high (above the 66th
percentile)

Line plot by Plt : Matplotlib.pyplot

RANDOM FOREST

ENSEMBLE LEARNING
Ensemble learning technique that combines multiple
decision trees to make accurate and robust
prediction.

BOOTSTRAP SAMPLING
Each tree in the forest is trained on a random subset
of the training data (with replacement).

RANDOMNESS
When splitting nodes, a random subset of
features is considered to create diverse trees.

AVERAGING IN REGRESSION
For regression tasks, the average of all tree
outputs is taken as the final prediction.

VOTING MECHANISM
For classification tasks, each tree makes a
prediction, and the final output is determined by
majority voting.
RANDOM FOREST
OBJECTIVE
predict whether daily COVID-19 hospitalization cases fall
into the categories of "low," "medium," or "high"

SELECTING PREDICTOR
['positiveIncrease', 'negativeIncrease', 'deathIncrease',
'totalTestResultsIncrease']

WHY RANDOM FOREST

Able to capture complex relationships, interactions
between variables, and robustness against overfitting.
~ 82 % ACCURACY
RANDOM FOREST

“Positive Increase” Variable

contribute the most to the
Hospitalization Forecasting
RANDOM FOREST : CODE SNIPPET
UNSUPERVISED LEARNING : RUPTURE

PURPOSE OF RUPTURE FUNCTION

Detects significant changes in the trend of
time-series data.

HOW RUPTURE FUNCTION WORKS?

Highlighting the effect of interventions
lockdowns, mask mandates, and vaccinations.
Red Dashed Lines show where the function
identified statistically significant shifts in the
hospitalization trend.
RUPTURE
INTERPRETATION IN COVID-19 DATA
After vaccine approvals (Pfizer, Moderna),
there’s a clear shift in the trend, indicating a
decline in hospitalizations. effectiveness of
vaccination in lowering hospitalizations.

Vaccine takes 25 days to work

Spike up cases in July because of Easing of

Lockdowns and Reopenings, Summer Travel
and Gatherings: U.S. Independence Day
celebrations (July 4), Improvement in testing
capabilities.
TURNING POINT

Smoothed line using moving average

method.

Identify significant turning points in

COVID-19 hospitalization data and
key events (e.g., lockdowns, vaccine
approvals) represented by vertical
lines.

Red represent Declining trend, and

Green represent Inclining trend.
LSTM MODEL
LSTM (LONG SHORT-TERM
MEMORY) MODEL

WHAT IS LSTM?
LSTM is a type of Recurrent Neural Network (RNN)
designed to handle sequences and time-series
data.
It overcomes the limitations of traditional RNNs by
solving the "short-term memory" issue.

WHY USE LSTM?

Captures Sequential Patterns: LSTM is capable of
learning long-term dependencies in data, making
it ideal for time-series forecasting.
Effective Memory Units: LSTM cells have memory
gates that allow it to remember relevant
information and forget unnecessary data as
needed.
LSTM (LONG SHORT-TERM
MEMORY) MODEL

HOW LSTM WORKS?

Input Gates: Control what new information enters
the memory cell.
Forget Gates: Decide what information to discard
from the cell.
Output Gates: Control what information to output
based on the cell state.

APPLICATION
In this project, the LSTM model is used to predict
future COVID-19 cases by learning from past case
trends.
Data is scaled and divided into training, validation,
and testing sets to enhance accuracy.
LSTM
DATA SCALING
Why Scale? Scaling ensures that all data is
within a similar range, which helps the LSTM
model learn effectively.
MinMaxScaler: This tool scales data to a range
between 0 and 1, which is ideal for LSTM input.

CREATING SEQUENCE
Window Size: A window size of 5 is used,
meaning each data point is based on the
previous 5 values.
Features (X) and Labels (y): The LSTM function
creates input-output pairs based on the
specified window size.
LSTM (MINMAXSCALAR)
LSTM
MODEL DESIGNING
Model Architecture
Sequential Model: A linear stack of layers, ideal
for time series forecasting with LSTM.

Input Layer
Window Size: The input shape is (window_size,
1), where the window size is 5.
LSTM(64): 64 units in the LSTM layer to capture
time dependencies in data. This helps the model
retain memory over long sequences, which is
crucial for time series forecasting.
Dense(8, activation='relu'): A dense layer with 8
nodes and ReLU activation to process learned
features.
Dense(1, activation='linear'): A single-node
output layer with linear activation for continuous
output, ideal for forecasting.
LSTM MODEL SUMMARY
LSTM
TRAINING
Checkpoint:
Saves the model only when it achieves a new best score on
validation data. It Prevents overfitting and ensures that the best
model configuration is retained.
Compiling the Model:
MeanSquaredError() minimizes the difference between predicted
and actual values.
Adam optimizer with a learning rate of 0.0001, enhancing stability
and convergence.
Tracks RootMeanSquaredError (RMSE) for model performance on
training and validation.
Training Execution:
Runs training for 100 epochs with both training and validation data.
Monitors performance using the ModelCheckpoint callback.
Loading the Best Model:
Reloads the best version of the trained model (saved in
"LSTM_model.keras") for accurate evaluation and predictions.
LSTM
PLOTTING
Inverse Transformation :
Converts scaled predictions and actual values back to original
COVID-19 case counts.
Separate predictions for training, validation, and testing sets.
Combining Predictions and Actual Values:
Concatenate predictions and actual values for a unified view.
Create a DataFrame with Prediction and Actual columns indexed
by date.
Plotting the Results:
Visualizes both Prediction and Actual on the same graph.
Labels and titles clarify the plot's purpose.
Gridlines and Legend aid readability and data interpretation.
Performance Metrics:
MAPE measures average error percentage between predictions
and actuals.
RMSE quantifies prediction error magnitude.
LSTM MODEL VISUALISATION
LSTM
FORECASTING
Setting Up the Forecast:
Choose future_days (e.g., 30 days) to set the forecast length.
Start with the final sequence of test data for prediction.
Iterative Forecasting:
Input the last sequence into the model to predict the next day’s
case count and append the predicted case to the forecast list.
Add the prediction to the sequence and remove the oldest value,
maintaining the window_size length for further predictions.
Inverse Scaling & Date Generation:
Transform predictions back to the original COVID case scale
Extend the timeline by future_days.
Plotting the Forecast:
Actual and predicted cases are plotted along with the 30-day
forecast. Titles and labels clarify each line’s significance.
Forecast Table:
Print the DataFrame of forecasted values, showing daily cases
over the specified forecast period.
LSTM MODEL VISUALISATION
(FORECASTING)
WHAT WE APPLIED FROM
THE LECTURE WHAT WE EXPLORED

Inspecting Data ML : Suprvised Learning

“.info()”, “,head()”, and “.describe()” Exponential Smoothing, Random
Forest, LSTM

Visualizing and Forecasting ML : Unsuprvised Learning

Data
heatmap, scatterplot, and boxplots
Ruptrue and Turning Point
Linear Regression
THANK
YOU

COVID 19 Data Analysis Using Python
No ratings yet
COVID 19 Data Analysis Using Python
10 pages
Data Analytics For Pandemics A COVID 19 Case Study, 1st Edition Open Access Download
100% (13)
Data Analytics For Pandemics A COVID 19 Case Study, 1st Edition Open Access Download
16 pages
The Role of Predictive Modeling in The COVID 19 Pandemic
No ratings yet
The Role of Predictive Modeling in The COVID 19 Pandemic
10 pages
Science of The Total Environment: Anuradha Tomar, Neeraj Gupta
No ratings yet
Science of The Total Environment: Anuradha Tomar, Neeraj Gupta
6 pages
COVID-19 Future Forecasting Using Machine Learning
No ratings yet
COVID-19 Future Forecasting Using Machine Learning
6 pages
Covid-19 Prediction Using Azure Data Factory (ADF)
No ratings yet
Covid-19 Prediction Using Azure Data Factory (ADF)
9 pages
(IJCST-V11I2P15) :jas Simran Kaur, Rupinder Kaur, Balpreet Kaur
No ratings yet
(IJCST-V11I2P15) :jas Simran Kaur, Rupinder Kaur, Balpreet Kaur
9 pages
Predicting Disease With Machine Learning
No ratings yet
Predicting Disease With Machine Learning
20 pages
ICATAS Invited Talk 2020 New
No ratings yet
ICATAS Invited Talk 2020 New
52 pages
(Spanos) Statistical Foundations of Econometric Modelling
100% (3)
(Spanos) Statistical Foundations of Econometric Modelling
672 pages
Full Version Data Analytics For Pandemics A COVID 19 Case Study 1st Edition Entire Volume Download
No ratings yet
Full Version Data Analytics For Pandemics A COVID 19 Case Study 1st Edition Entire Volume Download
15 pages
Appreciation Uniquely Predicts Life Satisfaction Above Demographics
No ratings yet
Appreciation Uniquely Predicts Life Satisfaction Above Demographics
5 pages
Covid-19 Future Forecasting Using Supervised Machine Learning Models
No ratings yet
Covid-19 Future Forecasting Using Supervised Machine Learning Models
5 pages
Sample Mini Project
No ratings yet
Sample Mini Project
24 pages
Presentation 1
No ratings yet
Presentation 1
13 pages
Nowcasting of COVID-19 Confirmed Cases: Foundations, Trends, and Challenges
No ratings yet
Nowcasting of COVID-19 Confirmed Cases: Foundations, Trends, and Challenges
40 pages
1 s2.0 S0960077920306081 Main
No ratings yet
1 s2.0 S0960077920306081 Main
9 pages
The Analysis and Forecasting COVID-19 Cases in The United States Using Bayesian Structural Time Series Models
No ratings yet
The Analysis and Forecasting COVID-19 Cases in The United States Using Bayesian Structural Time Series Models
16 pages
Panjala Sravani, V. Rama Krishna: Prospective Projection On Covid-19 Utilising ML Algorithms
No ratings yet
Panjala Sravani, V. Rama Krishna: Prospective Projection On Covid-19 Utilising ML Algorithms
8 pages
KasmiraBathuganesan (31108075) T04
No ratings yet
KasmiraBathuganesan (31108075) T04
21 pages
SEM 1.5 NaiveBayes
No ratings yet
SEM 1.5 NaiveBayes
9 pages
Forecasting The Trends of Covid-19 and Causal Impact of Vaccines Using Bayesian Structural Time Series and ARIMA
No ratings yet
Forecasting The Trends of Covid-19 and Causal Impact of Vaccines Using Bayesian Structural Time Series and ARIMA
23 pages
Prediction Analysis of Covid
No ratings yet
Prediction Analysis of Covid
2 pages
An Interpretable Hybrid Predictive Model of Covid 19 Cases Using AR and LSTM
No ratings yet
An Interpretable Hybrid Predictive Model of Covid 19 Cases Using AR and LSTM
12 pages
TIME SERIES Notes
No ratings yet
TIME SERIES Notes
8 pages
Estimation of COVID19 Infection Using Machine Learning Algorithms
No ratings yet
Estimation of COVID19 Infection Using Machine Learning Algorithms
15 pages
Chaurasia, Pal - 2020 - COVID-19 Pandemic Application of Machine Learning Time Series Analysis For Prediction of Human Future-Annotated
No ratings yet
Chaurasia, Pal - 2020 - COVID-19 Pandemic Application of Machine Learning Time Series Analysis For Prediction of Human Future-Annotated
16 pages
Covid-19 Deaths Prediction With Machine Learning
No ratings yet
Covid-19 Deaths Prediction With Machine Learning
8 pages
Prediction of Covid-19 Pandemic Based On Regression
No ratings yet
Prediction of Covid-19 Pandemic Based On Regression
5 pages
Models For COVID-19 Data Prediction Based On Improved LSTM-ARIMA Algorithms
No ratings yet
Models For COVID-19 Data Prediction Based On Improved LSTM-ARIMA Algorithms
11 pages
Modelling and Forecasting of New Cases, Deaths and Recover Cases of COVID-19 by Using Vector Autoregressive Model in Pakistan
No ratings yet
Modelling and Forecasting of New Cases, Deaths and Recover Cases of COVID-19 by Using Vector Autoregressive Model in Pakistan
5 pages
Data Analytics Assignment 1
No ratings yet
Data Analytics Assignment 1
11 pages
Infectious Disease Modeling: From Traditional To Evolutionary Algorithms
No ratings yet
Infectious Disease Modeling: From Traditional To Evolutionary Algorithms
37 pages
Visualizing Covid - 19
No ratings yet
Visualizing Covid - 19
12 pages
A Data Driven Interpretable Ensemble Framework Based On Tree Models For Forecasting The Occurrence of COVID 19 in The USA
No ratings yet
A Data Driven Interpretable Ensemble Framework Based On Tree Models For Forecasting The Occurrence of COVID 19 in The USA
12 pages
Zee Covid 19 Deaths Prediction Using ML (2) .PPTX (Read Only)
No ratings yet
Zee Covid 19 Deaths Prediction Using ML (2) .PPTX (Read Only)
9 pages
BIOE340 WEEK1,2,3,4,5, Review
No ratings yet
BIOE340 WEEK1,2,3,4,5, Review
92 pages
Project Presentation
No ratings yet
Project Presentation
18 pages
ANeuralNetworkEnsembleApproachForG Preview
No ratings yet
ANeuralNetworkEnsembleApproachForG Preview
1 page
Corona Virus in India
No ratings yet
Corona Virus in India
29 pages
Final
No ratings yet
Final
21 pages
Khuraijam Shitle Kumar Manipur University: Clustered Based Analysis and Forecasting of COVID-19 Cases in NE India
No ratings yet
Khuraijam Shitle Kumar Manipur University: Clustered Based Analysis and Forecasting of COVID-19 Cases in NE India
33 pages
A Comparison of Time Series Models To Predict COVID-19 Cases
No ratings yet
A Comparison of Time Series Models To Predict COVID-19 Cases
31 pages
Machine Learning and OLAP On Big COVID-19 Data
No ratings yet
Machine Learning and OLAP On Big COVID-19 Data
10 pages
Covid 19
No ratings yet
Covid 19
85 pages
Prediction and Analysis of Covid19 in India Using SVR and LSTM
No ratings yet
Prediction and Analysis of Covid19 in India Using SVR and LSTM
5 pages
Covid Research Paper
No ratings yet
Covid Research Paper
10 pages
1 en 42 Chapter Author
No ratings yet
1 en 42 Chapter Author
18 pages
Regression Analys
No ratings yet
Regression Analys
7 pages
Detection of Covid 19 Era
No ratings yet
Detection of Covid 19 Era
42 pages
Paper 25
No ratings yet
Paper 25
3 pages
Name
No ratings yet
Name
23 pages
Best
No ratings yet
Best
12 pages
MMMMM
No ratings yet
MMMMM
23 pages
Name
No ratings yet
Name
23 pages
Covid-19 Predictive Data Visualization Using Machine Learning
No ratings yet
Covid-19 Predictive Data Visualization Using Machine Learning
5 pages
Module 11 Unit 2 Simple Linear Regression
No ratings yet
Module 11 Unit 2 Simple Linear Regression
10 pages
CA2 Report Example 1
No ratings yet
CA2 Report Example 1
18 pages
Unit 2
No ratings yet
Unit 2
19 pages
Data Analytics - Activity 1
No ratings yet
Data Analytics - Activity 1
2 pages
Chapter 9 - Simple Regression
100% (1)
Chapter 9 - Simple Regression
62 pages
Unit 6
No ratings yet
Unit 6
73 pages
Research Paper Tanishka
No ratings yet
Research Paper Tanishka
5 pages
Sas 15 Fin 014 Far
No ratings yet
Sas 15 Fin 014 Far
9 pages
Retail Analysis With Walmart Data
No ratings yet
Retail Analysis With Walmart Data
10 pages
IFoA Syllabus 2019-2017
No ratings yet
IFoA Syllabus 2019-2017
201 pages
Lab5 DataMining
No ratings yet
Lab5 DataMining
7 pages
Introduction To Artificial Intelligence: High School - One Semester (75 Contact Hours)
No ratings yet
Introduction To Artificial Intelligence: High School - One Semester (75 Contact Hours)
3 pages
2010 RAMS Doe and Data Analysis
No ratings yet
2010 RAMS Doe and Data Analysis
30 pages
Report Car Price Prediction
No ratings yet
Report Car Price Prediction
8 pages
ESL: Chapter 1: 1.1 Introduction To Linear Regression
No ratings yet
ESL: Chapter 1: 1.1 Introduction To Linear Regression
4 pages
Statistics I Ii For Dummies 2 Ebook Bundle 1 2 Deborah Rumsey Download
No ratings yet
Statistics I Ii For Dummies 2 Ebook Bundle 1 2 Deborah Rumsey Download
87 pages
1 3 Correlation Regression jYMCtkvRAlEsm
No ratings yet
1 3 Correlation Regression jYMCtkvRAlEsm
51 pages
Brain Network Analysis Scribd Full Download
No ratings yet
Brain Network Analysis Scribd Full Download
15 pages
Enoch Project
No ratings yet
Enoch Project
39 pages
Effects of Leadership Style On Organizational Perf
No ratings yet
Effects of Leadership Style On Organizational Perf
13 pages
Notes Regression
No ratings yet
Notes Regression
26 pages
Evaluation of The Relationship Between EGT and Operational Parameters in CFM56-7B
No ratings yet
Evaluation of The Relationship Between EGT and Operational Parameters in CFM56-7B
8 pages
Change and Stability in Educational Stratification
No ratings yet
Change and Stability in Educational Stratification
17 pages
Lijphart 2012 Patterns of Democracy CH 15 - 16
No ratings yet
Lijphart 2012 Patterns of Democracy CH 15 - 16
40 pages
Machine Learning in Nutrition Research
No ratings yet
Machine Learning in Nutrition Research
17 pages
Brooklyn College Economics Department Economics 4400w
No ratings yet
Brooklyn College Economics Department Economics 4400w
8 pages
Ball SecurityReturnsaround 1991
No ratings yet
Ball SecurityReturnsaround 1991
22 pages
Research in Computing
No ratings yet
Research in Computing
40 pages
A Farewell To The Bias-Variance Tradeoff? An Overview of The Theory of Overparameterized Machine Learning
No ratings yet
A Farewell To The Bias-Variance Tradeoff? An Overview of The Theory of Overparameterized Machine Learning
48 pages
Regression: Finding The Equation of The Line of Best Fit: Background and General Principle
No ratings yet
Regression: Finding The Equation of The Line of Best Fit: Background and General Principle
6 pages
Topic 4
No ratings yet
Topic 4
15 pages
Role of Value Added Tax (VAT) On The Economic Growth of Bangladesh
No ratings yet
Role of Value Added Tax (VAT) On The Economic Growth of Bangladesh
17 pages
Solution Simple Random Sampling: Question No.1: Describe The Followings and Clarify The Context: (5 Marks)
No ratings yet
Solution Simple Random Sampling: Question No.1: Describe The Followings and Clarify The Context: (5 Marks)
8 pages
Forecasting Models – an Overview With The Help Of R Software
From Everand
Forecasting Models – an Overview With The Help Of R Software
Editor IJSMI
No ratings yet

CV0003

Uploaded by

CV0003

Uploaded by

CV0003: INTRODUCTION TO

DATA SCIENCE AND AI

Presented by : CV4 Group 6

The dataset contains historic U.S.

DROP COLUMNS WITH TOO MANY MISSING VALUES

FILL MISSING VALUE, BY USING MEDIAN

CONVERT DATE INTO DATETIME FORMAT

CONVERT COLUMNS WITH NUMERIC DATA

DROP IRRELEVANT COLUMNS

RENAME COLUMN FOR CLARITY

Remove unecessary, repeated, empty variable from dataset

Filtering Data starting from March 2020,

Marking Past Intervention on COVID case by vertical line

Supervised Learning Unsupervised Learning

Linear Regression Random Forest

Measures the goodness of fit

Import the Linear Regression model:

Repeat for x values of hospitalised cases and Icu cases

Low Expected Variance and

WHY EXPONENTIAL SMOOTHING?

LOW SMOOTHING LEVEL

Line plot by Plt : Matplotlib.pyplot

WHY RANDOM FOREST

“Positive Increase” Variable

PURPOSE OF RUPTURE FUNCTION

HOW RUPTURE FUNCTION WORKS?

Vaccine takes 25 days to work

Spike up cases in July because of Easing of

Smoothed line using moving average

Identify significant turning points in

Red represent Declining trend, and

WHY USE LSTM?

HOW LSTM WORKS?

Inspecting Data ML : Suprvised Learning

Visualizing and Forecasting ML : Unsuprvised Learning

You might also like