0% found this document useful (0 votes)
50 views13 pages

MIS410 Lecture8toLecture10

This document outlines a course on business intelligence that covers descriptive, inferential, predictive, and prescriptive analytics using tools like R and Python. The course instructor is Dr. Atikur R. Khan from the Department of Management at North South University. Key topics include regression modeling, model validation techniques like cross-validation, bootstrapping, and multiple and logistic regression. Cross-validation involves splitting data into training and test sets to evaluate model performance on independent data.

Uploaded by

Ahanaf Rasheed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views13 pages

MIS410 Lecture8toLecture10

This document outlines a course on business intelligence that covers descriptive, inferential, predictive, and prescriptive analytics using tools like R and Python. The course instructor is Dr. Atikur R. Khan from the Department of Management at North South University. Key topics include regression modeling, model validation techniques like cross-validation, bootstrapping, and multiple and logistic regression. Cross-validation involves splitting data into training and test sets to evaluate model performance on independent data.

Uploaded by

Ahanaf Rasheed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Business Intelligence

(Course Code: MIS 410, Prerequisite: BUS 173, MIS 210/MIS 310)

Dr. Atikur R. Khan


Associate Professor
Department of Management
North South University
Outline of the Course

 Introduction to BI and BI Tools


 Part-I: Descriptive Analytics
 Part-II: Inferential Analytics
 Part-III: Predictive Analytics
 Part-IV: Prescriptive Analytics
 Part-V: Decision Analytics
 Lab Works: R/Python & Tableau/Power BI
Regression model: Model validation

• Cross-Validation
• Bootstrapping
• Multiple regression
• Logistic regression
Cross-Validation

Cross-validation is a method that reserves a portion of a data set that is


not used for model building but is used for testing the model built on other
portion of that data set. Steps are as follows:

• Split data into training and test samples

• Build model with training sample

• Test the model on the test sample by computing a measure (for


example, MSE)

• This helps us to evaluate the effectiveness of the model. If the


model provides better performance on validation (test) sample, we
can go ahead with the model.
Cross-Validation (CV)

This is used to assess how the model will generalize on an independent


data set.

• Find or estimate expected error

• Helps in finding the best model

• Avoid overfitting
Cross-Validation Methods

• Hold out method of cross-validation

• K-fold cross-validation

• Leave-one-out cross-validation (LOOCV)

• Bootstrapping
Hold Out Cross-Validation

Split data set into training and test (validation) data


sets. Example: 80% training and 20% test data;
maximum 30% test (validation) data
Hold Out Cross-Validation
Hold Out Cross-Validation

• Which of the previous two models perform better with 25% hold out
cross-validation? – The model with the least average MSE computed
from 500 replication.
K-fold Cross-Validation
Usually, k=10 is used and we call it 10-fold cross-
validation. Let us explain this with k=5 as follows.
K-fold Cross-Validation
Replicate the whole computation 100 times and calculate
average MSE
MSE.rep = NULL
For(i in 1:100)
{

MSE.rep[ i ] = MSE
}
mean(MSE.rep)
Bootstrapping
• (1) Fit regression model

• (2) Calculate fitted values and residuals, and save these values

• (3) Resample residuals = sample(residual, sample size)

• (4) Dependent variable = fitted values + resampled residuals

• (5) Fit regression model with dependent variable in (4) and save coefficients

• (6) Repeat (3) – (5) steps 100 times, and calculate average of coefficients.
These estimates are known as bootstrap estimates of coefficients.
Bootstrapping

You might also like