2.2.1 Transcript

This module discusses linear regression as a supervised learning algorithm used to build regression models with dependent and independent variables. It highlights the importance of hypotheses in generating business rules and explains the distinction between mathematical and statistical relationships in regression. The module emphasizes that while regression can predict the value of a dependent variable based on independent variables, it does not establish causation.

Uploaded by

Dev Chan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views2 pages

2.2.1 Transcript

Uploaded by

Dev Chan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Predictive Analytics

Prof. Dinesh Kumar U

Module 2

Introduction to Regression

In this module, I am going to discuss linear regression, one of the supervised learning
algorithms that is to build a regression model we need both values of dependent variable and
independent variable. I would like to start with a quote a famous quote by Ronald Coase he
said, "If you torture the data long enough, it will confess." And aggression is one of the
technique that is frequently used to make data confess. Organizations use different business
rules and business rules are actually generated from a hypothesis that the organization may
believe. Let us look at a few interesting hypotheses which various people have claimed. The
first one is good looking couples are more likely to have a girl child. Personally, I like this
hypothesis because I have a daughter at least I'm statistically good looking.

The next hypothesis says that vegetarians miss fewer flights. Women use camera phone more
than men. Left-handed men earn more money, and smokers are better salespeople, and those
who whistle at the workplace are efficient. Organizations use these hypotheses to add value.
For example, let us consider the hypothesis that women use camera phone more than men. If
there is a company which makes cell phones, they can target women using advertisement and
claim that those phones are great for taking photos and similarly consider the hypothesis
smokers are better salespeople. If a company is hiring salespeople, then in the interview they
can ask whether they smoke to the candidate. So, hypotheses basically lead to business rule
that the company can use. Was the regression used to prove all these hypotheses? Interesting
question. We don't know how they came up with these hypotheses. Regression is one of the
techniques and also one of the powerful techniques, but they may have used simple hypothesis
testing techniques such as Z-test or T-test or F-test and there are so many other tests.

So, we don't know really how they actually created these hypotheses. Let us try to understand
the technique of regression. Regression is the tool for finding the existence of an association
relationship between a dependent variable we call it Y and one or more independent variable
in the study. So, the relationship can be either linear or nonlinear. Regression is a statistical
relationship. Linear regression means that the relationship is linear with respect to the
regression parameters. We have to understand the difference between mathematical
relationships and a statistical relationship. Let us look at a mathematical relationship. So let us
say we have Y equal to beta not + beta 1 X.

© All Rights Reserved. This document has been authored by Prof. Dinesh Kumar U and is permitted for use only within the course "Predictive
Analytics" delivered in the online course format by IIM Bangalore. No part of this document, including any logo, data, illustrations, pictures,
scripts, may be reproduced, or stored in a retrieval system or transmitted in any form or by any means – electronic, mechanical, photocopying,
recording or otherwise – without the prior permission of the author.
Predictive Analytics
Prof. Dinesh Kumar U
Module 2

In a mathematical relationship, if you know the value of X we can predict the value of Y
exactly, whereas in a statistical relationship we will have the relationship as Y equal to beta not
+ beta 1 Plus error term. So here with the knowledge of X we will not be able to predict the
value of Y exactly there will be some error in the prediction. Let us try and understand the
nomenclature used regression. We call a dependent variable or a response variable that
measures the outcome of a study. So, it is also called the outcome variable. In the case of Die
Another Day case, the total cost of treatment is a response variable or outcome variable, and
an independent variable or explanatory variable explains the changes in the response variable.
Independent variables are also called feature in machine learning algorithm lingo. If you want
to understand how the total cost of treatment changes, we may have to look at the variables like
patients’ height, weight, and the past medical history and so on.

So, with that information, we believe that we may be able to tell the value of the outcome
variable which is, in this case, total treatment cost. Regression often sets the values of the
explanatory variable to see how it affects the response variable. It is important to understand
that regression model establishes existence of an association between two variables but not
causation. This is very, very important for students to understand. How can I find the causal
relationship? It is an interesting question.

Now there are techniques such as Counter-Factual models, Ruben Castle model, and Graphical
Models that can be used for establishing causal relationship, but we are not going to discuss
these techniques in this course. In this table, I have given different names used to represent
dependent variable and independent variable. It is also important to understand that the
dependent variable does not mean it depends on the values of independent variables, just names
that we use in regression model. And also, as I said before regression is not designed to capture
causality. The purpose of regression is to predict the value of the dependent variable given the
values of independent variables.

An Introduction To Statistical Learning PDF
No ratings yet
An Introduction To Statistical Learning PDF
35 pages
Predictive Modeling
No ratings yet
Predictive Modeling
8 pages
Regression Analysis in Machine Learning - Javatpoint
No ratings yet
Regression Analysis in Machine Learning - Javatpoint
1 page
Unit 2 Notes - Final
No ratings yet
Unit 2 Notes - Final
32 pages
Predicting Horse Race Winners Through Regularized Conditional Logistic Regression With Frailty
No ratings yet
Predicting Horse Race Winners Through Regularized Conditional Logistic Regression With Frailty
11 pages
Predictive Analytics-Mid Sem Exam Question Bank
No ratings yet
Predictive Analytics-Mid Sem Exam Question Bank
28 pages
Math Viva Sem 3
50% (2)
Math Viva Sem 3
21 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
100% (1)
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
33 pages
Book CHPT 9 PPT - SLR
No ratings yet
Book CHPT 9 PPT - SLR
87 pages
Da 2
No ratings yet
Da 2
31 pages
Module 6 Predictive Analytics
No ratings yet
Module 6 Predictive Analytics
20 pages
Basic Estimation Techniques
No ratings yet
Basic Estimation Techniques
21 pages
MLT Unit 2 Linear Regression
No ratings yet
MLT Unit 2 Linear Regression
26 pages
Machine Learning
No ratings yet
Machine Learning
92 pages
Proposals of SPT-CPT and DPL-CPT Correlations For Sandy Soils in Brazil
No ratings yet
Proposals of SPT-CPT and DPL-CPT Correlations For Sandy Soils in Brazil
7 pages
Regression: UNIT - V Regression Model
100% (1)
Regression: UNIT - V Regression Model
21 pages
Sist Iso 2602 1996
No ratings yet
Sist Iso 2602 1996
8 pages
Data Analytics-11
No ratings yet
Data Analytics-11
23 pages
Predective Analytics
No ratings yet
Predective Analytics
11 pages
Data Analytivs-Unit-2
No ratings yet
Data Analytivs-Unit-2
24 pages
BA3 4 5modules
No ratings yet
BA3 4 5modules
258 pages
Data Analysis For Quantitative Research
No ratings yet
Data Analysis For Quantitative Research
26 pages
Unit 2
No ratings yet
Unit 2
6 pages
Datamining Unit4
No ratings yet
Datamining Unit4
21 pages
Unit-5 Bda
No ratings yet
Unit-5 Bda
21 pages
Unit 2
No ratings yet
Unit 2
76 pages
Unit 4 App Regression
No ratings yet
Unit 4 App Regression
1 page
Regression Models For Forecasting Goals and Match Results in Association Football
No ratings yet
Regression Models For Forecasting Goals and Match Results in Association Football
10 pages
Unit 2 Da
No ratings yet
Unit 2 Da
31 pages
Business Analytics Foundations Discussion
No ratings yet
Business Analytics Foundations Discussion
3 pages
MGT782 - Ia2 - Azura MD Radzi
No ratings yet
MGT782 - Ia2 - Azura MD Radzi
15 pages
Dataanalyticsunit 2
No ratings yet
Dataanalyticsunit 2
24 pages
Unit 2 Da
No ratings yet
Unit 2 Da
31 pages
Predictive Analytics
No ratings yet
Predictive Analytics
22 pages
Unit III
No ratings yet
Unit III
13 pages
Linear Regression
No ratings yet
Linear Regression
3 pages
Artificial Intelligence Lec 4
No ratings yet
Artificial Intelligence Lec 4
13 pages
Ida Unit-3
No ratings yet
Ida Unit-3
34 pages
Unit 2 Da
No ratings yet
Unit 2 Da
31 pages
Properties of The Normal and Multivariate Normal Distributions
No ratings yet
Properties of The Normal and Multivariate Normal Distributions
2 pages
Unit - III - PREDICTIVE ANALYTICS
No ratings yet
Unit - III - PREDICTIVE ANALYTICS
28 pages
Notes of DA Unit-II
No ratings yet
Notes of DA Unit-II
91 pages
Continuous Predictors
No ratings yet
Continuous Predictors
5 pages
Week 5 Notes
No ratings yet
Week 5 Notes
175 pages
Model Development
No ratings yet
Model Development
80 pages
Finals-Predictive-Time-Series-Analysis - Module
No ratings yet
Finals-Predictive-Time-Series-Analysis - Module
14 pages
Regression Analysis in Machine Learning: Temperature, Age, Salary, Price
No ratings yet
Regression Analysis in Machine Learning: Temperature, Age, Salary, Price
12 pages
Unit 2 Data Analytics
No ratings yet
Unit 2 Data Analytics
33 pages
What Is Regression Analysis
No ratings yet
What Is Regression Analysis
18 pages
ML 01 (Pranavv)
No ratings yet
ML 01 (Pranavv)
14 pages
Module - 03
No ratings yet
Module - 03
28 pages
Introduction To Mixed Modeling Procedures: Sas/Stat 13.2 User's Guide
No ratings yet
Introduction To Mixed Modeling Procedures: Sas/Stat 13.2 User's Guide
18 pages
Practical - Regression
No ratings yet
Practical - Regression
114 pages
DSR Notes 3 To 5
No ratings yet
DSR Notes 3 To 5
70 pages
Chapter 2
No ratings yet
Chapter 2
136 pages
Introduction To Statistics
100% (1)
Introduction To Statistics
4 pages
Regression Analysis (AI)
No ratings yet
Regression Analysis (AI)
9 pages
Predictive Analytics - Wikipedia
No ratings yet
Predictive Analytics - Wikipedia
10 pages
Unit 1 - Part 1
No ratings yet
Unit 1 - Part 1
105 pages
Unit 2 - NOTES1 - ML
No ratings yet
Unit 2 - NOTES1 - ML
35 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
White Paper On Regression
No ratings yet
White Paper On Regression
14 pages
Chapter 4 Exercises 4546 and 50
No ratings yet
Chapter 4 Exercises 4546 and 50
6 pages
May (2012) Nonequivalent Comparison Group Designs
No ratings yet
May (2012) Nonequivalent Comparison Group Designs
21 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Management Science Notes
No ratings yet
Management Science Notes
13 pages
Chapter 1. Elements in Predictive Analytics
No ratings yet
Chapter 1. Elements in Predictive Analytics
66 pages
Advertising Adstock Transformation
No ratings yet
Advertising Adstock Transformation
8 pages
Unit 5
No ratings yet
Unit 5
19 pages
As Quiz 3 PCA Solution PDF
100% (1)
As Quiz 3 PCA Solution PDF
1 page
Bda Unit 5
No ratings yet
Bda Unit 5
14 pages
MKTG 470 Regression Assignment
No ratings yet
MKTG 470 Regression Assignment
2 pages
Bargauwdatasciencelecture4 160424211528
No ratings yet
Bargauwdatasciencelecture4 160424211528
152 pages
Bargauwdatasciencelecture3 160424211506
No ratings yet
Bargauwdatasciencelecture3 160424211506
143 pages
Bargauwdatasciencelecture2 160424211445
No ratings yet
Bargauwdatasciencelecture2 160424211445
137 pages
Mat C 301 - Midterm Quiz: Attempt History
No ratings yet
Mat C 301 - Midterm Quiz: Attempt History
6 pages
SCP 21
No ratings yet
SCP 21
64 pages
CH0010790762166C66FD3C12577
No ratings yet
CH0010790762166C66FD3C12577
6 pages
Apple LLM Foundations
No ratings yet
Apple LLM Foundations
47 pages
Time Series Models: Zeeshan Khan
No ratings yet
Time Series Models: Zeeshan Khan
29 pages
Syllabus Business Analytics
No ratings yet
Syllabus Business Analytics
1 page
Bargauwdatasciencelecture1 160424211429
No ratings yet
Bargauwdatasciencelecture1 160424211429
95 pages
Numpy NP Pandas PD Scipy Matplotlib - Pyplot PLT Statsmodels - Api SM Statsmodels - Tsa.setar - Model Setar - Model
No ratings yet
Numpy NP Pandas PD Scipy Matplotlib - Pyplot PLT Statsmodels - Api SM Statsmodels - Tsa.setar - Model Setar - Model
3 pages
Mid - Term Test
No ratings yet
Mid - Term Test
24 pages
5.2) Multinomial Logistic Regression
No ratings yet
5.2) Multinomial Logistic Regression
34 pages
Dynamic Panel Data
No ratings yet
Dynamic Panel Data
51 pages
Visvesvaraya Technological University, Belagavi
No ratings yet
Visvesvaraya Technological University, Belagavi
56 pages
Period Number of Complaints 1 60 2 65 3 55 4 58 5 64
No ratings yet
Period Number of Complaints 1 60 2 65 3 55 4 58 5 64
4 pages
Blood Donation Predictions - Leveraging TPOT For Automated Model Selection
No ratings yet
Blood Donation Predictions - Leveraging TPOT For Automated Model Selection
31 pages
Econometrics I: Problem Set II: Prof. Nicolas Berman November 30, 2018
No ratings yet
Econometrics I: Problem Set II: Prof. Nicolas Berman November 30, 2018
4 pages
1.1.2 Machine Learning With Business Applications
No ratings yet
1.1.2 Machine Learning With Business Applications
3 pages
361 Introduction
No ratings yet
361 Introduction
1 page
GTM Product Marketing
No ratings yet
GTM Product Marketing
2 pages
(Ebook PDF) Business Statistics A First Course, 6th Editionpdf Download
100% (3)
(Ebook PDF) Business Statistics A First Course, 6th Editionpdf Download
46 pages
2.2.1 Introduction To Regression
No ratings yet
2.2.1 Introduction To Regression
1 page
Syllabus of Business Research Methods
No ratings yet
Syllabus of Business Research Methods
1 page
EmbeddedML TinyML
No ratings yet
EmbeddedML TinyML
1 page
Research Methodology Imp Questions
No ratings yet
Research Methodology Imp Questions
5 pages
2.2.2 Importance of Regression
No ratings yet
2.2.2 Importance of Regression
1 page
1.1.3 Transcript
No ratings yet
1.1.3 Transcript
1 page
AB1202 Cheatsheet
No ratings yet
AB1202 Cheatsheet
2 pages
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
From Everand
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Lee Baker
No ratings yet

2.2.1 Transcript

Uploaded by

2.2.1 Transcript

Uploaded by

Predictive Analytics

Prof. Dinesh Kumar U

You might also like