Linear Regression Subjective Questions

Uploaded by

218 24 L LIJIN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views14 pages

Linear Regression Subjective Questions

Uploaded by

218 24 L LIJIN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Linear

Regression
Subjective
Questions
LIJIN L
From your analysis of the categorical variables from the dataset, what could
you infer about their effect on the dependent variable?

I have plotted the categorical variables with the target variables on boxplot and has inferred
following effect on target.
Season: 3: fall has highest demand for rental bikes.
I see that demand for next year has grown.
Demand is continuously growing each month till June. September month has highest demand.
After September, demand is decreasing.
When there is a holiday, demand has decreased.
Weekday is not giving clear picture about demand.
The clear weathershit has highest demand.
Why is it important to use drop_first=True during dummy variable creation?

drop_first=True is important to use, as it helps in reducing the extra column

created during dummy variable creation. Hence it reduces the correlations
created among dummy variables.
If we do not drop one of the dummy variables created from a categorical
variable then it becomes redundant with dataset as we will have constant
variable(intercept) which will create multicollinearity issue.
Looking at the pair-plot among the numerical variables, which one has the
highest correlation with the target variable?

The feature “temp” has highest correlation. It is very well linearly

related with target “cnt”
How did you validate the assumptions of Linear Regression after building the
model on the training set?

Error terms are normally distributed with mean 0.

Error Terms do not follow any pattern.
Multicollinearity check using VIF(s).
Linearity Check.
Ensured the overfitting by looking the R2 value and Adjusted R2.
Based on the final model, which are the top 3 features contributing significantly
towards explaining the demand of the shared bikes?

Features “holiday”, “temp” and season “hum” are highly related with target
column, so these are top contributing features in model building.
Explain the linear regression algorithm in detail.

Linear Regression Algorithm is a machine learning algorithm based on supervised learning. Linear regression is a part of
regression analysis. Regression analysis is a technique of predictive modelling that helps you to find out the
relationship between Input and the target variable.
Linear regression is one of the very basic forms of machine learning where we train a model to predict the behaviour of
your data based on some variables. In the case of linear regression as you can see the name suggests linear that means
the two variables which are on the x-axis and y-axis should be linearly correlated.
Example for that can be let’s say you are running a sales promotion and expecting a certain number of count of
customers to be increased now what you can do is you can look the previous promotions and plot if over on the chart
when you run it and then try to see whether there is an increment into the number of customers whenever you rate
the promotions and with the help of the previous historical data you try to figure it out or you try to estimate what will
be the count or what will be the estimated count for my current promotion this will give you an idea to do the planning
in a much better way about how many numbers of stalls maybe you need or how many increase number of employees
you need to serve the customer. Here the idea is to estimate the future value based on the historical data by learning
the behaviour or patterns from the historical data.
In some cases, the value will be linearly upward that means whenever X is increasing Y is also
increasing or vice versa that means they have a correlation or there will be a linear downward
relationship. One example for that could be that the police department is running a campaign
to reduce the number of robberies; in this case, the graph will be linearly downward.
Linear regression is used to predict a quantitative response Y from the predictor variable X.
Mathematically, we can write a simple linear regression equation as follow y ~ b0 + b1*x
Where y is the predicted variable (dependent variable), b1 is slope of the line, x is independent
variable, b0 is intercept(constant). It is cost function which helps to find the best possible value
for m and c which in turn provide the best fit line for the data points.
Explain the Anscombe’s quartet in detail.

Anscombe's Quartet can be defined as a group of four data sets which are
nearly identical in simple descriptive statistics, but there are some peculiarities
in the dataset that fools the regression model if built. They have very different
distributions and appear differently when plotted on scatter plots. Each dataset
consists of eleven (x,y) points.
What is Pearson’s R?

Pearson's r is a numerical summary of the strength of the linear association between the
variables. If the variables tend to go up and down together, the correlation coefficient will be
positive. Pearson's r measures the strength of the linear relationship between two variables.
Pearson's r always between -1 and 1. If data lie on a perfect straight line with negative slope,
then r = -1.
Positive correlation indicates the both the variable increase and decrease together. Negative
correlation indicates the one the variable increase and the other variable decrease and vice
versa.
What is scaling? Why is scaling performed? What is the difference between
normalized scaling and standardized scaling?

Scaling is a method to normalize the range of independent variables. It is performed to bring all
the independent variables on a same scale in regression. If Scaling is not done, then regression
algorithm will consider greater values as higher and smaller values as lower values.
It is important to note that scaling just affects the coefficients and none of the other
parameters like t-statistic, F-statistic, p-values, R-squared, etc.
Example Weight of a device = 500 grams, and weight of another device is 5 kg. In this example
machine learning algorithm will consider 500 as greater value which is not the case. And it will
do wrong prediction.
Machine Learning algorithm works on numbers not units. So, before regression on a dataset it
is a necessary step to perform. Scaling can be performed in two ways: Normalization: It scale a
variable in range 0 and 1. Standardization: It transforms data to have a mean of 0 and standard
deviation of 1
You might have observed that sometimes the value of VIF is infinite. Why does
this happen?

When there is a perfect relationship then VIF = Infinity whereas if all the
independent variables are orthogonal then to each other then VIF = 1.0. Means
if a variable is expressed exactly by a linear combination of other variable then
it is said that VIF is infinite.
What is a Q-Q plot? Explain the use and importance of a Q-Q plot in linear
regression.

Quantile-Quantile (Q-Q) plot, is a graphical tool to help us assess if a set of data plausibly came
from some theoretical distribution such as a Normal, exponential or Uniform distribution. Also,
it helps to determine if two data sets come from populations with a common distribution It is
used for determining if two data sets come from populations with a common distribution.
A q-q plot is a plot of the quantiles of the first data set against the quantiles of the second data
set. Whether the Distributions is Gaussian, Uniform, Exponential or even Pareto distribution, it
can be found out.
Few advantages: a) It can be used with sample sizes also
b) Many distributional aspects like shifts in location, shifts in scale, changes in symmetry, and
the presence of outliers can all be detected from this plot.
It is used to check following scenarios: If two data sets — i. come from populations with a
common distribution
ii. have common location and scale
iii. have similar distributional shapes iv. have similar tail behaviour
Different distribution: If all point of quantiles lies away from the straight line at an angle of 45
degree from x -axis

Data Science
100% (1)
Data Science
14 pages
Psychic Self Defense
100% (16)
Psychic Self Defense
69 pages
Jyotish - KP - 2016 - Bibhash Choudhary - How To Judge Incest - With KP System
100% (1)
Jyotish - KP - 2016 - Bibhash Choudhary - How To Judge Incest - With KP System
19 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Film Theory
100% (1)
Film Theory
35 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Guidelines For Conceptual Design
No ratings yet
Guidelines For Conceptual Design
131 pages
Linear Regression
No ratings yet
Linear Regression
16 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Subjective Questions
92% (13)
Subjective Questions
6 pages
Bike Sharing Assignment
100% (6)
Bike Sharing Assignment
7 pages
Characteristic Features of Scientific Methods
No ratings yet
Characteristic Features of Scientific Methods
3 pages
Linear Regression PDF
100% (1)
Linear Regression PDF
32 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Organizational Leadership
No ratings yet
Organizational Leadership
15 pages
Assignment-Based Subjective Questions/Answers
No ratings yet
Assignment-Based Subjective Questions/Answers
3 pages
SHS Exam
No ratings yet
SHS Exam
3 pages
Unit 2
No ratings yet
Unit 2
19 pages
Assignment-Based Subjective Questions
100% (1)
Assignment-Based Subjective Questions
10 pages
Lesson Plan in Teaching Essay
100% (1)
Lesson Plan in Teaching Essay
12 pages
Clean Rooms L Essentiel 3
100% (1)
Clean Rooms L Essentiel 3
4 pages
Zakes Mda Presentation
100% (1)
Zakes Mda Presentation
10 pages
Linear Regression Assignment Questions and Answer
No ratings yet
Linear Regression Assignment Questions and Answer
7 pages
Anthropological Perspective On Culture and Society
No ratings yet
Anthropological Perspective On Culture and Society
6 pages
SCP 1471 A One of A Kind Friend
No ratings yet
SCP 1471 A One of A Kind Friend
195 pages
Optical Comm
100% (1)
Optical Comm
9 pages
Introduction To 8051 Programming
No ratings yet
Introduction To 8051 Programming
24 pages
DSR Notes 3 To 5
No ratings yet
DSR Notes 3 To 5
70 pages
Uttam Linear Regression 17march24
No ratings yet
Uttam Linear Regression 17march24
82 pages
Aggregate Functions Questions and Answers
No ratings yet
Aggregate Functions Questions and Answers
57 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Self Identified in Robert Penn Warren's The Cave
No ratings yet
Self Identified in Robert Penn Warren's The Cave
2 pages
Econometrics For MGT ppt-2
No ratings yet
Econometrics For MGT ppt-2
58 pages
Intermediate Analytics-Regression-Week 1
No ratings yet
Intermediate Analytics-Regression-Week 1
52 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
AIX Commands
100% (1)
AIX Commands
2 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
Questions Stats and Trix
No ratings yet
Questions Stats and Trix
39 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
Module 3
No ratings yet
Module 3
34 pages
ML Unit-III Notes
No ratings yet
ML Unit-III Notes
83 pages
ML Unit-2 Final
No ratings yet
ML Unit-2 Final
32 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
Chapter 5 - 1
No ratings yet
Chapter 5 - 1
5 pages
Job Hunting Guide For: International Students
No ratings yet
Job Hunting Guide For: International Students
34 pages
Aiml Module 3 Part 3
No ratings yet
Aiml Module 3 Part 3
12 pages
DA-3rd Unit
No ratings yet
DA-3rd Unit
16 pages
Linear Regression
No ratings yet
Linear Regression
35 pages
Mod3 Eda
No ratings yet
Mod3 Eda
16 pages
Topic - 9 PDF
No ratings yet
Topic - 9 PDF
12 pages
Unit II-II
No ratings yet
Unit II-II
21 pages
Explain The Linear Regression Algorithm in Detail
No ratings yet
Explain The Linear Regression Algorithm in Detail
12 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
Revision 235
No ratings yet
Revision 235
8 pages
Assignment Linear Regression
No ratings yet
Assignment Linear Regression
10 pages
Assignment-Based Subjective Questions/Answers
No ratings yet
Assignment-Based Subjective Questions/Answers
3 pages
Civil Engineering: A Road To Success ..Under Construction
No ratings yet
Civil Engineering: A Road To Success ..Under Construction
22 pages
Chapter4 Regression
No ratings yet
Chapter4 Regression
15 pages
Linear Regression
No ratings yet
Linear Regression
42 pages
Unit III
No ratings yet
Unit III
13 pages
1 Linear Regression
No ratings yet
1 Linear Regression
22 pages
Hanan
No ratings yet
Hanan
9 pages
Linear Regression Datascience Basit PDF
No ratings yet
Linear Regression Datascience Basit PDF
19 pages
AIML MSE 2 Notes
No ratings yet
AIML MSE 2 Notes
35 pages
Subjective Questions
No ratings yet
Subjective Questions
8 pages
Subjective Ques SKS
No ratings yet
Subjective Ques SKS
8 pages
Da Unit 3 R22
No ratings yet
Da Unit 3 R22
15 pages
Subjective Questions Answers
No ratings yet
Subjective Questions Answers
14 pages
Unit5 R
No ratings yet
Unit5 R
5 pages
Linear Regression Assignment - Subjective
No ratings yet
Linear Regression Assignment - Subjective
7 pages
MGIT 960 Brochure
No ratings yet
MGIT 960 Brochure
3 pages
Operation Guide 5269: About This Manual
No ratings yet
Operation Guide 5269: About This Manual
9 pages
Bike Assignment - Subjective Sol
No ratings yet
Bike Assignment - Subjective Sol
5 pages
Regression Questionnaire
No ratings yet
Regression Questionnaire
10 pages
Linear Quadratic Exponential Tables
No ratings yet
Linear Quadratic Exponential Tables
3 pages
(April-18) (MBA-202) BBA & IMBA Degree Examination Iv Semester Organizational Behavior Time: 3 Hours Max - Marks: 60
No ratings yet
(April-18) (MBA-202) BBA & IMBA Degree Examination Iv Semester Organizational Behavior Time: 3 Hours Max - Marks: 60
2 pages
Exploring Randomness PDF
No ratings yet
Exploring Randomness PDF
2 pages
Bose Suspension
No ratings yet
Bose Suspension
7 pages
SW Mock 2024 - O AND A LEVEL GENERAL TIMETABLE FINAL
No ratings yet
SW Mock 2024 - O AND A LEVEL GENERAL TIMETABLE FINAL
1 page
DGR 61st Edition Checklist For A Radioactive Shipment 11
No ratings yet
DGR 61st Edition Checklist For A Radioactive Shipment 11
1 page
ML Asssignment Subjective Questions Answers
No ratings yet
ML Asssignment Subjective Questions Answers
7 pages
Linear Regression Firm Basit PDF
No ratings yet
Linear Regression Firm Basit PDF
21 pages
National University of Modern Languages Lahore Campus Topic
No ratings yet
National University of Modern Languages Lahore Campus Topic
4 pages
Subjective Questions
No ratings yet
Subjective Questions
3 pages
YetiDespatch75-031 0090C 11-3059
No ratings yet
YetiDespatch75-031 0090C 11-3059
1 page
All Papers Will Be Published in Peer Reviewed Journals: 7 July 2019
No ratings yet
All Papers Will Be Published in Peer Reviewed Journals: 7 July 2019
2 pages
AMSOIL Synthetic Motor Oils For OE Oil Change Interval. 3000 Mile Oil Change
No ratings yet
AMSOIL Synthetic Motor Oils For OE Oil Change Interval. 3000 Mile Oil Change
2 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
From Everand
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
Jim Frost
5/5 (4)

Linear Regression Subjective Questions

Uploaded by

Linear Regression Subjective Questions

Uploaded by

Linear

drop_first=True is important to use, as it helps in reducing the extra column

The feature “temp” has highest correlation. It is very well linearly

Error terms are normally distributed with mean 0.

You might also like