Classical Machine Learning: Linear Regression: Ramesh S
Classical Machine Learning: Linear Regression: Ramesh S
Classical Machine Learning: Linear Regression: Ramesh S
Linear Regression
Ramesh S
Regression
● “Draw a line through these dots. Yep, that's linear regression”
○ Medical diagnosis
● Examples are
○ car price by its mileage
● And it's super smooth inside — the machine simply tries to draw a
line that indicates average correlation.
● It allows you to estimate a value, such as housing prices or human lifespan, based on input data.
● Here, target variable means the unknown variable we care about predicting, and continuous means there
aren’t gaps (discontinuities) in the value that can take on.
● Discrete variables, on the other hand, can only take on a finite number of values — for example, the number
of kids somebody has is a discrete variable.
Regression Models
● This technique is used for forecasting, time series modelling and finding the causal effect relationship
between the variables.
● For example, relationship between rash driving and number of road accidents by a driver is best studied
through regression.
● Here, we fit a curve / line to the data points, in such a manner that the differences between the distances
of data points from the curve or line is minimized.
Why Use Regression Analysis?
● Regression Analysis estimates the relationship between two or more variables. Let’s understand this with
an easy example:
● Say, you want to estimate growth in sales of a company based on current economic conditions.
● You have the recent company data which indicates that the growth in sales is around two and a half times
the growth in the economy.
● Using this insight, we can predict future sales of the company based on current & past information.
No. of
Independent
Regression Analysis Variables
Linear Regression
Multi-variate Linear
Simple Linear Regression
Regression
Linear Regression
● Single Variable Linear Regression is a technique used to model the relationship between a single input
independent variable (feature variable) and an output dependent variable using a linear model i.e., a line.
● multiple linear regression has (>1) independent variables, whereas simple linear regression has only 1
independent variable.
● The more general case is Multi Variable Linear Regression where a model is created for the relationship
between multiple independent input variables (feature variables) and an output dependent variable.
● The model remains linear in that the output is a linear combination of the input variables.
III. Obtain the best fit line. (How?) - Use Least Squares Method
● Now she decides what to buy where, that has minimum value for the Cost
Function Par
Estimate the Cost Function
● It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple
regression.
● It represents the proportion of the variance for a dependent variable that's explained by an independent variable.
● Correlation explains the strength of the relationship between an independent and dependent variable, R-squared
explains to what extent the variance of one variable explains the variance of the second variable.
● The sum of squared errors must be at least 0, which means that the R-squared can be at most 1.
● The higher the number, the better our model fits the data.
R-Squared
Other Performance Metrics
Median Absolute Deviation When Errors are skewed Should be as less as possible