Module-11.-Lesson-Proper
Module-11.-Lesson-Proper
The Simple Linear Regression Model and the Least Squares Point Estimates
• The dependent (or response) variable is the variable we wish to understand or predict
• The independent (or predictor) variable is the variable we will use to understand or predict the
dependent variable
• Regression Analysis is a statistical technique that uses observed data to relate the dependent
variable to one or more independent variable
• The objective is to build a regression model that can describe, predict and control the dependent
variable based on the independent variable.
REGRESSION NOTATION
where:
y i is the predicted value of the dependent variable (y) for any given value of the independent variable (x).
• e is the error of the estimate, or how much variation there is in our estimate of the regression
coefficient.
LET’S DO IT!
Complete the table of the given data
1) Get the sum of the independent and
dependent variables x and y
2) Find the Mean by dividing the sum of
all the values by the number of data
̅ ) ^2 = (17) ^2 = 289
(x- 𝑿 ̅ ) ^2 = (8) ^2= 64
(y-𝒀
̅ ) (y-𝒀
(x- 𝑿 ̅ ) = (17)(8) = 136
b1 = 470/730
b1 = 0.644
b0 = y - b1 * x
σ x =√ [Σ (xi - x) ^2 / N]
σ y = √ [ Σ (y i - y) ^2 / N]
σ y = √ (630/5) =√ (126) = 11.225
And finally, we compute the coefficient of
determination (R^2):
Linear regression finds the line of best fit line through your data
by searching for the regression
coefficient (B1) that minimizes the total error (e) of the model.
between two continuous (quantitative) variables: One variable, denoted x, is regarded as the
predictor, explanatory, or independent variable.
A regression line is a straight line that attempts to predict the relationship between two points, also
known as a trend line or line of best fit. Simple linear regression is a prediction when a variable (y)
is dependent on a second variable (x) based on the regression equation of a given set of data.
Simple linear regression is similar to correlation in that the purpose is to measure to what extent
there is a linear relationship between two variables. In particular, the purpose of linear regression is
to "predict" the value of the dependent variable based upon the values of one or more independent
variables.
Regression analysis is the method of using observations (data records) to quantify the relationship
between a target variable (a field in the record set), also referred to as a dependent variable, and a
set of independent variables, also referred to as a covariate.
There are three assumptions associated with a linear regression model: Linearity: The relationship
between X and the mean of Y is linear. Homoscedasticity: The variance of residual is the same for
any value of X. Independence: Observations are independent of each other.
Regression models describe the relationship between variables by fitting a line to the observed
data. Linear regression models use a straight line, while logistic and nonlinear regression models use
a curved line. Regression allows you to estimate how a dependent variable change as the
independent variable(s) change.
The relationship between the independent and dependent variable is linear: the line of best fit through
the data point is a straight line (rather than a curve or some sort of grouping factor).