1.1.1 Simple Linear Regression
1.1.1 Simple Linear Regression
After the machine deciphers the parameter values from the data, it creates what is known as a model:
an algorithmic equation for producing an outcome with new data based on the rules derived from the
training data.
Once the model is prepared, it can be applied to new data and tested for accuracy. After the model
has passed both the training and test data stages, it is ready to be applied and used in the real world.
Another simple example is to create a model for predicting house values where 𝑦 is the actual house
price and 𝑋 are the variables that impact 𝑦, such as land size, location, and the number of rooms.
Through supervised learning, we can create a rule to predict 𝑦 (house value) based on the given
values of various variables (𝑋).
There are many types of Regression such Simple Linear Regression, Multiple Linear Regression,
Polynomial Regression etc.
Simple Linear Regression
Simple linear regression is a type of regression analysis where the number of independent variables
is one and there is a linear relationship between the independent (𝑋) and dependent (𝑌) variable.
Independent variable, denoted by (𝑋) is also known as the predictor, explanatory variable
Dependent variable, denoted by (𝑌), is also known as the response, outcome, or target variable.
Simple linear regression gets its adjective "simple," because it concerns the study of only one
predictor variable. In contrast, multiple linear regression, which we study later in this course, gets its
adjective "multiple," because it concerns the study of two or more predictor variables.
𝒚 = 𝒃𝟎 + 𝒃𝟏 𝒙
where 𝑏0 (the intercept) and 𝑏1 (the slope) are the parameters our algorithm will try to “learn” to
produce the most accurate predictions.
Example:
Let's consider a scenario where we want to determine the linear relationship between: How much a
company spends on Radio advertising each year and its annual Sales in terms of units sold. We are
trying to develop an equation that will let us to predict units sold based on how much a company
spends on radio advertising. Given Below is a screenshot of dataset
radio sales
37.8 22.1
39.3 10.4
45.9 9.3
41.3 18.5
10.8 12.9
48.9 7.2
32.8 11.8
A Simple Linear Regression model can be formulated between sales (dependent variable) & radio
(independent variable)
𝒔𝒂𝒍𝒆𝒔 = 𝒃𝟎 + 𝒃𝟏 ∗ 𝒓𝒂𝒅𝒊𝒐
Our algorithm will try to learn the correct values for 𝒃𝟎 and 𝒃𝟏 using:
To evaluate model performance, we can use any of the following error metric: