0% found this document useful (0 votes)
9 views7 pages

Experiment N1

Prakash

Uploaded by

pranayaws15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views7 pages

Experiment N1

Prakash

Uploaded by

pranayaws15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

EXPERIMENT NO: 1

AIM: To implement Linear Regression

THEORY:

Linear Regression is a supervised machine learning algorithm where the predicted output is continuous
and has a constant slope. It’s used to predict values within a continuous range, (e.g. sales, price) rather
than trying to classify them into categories (e.g. cat, dog). There are two main types:

Simple regression

Simple linear regression uses the traditional slope-intercept form, where m and b are the variables our
algorithm will try to “learn” to produce the most accurate predictions. x represents our input data and y
represents our prediction.

y=mx+b

Multivariable regression

A more complex, multi-variable linear equation might look like this, where w represents the coefficients,
or weights, our model will try to learn.

f(x,y,z)= w1x+w2y+w3zf(x,y,z)=w1x+w2y+w3z

The variables x,y,z represent the attributes, or distinct pieces of information, we have about each
observation. For sales predictions, these attributes might include a company’s advertising spend on
radio, TV, and newspapers.

Sales=w1 Radio+w2TV+w3News

Simple regression

Let’s say we are given a dataset with the following columns (features): how much a company spends on
Radio advertising each year and its annual Sales in terms of units sold. We are trying to develop an
equation that will let us predict units sold based on how much a company spends on radio advertising.
The rows (observations) represent companies.

Our prediction function outputs an estimate of sales given a company’s radio advertising spend and our
current values for Weight and Bias.

Sales=Weight⋅Radio+Bias
Weight: The coefficient for the Radio independent variable. In machine learning we call coefficients
weights.

Radio:The independent variable. In machine learning we call these variables features.

Bias: The intercept where our line intercepts the y-axis. In machine learning we can call intercepts bias.
Bias offsets all predictions that we make.

Our algorithm will try to learn the correct values for Weight and Bias. By the end of our training, our
equation will approximate the line of best fit.

Code

def predict_sales(radio, weight, bias):

return weight*radio + bias

Cost Function:

The prediction function is nice, but for our purposes we don’t really need it. What we need is a cost
function so we can start optimizing our weights. Let’s use MSE (L2) as our cost function. MSE measures
the average squared difference between an observation’s actual and predicted values. The output is a
single number representing the cost, or score, associated with our current set of weights. Our goal is to
minimize MSE to improve the accuracy of our model.

Math: Given our simple linear equation

y=mx+b

we can calculate MSE as:

MSE=1/N∑n i=1(yi−(mxi+b))^2

● Nis the total number of observations (data points)

● 1/N∑n i=1 is the mean

● yi is the actual value of an observation and mxi+b is our prediction

Code

Gradient descent

To minimize MSE we use Gradient Descent to calculate the gradient of our cost function. Gradient
descent consists of looking at the error that our weight currently gives us, using the derivative of the
cost function to find the gradient (The slope of the cost function using our current weight), and then
changing our weight to move in the direction opposite of the gradient. We need to move in the opposite
direction of the gradient since the gradient points up the slope instead of down it, so we move in the
opposite direction to try to decrease our error.

Math

There are two parameters (coefficients) in our cost function we can control: weight m and bias b. Since
we need to consider the impact each one has on the final prediction, we use partial derivatives. To find
the partial derivatives, we use the Chain rule. We need the chain rule because

(y−(mx+b))^2

is really 2 nested functions: the inner function y−(mx+b) and the outer function x^2. returning to our
cost function:

f(m,b)=1/N∑ n i=1 (yi−(mxi+b))^2

Using the following:

(yi−(mxi+b))^2=A(B(m,b))

We can split the derivative into

A(x)=x^2

df/dx=A’(x)=2x

And

B(m,b)=yi−(mxi+b)=yi−mxi−b

dx/dm =B′(m)=0−xi−0=−xi

dx/db=B’(b)=0−0−1=−1

and then using the Chain rule which states:

df/dm=df/dx dx/dm

df/db=df/dx dx/db

We then plug in each of the parts to get the following derivatives df/dm=A′(B(m,f))B′(m)=2(yi−
(mxi+b))⋅−xi

df/db=A′(B(m,f))B′(b)=2(yi−(mxi+b))⋅−1

We can calculate the gradient of this cost function as:

f′(m,b)=⎡⎣df/dm df/db⎤=[1/N∑−xi⋅2(yi−(mi+b))

[1/N ∑−1⋅2(yi−(mxi+b))]

=[1/N ∑−2xi(yi−(mxi+b))

1/N∑−2(yi−(mxi+b))]
Code

To solve for the gradient, we iterate through our data points using our new weight and bias values and
take the average of the partial derivatives. The resulting gradient tells us the slope of our cost function
at our current position (i.e. weight and bias) and the direction we should update to reduce our cost
function (we move in the direction opposite the gradient). The size of our update is controlled by the
learning rate.

Training

Training a model is the process of iteratively improving your prediction equation by looping through the
dataset multiple times, each time updating the weight and bias values in the direction indicated by the
slope of the cost function (gradient). Training is complete when we reach an acceptable error threshold,
or when subsequent training iterations fail to reduce our cost. Before training we need to initialize our
weights (set default values), set our hyperparameters (learning rate and number of iterations), and
prepare to log our progress over each iteration.
Visualizing:
Conclusion

In this experiment, we successfully implemented Linear Regression to predict outcomes based on a


dataset. The model's performance was evaluated using metrics like Mean Squared Error (MSE) and R-
squared (R²), demonstrating its ability to capture the relationship between the features and target
variable. Linear Regression proved effective for predictive modeling with continuous data.

You might also like