suf, They are the distances between the green circles and red squares. When you implement
linear regression, you are actually trying to minimize these distances and make the red squares as
close t0 the predefined green circles as possible.
‘Implementing Linear Regression in Python
It's time to start implementing linear regression in Python. Basically, all you should do is apply
the proper packages and their functions ancl classes,
Python Packages for Linear Regression
‘The package NumPyis a fundamental Python scientific package that allows many high-
performanceoperationsonsingle-andmult-dimensionalarrays.Italsooffersmanymathematical
routines. OF course, it’s opensource,
‘The package scikit-learn is a widely used Python library for machine learning, built on top of
NumPy and some other packages. It provides the means for preprocessing data, reducing
dimensionality, implementingregression,classification,clustering,andmore.LikeNumPy,scikit.
learn is also opensource.
‘IL you want to implement linear regression and need the functionality beyond the scope of scikit-
learn, you should consider statsmodels. It's a powerful Python package for the estimation of
statistical models, performing tests, and more. It's open source as well
‘Simple Linear Regression With scikit-learn
Let's start with the simplest case, which is simple linear regression. There
are five basic steps when you're implementing linear regression:
Import the packages and classes younced.
1
2. Provide datato work with and eventually do appropriatetransformations.
3. Create a regression model and fit it with existingdata.
4. Check the results of model fitting to know whether the model isatisfactory.
5. Apply the model forpredictions.
SOURCE CODE:
importnumpy as np
importmatplotlib.pyplot as pit
dofestimate_coefix, y):
4# number of observations/points =
Page 34np.size(x)
# mean of x and y vector
m_x, m_y = np.mean(x), np.mean(y)
# calculating cross-deviation and deviation aboutx SS_xy =
nnp.sum(y*x) - n¥m_ym_x
SS_xx = np.sum(x"x) -n*m_x*m_x
# calculating regression coefficients b_I =
SS_xy/SS_xx
bO=my-bi*mx
return(b_0, b_1)
defiplot_regression_line(x, y, b):
# plotting the actual points as scatter plot
plt.scatter(x, y, color = *m",
marker = "o", s = 30)
# predicted response vector y_pred =
blO} + BIL] x
# plotting the regression line
pit.plot(x, y_pred, color ="z")
# putting labels
plt.xlabel('x')
plt.ylabel('y’)# function to show plat
plt.show()
def maind:
# observations
x= npaarray({O, 1, 2,3, 4, 5, 6, 7, 8,9)
y=np.array((1, 3, 2, 5, 7, 8, 8, 9, 10, 12))
# estimating coefficients b=
cestimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \
\nb_1 = {)".format(b[0}, b(1)))
# plotting regression line
plot _regression_line(x, y, b)
iframe,
main()
ourpur:
Estimated coefficients:
b_0 = -0,05862068965