0% found this document useful (0 votes)
6 views2 pages

Linear Regression 1

This document analyzes salary data using linear regression. It loads and explores a CSV file containing years of experience and salary for various employees. It then performs linear regression to find the slope and intercept of the line of best fit and plots the data and regression line.

Uploaded by

Linya C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views2 pages

Linear Regression 1

This document analyzes salary data using linear regression. It loads and explores a CSV file containing years of experience and salary for various employees. It then performs linear regression to find the slope and intercept of the line of best fit and plots the data and regression line.

Uploaded by

Linya C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

11/12/23, 2:34 PM Untitled0.

ipynb - Colaboratory

from google.colab import files


uploade=files.upload()

Choose Files Salary_Data.csv


Salary_Data.csv(text/csv) - 352 bytes, last modified: 11/11/2023 - 100% done
Saving Salary_Data.csv to Salary_Data (2).csv

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data=pd.read_csv('Salary_Data.csv')
print(data.shape)

(30, 2)

#x=independent data,y=dependent data


X=data['YearsExperience'].values
Y=data['Salary'].values
print('X=',X)
print('Y=',Y)

X= [ 1.1 1.3 1.5 2. 2.2 2.9 3. 3.2 3.2 3.7 3.9 4. 4. 4.1
4.5 4.9 5.1 5.3 5.9 6. 6.8 7.1 7.9 8.2 8.7 9. 9.5 9.6
10.3 10.5]
Y= [ 39343 46205 37731 43525 39891 56642 60150 54445 64445 57189
63218 55794 56957 57081 61111 67938 66029 83088 81363 93940
91738 98273 101302 113812 109431 105582 116969 112635 122391 121872]

x_mean=np.mean(X)
y_mean=np.mean(Y)
print('x_mean=',x_mean)
print('y_mean=',y_mean)

x_mean= 5.3133333333333335
y_mean= 76003.0

N=len(X)
n=0
d=0
for i in range(N):
n+=(X[i]-x_mean)*(Y[i]-y_mean)
d+=(X[i]-x_mean)**2
m=n/d
print('slope,m=',m)
c=(y_mean-m*(x_mean))
print('intercept,c=',c)

slope,m= 9449.962321455077
intercept,c= 25792.20019866869

min_X=np.min(X)
max_X=np.max(X)
x=np.linspace(min_X,max_X,100)
y=m*x+c
plt.plot(x,y,color='red')
plt.scatter(X,Y,color='blue')
plt.xlabel('YEARS EXPERIENCE')

https://colab.research.google.com/drive/1nXFYBPJtYmX5UWKkucIUeYGVQAkX27ay?hl=en#scrollTo=B_Rdgx-4uzVF&printMode=true 1/2
11/12/23, 2:34 PM Untitled0.ipynb - Colaboratory
plt.ylabel('SALARY')
plt.legend()
plt.show()

WARNING:matplotlib.legend:No artists with labels found to put in legend. Note that artists whose

for i in range(N):
a=0
b=0
a+=(y[i]-Y[i])**2
b+=(Y[i]-y_mean)**2
r=1-(a/b)
print('r:',r)

output r: -0.6919468281042018

https://colab.research.google.com/drive/1nXFYBPJtYmX5UWKkucIUeYGVQAkX27ay?hl=en#scrollTo=B_Rdgx-4uzVF&printMode=true 2/2

You might also like