0% found this document useful (0 votes)
2 views3 pages

Exp4 DM 1

This document outlines a lab exercise for building a linear regression model using Python on a dataset containing faculty IDs and their corresponding incomes. The provided source code reads the dataset, calculates regression coefficients, and plots the data along with the regression line. The estimated coefficients obtained from the model are b0 = -32885.91 and b1 = 2580.54.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views3 pages

Exp4 DM 1

This document outlines a lab exercise for building a linear regression model using Python on a dataset containing faculty IDs and their corresponding incomes. The provided source code reads the dataset, calculates regression coefficients, and plots the data along with the regression line. The estimated coefficients obtained from the model are b0 = -32885.91 and b1 = 2580.54.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

2/4 B.

Tech AI&DS Data Mining Using Python Lab


EXP4.
Build a model using linear regression algorithm on any dataset.

DATASET4EXP.csv:
FACID Income
25 25000
23 22000
24 28000
28 45000
30 37000
31 32000
30 50000
29 45000
28 60000

Source Code

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# To read data from FACULTYID_Income.csv file


dataFrame = pd.read_csv('C:\\Users\\AI&DS LAB\\Desktop\\DATAASETEXP4.csv')

dataFrame.head(20)

fACID Income

0 25 25000

1 23 22000

2 24 28000

3 28 45000

4 30 37000

5 31 32000

6 30 50000

7 29 45000

8 28 60000
2/4 B. Tech AI&DS Data Mining Using Python Lab
# To place data in to age and income vectors
FACID = dataFrame['FACID']
income = dataFrame['Income']

# number of points
num = np.size(FACID)
# To find the mean of age and income vector
mean_FACID = np.mean(FACID)
mean_income = np.mean(income)

# calculating cross-deviation and deviation about FACID


CD_FACIDincome = np.sum(income*FACID) - num*mean_income*mean_FACID
CD_FACIDFACID = np.sum(FACID*FACID) - num*mean_FACID*mean_FACID

# calculating regression coefficients


b1 = CD_FACIDincome / CD_FACIDFACID
b0 = mean_income - b1*mean_FACID

# to display coefficients
print("Estimated Coefficients :")
print("b0 = ",b0,"\nb1 = ",b1)

# To plot the actual points as scatter plot


plt.scatter(FACID, income, color = "b",marker = "o")

# TO predict response vector


response_Vec = b0 + b1*FACID

# To plot the regression line


plt.plot(FACID, response_Vec, color = "r")
# Placing labels
plt.xlabel('FACID')
plt.ylabel('Income')

# To display plot
plt.show()

OUTPUT:

Estimated Coefficients :
b0 = -32885.90604026866
b1 = 2580.536912751685
2/4 B. Tech AI&DS Data Mining Using Python Lab

You might also like