0% found this document useful (0 votes)

30 views

Simple Linear regression-LAB4.ipynb - Colaboratory

This document analyzes the relationship between student grades and salary after obtaining an MBA degree. It imports necessary libraries, loads and cleans a dataset containing grades and salary information for 50 students. It then splits the data into training and test sets, fits a linear regression model to predict salary based on grades using the training set, and visualizes the results. The model estimates a slope of around 1504 and intercept of around 152845 for predicting salary from grades.

Uploaded by

PATTABHI RAMANJANEYULU

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Simple Linear regression-LAB4.ipynb - Colaboratory

Uploaded by

PATTABHI RAMANJANEYULU

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

#

Import necessary libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Importing the dataset
df = pd.read_csv('MBA Salary.csv')

df.head()

S. No. Percentage in Grade 10 Salary

0 1 62.00 270000

1 2 76.33 200000

2 3 72.00 240000

3 4 60.00 250000

4 5 61.00 180000

df.info()

RangeIndex: 50 entries, 0 to 49

Data columns (total 3 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 S. No. 50 non-null int64

1 Percentage in Grade 10 50 non-null float64

2 Salary 50 non-null int64

dtypes: float64(1), int64(2)

memory usage: 1.3 KB

print(df.shape)

(50, 3)

# View descriptive statistics

print(df.describe())

S. No. Percentage in Grade 10 Salary

count 50.00000 50.000000 50.000000

mean 25.50000 63.922400 258192.000000

std 14.57738 9.859937 76715.790993

min 1.00000 37.330000 120000.000000

25% 13.25000 57.685000 204500.000000

50% 25.50000 64.700000 250000.000000

75% 37.75000 70.000000 300000.000000

max 50.00000 83.000000 450000.000000

# Declare feature variable and target variable

X = df['Percentage in Grade 10']

y = df['Salary']

# Plot scatter plot between X and y

plt.scatter(X, y, color = 'blue', label='Scatter Plot')

plt.title('Relationship between Grades and Salary of a person')

plt.xlabel('Percentage in Grade 10')

plt.ylabel('Salary')

plt.legend(loc=4)

plt.show()

# Print the dimensions of X and y
print(X.shape)

print(y.shape)

(50,)

0 62.00

1 76.33

2 72.00

3 60.00

4 61.00

5 55.00

6 70.00

7 68.00

8 82.80

9 59.00

10 58.00

11 60.00

12 66.00

13 83.00

14 68.00

15 37.33

16 79.00

17 68.40

18 70.00

19 59.00

20 63.00

21 50.00

22 69.00

23 52.00

24 49.00

25 64.60

26 50.00

27 74.00

28 58.00

29 67.00

30 75.00

31 60.00

32 55.00

33 78.00

34 50.08

35 56.00

36 68.00

37 52.00

38 54.00

39 52.00

40 76.00

41 64.80

42 74.40

43 74.50

44 73.50

45 57.58

46 68.00

47 69.00

48 66.00

49 60.80

Name: Percentage in Grade 10, dtype: float64

X=np.array(X)

y=np.array(y)

array([62. , 76.33, 72. , 60. , 61. , 55. , 70. , 68. , 82.8 ,

59. , 58. , 60. , 66. , 83. , 68. , 37.33, 79. , 68.4 ,

70. , 59. , 63. , 50. , 69. , 52. , 49. , 64.6 , 50. ,

74. , 58. , 67. , 75. , 60. , 55. , 78. , 50.08, 56. ,

68. , 52. , 54. , 52. , 76. , 64.8 , 74.4 , 74.5 , 73.5 ,

57.58, 68. , 69. , 66. , 60.8 ])

# Reshape X and y

X = X.reshape(-1,1)

y = y.reshape(-1,1)

# Print the dimensions of X and y after reshaping

print(X.shape)

print(y.shape)

(50,)

# Split X and y into training and test data sets

#random_state--the set of data does not change

from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test = train_test_split(X, y, test_size=0.30, random_state=42)

# Print the dimensions of X_train,X_test,y_train,y_test

print(X_train.shape)

print(y_train.shape)

print(X_test.shape)

print(y_test.shape)

(33, 1)

(17, 1)

# Fit the linear model

# Instantiate the linear regression object lm

from sklearn.linear_model import LinearRegression

lm = LinearRegression()

# Train the model using training data sets

lm.fit(X_train,y_train)

# Predict on the test data

y_pred=lm.predict(X_test)

# Visualising the Training set results

plt.scatter(X_train, y_train, color = 'red')

plt.plot(X_train, lm.predict(X_train), color = 'blue')

[<matplotlib.lines.Line2D at 0x22c2d23c430>]

# Visualising the Test set results

plt.scatter(X_test, y_test, color = 'red')

plt.plot(X_test, lm.predict(X_test), color = 'blue')

plt.title('Test set results')

plt.xlabel('Grades')

plt.ylabel('Salary')

plt.show()

# Compute model slope and intercept

slope = lm.coef_

intercept = lm.intercept_,

print("Estimated model slope:" , slope)

print("Estimated model intercept:" , intercept)

Estimated model slope: [[1504.41195413]]

Estimated model intercept: (array([152845.01374103]),)

X_new = [[80]]

lm.predict(X_new)

array([[273197.97007155]])
Colab paid products
-
Cancel contracts here

Python Report Ritik
No ratings yet
Python Report Ritik
15 pages
Dav Lab Manual
No ratings yet
Dav Lab Manual
28 pages
Time Series Analysis Group 9
No ratings yet
Time Series Analysis Group 9
16 pages
Numpy and Pandas
No ratings yet
Numpy and Pandas
11 pages
Assignmnet 5
No ratings yet
Assignmnet 5
11 pages
Data Science Manual
No ratings yet
Data Science Manual
16 pages
ml lab
No ratings yet
ml lab
14 pages
BDA File
No ratings yet
BDA File
26 pages
Ipclass 12
No ratings yet
Ipclass 12
21 pages
Pattern Recognition
No ratings yet
Pattern Recognition
26 pages
ml file syllabus
No ratings yet
ml file syllabus
43 pages
Data Science Algorithmen Master - 02 Data Handling
No ratings yet
Data Science Algorithmen Master - 02 Data Handling
76 pages
Data Science Practical Book - Ipynb
No ratings yet
Data Science Practical Book - Ipynb
21 pages
ML Practice Assignment
No ratings yet
ML Practice Assignment
7 pages
Programs of Python Pandas
No ratings yet
Programs of Python Pandas
15 pages
Fds Mannual
No ratings yet
Fds Mannual
39 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
ML(sudhanshu)
No ratings yet
ML(sudhanshu)
24 pages
FDS All Practicals
No ratings yet
FDS All Practicals
10 pages
MACHINE LEARNING manual
No ratings yet
MACHINE LEARNING manual
36 pages
XII IP PRACTICAL LIST 2022-23-1
No ratings yet
XII IP PRACTICAL LIST 2022-23-1
23 pages
Student - Linear Regression Example - Colaboratory
No ratings yet
Student - Linear Regression Example - Colaboratory
6 pages
IP_Practical
No ratings yet
IP_Practical
15 pages
IP Practical File
No ratings yet
IP Practical File
18 pages
INTRO TO STATISTICS (CH1&2)
No ratings yet
INTRO TO STATISTICS (CH1&2)
38 pages
Certificate
No ratings yet
Certificate
25 pages
Practical File Question 28.09.2022
No ratings yet
Practical File Question 28.09.2022
15 pages
Ai Tools and Applications-Lab
No ratings yet
Ai Tools and Applications-Lab
33 pages
Codes Frome Dayy 1 To Day 6
No ratings yet
Codes Frome Dayy 1 To Day 6
45 pages
Reading Data: #Importing Required Libraries
No ratings yet
Reading Data: #Importing Required Libraries
16 pages
ml lab
No ratings yet
ml lab
23 pages
Fha-pyhton Program Unit 1-4.Docx
No ratings yet
Fha-pyhton Program Unit 1-4.Docx
13 pages
Mlext
No ratings yet
Mlext
1 page
C121 Exp1
No ratings yet
C121 Exp1
32 pages
IDS-1
No ratings yet
IDS-1
30 pages
Data Preprocessing Python Tome III
No ratings yet
Data Preprocessing Python Tome III
12 pages
Import
No ratings yet
Import
15 pages
Data science and analtics Laboratory
No ratings yet
Data science and analtics Laboratory
21 pages
Roll NO 2020
No ratings yet
Roll NO 2020
8 pages
Class XII (As Per CBSE Board) : Informatics Practices
No ratings yet
Class XII (As Per CBSE Board) : Informatics Practices
27 pages
Liner Regression
No ratings yet
Liner Regression
12 pages
Mlda - Lab
No ratings yet
Mlda - Lab
35 pages
DS_lab manual
No ratings yet
DS_lab manual
31 pages
Python Libraries
No ratings yet
Python Libraries
27 pages
Project paarth (1) (1)
No ratings yet
Project paarth (1) (1)
21 pages
List of Programs For Informatics - XII - IP
No ratings yet
List of Programs For Informatics - XII - IP
26 pages
AIML LAB MANAUAL R23
100% (1)
AIML LAB MANAUAL R23
10 pages
Code shabab error 7
No ratings yet
Code shabab error 7
5 pages
Abhiml ML File
No ratings yet
Abhiml ML File
74 pages
practicals (1)
No ratings yet
practicals (1)
11 pages
gold_prediction_1719293155
No ratings yet
gold_prediction_1719293155
13 pages
Data Visualization EDA-print
No ratings yet
Data Visualization EDA-print
18 pages
Ankit Python
No ratings yet
Ankit Python
26 pages
AD3411 (2)
No ratings yet
AD3411 (2)
28 pages
EDA_CODE_SNIPPETS
No ratings yet
EDA_CODE_SNIPPETS
17 pages
14401172022_tanu raman ml lab file
No ratings yet
14401172022_tanu raman ml lab file
21 pages
Machine Learning
No ratings yet
Machine Learning
67 pages
Data Science Practicals - Ipynb
No ratings yet
Data Science Practicals - Ipynb
54 pages
DSA LAB MANUAL
No ratings yet
DSA LAB MANUAL
17 pages
A List of Factorial Math Constants
From Everand
A List of Factorial Math Constants
StreetLib
No ratings yet
Types of Tree Plantation
No ratings yet
Types of Tree Plantation
34 pages
Lab4 - SLR - Ipynb - Colaboratory
No ratings yet
Lab4 - SLR - Ipynb - Colaboratory
7 pages
Lab 3 - Working With Data Frames
No ratings yet
Lab 3 - Working With Data Frames
10 pages
Lab2 - Questions Only CON
No ratings yet
Lab2 - Questions Only CON
3 pages

Simple Linear regression-LAB4.ipynb - Colaboratory

Uploaded by

Simple Linear regression-LAB4.ipynb - Colaboratory

Uploaded by

#

S. No. Percentage in Grade 10 Salary

Data columns (total 3 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 S. No. 50 non-null int64

1 Percentage in Grade 10 50 non-null float64

2 Salary 50 non-null int64

dtypes: float64(1), int64(2)

memory usage: 1.3 KB

S. No. Percentage in Grade 10 Salary

count 50.00000 50.000000 50.000000

mean 25.50000 63.922400 258192.000000

std 14.57738 9.859937 76715.790993

min 1.00000 37.330000 120000.000000

25% 13.25000 57.685000 204500.000000

50% 25.50000 64.700000 250000.000000

75% 37.75000 70.000000 300000.000000

max 50.00000 83.000000 450000.000000

Name: Percentage in Grade 10, dtype: float64

array([62. , 76.33, 72. , 60. , 61. , 55. , 70. , 68. , 82.8 ,

59. , 58. , 60. , 66. , 83. , 68. , 37.33, 79. , 68.4 ,

70. , 59. , 63. , 50. , 69. , 52. , 49. , 64.6 , 50. ,

74. , 58. , 67. , 75. , 60. , 55. , 78. , 50.08, 56. ,

68. , 52. , 54. , 52. , 76. , 64.8 , 74.4 , 74.5 , 73.5 ,

57.58, 68. , 69. , 66. , 60.8 ])

Estimated model slope: [[1504.41195413]]

Estimated model intercept: (array([152845.01374103]),)

You might also like