0% found this document useful (0 votes)

2 views

Multiple_Linear_Regression - Colaboratory

The document outlines the process of performing multiple linear regression using a dataset of 50 startups. It includes steps for data preprocessing, such as importing libraries, loading the dataset, checking for null values, and encoding categorical data. Additionally, it describes splitting the dataset into features and target variables for model training and testing.

Uploaded by

jothiga835

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Multiple_Linear_Regression - Colaboratory

Uploaded by

jothiga835

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

1/24/23, 7:52 PM Multiple_Linear_Regression - Colaboratory

Multiple Linear Regression

0.Data Preprocessing

0.1 Importing the libraries

1 import numpy as np
2 import matplotlib.pyplot as plt
3 import pandas as pd

1 from google.colab import drive

2 drive.mount('/content/drive')

Mounted at /content/drive

1 path= '/content/drive/My Drive/50_Startups.csv/'

0.2 Importing the dataset

1 dataset = pd.read_csv('/content/drive/My Drive/50_Startups.csv')

2 dataset

https://colab.research.google.com/drive/1PiocMZjWMtlFWkXfvvweEmJ_WHLIXNj4#scrollTo=yoInQ7qS6iiB&printMode=true 1/5
1/24/23, 7:52 PM Multiple_Linear_Regression - Colaboratory
10 101913.08 110594.11 229160.95 Florida 146121.95

11 100671.96 91790.61 249744.55 California 144259.40

12 93863.75 127320.38 249839.44 Florida 141585.52

13 91992.39 135495.07 252664.93 California 134307.35

14 119943.24 156547.42 256512.92 Florida 132602.65

15 114523.61 122616.84 261776.23 New York 129917.04

16 78013.11 121597.55 264346.06 California 126992.93

17 94657.16 145077.58 282574.31 New York 125370.37

18 91749.16 114175.79 294919.57 Florida 124266.90

19 86419.70 153514.11 0.00 New York 122776.86

20 76253.86 113867.30 298664.47 California 118474.03

21 78389.47 153773.43 299737.29 New York 111313.02

22 73994.56 122782.75 303319.26 Florida 110352.25

23 67532.53 105751.03 304768.73 Florida 108733.99

24 77044.01 99281.34 140574.81 New York 108552.04

25 64664.71 139553.16 137962.62 California 107404.34

26 75328.87 144135.98 134050.07 Florida 105733.54

27 72107.60 127864.55 353183.81 New York 105008.31

28 66051.52 182645.56 118148.20 Florida 103282.38

29 65605.48 153032.06 107138.38 New York 101004.64

30 61994.48 115641.28 91131.24 Florida 99937.59

31 61136.38 152701.92 88218.23 New York 97483.56

32 63408.86 129219.61 46085.25 California 97427.84

33 55493.95 103057.49 214634.81 Florida 96778.92

34 46426.07 157693.92 210797.67 California 96712.80

35 46014.02 85047.44 205517.64 New York 96479.51

0.3 Check if any null value
36 28663.76 127056.21 201126.82 Florida 90708.19

37 44069.95 51283.14 197029.42 California 89949.14

1 dataset.isna().sum()
38 20229.59 65947.93 185265.10 New York 81229.06
R&D Spend 0
39 38558.51
Administration 0 82982.09 174999.30 California 81005.76
Marketing Spend 0
40 28754.33 118546.05 172795.67 California 78239.91
State 0
Profit
41 27892.92 0 84710.77 164470.71 Florida 77798.83
dtype: int64
42 23640.93 96189.63 148001.11 California 71498.49

1 dataset.info()
43 15505.73 127382.30 35534.17 New York 69758.98

44
<class 22177.74 154806.14
'pandas.core.frame.DataFrame'> 28334.72 California 65200.33
RangeIndex: 50 entries, 0 to 49
45 1000.23 124153.04 1903.93 New York 64926.08
Data columns (total 5 columns):
#
46 Column
1315.46 Non-Null Count
115816.21 Dtype
297114.46 Florida 49490.75
--- ------ -------------- -----
0
47 R&D Spend
0.00 50 non-null
135426.92 float64
0.00 California 42559.73
1 Administration 50 non-null float64
2
48 Marketing
542.05 Spend 51743.15
50 non-null float64
0.00 New York 35673.41
3 State 50 non-null object
49
4 Profit0.00 116983.80
50 non-null 45173.06
float64 California 14681.40
dtypes: float64(4), object(1)
memory usage: 2.1+ KB

1 ### 0.4 Split into X & y

1 X = dataset.drop('Profit', axis=1)
2 X

https://colab.research.google.com/drive/1PiocMZjWMtlFWkXfvvweEmJ_WHLIXNj4#scrollTo=yoInQ7qS6iiB&printMode=true 2/5
1/24/23, 7:52 PM Multiple_Linear_Regression - Colaboratory
10 101913.08 110594.11 229160.95 Florida

11 100671.96 91790.61 249744.55 California

12 93863.75 127320.38 249839.44 Florida

13 91992.39 135495.07 252664.93 California

14 119943.24 156547.42 256512.92 Florida

15 114523.61 122616.84 261776.23 New York

16 78013.11 121597.55 264346.06 California

17 94657.16 145077.58 282574.31 New York

18 91749.16 114175.79 294919.57 Florida

19 86419.70 153514.11 0.00 New York

20 76253.86 113867.30 298664.47 California

21 78389.47 153773.43 299737.29 New York

22 73994.56 122782.75 303319.26 Florida

23 67532.53 105751.03 304768.73 Florida

24 77044.01 99281.34 140574.81 New York

25 64664.71 139553.16 137962.62 California

26 75328.87 144135.98 134050.07 Florida

27 72107.60 127864.55 353183.81 New York

28 66051.52 182645.56 118148.20 Florida

29 65605.48 153032.06 107138.38 New York

30 61994.48 115641.28 91131.24 Florida

31 61136.38 152701.92 88218.23 New York

32 63408.86 129219.61 46085.25 California

33 55493.95 103057.49 214634.81 Florida

34 46426.07 157693.92 210797.67 California

35 46014.02 85047.44 205517.64 New York

36 28663.76 127056.21 201126.82 Florida

37 44069.95 51283.14 197029.42 California

38 20229.59 65947.93 185265.10 New York

39 38558.51 82982.09 174999.30 California

40 28754.33 118546.05 172795.67 California

41 27892.92 84710.77 164470.71 Florida

42 23640.93 96189.63 148001.11 California

43 15505.73 127382.30 35534.17 New York

44 22177.74 154806.14 28334.72 California

45 1000.23 124153.04 1903.93 New York

46 1315.46 115816.21 297114.46 Florida

47 0.00 135426.92 0.00 California

48 542.05 51743.15 0.00 New York

49 0.00 116983.80 45173.06 California

https://colab.research.google.com/drive/1PiocMZjWMtlFWkXfvvweEmJ_WHLIXNj4#scrollTo=yoInQ7qS6iiB&printMode=true 3/5
1/24/23, 7:52 PM Multiple_Linear_Regression - Colaboratory

1 y = dataset['Profit']
2 y

0 192261.83
1 191792.06
2 191050.39
3 182901.99
4 166187.94
5 156991.12
6 156122.51
7 155752.60
8 152211.77
9 149759.96
10 146121.95
11 144259.40
12 141585.52
13 134307.35
14 132602.65
15 129917.04
16 126992.93
17 125370.37
18 124266.90
19 122776.86
20 118474.03
21 111313.02
22 110352.25
23 108733.99
24 108552.04
25 107404.34
26 105733.54
27 105008.31
28 103282.38
29 101004.64
30 99937.59
31 97483.56
32 97427.84
33 96778.92
34 96712.80
35 96479.51
36 90708.19
37 89949.14
38 81229.06
39 81005.76
40 78239.91
41 77798.83
42 71498.49
43 69758.98
44 65200.33
45 64926.08
46 49490.75
47 42559.73
48 35673.41
49 14681.40
Name: Profit, dtype: float64

0.5 Encoding categorical data

1 from sklearn.preprocessing import OneHotEncoder

2 from sklearn.compose import ColumnTransformer
3
4 categorical_feature = ["State"]
5 one_hot = OneHotEncoder()
6 transformer = ColumnTransformer([("one_hot",
7 one_hot,
8 categorical_feature)],
9 remainder="passthrough")
10
11 transformed_X = transformer.fit_transform(X)

1 pd.DataFrame(transformed_X).head()

https://colab.research.google.com/drive/1PiocMZjWMtlFWkXfvvweEmJ_WHLIXNj4#scrollTo=yoInQ7qS6iiB&printMode=true 4/5
1/24/23, 7:52 PM Multiple_Linear_Regression - Colaboratory

0 1 2 3 4 5

0 0.0 0.0 1.0 165349.20 136897.80 471784.10

1 1.0 0.0 0.0 162597.70 151377.59 443898.53

0.6 Splitting
2 0.0 1.0the0.0dataset into101145.55
153441.51 the Training set and Test set
407934.54

3 0.0 0.0 1.0 144372.41 118671.85 383199.62

1 from sklearn.model_selection import train_test_split
4 0.0 1.0 0.0 142107.34 91391.77 366168.42
2 X_train, X_test, y_train, y_test = train_test_split(transformed_X, y, test_size = 0.25, random_state = 2509)

1. Training the Multiple Linear Regression model on the Training set

1 from sklearn.linear_model import LinearRegression

2 regressor = LinearRegression()
3 regressor.fit(X_train, y_train)

LinearRegression()

1.1 Score

1 regressor.score(X_test,y_test)

0.9840064291741644

2. Predicting the Test set results

1 y_pred = regressor.predict(X_test)

1 d = {'y_pred': y_pred, 'y_test': y_test}

2.1 Compare Predicted results

1 pd.DataFrame(d)

y_pred y_test

32 98884.371543 97427.84

33 100047.235184 96778.92

47 47766.247901 42559.73

9 154976.558305 149759.96

37 91129.087779 89949.14

8 151755.926389 152211.77

23 112436.195860 108733.99

24 113375.898676 108552.04

17 130706.106786 125370.37

1 189141.730655 191792.06

39 85217.422839 81005.76

22 116952.737156 110352.25

46 60343.602070 49490.75

https://colab.research.google.com/drive/1PiocMZjWMtlFWkXfvvweEmJ_WHLIXNj4#scrollTo=yoInQ7qS6iiB&printMode=true 5/5

Rfid Based Door Lock Using Arduino
80% (10)
Rfid Based Door Lock Using Arduino
52 pages
House Price Prediction: Project Description
No ratings yet
House Price Prediction: Project Description
11 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
10 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds Jquery
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds Jquery
12 pages
CSL0777 L16
No ratings yet
CSL0777 L16
25 pages
Zerox Ready
No ratings yet
Zerox Ready
21 pages
Multilinear ProblemStatement
No ratings yet
Multilinear ProblemStatement
132 pages
lab mannual of ML
No ratings yet
lab mannual of ML
43 pages
ML LN 3
No ratings yet
ML LN 3
44 pages
ML Remaining
No ratings yet
ML Remaining
17 pages
2.1 ML (Implementation of Simple Linear Regression in Python)
No ratings yet
2.1 ML (Implementation of Simple Linear Regression in Python)
8 pages
19BCS2059 DL1
No ratings yet
19BCS2059 DL1
4 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Linear Regression Mca Lab - Jupyter Notebook
No ratings yet
Linear Regression Mca Lab - Jupyter Notebook
2 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
23 pages
ML manoj
No ratings yet
ML manoj
51 pages
ml2020 Pythonlab02
No ratings yet
ml2020 Pythonlab02
3 pages
Regression Model
No ratings yet
Regression Model
6 pages
Linear Regression
No ratings yet
Linear Regression
10 pages
HW2A_Jiarui Han
No ratings yet
HW2A_Jiarui Han
6 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
CS 611 Slides 4
No ratings yet
CS 611 Slides 4
25 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
01.multiple Linear Regression - Ipynb - Colaboratory
No ratings yet
01.multiple Linear Regression - Ipynb - Colaboratory
10 pages
20BCP021 - Assignment - 5
No ratings yet
20BCP021 - Assignment - 5
5 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
PythonFile[1]
No ratings yet
PythonFile[1]
5 pages
Linear Regression - Cheatsheet
No ratings yet
Linear Regression - Cheatsheet
8 pages
LAB5_Regularization
No ratings yet
LAB5_Regularization
6 pages
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
No ratings yet
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
5 pages
MDS372_LAB4_2448001
No ratings yet
MDS372_LAB4_2448001
17 pages
ML Activity Kalyan
No ratings yet
ML Activity Kalyan
21 pages
ml
No ratings yet
ml
17 pages
Lesson 6
No ratings yet
Lesson 6
25 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
14 pages
ML Week 7 6607
No ratings yet
ML Week 7 6607
5 pages
Unit 5
No ratings yet
Unit 5
171 pages
Exp 1
No ratings yet
Exp 1
6 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
ICT Assignment 2
No ratings yet
ICT Assignment 2
7 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
C1 W1 Lab03 Model Representation Soln-Copy1
No ratings yet
C1 W1 Lab03 Model Representation Soln-Copy1
7 pages
Implementation of Linear Regression: Sir Syed University of Engineering & Technology, Karachi
No ratings yet
Implementation of Linear Regression: Sir Syed University of Engineering & Technology, Karachi
11 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
Data analytics
No ratings yet
Data analytics
10 pages
Assignment3_123EI0050
No ratings yet
Assignment3_123EI0050
3 pages
Data Preprocessing ML Lab
No ratings yet
Data Preprocessing ML Lab
6 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
11 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
Regression Dataset Example
No ratings yet
Regression Dataset Example
14 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
Multiple and Multivariate Linear Regression
No ratings yet
Multiple and Multivariate Linear Regression
31 pages
vertopal.com_Untitled
No ratings yet
vertopal.com_Untitled
3 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
LinearRegression Tutorial
No ratings yet
LinearRegression Tutorial
40 pages
Chirag HOusing Price Pred
No ratings yet
Chirag HOusing Price Pred
12 pages
03 Multiple Linear Regression
No ratings yet
03 Multiple Linear Regression
7 pages
DA_Programs
No ratings yet
DA_Programs
44 pages
A Book of Numbers
From Everand
A Book of Numbers
Maria Morisot
No ratings yet
SAP HCM - Configuration of Structural Authorizations
100% (1)
SAP HCM - Configuration of Structural Authorizations
13 pages
Literature Review On Solar Tracking
100% (2)
Literature Review On Solar Tracking
8 pages
Questions Practical File
No ratings yet
Questions Practical File
13 pages
Resume - Manikandan
No ratings yet
Resume - Manikandan
4 pages
STRATASYS - Fortus380mc450mc
No ratings yet
STRATASYS - Fortus380mc450mc
2 pages
Mukesh's Resume
No ratings yet
Mukesh's Resume
1 page
Configuring Cisco Fax Relay
No ratings yet
Configuring Cisco Fax Relay
10 pages
INST-00094-Power-Supply-Replacement
No ratings yet
INST-00094-Power-Supply-Replacement
3 pages
Reviewer in Operations Management & TQM, Midterms
No ratings yet
Reviewer in Operations Management & TQM, Midterms
6 pages
SMB Sales Impl Guide
No ratings yet
SMB Sales Impl Guide
29 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
STL in C++
No ratings yet
STL in C++
24 pages
Setan 1
No ratings yet
Setan 1
2 pages
Understanding Document For Update The ZISP6UPHR Program
No ratings yet
Understanding Document For Update The ZISP6UPHR Program
15 pages
Lesson 1 Text Format FINAL
No ratings yet
Lesson 1 Text Format FINAL
9 pages
ITPLUS2-Module 3
No ratings yet
ITPLUS2-Module 3
3 pages
Command of Winfoil - Mahesh
No ratings yet
Command of Winfoil - Mahesh
6 pages
Binary File Notes - File Handling
No ratings yet
Binary File Notes - File Handling
4 pages
Assignment 2.Doc 1
No ratings yet
Assignment 2.Doc 1
3 pages
Top Best Practices SSIS PDF
No ratings yet
Top Best Practices SSIS PDF
10 pages
RD Service Device Driver 3.0
No ratings yet
RD Service Device Driver 3.0
2 pages
Microwave Line of Sight
No ratings yet
Microwave Line of Sight
10 pages
Multiband LTE Layering Master 1.9
No ratings yet
Multiband LTE Layering Master 1.9
23 pages
Telon Implementation ENU
No ratings yet
Telon Implementation ENU
256 pages
RB Selenium 2
No ratings yet
RB Selenium 2
6 pages
Sailaja Resume
No ratings yet
Sailaja Resume
4 pages
Io List
No ratings yet
Io List
9 pages
done-TLE ICTCSS10 Q1 CLAS5 Performing-BIOS-Configuration
No ratings yet
done-TLE ICTCSS10 Q1 CLAS5 Performing-BIOS-Configuration
20 pages
Xerox Center Front Page / Cover page Sample Template.pdf
No ratings yet
Xerox Center Front Page / Cover page Sample Template.pdf
3 pages