0% found this document useful (0 votes)

16 views5 pages

Experiment No.8

The document outlines an experiment to implement a simple Linear Regression algorithm using a salary dataset from Kaggle. It includes steps for loading data, training a model, making predictions, evaluating the model's performance, and visualizing results. Key metrics such as Mean Squared Error and R-squared are calculated, and the model parameters are printed, along with a prediction for a specific input.

Uploaded by

aryanbajpai916

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views5 pages

Experiment No.8

Uploaded by

aryanbajpai916

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Experiment No.

8
Aim: Implement and demonstrate simple Linear Regression Algorithm based on a given set
of training data samples. Read the training data from a .CSV file. Use salary dataset from
Kaggle.

Python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset

df = pd.read_csv('Salary_Data.csv'

# Display the first few rows of the dataframe to inspect the data
print("First 5 rows of the dataset:")
print(df.head())

# Get information about the dataset (columns, data types, non-null values)
print("\nDataset information:")
df.info() # Print a summary of the DataFrame df to the console. This includes the data types
of each column, the number of non-null values, and the memory usage.

# Check for missing values

print("\nMissing values:")
print(df.isnull().sum()) # Print the number of missing values in each column of the
DataFrame df to the console. This is important for data cleaning, as linear regression
models require complete data.
# Extract the independent variable (Years of Experience) and the dependent variable
(Salary)
X = df[[‘Years of Expereience’]]
X = df.iloc[:, :-1].values # Extract all rows and all columns except the last one from the
DataFrame df, and store them as a numpy array in the variable X. This represents the
independent variable(s), which in this case is 'Years of Experience'. The .values attribute is
used to get the numpy array representation of the data.
y = df[‘Salary’]
y = df.iloc[:, -1].values # Extract all rows and only the last column from the DataFrame df,
and store them as a numpy array in the variable y. This represents the dependent variable,
which in this case is 'Salary'.

# Split the dataset into training and testing sets

# We'll use 80% of the data for training and 20% for testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Split
the data into training and testing sets.
# X_train: Independent variables for the training set.
# X_test: Independent variables for the testing set.
# y_train: Dependent variable for the training set.
# y_test: Dependent variable for the testing set.
# test_size=0.2: Specifies that 20% of the data should be used for the test set.
# random_state=42: Sets the random seed to 42. This ensures that the data is split in the
same way each time the code is run, which is important for reproducibility.

# Create a Linear Regression model

model = LinearRegression() # Create an instance of the LinearRegression class and store it
in the variable model. This creates a linear regression model object.

# Train the model using the training data

model.fit(X_train, y_train) # Train the linear regression model using the training data. The
fit() method learns the relationship between the independent variables (X_train) and the
dependent variable (y_train).
# Make predictions on the test set
y_pred = model.predict(X_test) # Use the trained model to make predictions on the test set
(X_test). The predicted values are stored in the variable y_pred.

# Evaluate the model

mse = mean_squared_error(y_test, y_pred) # Calculate the Mean Squared Error (MSE)
between the actual test values (y_test) and the predicted values (y_pred). MSE is a measure
of how well the model's predictions match the actual values.
r2 = r2_score(y_test, y_pred) # Calculate the R-squared value, also known as the coefficient
of determination. R-squared measures the proportion of the variance in the dependent
variable that is explained by the independent variable(s).

print("\nModel Evaluation:")
print(f"Mean Squared Error: {mse:.2f}") # Print the calculated Mean Squared Error to the
console, formatted to two decimal places.
print(f"R-squared: {r2:.2f}") # Print the calculated R-squared value to the console, formatted
to two decimal places.

# Visualize the training data and the regression line

plt.figure(figsize=(10, 6)) # Create a new figure with a size of 10x6 inches.
plt.scatter(X_train, y_train, color='blue', label='Training Data') # Create a scatter plot of the
training data.
# X_train: The independent variable (Years of Experience) for the training data.
# y_train: The dependent variable (Salary) for the training data.
# color='blue': Sets the color of the data points to blue.
# label='Training Data': Sets the label for the data points to 'Training Data'. This label will
appear in the legend.
plt.plot(X_train, model.predict(X_train), color='red', label='Regression Line') # Plot the
regression line.
# X_train: The independent variable (Years of Experience) for the training data.
# model.predict(X_train): The predicted values of Salary based on Years of Experience for
the training data, as predicted by the model.
# color='red': Sets the color of the line to red.
# label='Regression Line': Sets the label for the line to 'Regression Line'. This label will
appear in the legend.
plt.title('Salary vs. Years of Experience (Training Set)') # Set the title of the plot.
plt.xlabel('Years of Experience') # Set the label for the x-axis.
plt.ylabel('Salary') # Set the label for the y-axis.
plt.legend() # Display the legend, which shows the labels for the data points and the
regression line.
plt.show() # Display the plot.

# Visualize the test data and the predictions

plt.figure(figsize=(10, 6)) # Create a new figure with a size of 10x6 inches.
plt.scatter(X_test, y_test, color='green', label='Test Data') # Create a scatter plot of the test
data.
# X_test: The independent variable (Years of Experience) for the test data.
# y_test: The dependent variable (Salary) for the test data.
# color='green': Sets the color of the data points to green.
# label='Test Data': Sets the label for the data points to 'Test Data'.
plt.plot(X_test, y_pred, color='red', label='Predictions') # Plot the predicted values.
# X_test: The independent variable (Years of Experience) for the test data.
# y_pred: The predicted values of Salary based on Years of Experience for the test data.
# color='red': Sets the color of the line to red.
# label='Predictions': Sets the label for the line to 'Predictions'

plt.title('Salary vs. Years of Experience (Test Set)') # Set the title of the plot.
plt.xlabel('Years of Experience') # Set the label for the x-axis.
plt.ylabel('Salary') # Set the label for the y-axis.
plt.legend() # Display the legend.
plt.show() # Display the plot.

# Print the model parameters (intercept and coefficient)

print("\nModel Parameters:")
print(f"Intercept: {model.intercept_:.2f}") # Print the intercept of the regression line,
formatted to two decimal places. The intercept is the predicted value of Salary when Years
of Experience is 0.
print(f"Coefficient: {model.coef_[0]:.2f}") # Print the coefficient of the regression line,
formatted to two decimal places. The coefficient represents the change in Salary for each
one-unit increase in Years of Experience.

# Example prediction: Predict salary for 10 years of experience

years_of_experience = np.array([[10]]) # Create a numpy array representing 10 years of
experience. The input to model.predict() must be a 2D array.
predicted_salary = model.predict(years_of_experience) # Use the trained model to predict the
salary for 10 years of experience.
print(f"\nPredicted salary for 10 years of experience: ${predicted_salary[0]:.2f}") # Print the
predicted salary, formatted to two decimal places.

COREN Registration and Guide
No ratings yet
COREN Registration and Guide
3 pages
CSC 3300 Configuration 3
No ratings yet
CSC 3300 Configuration 3
51 pages
AIS Book Chapter 1 Answer
No ratings yet
AIS Book Chapter 1 Answer
5 pages
Learning Management System (LMS) : USER Manual Version 6.0: Sl. No Version History Date
No ratings yet
Learning Management System (LMS) : USER Manual Version 6.0: Sl. No Version History Date
19 pages
Northern Passage FINAL1 PDF
No ratings yet
Northern Passage FINAL1 PDF
550 pages
Computer Profile Summary: Plan For Your Next Computer Refresh... Click For Belarc's System Management Products
0% (1)
Computer Profile Summary: Plan For Your Next Computer Refresh... Click For Belarc's System Management Products
6 pages
Z Fi BDC Vendor Down Payment
No ratings yet
Z Fi BDC Vendor Down Payment
21 pages
Children and Young People's Home Use of ICT For Educational Purposes: The Impact On Attainment at Key Stages 1-4
No ratings yet
Children and Young People's Home Use of ICT For Educational Purposes: The Impact On Attainment at Key Stages 1-4
106 pages
A Seminar Report ON Direct-To-Home Television (DTH)
100% (1)
A Seminar Report ON Direct-To-Home Television (DTH)
32 pages
Management of Technology Task: Skype Business Canvas
0% (1)
Management of Technology Task: Skype Business Canvas
26 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
132 pages
State of Local Governance Report 2011
100% (1)
State of Local Governance Report 2011
103 pages
Doors Assignment
No ratings yet
Doors Assignment
29 pages
The Language of Algebra: Lesson
No ratings yet
The Language of Algebra: Lesson
8 pages
Class 7 Extra Computer Science CHAPTER 3 (Computer Viruses)
100% (1)
Class 7 Extra Computer Science CHAPTER 3 (Computer Viruses)
3 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
98 pages
As 1683.11-2001 Methods of Test For Elastomers Tension Testing of Vulcanized or Thermoplastic Rubber
No ratings yet
As 1683.11-2001 Methods of Test For Elastomers Tension Testing of Vulcanized or Thermoplastic Rubber
4 pages
OJTweekly Report (Week 1)
No ratings yet
OJTweekly Report (Week 1)
6 pages
Advanced Ch.03 Management Accounting S4HANA
No ratings yet
Advanced Ch.03 Management Accounting S4HANA
112 pages
Bread, Milk Bread, Diapers, Beer, Eggs Bread, Diapers, Beer, Cola Bread, Milk, Diapers, Beer Bread, Milk, Diapers, Cola
No ratings yet
Bread, Milk Bread, Diapers, Beer, Eggs Bread, Diapers, Beer, Cola Bread, Milk, Diapers, Beer Bread, Milk, Diapers, Cola
4 pages
Code Book
No ratings yet
Code Book
20 pages
Creating A React App
No ratings yet
Creating A React App
8 pages
MTII Fullam Brochure Tensile Test Stage
No ratings yet
MTII Fullam Brochure Tensile Test Stage
4 pages
Using A Dataset, Apply The Concept of Liner Regression
No ratings yet
Using A Dataset, Apply The Concept of Liner Regression
3 pages
Bytedance Ai Lab Ava Challenge 2019 Technical Report
No ratings yet
Bytedance Ai Lab Ava Challenge 2019 Technical Report
2 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
2 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
23 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
2960 Switch Cisco Catalyst 48 Port Switch
No ratings yet
2960 Switch Cisco Catalyst 48 Port Switch
1 page
Click To Open - Social Media Managers Toolbox
No ratings yet
Click To Open - Social Media Managers Toolbox
5 pages
27MP58VQP
No ratings yet
27MP58VQP
30 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
4 pages
Oil Basics & More: Drawing & Painting With Style and Confidence
No ratings yet
Oil Basics & More: Drawing & Painting With Style and Confidence
20 pages
2 Linear Regression
No ratings yet
2 Linear Regression
5 pages
Simple Linear Regression Lab II
No ratings yet
Simple Linear Regression Lab II
5 pages
Solution To Task 1
No ratings yet
Solution To Task 1
2 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Praktikum 1 Jupiter Machine Learning
No ratings yet
Praktikum 1 Jupiter Machine Learning
1 page
Agreement &doa
No ratings yet
Agreement &doa
3 pages
Salary Prediction - Ipynb
No ratings yet
Salary Prediction - Ipynb
3 pages
ML Practicals
No ratings yet
ML Practicals
11 pages
Supervised Learning For Data Science...
No ratings yet
Supervised Learning For Data Science...
14 pages
EXP-4 DMusingPYTHON
No ratings yet
EXP-4 DMusingPYTHON
7 pages
C: Users Dell Downloads Salary - Data - CSV
No ratings yet
C: Users Dell Downloads Salary - Data - CSV
2 pages
Regression Demo
No ratings yet
Regression Demo
8 pages
Financial Literacy Unit2 Notes by Abhishek Patel Sem1
No ratings yet
Financial Literacy Unit2 Notes by Abhishek Patel Sem1
9 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Linear Regression2
No ratings yet
Linear Regression2
9 pages
Regression
No ratings yet
Regression
16 pages
Exp 1
No ratings yet
Exp 1
6 pages
Unit5 - Linear Regression
No ratings yet
Unit5 - Linear Regression
4 pages
DWM Exp 8
No ratings yet
DWM Exp 8
2 pages
Lab 1
No ratings yet
Lab 1
3 pages
Simple - Linear - Regression - Ipynb - Colaboratory
No ratings yet
Simple - Linear - Regression - Ipynb - Colaboratory
2 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
ML 6 7 8
No ratings yet
ML 6 7 8
10 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
2.1 ML (Implementation of Simple Linear Regression in Python)
No ratings yet
2.1 ML (Implementation of Simple Linear Regression in Python)
8 pages
ServicePlus- Deprived Scheduled Caste Certificate - वंचित अनुसूचित जाति प्रमाण पत्र
No ratings yet
ServicePlus- Deprived Scheduled Caste Certificate - वंचित अनुसूचित जाति प्रमाण पत्र
2 pages
Regression Dataset Example
No ratings yet
Regression Dataset Example
14 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
4 pages
Batch Gradient
No ratings yet
Batch Gradient
3 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
4 pages
ML Recordjp
No ratings yet
ML Recordjp
35 pages
Python File
No ratings yet
Python File
5 pages
004N - UG EVO 3 IP ENG 15 - 04 - 2021 - Compressed
No ratings yet
004N - UG EVO 3 IP ENG 15 - 04 - 2021 - Compressed
52 pages
Python 1
No ratings yet
Python 1
3 pages
Linear Regression Program
No ratings yet
Linear Regression Program
2 pages
Exp 2 ML
No ratings yet
Exp 2 ML
4 pages
Simple Linear Regression Code
No ratings yet
Simple Linear Regression Code
3 pages
Exp1c
No ratings yet
Exp1c
6 pages
Task 1
No ratings yet
Task 1
5 pages
Task 8
No ratings yet
Task 8
2 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Btech1007022 Lab5
No ratings yet
Btech1007022 Lab5
14 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
ml1 PRG
No ratings yet
ml1 PRG
2 pages
Salary Prediction
No ratings yet
Salary Prediction
9 pages
High Fidelity UI Design Report
No ratings yet
High Fidelity UI Design Report
3 pages
Btech1007022 Lab5.1
No ratings yet
Btech1007022 Lab5.1
9 pages
DS P6 Yash
No ratings yet
DS P6 Yash
8 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
Linear Regression Research Paper
No ratings yet
Linear Regression Research Paper
2 pages
9.2. Data Science - Machine Learning - Simple Linear Regression - Example
No ratings yet
9.2. Data Science - Machine Learning - Simple Linear Regression - Example
10 pages
21BEI052 2EI503 ML SpecialAssignmentReport
No ratings yet
21BEI052 2EI503 ML SpecialAssignmentReport
12 pages
Easy Pract ML
No ratings yet
Easy Pract ML
7 pages
Chat GPT
No ratings yet
Chat GPT
2 pages
Import Pandas As PD
No ratings yet
Import Pandas As PD
3 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet

Experiment No.8

Uploaded by

Experiment No.8

Uploaded by

Experiment No.

# Load the dataset

# Check for missing values

# Split the dataset into training and testing sets

# Create a Linear Regression model

# Train the model using the training data

# Evaluate the model

# Visualize the training data and the regression line

# Visualize the test data and the predictions

# Print the model parameters (intercept and coefficient)

# Example prediction: Predict salary for 10 years of experience

You might also like