0% found this document useful (0 votes)

52 views7 pages

Assignment 2 - LP1

The document describes applying linear regression to predict monthly temperature data in India using scikit-learn. It explains loading and exploring the temperature dataset, generating a linear regression model on training data, and evaluating the model's performance on test data using various metrics. Code is provided to implement the linear regression analysis and visualize the results.

Uploaded by

bbad070105

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views7 pages

Assignment 2 - LP1

Uploaded by

bbad070105

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Experiment No.

AIM:

Assignment on Regression technique

a. Apply Linear Regression using suitable library function and predict the Month-wise
Download temperature data from below link.
https://www.kaggle.com/venky73/temperatures- of-india?select=temperatures.csv
This data consists of temperatures of INDIA averaging the temperatures of all places month wise.
Temperatures values are recorded in CELSIUS temperature.
b. Assess the performance of regression models using MSE, MAE and R-Square metrics
c. Visualize simple regression model.

Theory:
Regression:
Regression is a supervised learning technique which helps in finding the correlation between
variables and enables us to predict the continuous output variable based on the one or more
predictor variables. It is mainly used for prediction, forecasting, time series modeling and
determining the causal-effect relationship between variables. In Regression, we plot a graph
between the variables which best fits the given datapoints, using this plot, the machine learning
model can make predictions about the data.

Terminologies Related to the Regression Analysis:

Dependent Variable: The main factor in Regression analysis which we want to predict or understand is
called the dependent variable. It is also called target variable.
Independent Variable: The factors which affect the dependent variables or which are used to predict the
values of the dependent variables are called independent variable, also called as a predictor.

Outliers: Outlier is an observation which contains either very low value or very high value in comparison
to other observed values. An outlier may hamper the result, so it should be avoided.

Multicollinearity: If the independent variables are highly correlated with each other than other
variables, then such condition is called Multicollinearity. It should not be present in the dataset,
because it creates problem while ranking the most affecting variable.
Underfitting and Overfitting: If our algorithm works well with the training dataset but not well with
test dataset, then such problem is called Overfitting. And if our algorithm does not perform well even
with training dataset, then such problem is called underfitting.

Cost Functions:

1. Mean Absolute Error (MAE): MAE is a very simple metric which calculates the absolute
difference between actual and predicted values.

2. Mean Squared Error(MSE): Mean squared error states that finding the squared difference
between actual and predicted value. we perform squared to avoid the cancellation of negative
terms and it is the benefit of MSE.

3. Root Mean Squared Error(RMSE): As RMSE is clear by the name itself, that it is a simple
square root of mean squared error.

Linear Regression: Linear regression is a statistical regression method which is used for predictive
analysis. It is one of the very simple and easy algorithms which works on regression and shows the
relationship between the continuous variables. It shows the linear relationship between the independent
variable (X-axis) and the dependent variable (Y-axis), hence called linear regression.
Below is the mathematical equation for Linear regression:Y= aX+b

Here, Y= Independent Variable (Target Variable), X= Dependent Variable (Predictor Variable)

Steps in Linear Regression:

1. Loading the Data

2. Exploring the Data
3. Slicing The Data
4. Train and Split Data
5. Generate The Model
6. Evaluate The accuracy

Code:
#Importing required
libraries import
matplotlib.pyplot as plt import
pandas as pd
import seaborn as sns
import numpy as np
#Reading the input
dataset
trainData = pd.read_csv("temperatures.csv")
#Print first 10 records
print(trainData.head(n=10))
#Printing datatypes and columns in the dataset
#datatypes columwise
print("Below are the datatypes of
columns:") print(trainData.dtypes)
print()
#column names
print("Below are the columns in the
dataset:") print(trainData.columns)
print()
#describe min, max temp, count, std dev
values print("Descritive information about data
set:") print(trainData.describe())
#To check if dataset has null values or not
print(trainData.isnull().sum())
#To find top 10 temperature
#As per 'Annual col' find 'top 10' temp data
top_10_data = trainData.nlargest(10, "ANNUAL")
#Mentioned figure size
plt.figure(figsize=(14,12))
plt.title("Top 10 temperature records")
#In barplot x & y axis year & temp resp
sns.barplot(x=top_10_data.YEAR, y=top_10_data.ANNUAL)
#It is found that highest record of temperature is in 2016
roughly #about 32 degree Celsius
#Analyse 2016 data
data_2016 = trainData[trainData["YEAR"]==2016]
#x axis temp data in array format
xticks = np.array(data_2016[["JAN", "FEB", "MAR", "APR", "MAY", "JUN", "JUL", "AUG", "SEP",
"OCT", "NOV", "DEC"]].values)
#y axis months labels
yticks = ["JAN", "FEB", "MAR", "APR", "MAY", "JUN", "JUL", "AUG", "SEP", "OCT", "NOV",
"DEC"]
#To plot the graph
#Mentioned figsize
plt.figure(figsize=(10,8))
#barh: xticks & yticks get and set the current tick locations and labels of the x & y-axis.
plt.barh(yticks,xticks[0])
plt.title("Month wise temperature data of 2016")
plt.xlabel("Temperature in degree celsius")
plt.ylabel("Month")
plt.show()
#From the above graph it is clear that May month recorded highest temperature around 35
degree celsius
#Genearate Regresion Model of Training & Testing Data
from sklearn import linear_model,
metrics #train data according columns
print(trainData.columns)
#x axis = year
X=trainData[["YEAR"]]
# y axis= month wise temp
Y=trainData[["JAN"]]
#import training & testing features
from sklearn.model_selection import train_test_split
#split data in training & testing part
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=1)
print(len(X_train)) #lenth of x-train data

print(len(X_test)) #length of y-testing data

print(trainData.shape) #Show total row & column (117,18)

#Used Linear Regression Model Features to show data

reg = linear_model.LinearRegression()
print(X_train)
#fit decision line in Regression
Model model = reg.fit(X_train,
Y_train) #Predict test data
Y_pred = model.predict(X_test)
#Year wise predtion data
print('predicted response:', Y_pred, sep='\n')
#training regression model Scatter black color
plots plt.scatter(X_train, Y_train, color='black')
#Blue line indicate predicted training data
plt.plot(X_train, reg.predict(X_train), color='blue', linewidth=3)
plt.title("Temperature vs Year")
plt.xlabel("Year")
plt.ylabel("Temperature")
plt.show()
#testing regression model Scatter red color plots
plt.scatter(X_test, Y_test, color='red')
#Acc year machine predict temp
plt.plot(X_test, reg.predict(X_test), color='black',
linewidth=3) plt.title("Temperature vs Year")
plt.xlabel("Year")
plt.ylabel("Temperature")
plt.show()

Output:
Conclusion: Thus we have studied Regression techniques.

TOS GRADE 11 STAT & PROB (Finals) 2019 - 2020
100% (1)
TOS GRADE 11 STAT & PROB (Finals) 2019 - 2020
2 pages
O-Level Statistics (4040) - Quiz Level 2
No ratings yet
O-Level Statistics (4040) - Quiz Level 2
21 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
ML Practical File
100% (2)
ML Practical File
43 pages
Statistics and Probability
100% (2)
Statistics and Probability
71 pages
Descriptive Statistics
100% (3)
Descriptive Statistics
41 pages
Linear Regression
No ratings yet
Linear Regression
46 pages
ml2020 Pythonlab02
No ratings yet
ml2020 Pythonlab02
3 pages
Aakash S Project Report
No ratings yet
Aakash S Project Report
12 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
ML PR-2
No ratings yet
ML PR-2
11 pages
Zerox Ready
No ratings yet
Zerox Ready
21 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
SMDM Project
No ratings yet
SMDM Project
16 pages
PracticalWeek03a
No ratings yet
PracticalWeek03a
1 page
Six Sigma Green Belt Sample Questions: 1. Which Is The Following Is Not True About "Sigma"?
No ratings yet
Six Sigma Green Belt Sample Questions: 1. Which Is The Following Is Not True About "Sigma"?
4 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Unit 5
No ratings yet
Unit 5
171 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Linear Regression Code
No ratings yet
Linear Regression Code
5 pages
Q4 Week 1-2 INQUIRIES, INVESTIGATIONS, IMMERSION
100% (1)
Q4 Week 1-2 INQUIRIES, INVESTIGATIONS, IMMERSION
29 pages
DSunit 2
No ratings yet
DSunit 2
4 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
14 pages
Aiag Gage R&R Part Number Average & Range Met: Required Outputs
No ratings yet
Aiag Gage R&R Part Number Average & Range Met: Required Outputs
29 pages
ML Unit
No ratings yet
ML Unit
23 pages
Sample Size Determination For Research
No ratings yet
Sample Size Determination For Research
9 pages
ML Lab-3
No ratings yet
ML Lab-3
14 pages
Probability Assignment
No ratings yet
Probability Assignment
53 pages
Linear Regression: What Is Regression Analysis?
100% (1)
Linear Regression: What Is Regression Analysis?
21 pages
Matecconf Icpcm2023 01046
No ratings yet
Matecconf Icpcm2023 01046
6 pages
Lesson 2-07 Properties of Means and Variances
100% (1)
Lesson 2-07 Properties of Means and Variances
9 pages
AI Lab7
No ratings yet
AI Lab7
13 pages
Assignment AP&S PDF
No ratings yet
Assignment AP&S PDF
9 pages
Unit 2 Regression Analysis
No ratings yet
Unit 2 Regression Analysis
16 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
ML LN 3
No ratings yet
ML LN 3
44 pages
Train
No ratings yet
Train
17 pages
Pa Da1
No ratings yet
Pa Da1
17 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
Data Analytics Regression Unit III
No ratings yet
Data Analytics Regression Unit III
27 pages
8-Multivariate Analysis Using SAS
No ratings yet
8-Multivariate Analysis Using SAS
26 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Chapter-4-Simple Linear Regression & Correlation
100% (3)
Chapter-4-Simple Linear Regression & Correlation
9 pages
Name: Aditya Parade Roll No: 281047 PRN: 22311577 Batch: A-2 Assignment 6
No ratings yet
Name: Aditya Parade Roll No: 281047 PRN: 22311577 Batch: A-2 Assignment 6
2 pages
LR LogReg
No ratings yet
LR LogReg
53 pages
Chapter4 Regression
No ratings yet
Chapter4 Regression
15 pages
Operations Management: - Forecasting
No ratings yet
Operations Management: - Forecasting
96 pages
ML Experiment No 1 Linear Regression Analysis
No ratings yet
ML Experiment No 1 Linear Regression Analysis
3 pages
DSUP Exp4
No ratings yet
DSUP Exp4
6 pages
ML Manoj
No ratings yet
ML Manoj
51 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
ML Combined
No ratings yet
ML Combined
254 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
Dav Exp2
No ratings yet
Dav Exp2
3 pages
Aih Lab1
No ratings yet
Aih Lab1
10 pages
Amanda Murray
No ratings yet
Amanda Murray
2 pages
DAV 2201079 Exp 2 2-1
No ratings yet
DAV 2201079 Exp 2 2-1
35 pages
SAMPLES Assignment 1 SIMPLE Level Plan To Build A Tree House PDF
No ratings yet
SAMPLES Assignment 1 SIMPLE Level Plan To Build A Tree House PDF
62 pages
Lab 6 - Linear Regression and Multiple Linear Regression
No ratings yet
Lab 6 - Linear Regression and Multiple Linear Regression
12 pages
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
No ratings yet
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
89 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Wagner Chapter 5
No ratings yet
Wagner Chapter 5
10 pages
BP1 - Kelompok 1 - Rancangan Acak Faktorial
No ratings yet
BP1 - Kelompok 1 - Rancangan Acak Faktorial
13 pages
Factor Analysis: KMO and Bartlett's Test
No ratings yet
Factor Analysis: KMO and Bartlett's Test
6 pages
Lecture 6
No ratings yet
Lecture 6
11 pages
SML Project Sup3945 - 0000
No ratings yet
SML Project Sup3945 - 0000
11 pages
ML Linear Regression Trupesh Patel
No ratings yet
ML Linear Regression Trupesh Patel
23 pages
Chi-Square Practical
No ratings yet
Chi-Square Practical
3 pages
Gtsummary
No ratings yet
Gtsummary
2 pages
Unit 5
No ratings yet
Unit 5
18 pages
Lecture 3 (Measure of Central Tendency-Median)
No ratings yet
Lecture 3 (Measure of Central Tendency-Median)
7 pages
Fundamentals of Statistics
No ratings yet
Fundamentals of Statistics
7 pages
Predicting Inflation Through Online Prices
No ratings yet
Predicting Inflation Through Online Prices
20 pages
Linear Regression Tech
No ratings yet
Linear Regression Tech
15 pages
1 - Introduction To Deep Learning
No ratings yet
1 - Introduction To Deep Learning
87 pages
Test CPHQ
No ratings yet
Test CPHQ
3 pages
Jann 2021 Ebalfit
No ratings yet
Jann 2021 Ebalfit
32 pages
CL IV Manual
No ratings yet
CL IV Manual
108 pages
TNP Lecture 2 G1G2
No ratings yet
TNP Lecture 2 G1G2
58 pages
MTH145 Pyq1
No ratings yet
MTH145 Pyq1
3 pages
ml_exp_1
No ratings yet
ml_exp_1
4 pages
ML EXP 2
No ratings yet
ML EXP 2
8 pages
223A1131_ML_EXP_1
No ratings yet
223A1131_ML_EXP_1
8 pages
1.3 – Multiple Linear Regression
No ratings yet
1.3 – Multiple Linear Regression
13 pages
DS-unit-4.pptx (1)
No ratings yet
DS-unit-4.pptx (1)
21 pages

Assignment 2 - LP1

Uploaded by

Assignment 2 - LP1

Uploaded by

Experiment No.

Assignment on Regression technique

Terminologies Related to the Regression Analysis:

Here, Y= Independent Variable (Target Variable), X= Dependent Variable (Predictor Variable)

Steps in Linear Regression:

1. Loading the Data

print(len(X_test)) #length of y-testing data

print(trainData.shape) #Show total row & column (117,18)

#Used Linear Regression Model Features to show data

You might also like