Decision Tree

Uploaded by

ruthwin77

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Decision Tree

Uploaded by

ruthwin77

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

5.

implementationof mutiplelinear regressionfor housepriceprecedtionusingsklearn

import pandasaspd
import numpy asnp
import matplotlib.pyplotasplt
import seabornassns
import statsmodels.api assm
# Loadthedataset (youcanreplacethiswithyour owndata)
data=pd.read_csv("/ content/ sample_data/ housing.csv") # Replace'Housing.csv' withyour dataset filepath
# Datainspection
print(data.head(5)) # Display first 5 records
print(data.info()) # Showdatatypedefinitionsfor columns
print(data.describe()) # Descriptivestatistics
print(f"Total rowsandcolumns:{data.shape}")
# Check for null values
print(f"Null values:\ n{data.isnull().sum()}")

# Detectoutliersusingboxplots
def detect_outliers():
fig, axs=plt.subplots(2, 3, figsize=(10, 5))
sns.boxplot(x=data[ 'Feature1'] , ax=axs[ 0, 0] )
sns.boxplot(x=data[ 'Feature2'] , ax=axs[ 0, 1] )
# Repeat for other features..
plt.show()
# MultipleLinear Regression
X=data[ [ 'Feature1', 'Feature2'] ] # Independent variables
y =data[ 'HousePrice'] # Dependent variable
# Addaconstant termfor intercept
X=sm.add_constant(X)
# Fit themodel
model =sm.OLS(y, X).fit()
# Get model summary
model_summary =model.summary()
print(model_summary)

output
Avg.AreaIncome Avg.AreaHouseAge Avg.AreaNumber of Rooms \
0 79545.45857 5.682861 7.009188
1 79248.64245 6.002900 6.730821
2 61287.06718 5.865890 8.512727
3 63345.24005 7.188236 5.586729
4 59982.19723 5.040555 7.839388

Avg.AreaNumber of Bedrooms AreaPopulation Price \

0 4.09 23086.80050 1.059034e+06
1 3.09 40173.07217 1.505891e+06
2 5.13 36882.15940 1.058988e+06
3 3.26 34310.24283 1.260617e+06
4 4.23 26354.10947 6.309435e+05

Address
0 208 Michael Ferry Apt.674\ nLaurabury, NE 3701..
1 188 JohnsonViewsSuite079\ nLake Kathleen, CA..
2 9127 ElizabethStravenue\ nDanieltown, WI 06482..
3 USSBarnett\ nFPO AP44820
4 USNSRaymond\ nFPO AE 09386
<class'pandas.core.frame.DataFrame'>
RangeIndex:5000 entries, 0 to4999
Datacolumns(total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Avg.AreaIncome 5000 non-null float64
1 Avg.AreaHouseAge 5000 non-null float64
2 Avg.AreaNumber of Rooms 5000 non-null float64
3 Avg.AreaNumber of Bedrooms 5000 non-null float64
4 AreaPopulation 5000 non-null float64
5 Price 5000 non-null float64
6 Address 5000 non-null object
dtypes:float64(6), object(1)
memory usage:273.6+KB
None
Avg.AreaIncome Avg.AreaHouseAge Avg.AreaNumber of Rooms \
count 5000.000000 5000.000000 5000.000000
mean 68583.108984 5.977222 6.987792
std 10657.991214 0.991456 1.005833
min 17796.631190 2.644304 3.236194
25% 61480.562390 5.322283 6.299250
50% 68804.286405 5.970429 7.002902
75% 75783.338665 6.650808 7.665871
max 107701.748400 9.519088 10.759588

Avg.AreaNumber of Bedrooms AreaPopulation Price

count 5000.000000 5000.000000 5.000000e+03
mean 3.981330 36163.516039 1.232073e+06
std 1.234137 9925.650114 3.531176e+05
min 2.000000 172.610686 1.593866e+04
25% 3.140000 29403.928700 9.975771e+05
50% 4.050000 36199.406690 1.232669e+06
75% 4.490000 42861.290770 1.471210e+06
max 6.500000 69621.713380 2.469066e+06
Total rowsandcolumns:(5000, 7)
Null values:
Avg.AreaIncome 0
Avg.AreaHouseAge 0
Avg.AreaNumber of Rooms 0
Avg.AreaNumber of Bedrooms 0
AreaPopulation 0
Price 0
Address 0
dtype:int64
6.implementationof decisiontreeusingsklearnanditsparameterstuning
fromsklearn.datasetsimport load_iris
fromsklearn.model_selectionimport train_test_split
fromsklearn.tree import DecisionTreeClassifier
fromsklearn.metricsimport accuracy_score
iris=load_iris()
X=iris.data
y =iris.target
X_train, X_test, y_train, y_test =train_test_split(X, y,test_size=0.3, random_state=99)
clf =DecisionTreeClassifier(random_state=1)
clf.fit(X_train, y_train)
y_pred=clf.predict(X_test)

accuracy =accuracy_score(y_test, y_pred)

print(f'Accuracy:{accuracy}')
fromsklearn.model_selectionimport GridSearchCV

# Hyperparameter tofinetune
param_grid={
'max_depth': range(1,10, 1),
'min_samples_leaf': range(1,20, 2),
'min_samples_split': range(2, 20, 2),
'criterion': [ "entropy", "gini"]
}
# Decisiontreeclassifier
tree=DecisionTreeClassifier(random_state=1)
# GridSearchCV
grid_search=GridSearchCV(estimator=tree, param_grid=param_grid,
cv=5, verbose=True)
grid_search.fit(X_train, y_train)

# Best scoreandestimator
print("best accuracy", grid_search.best_score_)
print(grid_search.best_estimator_)
fromsklearn.tree import plot_tree
import matplotlib.pyplotasplt

# best estimator
tree_clf =grid_search.best_estimator_
# plot
plt.figure(figsize=(18, 15))
plot_tree(tree_clf, filled=True, feature_names=iris.feature_names,
class_names=iris.target_names)
plt.show()
output:
Accuracy:0.9555555555555556
Fitting5 foldsfor eachof 1620 candidates, totalling8100 fits
best accuracy 0.9714285714285715
DecisionTreeClassifier(criterion='entropy', max_depth=4, min_samples_leaf=3,
random_state=1)

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
MUNAR - Linear Regression - Ipynb - Colaboratory
No ratings yet
MUNAR - Linear Regression - Ipynb - Colaboratory
30 pages
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
From Everand
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
Bart Baesens
No ratings yet
Prediction of House Rent Using Multiple Linear Regression
No ratings yet
Prediction of House Rent Using Multiple Linear Regression
20 pages
01.multiple Linear Regression - Ipynb - Colaboratory
No ratings yet
01.multiple Linear Regression - Ipynb - Colaboratory
10 pages
Mod2_Multiple Linear Regression
No ratings yet
Mod2_Multiple Linear Regression
10 pages
5 Multiple Linear Regression
No ratings yet
5 Multiple Linear Regression
2 pages
Ex No.: Date: Problem Statement
No ratings yet
Ex No.: Date: Problem Statement
3 pages
Safi ML Lab6
No ratings yet
Safi ML Lab6
10 pages
T2_summary_VHA
No ratings yet
T2_summary_VHA
14 pages
PythonFile[1]
No ratings yet
PythonFile[1]
5 pages
DSBDAL_Assignment no 4
No ratings yet
DSBDAL_Assignment no 4
15 pages
AIML
No ratings yet
AIML
5 pages
Expt 7
No ratings yet
Expt 7
3 pages
snt7
No ratings yet
snt7
13 pages
Exp4(Linear Regression)
No ratings yet
Exp4(Linear Regression)
2 pages
IoT Task4 21BEC0384
No ratings yet
IoT Task4 21BEC0384
9 pages
Search Algorithms Python Implementation
No ratings yet
Search Algorithms Python Implementation
6 pages
20BCP021 - Assignment - 5
No ratings yet
20BCP021 - Assignment - 5
5 pages
Linear Regression Mca Lab - Jupyter Notebook
No ratings yet
Linear Regression Mca Lab - Jupyter Notebook
2 pages
unit 3 5
No ratings yet
unit 3 5
4 pages
ML Regression
No ratings yet
ML Regression
9 pages
Code 1
No ratings yet
Code 1
3 pages
Boston Housing Kaggle Challenge With Linear Regression
No ratings yet
Boston Housing Kaggle Challenge With Linear Regression
3 pages
Lab6_Hoursing_Price_Regression
No ratings yet
Lab6_Hoursing_Price_Regression
2 pages
ML Assignment2 33418
No ratings yet
ML Assignment2 33418
6 pages
ml project part a 1
No ratings yet
ml project part a 1
6 pages
Ex01 Linear Regression
No ratings yet
Ex01 Linear Regression
2 pages
Ex01 Linear Regression PDF
No ratings yet
Ex01 Linear Regression PDF
2 pages
MachineLearning
No ratings yet
MachineLearning
10 pages
Exp 1 121a1047 Lavanya Kurup ML
No ratings yet
Exp 1 121a1047 Lavanya Kurup ML
11 pages
AIand MLlab 6
No ratings yet
AIand MLlab 6
4 pages
Ridge Regression
No ratings yet
Ridge Regression
3 pages
ML - Tutorial1.ipynb - Colaboratory
No ratings yet
ML - Tutorial1.ipynb - Colaboratory
3 pages
ML LinearRegression
No ratings yet
ML LinearRegression
10 pages
Ex 2
No ratings yet
Ex 2
3 pages
SML - Week 3
No ratings yet
SML - Week 3
5 pages
New Opendocument Text
No ratings yet
New Opendocument Text
7 pages
regression analysis on the Boston house price dataset for house price prediction
No ratings yet
regression analysis on the Boston house price dataset for house price prediction
2 pages
Linear Regression Analysis - Polynomial Regression
No ratings yet
Linear Regression Analysis - Polynomial Regression
25 pages
ML Lab-3
No ratings yet
ML Lab-3
14 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Lab9 Solution 24052024 115622am
No ratings yet
Lab9 Solution 24052024 115622am
10 pages
Assignment 1 (1)
No ratings yet
Assignment 1 (1)
4 pages
Regression Algorithm
No ratings yet
Regression Algorithm
9 pages
f3683849-7ca6-4854-8f96-af11b6e837ec
No ratings yet
f3683849-7ca6-4854-8f96-af11b6e837ec
20 pages
Assignment 1
100% (1)
Assignment 1
3 pages
a
No ratings yet
a
2 pages
Regression (1)-1-4
No ratings yet
Regression (1)-1-4
4 pages
Week 3 Mlds
No ratings yet
Week 3 Mlds
5 pages
Project
No ratings yet
Project
10 pages
House price prediction
No ratings yet
House price prediction
2 pages
Lab 3 - Linear Regression
No ratings yet
Lab 3 - Linear Regression
15 pages
2 - Linear - Regression - Multivariate - Ipynb - Colaboratory
No ratings yet
2 - Linear - Regression - Multivariate - Ipynb - Colaboratory
4 pages
Lab 2 Linear Regression Representation
No ratings yet
Lab 2 Linear Regression Representation
6 pages
ML Unit
No ratings yet
ML Unit
23 pages
Multiple Regression
No ratings yet
Multiple Regression
4 pages
4.Multivariate Linear Regression-Shared
No ratings yet
4.Multivariate Linear Regression-Shared
41 pages
wvcg0mt7pkASSI_3_ML_16
No ratings yet
wvcg0mt7pkASSI_3_ML_16
4 pages
ML Exp4
No ratings yet
ML Exp4
4 pages
Form 16 Part A Name and Address of The Employer Name and Designation of The Employee
No ratings yet
Form 16 Part A Name and Address of The Employer Name and Designation of The Employee
3 pages
Team 1 Team 2 Team 1 Team 2: 5-Team Format 6-Team Format
No ratings yet
Team 1 Team 2 Team 1 Team 2: 5-Team Format 6-Team Format
2 pages
Caparo Industries PLC V Dickman
No ratings yet
Caparo Industries PLC V Dickman
3 pages
Transformer Datasheet Part 1
No ratings yet
Transformer Datasheet Part 1
1 page
Epartment of Town and Country Planning, Govt. of Haryana: Applying For Change of Landuse (Clu)
No ratings yet
Epartment of Town and Country Planning, Govt. of Haryana: Applying For Change of Landuse (Clu)
2 pages
DICTIONARIES PPT (1) (1)
No ratings yet
DICTIONARIES PPT (1) (1)
24 pages
POS Application in Airline Industry
100% (1)
POS Application in Airline Industry
5 pages
Max Unit Sold: Total Result $ 4,492.50
No ratings yet
Max Unit Sold: Total Result $ 4,492.50
57 pages
IEA Bioenergy EWorkshop 2021-2-1 JuhaHakala
No ratings yet
IEA Bioenergy EWorkshop 2021-2-1 JuhaHakala
23 pages
Week 07-08
No ratings yet
Week 07-08
30 pages
Brother HL-2400c Parts Manual
No ratings yet
Brother HL-2400c Parts Manual
23 pages
Sample Kick Off Meeting Slide
No ratings yet
Sample Kick Off Meeting Slide
4 pages
Policy of Prescription Pad
No ratings yet
Policy of Prescription Pad
2 pages
4G LTE-LTE-Advanced For Mobile Broadband - Data
No ratings yet
4G LTE-LTE-Advanced For Mobile Broadband - Data
8 pages
Rpi Pico r3 Public Schematic
No ratings yet
Rpi Pico r3 Public Schematic
1 page
part 6 د.عيسى فؤاد اسئلة شهر مارس SPLE
No ratings yet
part 6 د.عيسى فؤاد اسئلة شهر مارس SPLE
42 pages
Thermal Conductivity Guarded Plate Apparatus
100% (1)
Thermal Conductivity Guarded Plate Apparatus
4 pages
Bhavesh Mahadu Nikam Provisional Offer Letter
No ratings yet
Bhavesh Mahadu Nikam Provisional Offer Letter
11 pages
Cesium Formate
No ratings yet
Cesium Formate
20 pages
Industrial Revolution in The 17TH Century
No ratings yet
Industrial Revolution in The 17TH Century
38 pages
1610 02148 PDF
No ratings yet
1610 02148 PDF
4 pages
Moral Stories - in Kannada
70% (10)
Moral Stories - in Kannada
19 pages
c-3 (5)
No ratings yet
c-3 (5)
10 pages
Bài Tập Bổ Trợ Anh 10 Global Có File Nghe Và Đáp Án Cả Năm Unit 9 HS
No ratings yet
Bài Tập Bổ Trợ Anh 10 Global Có File Nghe Và Đáp Án Cả Năm Unit 9 HS
8 pages
"A Celebration of Stars": Save The Date
No ratings yet
"A Celebration of Stars": Save The Date
4 pages
Harman Kardon AVR 247 Owners Manual
No ratings yet
Harman Kardon AVR 247 Owners Manual
76 pages
Risk Assessment Report
No ratings yet
Risk Assessment Report
21 pages
D-PDD-DY-23 Exam Free Dumps Help You Pass
No ratings yet
D-PDD-DY-23 Exam Free Dumps Help You Pass
15 pages
A Realizable Reynolds Stress Algebric Equation Model - Shih
No ratings yet
A Realizable Reynolds Stress Algebric Equation Model - Shih
38 pages
Live Music: Thursdays Calendar of Events
No ratings yet
Live Music: Thursdays Calendar of Events
1 page

Decision Tree

Uploaded by

Decision Tree

Uploaded by

5.

implementationof mutiplelinear regressionfor housepriceprecedtionusingsklearn

Avg.AreaNumber of Bedrooms AreaPopulation Price \

Avg.AreaNumber of Bedrooms AreaPopulation Price

accuracy =accuracy_score(y_test, y_pred)

You might also like

implementationof mutiplelinear regressionfor housepriceprecedtionusingsklearn

Avg.AreaNumber of Bedrooms AreaPopulation Price \

accuracy =accuracy_score(y_test, y_pred)