0% found this document useful (0 votes)
64 views2 pages

Decision Tree On Laptop Dataset

This document shows code for analyzing a laptop dataset using a decision tree classifier in Python. It loads and cleans a laptop CSV dataset, encodes categorical variables, splits the data into training and test sets, fits a decision tree classifier on the training set, makes predictions on the test set, and evaluates the model's accuracy, classification report, and confusion matrix. The decision tree achieves an accuracy of 80% on the test data at classifying laptop statuses as 'New' or 'Refurbished'.

Uploaded by

Gaye Door Jani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views2 pages

Decision Tree On Laptop Dataset

This document shows code for analyzing a laptop dataset using a decision tree classifier in Python. It loads and cleans a laptop CSV dataset, encodes categorical variables, splits the data into training and test sets, fits a decision tree classifier on the training set, makes predictions on the test set, and evaluates the model's accuracy, classification report, and confusion matrix. The decision tree achieves an accuracy of 80% on the test data at classifying laptop statuses as 'New' or 'Refurbished'.

Uploaded by

Gaye Door Jani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

In [1]: import pandas as pd

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.preprocessing import LabelEncoder
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt
In [2]: df = pd.read_csv('laptops.csv')
df.head()
Out[2]: Storage Final
Laptop Status Brand Model CPU RAM Storage GPU Screen Touch
type Price

ASUS ExpertBook B1 B1502CBA-EJ0436X Intel Intel Core


0 New Asus ExpertBook 8 512 SSD NaN 15.6 No 1009.00
Core... i5

Alurin Go Start Intel Celeron N4020/8GB/256GB Intel


1 New Alurin Go 8 256 SSD NaN 15.6 No 299.00
... Celeron

ASUS ExpertBook B1 B1502CBA-EJ0424X Intel Intel Core


2 New Asus ExpertBook 8 256 SSD NaN 15.6 No 789.00
Core... i3

MSI Katana GF66 12UC-082XES Intel Core i7- Intel Core RTX
3 New MSI Katana 16 1000 SSD 15.6 No 1199.00
1270... i7 3050

HP 15S-FQ5085NS Intel Core i5- Intel Core


4 New HP 15S 16 512 SSD NaN 15.6 No 669.01
1235U/16GB/512GB... i5

In [14]: # Drop irrelevant columns


df.drop(columns=['Laptop'], inplace=True)

# Handle missing values


df.fillna(value={'GPU': 'Unknown'}, inplace=True)

# Encode categorical variables


df = pd.get_dummies(df, columns=['Brand', 'Model', 'CPU', 'Storage type', 'GPU', 'Touch'])

# Split the data into features (X) and target variable (y)
X = df.drop(columns=['Status'])
y = df['Status']
df.head()
Out[14]: Final Brand_Deep GPU_Radeon GPU_T GPU_T GPU
Status RAM Storage Screen Brand_Acer Brand_Alurin Brand_Apple Brand_Asus ...
Price Gaming RX 6600M 1000 1200 20

0 New 8 512 15.6 1009.00 False False False True False ... False False False Fa

1 New 8 256 15.6 299.00 False True False False False ... False False False Fa

2 New 8 256 15.6 789.00 False False False True False ... False False False Fa

3 New 16 1000 15.6 1199.00 False False False False False ... False False False Fa

4 New 16 512 15.6 669.01 False False False False False ... False False False Fa

5 rows × 230 columns

In [6]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [7]: clf = DecisionTreeClassifier(random_state=42)
clf.fit(X_train, y_train)
Out[7]: ▾ DecisionTreeClassifier
DecisionTreeClassifier(random_state=42)

In [8]: y_pred = clf.predict(X_test)


accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
print("Classification Report:")
print(classification_report(y_test, y_pred))
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))
Accuracy: 0.8032407407407407
Classification Report:
precision recall f1-score support

New 0.86 0.86 0.86 303


Refurbished 0.67 0.67 0.67 129

accuracy 0.80 432


macro avg 0.77 0.77 0.77 432
weighted avg 0.80 0.80 0.80 432

Confusion Matrix:
[[260 43]
[ 42 87]]
In [ ]:
In [ ]:
Loading [MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js

You might also like