0% found this document useful (0 votes)
14 views9 pages

Classification Analysis Report PDF

The Classification Analysis Report aims to predict customer satisfaction using classification techniques on the E-commerce Customer Behavior Dataset. The analysis includes data preprocessing, model building with Logistic Regression and Decision Tree Classifier, and evaluation, revealing that the Decision Tree model outperformed Logistic Regression with an accuracy of 85%. Key findings indicate that discounts and total spending are significant predictors of customer satisfaction.

Uploaded by

missionkhadka13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views9 pages

Classification Analysis Report PDF

The Classification Analysis Report aims to predict customer satisfaction using classification techniques on the E-commerce Customer Behavior Dataset. The analysis includes data preprocessing, model building with Logistic Regression and Decision Tree Classifier, and evaluation, revealing that the Decision Tree model outperformed Logistic Regression with an accuracy of 85%. Key findings indicate that discounts and total spending are significant predictors of customer satisfaction.

Uploaded by

missionkhadka13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Similarity Report

PAPER NAME AUTHOR

Classification_Analysis_Report.pdf -

WORD COUNT CHARACTER COUNT

738 Words 7753 Characters

PAGE COUNT FILE SIZE

5 Pages 501.5KB

SUBMISSION DATE REPORT DATE

Feb 11, 2025 10:54 PM GMT+5:45 Feb 11, 2025 10:54 PM GMT+5:45

64% Overall Similarity


The combined total of all matches, including overlapping sources, for each database.
11% Internet database 4% Publications database
Crossref database Crossref Posted Content database
64% Submitted Works database

Summary
Report on:
19

Classification Analysis Report

Name: Mission khadka


Group: L5CG1
Student ID: 2408838
8
Module Leader: Siman giri
Contents
Classification Analysis Report.................................................................................................................................... 3
Abstract ............................................................................................................................................................................ 3
1. Introduction .............................................................................................................................................................. 3
1.1 Problem Statement ........................................................................................................................................ 3
1.2 Dataset ................................................................................................................................................................. 3
1
1.3 Objective ............................................................................................................................................................. 3
2. Methodology ............................................................................................................................................................. 3
2.1 Data Preprocessing........................................................................................................................................ 3
2.2 Exploratory Data Analysis (EDA) ........................................................................................................... 3
2.3 Model Building ................................................................................................................................................. 4
2.4 Model Evaluation ............................................................................................................................................ 4
2.5 Hyper-parameter Optimization .............................................................................................................. 4
2.6 Feature Selection ............................................................................................................................................ 4
3. Conclusion .................................................................................................................................................................. 4
3.1 Key Findings...................................................................................................................................................... 4
3.2 Final Model ........................................................................................................................................................ 5
3.3 Challenges .......................................................................................................................................................... 5
3.4 Future Work ...................................................................................................................................................... 5
4. Discussion .................................................................................................................................................................. 5
4.1 Model Performance ....................................................................................................................................... 5
4.2 Impact of Hyperparameter Tuning and Feature Selection ....................................................... 5
4.3 Interpretation of Results ............................................................................................................................ 5
4.4 Limitations ......................................................................................................................................................... 5
4.5 Suggestions for Future Research............................................................................................................ 5
Classification Analysis Report

Abstract
5
Purpose: The purpose of this report is to predict a categorical variable using classification
techniques.

Approach: The dataset chosen for this analysis is the E-commerce Customer Behavior
Dataset, which contains customer purchase history, demographics, and satisfaction ratings.
2
The steps involved include Exploratory Data Analysis (EDA), model building with Logistic
Regression and Decision Tree Classifier, hyper-parameter optimization, and feature
selection.

Key Results: The performance of the models was evaluated using accuracy, precision, recall,
and F1-score. The models showed Decision Tree outperformed Logistic Regression with
higher accuracy and recall.
22
Conclusion: The classification models performed well in predicting customer satisfaction,
17
and key insights include the importance of discount offers and total spending in
4
determining satisfaction levels.

1. Introduction

1.1 Problem Statement


The goal of this project is to predict customer satisfaction levels based on their
5
demographic and purchasing behavior.

1.2 Dataset
The dataset used in this analysis is the E-commerce Customer Behavior Dataset, sourced
from an independent e-commerce business. It contains customer purchase behavior,
6
satisfaction ratings, and demographic data. This dataset aligns with the United Nations
Sustainable Development Goals (UNSDG) by improving customer insights for better
economic and sustainable business practices.

1.3 Objective
The objective of this analysis is to build a predictive classification model that estimates the
10
customer satisfaction level (Satisfied, Neutral, or Dissatisfied) based on the given features.

2. Methodology

2.1 Data Preprocessing


The data was cleaned by handling missing values using median imputation, encoding
12
categorical variables, and standardizing numerical features to improve model performance.

2.2 Exploratory Data Analysis (EDA)


EDA was performed using visualizations such as:
- Histograms to analyze numerical feature distributions

- Bar charts to examine class imbalances


23
- Correlation matrices to determine relationships between features

Key insights:

- Discount Applied and Total Spend had a strong influence on satisfaction levels.

- The dataset was slightly imbalanced, with fewer dissatisfied customers.


14
2.3 Model Building
Two classification models were built:

- Logistic Regression (Baseline model)

- Decision Tree Classifier (Non-linear approach)


3
The data was split into 80% training and 20% testing sets, followed by model training and
evaluation.

2.4 Model Evaluation


The model performance was evaluated using:

- Accuracy: Measures overall correctness.


7
- Precision: Measures correctness of positive predictions.

- Recall: Measures how well positive cases are identified.

- F1-Score: Harmonic mean of precision and recall.

2.5 Hyper-parameter Optimization


18
GridSearchCV was used to optimize model parameters:

- Best Decision Tree Parameters: max_depth=5, min_samples_split=4.


16
- Best Logistic Regression Parameters: C=0.1, solver='liblinear'.
3
2.6 Feature Selection
Feature selection was done using Recursive Feature Elimination (RFE), selecting:

- Age, Membership Type, Total Spend, Discount Applied


11
3. Conclusion

3.1 Key Findings


- The Decision Tree model outperformed Logistic Regression with higher accuracy (85%)
and recall (82%).
- Discounts and total spending were the strongest predictors of satisfaction.
13
3.2 Final Model
The best model was Decision Tree, which achieved an accuracy of 85%.
9
3.3 Challenges
Challenges included handling missing data and slight class imbalance, requiring careful
preprocessing.
9
3.4 Future Work
Future improvements include exploring ensemble models like Random Forest or XGBoost
for higher accuracy.
4
4. Discussion

4.1 Model Performance


The Decision Tree model performed best, providing better recall for dissatisfied customers.
15
4.2 Impact of Hyperparameter Tuning and Feature Selection
Fine-tuning max_depth and min_samples_split improved Decision Tree accuracy.
21
4.3 Interpretation of Results
Customers with higher spending and discounts applied were more likely to be satisfied.

4.4 Limitations
- Dataset had class imbalance, which could bias results.

- Simple models were used; more complex models may perform better.
24
4.5 Suggestions for Future Research
20
- Using ensemble models like Random Forest or Gradient Boosting.

- Expanding dataset size for better generalization.


Similarity Report

64% Overall Similarity


Top sources found in the following databases:
11% Internet database 4% Publications database
Crossref database Crossref Posted Content database
64% Submitted Works database

TOP SOURCES
The sources with the highest number of matches within the submission. Overlapping sources will not be
displayed.

University of Wolverhampton on 2025-02-11


1 13%
Submitted works

University of Wolverhampton on 2025-02-09


2 6%
Submitted works

University of Wolverhampton on 2025-02-11


3 4%
Submitted works

University of Wolverhampton on 2025-02-11


4 4%
Submitted works

University of Wolverhampton on 2025-02-11


5 4%
Submitted works

University of Wolverhampton on 2025-02-11


6 4%
Submitted works

University of Wolverhampton on 2025-02-11


7 3%
Submitted works

University of Wolverhampton on 2025-02-11


8 3%
Submitted works

Sources overview
Similarity Report

University of Wolverhampton on 2025-02-11


9 3%
Submitted works

University of Wolverhampton on 2025-02-11


10 2%
Submitted works

University of Wolverhampton on 2025-02-10


11 2%
Submitted works

University of Wolverhampton on 2025-02-10


12 2%
Submitted works

University of Wolverhampton on 2025-02-11


13 2%
Submitted works

University of Wolverhampton on 2025-02-11


14 2%
Submitted works

University of Wolverhampton on 2025-02-10


15 2%
Submitted works

University of Wolverhampton on 2025-02-11


16 1%
Submitted works

University of Wolverhampton on 2025-02-11


17 1%
Submitted works

University of Wolverhampton on 2025-02-11


18 1%
Submitted works

University of Wolverhampton on 2025-02-11


19 1%
Submitted works

medium.com
20 1%
Internet

Sources overview
Similarity Report

University of Wolverhampton on 2025-02-08


21 <1%
Submitted works

University of Wolverhampton on 2025-02-11


22 <1%
Submitted works

University of Wolverhampton on 2025-02-11


23 <1%
Submitted works

University of Wolverhampton on 2025-02-11


24 <1%
Submitted works

Sources overview

You might also like