0% found this document useful (0 votes)

17 views15 pages

Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts

Uploaded by

211cs011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views15 pages

Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts

Uploaded by

211cs011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

DATA VISUALIZATION

Data visualization is the process of creating graphical representations of data to

better understand and communicate information. It involves using visual
elements like charts, graphs, plots, and other visualizations to help people
understand and analyze data more effectively.

TYPES OF DATA VISUALIZATION:

Charts and Graphs

 Line Charts: Used to show trends over time or relationships between

continuous data.

 Bar Charts: Used to compare categorical data across different groups.

 Scatter Plots: Used to show relationships between two continuous

variables.

 Pie Charts: Used to show how different categories contribute to a whole.

 Histograms: Used to show the distribution of continuous data.

Geospatial Visualizations

 Maps: Used to show geographic data, such as population density, climate

patterns, or election results.

 Heat Maps: Used to show density or intensity of data points on a map.

 Geospatial Scatter Plots: Used to show relationships between geographic

data points.

Interactive Visualizations

 Dashboards: Used to provide an overview of multiple data sets and allow

users to explore data in real-time.
 Interactive Scatter Plots: Used to allow users to explore relationships
between data points in real-time.

 Filterable Visualizations: Used to allow users to filter data based on

specific criteria.

Infographics

 Static Infographics: Used to communicate a message or tell a story using

a combination of data, images, and text.

 Interactive Infographics: Used to allow users to explore data and interact

with the visualization in real-time.

Other Types of Data Visualization

 3D Visualizations: Used to show complex relationships between multiple

variables.

 Network Visualizations: Used to show relationships between nodes and

edges, such as social networks or supply chains.

 Radar Charts: Used to compare multiple categories across multiple

dimensions.

To create effective data visualizations, several key elements are required.

Here are some of the most important ones:

Data

 Quality data: Accurate, complete, and relevant data is essential for

creating meaningful visualizations.

 Clean data: Data should be free from errors, inconsistencies, and missing
values.
 Structured data: Data should be organized in a way that makes it easy to
analyze and visualize.

Data Visualization Process

 Define the problem: Identify the business problem or question to be

answered.

 Collect and clean data: Gather and prepare the data for analysis.

 Analyze data: Apply statistical and analytical techniques to extract

insights.

 Design visualization: Create a visualization that effectively communicates

the insights.

 Refine and iterate: Refine the visualization based on feedback and

iteration.

SELECTION OF DATA: Optimizing Delivery Times in E-commerce

from google.colab import files

uploaded = files.upload()

Import Data

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

Loading the dataset

datadf = pd.read_csv('customer_analytics.csv')

df.head()

Data Preprocessing

df.shape

o/p: (10999, 12)

df.dtypes

Dropping column ID because it is an index column

df.drop(['ID'], axis=1, inplace=True)

#Checking for null/missing values

df.isnull().sum()

df.duplicated().sum()

Descriptive Statistics

df.describe()

df.head()
Exploratory Data Analysis

In the exploratory data analysis, I will be looking at the relationship between

the target variable and the other variables. I will also be looking at the
distribution of the variables across the dataset, in order to understand the data
in a better way.

Customer Gender Distribution

plt.pie(df['Gender'].value_counts(),labels = ['F','M'], autopct='%1.1f%%',

startangle=90)

plt.title('Gender Distribution')

The dataset has the equal number of both males and female customers, with
percentage of 49.6% and 50.4% respectively.

Product Properties

df.replace([np.inf, -np.inf], np.nan, inplace=True)

fig, ax = plt.subplots(1,3,figsize=(15,5))
sns.histplot(df['Weight_in_gms'], ax=ax[0], kde=True).set_title('Weight
Distribution')

sns.countplot(x = 'Product_importance', data = df, ax=ax[1]).set_title('Product

Importance')

sns.histplot(df['Cost_of_the_Product'], ax=ax[2], kde=True).set_title('Cost of

the Product')

These three graphs explain the distribution of product properties - Weight, Cost
and Importance in the dataset. Firstly, looking at the weight distribution, we
can see that the products weighing between 1000-2000 grams and 4000-6000
grams are more in number. This means that the company is selling more of the
products in these weight ranges. The second graph is about the product
importance, where majority of the products have low or medium importance.
The third graph is about the cost of the product. Third graph is about the cost
distribution of the products, where there is increased distribution between 150-
200 and 225-275 dollars. From this, I conclude that majority of the products are
lighter than 6000 grams, have low or medium importance and costs between
150-275 dollars.
Logistics

fig, ax = plt.subplots(1,3,figsize=(15,5))

sns.countplot(x = 'Warehouse_block', data = df, ax=ax[0]).set_title('Warehouse

Block')

sns.countplot(x = 'Mode_of_Shipment', data = df, ax=ax[1]).set_title('Mode of

Shipment')

sns.countplot(x = 'Reached.on.Time_Y.N', data = df,

ax=ax[2]).set_title('Reached on Time')

The above graphs visualizes the logistics and delivery of the product. In the first
graph, we can see that the number of products from warehouse F is most i.e.
3500, whereas rest of the warehouses have nearly equal number of products.
The second graph is about the shipment of the product, where majority of the
products are shipped via Ship whereas nearly 2000 products are shipped by
flight and road. Third graph is about the timely delivery of the product where we
can see that the number of products delivered on time is more than the number
of products not delivered on time.
From all the above graph, I assume that warehouse F is close to seaport,
because warehouse F has the most number of products and most of the products
are shipped via ship.

Customer Experience

fig, ax = plt.subplots(2,2,figsize=(15,10))

sns.countplot(x = 'Customer_care_calls', data = df,

ax=ax[0,0]).set_title('Customer Care Calls')

sns.countplot(x = 'Customer_rating', data = df, ax=ax[0,1]).set_title('Customer

Rating')

sns.countplot(x = 'Prior_purchases', data = df, ax=ax[1,0]).set_title('Prior

Purchases')

sns.histplot(x = 'Discount_offered', data = df, ax=ax[1,1], kde =

True).set_title('Discount Offered')

o/p: Text(0.5, 1.0, 'Discount Offered')

The above graphs visualizes the customer experience based on their customer
care calls, rating, prior purchases and discount offered. The first graph shows
the number of customer care calls done by the customers, where we can see that
majority of the customers have done 3-4 calls, which could be a potential
indicator, which shows that customers could be facing with the product delivery.
In the second graph, we can see that the count of customer ratings across all
ratings is same, but there are little more count in rating 1, which means
customers are not satisfied with the service.

The third graph is about the prior purchases done by the customers, where we
can see that majority of the customers have done 2-3 prior purchases, which
means that customers who are having prior purchases, they are satisfied with
the service, and they are buying more products. The fourth graph is about the
discount offered on the products, where we can see that majority of the products
have 0-10% discount, which means that the company is not offering much
discount on the products.

Customer Gender and Product Delivery

sns.countplot(x = 'Gender', data = df, hue =

'Reached.on.Time_Y.N').set_title('Gender vs Reached on Time')

The number of products timely delivered for both the genders is same, which
means there is no relation of customer gender and product delivery.

Customer Experience and Product Delivery

fig, ax = plt.subplots(2,2,figsize=(15,10))

sns.countplot(x = 'Customer_care_calls', data = df, ax=ax[0,0],hue =

'Reached.on.Time_Y.N').set_title('Customer Care Calls')
sns.countplot(x = 'Customer_rating', data = df, ax=ax[0,1],hue =
'Reached.on.Time_Y.N').set_title('Customer Rating')

sns.countplot(x = 'Prior_purchases', data = df, ax=ax[1,0],hue =

'Reached.on.Time_Y.N').set_title('Prior Purchases')

sns.violinplot(x = 'Reached.on.Time_Y.N', y = 'Discount_offered' ,data = df,

ax=ax[1,1]).set_title('Discount Offered')

It is important to understand the customer experience and respond to services

provided by the E-Commerce company. The above graphs explain the
relationship between customer experience and product delivery. The first graph
is about the customer care calls and product delivery, where we that the
difference in timely and late delivery of the product decreases with increase in
the number of calls by the customer, which means that with the delay in product
delivery the customer gets anxious about the product and calls the customer
care. The second graph is about the customer rating and product delivery,
where we can see that customers who rating have higher count of products
delivered on time.

The third graph is about the customer's prior purchase, which also shows that
customers who have done more prior purchases have higher count of products
delivered on time and this is the reason that they are purchasing again from the
company. The fourth graph is about the discount offered on the product and
product delivery, where we can see that products that have 0-10% discount have
higher count of products delivered late, whereas products that have discount
more than 10% have higher count of products delivered on time.

Correlation Matrix Heatmap

plt.figure(figsize=(10,10))

sns.heatmap(df.corr(), annot=True, cmap='coolwarm')

In the correlation matrix heatmap, we can see that there is positive correlation
between cost of product and number of customer care calls.

sns.violinplot(x = 'Customer_care_calls', y = 'Cost_of_the_Product', data = df)

It is clear that customer are more concern regarding the delivery of the product
when the cost of the product is high. This is the reason that they call the
customer care to know the status of the product. So, it is important to make sure
the delivery of the product is on time when the cost of the product is high.
Conclusion

From the exploratory data analysis (EDA), it was found that product
weight and cost significantly impact delivery time. Specifically, products
weighing between 2500 and 3500 grams and costing less than 250 dollars had
a higher likelihood of being delivered on time. Additionally, most products were
shipped from Warehouse F via ship, suggesting that this warehouse might be
located near a seaport, contributing to more efficient deliveries.

Customer behavior also plays a crucial role in predicting delivery timeliness.

The analysis revealed that the more frequently customers call, the higher the
chances of delayed delivery. Interestingly, customers with more prior
purchases tended to experience more timely deliveries, possibly indicating a
higher level of trust in the company, which encourages repeat purchases.
Another observation is that products with a 0-10% discount had a higher rate
of late deliveries, while those with discounts of more than 10% were more often
delivered on time.

Submitted by:

HARSHAPRADHA K(24CESG010)

Resource Plan Template
100% (2)
Resource Plan Template
4 pages
RFQ, PR 10050671, 10050989, 10050435, 10050366, 10046721, 10050158, 10051179 (Insulators)
No ratings yet
RFQ, PR 10050671, 10050989, 10050435, 10050366, 10046721, 10050158, 10051179 (Insulators)
11 pages
Ads Phase 5
No ratings yet
Ads Phase 5
23 pages
Data Science
No ratings yet
Data Science
22 pages
BIDA Practical Print
No ratings yet
BIDA Practical Print
56 pages
ML Report
No ratings yet
ML Report
12 pages
Diwali Sales Analysis EDA 1696347982
No ratings yet
Diwali Sales Analysis EDA 1696347982
8 pages
Project Sale Analysis
No ratings yet
Project Sale Analysis
8 pages
Main Phase 3 Dharani
No ratings yet
Main Phase 3 Dharani
19 pages
Practical D.V
No ratings yet
Practical D.V
13 pages
Phase 3
No ratings yet
Phase 3
19 pages
EDA Report Week2
No ratings yet
EDA Report Week2
15 pages
Case Study Module 1
No ratings yet
Case Study Module 1
4 pages
Week13 2 Data Analysis 2
No ratings yet
Week13 2 Data Analysis 2
44 pages
Data Visualization
No ratings yet
Data Visualization
31 pages
An Extensive Step by Step Guide To Exploratory Data Analysis
No ratings yet
An Extensive Step by Step Guide To Exploratory Data Analysis
26 pages
Another Project-Creating Customer Segments
No ratings yet
Another Project-Creating Customer Segments
31 pages
Big Data Report
No ratings yet
Big Data Report
6 pages
Axe Submission
No ratings yet
Axe Submission
4 pages
E Commerce
No ratings yet
E Commerce
23 pages
Diwali Sales Analysis
No ratings yet
Diwali Sales Analysis
14 pages
ALOJIPAN Assessment - Task - 1 - Sampling - Data - Visualization
No ratings yet
ALOJIPAN Assessment - Task - 1 - Sampling - Data - Visualization
12 pages
Data Exploration and Visualization Unit 3
No ratings yet
Data Exploration and Visualization Unit 3
13 pages
DWM - Exp 1
No ratings yet
DWM - Exp 1
11 pages
Comprehensive Data Visualization With Matplotlib and Seaborn
No ratings yet
Comprehensive Data Visualization With Matplotlib and Seaborn
40 pages
Unit 2
No ratings yet
Unit 2
52 pages
BPP Business School - Applied Modelling and Visualisation
No ratings yet
BPP Business School - Applied Modelling and Visualisation
19 pages
Lesson 1 - Data Visualisation
No ratings yet
Lesson 1 - Data Visualisation
35 pages
INDEX
No ratings yet
INDEX
16 pages
Da End Sem
No ratings yet
Da End Sem
5 pages
Exploratory Data Analysis-1
No ratings yet
Exploratory Data Analysis-1
10 pages
Ccs346 Eda Unit 1
No ratings yet
Ccs346 Eda Unit 1
139 pages
DV Unit 2
No ratings yet
DV Unit 2
5 pages
Banking Analysis
No ratings yet
Banking Analysis
2 pages
Technologyname Phase2
No ratings yet
Technologyname Phase2
20 pages
Nikita Prasad - Exploratory Data Analysis (EDA)
No ratings yet
Nikita Prasad - Exploratory Data Analysis (EDA)
18 pages
SMA Expt 4
No ratings yet
SMA Expt 4
13 pages
Supermarket Sales Analysis Project
No ratings yet
Supermarket Sales Analysis Project
8 pages
Code
No ratings yet
Code
5 pages
Ass 2
No ratings yet
Ass 2
13 pages
DA Unit 1
No ratings yet
DA Unit 1
43 pages
DV - QB - Solution
No ratings yet
DV - QB - Solution
6 pages
Exploratory Data Analysis (EDA) in Python
No ratings yet
Exploratory Data Analysis (EDA) in Python
6 pages
Sma Exp4 Ayu
No ratings yet
Sma Exp4 Ayu
6 pages
Topic 2. Visual Data Analysis in Python: Mlcourse - Ai (Https://mlcourse - Ai)
No ratings yet
Topic 2. Visual Data Analysis in Python: Mlcourse - Ai (Https://mlcourse - Ai)
25 pages
Eds Unit 3
No ratings yet
Eds Unit 3
22 pages
Da Laqs Saqs
No ratings yet
Da Laqs Saqs
23 pages
UNIT - 1 EDA Continuation
No ratings yet
UNIT - 1 EDA Continuation
113 pages
Data Exploration and Analysis With Python
No ratings yet
Data Exploration and Analysis With Python
9 pages
Data Basics For ML
No ratings yet
Data Basics For ML
23 pages
Unit-5 New
No ratings yet
Unit-5 New
31 pages
Unit 3-5 15 Marks
No ratings yet
Unit 3-5 15 Marks
8 pages
Chapter 2. Data Analysis and Processing - Full
No ratings yet
Chapter 2. Data Analysis and Processing - Full
49 pages
Week 7 - Data Visualization
No ratings yet
Week 7 - Data Visualization
14 pages
DVPD Final Lab Word PDF
No ratings yet
DVPD Final Lab Word PDF
93 pages
Chapter3 - Visualization and Communication
No ratings yet
Chapter3 - Visualization and Communication
45 pages
Python Data Analysis and Visualization 100 Practical Exercises With Results and Explanations (Yuka, Horikawa Yui, Kirigaya Kouta Etc.) (Z-Library)
No ratings yet
Python Data Analysis and Visualization 100 Practical Exercises With Results and Explanations (Yuka, Horikawa Yui, Kirigaya Kouta Etc.) (Z-Library)
453 pages
Chapter 02 Overview (Python)
No ratings yet
Chapter 02 Overview (Python)
16 pages
Data Understanding and Prepration
100% (1)
Data Understanding and Prepration
10 pages
Supermarket Sales Analysis 1
No ratings yet
Supermarket Sales Analysis 1
13 pages
DSBDAL - Assignment No 9
No ratings yet
DSBDAL - Assignment No 9
12 pages
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Chapter 3 ERP Modules
96% (28)
Chapter 3 ERP Modules
23 pages
Enterprise Performance Management
No ratings yet
Enterprise Performance Management
6 pages
Performance Appraisal - ACCA Global
No ratings yet
Performance Appraisal - ACCA Global
10 pages
OTO - Create Stunning Calligraphy Prints by Stealing Tshirt Designs
100% (1)
OTO - Create Stunning Calligraphy Prints by Stealing Tshirt Designs
48 pages
Financial Ledger - 27-07-2024 - 113920
No ratings yet
Financial Ledger - 27-07-2024 - 113920
5 pages
Understanding CE & PE-1
No ratings yet
Understanding CE & PE-1
2 pages
PROFIT - VOLUME 2, NO. 2, Mei 2023 Hal 01-14
No ratings yet
PROFIT - VOLUME 2, NO. 2, Mei 2023 Hal 01-14
14 pages
Seminar On Planning
No ratings yet
Seminar On Planning
46 pages
Intermediate Accounting 2
No ratings yet
Intermediate Accounting 2
2 pages
Analisis Simpal Kausal PR
No ratings yet
Analisis Simpal Kausal PR
14 pages
TCS Applicability
No ratings yet
TCS Applicability
3 pages
Installment Sales Part 2 AND Business Combi
No ratings yet
Installment Sales Part 2 AND Business Combi
33 pages
Class Activities On Sales Forecasting
No ratings yet
Class Activities On Sales Forecasting
8 pages
10 Marks Question Production Operation Management
No ratings yet
10 Marks Question Production Operation Management
13 pages
Incred Financial Services Limited Press+Release
No ratings yet
Incred Financial Services Limited Press+Release
7 pages
Principles and Objectives of Risk MGMT
No ratings yet
Principles and Objectives of Risk MGMT
8 pages
Relations Functions Domain Range
No ratings yet
Relations Functions Domain Range
10 pages
6 Appendix: Sample Answers......................... : Sc3. Strategic Management. Bux'K I 13
No ratings yet
6 Appendix: Sample Answers......................... : Sc3. Strategic Management. Bux'K I 13
30 pages
What Are CPC, CPM, CPA & CTR ? - Publift 1
No ratings yet
What Are CPC, CPM, CPA & CTR ? - Publift 1
10 pages
UNIT 1 1 Introduction To Marketing
No ratings yet
UNIT 1 1 Introduction To Marketing
3 pages
Consumer Types in The United Kingdom
No ratings yet
Consumer Types in The United Kingdom
48 pages
Ali Members (MKT-358 MCQ, S)
No ratings yet
Ali Members (MKT-358 MCQ, S)
31 pages
Comprehensive Business Review
No ratings yet
Comprehensive Business Review
201 pages
Large Espresso Crate With Chalkboard by Ashland®
No ratings yet
Large Espresso Crate With Chalkboard by Ashland®
1 page
Marketing Management Assignment 1 and 2 Answer Sheet
100% (1)
Marketing Management Assignment 1 and 2 Answer Sheet
10 pages
Debt Free For Life
No ratings yet
Debt Free For Life
3 pages
Abfrl's Allen Solly Sip Report by Santosh Mishra
No ratings yet
Abfrl's Allen Solly Sip Report by Santosh Mishra
56 pages
CMR Pro Format
No ratings yet
CMR Pro Format
4 pages

Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts

Uploaded by

Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts

Uploaded by

DATA VISUALIZATION

Data visualization is the process of creating graphical representations of data to

TYPES OF DATA VISUALIZATION:

Charts and Graphs

 Line Charts: Used to show trends over time or relationships between

 Bar Charts: Used to compare categorical data across different groups.

 Scatter Plots: Used to show relationships between two continuous

 Pie Charts: Used to show how different categories contribute to a whole.

 Histograms: Used to show the distribution of continuous data.

 Maps: Used to show geographic data, such as population density, climate

 Heat Maps: Used to show density or intensity of data points on a map.

 Geospatial Scatter Plots: Used to show relationships between geographic

 Dashboards: Used to provide an overview of multiple data sets and allow

 Filterable Visualizations: Used to allow users to filter data based on

 Static Infographics: Used to communicate a message or tell a story using

 Interactive Infographics: Used to allow users to explore data and interact

Other Types of Data Visualization

 3D Visualizations: Used to show complex relationships between multiple

 Network Visualizations: Used to show relationships between nodes and

 Radar Charts: Used to compare multiple categories across multiple

To create effective data visualizations, several key elements are required.

 Quality data: Accurate, complete, and relevant data is essential for

Data Visualization Process

 Define the problem: Identify the business problem or question to be

 Analyze data: Apply statistical and analytical techniques to extract

 Design visualization: Create a visualization that effectively communicates

 Refine and iterate: Refine the visualization based on feedback and

SELECTION OF DATA: Optimizing Delivery Times in E-commerce

from google.colab import files

import matplotlib.pyplot as plt

import seaborn as sns

Loading the dataset

o/p: (10999, 12)

Dropping column ID because it is an index column

df.drop(['ID'], axis=1, inplace=True)

In the exploratory data analysis, I will be looking at the relationship between

Customer Gender Distribution

plt.pie(df['Gender'].value_counts(),labels = ['F','M'], autopct='%1.1f%%',

df.replace([np.inf, -np.inf], np.nan, inplace=True)

sns.countplot(x = 'Product_importance', data = df, ax=ax[1]).set_title('Product

sns.histplot(df['Cost_of_the_Product'], ax=ax[2], kde=True).set_title('Cost of

sns.countplot(x = 'Warehouse_block', data = df, ax=ax[0]).set_title('Warehouse

sns.countplot(x = 'Mode_of_Shipment', data = df, ax=ax[1]).set_title('Mode of

sns.countplot(x = 'Reached.on.Time_Y.N', data = df,

sns.countplot(x = 'Customer_care_calls', data = df,

sns.countplot(x = 'Customer_rating', data = df, ax=ax[0,1]).set_title('Customer

sns.countplot(x = 'Prior_purchases', data = df, ax=ax[1,0]).set_title('Prior

sns.histplot(x = 'Discount_offered', data = df, ax=ax[1,1], kde =

o/p: Text(0.5, 1.0, 'Discount Offered')

Customer Gender and Product Delivery

sns.countplot(x = 'Gender', data = df, hue =

Customer Experience and Product Delivery

sns.countplot(x = 'Customer_care_calls', data = df, ax=ax[0,0],hue =

sns.countplot(x = 'Prior_purchases', data = df, ax=ax[1,0],hue =

sns.violinplot(x = 'Reached.on.Time_Y.N', y = 'Discount_offered' ,data = df,

It is important to understand the customer experience and respond to services

Correlation Matrix Heatmap

sns.heatmap(df.corr(), annot=True, cmap='coolwarm')

sns.violinplot(x = 'Customer_care_calls', y = 'Cost_of_the_Product', data = df)

Customer behavior also plays a crucial role in predicting delivery timeliness.

You might also like