0% found this document useful (0 votes)
75 views2 pages

R4M - Superstore Dataset

Uploaded by

Radhe Shyam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views2 pages

R4M - Superstore Dataset

Uploaded by

Radhe Shyam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

R for Managers

Superstore Dataset
Prof. Rajendra Baraiya
Context:
In today's era of big data, we understand the importance of data as the key driver for business
decisions. The owner of a prominent superstore chain in the United States has access to a wealth
of data from sales spanning 2014 to 2017. Recognizing the potential value in this historical
data, the owner seeks to extract meaningful insights to enhance business performance. As
consultants, you have been invited to analyze this dataset and provide actionable insights that
can help the owner make informed decisions.
About Dataset:
Row ID: Unique ID for each row.
Order ID: Unique Order ID for each Customer.
Order Date: Order Date of the product.
Ship Date: Shipping Date of the Product.
Ship Mode: Shipping Mode specified by the Customer.
Customer ID: Unique ID to identify each Customer.
Customer Name: Name of the Customer.
Segment: The segment where the Customer belongs.
Country: Country of residence of the Customer.
City: City of residence of the Customer.
State: State of residence of the Customer.
Postal Code: Postal Code of every Customer.
Region: Region where the Customer belong.
Product ID: Unique ID of the Product.
Category: Category of the product ordered.
Sub-Category: Sub-Category of the product ordered.
Product Name: Name of the Product
Sales: Sales of the Product.
Quantity: Quantity of the Product.
Discount: Discount provided.
Profit: Profit/Loss incurred.
1. Load the ‘Superstore Dataset.csv’ into ‘sd’ data frame in R.
2. Find the number of variables in the dataset.
3. Find the number of entries in the dataset.
4. Check column headings of the data frame.
5. Check the structure of dataset.
6. Check the summary of the dataset. What are you observing? Interpret it.
7. Convert all the character datatypes into factor datatypes.
8. Checking for missing values, if any, in the dataset. If yes, identify it and remove the
observations with missing observations.
9. Checking for duplicated values, if any, in the dataset. If yes, identify it and remove the
observations with duplicated entries.
10. Identify the unique customers in the dataset.
11. Extract the unique Shipping Mode available in the dataset. Identify number of sales
instances for each category.
12. Extract the unique Region available in the dataset. Identify number of sales instances for
each Region.
13. Identify unique product categories in the datasets. Identify number of sales instances for
each category.
14. Identify unique product sub-categories in the datasets. Identify number of sales instances
for each sub-category.

DATA VISUALIZATION
• What is data visualization? Different Charts –
• What are data visualization tools? o Bar
o Histogram
Package – ggplot2 o Scatter
o Line

BAR CHART

15. Create a bar chart of number of sales instances with each product category.
a. Create a canvas.
b. Add a layer of bar.
c. Change the bar color as per the different product categories.
d. Provide title to the bar chart (i.e., No. of Sales per Category).
e. Provide x and y axis names to the chat.
f. Provide sub-title to the bar chart (i.e., Superstore Dataset).
g. Change the theme of chart. See the effect of using different themes.
h. Flip the coordinates (Change x-axis to y-axis, and vice versa).
i. Change the legend positions (top, bottom, left, right).
j. Add manual colors to the different bars respect to different product categories.
k. Stacked bar chart: Modify the bar chart and visualize the customer orders as per the
i. different sales regions
ii. different customer segments
iii. different sub-category of products
16. Create a bar chart of number of sales instances
a. at each region,
b. with each customer segment,
c. with each shipping mode.
17. Create a bar chart of the total sum of Profits with each customer segment.
18. For your practice:

𝑚𝑖𝑛 𝑆𝑎𝑙𝑒𝑠 𝐶𝑢𝑡𝑜𝑚𝑒𝑟 𝑆𝑒𝑔𝑚𝑒𝑛𝑡


𝑚𝑎𝑥 𝑄𝑢𝑎𝑛𝑡𝑖𝑡𝑦 𝑆ℎ𝑖𝑝 𝑀𝑜𝑑𝑒
Create a bar chart of the 𝑠𝑢𝑚 of with each
𝑚𝑒𝑎𝑛 𝐷𝑖𝑠𝑐𝑜𝑢𝑛𝑡 𝑅𝑒𝑔𝑖𝑜𝑛
𝐶𝑜𝑢𝑛𝑡 𝑃𝑟𝑜𝑓𝑖𝑡 𝑃𝑟𝑜𝑑𝑢𝑐𝑡 𝐶𝑎𝑡𝑒𝑔𝑜𝑟𝑦

You might also like