0% found this document useful (0 votes)
126 views23 pages

05 Detailed Project Report

The three sentence summary is: The document provides a detailed project report on budget sales analysis including objectives to perform ETL and data analysis on sales orders, customers, products and geographical data to deduce metrics and patterns. It describes the data attributes and insights gained such as most sales occurring on Wednesdays and Saturdays, the top 5 selling products, and average shipping delays of 7 days. The report also outlines the architecture used in data collection, exploration, modeling, and deployment of Power BI visualizations and reports.

Uploaded by

maqbool bhai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
126 views23 pages

05 Detailed Project Report

The three sentence summary is: The document provides a detailed project report on budget sales analysis including objectives to perform ETL and data analysis on sales orders, customers, products and geographical data to deduce metrics and patterns. It describes the data attributes and insights gained such as most sales occurring on Wednesdays and Saturdays, the top 5 selling products, and average shipping delays of 7 days. The report also outlines the architecture used in data collection, exploration, modeling, and deployment of Power BI visualizations and reports.

Uploaded by

maqbool bhai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

2022

Detailed Project Report


BUDGET SALES ANALYSIS
ABHISHEK DOKE
1 Detailed Project Report

1. Problem Statement:

Our "Domain Sale" process is structured to help potential


buyers purchase the domain they want immediately without the
hassle of contacting the seller directly.

A seller lists a domain for sale at a specific price in our


Marketplace. An interested buyer sees this domain for sale and
decides to buy it.

2. Objectives:

• The collection includes records for sales orders, customer


information, product information, and geographical data.
• In order to deduce important metrics and patterns in the
dataset, this project will use the provided data to perform
ETL and data analysis.
• Additionally, several visualisations and reports are created
to represent significant linkages.

ABHISHEK DOKE 1
2 Detailed Project Report

3. Benefits

• Help in making wiser business decisions.


• Aid in customer satisfaction and trend monitoring, which
can serve current consumers and attract new ones.
• Greater client base understanding is provided.
• Facilitates seamless resource management flow.

ABHISHEK DOKE 2
3 Detailed Project Report

4. Data attributes

Customer
CustomerKey FullName Birthdate
Maritalstatus Gender YearlyIncome
TotalChildren NumberChildrenAtHome Education
Occupation HouseOwnerFlag NumberCarsOwned
DateFirstPurchase CommuteDistance

Product
ProductKey ProductName Subcategory
Category ListPrice DaysToManufacture
ProductLine ModelName ProductDescription
StartDate

Territory
SalesTerritoryKey Region Country
Group

Sales
ProductKey OrderDate ShipDate
CustomerKey PromotionKey SalesTerritoryKey
SalesOrderNumber SalesOrderLineNumber OrderQuantity
UnitPrice TotalProductCost SalesAmount
TaxAmt

ABHISHEK DOKE 3
4 Detailed Project Report

4.1 Dataset information

CustomerKey: Primary key for customer dataset

Birthdate: Birthdate of the customer

MaritalStatus: M- Married / S - Single

Gender: M – Male / F – Female

TotalChildren: Total number of children

NumberChildrenAtHome: Number of children staying along with


their parents

Education: Education qualification


Occupation: Present occupation
HouseOwnerFlag: 1– Owns house / 0- Doesn’t have a permanent
address

NumberCarsOwned: Number of cars owned by the customer


DateFirstPurchase: First date of order by the customer

ProductKey: Primary Key for the product dataset


ProductName: Product name with colour of the product
Subcategory: Sub category name of the product

Category: Category name of the product


ListPrice: Sale price of the product

DaysToManufacture: Days to manufacture the product after


receiving the order
ProductLine: Product line name

ModelName: Model name of the product

ProductDescription: more details about the product

ABHISHEK DOKE 4
5 Detailed Project Report

SalesTerritoryKey: Primary Key of the Territory dataset

Region: Region name of the order

Country: Country name of the order

OrderDate: Date of the order received

ShipDate: Date when the order left the factory for export
SalesOrderNumber: Invoice number of the order

OrderQuantity: Number of quantities ordered for a product


UnitPrice: Per unit sale price of the product

TotalProductCost: Cost of the product

SalesAmount: Total sales price of the product


TaxAmt: Tax collected for the product sold

ABHISHEK DOKE 5
6 Detailed Project Report

5. Architecture

Importing
Collect Raw Data Load Dataset Merging Data
Libraries

Handling
Exploring Data Adding Columns Assessing Data
Missing Data

Creating
Modeling Power BI Report Insights
Measures

Deployment Documentation

1. Collect Raw Data - This step involves extracting the data


from different sources relevant to the problem statement
or obtaining data from the client

2. Importing Libraries – Import analysis related python


libraries example – Pandas, Numpy, Plotly, datetime etc

3. Data Wrangling – Contains following steps gathering data,


assessing data, handling missing data and adding columns

ABHISHEK DOKE 6
7 Detailed Project Report

4. Exploring Data – Once the data is loaded and pre-


processed, we preform data analysis using python libraries
and Business Intelligence tools like Power BI

5. Data Modelling - Data


Modelling is one of the features used
to connect multiple data sources in BI tool using a
relationship.
A relationship defines how data sources are connected
with each other and you can create interesting data
visualizations on multiple data sources

6. Deployment - The prepared visualizations are deployed on


the powerbi.microsoft.com site. Where they will be
available publicly

ABHISHEK DOKE 7
8 Detailed Project Report

6. Insights

1. Product Price per unit Distribution

▪ According to the above


distribution plot we can
conclude that maximum of
the product unit price is
below $1000

2. Sales order line number distribution


▪ Most of the time three to
two products are ordered in
a single order

ABHISHEK DOKE 8
9 Detailed Project Report

3. Sales order quantity distribution


▪ Maximum quantity
ordered for a product is
below 5

4. Age distribution
▪ A sizable portion of the
clientele is made up of
people between the ages
of 40 and 59

5. Year wise sales


▪ The year 2016 saw an
exponential surge in
sales

ABHISHEK DOKE 9
10 Detailed Project Report

6. Top 5 selling products

7. Quantity ordered based on category and subcategory from


2014 to 2016

ABHISHEK DOKE 10
11 Detailed Project Report

8. Country wise quantity ordered


▪ High quantity of
products is ordered from
Australia and United
States

9. Overall profit based on order year, category and subcategory

▪ Major Profit is contributed by the Bike Category

ABHISHEK DOKE 11
12 Detailed Project Report

10.How efficient are the logistics?


▪ The average order has a gap
of 7 days between the day
the order is ready for export
from the factory and the
date it was shipped
▪ Management must work to
reduce this gap toward 3
days

11.What was the best month for sales? How much was earned
that month?

▪ Maximum profit earned in the months of June, November, and


December

ABHISHEK DOKE 12
13 Detailed Project Report

12.What time should we display advertisement to maximize


likelihood of customer is buying product?
▪ High sales orders are seen
on Wednesday and
Saturday; therefore, we can
promote our product during
these workweek

13.Which products are most often sold together?

▪ The above product can be sold in a bundle or a combined


package for discount

ABHISHEK DOKE 13
14 Detailed Project Report

14.Which product sold the most? why do you think it sold the
most?

▪ There is a high negative


correlation between Price
and number of Quantity
ordered
▪ we can conclude that low
price product has high
demand

15.Compare most ordered product by gender

ABHISHEK DOKE 14
15 Detailed Project Report

16.Does Gender and home ownership matter in order


purchasing

▪ It's interesting to note that the average amount spent by men


without permanent addresses is low, whilst the average amount
spent by women without permanent addresses is higher

17.Number of children and Purchase correlation


▪ Purchase among customers
with number of children, 2
and 5, are high

ABHISHEK DOKE 15
16 Detailed Project Report

18.Occupation and purchase correlation

▪ Purchases by Professional and Management customers are


comparatively high

19.Which age group has produced the most revenue?


▪ Age range of 40-49 and 50-
59 is shows high demand
compared to other age
group

20.Yearly income range and purchase correlation


▪ High salary range leads to
increase in revenue

ABHISHEK DOKE 16
17 Detailed Project Report

21. Partial high school vs bachelors income mean and most


ordered product

▪ Customers with a high school diploma and modest annual


income buy more products than people with bachelor's degrees

22.Customer segmentation
▪ According to the customer
segmentation described
above, approximately 15%
of our clients are high
value clients, whereas the
majority of our clientele
are low value and lost
clients

ABHISHEK DOKE 17
18 Detailed Project Report

23.Cohort Analysis

▪ We can infer from the heatmap above that client retention in 2014
was subpar
▪ Since August of 2015, we have noticed some customers returning,
though not in large numbers
▪ 2016 brought about a slight improvement in retention

ABHISHEK DOKE 18
19 Detailed Project Report

7. Key Performance Indicator


▪ Sales trend line
▪ Cost trend line

▪ Average unit cost and price


▪ Revenue generated by Subcategory

▪ Sales by Product Line


▪ Revenue contribution by region

▪ Profit contribution by region

▪ Profit % by region
▪ Current year profit margin vs difference in last year’s profit

margin

▪ Total orders

▪ Total revenue
▪ Variance to target comparison by category

▪ Variance by month line chart

▪ Actual sales and target sales matrix


▪ Cohort analysis table

▪ Customer retention line chart


▪ Monthly spending trend

▪ Average monthly spend distribution

ABHISHEK DOKE 19
20 Detailed Project Report

8. Conclusion

▪ A sizable portion of the clientele is made up of people between

the ages of 40 and 59

▪ The year 2016 saw an exponential surge in sales

▪ High quantity of products is ordered from Australia and United

States

▪ Major Profit is contributed by the Bike Category

▪ The average order has a gap of 7 days between the day the

order is ready for export from the factory and the date it was

shipped

▪ Maximum profit earned in the months of June, November, and

December

▪ High sales orders are seen on Wednesday and Saturday, when

compared to other weekdays

▪ There is a high negative correlation between Price and number

of Quantity ordered

ABHISHEK DOKE 20
21 Detailed Project Report

▪ The average amount spent by men without permanent

addresses is low, whilst the average amount spent by women

without permanent addresses is higher

▪ Age range of 40-49 and 50-59 is shows high demand compared

to other age group

▪ High salary range leads to increase in revenue

▪ Customers with a high school diploma and modest annual

income buy more products than people with bachelor's degrees

▪ According to the customer segmentation described above,

approximately 15% of our clients are high value clients,

whereas the majority of our clientele are low value and lost

clients

▪ Client retention in 2014 was subpar

▪ 2016 brought about a slight improvement in retention

ABHISHEK DOKE 21
22 Detailed Project Report

9. Q & A

Q1) What’s the source of data?


➢ The Dataset was taken from iNeuron’s Provided Project
Description Document
➢ Data Link

Q2) What was the type of data?


➢ The data was the combination of numerical and
Categorical values

Q 3) What’s the complete flow you followed in this Project?


➢ Refer page 4 for better Understandings

Q4) What techniques were you using for data?


➢ Removing unwanted attributes
➢ Visualizing relation of independent variables with each
other
➢ Cleaning data by removing column with missing values
➢ Converting Numerical data into Categorical values

Q 6) What were the libraries that you used in Python?


➢ I used Pandas, NumPy, Matplotlib, Seaborn and Plotly
libraries

ABHISHEK DOKE 22

You might also like