0% found this document useful (0 votes)
1 views5 pages

DK Phase2

The document outlines a Phase-2 submission for a data analytics project focused on demand forecasting in retail to improve inventory management and customer satisfaction. It details the project's objectives, workflow, data description, preprocessing steps, exploratory data analysis insights, and the tools used, including Python and various libraries. The project aims to deliver a cleaned dataset, interactive dashboards, forecasted demand values, and actionable recommendations.

Uploaded by

dharanidk895
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views5 pages

DK Phase2

The document outlines a Phase-2 submission for a data analytics project focused on demand forecasting in retail to improve inventory management and customer satisfaction. It details the project's objectives, workflow, data description, preprocessing steps, exploratory data analysis insights, and the tools used, including Python and various libraries. The project aims to deliver a cleaned dataset, interactive dashboards, forecasted demand values, and actionable recommendations.

Uploaded by

dharanidk895
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Phase-2 Submission Template – Data Analytics

Student Name: DharaniKumar


Register Number: 613523104701
Institution: Government College of Engineering-Dharmapuri
Department: Computer Science and Engineering
Date of Submission: [08-05-2025]
GitHub Repository Link: [https://github.com/Dharanidk895/demandforecasting]

1. Problem Statement

The project addresses demand forecasting for a retail dataset to optimize


inventory management, reduce waste, and enhance customer satisfaction. The
dataset includes historical sales transactions from a retail chain,
enabling descriptive analytics (identifying past trends) and predictive
analytics (forecasting future demand).
Key Challenges:

• Seasonal demand variations.


• Outliers in sales data.
• Feature selection for model accuracy.

2. Project Objectives

1. Understand Demand Drivers: Analyze factors like seasonality, store


location, and item category.
2. Data Preparation: Clean, transform, and engineer features (e.g., date
parsing, outlier handling).
3. Trend Analysis: Visualize sales patterns over time and across stores/
items.
4. Model Building: Develop a forecasting model (e.g., ARIMA, Prophet,
or ML-based).
Deliverables:
• Cleaned dataset (CSV).
• Interactive dashboards (Power BI/Tableau).
• Forecasted demand values with confidence intervals.
• Actionable recommendations for inventory planning.

3. Flowchart of the Project Workflow:-

Data Collection → Data Cleaning → Exploratory Data Analysis (EDA) → Feature


Engineering → Modeling → Forecasting → Visualization & Reporting

Details:

• Data Cleaning: Handle missing values, duplicates, and outliers.


• EDA: Univariate/bivariate analysis, time-series decomposition.
• Modeling: Compare statistical (ARIMA) vs. ML (Random Forest,
XGBoost).

4. Data Description

Dataset: Historical Retail Sales Dataset

• Source: Internal CSV (e.g., Kaggle-like platform).


• Size: ~91,000 rows × 5 columns.
• Static/Dynamic: Static (snapshot of historical data).
Attributes:

Column Description Data Type

date Sale timestamp DateTime

store Store ID Categorical


item Item ID Categorical

sales Quantity sold Numeric

5. Data Preprocessing

1. Missing Values: No nulls in key columns.


2. Duplicates: Removed 12 duplicate entries.
3. Date Parsing: Converted date to datetime;
extracted year, month, day, weekday.
4. Outliers: Used IQR to cap extreme sales values.
New Features:

• is_weekend: Binary flag for weekends.


• rolling_avg_7d: 7-day moving average for trend smoothing.

6. Exploratory Data Analysis (EDA)


Key Insights:

• Seasonality: 20% higher sales in December (holiday effect).


• Store Performance: Store S3 consistently outperforms others.
• Item Trends: Category "A" items contribute 40% of total sales.

Visualizations:
1. Time-series plot of monthly sales.
2. Heatmap of sales by store-item combinations.
3. Boxplots to detect outliers.

7. Tools and Technologies Used


Programming Language: Python
IDE: Jupyter Notebook
Libraries:pandas, numpy, matplotlib, seaborn, statsmodels, scikit-learn

Category Tools

Language Python

IDE Jupyter Notebook

pandas, numpy, matplotlib, seaborn,


Libraries
statsmodels, scikit-learn

Version
GitHub
Control

• Hyperparameter tuning for models.


• Deploy forecasts via an API for real-time use.
• Expand dataset with external factors (e.g., weather, promotions).
8. Team Members and Contributions

[List the members and their responsibilities clearly.]

Name Contribution

Kaliappan Data collection,EDA

Nithishkumar Route optimization and evaluation

DharaniKumar Demand forecasting

Ramnath Visualization and documentation

You might also like