A Mini Project Report On: "Big Mart Sales Prediction" by
A Mini Project Report On: "Big Mart Sales Prediction" by
Department of Information
Technology
2020
CERTIFICATE
This is to certify that the project entitled 'Big Mart Sales
Prediction' being submitted by Kaushal Chande (17IT1012),
Makarand Madhavi (17IT1020), Omkar Chalke (17IT2030),
Dnyanisha Gondhalekar (17IT1037), Siddhi Kale (17IT2020),
Tanvi Chavan (17IT2025) to the University of Mumbai in partial
fulfilment of the requirement for the award of the degree Of ‘T.E.
I.T’ in “BUSINESS INTELLIGENCE LAB”.
Date:
Place:
ACKNOWLEDGEMENT
Declaration .................................................................. I
Acknowledgement… ....................................................II
Preface…......................................................................III
3. DATASET .............................................................................10
6. RESULT ..............................................................................18
8. CONCLUSION .................................................................... 22
1. PROBLEM STATEMENT
The data scientists at Big Mart have collected the sales data for 1559 products
across 10 stores in different cities. Also, certain attributes of each product and
store have been defined. The aim is to build a predictive model and find out
the sales of each product at a particular store which would help the sales team
to plan financial growth and adopt suitable production policy.
2. PROPOSED SYSTEM
The goal of the Big Mart sales prediction is to build a regression model to
predict the sales of each of new products in each of the 10 different Big Mart
outlets. This model helps Big Mart understand the properties of products and
stores that play an important role in increasing their overall sales.
3. DATASET
We have used Big Mart Sales Dataset containing twelve attributes. Six of
them give data related to product and remaining six attributes give data
related to outlets.
The attributes and their description is as follows :
Data Exploration:
In this we have explored the dataset deeply by plotting graphs of
various attributes. By executing this we got an idea about which are
the major attributes will help us in finding the result.
Data Cleaning:
In this the data is cleaned by removing NULL values.
Data Transformation:
In this the data is transformed into Consistent data so that we can use
it easily while finding the results.
Train Model:
We have used the Train Dataset to train the model. The technique
used for training is "Linear Regression".
Test Model:
the Test Dataset is used for Testing Purpose and for finding the result.
After Testing the data we got accuracy about 79%.
Fig 5.2 – Heatmap showing correlation between every pair of numeric attributes
Fig 5.3 Fig 5.4
(count of outlet size) (impact of location type on sales)
Mathematical Operation
Combining Data
Dummy Columns
5.4 Model Making
Training dataset is split into two parts; Independent variables (X_train)
consisting of item and outlet variables, and dependent variable (Y_train)
i.e sales
Independent variables from testing dataset are passed into the model to predict
the sales and result is obtained
The predicted sales obtained can be further analyzed to make suitable business decisions
Big Mart Sales Prediction provides visibility into which products are selling
the most in the market. Production team can identify areas of improvement
for low-selling products. Sales team can identify the geographical areas of
importance for selling the products. Sales team can adopt suitable production
policy so that the problem of overproduction or shortage can be avoided. If
there is an increase in sales of a product means that the demand for that
product has increased, this helps sales team to plan their supply to meet the
increased demand.
CONCLUSION