0% found this document useful (0 votes)
6 views18 pages

Business

This document summarizes a project to predict sales at BIG Mart stores using machine learning algorithms. It presents the project agenda, which includes explaining the data science process, datasets used, technologies like NumPy, Pandas and scikit-learn, and the project steps. The steps involve loading and exploring the data, dealing with null values, label encoding categorical variables, splitting data into train and test sets, applying models like linear regression, ridge regression, lasso regression, decision trees and random forests. The results are compared in a dataframe, showing that random forests performed best with high accuracy and low error margins. In conclusion, random forests were identified as the ideal choice for building the sales prediction model.

Uploaded by

Suraj Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views18 pages

Business

This document summarizes a project to predict sales at BIG Mart stores using machine learning algorithms. It presents the project agenda, which includes explaining the data science process, datasets used, technologies like NumPy, Pandas and scikit-learn, and the project steps. The steps involve loading and exploring the data, dealing with null values, label encoding categorical variables, splitting data into train and test sets, applying models like linear regression, ridge regression, lasso regression, decision trees and random forests. The results are compared in a dataframe, showing that random forests performed best with high accuracy and low error margins. In conclusion, random forests were identified as the ideal choice for building the sales prediction model.

Uploaded by

Suraj Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

BIG Mart Sales

Prediction
USING MACHINE LEARNING

Persented By Suraj Sharma


BCA(DS)/(27) Satyug Darshan Institute Of
Engineering & Technology
31 JULY, 2023
AGENDA

What is Data science and how its work


Types and its Scope
What is Motive and Goal of Our project
Explaining of Datasets used in this project
Data science technology used in this project
Display the project steps
Result
Final Conclusion of Project
Process of data Science
Problem First Step of Data Science is to Understand the real Problem that
Understanding we Need to Solve in The Project

Data Second step is to Collect the Data from the various Source that
Collection are used in the Project

Data Third Step is to Explore the whole data in the Dataset and Deals
Exploring & with the null values and with outliers and make the Data more
Analysing
Understandable and Persentable and learn it

Model Fourth Step is to Build the Appropriate Model to Make


Building prediction for Business ideas

Model
Last step is to Deploy the model in our Business ideas
Deployment
ABOUT THIS PROJECT
In this Project We Are Going Predict the Sales
of the Big Mart using the machine Learning
algorithms
Here we Use different algorithms To see the
different Accuracy

Then We Have to Compare The different


result on Table or the Histogram
TECHNOLOGIES USED IN THIS PROJECT

Here we used the various python library to build this project like
Numpy, Pandas, Sk-learn and matplotlib .

Numpy and Pandas for Basic data exploring operation.


Sk-learn are used for machine learning operation and matplotlib
used to plot a Graph
THERE IS SERVERAL STEP TAKEN IN THIS PROJECT

first import the Required Library like Pandas and Numpy

Load the DataSets and Store in the Variable and show all the data sets

then we Check the Shape , Info of the Data


Then We Check the Null Value in the Datasets the it will Return the Boolean Value

Here Are two


Variable in this Table
That Has The Null
Value
Item_Weight and
Outlet _Size

Replace the Null


Value With The
Mean Value
Then check the types of Variable or the
Data using Dtypes keyword

Here is some Categorical Data That


Need to Be Encode using the Label
Encoder

here We apply
Label encoding
on the Given
column and fit
into the Given
Data
In This Datasets The first Column are Not changable so We need To Drop It
Because it doesn't With Our Model

Then We Need To Distribute The Data Into Independent and Target Variable

Then We Check the Shape of X and Y Data


Then We Start the Feature Scaling
First We Import the Standard Scaler then Fit The X Data into the Standard Scaler

It simply scale the Data


in the X variable

After Applying Feature Scaling We Need To Apply Train Test Split It Simply Split
the Data Into Trained or Test Data
After Split The The data Into The Trained Or Test We Apply the The Model on The Data

Import The Linear


Regression from sk-learn

apply the Linear Regression using


Model.fit

Using model.score we
Checking the Score of Model
After Applying Linear Regression We Apply Ridge And Lasso Regression

This is a Model of Ridge Regression

This is Model of Lasso


Rigression
When Applying the Decision Tree regression the following Result Occurs
And there We can See The Improvement In Result
When Applying The Random Forest There Are more Improvement Over Both Data
Trained Data As well As Test Data
in this Project we apply Different Algorithms to Improve Our Result
Two See the Accuracy of all Model you

Store All Score in The Variable and make a Dictionary

Convert this Into a Dataframe


And the Margin of error of different algorithms are

Creating a Dictionaries

Convert This Into a


Dataframe
Conclusion
We have seen that in this project we apply many alogorithms like linear,
Ridge, Lasso and Decision Tree As well as Random Forest

We can see that there are very less Accuracy with linear ,ridge and Lasso
regression

And Decision Tree is also a good Algorithms to build a model but Random
Forest is The ideal Choice to Build the model in this project Because they
Give very Maximum accuracy With low margin of error

But There are no Any effect of Hypertuning in this Datasets


THANK YOU
For Your Attention

You might also like