0% found this document useful (0 votes)

26 views5 pages

Sales Analysis and Prediction Using Pyth

This document discusses sales analysis and prediction using machine learning algorithms in Python. It introduces the topic of big data analytics and tools used for analyzing structured and unstructured data. Various machine learning models like linear regression and decision trees are compared to predict sales of a product using a dataset. The goal is to determine which model performs best for obtaining accurate results. Python and its libraries are used for implementing the analysis.

Uploaded by

Bhagyaprasad Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views5 pages

Sales Analysis and Prediction Using Pyth

Uploaded by

Bhagyaprasad Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Palak Mittal, et. al.

International Journal of Engineering Research and Applications

www.ijera.com
ISSN: 2248-9622, Vol. 10, Issue 5, (Series-III) May 2020, pp. 50-54

RESEARCH ARTICLE OPEN ACCESS

Sales Analysis and Prediction Using Python

Palak Mittal*, Sujay**, Simran***, Krishan Kumar****, Pronika Chawla*****
*(Department of CSE, MRIIRS, Faridabad
**(Department of CSE, MRIIRS, Faridabad
***(Department of CSE, MRIIRS, Faridabad
****(Department of CSE, MRIIRS, Faridabad
*****(Department of CSE, MRIIRS, Faridabad

ABSTRACT
These days shopping centers and Big Marts maintain record of their selling details for all the persons to forecast
the customer’s potential demand and even monitor the inventory control. In a data center these data warehouses
essentially comprise a vast amount of consumer details and individual object attributes. In fact, deviations and
repeated variations are identified by removing data from the data warehouse. The resulting results will be used
to forecast potential revenue figures for retailers like Big Mart using numerous machine learning techniques. In
this paper, we build a predictive model using machine learning algorithms for predicting the sales of a company
and find which model performs better. The models are compared to find out which model performs better in
terms of performance.
Keywords: Data Analytics, Machine Learning, Linear Regression, Random Forest, Python
----------------------------------------------------------------------------------------------------------------------------- ----------
Date of Submission: 13-05-2020 Date of Acceptance: 26-05-2020
----------------------------------------------------------------------------------------------------------------------------- ----------

I. INTRODUCTION runs on BigInsights Big Data Platform. These tools

As the internet is growing rapidly, we can be used to better understand the mood of the
have switched from utilizing standard data such as people about a certain activity that is going on in
texts, documents etc to the more diverse types of their region and the world.[2]
data consisting of a huge amount of high-quality The data can be of various types such as
audio, images, photographs, interactive charts, structured, semi-structures or unstructured.
position data and much more. Each single second Structured data is the type of data that is in the
the data is becoming bigger and bigger. It is of no forms of rows and columns. These are basically
use to have big data if it is not being utilized for tables of data in a database. Structured data
taking decisions.[1] requires minimum processing and is the easiest to
Today, data analytics is being used across analyze and it can directly be fed to the model for
various fields for making predictions. One of the finding patterns, learning from the data, and then
applications of data is in the government sector. making analysis and showing the trend. Semi-
For the government sector the analysis of big data structured data is the data about data. It is basically
has proved very important. Analysis using big data the metadata.
proved instrumental in Barack Obama’s successful Unstructured data is the type of data that is
2012 re-election campaign. For BJP and its ally’s in no specific format and is difficult to analyze. It
big data processing was primarily responsible for requires a lot of pre processing to bring the data in
securing a widely competitive win. Various a form so that it can be used for analysis. It is a
methods are being used by the government of India very complex form of data and consists of data
to assess how the population of India is reacting to from all the nontraditional sources. This data can
political intervention, as well as policy-increase be in the form of audio, video, graphs, plots, power
proposals. point presentation, instant messaging, and
Another area where data analytics is being collaboration software.
used is in the field of social media analytics. The
rise of social networking has caused a large data
explosion. Numerous tools have been developed by
various organizations like IBM to evaluate social
network behavior. These tools are Cognos
Consumer Insights, which is an application that

www.ijera.com DOI: 10.9790/9622-1005035054 50 | P a g e

Palak Mittal, et. al. International Journal of Engineering Research and Applications
www.ijera.com
ISSN: 2248-9622, Vol. 10, Issue 5, (Series-III) May 2020, pp. 50-54

effective and can generate analysis primarily

based on real life records transformation
settings.
 Tableau Public: It’s an intuitive and simple
tool that offers interesting insights by data
visualization. One can inspect a hypothesis,
discover the data, and cross-check their
insights.
 Jupyter Notebook: It is an accessible tool for
Fig 1: structured and unstructured data performing end to end data science workflows
– information cleansing, statistical modeling,
In the data era, sizeable quantities of building and training machine learning models,
statistics have come to be reachable on hand to and visualizing data. [3]
decision makers. Big data refers to datasets that are
now not only big, however additionally high in Among all the different fields where data
range and velocity, which makes them challenging analytics can be used for making predictions and
to take care of using normal tools and techniques. thereby gaining insights for making decisions one
Due to the speedy boom of such data, options need of the fields is sales. We have used Big Data
to be studied and supplied in order to take care of Analytics to analyze and predict the sales of a
and extract price and expertise from these datasets. product using various different models like linear
Furthermore, decision makers want to be in a regression and decision trees. We compare the two
position to obtain treasured insights from such models to understand which of these performs
varied and unexpectedly changing data. Such fee better to obtain the best results. The language used
can be furnished using huge records analytics, for implementation is Python. The platform used
which is the utility of advanced analytics for implementation is Jupyter Notebook.
techniques on big data. There are a number of tools
that can be used for storing and analyzing data. II. DATA SET
Some of the popular tools for storing data are as Collection of data is termed as a dataset.
follows: Dataset refers to numerous database tables in the
 Apache Hadoop: It can be used to store case of data in the form of a table. The row of the
enormous amount of data in a cluster. It is a table gives information about the data set’s record
java-based framework. It can run in parallel on whereas the column gives the information about the
a cluster and is capable of allowing users to particular variable in a table. The data set gives the
process data across all nodes. This provides complete values that are stored in the database in
replication of data resulting in high availability the form of variables for all data set members.
of data. Every value present in the database is termed as a
 Hive: It’s a distributed data management for datum. These may also consist of a large number of
Hadoop. It can be used for data mining files and document.
purpose as it supports query operation like There are many different characteristics
HiveSQL for accessing the big data. that define a dataset such as the attributes and
 Apache Cassandra: It is a NoSQL database. It variables present in the dataset as well as their
is scalable, and has high performance numbers and types and the numerous statistical
distributed database tohandle large amounts of measures applied to the dataset. There are a number
data. We can store and retrieve data other than of popular built-in datasets in the Python libraries
tabular relations with the help of a NoSQL used for analysis. Few examples of such built-in
database. The qualities of this database are that databases are:
it is schema free, has a simple API, is  Iris flower dataset: It is a dataset which
consistent, supports easy replication, and can was introduces by Robert Fisher in 1936. It is a
handle large amounts of data. [1] multivariate dataset.
 MNIST database: It is used for text
Some of the popular tools for analyzing data are as classification, clustering, and image processing. It
follows: consists of the images of handwritten digits. [4]
 RapidMiner: RapidMiner can include any The dataset that we have used is the sales dataset
number of information source types, which which is acquired from Kaggle. This dataset
include Microsoft SQL, Sybase, IBM SPSS, contains two files namely train and test. Both of
Excel, Oracle, MySQL, Access, Tera data, these files are csv files. The aim is to predict the
IBM DB2, Ingress, Dbase. The tool is very sales of a product using the test data set.

www.ijera.com DOI: 10.9790/9622-1005035054 51 | P a g e

Palak Mittal, et. al. International Journal of Engineering Research and Applications
www.ijera.com
ISSN: 2248-9622, Vol. 10, Issue 5, (Series-III) May 2020, pp. 50-54

The dataset consists of 11 fields in the dataset language thus making the job easier to perform.
namely: Item_Identifier, Item_Weight, Other programming languages are harder than
Item_Fat_Content, Item_Visibility, Item_Type, Python. Python has emerged to be one of the
Item_MRP, Outlet_Identifier, favorite languages of the programmers. One that is
Outlet_Establishment_Year, Outlet_Size, widely used for developing various applications as
Outlet_Location_Type, and Outlet_Type fields. well as performing data analytics.
The description of the fields mentioned above are
as follows: 3.1 Features of Python
 Item_Identifier: This field consists of the Python can achieve better productivity with less
unique product ID of the item. It is an ID variable. amount of code. However, it is not as fast as some
 Item_Weight: This fields consists of the of the other programming languages. The features
weight of the product. This is not considered in of this language are:
hypothesis.  High-level: it has components of natural
 Item_Fat_Content: This field tells whether language that people use for communication. It is
the product has low fat or not. More than any other easy to understand what task the code is
items the low-fat items are preferred. This performing.
particular field is linked to the ‘Utility’ hypothesis.  Interpreted: Debugging errors is easy and
 Item_Visibility: This field tells us about efficient as the code is compiled line by line. This
the area assigned to a particular product with makes the Python programming language slow
respect to the percent of the total display area of all than other languages.
products. It is used for the hypothesis of the  Easy syntax: Indentations are used instead
‘display area’. of braces in Python to determine which code block
 Item_Type: This field tells about the is under a certain class or function. This makes the
category of the product. To derive more knowledge code easy to read.
about the utility this field can be used.  Dynamic Semantics: There is no need to
 Item_MRP: This field tells about the MRP initialize anything before using. This process is
of the product. This field is not important for done automatically in Python.
analysis and hence is not considered for the  Portable: There is no need to make
hypothesis. changes in the code to run it on different systems.
 Outlet_Identifier: This field consists of the This makes it easy to work on a task.
unique store ID. It is an ID variable.  Open Source: It is free and can be used
 Outlet_Establishment_Year: This field and modified by anyone as per their preference.
gives information about the year in which the store  Object-Oriented Language: It helps
was established. It is not considered in the simulate real-world scenarios and provides security
hypothesis. to get a well-made application.
 Outlet_Size: This field tells about the  Simplicity: By understanding only
ground area that the store covers. This field is indentations one can code any application in less
linked to ‘store capacity’ hypothesis. lines of code.
 Outlet_Location_Type: This field tells us  Embedding Properties: It is powerful and
about the location that is the type of city where the versatile and allows embedding of code from other
store is located. This field is linked to the ‘city languages like C.
type’ hypothesis.  Library Support: It supports various
 Outlet_Type: This field tells about libraries that can make obtaining solutions easy and
whether the store is a supermarket or a small store. fast.
This field is also connected to the ‘store capacity’
hypothesis. 3.2 Usage of Python
 Item_Outlet_Sales: This field is the  Frameworks like Django and Flask are
outcome variable that is being predicted. It tells used for developing web applications.
about the sales of the product in a store. This field  Creating workflows for the software.
is the desired outcome variable.[5]  Modifying files and data in Databases.
 Complex calculations and scientific and
III. PYTHON FOR DATA ANALYTICS analytic calculations.
Python is a programming language that
has a very easy syntax and semantics and is an 3.3 History of Python
interpreted language and high-level language. It Python programming language was
takes less effort to create applications using this developed approximately 30 years ago in 1990’s by

www.ijera.com DOI: 10.9790/9622-1005035054 52 | P a g e

Palak Mittal, et. al. International Journal of Engineering Research and Applications
www.ijera.com
ISSN: 2248-9622, Vol. 10, Issue 5, (Series-III) May 2020, pp. 50-54

Guido van Rossum and first came into being in the obtain a model. This model helps us to predict the
year 1991. The main aspect of this programming final outcome.
language is its code readability and the usage of ETL refers to Extract, Transform and
large enough to be noticed whitespace. It uses the Load. This is the tool which will combine all three
multi programming paradigm. It also makes the of the functions. It is fed the data from a particular
usage of functional, imperative, object-oriented, database and the tool transforms the input data into
structured, and reflective paradigm. a suitable format. The raw data is transformed to an
There are about 8 different understandable format by using data mining
implementations of Python programming language techniques that is data preprocessing. Data
namely: CPython, PyPy, Stackless Python, processing is a very important step as the data
MicroPython, CircuitPython, IronPython, Jython, collected from real sources may be incomplete or
RustPython. Python language is influenced by a inconsistent.
number of other languages namely: ABC, Ada,
ALGOL 68, APL, C, C++, CLU, Dylan, Haskell,
Icon, Java, Lisp, Modula-3, Perl, Standard ML.
There are languages whose development is
influenced by Python. These languages are: Apache
Groovy, Boo, Cobra, CoffeeScript, D, F#, Genie,
Go, JavaScript, Julia, Nim, Ring, Ruby, Swift.

3.4 Scope of Python

There are a number of applications for Python
which are as follows:
 Web and Internet development: Python
has a vast collection of libraries and packages of Fig 2: block diagram of the system
internet protocols to make the task of developing
web applications easier. Few of the libraries are: 4.1 Linear Regression
IMAP, FTP, image processing. Few of the It finds the relationship between the
packages present are: Feedparser, Beautifulsoup, dependent variable (Y) and one or more
Requests etc. frameworks such as Django, and independent variables (X) using one straight line
Flask are also available. which is the best fit line also termed as the
 Desktop GUI: One can draft a user regression line. The equation representing this line
interface using binary distributions of Python is:
shipped with Tk, which is a standard library for Y=a+b*X + e
GUI. In the above equation:
 Scientific and Numeric Applications: a is intercept,
Python is a powerful programming language and b is the slope of the line,
scientific and numeric applications is one of the e is the error term.
most popular applications of this language. There The accuracy cab be found out using this method.
are a number of libraries that allow to perform Although this model is very famous for analysis its
these tasks such as Numpy, Pandas, SciPy. disadvantage is that it gives less accurate results.[6]
 Software Development Application:
Python programming language can be used as a
support language (for testing, build-control and
management) for software development
applications by software developers. Few of the
examples are: SCons, Buildbot Apache Group etc.

IV. PROPOSED SYSTEM

The method to solve the problem at hand
is given below. The unprocessed data at the Big
Mart is collected. This raw data has to be pre-
processed to obtain the missing data, outliers and
the anomalies. We train two different machine
learning algorithms namely linear regression and
random forest on the raw data that is collected to Fig 3: linear regression

www.ijera.com DOI: 10.9790/9622-1005035054 53 | P a g e

Palak Mittal, et. al. International Journal of Engineering Research and Applications
www.ijera.com
ISSN: 2248-9622, Vol. 10, Issue 5, (Series-III) May 2020, pp. 50-54

4.2 Random Forest [2]. https://www.digitalvidya.com/blog/big-data-

These are also known as random decision applications/
forests. It is a machine learning algorithm that [3]. https://www.analyticsvidhya.com/blog/2018/
combines various tasks such as classification, 05/starters-guide-jupyter-notebook/
regression among others. It builds multiple decision [4]. https://en.wikipedia.org/wiki/Data_set
trees during the training period and output’s the [5]. https://medium.com/@nr3702/bigmart-sales-
class that is mode of classes that is classification or data-regression-using-python-57a5155767d7
mean prediction that is regression of the individual [6]. Heramb Kadam, Rahul Shevade,
trees. It is used to overcome the disadvantage of Prof.DevenKetkar, Mr. Sufiyan Rajguru, A
decision trees that is overfitting.[6] Forecast for Big Mart Sales Based on
Random Forests and Multiple Linear
Regression, BE IT, FAMT, Ratnagiri,
Assistant Professor ,IT department, FAMT,
IJEDR 2018 | Volume 6, Issue 4 | ISSN:
2321-9939

Fig 4: Random Forest

V. CONCLUSION
A software tool is proposed by us for
predicting the future sales based on the historical
data. With this tool, it can be found out how precise
is the prediction for linear regression and random
forest machine learning algorithms.

ACKNOWLEDGEMENT
The successful realization of the project is
an outgrowth of a consolidated effort of people
from disparate fronts. We are thankful to Dr.
Krishan Kumar for his valuable advice and support
extended to us without which we would have not
been able to complete the project for success.
We are thankful to Ms. Pronika Chawla
for her guidance and support.
Words cannot express our gratitude for all
those people who helped us directly or indirectly in
our Endeavour. We take this opportunity to express
our sincere thanks to everyone for their valuable
suggestions and also to our family and friends for
their support.

REFERENCES
[1]. Palak Mittal, Mansi Sharma, Dr. Prateek
Jain, A Detailed Study of Security and
Privacy Concerns in Big Data,International
Journal of Applied Engineering Research
ISSN 0973-4562 Volume 13, Number 10
(2018) pp. 7406-7411

www.ijera.com DOI: 10.9790/9622-1005035054 54 | P a g e

Azure For Developers
100% (3)
Azure For Developers
65 pages
Google Wallet Receipt Assistant Ideas
No ratings yet
Google Wallet Receipt Assistant Ideas
20 pages
Big Data Analysis by Deshbandhu
No ratings yet
Big Data Analysis by Deshbandhu
368 pages
Da Unit-1
No ratings yet
Da Unit-1
23 pages
Inside RavenDB 4 0
100% (1)
Inside RavenDB 4 0
465 pages
TheBigDataAnalytics PDF
100% (2)
TheBigDataAnalytics PDF
382 pages
Big Data Analytics Unit-1
100% (2)
Big Data Analytics Unit-1
5 pages
Mongodb 1
No ratings yet
Mongodb 1
8 pages
Marko Grobelnik, Blaz Fortuna, Dunja Mladenic Jozef Stefan Institute, Slovenia
100% (1)
Marko Grobelnik, Blaz Fortuna, Dunja Mladenic Jozef Stefan Institute, Slovenia
107 pages
Grokking The System Design Interview PDF
93% (46)
Grokking The System Design Interview PDF
196 pages
Project Report ON: Web Based Student Result System
No ratings yet
Project Report ON: Web Based Student Result System
27 pages
Notes - KCS 061 Big Data Unit 1
No ratings yet
Notes - KCS 061 Big Data Unit 1
25 pages
Big Data and Business Opportunities
100% (1)
Big Data and Business Opportunities
6 pages
Node Express Mongo Notes
No ratings yet
Node Express Mongo Notes
406 pages
Data Management & Data Architecture
No ratings yet
Data Management & Data Architecture
21 pages
DBMS 4TH Sem Course File
100% (1)
DBMS 4TH Sem Course File
15 pages
Data Analytics Quantum
No ratings yet
Data Analytics Quantum
144 pages
A Big Data Analytics Study Challenges, Unresolved Research Issues, and Techniques
100% (1)
A Big Data Analytics Study Challenges, Unresolved Research Issues, and Techniques
8 pages
Big Data 2.0 Processing Systems 2ed
No ratings yet
Big Data 2.0 Processing Systems 2ed
155 pages
Predictive Analytics
No ratings yet
Predictive Analytics
31 pages
BDA1-4 Bunits
No ratings yet
BDA1-4 Bunits
113 pages
Unit 1 Big Data
No ratings yet
Unit 1 Big Data
124 pages
Unit 2
No ratings yet
Unit 2
22 pages
Unit 4 Database Design and Development 4
No ratings yet
Unit 4 Database Design and Development 4
112 pages
Ccs 334
No ratings yet
Ccs 334
16 pages
Data Analytics For IOT
No ratings yet
Data Analytics For IOT
57 pages
BDA Module1
No ratings yet
BDA Module1
64 pages
Chapter-1 Introduction To Data Analytics
No ratings yet
Chapter-1 Introduction To Data Analytics
34 pages
Unit-1 Introduction To Data Analytics
No ratings yet
Unit-1 Introduction To Data Analytics
35 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
(Davoudian Et Al., 2018) A Survey On NoSQL Stores
No ratings yet
(Davoudian Et Al., 2018) A Survey On NoSQL Stores
43 pages
Kwasu-Csc204 Big Data Computing and Security-1
No ratings yet
Kwasu-Csc204 Big Data Computing and Security-1
57 pages
Big Data Analytics
100% (1)
Big Data Analytics
3 pages
Big Data Analysis
No ratings yet
Big Data Analysis
33 pages
Big Data Lesson 1 Lucrezia Noli
No ratings yet
Big Data Lesson 1 Lucrezia Noli
46 pages
COMP4332/RMBI4310: Big Data Mining and Management Advanced Data Mining For Risk Management and Business Intelligence
No ratings yet
COMP4332/RMBI4310: Big Data Mining and Management Advanced Data Mining For Risk Management and Business Intelligence
45 pages
Python Core Interview Questions
No ratings yet
Python Core Interview Questions
43 pages
What Is Data
No ratings yet
What Is Data
20 pages
Introduction-It Skills
No ratings yet
Introduction-It Skills
20 pages
Unit 1
No ratings yet
Unit 1
19 pages
Seminar - PPT
No ratings yet
Seminar - PPT
19 pages
Imp Answers
No ratings yet
Imp Answers
29 pages
Ds Unit 3 Notes
No ratings yet
Ds Unit 3 Notes
29 pages
Data Analytics
No ratings yet
Data Analytics
20 pages
CRM Data Collection and Storage
No ratings yet
CRM Data Collection and Storage
22 pages
Big Data Answers
No ratings yet
Big Data Answers
14 pages
Datasist: A Python-Based Library For Easy Data Analysis, Visualization and Modeling
No ratings yet
Datasist: A Python-Based Library For Easy Data Analysis, Visualization and Modeling
17 pages
Bda Unit 1
No ratings yet
Bda Unit 1
20 pages
Literature Review On Big Data Analytics Vishal Kumar Harsh Bansal
No ratings yet
Literature Review On Big Data Analytics Vishal Kumar Harsh Bansal
6 pages
NoSQL Paper 2
No ratings yet
NoSQL Paper 2
18 pages
MongoDB Why Documents
No ratings yet
MongoDB Why Documents
15 pages
Big Data Analytics in Financial Reporting - Trends and Challenges
No ratings yet
Big Data Analytics in Financial Reporting - Trends and Challenges
17 pages
AI Module3 CH2
No ratings yet
AI Module3 CH2
13 pages
R II Bca IV Sem Unit 3 Balu Sir
No ratings yet
R II Bca IV Sem Unit 3 Balu Sir
14 pages
Data Modeling Overview
No ratings yet
Data Modeling Overview
18 pages
Lecture 0
No ratings yet
Lecture 0
21 pages
MODULE 1 - ST
No ratings yet
MODULE 1 - ST
13 pages
Full Stack Launchpad
No ratings yet
Full Stack Launchpad
21 pages
Data Analytics - Unit - 5
No ratings yet
Data Analytics - Unit - 5
15 pages
Kingword
No ratings yet
Kingword
11 pages
Module 4
No ratings yet
Module 4
13 pages
Reviews of Big Data Techniques and Tools For Predictive Analytics Analysis
No ratings yet
Reviews of Big Data Techniques and Tools For Predictive Analytics Analysis
8 pages
Big Data Analytics
No ratings yet
Big Data Analytics
10 pages
(IJCST-V9I6P1) :yew Kee Wong
No ratings yet
(IJCST-V9I6P1) :yew Kee Wong
7 pages
Business Analytics and Big Data
No ratings yet
Business Analytics and Big Data
11 pages
TP 4 2docuatrimestre
No ratings yet
TP 4 2docuatrimestre
10 pages
Reviewed Big Data Assignment
No ratings yet
Reviewed Big Data Assignment
6 pages
An Experiential Study of The Big Data: Keywords
No ratings yet
An Experiential Study of The Big Data: Keywords
12 pages
UNIT - II Artificial Intelligence Second Part
No ratings yet
UNIT - II Artificial Intelligence Second Part
9 pages
Big Data Analytics - Applications, Challenges & Future Directions
No ratings yet
Big Data Analytics - Applications, Challenges & Future Directions
6 pages
Reaseach Paper Saas Ai Platform
No ratings yet
Reaseach Paper Saas Ai Platform
10 pages
Jsaer2016 03 01 21 24
No ratings yet
Jsaer2016 03 01 21 24
4 pages
Challenges in Big Data Analytics Techniques
No ratings yet
Challenges in Big Data Analytics Techniques
6 pages
Lab 3
No ratings yet
Lab 3
10 pages
CHICAGO - BROCHURE - ENG - Data Engineering
No ratings yet
CHICAGO - BROCHURE - ENG - Data Engineering
9 pages
Week 7 Written Assignment SQL & Nosql Injections
No ratings yet
Week 7 Written Assignment SQL & Nosql Injections
5 pages
(IJCST-V5I4P10) :M Dhavapriya
No ratings yet
(IJCST-V5I4P10) :M Dhavapriya
5 pages
Big Data Analytics
No ratings yet
Big Data Analytics
7 pages
Data Science Using Python - Introduction
No ratings yet
Data Science Using Python - Introduction
6 pages
Enache 1
No ratings yet
Enache 1
6 pages
Big Data Analytics and Its Applications
No ratings yet
Big Data Analytics and Its Applications
4 pages
Partiunit5introduction To Big Data Its Type and Advantagedisadvantages
No ratings yet
Partiunit5introduction To Big Data Its Type and Advantagedisadvantages
4 pages
Migration of Data From Relational Database To Graph Database
No ratings yet
Migration of Data From Relational Database To Graph Database
6 pages
37 A Review Paper On Big Data Analytics
No ratings yet
37 A Review Paper On Big Data Analytics
4 pages
Amazon DynamoDB vs. Elasticsearch Comparison
No ratings yet
Amazon DynamoDB vs. Elasticsearch Comparison
4 pages
Important Questions-Bigdata
No ratings yet
Important Questions-Bigdata
4 pages
VELEZ Reflection
No ratings yet
VELEZ Reflection
1 page
AWS - Certified Cloud Practitioner (CLF-C01) Notes 31
No ratings yet
AWS - Certified Cloud Practitioner (CLF-C01) Notes 31
1 page

Sales Analysis and Prediction Using Pyth

Uploaded by

Sales Analysis and Prediction Using Pyth

Uploaded by

Palak Mittal, et. al.

International Journal of Engineering Research and Applications

RESEARCH ARTICLE OPEN ACCESS

Sales Analysis and Prediction Using Python

I. INTRODUCTION runs on BigInsights Big Data Platform. These tools

www.ijera.com DOI: 10.9790/9622-1005035054 50 | P a g e

effective and can generate analysis primarily

www.ijera.com DOI: 10.9790/9622-1005035054 51 | P a g e

www.ijera.com DOI: 10.9790/9622-1005035054 52 | P a g e

3.4 Scope of Python

IV. PROPOSED SYSTEM

www.ijera.com DOI: 10.9790/9622-1005035054 53 | P a g e

4.2 Random Forest [2]. https://www.digitalvidya.com/blog/big-data-

Fig 4: Random Forest

www.ijera.com DOI: 10.9790/9622-1005035054 54 | P a g e

You might also like