Second Hand Car Price Prediction
Second Hand Car Price Prediction
Second Hand Car Price Prediction
Submitted by:
HINAL SETH
ACKNOWLEDGMENT
I would like to thank Flip Robo Technologies, for giving me this golden
opportunity to work on this valuable project. I got to learn a lot from this project
about Data Scraping, Data Wrangling, and practical implementations of using
machine learning modules.
I take this opportunity to express my profound gratitude and deep regards to
my mentor Ms. Swati Mahaseth for her exemplary guidance, monitoring and
constant encouragement throughout the course of this assignment. The
blessing, help and guidance given by her time to time shall carry me a long way
in the journey of life on which I am about to embark.
Lastly, I thank almighty, my parents, brother, sister and friends for their constant
encouragement without which this assignment would not be possible.
References:
1. https://www.olx.in/
2. https://www.cardekho.com/
3. https://www.cars24.com/
4. https://autoportal.com/
INTRODUCTION
There is no all inclusive instrument for building up the retail cost of utilized
vehicles in light of the fact that unique sites utilize various techniques to
make it. By utilizing measurable models to expect to value, it is
conceivable to acquire a fundamental value gauge without entering every
one of the subtleties into the ideal site. The fundamental motivation
behind this study is to think about the precision of two distinct
expectation models for assessing a pre-owned vehicle's retail cost.
Subsequently, we offer a Machine Learning-based philosophy at
anticipating the costs of secondhand vehicles in light of their attributes.
With the Coronavirus sway on the lookout, we have seen lot of changes
in the vehicle market. Presently some vehicles are sought after
subsequently making them exorbitant and some are not popular
consequently less expensive. With the adjustment of market due to
Coronavirus 19 effect, people/sellers are facing issues with their past Car
Price valuation AI/Machine Learning models. Along these lines, they are
searching for new AI models from new information. Here we are building
the new car price valuation model.
The primary point of this venture is to create a dataset with the help of
web scraping and anticipate the cost of trade-in vehicle in view of
different elements.
To have the option to anticipate utilized vehicles market worth can help
the two purchasers and merchants. Utilized Vehicle merchants are one
of the greatest objective gathering that can be keen on consequences of
this review. On the off chance that pre-owned vehicle merchants better
get what makes a vehicle attractive, what the significant highlights are
for a pre-owned vehicle, then, at that point, they might think about this
information and proposition a superior assistance.
• Review of Literature
With the recent arrival of internet portals, buyers and sellers may obtain
an appropriate status of the factors that ascertain the market price of a
used automobile. Lasso Regression, Multiple Regression, and Regression
Trees are examples of machine learning algorithms. We will try to develop
a statistical model that can forecast the value of a pre-owned automobile
based on prior customer details and different parameters of the vehicle.
This project aims to compare the efficiency of different models'
predictions to find the appropriate one. On the subject of used
automobile price prediction, several previous studies have been
conducted.
We did a background survey regarding the basic ideas of our project and
used those ideas for the collection of information like the technological
stack, algorithms, and shortcomings of our project which led us to build a
better project.
OLX
The OLX marketplace is a location where you can buy and sell services and
things including electronics, clothing, furniture, household goods,
vehicles, and motorcycles. According to reports, the network had 11
billion page views in 2014, 200 million monthly active users, 25 million
listings, and 8.5 million monthly transactions.
Car Dekho
CarDekho is one of India's most popular automobile websites for new and
secondhand car research. It offers precise automobile prices on the road,
as well as authentic user and expert evaluations. It may also use the
automobile comparison tool to compare different cars. This service also
allows you to connect with local vehicle dealers to find the greatest deals.
AutoPortal
AutoPortal is a digital platform that allows users to study New Cars in India
by looking at prices, specifications, images, mileage, reviews, and
comparisons. Here, you can easily Sell Used Car to verified customers.
One may offer their used automobile for sale with images, models, years
of purchase, and kilometres so that it can be seen by thousands of
potential customers in their city. User reviews and professional
automobile reviews with photographs are available to assist in making a
new car purchase decision.
Cars24
Cars24 is a website where used car sellers may list their vehicles for sale.
It's an Indian start-up with a simple user interface that asks sellers for
information like automobile model, mileage, registration year, and
vehicle type (petrol, diesel). These enable the online model to estimate
the price by running particular algorithms on provided parameters.
1. Numpy:
NumPy is a Python module for array processing. It includes a high-
performance multidimensional array object as well as utilities for
manipulating them. It is the most important Python module for
scientific computing. NumPy may be used as a multi-dimensional
container of general data in addition to its apparent scientific
applications.
Numpy allows any data types to be created, allowing NumPy to
connect with a broad range of databases cleanly and quickly.
2. SciPy:
SciPy is a Python library for scientific and technical computing that is
free and open-source. Optimization, linear algebra, integration,
interpolation, special functions, FFT, signal and image processing,
ODE solvers, and other activities used in research and engineering are
all covered by SciPy modules.
SciPy is based on the NumPy array object, and it's part of the NumPy
stack, which also contains Matplotlib, pandas, and SymPy, as well as a
growing number of scientific computing libraries. Other apps with
comparable users to NumPy include MATLAB, GNU Octave, and
Scilab.
The SciPy stack is occasionally used interchangeably with the NumPy
stack. The SciPy library is now available under the BSD licence, with
an open community of developers sponsoring and supporting its
development.
3. Scikit-Learn
Scikit-learn offers a standard Python interface for a variety of
supervised and unsupervised learning techniques. It is provided
under several Linux distributions and is licenced under a liberal
simplified BSD licence, promoting academic and commercial use. The
library is being constructed.
4. Jupyter Notebook
Jupyter Notebook is an open-source online software that lets you
create and share documents with live code, equations, visualisations,
and narrative prose. Data cleansing and transformation, numerical
simulation, statistical modelling, data visualisation, machine learning,
and more are all included.
Jupyter Notebook is an open-source online software that lets you
create and share documents with live code, equations, visualisations,
and narrative text. Data cleansing and transformation, numerical
simulation, statistical modelling, data visualisation, machine learning,
and more are all included.
Model/s Development and Evaluation
2. Linear Regression
Regression is a method for predicting a dependent component
with the help of independent variables.
The method is commonly used to predict and calculate
correlations between independent and dependent variables. The
regression model establishes a linear or exponential connection
between independent and dependent variables.
Linear regression is a type of regression analysis in which the
independent(x) and dependent(y) variables can be constrained
in a linear relationship. The red line in the graph above is known
as the best fit straight line. We want to draw a line that best
predicts the data points given the data points we have. The line
may be represented using the linear equation below.
y = a0 + a1 * x # Linear Equation
3. SGD Regressor
The loss gradient is calculated each sample at a time, and the
model is updated along the way using a decreasing strength
schedule. SGD stands for Stochastic Gradient Descent (aka
learning rate).
4. KNeighbors Regressor
Algorithm Calculating the average of the numerical goal of the K
nearest neighbours is a straightforward implementation of KNN
regression. An inverse distance weighted average of the K
closest neighbours is another method. The distance functions
used in KNN regression are the same as those used in KNN
classification.
5. Decision Tree Regressor
To get from observations about an item (represented in the
branches) to inferences about the item's goal value, decision
tree learning employs a decision tree (as a predictive model)
(represented in the leaves). In statistics, data mining, and
machine learning, it is one of the predictive modelling
methodologies. Classification trees are tree models in which the
goal variable can take a discrete set of values; in these tree
structures, leaves indicate class labels and branches represent
feature combinations that lead to those class labels. Regression
trees are decision trees in which the target variable can take
continuous values (usually real numbers). The objective is to
build a model that predicts the value of a target variable from a
set of input variables.