0% found this document useful (0 votes)
30 views

Final JournalPaperForCarPricePrediction Python

This document discusses predicting car resale value using machine learning algorithms. It proposes using SVM and KNN algorithms to predict future daily sales quantities of vehicles based on sales data. The goal is to help car sales managers understand which vehicle attributes most affect sales and how sales can be increased. Accurately predicting vehicle sales can help with inventory management and ensuring profitability.

Uploaded by

K Siva Deepak
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Final JournalPaperForCarPricePrediction Python

This document discusses predicting car resale value using machine learning algorithms. It proposes using SVM and KNN algorithms to predict future daily sales quantities of vehicles based on sales data. The goal is to help car sales managers understand which vehicle attributes most affect sales and how sales can be increased. Accurately predicting vehicle sales can help with inventory management and ensuring profitability.

Uploaded by

K Siva Deepak
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 5

CAR RESALE VALUE PREDICTION USING MACHINE

LEARNING ALGORITHM
L.Dharshini1,S.Divyadharshini2,M.Karthika3,S.keerthana4

Dr.M.P.Thiruvenkatasuresh5 Associate Professor


B.Tech-IT ,Erode Sengunthar Engineering College,
Erode,Tamilnadu

Abstract. Demand forecasting is the major key aspect to predictions and gather new and interesting results that shed
successfully manage restaurants, supermarkets and staff a new light on the knowledge with respect to task’s data.
canteens. In particular, properly predicting future sales of Demand forecasting is one of the major key aspects to
menu items allows for a precise ordering of cars. This will successfully manage showrooms.
ensure a low level of pre-consumer requirements, while this
is critical to the profitability of the showroom. Hence, this In particular, properly predicting future sales of
paper is interested in predicting future values of the daily vehicles allows for a precise ordering of cars. This will
sold quantities of given cars. This project proposes a ensure the profitability of the showroom. Hence, this
forecasting approach that is solely based on the data project is interested in predicting values of the daily sold
retrieved from sales and allows for a straightforward human quantities of given vehicles. SVM and K-Nearest
interpretation. Therefore, it proposes two generalized Neighbors (KNN) learning algorithms are used which is
models for predicting future sales. In an extensive based on instances and knowledge gained through them.
evaluation, data sets are taken which consists of car price
data. The main motivation of doing this project is to present “To find out what role certain properties of an item play
a sales prediction model for the prediction of car price. and how they affect their sales by understanding car sales.”
Further, this research work is aimed towards identifying the In order to help car sales achieve this goal, a predictive
best classification algorithm for sales analysis.In this work, model can be built to find out for every showroom, the key
data mining classification algorithm called Naïve Bayes is factors that can increase their sales and what changes could
addressed and used to develop a prediction system in order be made to the product or showroom’s characteristics.
to analyze and predict the sales volume. In addition, These models, if applied in different areas and trained to
various grouping and chart preparation is also made in match the expectations of management, then accurate steps
proposed system for better classification results. The could be taken to achieve the organization’s target.
project is designed using Python 3.7
Hence in the case of car sales, should been discussed to
Key words: K-NN, Naivebayes, SVM, Barplot predict the sales of different types of vehicles and for
understanding the effects of different factors on the
vehicles’ sales. Taking various aspects of a dataset
I. INTRODUCTION collected for car sales, and the methodology should be
followed for building a predictive model, results with high
Machine Learning is an algorithm category which levels of accuracy are generated, and these observations
allows software applications to become accurate in could be employed to take decisions to improve sales.
predicting outcomes without being explicit use of
programming. These models are applied in numerous areas
and trained to suit the expectations of management so that II. RELATED WORK
accurate steps are taken to accomplish the organization’s
target. With the fast development of Internet and technologies,
users are increasingly shop their products in online. Online
In today’s modern world, huge showrooms and shopping has a major role in daily lives due to associated
outlets recording data related to sales of vehicles with their high convenience, low cost, ease of use, and other such
advantages. Consequently, many retail websites, like OLX
various dependent or independent factors as an important
and eBay, are available in online market. In particular, the
step to be helpful in prediction of future demands and recent decades have rapid growth in second-hand
inventory management. The dataset built with different utilization/consumption across many global markets as the
dependent as well as independent variables is a composite result of booming collection of unwanted and used
form of item features, data gathered by means of users, and products. Pricing is not only a scientific term but also an art
data related to inventory management in the data which requires statistical/experimental formulas for
warehouse. The data is thereafter refined to obtain accurate creating a profile for brand and product in the market.

1
Fathalla et al. [1] proposed one of the main challenges authors in [10] presented an approach for identifying the
faced by sellers, which is the pricing. Now, the automotive segment of recreational trips implemented within the bike-
industry is considered one of the backbones of today sharing system based on popular clusterisation algorithms.
economy, and cars are called “industry of industries” The authors developed subroutines to clean the raw data
in developed as well as developing countries. Lower obtained from the GPS trackers. By using the purified data
inventory/longer vehicle retention in ownership and on numeric parameters of the trips in bicycle-sharing
production, respectively, lead to lower prices in used-car system, clustering model identifies such a cluster which
sector in second half of 2021 according to Vehicle represents recreational trips. The use of proposed approach
Remarketing Association (VRA). The used-car sectors in is demonstrated on example of data obtained from bike-
Republic of Croatia were at its peak of sales in 2020. But, sharing system in city of Krakow, Poland.
the general economic picture is still deteriorating and
degrades quickly during second half of 2022, and this will More recently, in [11] the authors exploited unique
have an impact in all ways, but most of all in consumer feature of bike-sharing system, like stopovers—short, non-
confidence, with cost of living crisis becoming the acute traffic-related stops made by cyclists during the trips. The
[2]. This will have direct impact on market, since many price prediction of second-hand items has not been widely
private car owners would to keep their existing vehicles for addressed. Only a few studies have addressed price
longer as their personal finance is affected, which is likely prediction of used products in specific domain, specifically,
to create demand to soften. This decrease in need also be price prediction of second-hand cars [12]. Furthermore,
accompanied by further decline in vehicles supply, driven Chen et al. [13] conducted an empirical investigation and
by new vehicle market that is not really improving in terms compared two techniques, namely i) linear regression and
of number of units available as although semiconductor ii) random forest.
situation is starting to ease, the new factors, like the
situation in Ukraine, have emerged. This shows that the latter is the best algorithm for
dealing with complex models with a large number of
Manufacturers are able to deliver new cars in large variables and data. But, it lacks a clear benefit when
quantities over past two to three years have often dealing with effortless models with fewer variables. The
differentiated themselves from traditionally dominant mean error of sample data fluctuates around 0.3. It is seen
players in market. The mix of models and brands models in that existing used-car price prediction methods are not
some websites is noticeably different than it is before ideal, and it is necessary to get a efficient, reasonable,
COVID-19 pandemic. A recent study in United Kingdom scientific and accurate method.
showed that from April to May 2022, used-car prices
decreased by 1.41% and are now 0.11 percentage points Fuzzy logic systems (FLS), Artificial neural networks
lower than at beginning of Jan 2022. The average used-car (ANN) and evolutionary algorithms (EA) are the most
value fell in Sep 2021 from a high of GBP 12000 to GBP rapidly emerging fields in intelligence computing, and they
8552 in Apr 2022. Values leveled off over first four months could be used to find a variety of prediction and
in 2022, with lead prices dropping from 99% in January to optimization challenges [14 – 16]. A back-propagation
94.8% in April [2]. neural network (BPNN) is an ANN that does not depend on
any empirical formula and automatically create rules to
According to the data of Croatian Vehicle Center, it is existing data to get the intricate patterns of data, which is
evident that sale of used vehicles exceeded number of new suitable to build multi-factor non-linear forecasting models,
vehicles sold after 2015. Likewise, after 2020, total decline like those for used vehicles. Wu et al. [17] compared the
in new car sales were recorded [3]. The majority of sales BPNN for used-car price prediction with proposed ANFIS
contributing to the private cars ownership, due to (adaptive-neuro-fuzzy-inference-system). The results
affordability/ economy, fall on used-car sector. To showed that when 3 feature variables are input, prediction
accurately predict prices of used cars in future, experts are accuracy of BPNN is lower than the latter.
needed when making decisions, due to the nature of
dependence of price of a vehicle on numerous factors and Zhou [18] introduced BPNN for establishing an
features in market. evaluation model, reducing subjectivity and randomness
amid valuation process. It showed that price evaluation
In [4] the authors used a linear regression model to predicted by BPNN is closer to actuality, with maximum
predict prices of new and second-hand vehicles, for which error of 3.05%, which indicates the reliability and
data set is in tabular form. Yang et al. [5] proposed a model applicability of that model. In order to regularize the
to predict vehicle prices which is based on product images evaluation standards of used-car prices and enhance
only by using the custom CNN (convolutional neural accuracy of used-car price foretelling, linear correlation
network) architecture. between vehicle conditions, parameters and transaction
factors and then used-car price was investigated
In [6, 7] the authors used sentiment analysis and comprehensively and grey relational analysis was applied
machine learning to predict stock prices. Kalaiselvi et al. by [19] to filter feature variables of factors affecting used-
[8] developed pricing analytics of smart phones by using car prices; furthermore, traditional BP neural network is
multilayer feed-forward neural network. Ahmed et al. [9] also optimized by combining particle swarm optimization
used the data set with tabular data and images to algorithm. To the best of authors’ knowledge, the state-of-
solvehouse price prediction using support vector regression the-art methods have limited work to predict prices of
(SVR) and neural network (NN) models. In addition, second-hand products based on the machine learning
methods.

2
 SVM/KNN could be preferred when the
In addition, a method for predicting second-hand outlier data is more.
product prices through the usage of statistical-based
approaches and also the time series models has not been
Using barplot the car price records values are
established still. Machine Learning based methods will
group with their count values and plotted. scatter.smooth(),
address only a certain item, while no effort is being made to
develop a generic model that predict price for a set of the data sets’ column values are plotted with ‘range’ as X
different vehicle types. Furthermore, most of the available and ‘count’ as Y column.
second-hand price prediction methods used textual
attributes of products and do not focus on visual features
and condition of the item. However, price prediction IV. FINDINGS
models of second-hand products rely on item images in
addition to the textual data [1].
 The proposed scheme will be helpful in the
III. PROPOSED METHODOLOGY preadiction of car price.
 The proposed method was successfully applied in
In current working system, car dataset which the car specifications with very high precision.
contains attributes (engine type, cylinder number, price,  Extracting price features of the car is
etc) are taken and two algorithms are carried out for implemented.
classification/prediction purpose. The algorithm called  The proposed system detects and shows the car
Naïve Bayes is used. The training data is taken 75% from resale value.
the whole data set and model is predicted. Then the
remaining 25% of the data is taken as test data and checked
against the predicted model.

 The Naïve bayes classification yields conditional


probability values only for existing given dataset.
New test data is added for classification.
 Naïve bayes classification could not be preferred
when the outlier data is more.
 Chart preparation is not carried out.
 SVM is not applied so the outlier data could not be
predicted well.

All the existing system approaches are carried out


in proposed system. In addition, along with Naïve Bayes
based classification, various grouping operation is used to
predict the model as it helps better in various ways. It is
found to be suitable especially if the data set is having more
FIG 4.1 SAMPLE RECORDS
number of records is contains outlier data. A wide variety
of sales records can be taken for all engine type and
cylinder count classification purpose and predicting a new
model at the same time increasing the efficiency. SVM and
KNN classification is also carried out.

The proposed system has following advantages.

 Chart preparation is carried out.


 Grouping of records for various columns are
prepared and displayed.
 Engine type wise sales are found out.
 Cylinder count wise sales are found out.
 SVM/KNN generally handles larger number
of instances to work its randomization
concept well and generalize to the novel data.
 SVM/KNN could be preferred when the data FIG 4.2 DATASET COLUMNS
set grows larger.

5
FIG 4.5 KNNWITH ACCURACY

V. CONCLUSION

The project is to implement a sales prediction


model to predict car price. In addition, this research works
also aiming towards the best classification algorithm
FIG 4.3 CAR PRICE GROUPED identification for sales analysis. In this work, algorithms
like Naïve Bayes and SVM classification are used to
develop a prediction method to analyze/ predict the sales
volume. In addition, KNN classification is also used in
proposed system to yield better classification results. The
project is designed using Python Language 3.7.

REFERENCES

[1] Fathalla, A.; Salah, A.; Li, K.; Li, K.; Francesco, P.
Deep end-to-end learning for price prediction of second-
hand items.
Knowl. Inf. Syst. 2020, 62, 4541–4568. [CrossRef]
[2] de Prez, M. Used car market to soften in second-half of
2022. General News, 31 May 2022.
[3] Statistics. Vehicle Center Croatia. Centar za vozila
Hrvatske—Statistika, 2022. Available online:
https://cvh.hr/gradani/tehnicki-
pregled/statistika/ (accessed on 30 May 2022).
FIG 4.4 SVM WITH ACCURACY
[4] Noor, K.; Jan, S. Vehicle Price Prediction System using
Machine Learning Techniques. Int. J. Comput. Appl. 2017,
167, 27–31.

4
[5] Yang, R.R.; Chen, S.; Chou, E. AI Blue Book: Vehicle
Price Prediction Using Visual Features. arXiv 2018,
arXiv:1803.11227.
[6] Khedr, A.E.; S.E.Salama.; Yaseen, N. Predicting Stock
Market Behavior using Data Mining Technique and News
Sentiment Analysis. Int. J. Intell. Syst. Appl. 2017, 9, 22–
30. [CrossRef]
[7] Shastri, M.; Roy, S.; Mittal, M. Stock Price Prediction
using Artificial Neural Model: An Application of Big Data.
ICST Trans. Scalable Inf. Syst. 2018, 19, 156085.
[CrossRef]
[8] Kalaiselvi, N.; Aravind, K.; Balaguru, S.; Vijayaragul,
V. Retail price analytics using backpropogation neural
network and sentimental analysis. In Proceedings of the
2017 Fourth International Conference on Signal
Processing, Communication and Networking (ICSCN),
Chennai, India, 16–18 March 2017; pp. 1–6.
[9] Ahmed, E.; Moustafa, M. House price estimation from
visual and textual features. 2016, arXiv:1609.08399 .
[10] Naumov, V.; Banet, K. Using Clustering Algorithms
to Identify Recreational Trips within a Bike-Sharing
System. In Reliability and Statistics in Transportation and
Communication; Springer: Cham, Switzerland, 2020.
[11] Banet, K.; Naumov, V.; Kucharski, R. Using city-bike
stopovers to reveal spatial patterns of urban attractiveness.
Curr. Issues Tour. 2022, 25, 2887–2904.
[12] Pal, N.; Arora, P.; Sundararaman, D.; Kohli, P.;
Palakurthy, S.S. How much is my car worth? A
methodology for predicting used cars prices using Random
Forest. arXiv 2017, arXiv:1711.06970.
[13] Chen, C.; Hao, L.; Xu, C. Comparative analysis of
used car price evaluation models. AIP Conf. Proc. 2017,
1839, 020165.
[14] Moayedi, H.; Mehrabi, M.; Mosallanezhad, M.;
Rashid, A.S.A.; Pradhan, B. Modification of landslide
susceptibility mapping using optimized PSO-ANN
technique. Eng. Comput. 2019, 35, 967–984.

You might also like