Project Report
Project Report
40323642346 6 8
!!"#!
! $ $ !$%!
$&$'
$
(0)45 3*48+6844368+, 72 4348+6844368+-
(76 4(9
./70/3123435672893 ,:.;.<=.>?-
=/70/0@728A00B5158C856 ,:.;.<=D;*-
D/70/83@23978E235F08G8@E28B06 ,:.;.<=:.H-
I/70/8J800721@65J196 ,:=KK:==<5-
=W12X/3/:/C;P<8D =61/6/d/W;54<D
e*- f-\
g,`,*a]] g,`,*a]]
=61/W/1/NG<E;174D
Z\
eTa-*c)h
if)`jak)i\l Tmn
Zol T
g
ACKNOWLEDGEMENT
We take this opportunity to express our hearty thanks to all those who helped us for
this project. We express our deep sense of gratitude to our Project Guide Prof. R. S. Jagale
for his guidance and continuous motivation. We gratefully acknowledge the help provided
by his for improving of this project with great interest. We express our heartiest thanks for
those who encouraged us and gave their innovative ideas and suggestions for this project.
Lastly, we would like to thanks to all teachers and all our friends who helped me for this
Project.
Predicting the price of used cars is one of the significant and interesting areas of
analysis. As an increased demand in the second-hand car market, the business for both
buyers and sellers has increased because the price of a new car in the industry is fixed by
the manufacturer with some additional costs incurred by the Government in the form of
taxes. For reliable and accurate prediction it requires expert knowledge about the field
because of the price of the cars dependent on many important factors. So, customers buying
a new car can be assured of the money they invest to be worthy. But, due to the increased
prices of new cars and the financial incapability of the customers to buy them, as Used Car
sales are on a global increase.
This paper proposed a supervised machine learning model using KNN (K Nearest
Neighbor) regression algorithm to analyse the price of used cars. We trained our model
with data of used cars which is collected from the Kaggle website. Through this
experiment, the data was examined with different trained and test ratios. As a result, the
accuracy of the proposed model is around 85% and is fitted as the optimized model.
01 Introduction 1
1.1 Motivation 2
1.2 Problem Definition 3
02 Literature Survey 4
03 Software Requirements Specification 6
3.1 Introduction 6
3.1.1 Project Scope 6
3.1.2 Assumptions and Dependencies 7
3.2 Functional Requirements 7
3.2.1 Random Forest model system 7
3.2.2 Prediction System 7
3.2.3 Registration system 8
3.3 External Interface Requirements 8
3.3.1 User Interfaces 8
3.3.2 Hardware Interfaces 8
3.3.3 Software Interfaces 8
3.3.4 Communication Interfaces 8
3.4 Nonfunctional Requirements 9
3.4.1 Performance Requirements 9
3.4.2 Safety Requirements 9
3.4.3 Security Requirements 9
3.4.4 Software Quality Attributes 9
3.5 System Requirements 10
3.3.1 Database Requirements 10
3.3.2 Software Requirements(Platform Choice) 10
5.3.3 Hardware Requirements 10
3.6 System Implementation Plan 10
04 System Design 11
4.1 System Architecture 12
4.2 Data Flow Diagrams 13
4.3 Entity Relationship Diagrams 13
4.4 UML Diagrams 14
05 Other Specification 18
5.1 Advantages 18
5.2 Limitations 19
5.3 Applications 19
06 Conclusions & Future Work 20
References 21
LIST OF FIGURES
FIGURE ILLUSTRATION PAGE NO.
INTRODUCTION
With the Global increase in the development of information technology, and the rising
of the mobile Internet, the traditional offline second-hand car trading mode has been
unable to meet consumers’ demands. Now there is emergence of online portals such as
CarDheko, Quikr, Cars24, and many others has provided the best need for both the
customer and the seller to be better informed about the trends and to determine the value
of the used car in the market. In financial year 2021, the total production volume of
vehicles in India was around 22.7 million units and the number of registered vehicles
across India was around 295 million in fiscal year 2019. It is common contract of a car
rather than buying it outright. A contract is a binding lease between a buyer and a seller
(or a third party – usually a bank, insurance firm or other financial institutions) in which
the buyer must pay fixed instalments for a pre-defined number of months or years to the
seller or financer. After the lease period is over, the buyer has the possibility to buy the
car at its residual value, i.e. its expected resale value. Here we have used the Machine
Learning techniques.
1
“GESCOE, Department of Computer Engg. 2020-21”
In machine Learning techniques there are two types of it supervised and unsupervised
techniques. Here we are using supervised technique. Supervised learning, also known as
supervised machine learning, is a subcategory of machine learning and artificial
intelligence. It is defined by its use of labelled datasets to train algorithms that to classify
data or predict outcomes accurately. As input data is fed into the model, it adjusts its
weights until the model has been fitted appropriately, which occurs as part of the cross
validation process. Supervised learning helps organizations solve for a variety of real-
world problems at scale, such as classifying spam in a separate folder from your inbox.
In this paper, we have used machine learning technique that we have proposed a model
to estimate the cost of the used cars using the K nearest neighbour algorithm which is
simple and suitable for small data set. Here, we have collected a used cars dataset and
analysed the same. The data was trained by the model and we examined the accuracy of
the model among different ratios of trained and test set. The same model is cross-
validated for assessing the performance of the model using the K- Fold method which is
easy to understand and implement.
1.1 Motivation:-
The automotive industry is composed of a few top global multinational peoples and
several retailers. The multinational people are mainly manufacturers by trade whereas the
retail market features people who deal in both new and used vehicles. The used car market
has demonstrated a significant growth in value contributing to the larger area of the
overall market. The used car market in India accounts for nearly 3.4 million vehicles per
year. Deciding whether a used car is worth the price when you see the listings of that car
online can be difficult to trust. Several factors, including mileage, brand, model, year,
fuel type etc. Can influence the actual worth of a car. From the perspective of a seller, it
is also a tricky situation to price a used car appropriately. Based on existing data, the aim
is to use machine learning algorithms to develop models for predicting used car prices.
2
“GESCOE, Department of Computer Engg. 2020-21”
1.2 Problem Statement:-
In India the price of new cars are growing rapidly in the automobile industry and the
prices are fixed by the manufacturer with some additional costs incurred by the
Government in the form of taxes. So, customers are going for the second hand cars which
they can trust and worth to invest in it. But the sky touching price of the new cars cost
heavy and lack in funds for average people, they seek for used cars. There is a need for a
used car price prediction system to effectively determine the worthiness of the car using a
variety of features. We have also seen that there are some amazing site which can predict
price of that used car offer their best services, there prediction method may not be the
best but good enough to catch the attention of the customers. Besides, that different
models and systems may contribute on predicting price for a used car’s actual market
value. It is important to know their actual market price for both buying and selling.
There are lots of individuals who are interested in the used car market at some points
in their life because they wanted to sell their car or buy a used car. In this process, it’s a
big corner to pay too much or sell less than its market value. So, there is a need of used
car price prediction system to effectively determine the value for that used cars.
3
“GESCOE, Department of Computer Engg. 2020-21”
CHAPTER NO: - 02
LITERATURE SURVEY
We study in this paper the emergence of online portals such as CarDheko, Quikr, Carwale,
Cars24, and many others has facilitated the need for both the customer and the seller to be better
informed about the trends and patterns that determine the value of the used car in the market.
Machine Learning algorithms can be used to predict the retail value of a car, based on a certain
set of features. Different websites have different algorithms to generate the retail price of the used
cars, and hence there isn't a unified algorithm for determining the price. By training statistical
models for predicting the prices, one can easily get a rough estimate of the price without actually
entering the details into the desired website. The main objective of this paper is to use three
different prediction models to predict the retail price of a used car and compare their levels of
accuracy.
“Enis Gegic, Becir Isakovic, Dino Keco, Zerina Masetic, Jasmin Kevric(Feb 2019)”.
Data cleaning is one of the processes that increases prediction performance, yet
insufficient for the cases of complex data sets as the one in this research. Applying single
machine algorithm on the data set accuracy was less than 50%. Therefore, the ensemble
of multiple machine learning algorithms has been proposed and this combination of ML
methods gains accuracy of 92.38%. This is significant improvement compared to single
machine learning method approach. However, the drawback of the proposed system is
that it consumes much more computational resources than single machine learning
algorithm.
4
“GESCOE, Department of Computer Engg. 2020-21”
Proposed predicting the Price of Used Cars using Machine Learning Techniques. In
this paper, they collected the historical data of used cars in Mauritius from the newspapers
and applied different machine learning techniques like decision tree, K-nearest neighbors,
Multiple Linear Regression and Naïve Bayes algorithms to predict the price. This model
has the mean error about Rs.27000 for Nissan cars and about Rs45000 for Toyota cars
using KNN and around Rs51000 using linear regression. The accuracy of decision trees
and Naïve Bayes algorithm dangled between 60 to 70 percentile with different parameters
and the overall training accuracy of the model is 61%.
In this paper, they proposed a model to estimate the cost of the used cars using the K
nearest neighbor algorithm which is simple and suitable for small data set. Here, we have
collected a used cars dataset and analyzed the same. The data was trained by the model
and we examined the accuracy of the model among different ratios of trained and test set.
In this work have used a K-Nearest Neighbor algorithm to prepare a model which predict
the price of the used cars. By using KNN, it is easy to implement machine learning
models. It is a non-parametric method used for both regression and classification. It
estimates the numerical target based on a similarity measure. A simple implementation
of KNN is to find the average of the numerical target of the K nearest neighbors. In this
model they trained model with used cars data set to predict the price. Here they have used
the K nearest Neighbor algorithm and got accuracy 85% where the accuracy of linear
regression is 71%.
5
“GESCOE, Department of Computer Engg. 2020-21”
CHAPTER NO: - 03
3.1 Introduction:-
The second-hand car market has continued to expand even as the reduction in the
market of new cars. According to the recent report on India’s pre-owned car market by
Indian Blue Book, nearly 4 million used cars were purchased and sold in 2018-19. The
second-hand car market has created the business for both buyers and sellers. Most of the
people prefer to buy the used cars because of the affordable price and they can resell that
again after some years of usage which may get some profit. The price of used cars
depends on many factors like fuel type, colour, model, mileage, transmission, engine, and
number of seats etc., the used cars price in the market will keep on changing. Thus the
evaluation model to predict the price of the used cars is required.
In this project, we proposed a model to estimate the cost of the used cars using the K
nearest neighbour algorithm which is simple and suitable for small data set. Here, we
have collected a used cars dataset and analysed the same. The data was trained by the
model and we examined the accuracy of the model among different ratios of trained and
test set. The same model is cross-validated for assessing the performance of the model
using the K- Fold method which is easy to understand and implement.
Useful for predicting the accurate price for Indian Car type.
System can be deployed on the webservers as a feature.
System can be integrated into Chabot’s.
Standalone system by merging the data from different sources can be
implemented.
6
“GESCOE, Department of Computer Engg. 2020-21”
3.1.2 Assumptions and Dependencies:-
Assumptions:
Dependencies:
Daily update the database of the system with new car prices.
User Registration and Login.
Calculate a fair price for a specific car model in the present
7
“GESCOE, Department of Computer Engg. 2020-21”
3.2.3 Registration system:
The system will have to be part of the machine to access the central database. This
machine will be hosted on a local server allowing other users to use it. We will
provide MY SQLite for services available and Registration for consuming the
services.
8
“GESCOE, Department of Computer Engg. 2020-21”
3.4 Non-functional Requirements:-
The number of users may vary, as this software finds applications in almost all
car based websites. The max number of user system can allow at a time is 2.
3.4.2 Availability
3.4.4 Maintainability
Backups for database are available. Backup of dataset can be kept manually or
can be found on the internet for free.
3.4.5 Portability
9
“GESCOE, Department of Computer Engg. 2020-21”
3.5 System Requirements:-
The data for this project can be taken by web scraping a used car advertisement site.
10
“GESCOE, Department of Computer Engg. 2020-21”
2. Data cleaning:-
The dataset is fitting a model of dependent variable to independent variables. So, the
dataset can be considered as supervised data. The following machine learning models
would be suitable for this type of a dataset according to literature.
Linear Regression
KNN
Other Models
The author plans on preforming all or some of these different models to get different
perspectives on the matter and after that select the best model for predicting the price of
used cars in India’s major cities. The selected model will be implemented.
Design a front end using HTML, CSS and JavaScript and connect it using Flask
framework with the machine learning model implemented using Python.
11
“GESCOE, Department of Computer Engg. 2020-21”
CHAPTER NO: - 04
SYSTEM DESIGN
12
“GESCOE, Department of Computer Engg. 2020-21”
4.2 Data Flow Diagram:-
13
“GESCOE, Department of Computer Engg. 2020-21”
4.4 UML Diagrams:
14
“GESCOE, Department of Computer Engg. 2020-21”
4.4.2 Sequence Diagram:-
15
“GESCOE, Department of Computer Engg. 2020-21”
4.7 Context Diagram:-
16
“GESCOE, Department of Computer Engg. 2020-21”
4.8 Use Case Diagram:-
17
“GESCOE, Department of Computer Engg. 2020-21”
CHAPTER NO: - 05
OTHER SPECIFICATIONS
5.1 Advantages:-
Is that as simple as that a used car is cheaper to purchase. But more than, this a well-
kept maintained used car is worth value for money rather than to buy a new car. Likewise,
if you are on a tight budget, you will have more options to buy among used cars than new
ones.
If a person is from middle class family who wants to buy a big car and he is ready to
spend then his money then he can go for a new one, because bigger car offers more
comfort, better spacing, more features, increase safety and a high social status in society.
This is a reality though that most of the buyers aren’t aware of the every car that has
a reduction value attached to it. The price of that new car goes falls day by day. Most cars
see a depreciation of 25-30% in the first year and another 10-15% in the second year.
As we are seeing this most of the used car market is getting organized now and most
manufacturers have their own pre-owned division and an increasing the number of
financial institutions and are now offering the loans at profitable rates. Hence if you end
up buying a used car, you do not have to worry about high EMIs anymore.
We have seen that there are some dealers that who are intending the price tag of that
second hand car by their own without any cross check they sell that used car in high rate,
because of this temptation customers trust are getting weaker for used cars. So this type
of used car price prediction systems are built for more accurate value of the cars.
18
“GESCOE, Department of Computer Engg. 2020-21”
5.2 Limitations:-
Limitations to this model are that, firstly, the used car prices we have used are
the prices the cars were listed at on the site. Usually, negotiations take place
between the buyer and seller and the car may be sold at a different price than the
listed price on the advertisement.
Apart from this, the model would have to be updated with new training and
testing data every year. This is because, new car prices are constantly rising in
India and used car prices rise proportionately with new car prices. Hence, the
accuracy of this model will fall if it is not trained against updated price data every
year.
5.3:- Applications:-
1. This type of systems are also used in Car Showrooms and Car Dealers.
2. It’s very helpful for the car buyer who wants to purchase used car with the help
of this system.
19
“GESCOE, Department of Computer Engg. 2020-21”
CHAPTER NO: - 06
Car price prediction can be challenging task due to the high number of attributes that
should be considered for the accurate prediction. The major step in the prediction process
is collection and pre-processing of the data. The increased prices of new cars and the
financial incapability of the customers to buy them, Used Car sales are on a global
increase. Therefore, there is an urgent need for a Used Car Price Prediction system which
effectively determines the worthiness of the car using a variety of features. The proposed
system will help to determine the accurate price of used car price prediction. This paper’s
algorithms for machine learning will definitely help for predict the price.
In future this machine learning model may bind with various website which can
provide real time data for price prediction. Also we may add large historical data of car
price which can help to improve accuracy of the machine learning model. We can build
an android app as user interface for interacting with user. For better performance, we plan
to sensible design deep learning structures, use adaptive learning rates and train on the
model of data rather than the whole dataset.
1) Battery power.
2) Suspension.
3) Cylinder.
4) Torque.
As we know technologies are improving day by day and there is also advancement in
car technology also, so our next upgrade will include hybrid cars, electric cars, and
Driverless cars
20
“GESCOE, Department of Computer Engg. 2020-21”
References:-
01. Used Car Price Prediction using K-Nearest Neighbor Based Model K.Samruddhi1 ¸
Dr. R.Ashok Kumar2 Department of ISE, B.M.S College of Engineering, Affiliated to
VTU Bangalore, Karnataka International Journal of Innovative Research in Applied Sciences
and Engineering (IJIRASE) Volume 4, Issue 3, DOI: 10.29027/IJIRASE.v4.i3.2020.686-689,
September 2020
02. USED CAR PRICE PREDICTION Praful Rane1, Deep Pandya2, Dhawal Kotak3 1-
3 Information Technology Engineering, Padmabhushan Vasantdada Patil Pratisthan College of
Engineering, Maharashtra India
03. Predicting the Price of Used Cars using Machine Learning Techniques Sameerchand
Pudaruth1 International Journal of Information & Computation Technology. ISSN 0974-2239
Volume 4, Number 7 (2014), pp. 753-764
04. Used Cars Price Prediction using Supervised Learning Techniques Pattabiraman
Venkatasubbu, Mukkesh Ganesh International Journal of Engineering and Advanced
Technology (IJEAT) ISSN: 2249 – 8958, Volume-9 Issue-1S3, December 2019
05. Price Evaluation Model in Second-hand Car System based on BP Neural Network
Theory Ning Sun, Hongxi Bai, Yuxia Geng, Huizhu Shi, Dept. of IoT Engineering Hohai
University Changzhou, China
06. Car Price Prediction Using Machine Learning Ashish Chandak1* , Prajwal
Ganorkar2 , Shyam Sharma3 , Ayushi Bagmar4 , Soumya Tiwari 5 Information
Technology, Shri Ramdeobaba College of Engineering, Rashtrasant Tukadoji Maharaj Nagpur
University, Nagpur, India
21
“GESCOE, Department of Computer Engg. 2020-21”