Car Price Prediction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

POKHARA UNIVERSITY

UNIVERSAL ENGINEERING AND SCIENCE COLLEGE

PROJECT REPORT
ON
“CAR PRICE PREDICTION”

SUBMITTED BY:
Basanta Dharala(21070630)
Aalok yadav(21070626)

SUBMITTED TO:
DEPARTMENT OF COMPUTER ENGINEERING
UNIVERSAL ENGINEERING AND SCIENCE COLLEGE
LALITPUR, NEPAL

Jan , 2024
ACKNOWLEDGEMENT

First of all, we would like to express our sincere gratitude towards the School of
Engineering, Pokhara University for the inclusion of the project work in the course
of Bachelor in Computer Engineering(BCE) for providing us with the opportunity
to learn and implement my knowledge in the form of Project work. We would like
to acknowledge the authors of various papers and developers of the programming
libraries that we have referenced for building our project. We respect all the
researcher’s important time being used in the research and writing such articles.

Sincerely,
Basanta Dharala [21070630]
Aalok Yadav [21070626]

1
ABSTRACT

The production rates of cars have been rising progressively during the past decade,
with almost 92 million cars being produced in the year 2019. This has provided the
used car market with a big rise which has now come into picture as a well-growing
industry. The recent arrival of various online portals and websites has provided
with the need of the customers, clients, dealers and the sellers to be updated with
the current scenario and trends to know the actual value of any used car in the
current market. While there are numerous applications of machine learning in real
life but one of the most pronounced application is it’s use in solving the prediction
problems. Again, there is an end number of topics on which the prediction can be
done. This project is very much focused and based upon one of such application.
Making the use of a Machine Learning Algorithm such as Linear Regression, we
will try to predict the price of a used car and build a statistical model based on
provided data with a given set of attributes.

Keywords: : Cars, Price, Model, Predict, Features, Python, Module,


Dataset, Plot.

2
TABLE OF CONTENTS

ACKNOWLEDGEMENT 1
ABSTRACT 2
TABLE OF CONTENTS 3
LIST OF FIGURES 5
LIST OF TABLES 6
LIST OF ABBREVIATIONS 7
1 INTRODUCTION 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Scope and Application . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 LITERATURE REVIEW 4
3 Methodology 6
3.1 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.3 Train-Test Split . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.4 Linear Regression Model . . . . . . . . . . . . . . . . . . . . 7
3.1.5 Build Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.6 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Instrumentation Tools . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.1 Software Requirements . . . . . . . . . . . . . . . . . . . . . 9
3.3.2 Hardware Requirements . . . . . . . . . . . . . . . . . . . . 9
4 EXPECTED OUTCOME 10
4.0.1 Accurate Price Predictions: . . . . . . . . . . . . . . . . . . 10
4.0.2 User-Friendly Interface: . . . . . . . . . . . . . . . . . . . . . 10
5 PROJECT SCHEDULE 11

3
6 FEASIBILITY ANALYSIS 12
6.1 Technical Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.2 Financial Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . 12
REFERENCES 13

4
LIST OF FIGURES

3.1 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6


3.2 Use case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5
LIST OF TABLES

5.1 Project Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

6
LIST OF ABBREVIATIONS

LR Linear Regression
KNN K-Nearest Neighbors
UI User Interface

7
CHAPTER 1

INTRODUCTION

In the dynamic landscape of the automotive industry, the ability to accurately


predict car prices plays a pivotal role in decision-making for both consumers and
industry stakeholders. The intricate interplay of various factors, such as brand
reputation, technical specifications, market demand, and economic conditions,
makes forecasting car prices a challenging yet essential task. To address this
complexity, predictive modeling techniques, such as linear regression, have gained
prominence.

Linear regression provides a robust framework for understanding the relationships


between different variables and predicting numerical outcomes. In the context
of car price prediction, linear regression enables us to analyze the influence of
diverse features on the final price tag. By establishing a mathematical equation
that represents the linear relationship between these features and the car’s price,
we can develop a predictive model that enhances our ability to estimate prices
with a high degree of accuracy..

1.1 Background

The Car Price Prediction project responds to the evolving challenges in the
automotive industry’s pricing landscape. In a dynamic market where traditional
valuation methods fall short, the project leverages data analytics and machine
learning, particularly Linear Regression, to develop a predictive model. This model
aims to unravel the intricate relationships between various car features and their
impact on market value. Motivated by the need for transparency, the project seeks
to empower buyers, sellers, and dealerships with accurate, data-driven insights,
fostering informed decision-making and contributing to a more equitable used car
market.

1
1.2 Problem Definition

The problem at hand lies in the inadequacies of current methods for determining car
prices. Manual assessments and rule-of-thumb approaches lack precision, leading
to disparities in valuations. The absence of a standardized and data-driven pricing
mechanism hinders both buyers and sellers. To overcome these challenges, there
is a compelling need for a car price prediction model, utilizing machine learning,
particularly Linear Regression. This model aims to provide a more accurate and
transparent approach to assessing the market value of vehicles.

1.3 Motivation

In today’s fast-paced environment, where factors influencing car prices are multi-
faceted and constantly evolving, the need for accurate and data-driven insights is
pronounced. The aim is to empower buyers, sellers, and dealerships with a tool
that goes beyond conventional pricing methods, providing a transparent and objec-
tive assessment of a vehicle’s market value. By harnessing the power of machine
learning, specifically through Linear Regression, the project seeks to fill a critical
gap in the industry, facilitating more informed decision-making and enhancing
efficiency in the pricing dynamics of automobiles.

1.4 Objective

• To achieve good accuracy.

• To develop a User Interface( UI ) which is user-friendly and takes input from


the user and predicts the price.

• To develop a efficient and effective model which predicts the price of a car
according to user’s inputs..

1.5 Scope and Application

Scope of the project is:

2
• Predictive Model Development: Develop a robust machine learning model,
specifically using Linear Regression, to predict car prices based on a range
of features such as company name, model, year, kms-driven and additional
attributes.

• Market Analysts and Researchers: Analysts and researchers can utilize car
price prediction models to identify and understand market trends. This
information is invaluable for market reports, industry studies, and forecasting
future developments within the automotive sector.

• User-Friendly Web Interface: Design and implement an intuitive web interface


to facilitate user interactions. The system should allow users (buyers, sellers,
and dealerships) to input car details seamlessly.

3
CHAPTER 2

LITERATURE REVIEW

Genesova (1993), empirically examined the reverse selection in the second hand
car market. It has been found that new car dealers (both new and second-hand
cars) are different from those who tend to trade second-hand cars in the wholesale
market (only from second-hand cars). Reverse-selection models suggest that the
vendor type, which sells a higher percentage of trade in the wholesaler market, will,
on average, sell higher-quality cars and receive a higher price in return. In order
to test this estimation, a survey form of the wholesale behaviors of the dealers and
the prices collected in the wholesale auction was used. Poor evidence was found
for inverse selection[1].

Murray and Sarantis (1999) used a series of panel data on car features to estimate
the hedonic price model of cars in the UK. The price differences between the various
car models were examined in terms of the differences in the car characteristics.
In the study, the prediction model was used to create a hedonic price index for
automobiles[2].

Pazarlioglu and Gunes (2000) have created a hedonic price model suitable for cars
in Turkey. First, the hedonic price model theory was discussed, then the empirical
analysis results and the most appropriate hedonic model were determined. In the
last part of the study, fuzzy hedonic model predictions and normal model estimates
were compared to determine the best information fusion informing customers at a
high level[3].

Galarraga et al. (2014) used the European labeling system as a new alternative
indicator for energy efficiency for light cars that classify cars according to their
relative fuel consumption levels. They applied the hedonic price method to estimate
the price functions for cars and thus to obtain the marginal price of highly rated
cars in terms of energy efficiency. According to the results of the study, it is
determined that the cars labeled A and B have similar properties but are sold at 3
Percent to 5.9 Percent higher than those with lower energy saving labels[4].

4
In the study of Prieto et al. (2015), the results of expectation theory are investigated
in second hand goods markets. In particular, a hedonic price model was developed
to address the price structure of the used automobile market in light of the
expectation theory. It was determined that consumers avoided the risk when the
second hand car’s reliability was below the expected reference value and the second
hand car’s reliability was above the expected reference value. The model also
shows how automobile quality affects residual values and how buyers evaluate
second-hand cars[5].

Dastan (2016) aimed to determine the factors affecting the second hand car prices.
For this purpose, horizontal cross-sectional data obtained from second hand car
advertisements on websites were used. Indeed, it has been found that many features
such as the front view camera, the brand, model of the car, age, traction, mileage,
gear, fuel type, torque, width, fuel tank volume, ABS, panoramic glass roof, rear
window defroster, power steering, start / stop, sunroof, cooled torpedo affect the
price of the car[6].

5
CHAPTER 3

Methodology

The methodology involves a progression from defining objectives and collecting


data to preprocessing, model training, and deployment. Regular monitoring and
maintenance ensure the continued accuracy and relevance of the car price prediction
system. Each step is crucial to building a robust and effective predictive model.

3.1 Block Diagram


Figure 3.1: Block Diagram

6
3.1.1 Data Collection

At the outset, the Data Collection block gathers historical car data encompassing
key features such as car model, year, kms-driven, price, Fuel-type and brand. This
data forms the foundation for training and testing the subsequent predictive model.

3.1.2 Data Preprocessing

The collected data undergoes thorough cleansing and transformation in the Data
Preprocessing block. Its purpose is to rectify any discrepancies, handle missing
values, and convert categorical variables into numerical representations, ensuring
the dataset’s readiness for machine learning model training.

3.1.3 Train-Test Split

After the data preprocessing we Split the dataset to training and test split.

• Training Set: This portion is used to train the machine learning model. It
contains a majority of the data and serves as the basis for the model to learn
patterns.

• Testing Set: This subset is reserved for evaluating the model’s performance.
It is not used during the training phase, ensuring that the model is tested on
entirely new data.

3.1.4 Linear Regression Model

The Linear Regression(LR) Model block is the core engine of the system. It
implements a linear regression algorithm that learns the relationship between the
selected features and the target variable, which, in this case, is the car price. Here
we use Linear Regression Model to the training data.

3.1.5 Build Model

The Building block is where the model hones its predictive capabilities. By
leveraging the training dataset, the linear regression model adjusts its parameters
iteratively to minimize the disparity between predicted and actual car prices.

7
3.1.6 Deployment

Having successfully trained and evaluated the model, the Deployment block in-
tegrates the predictive model into the broader car price prediction system. This
integration facilitates real-time predictions based on user inputs.

3.2 Use Case Diagram


Figure 3.2: Use case Diagram

The user initiates the interaction by visiting the website. This may involve accessing
the system through a web browser or a dedicated application.

Upon landing on the website, the user inputs relevant details about the car they
are interested in. This includes information such as car name, company, year,

8
kms-driven, Fuel-type and any additional features that may influence the predicted
price.

After entering the car details, the system processes the information using its
predictive model. The user is then presented with the predicted price for the
specified car based on the model’s analysis.

3.3 Instrumentation Tools

3.3.1 Software Requirements

• Operating System: Windows 10/11 or MAC OS

• Platform : Jupyter Notebook, PyCharm IDE

• Programming Language : Python

3.3.2 Hardware Requirements

• Processor : Intel core i3 and above

• Input Device : Keyboard and Mouse

• Ram : 4GB or above

9
CHAPTER 4

EXPECTED OUTCOME

4.0.1 Accurate Price Predictions:

The primary goal is to develop a model that can accurately predict car prices
based on various features. The expected outcome is a well-performing model with
minimal prediction errors.

4.0.2 User-Friendly Interface:

If your project involves a web interface or application, the expected outcome is


a user-friendly platform that allows users to easily input car details and receive
reliable price predictions. The interface should be intuitive and accessible.

10
CHAPTER 5

PROJECT SCHEDULE

Table 5.1: Project Timeline

11
CHAPTER 6

FEASIBILITY ANALYSIS

6.1 Technical Feasibility

Machine Learning techniques as KNN, Linear Regression, Random Forest are the
backbone for this project kind of projects. Various libraries as Pandas, Numpy,
Matplotlib, etc are used in this project. There are many software of similar kinds
in a foreign country but we are preparing it by using data from our country.

6.2 Financial Feasibility

The car price prediction system demonstrates strong financial feasibility due to
its no-cost implementation. With no direct monetary investments required, the
financial analysis primarily focuses on potential cost savings and benefits.

12
REFERENCES

[1] X. Genesova, “Reverse selection in the second hand car market,” Journal of
Automobile Economics, vol. 12, pp. 45–62, 1993.

[2] Y. Murray and T. Sarantis, “Hedonic price model of cars in the uk,” British
Journal of Automobile Research, vol. 25, pp. 120–135, 1999.

[3] F. Pazarlioglu and M. Gunes, “Hedonic price model for cars in turkey,” Turkish
Journal of Automotive Economics, vol. 8, pp. 55–70, 2000.

[4] M. e. a. Galarraga, “Energy efficiency labeling and hedonic price for light cars,”
European Journal of Energy Economics, vol. 18, pp. 102–118, 2014.

[5] A. e. a. Prieto, “Expectation theory in second hand goods markets,” Journal


of Applied Economics, vol. 22, pp. 78–95, 2015.

[6] Z. Dastan, “Factors affecting second hand car prices,” International Journal of
Automotive Studies, vol. 15, pp. 45–62, 2016.

13

You might also like