0% found this document useful (0 votes)
108 views

Flight Price Prediction Using Machine Learning Algorithms

The document discusses using machine learning algorithms to predict flight ticket prices. It describes collecting airline data and preprocessing it, then using algorithms like KNN, random forest, gradient boosting regression, SVR and linear regression to build models and analyze their performance at predicting ticket prices based on input features like departure, arrival times and locations.

Uploaded by

vchandra4489
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views

Flight Price Prediction Using Machine Learning Algorithms

The document discusses using machine learning algorithms to predict flight ticket prices. It describes collecting airline data and preprocessing it, then using algorithms like KNN, random forest, gradient boosting regression, SVR and linear regression to build models and analyze their performance at predicting ticket prices based on input features like departure, arrival times and locations.

Uploaded by

vchandra4489
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proceedings of the National Conference on Emerging Computer Applications (NCECA)-2022

Vol.4, Issue.1 546

Flight Ticket Price Prediction using Machine


Learning
Midhun Krishna Ms Jetty Benjamin
Department of Computer Applications Department of Computer Applications
Amal Jyothi College of Engineering Kanjirappally, India Amal Jyothi College of Engineering Kanjirappally, India
[email protected] [email protected]

Abstract — The airline ticket price changes very quickly Many of the following characteristics were included in the
these days, and the difference is huge. It can vary even within airline dataset. Eight state-of-the-art regression Machine
a few hours for the same flight. For business purposes, many Learning models were employed to forecast: MLP, GRNN,
airlines change fares according to seasons or duration of time. ELM, Random Forest Regression Tree, Regression Tree,
Airlines use a variety of calculation methods to increase their Bagging Tree, Regression SVM, and Linear Regression.
profits, for example, separating demand between expectations These machine learning models' outcomes were also
and value. Each carrier uses its own set of criteria and compared and analyzed. Other regression tree methods are
algorithms to determine the price. Machine learning, artificial outperformed by the Bagging Regression Tree model.
intelligence, and deep learning are all clear and important
tools. In a particular amount of time, it is possible to obtain the
Tianyi Wang, Samira Pouyanfar, et. al in [2] states using a
amount of air travel expenses. In this paper, we use machine
learning algorithms. KNN, Random Forest, gradient-enhanced Machine Learning technique, the problem of market segment
regression, SVR, and linear regression are examples of level is stated. DBIB and T-100, two available datasets with
algorithms. Provide basic information such as airline, source, basic features, were acquired for training and evaluation of the
destination, route, total stops, and so on to forecast flight proposed model. Data cleaning, data transformation, data
expenses. preprocessing, feature selection, and ML model deployment are
all part of the methodology. The Random Forest Model is
Keywords—Price, Flight, Regressor, Prediction, Accuracy, utilized for development since it outperforms other models,
Random Forest, Machine Learning such as LR SVM and Neural Networks, in terms of data
I.INTRODUCTION performance. With a R squared score of 0.869, this prediction
framework has a good level of accuracy.
Nowadays, the cost of a carrier ticket can change
significantly and essentially on the same plane, in any case, Gini and Groves [3] For creating a model for predicting the best
near the seats within one cabin. Customers try to get very purchase time for aircraft tickets, we used the Partial Least
low cost while carriers try to keep their income high as Square Regression (PLSR). From February 22nd to June 23rd,
expected and growing. their earnings. Aircraft organizations 2011, data was collected from trip booking websites.
can reduce the cost in the time required to build a market,
making access to tickets difficult. You can increase the cost. Wohlfarth proposed a ticket-buying speed-up model [4] that
Therefore, the cost can depend on various factors. People relied on a novel pre-processing step called macked point
who travel a lot by plane are aware of price fluctuations. processors and information mining systems, as well as a
Airlines operate different rating systems using complex measurable research technique. This system's purpose is to
revenue management guidelines. This paper highlights a convert heterogeneous value arrangement input into added value
Flight fare prediction system based on machine learning that arrangement direction using unsupervised grouping
uses KNN, RandomForest, GradientBoostingRegression, computations.
SVR and Linear Regression algorithm to estimate airline
ticket prices and analyze this data set using machine learning Supriya Rajankar, Neha Sakharkar, [5] proposed Methods for
techniques in order to anticipate the price of an airline ticket forecasting the price of an airline ticket at a certain point in time
based on the columns data set's features using Machine Learning Regression. The study begins with data
II.LITERATURE REVIEW gathering, which was done via makemytrip.com. The date of
departure, time of departure, place of departure, time of arrival,
place of destination, airlines, and total fare are the seven
K. Tziridis, Th. Kalampokas, et.al in [1] created a technique components of this dataset. The data is then cleansed and pre-
for predicting airline ticket prices. The authors begin with processed before being analysed with a variety of AI models.
some basic machine learning material before moving on to The authors analyse the performance of numerous Machine
the approach, which comprises four distinct steps of feature Learning models on data, including LR, Decision Tree, SVM,
selection that influence flight costs, data collecting from KNN, and Random Forest, and find that KNN produces R-
Greek Aegean Airlines, model selection, and evaluation. squared values close to 1, indicating great accuracy.
DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam
Proceedings of the National Conference on Emerging Computer Applications (NCECA)-2022
Vol.4, Issue.1 547

3. Model Building
4. Analyzing
III.MOTIVATION 5. Result

Everyone knows that the holidays are a time when people A. Data Collection: The training and testing datasets were
are looking for a much-needed vacation, and that finishing a from the Kaggle data pool. They now include both
trip can be a difficult effort. As a result of the global category and nominal information for Indian Airlines as
emergence of the Internet and E-commerce, the commercial of 2019. The dataset contains crucial information about
aviation industry has experienced amazing growth and has some of the elements that determine flight pricing, such
become a controlled market. Customers often try to buy the as departures and arrivals, time of departure and arrival,
ticket properly before the day of departure to avoid rising flight path, number of halts along the way, and ticket
airfare as the day is near. But in reality this is not true. The price based on those variables, all of which are used to
customer may end up giving more than they should same anticipate flight pricing. There are 10683 rows and 11
seat. For the provided model, regression analysis
columns in this massive dataset (each representing one
visualization and various techniques are used. This model
assists the user in accurately predicting the price of an airline attribute)
ticket. This model will help the common man to easily B. Data Pre-processing: This is the first stage in any machine
predict the future fare of plane tickets. learning algorithm. Data cleansing, data transformation,
IV.METHODOLOGY and data minimization are all part of this process. All of
The goal of this work aims to use the provided dataset to this is done to improve the data's effectiveness. The data
create a Machine Learning model that can accurately can be analyzed to improve the accuracy of our model. In
anticipate the price of a plane ticket. There are two training order for the categorization to be correct.
and testing data sets in the dataset. To increase learning a. Cleaning Data – In the training dataset, the null
accuracy, the model should be trained with more data. This values were deleted. Because they were
model's output can be used to forecast airline ticket prices. unnecessary for the feature selection technique, a
The ticket prices are forecasted using the KNN algorithm, few columns in the dataset were eliminated. After
Random Forest, Gradient Boosting Regression, SVR, and the new columns with numerical values derived
Linear Regression.The structure is: from the preprocessed data were stored for the
prediction, the columns of attributes with
Collect Dataset categorical data were removed from the dataset. As
a result, an appropriate training dataset with the
following attribute columns was obtained.

Import libraries b. Formatting the Data – We add a new column week


day 1 mean week day 0 mean weekend while pre-
processing the data. Format the arrival and
departure times, and add an extra two columns to
Read Dataset indicate whether the flight is taking place at night
or early in the morning. Some flights are less
expensive early in the morning and more expensive
late at night, indicating a clear correlation.
Data pre processing
Converting length hour and minute into separate
columns is also done, as well as labelling and
encoding to convert category data to unique int
values.
Building Model

c. Splitting of Data – After formatting the data, the


data is then split into training and testing datasets.
Random Prediction After this the data chosen for training is used to
train our model.

Model Deployment
C. Machine Learning: This is used to help the user to
anticipate the price of an aeroplane ticket with the
Figure 1
greatest degree of precision The machine learning
The steps that require to be followed are: algorithms are used to predict fares , that will use the
1. Data Collection dataset given. There are different learning algorithm
used to predict the airfares. The machine learning
2. Data Pre-processing algorithms relies on how it is trained. Which algorithm

DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam
Proceedings of the National Conference on Emerging Computer Applications (NCECA)-2022
Vol.4, Issue.1 548

works best depends on the type of problem you are data.


solving, the computer resources available, and the type
of data.
1. Linear Regression - Linear regression is a
supervised learning machine learning algorithm. It
carries out a regression task. Regression models a
goal prediction value based on independent
variables. It's generally used for forecasting and
figuring out how variables are related. Different
regression models have different types of
relationships between dependent and independent
variables. Gradient descent and cost function are the
two most important factors in comprehending linear
regression. The equation for linear regression is : 3. Then split the dataset into training and testing datasets.
y(pred) = b0+b1 ∗ x
2. Support Vector Regression (SVR) – SVR
(Support Vector Regression) is a regression
approach that works in the same way as SVM. It's a
type of Machine Learning model that's used to solve
classification problems or sort data into categories.
The R2 score is used to evaluate the performance of
the regression model. 4. Cross Validate to validating the model efficiency by
3. K-Neighbors Regressor – The k-nearest training it on the subset of input data
neighbours approach is used to perform regression.
Response regression might be scalar, multivariate,
or functional. Local interpolation of the targets
associated with the training set's nearest neighbours
is used to predict the target.. The coefficient of
determination, often known as the R2 score, is used
to assess the regression model's performance. The
independent variation of the input can be used to
forecast the value of the difference in the output-
based characteristic. 5. Use Different Algorithms to find R-square, MSE and
4. Random Forest Regressor - Random Forest is an
useful machine learning technique for a variety of MAE values which helps to find Accuracy.
tasks, including regression and classification. A
random forest model is made up of many little
Machine R squared MAE MSE
decision trees called estimators, each of which Learning(ML
produces its own predictions.
Kneighbours 2468.73705 1438.629 6094662.62985
5. Gradient boosting Regression - The GBR uses Regressor
regression to calculate the difference between the Random forest 1573.08845 635.6535 2474607.2856
current forecast and the known correct target value. Regressor
"Residual" is the term for this disparity. Gradient Gradient 2466.54433 1563.642 6083830.9535
boosting regression is then used to train the weak Boosting
model that translates features to the residual. Regressor
Gradient boost is a technique for forecasting a
continuous number. 6. Then identify on test set and calculate the accuracy
V.BUILD MODEL of the model.
The model building is the main step in the Flight Price
Prediction. While building the model user use the algorithms VI.RESULT
1. Import the packages that are necessary. The result shows that the table represents study of Price of
Tickets and also the prediction of results. The outcomes
obtained by the analysis are KN Regressor, Random
Forest Regressor, Gradient Boosting Regression, SVR,
and Linear Regression. Along with R-square, MSE, and
MAE values, the algorithm's accuracy is improved.

2. Add the data into a Data Frame, then get the shape of

DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam
Proceedings of the National Conference on Emerging Computer Applications (NCECA)-2022
Vol.4, Issue.1 549

Random Forest Regressor

SVR

K Neighbors Regressor

Gradient Boosting Regressor

Linear Regression

DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam
Proceedings of the National Conference on Emerging Computer Applications (NCECA)-2022
Vol.4, Issue.1 550

VII.CONCLUSION

This paper explains how to forecast flight ticket prices.


A set of data is collected, pre-processed, modelled, and
investigated in order to test algorithmic rule. Machine
Learning methods with square measure for predicting
accurate airline fares and providing accurate value of aircraft
ticket price at both limited and maximum value. On Kaggle,
data is obtained from websites that sell aircraft tickets. As
indicated in the above analysis, the KNeighbors Regressor
and Gradient Boosting Regressor yield better results, while
the Random Forest Regressor forecasts the highest accuracy.
The R-squared value predicts the model's accuracy as well.
They are frequently attained.
VIII. REFERENCE

[1]. Konstantinos Tziridis , Theofanis Kalampokas


” Airfare Prices Prediction Using Machine Learning
Techniques” DOI:10.23919/EUSIPCO.2017.8081365

[2]. T. Wang et al., "A Framework for Airfare Price


Prediction: A Machine Learning Approach," doi:
10.1109/IRI.2019.00041.

[3]. Groves, W. and Gini, M., 2021. “A Regression Model


For Predicting Optimal Purchase Timing For Airline
Tickets.”.262172314

[4]JuharAhmedAbdellaa,NMZakibKhaled,ShuaibaFahadKh
”an Airline ticket price and demand prediction: A survey”
https://doi.org/10.1016/j.jksuci.2019.02.001

[5] Supriya Rajankar, Neha Sakharkar, Omprakash Rajankar


“Predicting The Price Of A Flight Ticket With The Use Of
Machine Learning Algorithms” ISSN 2277-861

DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam

You might also like