Flight Price Prediction Using Machine Learning Algorithms
Flight Price Prediction Using Machine Learning Algorithms
Abstract — The airline ticket price changes very quickly Many of the following characteristics were included in the
these days, and the difference is huge. It can vary even within airline dataset. Eight state-of-the-art regression Machine
a few hours for the same flight. For business purposes, many Learning models were employed to forecast: MLP, GRNN,
airlines change fares according to seasons or duration of time. ELM, Random Forest Regression Tree, Regression Tree,
Airlines use a variety of calculation methods to increase their Bagging Tree, Regression SVM, and Linear Regression.
profits, for example, separating demand between expectations These machine learning models' outcomes were also
and value. Each carrier uses its own set of criteria and compared and analyzed. Other regression tree methods are
algorithms to determine the price. Machine learning, artificial outperformed by the Bagging Regression Tree model.
intelligence, and deep learning are all clear and important
tools. In a particular amount of time, it is possible to obtain the
Tianyi Wang, Samira Pouyanfar, et. al in [2] states using a
amount of air travel expenses. In this paper, we use machine
learning algorithms. KNN, Random Forest, gradient-enhanced Machine Learning technique, the problem of market segment
regression, SVR, and linear regression are examples of level is stated. DBIB and T-100, two available datasets with
algorithms. Provide basic information such as airline, source, basic features, were acquired for training and evaluation of the
destination, route, total stops, and so on to forecast flight proposed model. Data cleaning, data transformation, data
expenses. preprocessing, feature selection, and ML model deployment are
all part of the methodology. The Random Forest Model is
Keywords—Price, Flight, Regressor, Prediction, Accuracy, utilized for development since it outperforms other models,
Random Forest, Machine Learning such as LR SVM and Neural Networks, in terms of data
I.INTRODUCTION performance. With a R squared score of 0.869, this prediction
framework has a good level of accuracy.
Nowadays, the cost of a carrier ticket can change
significantly and essentially on the same plane, in any case, Gini and Groves [3] For creating a model for predicting the best
near the seats within one cabin. Customers try to get very purchase time for aircraft tickets, we used the Partial Least
low cost while carriers try to keep their income high as Square Regression (PLSR). From February 22nd to June 23rd,
expected and growing. their earnings. Aircraft organizations 2011, data was collected from trip booking websites.
can reduce the cost in the time required to build a market,
making access to tickets difficult. You can increase the cost. Wohlfarth proposed a ticket-buying speed-up model [4] that
Therefore, the cost can depend on various factors. People relied on a novel pre-processing step called macked point
who travel a lot by plane are aware of price fluctuations. processors and information mining systems, as well as a
Airlines operate different rating systems using complex measurable research technique. This system's purpose is to
revenue management guidelines. This paper highlights a convert heterogeneous value arrangement input into added value
Flight fare prediction system based on machine learning that arrangement direction using unsupervised grouping
uses KNN, RandomForest, GradientBoostingRegression, computations.
SVR and Linear Regression algorithm to estimate airline
ticket prices and analyze this data set using machine learning Supriya Rajankar, Neha Sakharkar, [5] proposed Methods for
techniques in order to anticipate the price of an airline ticket forecasting the price of an airline ticket at a certain point in time
based on the columns data set's features using Machine Learning Regression. The study begins with data
II.LITERATURE REVIEW gathering, which was done via makemytrip.com. The date of
departure, time of departure, place of departure, time of arrival,
place of destination, airlines, and total fare are the seven
K. Tziridis, Th. Kalampokas, et.al in [1] created a technique components of this dataset. The data is then cleansed and pre-
for predicting airline ticket prices. The authors begin with processed before being analysed with a variety of AI models.
some basic machine learning material before moving on to The authors analyse the performance of numerous Machine
the approach, which comprises four distinct steps of feature Learning models on data, including LR, Decision Tree, SVM,
selection that influence flight costs, data collecting from KNN, and Random Forest, and find that KNN produces R-
Greek Aegean Airlines, model selection, and evaluation. squared values close to 1, indicating great accuracy.
DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam
Proceedings of the National Conference on Emerging Computer Applications (NCECA)-2022
Vol.4, Issue.1 547
3. Model Building
4. Analyzing
III.MOTIVATION 5. Result
Everyone knows that the holidays are a time when people A. Data Collection: The training and testing datasets were
are looking for a much-needed vacation, and that finishing a from the Kaggle data pool. They now include both
trip can be a difficult effort. As a result of the global category and nominal information for Indian Airlines as
emergence of the Internet and E-commerce, the commercial of 2019. The dataset contains crucial information about
aviation industry has experienced amazing growth and has some of the elements that determine flight pricing, such
become a controlled market. Customers often try to buy the as departures and arrivals, time of departure and arrival,
ticket properly before the day of departure to avoid rising flight path, number of halts along the way, and ticket
airfare as the day is near. But in reality this is not true. The price based on those variables, all of which are used to
customer may end up giving more than they should same anticipate flight pricing. There are 10683 rows and 11
seat. For the provided model, regression analysis
columns in this massive dataset (each representing one
visualization and various techniques are used. This model
assists the user in accurately predicting the price of an airline attribute)
ticket. This model will help the common man to easily B. Data Pre-processing: This is the first stage in any machine
predict the future fare of plane tickets. learning algorithm. Data cleansing, data transformation,
IV.METHODOLOGY and data minimization are all part of this process. All of
The goal of this work aims to use the provided dataset to this is done to improve the data's effectiveness. The data
create a Machine Learning model that can accurately can be analyzed to improve the accuracy of our model. In
anticipate the price of a plane ticket. There are two training order for the categorization to be correct.
and testing data sets in the dataset. To increase learning a. Cleaning Data – In the training dataset, the null
accuracy, the model should be trained with more data. This values were deleted. Because they were
model's output can be used to forecast airline ticket prices. unnecessary for the feature selection technique, a
The ticket prices are forecasted using the KNN algorithm, few columns in the dataset were eliminated. After
Random Forest, Gradient Boosting Regression, SVR, and the new columns with numerical values derived
Linear Regression.The structure is: from the preprocessed data were stored for the
prediction, the columns of attributes with
Collect Dataset categorical data were removed from the dataset. As
a result, an appropriate training dataset with the
following attribute columns was obtained.
Model Deployment
C. Machine Learning: This is used to help the user to
anticipate the price of an aeroplane ticket with the
Figure 1
greatest degree of precision The machine learning
The steps that require to be followed are: algorithms are used to predict fares , that will use the
1. Data Collection dataset given. There are different learning algorithm
used to predict the airfares. The machine learning
2. Data Pre-processing algorithms relies on how it is trained. Which algorithm
DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam
Proceedings of the National Conference on Emerging Computer Applications (NCECA)-2022
Vol.4, Issue.1 548
2. Add the data into a Data Frame, then get the shape of
DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam
Proceedings of the National Conference on Emerging Computer Applications (NCECA)-2022
Vol.4, Issue.1 549
SVR
K Neighbors Regressor
Linear Regression
DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam
Proceedings of the National Conference on Emerging Computer Applications (NCECA)-2022
Vol.4, Issue.1 550
VII.CONCLUSION
[4]JuharAhmedAbdellaa,NMZakibKhaled,ShuaibaFahadKh
”an Airline ticket price and demand prediction: A survey”
https://doi.org/10.1016/j.jksuci.2019.02.001
DOI: 10.5281/zenodo.6906451
ISBN: 978-93-5607-317-3@2022 MCA, Amal Jyothi College of Engineering Kanjirappally, Kottayam