Cse 28

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

Developing and Deploying a Flight Fare Predictive

Web Application
Abstract: Traveling via flights has become an integral part of today's lifestyle, as more and more people choose faster travel options.
Airfare prices increase or decrease every day here and there depending on various factors such as flight timings, destination, duration of
flights, various occasions such as holidays or holiday season. As a result, many people will save time and money by having a basic
understanding of flights before making travel arrangements. A predictive model will be created in the proposed system by application of
machine learning algorithms to collected historical data. This system will give people an idea of the trends the prices follow and also
provide the predicted value of the price they can check before booking flights to save money. This kind system or service can be provided
to customers through flight booking companies to help customer book tickets.

Keywords: Machine learning; Prediction model; Feature selection, Airfare price; Pricing Models; Random Forest

1. Introduction companies where they actually end up spending more than


they should have. By providing clients with the
The ticketing system is to buy the ticket many days before information they need to order tickets at the proper
the flight takes off to avoid the effects of the most extreme moment, the suggested system can help them save
fees. Air routes usually do not agree with this procedure. millions of rupees.
Airlines can reduce costs at times when they need to build
a market and when tickets are less affordable. They can The proposed problem statement is “Flight price
maximize costs. So the price may depend on various prediction System".
factors. In order to predict the costs, this business uses AI
to show the ways of the tickets after some time. All 2. Literature Survey
groups are free to adjust the price of their tickets at any
moment. An explorer can save cash by booking the lowest Proposed Study [1] Airfare Price Prediction Using
cost flight. People who have traveled frequently by plane Machine learning techniques, for research paper data set
are aware of the fluctuations in prices. consisting of 1814 Aegean Airlines data flights collected
and used to train a machine learning model. Different a
Airlines use comprehensive Revenue Management number of features were used to train different models
principles to implement distinctive rating systems. As a demonstrate how feature selection can change accuracy
result, the evaluation system changes the fee depending on Model.
the time, season and holidays to change the header or
footer on the following pages. The ultimate goal of In a case study [2] by William Groves, an agent is
airways is to make a profit while the customer seeks the introduced which is able to optimize the timing of the
minimum rate. Customers usually try to buy a ticket well purchase on behalf customers. To create a model, partial
in advance of the departure date to avoid the increase in least squares regression is employed.
ticket prices as the date approaches. But in reality it is not.
A customer may end up paying more than they should for In [3] study, the desired model is implemented utilising
the same location. the San Francisco-New York course's linear quantile
mixed regression methodology, where tickets are listed
This project aims to predict flight prices for different every day through an online website. Two functions like
flights using the machine learning model. The user the number of days for departure and whether the
receives the expected values, and using these as a guide, departure is on a weekend or a weekday are considered
they can choose whether to purchase tickets. model development.
In the current scenario, airlines are trying to manipulate In proposed Study [4] we learn about different flight
ticket prices to maximize their profits. Many people trends, the best time to buy a ticket. We have also
frequently take flights, so they are aware of the optimum successfully debunked some of the typical myths and
times to get affordable tickets. But there are also many misconceptions related to the airline industry and backed
people who have no experience while booking tickets and them up with data and analysis.
they end up falling into the trap of discounts from the
In a case study [5] by Tianya Wang, several features were After further reading, it was found that models are divided
extracted from the datasets and combined with the data to into two types-one that predicts minimum ticket price and
model the segments of the air transport market. With the one that helps generate maximum returns which can be
help of feature selection techniques, our proposed model termed as customer-side models and airline-side models.
is able to predict the quarterly average ticket price. In addition to these, other researches were also carried out,
such as researching the various factors that lead to
In research done [6] by Vinod Kimbhaune, proper changes in ticket prices and demand changes its price.
implementation of project has resulted in saving These researches found that customers who travel for
inexperienced people money by providing them with entertainment are more sensitive to ticket prices rather
information regarding the trends that flight prices are than customers traveling for business purposes. Date of
following, as well as providing them with a predicted reservation and date of travel is also being looked at by
price value that they use to decide whether to book a ticket many researchers as influencing price increases. Studies
now or later. are also being conducted on effects of delays on fares.
In survey [7] by Neel Bhosale, Machine learning This article will use Python3 to implement machine
algorithms are applied to the data set to predict the learning algorithms to create a model that will make
dynamic price of flights. This gives the predicted values predictions with high precision. Various python libraries
of the flight fare so that you can get the ticket at the are imported to perform these actions.
minimum price.
There are various steps involved in building an ML
Jaywrat Singh Champawat [8] proposed a framework to model, starting with importing a dataset and cleaning the
find a machine learning model that provides higher data. All null values and duplicate values are removed
accuracy in predicting the price of Indian flights. Working from the dataset. Then the data is encoded by conversion
with different models, it was found that the Random of some variables into a certain format. Converts
Forest algorithm showed the highest accuracy in categorical data to numeric data.
predicting the output.
After the dataset is processed, feature selection is
In [9] research, Fare prediction for civil aviation remains performed. Properties or variables that are not so
relatively imprecise and unreliable. A prediction method important are removed from the dataset. Exploratory data
based on MADA is proposed to solve this problem. analysis is performed to provide insight and identify
Judging from the experimental results, the MADA-based important features using the Extra Tress Regressor.
method can provide more accurate prediction results than Feature Engineering is performed to reduce computational
traditional methods for civil aviation ticket prices. cost and sometimes to improve accuracy. This is done
using a correlation matrix. Then the data is split into train
Proposed Study [10] Square measure machine learning and test data, where the train data is used to train the
algorithms to predict the exact price of air tickets and models. The test data is used to check the accuracy of the
provides the exact value of the air ticket price in the models. Then it decides on the optimal features and
limited and highest value. Model accuracy is also parameters performing hypertuning. After all models are
predicted by the R-squared value. trained, their accuracy is checked using their R-squared
value.
In a case study [11] by QiqiRen, it focuses on aspects that
are visible on the consumer side and only predicts a binary
4. Implementation
class of whether the price will increase or not, which is
basically whether we should buy now or wait.
We implemented machine learning lifecycle for this
project to create a basic web application that will predict
In the research done [12] by Juhar Ahmed Abdella, two
flight prices using a machine learning algorithm with
main areas of research are discussed-prediction models
historical flight data using python libraries such as Pandas,
that are designed to save money for the customer and
NumPy, Matplotlib, seaborn and sklearn. Figure shows
those that are designed to increase airline revenue. The
the steps that we were based on the life cycle:
strengths and weaknesses of the existing work were
discussed.

3. Proposed Work

Forecasting the price of an airline ticket is a very


challenging task because many factors depend on the price
of an airline ticket. Many researchers used various
machine learning algorithms to obtain a model with higher
prediction accuracy from the ticket price. Researchers use
various regression models such as support vector
machines (SVMs), Linear regression (LR), decision tree,
The first step is the selection of data, where the historical
random forests, etc. to predict the exact price of a flight.
flight data is collected for a price prediction model. Our
dataset contains more than 10, 000 flight-related data
records and its prices. Source, destination, departure date Decision the tree chooses an independent variable from
and time, number of stops, arrival time, costs, and more the dataset as decision nodes.
are just a few of the dataset's functions.
The entire data file is divided up into many subsections,
We cleansed the dataset during the exploratory data and when test data is fed into the model, the result is
analysis process by eliminating duplicate and null values. determined by determining which subsection the data
The accuracy of the model would suffer if these values point belongs to. The decision tree's output will be the
weren't eliminated. average value of all the data points in the subsection,
depending on which subsection the data point belongs to.
The next step is data preprocessing, where we noticed that
string format was used to store the majority of the data. Random forest
Every feature's data is retrieved, such as the day and
month from the trip's date in integer format and the hours In the Random Forest ensemble learning technique, the
and minutes from the departure time. Source and training model employs a number of different learning
destination features had to be transformed to values algorithms, and the separate outputs are then combined to
because they were of the categorical kind. For this One, produce the final anticipated outcome. The random forest
categorical values are transformed into model-identifiable belongs to the bagging category of ensemble learning,
values using hot-coding and label coding approaches. where a random number of elements and records are
The feature selection step is involved in selecting the chosen and given to the model group. In essence, decision
important properties that correlate more with price. There trees are used as a group of models in random forest. The
are some features such as additional information and route average value of the anticipated values, if they are thought
that are unnecessary features that can affect accuracy to be the output of the random forest model, can be
model and therefore need to be removed before obtaining calculated from the predictions produced by decision
our model ready for prediction. trees.
The following phase involves employing a machine Performance metrics
algorithm and model generation after features that are
more closely related to price have been chosen. Since our The accuracy of machine learning models trained by
dataset consists of labeled data, we will also use various algorithms will be compared using performance
supervised machine learning algorithms under supervision metrics, which are statistical models. Regression metrics
we will use regression algorithms like ours the dataset will be implemented for error measurement functions
contains continuous values in the functions. To explain the from each model using the sklearn. metrics module. The
link between dependent and independent variables, following metrics will be examined to determine each
regression models are used. We will utilise the following model's error rate:
machine learning algorithms in our project:
MAE (Mean Absolute Error)
Linear regression
The mean of the absolute difference between the expected
We will use multiple linear regressions, which estimates and actual numbers is effectively added to determine the
the relationship between two or more independent mean absolute error.
variables and one dependent variable. In simple linear
regression, there is only one independent and dependent MAE = 1/n [∑ (y-ý)]
function. However, our dataset contains many
independent functions on which the price may depend. The expected output values are y' and the actual output
values are y.
The following depicts the multiple linear regression
model: There are n total data points.
Y = β0x1+…. +βnxn + Ɛ Your model will perform better the lower the MAE
number is.
Y = the dependent variable's anticipated value
MSE (mean square error)
Independent variables = Xn
The root mean square error exponentiates the difference of
When all other parameters are zero, n = coefficients of the true a predicted output values before summing them
independent variables equals the y-intercept. instead using an absolute value.
Decision tree MSE = 1/n [∑ (y-ý) 2]
Regression and classification trees are the two main forms y=actual output values
of decision trees, where regression is used for continuous ý=predicted output values
data and classification is used for categorical values. n = Total number of data points
MSE penalizes large errors when we square the errors. entering your flight details. This information will be
Less the MSE value, the better the model performance. transmitted to the backend service, where the model will
forecast the result based on the input. The front-end
RMSE (root mean square error) receives the expected value and displays it.
RMSE is measured by taking the square root of the mean
5. Experimental Results and Conclusion
squared difference between forecast and actual value.

RMSE = √1/n [∑ (y-ý) 2]

The expected output values are y' and the actual output
values are y.

There are n total data points.

The higher the performance of a model, the more RMSE


is bigger than MAE and smaller than RMSE value
comparing different models.

R2 (Coefficient of determination)

It will help you understand how well the independent Figure: Graphical Results for Random Forest
variable modified with a deviation in your model.

R2 = 1 −∑(ý-y̅ )2 / ∑(y-y̅ )2

The R-squared value lies between 0 and 1. The closer its


value is for one, the better your model is compared to
others model values.

As the graph is forming a Gaussian distribution, this


means that our results are good.

This paper proposed to find a machine learning model that


provides higher accuracy in fare price prediction of
flights. Working with different models, it was found that
the Random Forest algorithm showed highest output
prediction accuracy. The paper provides better results than
previously observed models and aims to improve in the
future.

Proper implementation of this project can lead to saving


inexperienced people's money by providing them
Figure: System Architecture Diagram information related to the trends that air fares follow and
also give them the predicted value of the price they use to
We include the last three steps of the life cycle model for decide whether to book your flight now or later. In
deploying a trained machine learning model. Therefore, conclusion, this service can be implemented with good
after obtaining the model with the best accuracy, we save precision forecast. Because the predicted value is not
this model to a file using the pickle module. The completely accurate, there is a lot of room for
application will be built using the Flask Framework where improvement in this kind of service.
API endpoints such as GET and POST will be created
perform operations related to loading and displaying data 6. Future Scope
on front-end applications.
More routes can be added and the same analysis can be
The front-end of the application will be created using extended to major airports and travel routes in India. More
bootstrap framework where the user will have functions of data points and historical data should be taken into
account for analysis. This will train the model better and [9] Rutuja Konde, Rutuja Somvanshi, Pratiksha Khaire,
provide better accuracy and more savings. Prachi Zende, Kamlesh Patil, “Airfare Price
Prediction System”, International Research Journal of
Additional rules may be added to rule-based learning Engineering and Technology (IRJET), Volume: 09
based on our understanding of the industry, including Issue: 05, May 2022.
offer periods provided by airlines. Development of a more [10] Vivekanand P. Thakare, Ankita Sanjay Murraya,
user-friendly interface for different routes giving users Roshani Bandu Gawade, Mrunali Mukundrao
more flexibility. Sawarkar, Trupti Khemraj Shende, Ujjwala Kamlesh
Badole, “Flight Ticket Price Predictor Using Python”,
Currently, there are many fields where prediction services Advancement and Research in Instrumentation
are used such as stock price prediction tools used by Engineering Volume 5 Issue 1 e-ISSN: 2582-4341.
stockbrokers and services like Zestimate that provide [11] Kunal Khandelwal, Atharva Sawarkar, Dr. Swati
estimated value of house prices. That's why there exists Hira, “A Novel Approach for Fare Prediction Using
the demand for services like this in the aviation industry Machine Learning Techniques”, International Journal
which can assist customers in booking tickets. There are of Next-Generation Computing, VOLUME 12,
many of them that examine the work that has been done SPECIAL ISSUE 5, Online ISSN: 0976-5034,
on it using various techniques and further research is NOVEMBER 2021.
needed to improve prediction accuracy using different [12] Prof. Ms. Archana Dirgule, Shubham Agarwal, Ram
algorithms. Results can be more accurately obtained by Agrawal, Neha Singh, Kiran Adsul, “Flight Fare
using data that is more accurate and has better features. Prediction using Random Forest Algorithm”,
International Journal of Advanced Research in
References Science, Communication and Technology
(IJARSCT), Volume 2, Issue 3, ISSN (Online) 2581-
[1] Joshi, Achyut & Sikaria, Himanshu & Devireddy, 9429, May 2022.
Tarun. (2017). Predicting Flight Prices in India. [13] Tadvi Shabana, Khan Mariya, Shaikh Afifa, Sayyed
[2] Wang, Tianyi & Pouyanfar, Samira & Tian, Haiman Naziya Begum, “A NOVEL DATA SCIENCE AND
& Tao, Yudong & Alonso, Miguel & Luis, Steven & ML APPROACH TO PREDICT AIRFARE”, JETIR,
Chen, Shu-Ching. (2019). A Framework for Airfare Volume 6, Issue 4, ISSN-2349-5162, April 2019.
Price Prediction: A Machine Learning Approach.200- [14] Abhijit Boruah, Kamal Baruah, Biman Das, Manash
207.10.1109/IRI.2019.00041. Jyoti Das, Niranjan Borpatra Gohain, “A Bayesian
[3] Supriya Rajankar, Neha Sakharkar, Omprakash Approach for Flight Fare Prediction Based on
Rajankar, “Predicting The Price Of A Flight Ticket Kalman Filter”, Progress in Advanced Computing
With The Use Of Machine Learning Algorithms”, and Intelligent Engineering, Volume 2, ICACIE
INTERNATIONAL JOURNAL OF SCIENTIFIC & 2017.
TECHNOLOGY RESEARCH VOLUME 8, ISSUE [15] R. R. Subramanian, M. S. Murali, B. Deepak, P.
12, ISSN 2277-8616, DECEMBER 2019. Deepak, H. N. Reddy and R. R. Sudharsan, "Airline
[4] Neel Bhosale, Pranav Gole, Hrutuja Handore, Priti Fare Prediction Using Machine Learning Algorithms,
Lakde, Gajanan Arsalwad, “Flight Fare Prediction " 2022 4th International Conference on Smart
Using Machine Learning”, International Journal for Systems and Inventive Technology (ICSSIT), 2022,
Research in Applied Science & Engineering pp.877-884, doi:
Technology (IJRASET), ISSN: 2321-9653, Volume 10.1109/ICSSIT53264.2022.9716563.
10, Issue V, May 2022. [16] S. N. Prasath, S. Kumar M and S. Eliyas, "A
[5] Jaywrat Singh Champawat, Udhhav Arora, Dr. K. Prediction of Flight Fare Using K-Nearest Neighbors,
Vijaya, “INDIAN FLIGHT FARE PREDICTION: A " 2022 2nd International Conference on Advance
PROPOSAL”, International Journal of Advanced Computing and Innovative Technologies in
Technology in Engineering and Science, Vol. No.09, Engineering (ICACITE), 2022, pp.1347-1351, doi:
Issue No.03, ISSN 2348-7550, March 2021. 10.1109/ICACITE53722.2022.9823876.
[6] Zhichao Zhao, Jinguo You, Guoyu Gan, Xiaowu Li, [17] Y. S. Can, K. Büyükoğuz, E. B. Giritli, M. Şişik and
Jiaman Ding, “Civil airline fare prediction with a F. Alagöz, "Predicting Airfare Price Using Machine
multi-attribute dual-stage attention mechanism”, Appl Learning Techniques: A Case Study for Turkish
Intell 52, 5047-5062 (2022), 03 August 2021. Touristic Cities, " 30th Signal Processing and
[7] Pavithra Maria K, Anitha K L, “Flight Price Communications Applications Conference (SIU), doi:
Prediction for Users by Machine Learning 10.1109/SIU55565.2022.9864692, 2022.
Techniques”, International Advanced Research [18] G. Ratnakanth, "Prediction of Flight Fare using Deep
Journal in Science, Engineering and Technology, Learning Techniques, " 2022 International
Vol.8, Issue 3, DOI: 10.17148/IARJSET.2021.8321, Conference on Computing, Communication and
March 2021. Power Technology (IC3P), 2022, pp.308-313, doi:
[8] Juhar Ahmed Abdella, NM Zaki, Khaled Shuaib, 10.1109/IC3P52835.2022.00071.
Fahad Khan, “Airline ticket price and demand [19] Mrs. Sowjanya, Saud Ikram, Rayyan Khan, Shaik
prediction: A survey”, Journal of King Saud Haseeb Jawad, Saad ul arifeen, “Prediction of Airfare
University-Computer and Information Sciences, Prices Using Machine Learning”, International
Volume 33, Issue 4, May 2021. Journal of Mechanical Engineering, Vol.7 No.6,
ISSN: 0974-5823, June, 2022.
[20] B. S. Panda, B. Phanendra Varma, B. Chandini, R.
Bhoomika, “Flight Price Prediction Using Machine
Learning Techniques”, International Journal of
Computer Sciences and Engineering, Vol.10, Issue.9,
E-ISSN: 2347-2693, September 2022.
[21] Jaya Shukla, Aditi Srivastava, and Anjali Chauhan,
“Airline Price Prediction using Machine Learning”,
International Journal of Research in Engineering, IT
and Social Sciences, ISSN 2250-0588, Volume 10
Issue 05, May 2020.
[22] Vinal Raja, Janhavi Vakil, Yash Shah, Sonia Relan,
“Prediction of Airfare Using Machine Learning”,
IJSDR, Volume 3, Issue 4, ISSN: 2455-2631, April
2018.
[23] Bhavuk Chawla, Ms. Chandandeep Kaur, “Airfare
Analysis And Prediction Using Data Mining And
Machine Learning”, International Journal of
Engineering Science Invention, Volume 6 Issue 11,
ISSN (Online): 2319-6734, November 2017.
[24] Benny Mantin and Eran Rubin, “Fare Prediction
Websites and Transaction Prices: Empirical Evidence
from the Airline Industry”, INFORMS, Vol.35, No.4,
pp.640-655, July-August 2016.
[25] Mohit Vyavhare, Komal Wable, Ayush chothe and
Shreya Kute, “Airfare Price Prediction Using
Machine Learning Algorithms”, IJARIIE, Vol-8
Issue-2,-ISSN (O)-2395-4396, 2022

You might also like