0% found this document useful (0 votes)
52 views

Implementation of Flight Fare Prediction System Using Machine Learning

The Flight ticket prices increase or decrease every now and then depending on various factors like timing of the flights, destination, duration of flights. In the proposed system a predictive model will be created by applying machine learning algorithms to the collected historical data of flights.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

Implementation of Flight Fare Prediction System Using Machine Learning

The Flight ticket prices increase or decrease every now and then depending on various factors like timing of the flights, destination, duration of flights. In the proposed system a predictive model will be created by applying machine learning algorithms to the collected historical data of flights.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

10 V May 2022

https://doi.org/10.22214/ijraset.2022.43230
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com

Implementation of Flight Fare Prediction System


Using Machine Learning
Neel Bhosale1, Pranav Gole2, Hrutuja Handore3, Priti Lakade4, Gajanan Arsalwad5
1, 2, 3, 4,
5Department of Information Technology, K. J’s Trinity College of Engineering and Research, Pune, India

Abstract: The Flight ticket prices increase or decrease every now and then depending on various factors like timing of the flights,
destination, duration of flights. In the proposed system a predictive model will be created by applying machine learning
algorithms to the collected historical data of flights. Optimal timing for airline ticket purchasing from the consumer’s
perspective is challenging principally because buyers have insufficient information for reasoning about future price movements.
In this project we majorly targeted to uncover underlying trends of flight prices in India using historical data and also to suggest
the best time to buy a flight ticket. The project implements the validations or contradictions towards myths regarding the airline
industry, a comparison study among various models in predicting the optimal time to buy the flight ticket and the amount that
can be saved if done so. Remarkably, the trends of the prices are highly sensitive to the route, month of departure, day of
departure, time of departure, whether the day of departure is a holiday and airline carrier. Highly competitive routes like most
business routes (tier 1 to tier 1 cities like Mumbai-Delhi) had a non-decreasing trend where prices increased as days to departure
decreased, however other routes (tier 1 to tier 2 cities like Delhi - Guwahati) had a specific time frame where the prices are
minimum. Moreover, the data also uncovered two basic categories of airline carriers operating in India – the economical group
and the luxurious group, and in most cases, the minimum priced flight was a member of the economical group. The data also
validated the fact that, there are certain time-periods of the day where the prices are expected to be maximum. The scope of the
project can be extensively extended across the various routes to make significant savings on the purchase of flight prices across
the Indian Domestic Airline market.
Keywords: Flight ticket, Optimal timing, historical data, competitive routes, Indian Domestic Airline market.

I. INTRODUCTION
The flight ticket buying system is to purchase a ticket many days prior to flight take-off so as to stay away from the effect of the
most extreme charge. Mostly, aviation routes don’t agree this procedure. Plane organizations may diminish the cost at the time, they
need to build the market and at the time when the tickets are less accessible. They may maximize the costs. So, the cost may rely
upon different factors. To foresee the costs this venture uses AI to exhibit the ways of flight tickets after some time. All
organizations have the privilege and opportunity to change its ticket costs at any time. Explorer can set aside cash by booking a
ticket at the least costs. People who had travelled by flight frequently are aware of price fluctuations. The airlines use complex
policies of Revenue Management for execution of distinctive evaluating systems. The evaluating system as a result changes the
charge depending on time, season, and festive days to change the header or footer on successive pages. The ultimate aim of the
airways is to earn profit whereas the customer searches for the minimum rate. Customers usually try to buy the ticket well in
advance of departure date so as to avoid hike in airfare as date comes closer. But actually, this is not the fact. The customer may
wind up by giving more than they ought to for the same seat.

II. MOTIVATION
Motivation is to help people who tends to pay more for the flight fare ticket and for those who are naïve to this booking tickets
process. This will also help us to get more exposure to the machine learning techniques that will help us to excel and improve in the
existing skills.

III. AIM AND OBJECTIVE


The objective of the project is given below:
1) To get effective price for the customers.
2) Make UI user friendly.
3) Use of various ML methods to know more about dataset and get accurate results.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3814
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com

The aim of the project is:


a) The aim is to gain complete knowledge of “Data Science and Machine Learning”.
b) To study and gain knowledge about different algorithms in Machine Learning.
c) To get effective accurate price of flight fare.
d) To study flights prices ups & downs according to routes and on different days.
e) Creating effective user-friendly UI design.
f) Finding solutions for mitigation of defects.

IV. LITERATURE SURVEY


1) K. Tziridis T. Kalampokas G.Papakostas and K. Diamantaras "Airfare price prediction using machine learning techniques" in
European Signal Processing Conference (EUSIPCO), DOI: 10.23919/EUSIPCO .2017.8081365L. Li Y. Chen and Z. Li”
Yawning detection for monitoring driver fatigue based on two cameras” Proc. 12th Int. IEEE Conf. Intel. Transp. Syst. pp. 1-6
Oct. 2009.
Proposed study [1] Airfare price prediction using machine learning techniques, For the research work they have used dataset
consisting of 1814 data flights of the Aegean Airlines collected and used to train machine learning model. Different number of
features were used to train model various to showcase how selection of features can change accuracy of model. They have used
various algorithms such as Multilayer Perceptron (MLP), Generalized Regression Neural Network, Extreme Learning Machine
(ELM), Random Forest Regression Tree. o Regression Tree, Bagging Regression Tree, Regression SVM (Polynomial and Linear)
and Linear Regression (LR) and gained different outputs for each machine learning algorithms. They have tried and trained various
types of models with removing and adding different features from the dataset. Followed typical data science life cycle. The best
results came from Bagging regression tree.

2) William Groves and Maria Gini "An agent for optimizing airline ticket purchasing" in proceedings of the 2013 international
conference on autonomous agents and multi-agent systems.
In case study [2] by William groves an agent is introduced which is able to optimize purchase timing on behalf of customers. Partial
least square regression technique is used to build a model. Initially they have used various techniques for feature selection such as
Feature Extraction, Lagged Feature Computation, Regression Model Construction and Optimal Model Selection. Their experiments
were designed to estimate real-world costs of using our prediction models. The lag scheme approach works well for many choices of
machine learning algorithms, but PLS regression was found to work best for this domain. The improved performance can be
attributed to a natural resistance to collinear and irrelevant variables.

3) J. Santos Dominguez-Menchero, Javier Rivera and Emilio Torres Manzanera "Optimal purchase timing in the airline market".
In this paper, the researchers have researched the general pattern in airline pricing behaviour and a methodology for analysing
different routes and/or carriers. Their purpose is to provide customers with the relevant information they need to decide the best time
to purchase a ticket, striking a balance between the desire to save money and any time restraints the buyer may have. Their study
shows how non-parametric isotonic regression techniques, as opposed to standard parametric techniques, are particularly useful.
Most importantly, we can determine the margin of time consumers may delay their purchase without significant price increase,
specify the economic loss for each day the purchase is delayed and detect when it is better to wait until the last day to make a
purchase.

4) Supriya Rajankar, Neha sakhrakar and Omprakash rajankar “Flight fare prediction using machine learning algorithms”
International journal of Engineering Research and Technology (IJERT) June 2019.
Journal by Supriya Rajankar a survey on flight fare prediction using machine learning algorithm uses small dataset consisting of
flights between Delhi and Bombay. Algorithms such as K-nearest neighbours (KNN), linear regression, support vector machine
(SVM) are applied to gain different outcomes and do research on them. For predicting the flight ticket prices, many algorithms were
implemented in machine learning. The algorithms are: Support Vector Machine (SVM), Linear regression, K-Nearest neighbours,
Decision tree, Multilayer Perceptron, Gradient Boosting and Random Forest Algorithm. Using python library scikit learn these
models have been implemented. The parameters like R-square, MAE and MSE are considered to verify the performance of these
models. The best model results were of Decision Tree algorithm.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3815
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com

5) Tianyi wang, samira Pouyanfar, haiman Tian and Yudong Tao "A Framework for airline price prediction: A machine learning
approach"
In this paper, Tianyi wang, samira Pouyanfar, haiman Tian and Yudong Tao [5] proposed framework where two databases are
combined together with macroeconomic data and machine learning algorithms such as support vector machine, XGBoost are used to
model the average ticket price based on source and destination pairs. The framework achieves a high prediction accuracy 0.869 with
the adjusted R squared performance metrics. They had the result of lowest error rate of 0.92 with the XGBoost Algorithm.

6) T. Janssen "A linear quantile mixed regression model for prediction of airline ticket prices"
In this paper, they have predicted the best time to purchase the tickets. They have used various machine learning algorithms such as
linear regression, Decision Tree, Random Forest, K-Nearest Neighbour, Multilayer Perceptron (MLP), gradient boosting, support
vector machine (SVM). For predictors, they have used Naïve Bayes and Stacked Prediction Model. the research a desired model is
implemented using the Linear Quantile Blended Regression methodology for San Francisco–New York course where each day
airfares are given by online website. Two features such as number of days for departure and whether departure is on weekend or
weekday are considered to develop the model.

7) Wohlfarth, T.clemencon, S.Roueff “A Data mining approach to travel price forecasting” 10th international conference on
machine learning Honolulu 2011.
In the research paper [7] on Flight fare prediction system by Wohlfarth, T.clemencon, S.Roueff using the technique of yield
management in the air travel industry. They have used various data mining techniques. It is the goal of this paper to consider the
design of decision-making tools in the context of varying travel prices from the customer’s perspective. Terms used in the research
are machine techniques/ algorithms mentioned as Clustering.

8) Vinod Kimbhaune, Harshil Donga, Ashutosh Trivedi, Sonam Mahajan and Viraj Mahajan research paper on flight fare
prediction system.
In the research paper [7] on Flight fare prediction system by Vinod Kimbhaune, Harshil Donga, Ashutosh Trivedi, Sonam Mahajan
and Viraj Mahajan using the various machine learning algorithm approaches i.e., Random Forest, Decision tree and Linear
regression are applied on dataset. To determine ideal purchase time for flight ticket. There project aims to develop an application
which will predict the flight prices for various flights using machine learning model. The techniques they have used are mentioned
as Linear Regression, Decision Tree and random Forest. The performance metrics techniques used are MAE, MSE and RSME. The
outcome for their project was not fully accurate but by adding more real time data set will give more accurate results.

9) W. Groves and M. Gini, ―An agent for optimizing airline ticket purchasing, ǁ 12th International Conference on Autonomous
Agents and Multiagent Systems (AAMAS 2013), St. Paul, MN, May 06 - 10, 2013, pp. 1341-1342.
This is the extended version of the research paper [3] exploited Partial Least Square Regression (PLSR) for building up a model.
The information was gathered from major travel adventure booking sites from 22 February 2011 to 23 June 2011. Extra information
was additionally gathered and are utilized to check the correlations of the exhibitions of the last model. Janssen.

V. DIFFERENT APPROACHES
There are various approaches for implementing the project, below we got some approaches used by authors in the literature survey:

A. Linear Regression
Regression is a method of modelling a target value based on predictors that are independent. It is mostly based on the number of
independent variables and the relationship between independent and dependent variables.
Linear regression is a type of analysis where the number of independent variables is one and the relationship between the dependent
and independent variables vary linearly. The important concept to understand linear regressions are cost function and Gradient
decent

y(pred) = b0+b1 ∗ x

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3816
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com

B. Gradient Boosting
It is an additive regression model by fitting simple function to current “pseudo” residuals sequentially by least-squares at each
iteration. It uses the Decision tree as a basic estimator in sci-kit implementation. Starting from 10 to 1000 with the interval of 10
boosting stages are used with maximum numbers. The loss function is an important parameter in the gradient boosting. It can be
calculated with options: least squares regression, least absolute deviation, and quantile regression.

C. K- Nearest Neighbours
In regression techniques, the output obtained is an average value of its k nearest neighbours. It is a non-parametric method like
SVM. Using some values, results are evaluated and the best performance value is obtained.

D. Multi-Layer Perceptron
It is the class of feedforward artificial neural networks. It includes the input layer, output layer and the number of the hidden layers.
The hidden layer gives the depth of the neural network. The setup includes 1 hidden layer, the number of neurons starts from 100 to
2000 with different intervals depending upon the required condition. To fire each neuron, it requires activation energy. The logistic
sigmoid function is used as an activation function.

E. Support Vector Machine


Support Vector Machine used as regression analysis that relays on kernel function considered as non-parametric technique. The
following kernels are used: Linear, Polynomial, Radial Basis Function. As per the previous studies Random Forest and the gradient
boosting gives the maximum accuracy.

VI. METHODOLOGY AND TERMS USED


The below mentioned are some parameters used in our data set:
1) Size of Test Set: 10683 rows & 11 columns
2) Airline: The name of the airline.
3) Date of Journey: The date of the journey.
4) Source: The source from which the service begins.
5) Route: Route of the flight, start to end.
6) Destinations: The destination where the service ends.
7) Departure Time: The time when the journey starts from the source.
8) Arrival Time: Time of arrival at the destination.
9) Duration: Total duration of the flight.
10) Total Stops: Total stops between the source and destination.
11) Additional Info: Additional information about the flight
12) Price: The price of the ticket

Fig. Dataset Contents

Machine Learning Algorithm used for implementing the project.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3817
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com

A. Random Forest
It is a supervised learning algorithm. The benefit of the random forest is, it very well may be utilized for both characterization and
relapse issue which structure most of current machine learning framework. Random forest forms numerous decision trees, what’s
more, adds them together to get an increasingly exact and stable expectation. Random Forest has nearly the equivalent parameters as
a decision tree or a stowing classifier model. It is very simple to discover the significance of each element on the expectation when
contrasted with others in this calculation. The regular component in these techniques is, for the kth tree, a random vector theta k is
produced, autonomous of the past random vector’s theta 1, theta k-1 however with the equivalent distribution, while a tree is
developed utilizing the preparation set and bringing about a classifier. x is an information vector. For a period, in stowing the
random vector is created as the includes in N boxes where N is the number of models in the preparation set of information. In
random split, choice includes various autonomous random whole numbers between 1 to K. The dimensionality and nature of theta
rely upon its utilization in the development of a tree. After countless trees are created, they select the most famous class. These
methodologies are called as random forests.

B. XGBoost
XGboost is the implementation of gradient boosted decision tree. In this algorithm, decision trees are created in sequential form.
Weights play an important role in XGBoost. Weights are assigned to all independent variables which are then fed into decision tree
which predicts results. The weight of tree is predicted wrong by tree is increased then these variables are then fed to second decision
tree. This individual classifiers/predictor then ensemble to give a strong and more precise model. It ca work on
regression,classification, prediction, ranking, user-defined prediction problems.

C. Performance Metrics
Performance metrics are statistical models which will be used to compare the accuracy of the machine learning models trained by
different algorithms. The sklearn. metrics module will be used to implement the functions to measure the errors from each model
using the regression metrics. Following metrics will be used to check the error measure of each model.

D. MAE (Mean Absolute Error)


Mean Absolute Error is basically the sum of average of the absolute difference between the predicted and actual values.
MAE = 1/n[∑(y-ý)]
y = actual output values, ý = predicted output values n = Total number of data points Lesser the value of MAE the better the
performance of your model.

E. MSE (Mean Square Error)


Mean Square Error squares the difference of actual and predicted output values before summing them all instead of using the
absolute value.
MSE = 1/n[∑(y-ý)2]
y=actual output values ý=predicted output values n = Total number of data points MSE punishes big errors as we are squaring the
errors. Lower the value of MSE the better the performance of the model.

F. RMSE (Root Mean Square Error)


RMSE is measured by taking the square root of the average of the squared difference between the prediction and the actual value.
RMSE = √1/n[∑(y-ý)2]
y=actual output values ý=predicted output values n = Total number of data points RMSE is greater than MAE and lesser the value
of RMSE between different model the better the performance of that model.

G. R2 (Coefficient of Determination)
It helps you to understand how well the independent variable adjusted with the variance in your model.
R2 = − ∑(ý-y̅ )2 ÷ ∑(y-y̅ )2
The value of R-square lies between 0 to 1. The closer its value to one, the better your model is when comparing with other model
values.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3818
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com

VII. PROPOSED SYSTEM


Following is the basic proposed system:

Fig. Proposed System Diagram

VIII. USE CASE DIAGRAM


Use Case Diagram of the project:

Fig. Use Case Diagram

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3819
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com

IX. IMPLEMENTATION
We have followed following steps in our project to get to our ultimate goal of predicting flight fare:
1) Importing Necessary Libraries
Importing the python libraries such as pandas, matplotlib, seaborn, NumPy for reading and visualizing the dataset.

2) Reading our Dataset


We will read out dataset using pandas. As the dataset is in the excel form, we will use “pd.read_excel()”.

3) Dropping NAN Values


We will check if there are any Null values in our dataset, if we have, we will drop it using: “dropna(inplace=TRUE)”.

4) Exploratory Data Analysis


We will pre-process our dataset. We will extract day and month from the column “Date of Journey” as the model will understand
numerical value, for this we will use “pd.to_datetime” for day and month column. “dt.day” and “dt.month” will extract day and
month respectively from the given column.
Same process will be doing for the “dep_time” column, “Duration” column and “arrival_time” column and extract hours and min
from it. After extracting day, month, hours and min, we will drop “Date of Journey”, “Duration”, “dep_time” & “arrival_time”
column from our dataset.

5) Handling Categorical Data


As we know the model understands numerical value, so we will convert all the categorical data into numerical data. For this we will
perform “OneHotEncoding” method to convert it to numerical data. We will make dummies using pandas and perform
“OneHotEncoding” on the “Airline”, “Source” and “Destination” columns.
We will drop “AdditionalInfo” and “Route” columns as “Route” column contains same data as “Total_Stops” columns and
“AdditionalInfo” column doesn’t have any additional info. “Total_Stops” column is ordinal type data so we will perform
“LabelEncoder” and label each stop as 0,1,2,3,4. As the stop increases, the value also increases.

6) Test Data: Performing EDA and Feature Engineering


For the test data, we will perform same steps followed in step (2), (3), (4) and (5).

7) Feature Selection
In this process, we will find out the best feature which will contribute to our target variable.
X = “Independent Feature”
Y = “Dependent Feature” i.e., “Price” column.
We will separate all the independent features except price in the X variable and price in Y variable. For this, we will use loc & iloc
method.
Now, we have used “ExtraTreesRegressor” to find more important features from the data. Use the selection variable and do fitting
the X & Y features. After this we will print “feature_importance” and will get to know the important features.
We get to know that “Total_stops” is playing as the most important feature.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3820
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com

8) Applying Machine Learning Algorithms


We have implemented this project using Random Forest and XGBoost Regressor algorithm. The test results are mentioned below.

9) Pickling the File


Pickling the best model (Random Forest) to reuse it.
X. RESULTS
Following are the test results for the train data and test data. As we can see Random Forest is performing better than XGBoost
algorithm. So, we have chosen Random Forest model for out project.

ML Algorithm Train Data Test Data


XGBoost 0.7774 0.7752

Random Forest 0.9529 0.7970

Table 4. Model Accuracy Results

Below is the comparison of the MAE, MSE & RMSE


ML Algorithm MAE MSE RMSE
Random Forest 1180.05 4358748.36 2087.76

HyperParameter Tuning 1278.10 4277590.50 2068.23


Random Forest
XGBoost 1180.05 4358748.36 2087.76

Table 5. Comparing MAE, MSE &RMSE Scores

As we can, the scores of Random Forest before hyperparameter tuning and XGBoost are same. Scores of Random Forest are slightly
affected after performing hyperparameter tuning.
As compared to the results of the reference paper [1], they have used various machine learning techniques in which they have got
the best results with the Bagging Regression Tree method with the 87.42 accuracy rate. As compared to the Random Forest model
of reference paper [1], below are the comparison:

Accuracy 87.42 79.7


Table 6. Comparison of Model Table 1

As compared to the results of the reference paper [15], they have used various machine learning techniques in which they have got
the best results with the Trend Based Model method with the 81.8 accuracy rate. As compared to the Random Forest model of
reference paper[15], below are the comparison:

Accuracy 77.8 79.7


Table 7. Comparison of Model Table 2

XI. CONCLUSIONS
Machine Learning algorithms are applied on the dataset to predict the dynamic fare of flights. This gives the predicted values of
flight fare to get a flight ticket at minimum cost. The values of R-squared obtained from the algorithm give the accuracy of the
model. In the future, if more data could be accessed such as the current availability of seats, the predicted results will be more
accurate. Finally, we conclude that this methodology is not preferred for performing this project. We can add more methods, more
data for more accurate results.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3821
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue V May 2022- Available at www.ijraset.com

REFERENCES
[1] K. Tziridis T. Kalampokas G. Papa Kostas and K. Diamantaras "Airfare price prediction using machine learning techniques" in European Signal Processing
Conference (EUSIPCO), DOI: 10.23919/EUSIPCO .2017.8081365L. Li Y. Chen and Z. Li” Yawning detection for monitoring driver fatigue based on two
cameras” Proc. 12th Int. IEEE Conf. Intel. Transp. Syst. pp. 1-6 Oct. 2009.
[2] William Groves and Maria Gini "An agent for optimizing airline ticket purchasing" in proceedings of the 2013 international conference on autonomous agents
and multi-agent systems.
[3] J. Santos Dominguez-Menchero, Javier Rivera and Emilio TorresManzanera "Optimal purchase timing in the airline market".
[4] Supriya Rajankar, Neha sakhrakar and Omprakash rajankar “Flight fare prediction using machine learning algorithms” International journal of Engineering
Research and Technology (IJERT) June 2019.
[5] Tianyi wang, samira Pouyanfar, haiman Tian and Yudong Tao "A Framework for airline price prediction: A machine learning approach"
[6] T. Janssen "A linear quantile mixed regression model for prediction of airline ticket prices"
[7] Wohlfarth, T. clemencon, S. Roueff “A Dat mining approach to travel price forecasting” 10th international conference on machine learning Honolulu 2011.
[8] Vinod Kimbhaune, Harshil Donga, Ashutosh Trivedi, Sonam Mahajan and Viraj Mahajan research paper on flight fare prediction system.
[9] W. Groves and M. Gini, ―An agent for optimizing airline ticket purchasing, ǁ 12th International Conference on Autonomous Agents and Multiagent Systems
(AAMAS 2013), St. Paul, MN, May 06 - 10, 2013, pp. 1341-1342.
[10] Viet Hoang Vu, Quang Tran Minh and Phu H. Phung, An Airfare Prediction Model for Developing Marketsǁ, IEEE paper 2018.
[11] Dominguez-Menchero, J. Santo, Reviera, ǁoptimal purchase timing in airline marketsǁ ,2014
[12] medium.com/analytics-vidhya/mae-mse-rmse-coefficient of determination-adjusted-r-squared-which-metric-is bettercd0326a5697e article on performance
metrics
[13] www.keboola.com/blog/random-forest-regression article on random forest
[14] https://towardsdatascience.com/machine-learning-basics-decisiontreeregression-1d73ea003fda article on decision tree regression.
[15] Achyut Joshi, Himanshu Sikaria, Tarun Devireddy, & Dr. Vivek Vijay. Predicting Flight Prices in India
[16] O. Etzioni, R. Tuchinda, C. A. Knoblock, and A. Yates. To buy or not to buy: mining airfare data to minimize ticket purchase price.
[17] Manolis Papadakis. Predicting Airfare Prices.
[18] Groves and Gini, 2011. A Regression Model for Predicting Optimal Purchase TimingFor Airline Tickets.
[19] Modeling of United States Airline Fares – Using the Official Airline Guide (OAG) and Airline Origin and Destination Survey (DB1B), Krishna Rama-Murthy,
2006.
[20] B. S. Everitt: The Cambridge Dictionary of Statistics, Cambridge University Press, Cambridge (3rd edition, 2006). ISBN 0-521-69027-7.
[21] Bishop: Pattern Recognition and Machine Learning, Springer, ISBN 0-387-31073-8.
[22] E. Bachis and C. A. Piga. Low-cost airlines and online price dispersion. International Journal of Industrial Organization, In Press, Corrected Proof, 2011.
[23] P. P. Belobaba. Airline yield management. an overview of seat inventory control. Transportation Science, 21(2):63, 1987.
[24] Y. Levin, J. McGill, and M. Nediak. Dynamic pricing in the presence of strategic consumers and oligopolistic competition. Management Science, 55(1):32–46,
2009
[25] B. Smith, J. Leimkuhler, R. Darrow, and Samuels, ―Yield managementat american airlines,ǁInterfaces, vol.22, pp. 8–31, 1992.
[26] T. Janssen, ―A linear quantile mixed regression model for prediction of airline ticket prices,ǁ Bachelor Thesis, Radboud University, 2014.
[27] S.B. Kotsiantis, ―Decision trees: a recent overview,ǁ Artificial Intelligence Review, vol. 39, no. 4, pp. 261-283, 2013.
[28] L. Breiman, ―Random forests, ǁ Machine Learning, vol. 45, pp. 5-32, 2001.
[29] S. Haykin, Neural Networks – A Comprehensive Foundation. Prentice Hall, 2nd Edition, 1999.
[30] H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola and V. Vapnik, ǁSupport vector regression machines, ǁ Advances in neural information processing systems,
vol. 9, pp. 155-161, 1997.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3822

You might also like