0% found this document useful (0 votes)

22 views25 pages

Thesis Defense

Flight ticket price prediction using machine learning

Uploaded by

shubham7gladiator

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views25 pages

Thesis Defense

Flight ticket price prediction using machine learning

Uploaded by

shubham7gladiator

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

Flight Ticket

Price
Prediction

Presentation by Shubham Karmakar

Supervised by Sri Tulsidas Mukherjee

Page 2
Incentives
• To develop a predictive model that accurately forecasts flight ticket
prices.
• Identify and analyze key factors such as airline selection,
routes, and stops that influence ticket pricing.
• Optimize model performance through hyperparameter tuning and
feature importance analysis.
• Gain insights into the complex relationships between features and
ticket prices to make informed decisions when booking flights.
• Enhance consumer welfare by enabling cost-effective travel
planning.
• Optimizes airline revenue management through informed pricing
strategies.
Page 3 Introduction
The motivation behind predicting flight ticket prices is
driven by the dual imperative of enhancing consumer
welfare and optimizing revenue streams for airlines. For
travelers, predictive algorithms serve as indispensable
tools, offering discerning insights into optimal purchasing
windows, thereby imbuing journeys with a sense of
financial prudence and foresight. These sophisticated
models not only augment market transparency but also
engender a paradigm shift towards consumer
empowerment, elucidating the intricate dynamics of
pricing volatility. Concurrently, for airlines, predictive
analytics represent a strategic linchpin, enabling dynamic
pricing strategies that maximize yield while mitigating
demand volatility. Furthermore, the deployment of
predictive tools engenders a virtuous cycle of innovation,
catalyzing the evolution of pricing mechanisms and
operational efficiencies within the aviation industry..
Page 4

Insights
• Total Operating Revenue:
⚬ India's aviation industry generated approximately $20 billion in revenue.
• Revenue Passenger Kilometers (RPK):
⚬ In 2019, Indian airlines achieved around 265 billion RPK, reflecting
significant passenger demand and airline activity.
• Passenger Traffic:
⚬ Domestic airlines in India carried approximately 144 million passengers
in 2019, showcasing the growing preference for air travel
Page 5

1
Stakeholders
Consumers 2 3
• Cost Savings: Helps consumers
identify the best times to purchase Airline Benefits:
tickets at the lowest prices. • Revenue Management: Assists airlines in
setting optimal prices to maximize revenue. Market Dynamics:
• Budget Planning: Enables travelers to
• Demand Forecasting: Improves airlines' ⚬ Enhances market competitiveness
plan their expenses more effectively.
ability to predict passenger demand and by providing transparent pricing.
adjust pricing strategies accordingly. ⚬ Encourages more informed
decision-making for both
consumers and airlines.
Page 6

Data Processing
• Importing Datasets:
• Collected datasets from various sources, including airline websites and travel agencies.
• Ensured datasets included relevant features such as date of journey, departure time, arrival time, and
ticket prices.
• Handling Missing Values:
• Identified and addressed missing data points using imputation methods to maintain dataset integrity.
• Applied techniques such as mean imputation for numerical data and mode imputation for categorical
data.
• Date and Time Conversion:
• Converted date and time columns into datetime format for consistent processing.
• Extracted valuable time-based features such as day of the week, month, and hour of travel.
• Data Normalization:
• Standardized numerical features to ensure uniform scaling across the dataset.
• Used normalization techniques to improve the performance of machine learning models.
Page 6

Handling Categorical Data

We are using two main Encoding Techniques to convert

Categorical data into some numerical format

Nominal data -- Data that are not in any order -->one hot
encoding

ordinal data -- Data are in order --> labelEncoder

From graph we see that Jet Airways have the highest Price and
apart from the first airline everyone has almost similar median
Handling Outliers

As there some outliers in price feature, we replace them with the

MEDIAN
Page 10
Feature Engineering
Feature engineering is a preprocessing step in
supervised machine learning and statistical modellingwhich
transforms raw data into a more effective set of inputs. Each input
comprises several attributes, known as features. By providing
models with relevant information, feature engineering significantly
enhances their predictive accuracy and decision-making capability

1 2
Date and Time Extraction:
⚬ Extracted day and month from the date of journey to Duration Calculation:
⚬ Calculated the duration of each flight by subtracting departure
capture seasonal and monthly trends.
⚬ Extracted hour and minute from departure and arrival time from arrival time.
⚬ Converted the duration into a numerical feature representing
times to analyze the impact of travel time on ticket
prices. total travel time in minutes or hours.
Page 11
Feature Engineering
• Airline Encoding:
⚬ Encoded the airline names to capture airline-specific
1
pricing strategies.
⚬ Used label encoding or one-hot encoding for
transforming airline data into numerical features.

• Route Encoding:
2 3
⚬ Encoded the route (combination of source and Additional Features:
destination) to identify route-specific pricing • Considered adding features such as layovers, number
patterns. of stops, and seat class.
⚬ Applied one-hot encoding to convert categorical • Evaluated the importance of each feature using feature
route data into numerical format. importance metrics from machine learning models.
Model Selection
Page 12

1
2

• RandomForest Regressor: • Logistic Regression:

⚬ Chosen for its ability to handle complex ⚬ Evaluated for its simplicity and
interactions and provide robust efficiency in binary classification.
predictions. ⚬ Found less suitable for continuous
⚬ Performed well in initial tests with high
price prediction tasks.
accuracy and low error rates.
Model Selection
Page 13

3
4

• K-Nearest Neighbors (KNN): • Decision Tree Regressor:

• Considered for its straightforward approach • Explored for its interpretability and ease
to prediction based on nearest neighbors. of visualization.
• Showed limitations in handling large • Prone to overfitting on training data,
datasets and complex feature spaces. requiring pruning techniques.
Model Selection
Page 14

5
6

• Support Vector Regression • Gradient Boosting Regressor:

(SVR): ⚬ Assessed for its ability to improve
⚬ Tested for its effectiveness prediction accuracy through iterative
in high-dimensional boosting.
spaces. ⚬ Demonstrated competitive performance
⚬ Computationally intensive but required careful tuning of
and less scalable for larger hyperparameters.
datasets.
Page 15
So Which is the best Model?
• Model Selection:
• After evaluating multiple models, the RandomForest Regressor emerged as the best-performing model.
Page 6

....why so?
• Performance Metrics:
⚬ Accuracy: Achieved an accuracy rate of 85.36%, indicating a high level of predictive performance.
⚬ Mean Absolute Error (MAE): Low MAE values, reflecting the model's precision in predicting
ticket prices.
⚬ Root Mean Squared Error (RMSE): Demonstrated the model's ability to handle variability in the
data with minimal error.
Page 6

....why so?
• Feature Importance:
• Identified key features contributing to accurate predictions, such as departure time,
duration, and airline.
• Provided insights into which factors most significantly influence ticket prices.

• Model Robustness:
• Validated the model's robustness through cross-validation and testing on unseen data.
• Ensured consistent performance across different datasets and conditions.
Page 18
Hypertunning The Model
Objective:
Optimize the performance of the RandomForest Regressor by fine-tuning its hyperparameters.
Parameters Tuned:
Number of Trees (n_estimators): Adjusted the number of decision trees in the forest to balance between overfitting
and underfitting.
Maximum Depth (max_depth): Set limits on the depth of the trees to prevent overfitting.
Minimum Samples Split (min_samples_split): Determined the minimum number of samples required to split an
internal node.
Minimum Samples Leaf (min_samples_leaf): Established the minimum number of samples required to be at a leaf
node.
Optimization Techniques:
RandomizedSearchCV: Utilized to efficiently search through a wide range of hyperparameters by randomly
sampling from the specified distributions.
GridSearchCV: Employed to perform an exhaustive search over a predefined grid of hyperparameters to find the best
combination.
Page 19
Hypertuning The Model
• Results:
⚬ Achieved improved accuracy and reduced error rates with optimized
hyperparameters.
⚬ Fine-tuned model demonstrated enhanced generalization on unseen data.

Before hypertuning After Hypertuning

r2 score was: 0.8383033821751005 r2 score is:
0.8536822685106241

AFTER HYPERTUNING THE ACCURACY INCREASES

Data analysis(sample data)
Page 20

• Sample Data Visualization:

⚬ Display charts and graphs showing
the distribution of ticket prices over
time.
⚬ Include visuals depicting the
relationship between ticket prices
and key features such as departure
time, duration, and airline.
Data analysis(sample data)
Page 21

• Insights from Visualizations:

⚬ Price Trends: Identify patterns and trends in ticket prices based on the day of the
week, month, and season.
⚬ Departure Time Impact: Show how departure times influence ticket prices,
highlighting peak and off-peak hours.
⚬ Airline Comparison: Compare ticket prices across different airlines to reveal
pricing strategies and differences.
• Descriptive Statistics:
⚬ Provide summary statistics such as mean, median, and standard deviation of ticket
prices.
⚬ Highlight any notable outliers or anomalies in the data.
Page 22
Lexicon retrograde
RandomForest Regressor:
An ensemble learning method that constructs multiple decision trees and merges them to improve
predictive accuracy and control overfitting.
Hyperparameter Tuning:
The process of adjusting model parameters to optimize performance and achieve the best possible
predictive accuracy.
Feature Importance:
A metric that indicates the contribution of each feature to the predictive power of the model, helping to
identify the most influential variables.
One-Hot Encoding:
A technique used to convert categorical variables into a binary format, where each category is represented
by a unique binary vector.
LabelEncoder:
A preprocessing tool that converts categorical labels into a numerical format, allowing them to be used in
machine learning algorithms.
CESSATION
Based on the analysis conducted, it can be concluded that factors such as the airline, total stops, and
specific routes have a significant impact on ticket prices. It is advisable for passengers to consider these
factors while booking flights to potentially find more cost-effective options.
Additionally, understanding the importance of these variables can help passengers make informed
decisions and optimize their travel expenses.
The flight price prediction project utilized advanced machine learning techniques such as Random
ForestRegressor and hyperparameter tuning to accurately forecast ticket prices. By analyzing feature
importance, we identified key factors like airline selection, routes, and stops that significantly influence
pricing. Techniques like One-Hot Encoding and LabelEncoder were employed to preprocess categorical
data for model training.
Through this analysis, we gained insights into the complex relationships between various features and
ticket prices, enabling us to make informed decisions when booking flights. The project showcased the
importance of data preprocessing, model optimization, and feature analysis in enhancing prediction
accuracy and understanding the dynamics of flight pricing.
धन्यवादः

Flight Price Prediction
57% (7)
Flight Price Prediction
19 pages
Case Study
No ratings yet
Case Study
8 pages
Revenue Management at SkyJet
0% (1)
Revenue Management at SkyJet
8 pages
CH 13 - Aggregate Planning PDF
No ratings yet
CH 13 - Aggregate Planning PDF
15 pages
Function Space Revenue Management
No ratings yet
Function Space Revenue Management
14 pages
Airplane Final
No ratings yet
Airplane Final
23 pages
Ict Project Report
No ratings yet
Ict Project Report
14 pages
Flight Ticket Price Predictor - Formatted Paper
No ratings yet
Flight Ticket Price Predictor - Formatted Paper
5 pages
Presentation On Flight Price Prediction
No ratings yet
Presentation On Flight Price Prediction
30 pages
Presentation On Flight Price Prediction 2
No ratings yet
Presentation On Flight Price Prediction 2
30 pages
Propsoal ML
No ratings yet
Propsoal ML
4 pages
Prediction of Flight-Fare Using Machine Learning
No ratings yet
Prediction of Flight-Fare Using Machine Learning
6 pages
Presentation On Flight Price Prediction
No ratings yet
Presentation On Flight Price Prediction
30 pages
Research High School
No ratings yet
Research High School
10 pages
Flight Price Prediction
No ratings yet
Flight Price Prediction
15 pages
Flight Price Project
No ratings yet
Flight Price Project
15 pages
Airfare Synopsis
No ratings yet
Airfare Synopsis
6 pages
Models
No ratings yet
Models
5 pages
Flight Price Prediction Project Report in PDF
No ratings yet
Flight Price Prediction Project Report in PDF
34 pages
Flight Price Prediction
No ratings yet
Flight Price Prediction
15 pages
Prediction of Flight-Fare Using Machine Learning
No ratings yet
Prediction of Flight-Fare Using Machine Learning
6 pages
Dse4 Stug082
No ratings yet
Dse4 Stug082
43 pages
Flight Ticket Price Predicting With The
No ratings yet
Flight Ticket Price Predicting With The
4 pages
Paper 90
No ratings yet
Paper 90
7 pages
Presentation Learbnbay - Flight Fare Prediction
No ratings yet
Presentation Learbnbay - Flight Fare Prediction
15 pages
Surendra Paper
No ratings yet
Surendra Paper
7 pages
47.epra Journals 14763
No ratings yet
47.epra Journals 14763
6 pages
Presentation Sample Half
No ratings yet
Presentation Sample Half
11 pages
Flight Fare Predictor
No ratings yet
Flight Fare Predictor
21 pages
Flight Price Prediction Using Machine Learning Algorithms
No ratings yet
Flight Price Prediction Using Machine Learning Algorithms
5 pages
EE5253 2023 Paper Group35
No ratings yet
EE5253 2023 Paper Group35
5 pages
Project Report On Flight Price Predication Using ML Techniques
No ratings yet
Project Report On Flight Price Predication Using ML Techniques
23 pages
Capstone Review 1
No ratings yet
Capstone Review 1
7 pages
Team Nithya
No ratings yet
Team Nithya
16 pages
Predicting Flight Prices in India Sectors
No ratings yet
Predicting Flight Prices in India Sectors
16 pages
Flight Price Predection 2
No ratings yet
Flight Price Predection 2
6 pages
Flight Price Prediction
No ratings yet
Flight Price Prediction
34 pages
Cse 28
No ratings yet
Cse 28
7 pages
1-Flight Booking
No ratings yet
1-Flight Booking
25 pages
Flight Booking
No ratings yet
Flight Booking
25 pages
Flight Fare Prediction: Project Report
No ratings yet
Flight Fare Prediction: Project Report
38 pages
Comparative Analysis of Machine Learning Models For Accurate Flight Price Prediction
No ratings yet
Comparative Analysis of Machine Learning Models For Accurate Flight Price Prediction
7 pages
Easychair Preprint: Vinod Kimbhaune, Harshil Donga, Asutosh Trivedi, Sonam Mahajan and Viraj Mahajan
No ratings yet
Easychair Preprint: Vinod Kimbhaune, Harshil Donga, Asutosh Trivedi, Sonam Mahajan and Viraj Mahajan
5 pages
A17 Journal (1) .Docxnew
No ratings yet
A17 Journal (1) .Docxnew
9 pages
NM Arts&Science Project Documentation
No ratings yet
NM Arts&Science Project Documentation
8 pages
Meta
No ratings yet
Meta
21 pages
Report
No ratings yet
Report
31 pages
Dse 4 Extension
No ratings yet
Dse 4 Extension
5 pages
A17 MJ PPT March 7
No ratings yet
A17 MJ PPT March 7
43 pages
Flight Price Prediction Project
No ratings yet
Flight Price Prediction Project
9 pages
Flight Price Predictions
No ratings yet
Flight Price Predictions
37 pages
SSRN Id4269263
No ratings yet
SSRN Id4269263
5 pages
STC (1) - Removed
No ratings yet
STC (1) - Removed
30 pages
Implementation of Generative AI in Flight Price Prediction
No ratings yet
Implementation of Generative AI in Flight Price Prediction
16 pages
Flight Price Prediction Report
No ratings yet
Flight Price Prediction Report
18 pages
VND Openxmlformats-Officedocument Wordprocessingml
No ratings yet
VND Openxmlformats-Officedocument Wordprocessingml
71 pages
Flight Price Prediction Project Presentation
No ratings yet
Flight Price Prediction Project Presentation
15 pages
Major Project
No ratings yet
Major Project
17 pages
Prediction of Airline Ticket Price: Motivation Models Diagnostics
No ratings yet
Prediction of Airline Ticket Price: Motivation Models Diagnostics
1 page
Final 43
No ratings yet
Final 43
34 pages
Prediction of Airline Ticket Price Using Machine Learning Method
No ratings yet
Prediction of Airline Ticket Price Using Machine Learning Method
15 pages
Rutik Kothwala Final Practical Data Science
No ratings yet
Rutik Kothwala Final Practical Data Science
27 pages
Flight Price Prediction Using Machine Learning Report
No ratings yet
Flight Price Prediction Using Machine Learning Report
58 pages
Vending Machine Route Planning Strategies Plan
From Everand
Vending Machine Route Planning Strategies Plan
Business Success Shop
No ratings yet
HTM 4250 W19 Course Outline - Mark Holmes
No ratings yet
HTM 4250 W19 Course Outline - Mark Holmes
12 pages
Thesis
No ratings yet
Thesis
13 pages
Program Operations
No ratings yet
Program Operations
11 pages
Revenue Management - Basics
No ratings yet
Revenue Management - Basics
47 pages
Revenue Management in SCM
No ratings yet
Revenue Management in SCM
8 pages
Supply Chain Management
No ratings yet
Supply Chain Management
41 pages
Port Pricing PDF
100% (2)
Port Pricing PDF
30 pages
Fundamental Petroleum Policy of Ghana
No ratings yet
Fundamental Petroleum Policy of Ghana
16 pages
Capsim Andrew 1
No ratings yet
Capsim Andrew 1
11 pages
Aggregate
No ratings yet
Aggregate
2 pages
Business Analytics Methods Models and Decisions 2nd Edition PDF
No ratings yet
Business Analytics Methods Models and Decisions 2nd Edition PDF
22 pages
Operation Research
100% (5)
Operation Research
309 pages
Technical Guide On Audit in Hotel Industry - AASB
89% (9)
Technical Guide On Audit in Hotel Industry - AASB
108 pages
Single Resource Capacity Control (S20)
No ratings yet
Single Resource Capacity Control (S20)
39 pages
CV - Renato - Santos - 2023 - Eng - BR
No ratings yet
CV - Renato - Santos - 2023 - Eng - BR
3 pages
Revenue Management & Dynamic Pricing: Prof. Preetam Basu IIM Calcutta Email: Preetamb@iimcal - Ac.in
No ratings yet
Revenue Management & Dynamic Pricing: Prof. Preetam Basu IIM Calcutta Email: Preetamb@iimcal - Ac.in
15 pages
CMA II - Chapter 2, The Master Budget
No ratings yet
CMA II - Chapter 2, The Master Budget
45 pages
Hotel Revenue Management Services
No ratings yet
Hotel Revenue Management Services
10 pages
Big Basket Report
50% (2)
Big Basket Report
21 pages
Questions On Operating Budget 18012022 020212pm 15032022 020104pm
No ratings yet
Questions On Operating Budget 18012022 020212pm 15032022 020104pm
2 pages
Describe How and Why Managers Use Budgets
No ratings yet
Describe How and Why Managers Use Budgets
4 pages
CV - Zeynep Maden
No ratings yet
CV - Zeynep Maden
2 pages
Acc 311 - Week4 - 4-1 MyAccountingLab Homework Chapters 6-8
No ratings yet
Acc 311 - Week4 - 4-1 MyAccountingLab Homework Chapters 6-8
12 pages
Ba ZG621
No ratings yet
Ba ZG621
11 pages
08 Handout 1
No ratings yet
08 Handout 1
7 pages
New BCA Brochure
0% (1)
New BCA Brochure
44 pages

Thesis Defense

Uploaded by

Thesis Defense

Uploaded by

Flight Ticket

Presentation by Shubham Karmakar

Supervised by Sri Tulsidas Mukherjee

Handling Categorical Data

We are using two main Encoding Techniques to convert

ordinal data -- Data are in order --> labelEncoder

As there some outliers in price feature, we replace them with the

• RandomForest Regressor: • Logistic Regression:

• K-Nearest Neighbors (KNN): • Decision Tree Regressor:

• Support Vector Regression • Gradient Boosting Regressor:

Before hypertuning After Hypertuning

AFTER HYPERTUNING THE ACCURACY INCREASES

• Sample Data Visualization:

• Insights from Visualizations:

You might also like