Delay Prediction
Delay Prediction
Flight delays are a widespread and ongoing issue in the aviation industry, influencing not
only the experience of millions of air travelers but also causing notable operational
inefficiencies and financial burdens for airlines and airport authorities. These delays lead to
significant disruptions in airline schedules, increased fuel consumption, resource
mismanagement, and considerable passenger dissatisfaction. They also complicate the
coordination of airport infrastructure, baggage handling, and air traffic control operations.
Given these challenges, there is a pressing need for a robust and intelligent system that can
accurately forecast potential delays before they occur, enabling stakeholders to take proactive
measures. A reliable delay prediction system can contribute substantially to improving
operational efficiency, minimizing disruptions, lowering operating costs, and enhancing the
overall passenger experience. In machine learning and deep learning techniques offer
valuable tools for analyzing large volumes of historical flight and environmental data to
uncover patterns associated with delays. The predictive system integrates a wide range of
traditional machine learning models such as Logistic Regression, Random Forest, Support
Vector Machine (SVM), and Gradient Boosting, specifically the XGBoost algorithm. These
models are well-established in classification and regression tasks due to their robustness,
interpretability, and relatively fast computation. In addition, modern deep learning methods,
particularly Long Short-Term Memory (LSTM) networks, are employed to capture temporal
dependencies and sequential patterns in time-series data. LSTM models are particularly
suitable for delay prediction due to their ability to retain long-term memory and handle time-
dependent variables more effectively than standard feedforward networks.The dataset used
for training these models is comprehensive, containing numerous attributes relevant to flight
operations and weather conditions. Key features include scheduled and actual departure and
arrival times, total flight duration, airline identifiers, aircraft codes, origin and destination
airports, and route-specific information. Moreover, the dataset integrates meteorological
parameters such as temperature, wind speed, visibility, precipitation, and atmospheric
pressure, all of which have known influences on flight schedules. Historical delay data is also
included to help the models learn from past trends and seasonality, improving their predictive
power.
KEYWORDS:
i
Regression
TABLE OF CONTENTS
1.1 Overview
1.2 Purpose
1.3 Problem with Existing System
1.4 Objective
1.5 Overall Description
2. LITERATURE SURVEY
2.1 Predicting Flight Delays Using NN
2.2 Multi-task Local Global Graph Network for FDP
2.3 Modeling Delay Propagation Effects Using Bayesian
Network
2.4 Flight Delay Classification Prediction Based on Stackin
Algorithm
2.5 Deep Learning Approach for FDP Through Time
Graphs
2.6 A Data Mining Approach to Flight Arrival Delay
Prediction for American Airlines
2.7 Limitations
3. SYSTEM MODELING
3.1 Data Flow Diagram
3.1.1 Data Flow Diagram level 0
3.1.2 Data Flow Diagram level
3.2 System Architecture
3.3 Process Description
3.3.1 Data Preprocessing
ii
3.3.2 Data Splitting
3.3.3 Model Training
3.3.4 Model Evaluation
3.4 System Requirements
3.4.1 Software Requirements
3.4.2 Hardware Requirements
3.5 Contribution of Individual Participants
4.
iii
CHAPTER 1
INTRODUCTION
Flight delay prediction represents a practical and impactful use of machine learning in the
transportation and aviation industries. Its primary goal is to estimate whether a planned flight
will experience a delay and, if so, determine the likely extent of the delay. This task is driven
by the growing need to mitigate the adverse effects that delays have on airlines, airports, and
travelers. Unforeseen disruptions in flight schedules can result in financial losses for carriers,
operational congestion at terminals, and dissatisfaction among passengers. Consequently, the
ability to anticipate delays in advance is of significant importance for ensuring efficient air
travel operations.Machine learning models provide a robust solution to this issue, as they are
capable of analyzing large volumes of historical and real-time data to discover intricate
relationships and patterns that contribute to delays. These models surpass traditional
statistical methods by offering improved prediction accuracy through advanced learning
capabilities. Unlike conventional approaches, which may rely on limited parameters and
assumptions, machine learning algorithms can handle complex, multidimensional data
environments with greater flexibility.The process of developing a predictive model for flight
delays starts with the acquisition and preparation of relevant data. This typically involves
assembling a comprehensive dataset containing a wide array of features. Historical flight data
forms the core of such datasets, encompassing scheduled and actual departure and arrival
times, flight identifiers, airline codes, origin and destination airports, aircraft models, and
additional operational information. These attributes help the model understand past flight
performance and operational behavior under different conditions.In addition to flight-specific
records, incorporating supplementary data such as weather conditions, seasonal trends,
airport traffic congestion, and airspace limitations further enhances the model's ability to
make accurate forecasts. This enriched dataset enables the system to account for various
factors that influence delays and improves the model's overall generalization capability.The
choice of machine learning algorithms depends on the nature of the prediction task. When the
objective is to determine whether a flight will be delayed typically framed as a classification
problem supervised learning models are employed.
1
1.1 OVERVIEW
1.2 PURPOSE
The primary purpose of using machine learning in flight delay prediction is to improve the
accuracy and reliability of delay forecasts. This helps airlines and airport authorities make
better operational decisions, reduce passenger inconvenience, and minimize financial losses.
Passengers can also benefit from timely notifications and rescheduling options, leading to an
overall enhanced travel experience.
Predicting delays helps air traffic controllers manage congestion at airports and in airspace
more efficiently. This ensures smoother operations, reduces bottlenecks, and enhances overall
flight safety.
Airlines, airports, and passengers collectively suffer significant financial losses due to flight
delays. Predicting delays allows airlines to take preventive measures, reducing compensation
claims, refund costs, and additional expenses.
Existing systems for predicting flight delays using machine learning face several significant
challenges that limit their effectiveness. One major issue is the quality and completeness of
available data. Flight delays are influenced by a wide range of factors including weather, air
traffic, maintenance issues, and airport congestion, but not all of this data is consistently
available or accurately recorded. Many models also struggle with the dynamic and complex
nature of the aviation environment, where delays can result from a chain reaction of events.
Furthermore, some systems rely on simplistic models that fail to capture non-linear
relationships between features, leading to low prediction accuracy. Lack of standardization
in data sources, data quality issues, and insufficient integration with airline operations also
contribute to the inefficiency of current delay prediction models. As a result, there is a
pressing need for more robust and intelligent systems that can provide accurate and
actionable delay predictions.
1.4 Objectives
Predict flight delays using machine learning project is to develop an accurate model
that can predict potential delays before flight To departure.
By analysing historical flight data, weather conditions, airline information, and other
relevant factors, the project aims to enhance operational efficiency, improve
passenger satisfaction, and support proactive decision-making for airlines and airport
authorities.
3
Flight delay prediction models aim to enhance safety, efficiency, and customer
satisfaction across the aviation ecosystem while supporting smarter, data-informed
decision making.
Flight delay prediction using machine learning is an advanced approach to forecasting flight
delays by analysing historical and real-time data. Flight delays are a major challenge in the
aviation industry, causing inconvenience to passengers, financial losses for airlines, and
disruptions in airport operations. This project utilizes machine learning techniques to identify
patterns and relationships between various factors influencing delays, such as weather
conditions, air traffic congestion, airline schedules, departure and arrival times, and
operational inefficiencies. By implementing models like Random Forest, Logistic Regression,
and deep learning techniques such as Long Short-Term Memory (LSTM), the project aims to
determine the most effective predictive approach. The process involves data collection,
preprocessing, feature selection, model training, evaluation, and visualization to gain
meaningful insights. By providing accurate forecasts, this system helps airlines optimize
scheduling, minimize disruptions, and enhance passenger experience, ultimately improving
efficiency in the aviation industry. The primary goal is to provide timely and reliable
predictions that support better decision-making for airlines, airport authorities, and
passengers. By doing so, these models help improve operational efficiency, reduce costs,
enhance the passenger experience, and contribute to more sustainable and intelligent air
travel systems. As the aviation industry becomes more data driven, machine learning-based
delay prediction stands out as a crucial tool for addressing one of the most persistent
challenges in air transportation. Flight delay prediction focuses on estimating whether a flight
will be delayed and by how much, based on various influencing factors. These factors may
include weather conditions, airport traffic, airline operations, and scheduled flight times.
Accurately predicting delays helps improve scheduling, reduce passenger inconvenience, and
optimize airline and airport operations.To achieve this, historical flight data and external
variables (like weather and airport conditions) are collected and processed. Machine learning
models such as decision trees, support vector machines, or deep learning networks are then
trained to recognize patterns and make delay predictions for future flights.This approach
supports proactive decision-making, allowing airlines and airports to respond to potential
issues before they escalate, ultimately enhancing operational efficiency and customer
satisfaction.
4
CHAPTER 2
LITERATURE SURVEY
Recent research has shown that integrating machine learning with aviation data significantly
enhances flight delay prediction accuracy. Etani (2019) developed a predictive model that
leverages both flight operation data and meteorological information to estimate on-time
arrivals. The study revealed strong correlations between weather conditions such as wind,
visibility, and precipitation and delay occurrences. By incorporating these features, the model
achieved better performance compared to traditional methods. This work highlights the
importance of combining multiple data sources to improve predictive reliability.Machine
learning models, including Random Forests, Support Vector Machines (SVM), and XGBoost,
have become widely adopted due to their ability to model non-linear relationships in large
datasets. Etani’s findings influenced many subsequent studies to adopt hybrid data
approaches, integrating weather and operational data. Preprocessing techniques like data
cleaning, encoding, and normalization have been recognized as essential for improving model
performance.
Khan et al. proposed a machine learning-based approach using Gradient Boosting methods to
predict flight delays. The study was presented at the 16th International Conference on Open
Source Systems and Technologies (ICOSST) held in Lahore, Pakistan, in December 2022.
They utilized real-world flight datasets and emphasized the importance of preprocessing steps
such as feature selection and data balancing. The authors compared different ensemble
techniques, including AdaBoost, Random Forest, and Gradient Boosting Machines (GBM).
Among them, GBM yielded the best predictive performance in terms of accuracy and
robustness. The models were evaluated using performance metrics like accuracy, precision,
5
recall, and F1-score. The study demonstrated that ensemble methods significantly enhance
prediction capabilities over traditional approaches. The research also highlighted the
importance of integrating external factors such as weather and traffic data. This paper mainly
focuses on data-driven machine learning techniques without relying on simulation.
2.3 " Li Q, Jing. Generation and prediction of fight delays in air transport.
IET Intell Transp Syst. 2021;"
The paper primarily analyzed how flight delays are generated and their temporal
characteristics. The author used historical air transport data to identify delay patterns and
predict future delays. Unlike purely ML-based studies, this research incorporated simulation
techniques to understand the system-level behavior of delay propagation. Seasonal variations,
time-of-day patterns, and weather conditions were key variables considered in the model. The
study also compared machine learning models with traditional statistical baselines, showing
that ML models offered better predictive performance. Li emphasized the importance of
integrating both macro-level (system-wide) and micro-level (individual flight) data for
accurate predictions. The research offers a more holistic approach by combining data science
and domain-specific knowledge. It serves as a valuable contribution to transport system
planning and operational management.
In their 2022 study, Bisandu et al. proposed a deep learning model called the Social Ski
Driver Conditional Autoregressive (CAR) classifier for flight delay prediction. The model
incorporates spatiotemporal correlations among flights to enhance predictive accuracy. It
outperformed conventional machine learning models like random forests and SVMs on
benchmark datasets. Their approach captures the influence of surrounding flight behaviors,
offering a more dynamic prediction system. The study demonstrated strong generalization
in complex real-world scenarios. Evaluation metrics showed significant improvements in
accuracy and reliability. This work advances deep learning applications in transportation
analytics.
6
2.5 " W. Shao, A. Prabowo, S. Zhao, S. Tan, P. Koniusz, J. Chan, X. Hei,
B. Feest and F. D. Salim, "Flight delay prediction using airport
situational awareness map," in Proc. of 27th ACM SIGSPATIAL Int.
Conf. on Advances in Geographic Information Systems, November 5-8,
2019 "
W. Shao et al. (2019) proposed a novel approach for flight delay prediction using an
Airport Situational Awareness Map. This method integrates spatiotemporal data, such as
runway occupancy, taxiway congestion, and gate availability, to enhance prediction
accuracy. The study was presented at the 27th ACM SIGSPATIAL conference,
highlighting the use of geographic information systems (GIS) in aviation analytics. Their
model outperformed traditional delay prediction methods by incorporating real-time airport
conditions. It emphasized the importance of spatial context in operational forecasting. The
approach also supports proactive decision-making for air traffic controllers. This work
bridges GIS and AI for smarter air transport systems.
2.7 "Wang, L., Tien, A., & Chou, J. (2021).Multi-Airport Delay Prediction
with Transformers"
Wang et al. (2021) introduced a Temporal Fusion Transformer (TFT) model for predicting
delays across multiple airports. The approach captured complex temporal dynamics of
inputs like traffic, demand, and weather. A self-supervised learning model encoded high-
7
dimensional weather data into lower-dimensional representations. This facilitated efficient
training of the TFT model. The model achieved satisfactory performance with smaller
prediction errors. Interpretability analysis identified key input factors influencing delays.
The study aimed to assist air traffic managers in proactive decision-making.
8
handling nonlinear features. Feature importance indicated weather and airline schedules as
dominant factors. The study included balancing techniques to address class imbalance. It
emphasized timely predictions to reduce cascading delays. Their results support airport-
level strategic planning.
2.14"Shao, Y., et al. (2019). Departure Delay Prediction for Flights Using
Airport Situational Awareness Maps and Machine Learning"
9
The study used Airport Situational Awareness Maps (ASAMs) to improve departure delay
predictions. Gradient Boosting was employed with engineered features from weather,
ground operations, and ASAMs. The system significantly improved delay detection
accuracy. It supports proactive decision-making by airlines. Real-time ASAM data
provided high-resolution contextual inputs. The work demonstrates the value of localized
airport data in ML pipelines. Their framework is scalable across airports of varying sizes.
2.15"Jha, R., et al. (2024). Flight Delay Prediction Using Deep Learning:
A Hybrid Approach"
Jha and colleagues developed a hybrid architecture combining XGBoost and LSTM for
flight delay classification. Tabular data were processed through XGBoost while sequential
patterns were learned via LSTM. The hybrid model improved precision, especially for long-
haul delays. Feature engineering included time, weather, and operational data. The authors
tackled noise and imbalance using SMOTE. Comparative metrics proved the hybrid model's
superiority. Their design promotes both performance and interpretability.
10
2.18"Pophale, A., et al. (2022). Airline Delay Analysis Using Machine
Learning Algorithms. Mathematical Statistician and Engineering
Applications"
This paper applies Linear and Polynomial Regression models to forecast departure delays.
The dataset, sourced from Kaggle, includes flight origin, airline, and weather data.
Polynomial Regression outperformed Linear Regression in capturing non-linear delay
trends. Data visualization was used to understand delay patterns. The study achieved
moderate accuracy (\~72%) but provided insight into model tuning. It's especially valuable
for educational and baseline modeling. It lays the groundwork for integrating more complex
methods.
LIMITATIONS
Flight delay prediction involves complex features like time-based patterns, airport
traffic, and flight routes. Constructing meaningful input features from raw data requires
domain expertise and careful preprocessing to avoid overfitting or underperformance.
11
A model trained on data from specific regions, airlines, or time periods may not
generalize well to others. Variations in geography, infrastructure, and airline operations
demand frequent model retraining and validation across different datasets.
Real-time prediction systems must process live feeds (e.g., weather, air traffic). This
requires robust data pipelines and low latency processing, which can be technically
challenging and prone to failure, affecting timely and accurate predictions.
Advanced models like LSTM and Transformers often lack transparency. Their complex
internal workings make it hard to explain predictions to stakeholders, reducing trust and
making it difficult to use them in operational decision-making contexts.
Using flight and passenger data must comply with data protection laws like GDPR.
Additionally, models can unintentionally learn biases from historical data, leading to unfair or
skewed predictions if not properly monitored and corrected.
CHAPTER 3
SYSTEM MODELING
3.1 Data Flow Diagram
The Data Flow Diagram (DFD) for a flight delay prediction system outlines how data
is collected, processed, and used to generate predictions. The system receives inputs from
multiple external sources, such as airline databases and weather services. These inputs
include historical flight data, airport information, aircraft details, and weather conditions.
12
The Level 0 Data Flow Diagram provides a high-level view of a machine learning-based
flight delay prediction system. The system receives flight data including date, time, airport,
and weather—as input. This data is processed by the ML prediction system which analyzes
the input using a trained machine learning model. The system then outputs the, delay status
indicating whether a flight is likely to be on time or delay. This diagram captures the core
function of the system without detailing the internal processes.
The Level 1 Data Flow Diagram (DFD) for the flight delay prediction system provides a
detailed view of how the system operates and interacts with external data sources and users.
The system receives input from two main external entities, Airline and Weather Services.
Airline data includes flight schedules, historical delays, aircraft details, and operational
metrics, while weather services supply real-time and forecasted weather conditions that can
affect flight performance. These data sets are processed by the Flight Delay Prediction
System, which utilizes machine learning algorithms to analyze patterns and generate
predictive insights. In return, they receive an output in the form of delay predictions which
help them make informed decisions.
13
Figure 3.2 Data Flow Diagram
Process Description:
The optimized model is then evaluated for accuracy and registered in a model registry
for deployment. The image represents a comprehensive end-to-end machine learning pipeline
designed for predicting flight delays. It is structured into three main stages: Data Pipeline,
Model Training, and Deployment & Prediction. Each stage contains multiple steps that
transform raw data into a final prediction output.
1. Data Pipeline
The process begins with the data pipeline, where the system gathers and processes raw flight
data.
14
Flight Data Sources: The system collects flight-related data from various sources such as
airline databases, airport systems, weather reports, and air traffic logs. This data is essential
for understanding the different factors that can affect flight delays.
Ingestion: Once the data is collected, it is ingested into the system. This step involves
loading data into a centralized location where it can be processed and analyzed.
Data Lake / Database: The ingested data is stored either in a data lake (a repository that can
store structured, semi-structured, or unstructured data) or in a traditional database. This
storage layer serves as the foundation for subsequent data processing.
ETL & Data Cleaning: In this critical step, ETL (Extract, Transform, Load) processes are
applied to clean and transform the data. This includes removing duplicates, handling missing
values, standardizing formats, and ensuring data consistency.
Feature Engineering: After cleaning the data, the system creates new features that can
improve model performance. For example, it might generate features like "time of day," "day
of the week," "weather conditions," or "historical delay patterns" that are useful for predicting
future delays. Once the data is fully prepared, it is split into training and testing datasets for
model development.
2. Model Training
This stage involves building, tuning, and validating a machine learning model.
Train/Test Split: The cleaned and engineered dataset is split into training and testing sets.
The training data is used to build the model, while the testing data is used to evaluate its
performance.
Hyperparameter Tuning: Once the initial model is trained, its hyperparameters are tuned
using techniques such as Grid Search or Random Search. Tuning helps optimize the model’s
performance.
Optimized Model: The best-performing version of the model, after tuning, is selected as the
optimized model.
15
Model Evaluation: The optimized model is then evaluated on the test dataset using various
performance metrics like accuracy, precision, recall, F1-score, or RMSE (Root Mean Squared
Error), depending on the nature of the problem.
Model Registry: Once the model passes evaluation standards, it is stored in a model registry.
This registry acts as a version controlled system where models can be tracked, compared, and
retrieved for deployment.
In the final stage, the trained model is deployed and used to make predictions on new data.
New Flight Data: Fresh flight data is collected in real-time or batch mode for which
predictions are needed.
Preprocess: This new data undergoes the same preprocessing and cleaning steps used during
training to ensure consistency.
Feature Extraction: Features are extracted from the new data using the same methods
applied earlier, ensuring the input format matches what the model expects.
Model API / Web Service: The trained and registered model is deployed as an API
(Application Programming Interface) or a web service. This allows external systems to send
flight data to the model and receive delay predictions in return.
Predict Delay: The deployed model processes the new flight data and returns predictions—
typically indicating whether a flight will be delayed and possibly by how much time.
Predicted Flight Delay: The final prediction is output, which can be shown to users,
integrated into decision-making systems, or used to notify stakeholders.
This pipeline illustrates a complete machine learning workflow for a real-world problem:
predicting flight delays. It starts from raw data ingestion and ends with a live prediction
system, demonstrating how machine learning models are developed, evaluated, and used in
production environments. Each stage is crucial and interconnected, ensuring the final model
is both accurate and reliable.
16
Fig 3.3 System Architecture
17
Hardware Requirements
GPU: Not necessary for traditional ML, but useful for deep learning like LSTM
Software Requirements
Python Libraries
CHAPTER 4
18
METHODOLOGY
4.1 Machine Learning
Machine Learning (ML) is a subfield of Artificial Intelligence (AI) that enables systems
to learn patterns from data and make decisions or predictions without being explicitly
programmed. Instead of writing rules manually, ML algorithms learn from examples
provided through data. The key idea is to develop algorithms that can generalize from
historical data to make predictions on new, unseen data.
Machine Learning is broadly classified into three main categories based on the kind of
learning signal or feedback available to the system:
In Supervised Learning, the model is trained using a labeled dataset, meaning that each
training example is paired with the correct output (label). The goal is to learn a mapping from
inputs (features) to outputs (target labels).
Examples:
Subtypes:
In Unsupervised Learning, the model is given unlabeled data and must discover patterns,
groupings, or structures on its own. There are no output labels provided.
Examples:
19
Semi-Supervised Learning is a mix of supervised and unsupervised learning. It uses a small
amount of labeled data and a large amount of unlabeled data. This is useful when labeling
data is expensive or time-consuming.
Example:
Using a small set of labeled flight delay records along with a large set of unlabeled flight data
to improvement.
In flight delay prediction, the algorithm starts by taking the historical flight data including
features like departure time, wind speed, weather conditions, and airline information and
creating multiple random subsets of this data. For each subset, a separate decision tree is
trained to learn patterns that could indicate whether a flight is likely to be delayed.
Unlike a single decision tree that might overfit to the data, Random Forest introduces
randomness both in the data it selects and the features it uses, making the overall model more
generalizable. Once all the trees have made their predictions, the algorithm takes a majority
vote: if most trees predict a delay, the final output is "Delayed"; otherwise, it is "Not
Delayed".
This ensemble approach reduces errors, handles missing data well, and provides a strong
performance even with complex or noisy flight datasets. It is especially effective when the
goal is to make accurate classifications using structured, tabular information.
20
4.3 Logistic Regression
Logistic Regression is a linear model for binary classification. It estimates the probability that
a data instance belongs to a particular class using the logistic function. This model was
applied to the dataset after feature scaling. Logistic Regression provides a strong baseline for
classification tasks and showed moderately good performance on the flight delay dataset. It is
computationally efficient and interpretable, making it a reliable choice for initial
benchmarking.
Logistic Regression is a straightforward and widely used algorithm for binary classification
tasks like predicting whether a flight will be delayed or not. The process begins by collecting
structured data such as departure time, weather, distance, airline, and delay history. The
features are often scaled to improve learning efficiency. Logistic Regression models the
relationship between the input features and the delay status using a logistic (sigmoid)
function, which transforms the result into a probability value between 0 and 1.
If the predicted probability is greater than 0.5, the model classifies the flight as "Delayed";
otherwise, it predicts "Not Delayed". During training, the algorithm adjusts its internal
coefficients to minimize the difference between predicted and actual values using an
optimization method like gradient descent
XGBoost is an advanced implementation of gradient boosting that is optimized for speed and
performance. It uses a more regularized model formalization to control overfitting, making it
suitable for structured/tabular data. The model was trained using default hyperparameters and
demonstrated superior predictive performance on the test set compared to Random Forest and
Logistic Regression. XGBoost's boosting mechanism allows it to correct the errors of
previous models iteratively, leading to improved overall accuracy.
21
4.4.1 Working with XG BOOST(Extreme Gradient Boosting)
XGBoost, short for Extreme Gradient Boosting, is a powerful machine learning algorithm
known for its high speed and accuracy in structured data problems like flight delay
prediction. It operates by building decision trees sequentially, where each new tree focuses on
correcting the prediction errors of the previous trees.
The workflow begins by feeding historical flight data into an initial simple model. The
algorithm calculates the errors (residuals) and trains the next tree to predict these errors.
This process repeats over several boosting rounds, with each tree making the model more
accurate. A weighted sum of all the trees’ outputs is used to make the final prediction.
XGBoost also includes regularization, which helps prevent overfitting, and supports missing
value handling natively.
LSTM is a type of Recurrent Neural Network (RNN) that is especially designed to handle
sequence and time-series data, making it highly suitable for flight delay prediction based on
temporal patterns. The LSTM algorithm works by analyzing sequences of data points for
example, a time series of wind speeds, airport congestion, or previous flight delays.
The input data is reshaped into a 3-dimensional structure representing samples, time steps,
and features. LSTM networks contain special units called memory cells that can retain or
forget information over long periods using internal gates (input, forget, and output gates).
22
This memory allows the model to learn how past patterns (e.g., consecutive delays or
worsening weather) influence future delays.
4.6.1 Accuracy
Accuracy is the simplest and most intuitive metric. It measures the proportion of total correct
predictions made by the model out of all predictions.
Example: If the model makes 90 correct predictions out of 100 total cases, the accuracy is
90%.
Not enough when classes are imbalanced (e.g., 90% on-time, 10% delayed)
4.6.2 Precision
Precision tells us how many of the flights predicted as “Delayed” were actually delayed. It
answers the question:” when the model says a flight is delayed, how often is it correct”
23
Example: If the model predicted 50 flights as delayed and 40 were truly delayed, precision =
40/50 = 80%.
Good for: When the cost of false alarms is high (e.g., unnecessary rescheduling)
4.6.3 Recall
Recall measures how many of the actual delayed flights were correctly predicted by the
model. It answers the question: “Out of all actual delays, how many did the model detect?”
Example: If there are 100 delayed flights and the model correctly identifies 80, recall =
80/100 = 80%.
Good for: When missing a delay is critical (e.g., airport planning or safety)
4.6.4 F1 Score
The F1 Score is the harmonic mean of Precision and Recall. It gives a single score that
balances both concerns. It is especially useful when the class distribution is uneven (e.g., far
more “Not Delayed” than “Delayed” flights). Good for overall balance bet ween Precision
and Recall
Example: If a model has a precision of 75% and a recall of 60%, the F1 Score = 2 × (0.75 ×
0.60) / (0.75+0.60) = 0.666 or 66.6%.
Good for: When both detecting delays and minimizing false alerts are important—such as in
real-time flight monitoring systems or automated scheduling adjustments.
24
4.7 Dataset creation methodology
This project aggregates flight data from official sources such as the Kaggle and airport
databases. By using available APIs and public datasets, historical flight records were
collected over a span of four years. To build the dataset, flight identifiers, schedules, and
actual performance metrics were extracted, resulting in a consolidated dataset comprising
nearly 3000 records. These entries capture detailed information about flight timings, delays,
and operational attributes during the specified period. The Origin Airport and Destination
Airport codes:
Indicate where the flight starts and ends, as different airports may have varying congestion
levels or operational efficiency.
It is crucial, as delays often vary depending on the time of day, and it can also be used to
derive additional features like "Hour of Day".
is recorded to calculate the Departure Delay, while Arrival Delay shows how late the flight
landed compared to the schedule.
It refers to the time at which the flight is officially planned to land at its destination airport.
It is the real timestamp when the aircraft touches down at the destination. The difference
between Scheduled and Actual Arrival Time determines the arrival delay, which is a key
factor in flight delay prediction models.
Carrier
25
It refers to the airline company that operates a given flight, such as Indigo, Air India, or
SpiceJet.
CHAPTER 5
SYSTEM IMPLEMENTATION
MODULES:
5.1.1 Implementation
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score, precision_score,
recall_score, f1_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam
26
It starts by importing pandas and numpy, which are foundational libraries for handling data
structures and numerical computations.
Train_test_split from Scikit-learn is used to divide the dataset into training and testing
subsets, which is essential for evaluating a model’s performance on unseen data.
LabelEncoder is used to convert categorical text labels (like airport codes or status labels)
into numerical form, making them suitable for machine learning algorithms.
StandardScaler standardizes numerical features by removing the mean and scaling to unit
variance, which helps improve model training and convergence for algorithms sensitive to
feature scaling.
Scikit-learn
XGBoost(XGBClassifier):
It is well-known for its performance and efficiency in structured data problems and often
outperforms traditional machine learning models in classification tasks like flight delay
prediction.
It shows accuracy_score, precision_score, recall_score, and f1_score are used to evaluate the
performance of classification models. These metrics provide insights into how well the model
is making predictions — for instance, precision and recall help understand the trade-off
between false positives and false negatives, which is important in real-world delay prediction
systems.
TensorFlow keras
27
Sequential is a Keras model type that allows stacking layers in a linear manner. The LSTM
(Long Short-Term Memory) layer is a type of recurrent neural network (RNN) ideal for time-
series or sequential data, such as predicting delays based on a sequence of historical records.
Dense layers are fully connected neural network layers that process information between
neurons. Dropout is a regularization technique that randomly disables neurons during training
to reduce overfitting.
import pandas as pd
file_path=("/content/drive/MyDrive/flight delay
prediction/Airline Delay (1).csv")
df=pd.read_csv(file_path)
df.head()
Imports pandas.Loads a CSV file containing airline delay data from Google Drive. Stores the
data in a DataFrame called df. Displays the first five rows to give you a quick look at the
dataset's structure and contents.
28
Figure No. 5.1 Data Preprocessing
Data preprocessing is a vital step in machine learning that involves preparing and
transforming raw data into a clean and organized format suitable for modeling. Since real-
world data often contains missing values, inconsistencies, and noise, preprocessing helps
improve the quality and reliability of the data. This process typically includes cleaning the
data by handling missing or incorrect entries, normalizing or scaling numerical features to
ensure uniformity, and converting categorical variables into a numerical format through
encoding techniques. Additionally, feature selection or extraction may be applied to identify
the most relevant information for the model, enhancing both accuracy and efficiency.
Keeps the value if it’s not null, non-negative, and a whole number.
Replaces anything else with NaN.
Drops any rows with NaN and converts the remaining values to integers.
29
5.1.4 Feature Extraction
The goal is to prepare new flight-related features such as departure and arrival times, origin
and destination airports, and flight status and add them as new columns to an existing
DataFrame (df).To do this, the code first calculates the number of rows in the DataFrame and
initializes several empty lists to store the new values for each row.
30
Figure No. 5.3 Model Selection and Training
The load_and_train method loads the dataset from a specified CSV file, checks for the
presence of critical columns such as actual_departure_time, and drops rows with missing
values in any essential fields. Then, it performs feature engineering by converting scheduled
and actual departure times to minutes and encoding categorical variables using the label
encoders.
A binary label is created based on the flight_status column, mapping 'on-time' to 0 and
'delayed' to 1. These processed features such as encoded airports, carrier, departure times,
31
and the year are used to train an XGBClassifier, which is a gradient boosting model well-
suited for classification tasks. After training, the model and encoders are saved as a serialized
.pkl file using the pickle module to Google Drive, enabling future reuse without retraining.
32
XGBOOST
33
34