Flight Delay Prediction: Project Synopsis On

This document provides a synopsis for a project using machine learning models to predict flight delays. The goals are to apply linear regression to predict arrival delays based on departure delay and route distance. The dataset used contains features like origin, destination, carrier, day of week, and departure/arrival delays. Python is identified as the programming language due to its powerful machine learning and scientific computing packages. The proposed work will use multiple linear regression to model the relationship between arrival delay as the dependent variable and other features as explanatory variables. The system will involve data collection, preprocessing, training a machine learning model, and evaluating model performance.

Uploaded by

Ramesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

437 views13 pages

Flight Delay Prediction: Project Synopsis On

Uploaded by

Ramesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

PROJECT SYNOPSIS ON

Flight Delay Prediction

DEPARTMENT OF INFORMATION TECHNOLOGY
NETAJI SUBHASH ENGINEERING COLLEGE

RAMESH KUMAR (10900216031)

Under the guidance of:

(ANUPAM BERA)

Project group: SAURABH KUMAR, SAMEER AKHTER, PAYAL KUMARI, RAMESH KUMAR
Under my guidance and supervision the synopsis of the project
_____________________________________________________________of 4th year
Information and technology is submitted.

(signature of Project guide)

--------------------------------------
ANUPAM BERA
Information and Technology
Netaji Subhas Engineering College.
Garia, Kolkata - 700152
ACKNOWLEDGEMENT
I owe my deep sense of gratitude to my respected mentor Prof. ANUPAM BERA,
Department of Information and Technology. Netaji Subhash Engineering College,
Kolkata for his meticulous and expert guidance, constructive criticism, patient
hearing and benevolent behaviour throughout my ordeal of the present research. I
shall remain grateful to him for his cordial, cooperative attitude, wise and
knowledgeable counsel that acted as an impetus in the successful completion of
my project titled MACHINE LEARNING MODEL FOR FLIGHT DELAY
PREDICTION.
I would like to particularly thank the Head of the Department for giving me guidance
and inspiration during my study in the department. I never forget the kind help
extended by the HOD. It however, is not possible for me to forget the kind of help
provided by all the faculty members,
At last but not least my friends in the department who deserve some words of
thanks.
CONTENT
Abstract 5
Introduction 6
Project Goals and Scope 7
Data and Tools 7
4.1 Data Used
4.1.1 Choosing the Dataset
4.2 Tools 8
Python and associated packages
Proposed Work 9
Linear Regression
System Design 10
The various modules of the project would be divided into the segments as described.
I. Data Collection 12
II. Pre Processing 12
III. Training the Machine 13
IV. Data Scoring 14
Conclusion 15
Future work 15
References 16
Abstract
As the population increases, there is need of more n more airlines , and which
results growth in aircraft industry which in turn resulted in air-traffic congestion
causing flight delays. Flight delays not only have economic impact but also harmful
environmental effects. Air-traffic management is becoming increasingly
challenging. In this project I apply machine learning algorithm like linear regression
to predict if a given flight’s arrival will be delayed or not, that will tells us how delay
one flight is .
Introduction

Delay is one of the most the annoying thing we people face in our day to day life . So
with the case of flight also. People are in hurry. And they hate the delays. What
basically delay is - a delay may be defined as the difference between scheduled and
real times of departure or arrival of a plane. Country regulator authorities have a
multitude of indicators related to tolerance thresholds for flight delays. Indeed, flight
delay is an essential subject in the context of air transportation systems. For
passengers, flight delay caused the inconvenience of travel, bad mood, as well as the
double loss of time and economy; for the airport, the delay of the flight seriously
affects the normal operation of the airport; for airline, frequent flight delay not only
bring huge economic losses to the airline, but also affect the reputation of the airline.
Flight delay has become the shackles of the development of the aviation industry.
Project Goals and Scope
In this project we established a multiple linear regression model using departure
delay and route distance to predict arrival delay, and presented the design and
implementation of a flight delay prediction system. A chief goal of this project is to
add to the academic understanding of flight delay prediction. The hope is that with
a greater understanding of how the flight delays, customer will be better equipped
to prevent delay.
It is important here to define the scope of the project. This project will focus
exclusively on predicting the flight delay of individual stocks. The project will make
no attempt to decide how much money to allocate to each prediction. More so, the
project will analyse the accuracies of these prediction.

Data and Tools

4.1 Data Used

4.1.1 Choosing the Dataset
We have selected dataset available on kaggle.com .Features contained in the
dataset are as follows:
1. Origin
2. Dest
3. Unique_Carrier
4. Day_of_Week
5. Dep_Hour
6. Arr_Delay.
4.2 Tools
Python and associated packages
Python was the language of choice for this project. This was an easy decision for
the multiple reasons. 16
1. Python as a language has an enormous community behind it. Any problems
that might be encountered can be easily solved with a trip to Stack Overflow.
Python is among the most popular languages on the site which makes it very
likely there will be a direct answer to any query.
2. Python has an abundance of powerful tools ready for scientific computing.
Packages such as Numpy, Pandas, and SciPy are freely available,
performant, and well documented. Packages such as these can dramatically
reduce, and simplify the code needed to write a given program. This makes
iteration quick.
3. Python as a language is forgiving and allows for programs that look like
pseudo code. This is useful when pseudo code given in academic papers
needs to be implemented and tested. Using Python, this step is usually
reasonably trivial.

Proposed Work

I basically use regression in my project.

5.1Multiple Linear Regression

In statistics, linear regression is an approach for modeling the relationship between
a scalar dependent variable y and one or more explanatory variables (or
independent variables) denoted X. The case of one explanatory variable is called
simple linear regression. For more than one explanatory variable, the process is
called multiple linear regression. In linear regression, the relationships are modeled
using linear predictor functions whose unknown model parameters are estimated
from the data. Such models are called linear models.
Linear regression has many practical uses. Most applications fall into one of the
following two broad categories: (1) if the goal is prediction, or forecasting, or error
reduction, linear regression can be used to fit a predictive model to an observed
data set of Y and X values. After developing such a model, if an additional value of
X is then given without its accompanying value of Y, the fitted model can be used
to make a prediction of the value of Y. (2) given a variable Y and a number of
variables X1, ..., Xp that may be related to Y, linear regression analysis can be
applied to quantify the strength of the relationship between Y and the Xj, to assess
which Xj may have no relationship with Y at all, and to identify which subsets of the
Xj contain redundant information about Y.

System Design
The first step is the conversion of this raw data into processed data. This is done
using feature extraction, since in the raw data collected there are multiple attributes
but only a few of those attributes are useful for the purpose of prediction. So the
first step is feature extraction, where the key attributes are extracted from the whole
list of attributes available in the raw dataset. Feature extraction starts from an initial
state of measured data and builds derived values or features. These features are
intended to be informative and non-redundant, facilitating the subsequent learning
and generalization steps. Feature extraction is a dimensionality reduction process,
where the initial set of raw variables is diminished to progressively reasonable
features for ease of management, while still precisely and totally depicting the first
informational collection. The feature extraction process is followed by a
classification process wherein the data that was obtained after feature extraction is
split into two different and distinct segments. Classification is the issue of
recognizing to which set of categories a new observation belongs. The training data
set is used to train the model whereas the test data is used to predict the accuracy
of the model. The splitting is done in a way that training data maintain a higher
proportion than the test data. The random forest algorithm utilizes a collection of
random decision trees to analyze the data. In layman terms, from the total number
of decision trees in the forest, a cluster of the decision trees look for specific
attributes in the data. This is known as data splitting. In this case, since the end
goal of our proposed system is to predict the price of the stock by analyzing its
historical data.
The various modules of the project would be divided into the segments as described.
I. Data Collection
Data collection is a very basic module and the initial step towards the project. It
generally deals with the collection of the right dataset. The dataset that is to be
used in the market prediction has to be used to be filtered based on various
aspects. Data collection also complements to enhance the dataset by adding more
data that are external. Our data mainly consists of the previous year flight time
table. Initially, we will be analyzing the Kaggle dataset and according to the
accuracy, we will be using the model with the data to analyze the predictions
accurately.
II. Pre Processing
Data pre-processing is a part of data mining, which involves transforming raw data
into a more coherent format. Raw data is usually, inconsistent or incomplete and
usually contains many errors. The data pre-processing involves checking out for
missing values, looking for categorical values, splitting the data-set into training and
test set and finally do a feature scaling to limit the range of variables so that they
can be compared on common environs.
III. Training the Machine
Training the machine is similar to feeding the data to the algorithm to touch up the
test data. The training sets are used to tune and fit the models. The test sets are
untouched, as a model should not be judged based on unseen data. The training of
the model includes cross-validation where we get a well-grounded approximate
performance of the model using the training data. Tuning models are meant to
specifically tune the hyperparameters like the number of trees in a random forest.
We perform the entire cross-validation loop on each set of hyperparameter values.
Finally, we will calculate a cross-validated score, for individual sets of
hyperparameters. Then, we select the best hyperparameters. The idea behind the
training of the model is that we some initial values with the dataset and then
optimize the parameters which we want to in the model. This is kept on repetition
until we get the optimal values. Thus, we take the predictions from the trained
model on the inputs from the test dataset. Hence, it is divided in the ratio of 80:20
where 80% is for the training set and the rest 20% for a testing set of the data.
IV. Data Scoring
The process of applying a predictive model to a set of data is referred to as scoring
the data. The technique used to process the dataset is the Random Forest
Algorithm. Random forest involves an ensemble method, which is usually used, for
classification and as well as regression. Based on the learning models, we achieve
interesting results. The last module thus describes how the result of the model can
help to predict the probability of a stock to rise and sink based on certain
parameters. It also shows the vulnerabilities of a particular entity. The user
authentication system control is implemented to make sure that only the authorized
entities are accessing the results.

Conclusion
In this project, I am able to successfully apply machine learning algorithms to
predict flight arrival-delay and show simple classifiers like linear regression and can
predict if a flight’s arrival will be delayed or not fairly accurately , i.e. giving how
delay one flight could be .

Future work
For further work I like to further improve my model, perhaps with more
training-data or deeper neural network, or both. Taxi-delay prediction is a natural
progression to this work, considering amount of fuel wasted while taxing. Accurate
taxi-delay prediction requires taking airport runway and taxiway configurations in to
consideration where very little work exists.

References:
[1] C. Cetek, E. Cinar, F. Aybek, and A. Cavcar, “Capacity and delay analysis for airport
manoeuvring areas using simulation,” Aircraft Engineering and Aerospace Technology,
vol. 86, no. 1, pp. 43–55, 2013. [Online]. Available: https://doi.org/10.1108/AEAT-04-
2012-0058
[2] K. B. Nogueira, P. H. Aguiar, and L. Weigang, “Using ant algorithm to arrange taxiway
sequencing in airport,” International Journal of Computer Theory and Engineering, vol. 6,
no. 4, p. 357, 2014.
[3] R. R. Clewlow, I. Simaiakis, and H. Balakrishnan, “Impact of arrivals on departure taxi
operations at airports,” 2010
References
1. https://www.researchgate.net/publication/315382748_A_Review_on_Flight_Delay_Prediction

Beej's Guide To C Programming: Brian "Beej Jorgensen" Hall
No ratings yet
Beej's Guide To C Programming: Brian "Beej Jorgensen" Hall
679 pages
tms320f28377d (데이터시트)
No ratings yet
tms320f28377d (데이터시트)
253 pages
Half Bridge
No ratings yet
Half Bridge
44 pages
Aoc 2217v
No ratings yet
Aoc 2217v
51 pages
Cs3401 - Algorithm Lab Record
No ratings yet
Cs3401 - Algorithm Lab Record
57 pages
PFS and PD Compiled Document Exec Edited
No ratings yet
PFS and PD Compiled Document Exec Edited
59 pages
Computer Modeling of Electronic Circuits With LTSPICE
No ratings yet
Computer Modeling of Electronic Circuits With LTSPICE
30 pages
SAP Absence and Leave Management by WFS, SAP WF Forecasting and Scheduling by WFS
No ratings yet
SAP Absence and Leave Management by WFS, SAP WF Forecasting and Scheduling by WFS
1 page
Game Engine Gems 2 1st Edition Eric Lengyel Instant Download
100% (3)
Game Engine Gems 2 1st Edition Eric Lengyel Instant Download
81 pages
AACC - UC Knowledge Transfer
No ratings yet
AACC - UC Knowledge Transfer
38 pages
Exploring HashMap and HashSet-1
No ratings yet
Exploring HashMap and HashSet-1
13 pages
Wifi Service Log
No ratings yet
Wifi Service Log
405 pages
SB Pro PE 4.009 (Full Release) Version History and Release Notes
No ratings yet
SB Pro PE 4.009 (Full Release) Version History and Release Notes
39 pages
11052r Wingding
No ratings yet
11052r Wingding
45 pages
Openbravo Obtt2 Platform Course Guide
No ratings yet
Openbravo Obtt2 Platform Course Guide
7 pages
String Functions in SQL
No ratings yet
String Functions in SQL
10 pages
COMP5110 Lecture 1 - Introduction To Software Engineering - Ethics
No ratings yet
COMP5110 Lecture 1 - Introduction To Software Engineering - Ethics
33 pages
Online Safety Awareness Past Paper
No ratings yet
Online Safety Awareness Past Paper
4 pages
Why I Do CS Research: 25 April 2014 Wim Vanderbauwhede
No ratings yet
Why I Do CS Research: 25 April 2014 Wim Vanderbauwhede
32 pages
Using Social Media Images As Data in Social Science Research
No ratings yet
Using Social Media Images As Data in Social Science Research
23 pages
Java Notes
No ratings yet
Java Notes
121 pages
D-PVM-DS-01 Dell Technologies Exam Valid Dumps
No ratings yet
D-PVM-DS-01 Dell Technologies Exam Valid Dumps
4 pages
AMO 2023 Syllabus
No ratings yet
AMO 2023 Syllabus
1 page
Major Project of Ai Mock REPORT
No ratings yet
Major Project of Ai Mock REPORT
47 pages
CV RMS
No ratings yet
CV RMS
1 page
ON1 Photo Keyword AI User Guide PDF
No ratings yet
ON1 Photo Keyword AI User Guide PDF
91 pages
Raja Shankar Shah University, Chhindwara (M.P.)
No ratings yet
Raja Shankar Shah University, Chhindwara (M.P.)
2 pages
E Mart - 093653
No ratings yet
E Mart - 093653
49 pages
Netaji Subhash Engineering College: Internet Technology Lab (IT-791)
No ratings yet
Netaji Subhash Engineering College: Internet Technology Lab (IT-791)
7 pages
Internet Tools and Web Technology
No ratings yet
Internet Tools and Web Technology
91 pages
E Blaster
No ratings yet
E Blaster
9 pages
Song By: Eddie Rabbitt: I Love A Rainy Night
No ratings yet
Song By: Eddie Rabbitt: I Love A Rainy Night
8 pages
Unit 1 CC
No ratings yet
Unit 1 CC
4 pages
Project Schedule Management Overview: 174 Part 1 - Guide
No ratings yet
Project Schedule Management Overview: 174 Part 1 - Guide
1 page
Internet Technology Assignment: Department of Information Technology Netaji Subhash Engineering College
No ratings yet
Internet Technology Assignment: Department of Information Technology Netaji Subhash Engineering College
2 pages
SMS Spam Detection Using Machine Learning
No ratings yet
SMS Spam Detection Using Machine Learning
9 pages
E Commerce
No ratings yet
E Commerce
13 pages
AIML Internship Report
No ratings yet
AIML Internship Report
53 pages
Install Apache PHP5 MySQL5.6 Debian 9.6
No ratings yet
Install Apache PHP5 MySQL5.6 Debian 9.6
5 pages
SSH Cadangan 21 Peb (Sfile
No ratings yet
SSH Cadangan 21 Peb (Sfile
3 pages
Enquiry: ENQ Nom Date of Enq Name of The Candidate Mobile Nomber Subject
No ratings yet
Enquiry: ENQ Nom Date of Enq Name of The Candidate Mobile Nomber Subject
15 pages
Transform Energy Lab Report
No ratings yet
Transform Energy Lab Report
2 pages
Dbms Project Report Inventory Management System
No ratings yet
Dbms Project Report Inventory Management System
41 pages
Chatbot Abstract
No ratings yet
Chatbot Abstract
6 pages
WTA Mini Project Format
100% (3)
WTA Mini Project Format
21 pages
Internship Report DiabetesPrediction
No ratings yet
Internship Report DiabetesPrediction
15 pages
Remote Work New Normal Communication Challenges
No ratings yet
Remote Work New Normal Communication Challenges
7 pages
Fake News Detection Using LSTM
No ratings yet
Fake News Detection Using LSTM
67 pages
Major Project Documentation Final 2
No ratings yet
Major Project Documentation Final 2
62 pages
b3 Plant Leaf Disease Detection
No ratings yet
b3 Plant Leaf Disease Detection
62 pages
Call Taxi System
100% (2)
Call Taxi System
54 pages
Internship - Report Nithin
No ratings yet
Internship - Report Nithin
25 pages
CS459 - Introduction To Services Computing: Course Information
No ratings yet
CS459 - Introduction To Services Computing: Course Information
2 pages
Nikhil Major Project
No ratings yet
Nikhil Major Project
60 pages
Vandana Internship Report
No ratings yet
Vandana Internship Report
48 pages
Project Report
No ratings yet
Project Report
50 pages
Major Project CSE
No ratings yet
Major Project CSE
35 pages
Tourism Report PDF
No ratings yet
Tourism Report PDF
40 pages
Ooad Record Abinash
No ratings yet
Ooad Record Abinash
241 pages
Atulkumar Bca 5thsem A35404819038 NTCC Amity University Jharkhand
No ratings yet
Atulkumar Bca 5thsem A35404819038 NTCC Amity University Jharkhand
76 pages
Visvesvaraya Technological University BELGAUM-590014: "Online Agriculture Products Marketing"
100% (1)
Visvesvaraya Technological University BELGAUM-590014: "Online Agriculture Products Marketing"
30 pages
Online Bus Reservation System Project Report Good One
No ratings yet
Online Bus Reservation System Project Report Good One
59 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
69 pages
Projects 1920 A12
No ratings yet
Projects 1920 A12
78 pages
Report AAhar App
No ratings yet
Report AAhar App
61 pages
Online Jewelry Shop - Final
No ratings yet
Online Jewelry Shop - Final
116 pages
Final ML Report
No ratings yet
Final ML Report
34 pages
Project Report
No ratings yet
Project Report
105 pages
Online Library Management System: Mini Project Report On
No ratings yet
Online Library Management System: Mini Project Report On
30 pages
College Management e Magazine
No ratings yet
College Management e Magazine
82 pages
Virtual Mouse Control Using Hand Class Gesture: Bachelor of Engineering Electronics and Telecommunication
No ratings yet
Virtual Mouse Control Using Hand Class Gesture: Bachelor of Engineering Electronics and Telecommunication
34 pages
Secure Email Transaction System
100% (1)
Secure Email Transaction System
32 pages
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
No ratings yet
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
74 pages
Final Project Report - Pet Orphnage
No ratings yet
Final Project Report - Pet Orphnage
43 pages
54 Batch Project Documentation-1
No ratings yet
54 Batch Project Documentation-1
82 pages
PHP Project Report On Employee
No ratings yet
PHP Project Report On Employee
28 pages
Online Recruitment System
100% (1)
Online Recruitment System
23 pages
Notes Management System: A Synopsis On
No ratings yet
Notes Management System: A Synopsis On
8 pages
Project Synopsis of Python
No ratings yet
Project Synopsis of Python
6 pages
Shreyaas - CSDFF Aniket
No ratings yet
Shreyaas - CSDFF Aniket
10 pages
Project Report - Credit Card Fraud Detection
No ratings yet
Project Report - Credit Card Fraud Detection
11 pages
Software Engineering
No ratings yet
Software Engineering
8 pages
SRM Mess Management System
No ratings yet
SRM Mess Management System
18 pages
Sms Spam Detection
No ratings yet
Sms Spam Detection
23 pages
A Machine Learning Model For Flight Delay Prediction: Certificate
No ratings yet
A Machine Learning Model For Flight Delay Prediction: Certificate
17 pages
Vreportinterm Nsihp
No ratings yet
Vreportinterm Nsihp
28 pages
AN INDUSTRY ORIENTED MINI PROJECT - Docx Edited'
No ratings yet
AN INDUSTRY ORIENTED MINI PROJECT - Docx Edited'
5 pages
Netaji Subhash Engineering College
No ratings yet
Netaji Subhash Engineering College
24 pages
Grievance Portal
No ratings yet
Grievance Portal
44 pages
Index: 1.1 Key Features
No ratings yet
Index: 1.1 Key Features
53 pages
Heart Disease Prediction: Submitted For Partial Fulfillment of The Degree
No ratings yet
Heart Disease Prediction: Submitted For Partial Fulfillment of The Degree
38 pages
Health Care Final Project
No ratings yet
Health Care Final Project
78 pages
Report On Book Store Project in Java
No ratings yet
Report On Book Store Project in Java
22 pages
Projects in Software Testing
No ratings yet
Projects in Software Testing
8 pages
Temenos Group Sample Aptitude Placement Paper Level1
No ratings yet
Temenos Group Sample Aptitude Placement Paper Level1
7 pages
Railway Reservation System
No ratings yet
Railway Reservation System
3 pages
Email Client Application Implementing SMTP and POP - DOC
No ratings yet
Email Client Application Implementing SMTP and POP - DOC
103 pages
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet