0% found this document useful (0 votes)
92 views54 pages

Region Ex

The document discusses flight delay data from two airlines, RegionEx and MDA, to illustrate data analysis challenges. Key points include: - Simple statistics like average delay can provide misleading comparisons between airlines depending on metrics and destinations. - Disaggregating national data can reverse conclusions about which airline has better punctuality due to factors like airport congestion. - Correlations between variables like delays and day of week or passenger loads may not indicate direct causation. - Determining whether one airline has unambiguously worse on-time performance requires weighing multiple analyses and considerations.

Uploaded by

ishan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views54 pages

Region Ex

The document discusses flight delay data from two airlines, RegionEx and MDA, to illustrate data analysis challenges. Key points include: - Simple statistics like average delay can provide misleading comparisons between airlines depending on metrics and destinations. - Disaggregating national data can reverse conclusions about which airline has better punctuality due to factors like airport congestion. - Correlations between variables like delays and day of week or passenger loads may not indicate direct causation. - Determining whether one airline has unambiguously worse on-time performance requires weighing multiple analyses and considerations.

Uploaded by

ishan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Flight Delays at RegionEx

Learning Journal
Objectives
• This case uses a data set modeled after U.S.
commercial airline flight delay data to explore
the use of descriptive statistics and graphical
analysis in solving a business problem.
• The case illustrates potential data analysis pitfalls
and paradoxes, and demonstrates that even
when using simple methodology, an analyst
must act like a detective to discover what the
data do or do not say about the problem at hand.
After completing this case, you should be able to:

• Compute and compare descriptive statistics of a


response variable
• Construct a histogram and interpret differences
in distributional shape
• Construct a scatter plot of two variables and
compute their correlation coefficient
• Perform a hypothesis test on the difference
between two means and two proportions
After completing this case, you should be able to:

Recognize statistical paradoxes, including:


• the starkly different outcomes yielded by subtle
differences in metrics;
• Simpson’s Paradox: Comparing two data sets from
an aggregated viewpoint might show a different
relationship than when the data are disaggregated,
due to differences in the sample sizes
• the non‐causal relationship that can exist between
two correlated variables
Plan
• Interpreting the descriptive statistics and
histograms of the data
• Discussion of airline schedule “padding”, use
of scatter plots to depict correlations between
flight delays and day of week, cancellation
policies between the two airlines, passenger
delays versus flight delays, and other
phenomena in the data set.
Case Background
• Flight delay statistics are often reported in the media to
inform airline passengers of potential differences between
airlines.

• There is a standard definition, provided by the Federal


Aviation Administration (FAA), for what it means for a
flight to be delayed: a delayed flight is any flight which
arrives fifteen or more minutes later than its scheduled
arrival time.

• However, there is no standard reporting technique for


summarizing the flight delay performance of a given
airline, or for comparing that performance across airlines.
Case Background

• Common metrics include percentage of flights


that were delayed and average flight delay in
minutes.

• Sometimes these are presented as raw numbers,


other times airlines are ranked based on these
metrics.)

• As a result, a consumer might draw different


conclusions from reading two different
newspaper articles about the same data.
Case Background
• Airline flight delays constitute a national
problem with significant economic impact.
• A report by the Joint Economic Committee of
Congress estimates that flight delays in 2007
cost the economy $41 billion dollars.
Case Background
• The airline industry’s on‐time performance in
2007 and 2008 were the worst since 1995 when
the Bureau of Transportation Statistics started
collecting flight delay data.
• Flight delay statistics are regularly reported in the
news media, often coupled with “rankings” of
airline performance. For example, the Wall Street
Journal article titled “Fliers Saw Longer Delays in
2008” includes the following ranking table
Case Background
• The performance of regional airlines in
particular has received special attention in the
news media. The Wall Street Journal article
“Flying Gets Rough on Regional Airlines”
• “In the third quarter last year, four of the six airlines
with the worst on‐time records were regionals. In
October, the most recent month reported by the
Department of Transportation, the six worst carriers in
baggage were all regionals, and six of the seven airlines
that canceled flights most frequently were regionals.
Comair, owned by Delta Air Lines Inc., dropped to 64.9
percent on‐time from 84.9 percent a year earlier.
American Eagle, owned by AMR Corp.'s American
Airlines, canceled flights three times as frequently as its
parent airline in October.”
In another article, the discrepancy between major airlines and
their regional partners is noted in the following table
The same article notes, however, that the
discrepancy might not be entirely the regional
airlines’ fault:
• “Regional airlines say the operational infirmities often
aren’t their fault – they blame the mainline carriers
who handle everything from selling tickets, scheduling
flights and issuing frequent‐flier miles to deciding
which flights to cancel or delay.
• Regionals do have control over issues like crews and
maintenance that can affect delays and cancellations.
• But when bad weather hits big hubs, it’s the major
airlines that order delays and cancellations at their
regional partners in order to get the biggest planes
moving first.”
The issue of schedule padding has also been highlighted. In an article titled
“Why Flights are Getting Longer”, Scott McCartney writes:

• “Many delays are now simply being incorporated into schedules,


at high cost to consumers and airlines.
• Congestion at airports and in the sky have forced airlines to pad
their schedules more than ever so flights have a better chance of
arriving ’on‐time,’ which the Department of Transportation
defines as within 15 minutes of the airline's scheduled arrival
time.
• Flights now arrive technically ’on‐time,’ but with 30 minutes or
more of delay written into the flight plan”.
• The above citations are but a few examples of the public attention
that flight delays have received and their business implications.
They highlight the importance of properly interpreting flight
delay statistics that are often deceiving in their simplicity.
To illustrate these issues and some common data analysis paradoxes in a concise
data set, this case uses fictitious data. However, these data reflect phenomena
observed in real airline data, publicly available from the U.S. Bureau of
Transportation Statistics (BTS) website: http://www.transtats.bts.gov
• For instance, let’s examine the BTS Flight Delay statistics from
September 2008 of Delta Airlines (a major US airline) and Atlantic
Southeast Airlines (a regional airline affiliated with Delta Connection).
• The average delays and the percentages of delayed flights are shown
in the table below for flights originating at Atlanta airport and
arriving at Buffalo, NY and New Orleans, LA airports.
• We can see for the case of Buffalo that while Atlantic Southeast has a
lower average flight delay than Delta (in fact, its average flight delay is
negative, indicating early arrivals on average), 18.2% of its flights
arrive more than 15 minutes late, compared with 11.5% for Delta.
To illustrate these issues and some common data analysis paradoxes in a concise
data set, this case uses fictitious data. However, these data reflect phenomena
observed in real airline data, publicly available from the U.S. Bureau of
Transportation Statistics (BTS) website: http://www.transtats.bts.gov

• Conversely, on flights arriving in New Orleans, Delta


experiences a lower average flight delay, but a higher
percentage of delayed flights.
• Thus, an airline having a lower percentage of flights
delayed could actually experience a higher average flight
delay.
• This illustrates that our choice of metric for quantifying flight
delays can influence which airline appears more punctual.
We can also find examples of Simpson’s Paradox
in this real delay data
• Examining the same two airlines’ flights from
Atlanta to Newark, NJ and Dayton, OH, we find
the following average flight delays and
percentages of flights delayed (aggregated
over both destinations):
• Delta Airlines appears to fare worse than
Atlantic Southeast with respect to both
average flight delay and percentage of flights
delayed.
• If we disaggregate the data and look at each
destination individually, we draw a different
conclusion:
• On flights to both destinations, Delta’s average flight
delay and percentage of flights delayed are lower
than those of Atlantic Southeast!
• Delta flies nearly eight times as many flights into
Newark, a highly congested airport, as Atlantic
Southeast, but only half as many flights into Dayton,
a less congested airport.
• Thus, Delta is “penalized” for offering more service
to the more‐congested airport than to the less‐
congested airport, even though its service to both
airports is more punctual than that of Atlantic
Southeast.
• This case includes a data set which succinctly
illustrates these and several other data
analysis paradoxes.
Discussion Questions and Assignments

• The most important question posed by this case


is whether the data support the claim that the
on‐time performance of RegionEx is worse than
that of MDA.
• A framework for answering this in the questions
that follow.
• These questions demonstrate that there is
evidence favoring both airlines, and no single
analysis yields the right answer.
Discussion Questions and Assignments

• One must weigh the conflicting evidence,


evaluate the business objectives motivating
the airlines to keep flight delays low, and come
to a conclusion using all of the available
information.
• No sophisticated methodology is required, but
this thought‐process, based on common sense,
is an important and often overlooked part of
data analysis.
Correlation of number of passengers and day
of the week
• Compute the mean, median, 90th percentile,
and standard deviation of the number of
arrival delay minutes of RegionEx’s flights. Do
the same for MDA’s flights. How do the two
airlines compare?
Rankings alone do not give any indication of the magnitude of difference

• In the problem statement, we are told only that RegionEx “ranked


worse” than MDA in flight delays.

• We see here that while RegionEx does indeed have a higher average
flight delay than MDA, it also has a large standard deviation.
• We should perform a two‐sample t‐test with unequal variances and
significance level α= 0.05 to test the hypothesis that the means are
different.
• The null hypothesis is that the means μ1 and μ2 of delayed flights at
RegionEx and MDA, respectively, are equal.
• The alternative hypothesis is that they are unequal.
• Our test statistic is:
Rankings alone do not give any indication of the
magnitude of difference.
Distributions matter

• That RegionEx has a higher average delay than MDA but a lower median
delay is indicative of RegionEx having a skewed distribution of flight delays.
• Indeed, this skewness is also indicated by the difference in the two airlines’
90th percentile delay values and RegionEx’s extremely large standard
deviation.
• This suggests that while half of RegionEx flights arrive within 9 minutes of
schedule (compared to MDA’s 13.0 minutes), some RegionEx flights arrived
very late, pulling the average delay upward.

• It isn’t immediately clear which is the better statistic to use: passengers might
be more concerned about the likelihood of delays, which suggests the use
of percentiles, whereas costs incurred by the airlines in terms of fuel and
crew are a function of the average delay.
How do we compare apples‐to‐apples?
• MDA cancelled three of its flights, which excludes them
from the computation of the above descriptive statistics.
• Some airlines cancel extremely delayed flights and prefer to
rebook those passengers on a different flight.
• If MDA uses a policy of cancelling extremely delayed flights
while RegionEx does not, then this could explain why
RegionEx had a few extremely large delay values that pulled
its mean delay upward.
• It is unclear which policy is better from a passenger’s
standpoint, but this is at least one plausible explanation for
the difference between the two airlines’ delay statistics.
Inspect the distribution of RegionEx’s arrival delays by constructing a histogram of the number of
arrival delay minutes of RegionEx’s flights. Do the same for MDA’s flights. How do these two
distributions compare? Interpret the meaning of the descriptive statistics from Question 1 in relation
to these histograms. What, if any, additional information do the histograms provide?
Histograms show us the distribution of the flight delays

• We see that RegionEx had a greater percentage of flights experiencing


very small delays than MDA – 50% of RegionEx flights were less than
10 minutes late, compared to only 30% of MDA flights.
• However, RegionEx had a few flights with very high delays. Moreover,
the histogram reveals a spike in MDA flights experiencing 10 ‐14
minutes of delay.
• This could be suspicious because airlines self‐report their delay data,
and the FAA cutoff for a delayed flight is 14 minutes or less.
• Although we have no evidence that MDA rounded down flight delays
near the cutoff, spikes in histograms can suggest erroneous or
doctored data.
• Alternatively, it could suggest that as the 15‐minute cutoff
approached, the crew worked faster to beat the cutoff.
Identifying descriptive statistics on the
histogram
• We can easily identify the median and 90th percentile on the
histogram by finding the point where the cumulative
frequency crosses 50% and 90%, respectively.
• For RegionEx, the median corresponds to the 5‐9 minute bin,
while for MDA it is somewhere between the 5‐9 minute and
10‐14 minute bins.
• The 90th percentile is in the 20‐24 minute bin for RegionEx
and between the 10‐14 minute and 15‐19 minute bins for
MDA.
• We observe the difference in standard deviations between
the two airlines simply by noting that the RegionEx histogram
exhibits a wider spread of values than the MDA histogram.
• Using the FAA definition of a “late” flight, what
percentage of RegionEx’s September flights were “late”?
• What percentage of MDA’s September flights were “late”?
• What percentage were “on‐time” for each airline,
according to the FAA definition?
Rankings alone do not give any indication of the magnitude of difference

• We were told only that RegionEx “ranked worse” than MDA in flight delays; we see
here that the actual difference in the percentage of delayed flights is small.
• To test whether this difference is significant, we perform a hypothesis test for two
proportions.
• The null hypothesis is that the proportions p 1 and p2 of delayed flights of RegionEx and
MDA, respectively, are equal. The alternative hypothesis is that they are unequal is:
Definitions matter

• RegionEx actually has a higher (beter) on‐time performance of 73.8%


versus 71.7% even though its percentage of delayed flights is worse
than MDA’s.
• This is because the definition of delayed flights does not include
canceled or diverted flights that never arrive.
• MDA had three canceled flights in this data set, which lowered its on‐
time performance.
• So although a greater percentage of RegionEx’s flights arrived late, it
is also true that a greater percentage arrived on‐time because all of
RegionEx’s flights ultimately arrived.
• Note, however, that a two‐sample proportion test again fails to show
a significant difference between these on‐time percentages.
Compare the performance of the two airlines on each flight
leg by calculating the descriptive statistics from Questions 1
and 3 for each of the four routes.
Do any of the comparisons change? If so, why?
Simpson’s Paradox

• We see that although RegionEx has a higher percentage of delayed flights in the
aggregate, when we look at each route individually, RegionEx does no worse than
MDA on any route.
• Moreover, on routes between DFW and MSY, it experiences a lower fraction of
delayed flights than MDA.
• Why is this reasonable? MDA flies the same number of flights on each of the four
legs, so its total percentage delay is just a straight average of the delays on the
four legs.
• RegionEx, however, flies three times as many flights on the DFW routes as on the
PNS routes, so these receive three times the weight in the aggregated average.
• Moreover, Dallas‐Fort Worth legs experience higher delays (for both airlines) than
the Pensacola routes.
• Thus, the total percentage delay is pulled higher for RegionEx than for MDA. Thus,
a same metric applied to different levels of aggregation can yield a different
ranking.
Adjusted average delay

• A different illustration is as follows: Suppose that all 360


flights are operated by the same airline, and compute the
average delay in each case.
• If RegionEx was operating all flights then the average delay
would be:

• ((90+28)*0.256+(90+29)*0.289+(30+30)*0.200+(30+30)*0.267)/357=0.259.

• If MDA was operating all flights then the average delay would be:
• ((90+28)*0.267+(90+29)*0.300+(30+30)*0.200+(30+30)*0.267)/357=0.267.
Consider the RegionEx flights only. Prepare a scatter plot of arrival delay minutes versus number of passengers.
Your scatter plot should consist of 240 data points, one for each flight in the data set where the vertical
coordinate is arrival delay minutes of that flight and the horizontal coordinate is the number of passengers.
What is the correlation coefficient between arrival delay minutes and number of passengers for RegionEx’s
flights? Interpret your results.

• There is a slight linear trend between passengers and minutes of


delay.
• This suggests some relationship between the number of
passengers on the flight and the average delay.
• Notice, also that the variability in arrival delay appears to
increase with passenger count, as exhibited by the “funnel” shape
of the scater plot. The correlation coefficient is 0.49, which is
moderate.
Correlation does not imply causation

• Although tempting, it is not reasonable to conclude


that large passenger counts cause delays.
• There could be another cause for the delays that also
happens to be correlated with passenger count.
• For example, if we plot average flight delay versus day
of the week (coded with values 1 through 7), we see
that Mondays and Fridays exhibit large flight delays.
• This could be due to a higher volume of flights
scheduled for those days to accommodate business
travelers.
Correlation does not imply causation

• For a similar reason, flights on those days might carry more


passengers; this is shown in the scater plot of passenger count
versus day of the week.
• Thus, while we might be tempted to conclude that full flights cause
delays, a different plausible explanation is that more people travel
on Mondays and Fridays, which results in more flights (contributing
to congestion) and higher passenger loads per flight (contributing to
the apparent correlation between congestion and passenger load).
• Day of week could be confounded with number of passengers, and
their respective influences, if any, on flight delays cannot be
determined from this data.
• Indeed, if we control for day of the week we see a somewhat
weaker correlation (see the table below)
• Compare the scheduled flight durations for
the two airlines on each of their four routes.
• Compare the actual flight durations. What
do you notice?
• If the two airlines had the same scheduled
duration, what impact would this have on
their delay records?
Make sure the question answered matches
the question asked!

• Although RegionEx shows a greater average flight delay, its flights on the
DFW routes are actually shorter than those of MDA.
• MDA has built 10 extra minutes of padding into the scheduled flight time of
DFW routes and 5 extra minutes on PNS routes as compared to RegionEx.
• If we adjust RegionEx’s delays for this padding, the average delay drops to
6.9 minutes per flight, and their on‐time performance increases to 92%!
• A flight schedule can be padded to make delay statistics appear more
favorable to an airline even if passengers actually have to spend longer at
the airport.
• However, because passengers make plans based on the scheduled arrival
time of the flight, it is useful for scheduled flight times to match reality.
RegionEx might benefit from increasing its scheduled flight times.
• What other factors should Marion Volero take
into consideration regarding the data
analysis?
• ‐ Sampling: Was the sample large enough or
representative of all flights on those legs?
• Is September a representative month, or did
something peculiar happen during September
that might skew the data in one direction or
another for one or both airlines?
Flight Delays versus Passenger Delays

• Because flight delay statistics are relatively easy for the


airlines to collect, these are the focus of the public
debate on delays.
• However, arguably more important is the issue of
passenger delays: what is the total delay in passenger‐
minutes?
• For instance, if a plane arrives 30 minutes late and
results in a passenger missing a connecting flight, then
the delay experienced by the passenger is far greater
than the 30 minutes reported by the airline.
Flight Delays versus Passenger Delays

• Moreover, a 30 minute delay experienced by a full flight


of
• 100 passengers has a greater societal cost (and cost to
the airline’s reputation) than a 30 minute delay affecting
a flight of only 20 passengers.
• ‐ Delay costs: Also omited from this analysis are the
costs to the airlines of delays due to wasted fuel, crew
expenses, rescheduling passengers who missed
connecting flights, overnight stays in the event of
extreme delays or cancellation, etc.

You might also like