Art Data Science Practitioners Guide
Art Data Science Practitioners Guide
Douglas A. Gray
Designed cover image: Getty Images_1367899893
MATLAB® and Simulink® are trademarks of Te MathWorks, Inc. and are used with permission. Te
MathWorks does not warrant the accuracy of the text or exercises in this book. Tis book’s use or
discussion of MATLAB® or Simulink® software or related products does not constitute endorsement
or sponsorship by Te MathWorks of a particular pedagogical approach or particular use of the
MATLAB® and Simulink® software.
First edition published 2025
by CRC Press
2385 NW Executive Center Drive, Suite 320, Boca Raton FL 33431
and by CRC Press
4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
CRC Press is an imprint of Taylor & Francis Group, LLC
© 2025 Douglas A. Gray
Reasonable eforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of
their use. Te authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and
let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known
or hereafter invented, including photocopying, microflming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.
com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978‑750‑8400. For works that are not available on CCC please contact [email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identifcation and explanation without intent to infringe.
Library of Congress Cataloging-in-Publication Data
Names: Gray, Doug (Douglas A.), author.
Title: Te art of data science : a practitioner’s guide / Douglas A. Gray, MSOR, MBA.
Description: First edition. | Boca Raton, FL : CRC Press, 2025. |
Includes bibliographical references and index. |
Identifers: LCCN 2024041228 | ISBN 9781032818177 (hardback) |
ISBN 9781032816968 (paperback) | ISBN 9781003588344 (ebook)
Subjects: LCSH: Business analysts. | Management–Statistical methods. | Quantitative research.
Classifcation: LCC HD69.B87 G73 2025 | DDC 658.4/012–dc23/eng/20241107
LC record available at https://lccn.loc.gov/2024041228
ISBN: 9781032818177 (hbk)
ISBN: 9781032816968 (pbk)
ISBN: 9781003588344 (ebk)
DOI: 10.1201/9781003588344
Typeset in Palatino
by codeMantra
To my parents, my wife, and my sons, without whose love
with whom I have been on the journey of delivering business value and
Introduction ............................................................................................................1
Motivation.........................................................................................................1
Creating a Course and a Legacy....................................................................3
Organization.....................................................................................................4
ix
x Contents
11. Top 10 Analytics Leadership Skills (with Tom Davenport) .............. 111
Introduction.................................................................................................. 111
Recruitment, Retention, People Development......................................... 111
Generating Demand (Securing Projects by Domain Area) ................... 113
Relationship Building.................................................................................. 113
Understand the Business Domain (and Problem in Question) ............. 114
Change Management .................................................................................. 114
Project Management.................................................................................... 115
Communication Skills................................................................................. 116
Planning, Budgeting, Administration, and P&L Management ............ 116
Practitioner Experience and Expertise ..................................................... 117
Information Technology (IT) Experience and Expertise........................ 118
Conclusion........................................................................................................... 185
References and Bibliography .......................................................................... 191
Index ..................................................................................................................... 193
Foreword
With analytics, data science, and artifcial intelligence (ADSAI, as the author
puts it) fast becoming key components of corporate strategy and most tech‑
nology projects, this book is a must read for anyone who is engaged either
directly or indirectly with work or research in these disciplines or for anyone
who is just curious about what all the hype is about around ADSAI. Through
the lens of his own career and through the different papers and articles he
has published through the years, Doug Gray does a masterful job of creating a
mosaic of how applied mathematics, statistics, operations research, and man‑
agement science have evolved into the disciplines of analytics, data science,
and artifcial intelligence that we know today. This mosaic depicts a 40‑year
journey beginning when Doug changed his undergraduate major from com‑
puter science to mathematical sciences to today where we fnd Doug as a data
science director at Walmart Global Tech, as the co‑author of the book Why
Data Science Projects Fail: The Harsh Realties of Implementing Analytics without
the Hype, and teaching practitioners and leaders at the Southern Methodist
University Cox School of Business how to apply analytical science within a
business environment. For the duration of his career, Doug has worked at
the intersection of mathematics, statistics, computer science, large amounts
of data, and real‑world problems for both the private sector and the pub‑
lic sector. His journey has had many twists and turns along the way, but
the best practices and critical lessons learned that Doug has gleaned from
his experiences are invaluable for anyone even tangentially involved with
ADSAI today.
Doug purposely makes the audience for this book very broad. For the high
school or college student contemplating pursuing a degree in one of the
felds of ADSAI, Doug shares in detail his own decision process for pursu‑
ing an undergraduate degree in mathematical sciences at Loyola University
Maryland and a graduate degree at Georgia Tech. The self‑assessment he
put himself through to decide that the feld of operations research was best
for him and the way he proactively reached out to his professors for advice
and counsel are best practices for any student pursuing a degree or career
in ADSAI. In fact, I would argue that these are best practices for any student
choosing a major or career path no matter what the area.
For the ADSAI practitioner, there are key lessons learned and best prac‑
tices sprinkled throughout this book. Doug devotes a whole chapter in the
book to the non‑technical skills (soft skills) that make an ADSAI practitioner
successful. Anyone contemplating a career in this area should take a hard
look in the mirror and ask themselves if they are interested and motivated
to work hard and to develop their skills in these requisite non‑technical
areas. As Doug walks the reader through his own career progression, he
xii
Foreword xiii
unit and the technology teams. As a business unit sponsor, you should take
note of the top 10 reasons projects fail in Chapter 12 and do everything you
can within your organization to minimize these risks.
Doug also spends a signifcant amount of time in the book talking about
digital transformation. If you “google” digital transformation, you get a
myriad of results from different consulting frms and technology compa‑
nies. Clearly, digital transformation is a hot area these days. As I read Doug’s
career journey in this book, what struck me is that he has been part of a
digital transformation for different organizations for the past 40 years. While
different components of a digital transformation can be broken into projects
and managed like projects, the transformation itself is a journey much like
Doug’s career with many twists and turns along the way. As a leader of such
a journey, the best piece of advice for you can be found at the beginning of
Chapter 12 where Doug highlights Stephen Covey’s “The 7 Habits of Highly
Effective People” and focuses on “Begin with the end in mind.” There is no
better advice for a leader embarking on a digital transformation than these
simple six words!
For the researcher in one of the felds of ADSAI, Doug’s journey highlights
how massive improvements in compute power and data storage have pro‑
vided us the ability to solve problems today that we only dreamed about
30–40 years ago. As a prime example, Doug highlights the airline real‑time
irregular operations problem that he and a small team addressed at two
different points in his career. In the early 1990s, Doug and a team tried
and quickly failed to solve this problem for American Airlines. Twenty to
twenty‑fve years later, Doug and his team at Southwest Airlines solved
this problem driving signifcant, measurable business value that Southwest
Airlines still realizes today. Through the many examples Doug references,
he makes the case that in most instances, a robust technical solution is a
necessary condition for a successful ADSAI project. He also makes the case,
however, that while these technical solutions can be quite sophisticated,
they are not a suffcient condition for a successful project. Consequently, it
is incumbent upon the researcher that pushes the state‑of‑the‑art across dif‑
ferent technical boundaries to be cognizant of what ultimately makes a suc‑
cessful ADSAI project and understand how the research fts or does not ft
within that framework.
Finally, for a grizzled veteran like myself, who has spent the last 45 years of
my life in many different roles applying operations research techniques and
developing decision support systems, The Art of Data Science: A Practitioner’s
Guide provides me an opportunity to look back on my own career, to take
pride in what we accomplished, to think about the many mistakes we made
and what we learned from those mistakes, and most importantly to refect
on where the disciplines of analytics, data science, and artifcial intelligence
go from here. My favorite chapter in the book is Chapter 14 entitled “O. R. in
2048.” The crux of the chapter is an article published in 1998 in OR/MS Today
where Peter Horner, the editor of the magazine, asked Doug and a few other
Foreword xv
Stephen M. Clampett
To my publisher at Taylor & Francis, Ms. Randi Slack, whose support, encour‑
agement, guidance, and direction on both of my books have been unwaver‑
ing from the very frst conversation and so very instrumental in completing
these writing projects. Randi took a chance on a frst‑time author and looked
past my inexperience to focus on the content of the message of someone who
believed they had something important to say. Her judgment, insight, pub‑
lishing business acumen, and magical words, “We have to publish this book!”
(twice) have made all of the difference. (And she was instrumental in the
titling of both books!)
To my editor, Ms. Kara Tucker, for inviting me to publish, and then editing,
my INFORMS Analytics magazine 12‑part article series on The Top 10 Reasons
Why Data Science Projects Fail, which then gave rise to the book Why Data
Science Projects Fail (with my co‑author Evan Shellshear) and The Art of Data
Science, both of which she also edited. Thankfully, because I am the furthest
thing possible from a literary grammarian or book production specialist.
To my co‑author on the frst book, Evan Shellshear, thank you for helping
to make me a better writer and a more rigorous editor.
To my frst software engineering partner, Bobby Johns, on my very frst
big, successful project at AA, highlighted in Chapter 5, that materially helped
to launch both of our careers, a debt of gratitude for making my O.R. model
more easily consumable and intelligible to end users with super cool GUIs,
and for making my code run faster and more effciently. (Bobby was also my
chief engineer on Travelocity, which he helped to make into a huge success,
and multiple other projects with me over the years, as well as a great friend
who always made work more fun!)
To Dr. Phil Beck, the manager of my Optimization Solutions group at
Southwest Airlines where together we garnered multiple industry awards
and 9‑fgures of validated business value annually delivering game‑chang‑
ing O.R., analytics, and data science solutions across the airline operations
spectrum. Phil is a true colleague and friend whose partnership I value
greatly.
To Nader Kabbani, my co‑author on Right Tool, Right Place, Right Time in
Chapter 8, thank you for your partnership.
To Dr. Peter C. Bell, Professor Emeritus, Management Science, at Western
University’s Ivey School of Business, for permission to quote and reprint his
OR/MS Today article “Defning Analytics Through the Eyes of Students”
in Chapter 13. Dr. Bell taught a course called Competing with Analytics to
EMBA students (which is very similar to the course I teach at SMU!). He is
a past recipient of the INFORMS Prize for the Teaching of OR/MS Practice.
xvii
xviii Acknowledgments
xix
Introduction
Motivation
This book is an important part of what I call my professional “Legacy Project,”
which is founded on how I want to spend the fnal phase of my formal work
life while transitioning into retirement, and what I want to leave behind for
others to learn from my decades‑long career.
My professional Legacy Project consists of the following components:
I was inspired to write books about the practical application of analytics, AI,
and data science by Tom Davenport, Ph.D., whose prodigious and prolifc
authoring of books on the subject, including the seminal works Competing on
Analytics and All ‑iIn on AI, helped to immeasurably shape my own thinking
DOI: 10.1201/9781003588344‑1 1
2 The Art of Data Science
its origin, and evolution and its myriad problems addressed, modeling methods, and
applications, see: https://en.wikipedia.org/wiki/Operations_research.
I am proud to say that my career is “bookended” by working in O.R. and
data science, respectively, for two INFORMS Prize‑ and INFORMS Franz
Edelman Award‑winning organizations: American Airlines (Decision
Technologies) and Walmart (Global Tech). (To be clear, I was not a mem‑
ber of either Edelman Award‑winning team, but the awards speak to the
caliber of the cohorts of which I was a member.) At those companies, and
others in between, including Sabre, Blue Cross, and Blue Shield of Kansas
City (on contract as Acting Director of Analytics), and Southwest Airlines,
I personally did a lot of good work as a practitioner and led teams that did
a lot of good work with analytics that led to well over $3 billion (at the time
of publication) in cumulative, Finance department‑validated business value
and economic impact. This book will address in detail portions of that value
that can be made public (as most of it cannot). That value and impact is an
important component of my legacy, because it is a testament to the cadre
of people that I recruited, hired, trained, taught, and learned from myself
while delivering projects.
Organization
The Art of Data Science: A Practitioner’s Guide is intended for current and
future students, practitioners, and leaders studying and working in the ana‑
lytical sciences. The goal is to provide insight into best practices for applying
analytical sciences as well as career progression and development, and how
to maximize the business value and economic impact that data science, ana‑
lytics, O.R., statistics, and AI can offer enterprises. These best practices stem
from my own experiences executing projects and leading teams, and those
of my colleagues and students, over a 30‑year career. The “hook” in the title
is the word “art.” The juxtaposition of “art” versus “science” is intended to
Introduction 5
focus the reader on so many of the nontechnical “soft skills” that make all of
the difference in successful applications of the analytical sciences. This is not
a technical textbook, rather a book addressing some of my methodologies,
approaches, and rubrics that are based on experience and “know‑how”
versus sophisticated mathematical theory.
The book is organized as follows:
The arc of Chapters 5–9 represents some of the most impactful modeling and
solution approaches and projects on which I worked throughout my career,
and how those experiences led to foundational principles and lessons learned
that I applied with success time and time again:
The arc of Chapters 10–12 offers a collection of skills and key learnings that
I have developed throughout my career and found invaluable as a practitioner,
leader, and educator:
DOI: 10.1201/9781003588344‑2 7
8 The Art of Data Science
Lastly, an invaluable lesson that I took away from working for Joe on
DoD projects for two years convinced me that once I graduated, I no longer
wanted to work on federal government contracts, despite the solid job secu‑
rity, but rather in corporate America, where capitalism was the order of the
day and not quite as much of the “red tape” and bureaucracy of government
contracting.
As I progressed through my junior year coursework and focused on oper‑
ations research as my career direction, my thoughts turned to postgradu‑
ate employment opportunities – and then, I got a wake‑up call from Drs.
Hennessey and Auer. They informed me that to get a job and work in opera‑
tions research in corporate America, I would most certainly need a master’s
degree. They said that a bachelor’s degree would get me a job as a “program‑
mer/analyst,” but a master’s degree would be mandatory to get a “seat at the
table” where the business problems would be analyzed and models formu‑
lated. (The “fun” part! Not just the coding!)
Armed with that insightful advice, I set out to research graduate school
programs in operations research. I quite literally wrote to 50+ universities to
obtain literature on their O.R. master’s degree programs. (I later donated all
of the materials to the Loyola Math Department library for future students
to use!) After my exhaustive research, I narrowed my choices to a handful of
the top schools in the country (ranked by my preference):
My decision became easier and more readily apparent when only Georgia
Tech and Purdue offered me “full‑ride” research assistantship scholarships
that paid both tuition and a modest stipend to cover my living expenses.
Based on several criteria, I chose Georgia Tech, which was another pivotal
event for me in my career path trajectory:
As it turned out, I made the right decision for multiple reasons. Georgia Tech’s
Graduate School of Industrial and Systems Engineering has been consistently
ranked No. 1 by U.S. News & World Report’s survey of Best Graduate Schools
in Engineering since 1990. (I wouldn’t have gone wrong going to Purdue as
they are ranked No. 2). My frst employer out of tech, American Airlines, had
already started interviewing and hiring M.S. in operations research gradu‑
ates (from both Georgia Tech and Purdue, as it turned out) for their growing
O.R. Department; I was the second new hire of what turned out to be a cohort
of more than 50 Georgia Tech ISyE graduates in O.R., IE, and statistics!
When I got to Georgia Tech ISyE in 1986, I realized how fortunate I was to
be there. The O.R. faculty was literally and fguratively “world‑class” award‑
winning academic researchers. George Nemhauser, Ellis Johnson, John
Jarvis, Don Ratliff, and Dave Goldsman, just to name a few of many, were
all top‑notch academic researchers in their respective felds, but were also
practitioners and entrepreneurs who started companies to apply their expertise
and business acumen. [Jarvis and Ratliff started, grew, and eventually sold
CAPS Logistics (Computer‑Aided Planning and Scheduling) to Baan, an
enterprise resource planning (ERP) software company. Nemhauser started
Sports Scheduling, LLC, which was contracted for several years with Major
League Baseball to generate the MLB season schedules. Jerry Banks and John
Carson started their own simulation analysis company and completed proj‑
ects for Coca‑Cola, among others.]
The Georgia Tech student body represented the very best intellects from
around the world, including the United States, Latin America, China, Eastern
Europe, and India. My roommate Chris Hane earned a Ph.D. and is now a VP
at Optum where he leads healthcare‑related AI research. Ananth Iyer, Ph.D.,
served as a Department Chair and Professor of Operations Management at
Purdue Krannert School of Management and is now Dean of the University
at Buffalo School of Management. The list of accomplished alumni goes on
and on.
The Georgia Tech M.S. in operations research program was simply a per‑
fect ft for me and my interests. I had a strong undergraduate math degree
focused on linear algebra, statistics, and operations research, which provided
the ideal foundation for my 16 courses in theory and applications:
was perfect to join AA O.R., and I consider myself very lucky to have been
hired there. The group was pioneering airline operations research with inno‑
vations in pricing and yield (revenue) management, fight scheduling, crew
scheduling, spare parts inventory management, and airport air and ground
operations simulation analysis. The group was so successful that AA senior
management decided to form American Airlines Decision Technologies
(AADT), a wholly owned subsidiary of American Airlines, to offer O.R.‑
based consulting services to other airlines and travel industry‑related enti‑
ties. Later, in 1993, AADT merged with Sabre Development Services to form
Sabre Decision Technologies, which later became Sabre Airline Solutions, the
world’s largest (by overall market share) exclusive airline O.R.‑based soft‑
ware products and consulting services company. (It was later purchased by
CAE in March 2022.)
In January 1988, AADT secured a consulting project with the Swedish Civil
Aviation Authority (CAA) to evaluate multiple new airspace confgurations
at Arlanda Airport in Stockholm, Sweden, to determine (using discrete‑event
simulation analysis/SIMMOD) which one(s) would produce minimal delays.
I was assigned as the analyst and learned several lessons after completing
the project:
• Show up, diligently get your work done, and underpromise and
overdeliver.
• Don’t complain… ever, about anything.
• It is very diffcult to communicate complex business subjects with
folks for whom English is a second language (and shame on me, I
didn’t speak Swedish).
• How dark it is (no sun for two weeks) and how wet and cold it is in
January in Sweden.
• Make sure you get as much as sleep as possible on the fight from the
United States to Europe, because if you don’t, you will never recover
from jet lag (I had to take a nap every day at lunch!).
Despite the severe jet lag, extreme darkness, dankness, and cold (and a lan‑
guage barrier), my colleague and I successfully completed the project and
were able to help the Swedish CAA select the minimal delay airspace con‑
fguration. In hindsight, we probably could have picked the right airspace
confguration just by visual inspection of the designs (one had a severe
built‑in bottleneck), but the simulation quantitatively ensured the right
decision. [I was even able to use my Georgia Tech graduate research on
selecting the population (airspace design) with the minimum mean (delay)
and published and presented a paper with Dave Goldsman at an INFORMS
conference!]
After Stockholm, I was off to Spain in the spring of 1988. AADT secured two
consulting projects with the Spanish Aviation Safety and Security Agency
(CAA): (1) Analyze operations at Madrid‑Barajas Airport, with emphasis on
factoring in the effect of military air traffc at adjacent Spanish Air Force Base
Torrejón (a lot of F‑16 traffc) and (2) analyze the impact on ground (gating
and taxiway) operations of a new, longer runway being built at Palma de
Mallorca Airport off the coast of Spain. I went to Spain to help out on the
Madrid project, but was assigned primarily to Palma. Although it was fun
to visit two great Spanish cities and enjoy some phenomenal cuisine, neither
of these projects presented any real complex technical challenges, but I did
learn a lot about project management, people management, and client han‑
dling from my project manager, Jim Crites.
Jim was a retired U.S. Marine Corps Major (and later a USMC Reserves
Lt. Col.) who went through ROTC at the University of Illinois Urbana‑
Champaign while earning a B.S. in business administration and earned an
Career Summary 17
present themselves, especially early in your career when you are trying to make your
mark, say “yes” far more often than you say “no” because the “big games” don’t come
around all that often.)
Mike asked whether I was sure and said I could do it, but not to “f@&k
it up.” Mike always knew how to instill confdence in me, much like Gen.
George S. Patton Jr., who used profanity for effect, and it worked!
As the old proverb says, “Be careful what you wish for, you may get it.”
• Say “yes” more often than you say “no” when asked to participate.
(I could have turned down the opportunity to manage the Sydney
project and let whatever trepidation scare me into oblivion, but I said
“yes,” and it made all the difference career‑wise.)
• Make yourself indispensable – you want to be the “go‑to” person when
leadership needs something important done well.
• Endear yourself to people – be someone other people want to be around
and have around, call upon, and follow (primarily your leadership,
clients, and co‑workers who will need your help to get things done).
We did such a great job with the SIMMOD project in Sydney that Qantas’
O.R. leadership (and the CASA) asked us to stay in Australia for an additional
two weeks to do another similar type of project at the airport in Melbourne.
That is the best customer feedback you can ever receive – when the customer asks
you to perform more work before you have even completed the frst project! I don’t
even recall the scope or ask in that project, but we completed it successfully as
well and enjoyed the heck out of the food, sights, and culture in Melbourne!
Movin’ On Up
Typically, practitioners begin their career going through a progression. First,
they run models built by others (that was my time on SIMMOD), modeling
different scenarios through data. Then, they tweak, modify, extend, enhance,
and maintain models built by others. Then, they design, build, and deploy
their own models from scratch. The progression represents an increase in
responsibility, accountability, and risk management. Then, they start man‑
aging others on individual projects and start managing one or more larger
programs, each made up of one or more projects. Along the way, the size of
the teams, organizations, budgets, and value targets they are managing con‑
tinues to grow. My career followed this exact progression.
After my time on the airport analysis team (SIMMOD), during which I
was promoted to senior consultant from consultant, I was assigned to work
20 The Art of Data Science
In the early fall of 1990, Mike Parks assigned me to work on a project with
AA maintenance and engineering long‑range planning (LRP) at AA’s primary
heavy maintenance base in Tulsa, Oklahoma. As AA’s feet had grown from
200 to 600 aircraft, the process of planning for and scheduling heavy main‑
tenance “check” (or “overhaul”) activity and hangar capacity was becoming
more complex than the LRP team could handle with large sheets of paper
tacked up on the wall marked with colored pencils (the historical incum‑
bent solution) or even Microsoft Excel (the newfangled spreadsheet macro for
scheduling the 250 narrow‑body aircraft feet of MD‑80s was taking 12 hours
to run and oftentimes crashing before reaching a solution!). Senior leader‑
ship was growing more and more concerned about ensuring completion
of all heavy maintenance checks on time (avoiding fnes and grounding of
aircraft), minimizing costs within FAA rules, and not running out of much
needed hangar capacity. No one in LRP had solid solutions to any of these
problems.
Heavy maintenance checks are required to be completed on each aircraft
periodically, the time window of which is governed by a designated fight
hour limit (e.g., every 15,000 fight hours) and cost, at the time, $1 million per
aircraft check (or ~$2.4 million in 2023). The objective is to bring an aircraft
in for maintenance as close to the fight hour limit as possible without going over
to maximize the “yield” of the check. If the fight hour limit is exceeded, then
FAA fnes and penalties are incurred and the aircraft is grounded. If the
aircraft is brought in too early, well before reaching the limit, then more checks
than are legally required are completed over the life of the aircraft, thereby
unnecessarily increasing maintenance costs.
In roughly six months of research, analysis, and iterative design, develop‑
ment, and testing, I was able to successfully model and solve the problem
using a job scheduling on parallel machines with fxed job deadlines modeling
approach solved with a “greedy heuristic” algorithm. My colleague Bobby
Johns (one of the best software engineers I have ever worked with) devel‑
oped a colorized Gantt chart user interface that digitally emulated the large
sheets of paper and colored pencil approach on an Apple Macintosh PC
(IIx running a Motorola 68000 chip). The industrial engineer on the project
(who came up with the original concept) estimated that the model would
increase check yields to nearly 100% and, on the 227 widebody aircraft feet
alone, would eliminate 1–2 heavy checks over the life of the feet or a cost
avoidance of $227 million to $454 million (or ~$545 million to $1.09 billion
in 2023)!
This project was a tremendous success and literally changed the trajec‑
tory of my career, given the business value and economic impact that our
automated, intelligent solution created (i.e., hundreds of millions of dollars
in maintenance costs avoided over the life of the feet and an automated tool
that analysts could use to run all manner of planning scenarios in a matter of
minutes on their desktop Apple Macintosh PC that used to take hours, days,
or weeks, depending on the scope).
22 The Art of Data Science
The project and all of the lessons learned are detailed in two articles later
in the book:
The learning experience on the project was foundational for me and estab‑
lished a set of principles that guided my career in analytical sciences. Three
quotes from O.R. luminary Dr. R. E. D. “Gene” Woolsey (Colorado School of
Mines professor and industry consultant) summarize my key learnings:
1. “If you try to tell someone how to do something better or differently without
understanding how they do the job today, then you are a fraud.”
2. “Finding the ‘optimal’ solution is often not nearly as important as putting
the solution values into a form that the client is accustomed to seeing.”
3. “A manager will prefer and opt to live with a problem that they cannot solve
rather than implement a solution they cannot understand.”
I spent several weeks in Tulsa sitting next to the LRP planners learning every‑
thing there was to know about heavy maintenance check activity and facil‑
ity capacity planning and scheduling before I started modeling the problem.
The model and system we built were simply a more formalized, optimized,
and automated version of the business process fow that the planners were
already executing, albeit their way was much less effcient and more time‑
consuming. Therefore, it was relatively easy for the planners to make the
(not so big) leap to use the new computer model/system, because they intui‑
tively understood how it worked and saw it as a “big calculator.” Mission
accomplished!
Bobby and I, along with a couple of additional engineers, spent another
year or so enhancing and extending the model to accommodate different
planning scenarios, such as third‑party aircraft maintenance (for proft), FAA
directive‑related checks, and landing gears (we achieved 99% yield based
on cycles, i.e., one cycle equals one takeoff and one landing, together), and
rewriting the C code into C++ to leverage the benefts of object‑oriented
programming.
As a result of the above success, in 1992, I was promoted to principal (i.e.,
manager), which was the frst level of management at AADT. I worked on
a series of smaller projects, including forecasting aircraft engine removals
and engine repair shop arrival fow (using the binomial and Poisson distri‑
butions). (The engine shop needed a business case for expansion to handle
increased demand due to feet growth, but they needed a more scientifc
basis for estimating how many more engines would be arriving at the shop
Career Summary 23
for unscheduled repairs and scheduled maintenance.) I also did some con‑
sulting for other airlines around the world using our maintenance planning
and scheduling tool. My team and I also prototyped a model for schedul‑
ing tasks (e.g., inspections and repairs) and resources (e.g., people, tools, and
equipment) on overhaul checks inside the hangar, but the interest simply was
not there from the business folks at any client company to implement such a
solution.
In 1993, Mike Parks asked me to take responsibility for all of the models
and systems that scheduled pilot training activities (i.e., new hires, recurrent,
and check rides) and facilities (i.e., instructors, classrooms, and simulators)
for the AA Flight Academy. These were optimization‑based scheduling mod‑
els designed to ensure that there were no gaps in crew training requirements
and mission availability. In addition, I inherited the crew manpower plan‑
ning system that utilized a multiyear time horizon integer programming
model to determine the optimal number of pilots and fight attendants to
hire each month, factoring in feet growth, changes in aircraft type feet com‑
position, and both aircraft and crew retirements, to ensure adequate line and
reserve crew resources. We employed this model on behalf of AA and other
U.S. airlines. Although these systems were all mission‑critical, there was not
a lot of new development going on, so this work was largely in “operations
and maintenance” mode.
In 1994, AADT merged with Sabre Development Services (the software arm
of Sabre) to form Sabre Decision Technologies. I was promoted to director
that year in August, and I added responsibility for another set of systems that
operated the Flight Academy. Merging two different cultures together, ours
entrepreneurial and innovative, with another that was quite stodgy, bureau‑
cratic, and resistant to change, was the biggest eye‑opener for me. That said,
I was also able to learn and grow by taking on some large, multi‑million‑
dollar travel distribution system integration projects, e.g., merging Sabre Air
Booking with a large tour company’s booking systems – no analytics, but
a good learning experience for managing large‑scale system development/
integration projects.
In July 1995, all of my crew‑related programs were merged into a large
crew system suite that was managed by another leader. At that point, my
career took a much different direction, away from a focus on analytical sci‑
ences and into internet e‑commerce and software product engineering com‑
panies, including startups.
(Chapter 3 on “Digital Transformation” covers a lot of the work that I did
from 1995 to 2009.)
In summary, in 1995, the internet and World Wide Web were in their
infancy with companies like Netscape (the frst commercial web browser)
going public. E‑commerce was the next big thing – monetizing the internet
by marketing and selling products and services online.
I saw frsthand in my previous job the business opportunity presented by
Sabre and travel distribution. Mike Parks asked whether I wanted to lead
24 The Art of Data Science
the program to put Sabre’s travel reservation capability on the internet and
make it available directly to consumers to book their own travel online.
Frankly, no one else wanted the job because they thought the internet was
a “fad” that would be “gone soon” and that consumer‑direct travel distribu‑
tion was undermining Sabre’s travel agency “bread and butter” legacy busi‑
ness model. People at Sabre actually said things like “Why would a consumer
ever want to book their own travel on a website?” In hindsight, pretty ridiculous,
right? Mike, who hired me, had been my boss and mentor for eight years and
pretty much taught me everything I knew about business to that point; I was
fercely loyal to him. He needed my help, so I said, “Yes!”
Delivering Value
Travelocity, Sabre’s consumer‑direct travel distribution model (now part
of Expedia), was the world’s frst completely automated real‑time, internet
e‑commerce travel booking tool. For two years (1995–1997), I led the team
that designed, engineered, developed, launched, operated, supported, and
maintained the software technology platform underlying Travelocity, as well
as functioned as the CTO in charge of hardware and network communica‑
tion engineering and operations. A core team of four, which eventually grew
to 75, took Travelocity from a “whiteboard drawing” in July 1995 to a success‑
ful launch at the CyberCafe in Soho, New York City, on March 12, 1996. I was
there when our CEO, Terry Jones, made the world’s frst airline booking in real
time via the internet. From there, we successfully licensed, customized, and
hosted that software platform to become the internet travel booking engine
for American Airlines, Sabre Web Reservations, Cheap Tickets International,
Rosenbluth Travel, Canadian Airlines, and many other e‑commerce travel
distribution properties.
I realize this is a book about timeless lessons learned during a career in
the analytical sciences, and Travelocity had little (OK, nothing) to do with
analytics (other than some rudimentary AI laboratory experiments to auto‑
mate booking steps from natural language queries – see Chapter 3 “Digital
Transformation”).
What my Travelocity experience did teach me was invaluable later in my
analytics career (i.e., building and deploying large, highly scalable, transactional
enterprise systems):
Analytical sciences is not only about building models but also about put‑
ting those models into production and embedding them within mission‑critical
enterprise business processes and systems that deliver business value and economic
impact. Tom Davenport, author of Competing on Analytics, said, “Models make
the enterprise smarter, but models embedded in processes and systems make the
enterprise more economically effcient.” That, my friends, is the end game, your
raison d’etre, and that is why you need to learn about building and deploying
models as microservices embedded in larger enterprise system architectures
and ecosystems if you are going to deliver value at scale with analytical sci‑
ences in corporate America, government, or the military.
After an incredible 10‑year learning experience that launched my career,
I parted ways with Sabre when a major reorganization left me with an uncer‑
tain path forward (I was most likely within six months of a promotion to VP,
but now that was in jeopardy). Mike Parks left the company soon after I did
and moved to the West Coast to work in Silicon Valley. I chose to go east to
work as VP and CTO for a startup e‑commerce solutions company in Tampa,
Florida (a 33% increase in base pay plus bonus and equity participation was
a big motivator). Sometimes, I regret the decision to not follow Mike, but the
past is the past. Although I cashed out of the startup with a six‑fgure stock
windfall, I often wonder how things might have gone had I followed Mike
to the West Coast. This story would not be complete without acknowledging
that Mike was the best manager, leader, “boss,” and mentor that I ever had.
He was also one of the best all‑around business people with whom I have
ever been associated. Mike recruited and hired me, trained and educated me
about business, mentored me, and gave me all of the project opportunities
that defned my career at AA/AADT/SDT/Sabre/Travelocity. Most impor‑
tantly, he understood me and valued me for who I was, and he made me better,
professionally and personally, for which I will be eternally grateful.
A key lesson learned here: Never ever underestimate the inordinate value and
political power of leaders near the top of the “food chain” with whom you have a close,
trusted, mutually benefcial working relationship. No one “climbs the corporate
ladder” alone – while you are “climbing,” your “sponsor” is “pulling you up
the ladder by your shirt tail.” Your performance is only one part of the equa‑
tion. Who knows you and wants you to be successful is far more important in your
26 The Art of Data Science
promotion potential and to get the “plum” assignments. In the military, gen‑
erals promote the majors and lieutenant colonels who they believe have the
greatest potential to become a general one day. The same principle exists in corpo‑
rate America. If you regularly change companies or organizations, without con‑
tinuous sponsorship, it becomes more and more diffcult to climb up the ladder
because you are no one’s candidate due to lack of tenure, loyalty, and trust that
comes from years and years of working for the same leader/sponsor. Frankly,
I took my success at AA/Sabre for granted as a result of my own performance,
and I failed to really understand this principle until much, much later in my
career, and by then, it was too late to advance to senior management.
With IPOs turning technology geeks into millionaires with stock options, I
fgured I would ride the “internet wave” for a while. I was able to parlay my
Travelocity CTO experience into a series of CTO positions with e‑commerce
startups selling everything from computers to mortgages to Las Vegas hotel
rooms online (none of which went public in an IPO, but several were bought
out, and I did fairly well fnancially with bonuses and equity stake cash outs).
In 2000, I launched a VC‑backed startup of my own in the system integra‑
tion management console SaaS software product market that ultimately got
bought out by Cisco. I did not do much work in analytical sciences during
this time other than some prototypes (see Personal Mortgage Optimizer in
Chapter 3 “Digital Transformation”).
In 2003–2004, I was recruited to help lead the turnaround of a division of
McAfee (then Network Associates, based in Plano, Texas – north of Dallas) as
VP of engineering and product development (i.e., IT Service/Help Desk prod‑
uct Magic Solutions, acquired by BMC Software), by leading a team of awe‑
some software architects and engineers in re‑architecting and reengineering
the product and helping to drum up sales of new licenses and renewals with
a revitalized product roadmap.
My experience at Network Associates was invaluable and taught me about
the importance of “selling.” My division GM, Jeff Honeycomb (another ter‑
rifc boss), was a professional salesman, through and through, who began his
career selling copiers door‑to‑door, foor‑to‑foor in the skyscrapers of Manhattan
and worked his way up to selling computers and the “big money” of selling
enterprise software product licenses. He had an engraved brass sign on his
desk that said, “Nothing happens until somebody sells something.” Truer words
were never spoken. Not surprisingly, Jeff had me spend about 50% of my
time in the feld on customer sales and support calls as the VP of engineer‑
ing and product development. New license deals and license renewals were
critical. The fact was that the product was falling apart and customers hadn’t
had a new major release (x.0) or even a “dot” release (x.y.z) in years – they
weren’t happy at all. Jeff wanted me to hear the problems frsthand and see the
challenges the customers and sales team faced, so I could fx it. Trust me, you
don’t have to get screamed at too many times in Dutch, German, French, and
British English by bank presidents, military commanders, etc. (all custom‑
ers), before you get super motivated to go and fx the @#$%&* software!
Career Summary 27
When you are evaluating potential analytics projects, you are looking for
a set of attributes. The healthcare insurance company project checked all of
the boxes:
The RFP process ran very smoothly, and we were able to solicit bids from
the top players in the market and then play them off of each other to get the
best deal for the client. We ended up selecting name brands for the ADW and
dashboard platforms, and the client wanted to continue to use a legacy statis‑
tical package as their analytics tool because they already paid for the licenses.
The 11 healthcare risk predictor models to predict the likelihood that
patients would be susceptible to and develop maladies such as heart disease,
diabetes, and various cancers were built using patient data and replaced
expensive licensed third‑party software (this action alone saved HealthCo
$2 million annually in licensing fees).
The three POC projects were carried out in timely response to real, critical
issues that HealthCo was facing:
All three of these POC projects were successful and provided invaluable
insights into predictable patterns of demographic/psychographic/behav‑
ioral patient data that helped practitioners better allocate scarce resources
and deliver better quality of patient care and better patient outcomes, more
economically. All of the credit goes to the four professionals I recruited who
did all of the heavy lifting: two healthcare economists who taught at a local
university; a data science colleague who I previously worked with; and a
longtime O.R. colleague from AA who had established her own consulting
company doing this kind of marketing science work. Customer satisfaction
was through the roof on these three projects, and the end product provided
even greater justifcation and testament to HealthCo’s investment in the
power of analytics applied to voluminous healthcare patient data.
30 The Art of Data Science
Client confdentiality limits what I can share, but suffce it to say, this project
was successfully completed with a robust, performant ADW and dashboard,
predictor models, and fresh insights into solutions for challenging business
problems at hand. The solution was so successful that HealthCo offered the
use of their new platform on a Data Analytics‑as‑a‑Service (DAaaS) basis to
other healthcare insurers. The DAaaS program was also so successful that
the DAaaS platform was spun off as a separate company from HealthCo
and was purchased by another healthcare informatics company. HealthCo
became a customer of the DAaaS platform they had built!
By design, this was a short‑term engagement for me of slightly less than
one calendar year. Despite the brevity, it was one of the most successful, and
impactful, programs that I have run in my entire analytics career. HealthCo
benefted from the investment in world‑class products and solutions, and in
turn, their patients and policyholders benefted from better quality of care
and outcomes. A win‑win‑win!
After HealthCo, I continued working on “time for money” CTO consulting
gigs through my LLC for a few more years, while looking for a career oppor‑
tunity that I was passionate about and had some fnancial and professional
upside. I reconnected with a former colleague and fellow classmate from my
SMU MBA program on an email thread ironically intended to help another
fellow SMU alum who was searching for a job. He was working for Southwest
Airlines at the time, and he asked, “Aren’t you an airline O.R. guy?” I replied in
the affrmative. It turned out that Southwest was looking for a senior man‑
ager to run their 11‑person O.R. group (called Optimization Solutions).
Given my background and connections, I got the job and moved with my
family back to Dallas (the third time was a charm). I was pleased to fnd that
a few super‑talented former AADT colleagues were also on my new team at
Southwest. The majority of the group was focused mainly on crew‑related
optimization solutions, i.e., fight and cabin crew schedule planning develop‑
ment, as well as all of the models that optimize crew training schedules. (A
bit of déjà vu from my AADT days.) There was also a four‑person team work‑
ing on an R&D project with network operations control (NOC) to develop a
real‑time airline irregular operations optimizer to put fight schedules back
together after major disruptions due to weather, air traffc control, etc., and
minimize adverse effects of delays and cancellations, while ensuring pas‑
sengers got where they were going (more déjà vu that made me shudder after
my own failure with the model for irregular operations (MIO) at AA SOC).
Fortunately, this team had quite a bit more intellectual horsepower, expertise,
and experience (i.e., three Ph.D.s, including an INFORMS Franz Edelman
Award winner who had designed, developed, and deployed SkySolver, a
crew irregular operations optimizer at Continental Airlines).
Over a period of roughly fve years, I am proud to say that we doubled the
size of the O.R. group while building on the success of the crew optimiza‑
tion team, achieving success with the NOC irregular operations optimizer
(known at Southwest as The Baker, posthumously named after an NOC
Career Summary 31
supervisor of dispatch, Mike Baker, who conceived of the idea for the solu‑
tion), and expanding into some new areas, like jet fuel and liquor inventory
optimization. The crew optimization team alone delivered $100 million
annually in crew cost avoidance by optimizing pilot and fight attendant
schedules. The NOC program team delivered The Baker, which helped to
improve airline on‑time performance overall (by 2.11 percentage points, annu‑
ally) and signifcantly (by a factor of 2X) during irregular operations caused
by major winter storms. The Baker became, and remains, an invaluable deci‑
sion support tool for NOC supervisors of dispatch to better manage airline
network disruptions and help ensure a better customer experience. After
years of rigorous testing and validation, The Baker solves for and re‑opti‑
mizes fight schedules in the wake of isolated network disruptions in less
than 5 minutes and major network‑wide disruptions in ~30 minutes, a process
that required several hours when done manually. The Baker won two presti‑
gious industry awards, AGIFORS Operations Best Innovation Award and
FICO’s Decision Management Award, and multiple team members won the
coveted President’s Award, the most prestigious prize awarded to associates
at Southwest Airlines.
[For more about The Baker Project, see Hagel, J., Brown, J.S., Wooll, M., &
De Maar, A. (2018). “Southwest Airlines: Baker workgroup: Reducing disrup‑
tion and delay to accelerate performance.” Deloitte Insights (A Case Study in
the Business Practice Redesign Series From the Deloitte Center for the Edge),
1–13.]
Jet fuel (JP5) represents an airline’s second largest operating expense after
crew. Together with one of the O.R. analysts on my team, I partnered with
the jet fuel supply chain management department to build a new solution to
more accurately forecast fuel demand, annually purchase fuel from suppli‑
ers under contract at minimum cost, and optimize fuel inventory levels at all
100 U.S. airports. (Fortunately, that analyst was brilliant and had a Ph.D. in
supply chain O.R. from Clemson University; he now works for Amazon in
Seattle.) Having an accurate jet fuel demand forecast and then managing jet
fuel inventory levels at each airport are critical for an airline; if you buy more
fuel than you actually need, then you have to pay to store it (holding costs),
and if you don’t buy enough in advance, you will have to pay a premium on
the “spot market” or worse, cancel some fights (shortage costs). That solu‑
tion, which was implemented in less than one calendar year by that single O.R.
analyst (as a set of microservices integrated with enterprise jet fuel purchas‑
ing and management systems), delivered a total cost avoidance of $38 million
over a three‑year period and delivered optimized jet fuel decisions in a mat‑
ter of minutes, replacing a more rudimentary spreadsheet‑based solution. The
solution was so compelling it won two awards: Alteryx’s Best Business ROI
Award and the Drexel LeBow Analytics 50.
Anytime you can reuse a performant, working solution to solve another sim‑
ilar problem, it is a win for the enterprise. Liquor is a commodity that must be
managed similarly to jet fuel (Jack Daniels, among other brands, instead of
32 The Art of Data Science
JP5 jet fuel). Inventory must be bought and stored until needed. You need to
forecast demand by product, decide how much of each product to purchase
in each airport catering station location, and then manage inventory levels
and purchasing as demand fuctuates over time. Sound familiar? One O.R.
analyst took the jet fuel optimization solution “as is,” with no model changes,
applied it to liquor, and realized a one‑time $12 million to $18 million capital
expenditure beneft through optimized liquor inventory decision‑making in
less than one elapsed year.
My experience at Southwest Airlines’ Optimization Solutions proved with‑
out a doubt that a relatively small group of super‑talented and committed ana‑
lytics professionals can make a huge difference when allowed the freedom
to innovate and execute and aim that robust capability at the problems with
big fnancial and economic multipliers, like crew, jet fuel, liquor, and irregular
operations. It was truly an honor and privilege to lead that team and wit‑
ness the business value that they created and the economic impact that was
manifested by their analytical expertise and passion to make things better!
During my tenure at Southwest, I also served as director for enterprise
data, managing all of the data warehousing, enterprise reporting, and data
science tools and dashboard platforms, projects, and people. We delivered
some huge projects; two in particular come to mind. One is a customer data
warehouse for marketing that was credited with enabling $100 million in
incremental revenue annually by targeting the right marketing campaigns and
offers at the right customers, at the right time, and at the right price. (The cus‑
tomer data warehouse won the Teradata EPIC Award.) The second is all of
the data pipelines/extract, transfer, and load (ETL) code and data warehouse
tables to enable the data fow from a new passenger reservation system to the
myriad downstream enterprise applications that needed that data to func‑
tion and execute business processes, such as credit frequent fyer miles, pro‑
cess refunds, and enable proper revenue accounting, among many others.
That project required more than 3,900 test cases that had to pass for acceptance
and over 150 staff working for two years to complete – on time. That was, by far, the
largest and most impactful project team that I managed and led in my entire
career. An amazing group of leaders, architects, engineers, and analysts who
never wavered in their commitment to complete their part of the largest system
project in the history of the airline industry.
After my time at Southwest Airlines came to a close, I was looking for
an even greater challenge with an even larger company to ply my trade in
analytics and data science. For me, all of my very best career opportunities
have come from a referral from someone I already knew, like a colleague or
recruiter. The next job was no different.
In 2019, my former FICO Xpress account executive at Southwest told me
that Walmart Global Tech was aggressively hiring to dramatically expand
its data science capabilities, particularly in the supply chain. He offered to
submit my resume to the then chief data and analytics offcer, with whom
Career Summary 33
Introduction
I wrote the following article in 1991, about four years into my career at
American Airlines (AA) Decision Technologies. At this time, I realized the
primary goal of an analytics practitioner should be researching (i.e., theoreti‑
cal) and developing (i.e., applied) real‑world data‑driven, model‑based solu‑
tions that solve problems more effciently and effectively, help make better
decisions, answer key business questions, and deliver business value and
economic impact, and are actually utilized by their intended customer stake‑
holder group.
The subject matter of the article was “O.R.” (operations research), a pre‑
cursor to analytics – as statistics was a precursor to data science, which
includes now felds like machine learning. But the key point remains true
today, that the analytical sciences, including AI, are a means to an end of
creating signifcant business value and economic impact. They are not an
end themselves.
At that time, O.R. was struggling to move away from having evolved into
a largely academic, theoretical discipline since its founding in the 1940s. The
trend accelerated in the 1970s through 1990s with the excellent applied work
done in the energy industry (oil & gas, electric), transportation (airlines &
trucking), building on the work done in the 1960s in manufacturing (e.g., the
seminal work in industrial production, inventory, and labor planning at PPG
attributed to Holt, Modigliani, Muth & Simon).
AI went through similar dark periods, known as “AI Winters,” in the
mid‑1970s and the late‑1990s and early‑2000s, when the discipline strug‑
gled with: (1) funding and interest levels outside of academia, (2) fnd‑
ing suitable problems that business could solve and wanted to invest in
solving, and (3) lack of data and insuffcient computing power to drive AI
solutions.
36 DOI: 10.1201/9781003588344‑3
The Dual Challenge of the Analytical Sciences Practitioner 37
Many of the principles that show up in my later writings are found in this
early article, including:
Winning the hearts and minds of stakeholders and constituents and making
sure that you bring them along with you on the journey is critical to a suc‑
cessful outcome.
How are you generating business value and economic impact with analyti‑
cal sciences?
Upon realizing that this was the way OR operated in the “real world”
of business, where proft and loss are of paramount importance, I was
reminded of a sign that hung on the wall of the Simulation and Analysis
Division of Ketron Inc, where I worked as a programmer/intern while
in college.
The sign read. Operations Research is the art and science of giving bad
answers to questions that would otherwise be given worse answers. Although
the initial reaction of most OR practitioners, myself included, is to reject
such a statement as an extremely cynical and malicious threat to our
profession and livelihood, upon further consideration and refection it
makes sense. Albeit, while this is not always the case, the statement can
be interpreted as saying that it is better to have an approximate answer
than to have none at all, especially when the optimal answer may not
exist or be far too costly to determine.
The bottom line is that we live in an imperfect, complex and compli‑
cated world. As OR practitioners, we are charged to use our education,
skills and abilities to bring order and structure to our often unorga‑
nized and chaotic world for the beneft of corporations, industry, gov‑
ernment, and society. To research and develop relevant, viable solutions
to real world problems using any variety of mathematical and com‑
puter methodologies in a timely and cost‑effective manner truly is the
dual challenge of the OR practitioner.
In writing this, I am accepting my own challenge to practitioners to
become more active in the society and rekindle the applied nature and
spirit of the discipline by writing more articles and presenting results
of successful, as well as unsuccessful, OR applications, so that we may
recognize our triumphs and learn from our failures.
Douglas Gray is a senior consultant with American Airlines Decision
Technologies in Fort Worth. Texas.
Introduction
I wrote this article in 2019 while I was interviewing with Walmart because
I was intrigued by the company’s aggressive digital transformation journey
from a brick‑and‑mortar retailer into an omnichannel retailer, expertly com‑
bining the physical presence of retail stores and wholesale clubs with the
virtual world of e‑commerce online ordering, and consumer pickup and
delivery, on a global scale. I refected on similar experiences that I had at
American Airlines, Sabre, Travelocity, and several others, and recognized a
few key patterns and trends.
Most companies (more than 80%) still have a long way to go to realize the
full potential of digital transformation, and not surprisingly, because it is so
very, very challenging on so many levels.
Analytics and, now more than ever before, AI play a huge part in digital
transformation, alongside mobile and web.
The “why” of digital transformation is improving the customer experience
and economic performance through greater effciency. The “how” is based on
the principles of speed, scale, change, drive, and fearlessness that propel the very
best companies on the digital transformation journey.
I updated the article to refect on my past 4+ years of intensive and impactful
experience working in data science supporting supply chain and end‑to‑end
fulfllment at Walmart Global Tech.
Where is your company on its journey of digital transformation?
DOI: 10.1201/9781003588344‑4 41
42 The Art of Data Science
where the value is realized. These attributes are embodied from senior lead‑
ership all the way through to the engineers who do the heavy lifting to man‑
age extraordinary volumes of data, build models and software, and deploy
and operate end‑user facing solutions.
How do they do it? Behaviorally speaking, what traits do those companies have in
common?
Digitally transformative companies share traits that are inherent in their
people and/or culturally mandated by aggressive, visionary leaders, e.g.,
Elon Musk at Tesla and SpaceX.
Some will say that digital transformation is easier for “service businesses”
such as banks and airlines, than it is for manufacturing companies. Toyota
pioneered, and is perfecting, robotics in automobile manufacturing, enabling
Toyota, and luxury car division Lexus, to regularly outperform all of their
competitors in sales volume, quality, customer satisfaction, and proftability,
i.e., economic effciency – producing a greater number of units of output of
higher quality in exchange for a fewer number of units input(s), faster. I had a
student in an EMBA Business Analytics course that I teach at SMU who was
a steel mill deputy superintendent. The mill was struggling with proftability.
Using customer product demand data, mill production data, and mill product
proftability data, the student applied mathematical programming to optimize
the mill product mix and production schedules. The mill increased proftabil‑
ity by 23% and the deputy superintendent was promoted and rolled out his
solution to the company’s other steel mills! If it can be done, it will be done!
rate of change that is faster than ever. As important as the technologies are in
digital transformation, the point is not as much about the technologies as it is
about the focus and behaviors in employing the technologies. The companies
that digitally transformed themselves and their industries over the past fve
decades had the same timeless and fundamental traits as Toyota: a penchant
for speed, scale, change, drive, and fearlessness in transforming from analog
to digital to achieve greater economic effciency and bond themselves in a
more engaging relationship with their customers, and partners.
Anyone engaged in digital transformation knows how incredibly diffcult
it is to do at all, let alone do well. The amount of change and disruption is
cataclysmic, the speed is breakneck, and the risk is high, if not properly man‑
aged. The complexity involved and the sophistication required for success are
equally high and can seem daunting. The Navy SEALs, America’s most elite
fghting force, have mantras for dealing with this kind of pain:
Consider three examples: American Airlines not only survived the dereg‑
ulation cataclysm of the 1980s by digitally transforming themselves with
automation, data, and analytics, when so many other airlines did not (TWA,
Eastern, Braniff, Pan Am, and PEOPLExpress to name a few), but went on
to thrive and become one of the largest, most dominant air carriers in the
industry. (Surveys show Delta Airlines has caught up and surpassed them
since – they implemented their own versions of revenue management and
analytics in parallel to AA.) Walmart could not even feasibly operate their
global omnichannel retail business at the scale they do today, let alone do
so proftably, had they not maximally leveraged web, software, data, ana‑
lytics, and cloud data center technologies to digitally enable their customer
shopping, buying, and fulfllment experience, and optimize their end‑to‑end
supply chain processes. Lastly, Toyota has become the dominant global auto‑
mobile manufacturer, by most metrics, including total sales, quality, and cus‑
tomer satisfaction. Robotics, AI, and heavy automation, along with Kanban
just‑in‑time manufacturing and kaizen continuous improvement methods,
played an enormous role in that accomplishment over several decades.
There is one hope for us all. Digital transformation does not have to hap‑
pen all at once. No one wakes up one day “digitally transformed.” Like most
things in life, e.g., hitting a baseball, learning to ice skate, learning to code,
digital transformation is a journey, process, mindset, and a way of thinking and
behaving. It takes a signifcant amount of time and effort to get started, get
moving, really get rolling delivering value and hitting milestones, and ulti‑
mately be successful, then maintain at that level, and then get to the next level.
Will, tenacity, and resiliency are more important than intellect, although seri‑
ous technology skills are required.
Start now. Or not. The choice is yours. Just remember this…
“Only 53 companies have been on the Fortune 500 since 1955, thanks to
the creative destruction that fuels economic prosperity.” – Mark Perry,
American Enterprise Institute
4
Advanced Analytics is
Economically Transformational
Introduction
Continuing on the theme of transformation, I wrote this article as a motivator
for my students and colleagues to expound on the “why” behind analytics,
which is that it is economically transformational. Companies and entire indus‑
tries have been transformed by advanced analytics. The best example I wit‑
nessed was yield (revenue) management in the airline industry pioneered
at American Airlines (and Delta contemporaneously) in the 1980s, which
transformed dynamic, competitive pricing and commodity seat inventory
management to maximize revenue on every fight. That methodology is used
today in hotels, cruise lines, rental car companies, tour companies, passenger
and freight railroads, movie theaters, and even self‑storage facilities!
What economic performance metrics can advanced analytics measurably
transform in your business and industry?
What if I told you that your company could…
• Increase sales closure rates by 30% and double market share? (Targeting the
right prospects)
• Reduce transportation costs by $6 million? (Optimizing truck delivery
assignments and routes)
• Predict hospital post‑surgical readmissions with 90%+ accuracy, reducing
readmissions‑related costs by $500,000 annually, while improving patient
outcome and quality of care?
• Increase annual revenue 4%–6% annually on $14 billion in revenue? (More
accurately forecasting demand, dynamically adjusting pricing and optimiz‑
ing inventory allocation)
• Reduce operating costs $100 million annually on $20 billion in revenue?
(Optimizing high‑skilled labor utilization & supply chain operations)
DOI: 10.1201/9781003588344‑5 51
52 The Art of Data Science
If you are like most people, you would probably smile politely in disbelief, and
say, “No way!” An expected and reasonable response. These types of results
seem practically unbelievable, right? (BTW, they’re all real.) If you’re inquisi‑
tive, and inclined to want to achieve similar results, you might ask, “How?”
The answer is… analytics. Specifcally, advanced analytics, i.e., predictive
analytics – answering the question, “What outcome is going to happen in the
future with what odds or probability?” –and prescriptive analytics – answering
the question, “How do we optimize the outcome of what is going to happen?”
Advanced analytics, i.e., mathematics, statistics, computer science mod‑
els, and algorithms, coupled with software and computer technology, and
LOTS of data about your customers and your business operations, is one of
the most economically impactful elements of digital transformation avail‑
able. Economic effciency, i.e., producing a greater number of units of output of
higher quality in exchange for a fewer number of units input(s), faster, is one
of the goals of digital transformation. Advanced analytics delivers on this goal.
IDC reported that the median ROI of BI‑only projects is 89%; incorporating analyt‑
ics raises the median ROI to 145%.
How? The $64 million question, literally! First, predictive analytics.
Baseball Hall of Fame catcher Yogi Berra, famous for his “Yogi‑isms,” said,
“It’s tough to make predictions, especially about the future.” (Nobel Prize‑winning
Physicist Niels Bohr said something similar, but Yogi was more entertaining!)
Business, like life, is full of uncertainty. That uncertainty stems from complexity.
No human alone can consistently and accurately predict what is going to happen
tomorrow, next week, let alone next year. Think weather, oil prices, demand for
a (new) product, etc. (If you could, you’d clean up on Wall Street and the lottery!)
There are too many factors, and too many variables to consider, and often those
variables change over time, and interact with and depend on each other.
However, there is an approach to collect and analyze as much data about
as many factors and variables from the past as possible to build models that
can identify trends, patterns, and predict the odds or probability of certain
outcomes. The goal of predictive analytics is to get a lot better than “fipping
a coin” (50‑50) to predict an outcome, say 80%–90% accurately, knowing full
well that your predictions will never be 100% accurate. Pragmatically, mod‑
els will need to adapt, evolve, and adjust to changing conditions over time.
George E. P. Box, statistician and Father of Time Series Analysis (Forecasting),
famously said, “All models are wrong, but some are useful.” (Every model has
some predictive “error,” but some are far better than fipping a coin.)
• Out of 100 patients, if you can predict with 90% accuracy the 30%
that have the highest probability of readmitting after surgery, you
can focus your limited patient care staff on those patients, and pay
less attention to the other patients that are less likely to have issues.
Advanced Analytics is Economically Transformational 53
• Imagine being able to predict with 90% accuracy the on‑time per‑
formance of your commercial airline tomorrow at every airport as
a function of weather and passenger traffc load; decision‑making
concerning the recovery of trouble areas would be far more effective.
In both instances, you will get it wrong 10% of the time, but that is a lot better
than 50% by guessing!
Second, prescriptive analytics.
Businesses have objectives, e.g., maximize revenue, minimize costs, and
constraints, e.g., limits on resources, like raw materials, staff, and productive
capacity (steel mill, airplanes) or time windows when events must occur, e.g.,
deliveries. Variables, such as how much of each product to make and sell at
what price, are diffcult to determine visually or in a spreadsheet to reach
the optimal outcome due to the vast number of possible combinations and
permutations of variables and their values, that constitute the range of solu‑
tion outcomes. The feld of mathematical optimization, known as mathemati‑
cal programming, was developed to solve such problems.
One incredibly insightful real‑world application of mathematical program‑
ming, and its impact on business value and economic effciency, is the one
below from United Parcel Service (UPS). Every day, UPS must optimize the
routes of 55,000 delivery truck drivers – no small feat solving that problem!
For UPS, eliminating one mile, per driver, per day over one year can save
up to $50 million. By the end of 2016, 55,000 routes optimized daily by the
On‑Road Integrated Optimization and Navigation (ORION) system will have
saved 10 million gallons of fuel annually, reduced 100,000 metric tons in CO2
emissions, and avoided an estimated $300 million to $400 million in costs.
Often times in analyzing a complex system with many moving parts that
interact – and events and activities are often random or probabilistic in
their occurrence and duration – it is not possible to formulate a mathemati‑
cal optimization problem with equations representing objectives, variables,
and constraints. In those instances, we utilize other prescriptive analytical
techniques, such as discrete‑event (aka, Monte Carlo) computer simulation to
optimize the policies that govern the operation of the complex system, e.g.,
cranes unloading ships in a harbor cargo dock, customers waiting in line at
a bank or amusement park, or operating an airline schedule or an airport. By
simulating the system’s operation under different conditions with thousands
of computer simulation replications, statistically, we can measure and deter‑
mine which policies work best to achieve the desired business outcomes.
Advanced analytics is economically transformational. Advanced analytics, cou‑
pled with large amounts of data and readily available inexpensive computing
power, possesses a uniquely powerful capability to cut through the uncer‑
tainty and complexity that grip business operations, clarify the likelihood
of outcomes, prescribe the best actions to take, and deliver signifcant, tan‑
gible, measurable business value and economic impact. This thesis has been
proven consistently over decades in industries ranging from transportation
54 The Art of Data Science
Introduction
There are defning moments in every person’s career (some favorable, others
not). This article summarizes one of my frst favorable defning moments.
A project that: (1) got me recognized as someone that could work closely and
effectively with customers, solve a business problem that was causing consid‑
erable pain, deliver a holistic, stand‑alone pain‑relieving solution, and gener‑
ate signifcant business value, (2) led to my frst big promotion to Manager/
Principal at AADT, and (3) put me on a trajectory to senior leadership.
I published this article in 1992 to commit to paper my frst successful
model‑based solution design, development, and deployment from “scratch.”
This was the very frst time I coded an entire system starting with a blank
screen C compiler/development environment. As daunting as that was, the
experience was invaluable and foundational to my career as a practitioner and
leader, infuencing how to go about getting things done that leads to a favor‑
able outcome for all involved.
The story of the project, the approach, and outcome is detailed in the arti‑
cle, and I alluded to it in Chapter 1, so I won’t give away the ending here.
What I will say, to characterize the story from a career development lessons
learned perspective, is that while many of my peers and colleagues were
fortunate to work in the far more glamorous, some might say “sexier,” areas
of airline O.R., like yield management, fight scheduling, even crew sched‑
uling, I ended up on the operations side of the business working in, some
might say, the “grungier backwaters” of airline O.R., like airport operations,
airline operations, crew training, and aircraft maintenance. First of all, it is
important to note that if these “operations” domains and disciplines are not
running smoothly and empowered by analytical, automated solutions, all
of the “sexier” O.R. quickly becomes impaired, i.e., metaphorically speaking,
goes right into the dumpster where it catches on fre. Airports and airlines have
to operate in bad weather, pilots need to be trained, and planes need to be
maintained, or no one is going anywhere. Secondly, the moral of this story
DOI: 10.1201/9781003588344‑6 55
56 The Art of Data Science
is that you, as a practitioner with help from a capable software engineer, can
create signifcant value with analytics working in the “grungier, less glamor‑
ous parts” of your company, to the tune of over $1 billion (2023 dollars), in this
case, over the life of an airline feet, with a very fast, heuristic algorithm solv‑
ing a classic problem model formulation, and a very visually appealing and
functionally effcient GUI‑based Apple Macintosh PC software application.
Look for opportunities with large value multipliers wherever you happen to
land in your company, preferably where people are currently solving com‑
plex problems with big sheets of paper and colored pencils or spreadsheets, and
then … make the most of it.
Read on and you will see exactly what I am talking about. And, by the way,
the O.R. and systems approach employed was quite clever, if I do say so myself – credit
Georgia Tech ISyE Drs. Jarvis and Ratliff for their inspiration in the feld of interac‑
tive optimization (circa 1986).
With these three elements, we set out to design and develop a fexible,
user‑friendly decision support system which would allow the user to
generate a maintenance plan and then perform various what‑if analyses
to evaluate the impact on the plan of changes in any of the key mainte‑
nance variables. The need for fexibility was of paramount importance
due to the need for planners to react quickly and deftly to the rapidly
changing maintenance planning environment. The system would be
required to have three levels of functionality, including; 1. manage and
develop maintenance planning data, 2. generate maintenance plan sce‑
narios quickly and effciently, and 3. generate a variety of tabular and
graphical reports to describe and evaluate the maintenance plan.
Heuristic Approach
The need for a heuristic approach versus a strict formulation solu‑
tion or optimal algorithm approach is justifed vis a vis many circum‑
stances specifc to this scheduling problem. The exact amount of dock
line capacity to have open at each point during the planning horizon
is rarely, if ever, known prior to running the scheduling model. The
delicate balance of optimizing yields so that no aircraft exceeds its
allowable limit depends heavily on having the right amount of capac‑
ity available throughout the planning horizon. If an overhaul could not
be scheduled due to insuffcient capacity, an optimizer (e.g., MPSX or
MPS‑III) would return an infeasible solution. In the strictest sense this
is true; however, the user can modify the amount of capacity available
and rerun the model to easily correct the infeasibility.
On the other hand, excess dock capacity leaves gaps in the overhaul
schedule, and would also result in an infeasible solution due to the
inability to satisfy the constant dock utilization constraint. Similarly, if
an overhaul could not be scheduled such that its yield falls between the
lower and upper yield control limits, an optimizer would, again, return
an infeasible solution. Such a solution, however, may be acceptable to
the user in some instances.
Due to the unique structure of the overhaul scheduling problem in
which checks may be sequenced on allowable date deadline, the greedy
heuristic performs quite well. The algorithm uses the deadline to pro‑
vide a natural ordering of checks, selects the check with nearest dead‑
line, and positions that check in a dock line such that its yield is as large
as possible. The quality of the solution is highly dependent on the user’s
skill at specifying available dock line capacity and using the yield con‑
trol limits to drive the solution in the right direction. Upon implement‑
ing the algorithm, it was discovered that given a good starting point,
the algorithm does quite well at achieving yields between 90 and 100
percent of the allowable limit on most checks.
64 The Art of Data Science
• A DC‑10 feet (40 aircraft) 5‑year dock plan for main base visits
and component removals can be generated in about one minute.
• A 727 feet (164 aircraft) 5‑year dock plan for heavy and light C
checks and component removals can be generated in about fve
minutes.
• A Super 80 feet (250 aircraft) 5‑year dock plan for heavy and
light C checks and component removals can be generated in
about eight minutes.
Introduction
I wrote this article upon refection on the project described in Chapter 5.
Fortunately, a lot of things went right on the AA heavy maintenance check
and hangar planning and scheduling project, and the end result was a suc‑
cess, as evidenced by many favorable economic impacts and business value
generation outcomes. I wanted to understand and capture what went right
in an effort to defne a repeatable consulting process for O.R. projects. These les‑
sons learned and the process still hold true today, more generally for analyt‑
ics, data science, and AI projects.
The key attributes in the consulting process include:
66 DOI: 10.1201/9781003588344‑7
Consulting Concepts Learned from Airworthy 67
REAL‑WORLD PROJECT
In order to solidify and give credence to methodology, specifc exam‑
ples are included from an actual OR model‑based system development
project recently undertaken by the author. This project, which involved
the development of an aircraft overhaul scheduling system for the
American Airlines Maintenance and Engineering Division, is a particu‑
larly good example on which to focus for several reasons. It represents a
complete systems development process from system conceptual design
through development, implementation and support, and demonstrates
the enormous benefts that can be achieved when OR is applied with
the client’s best interests in mind. In fact, it was the success of this proj‑
ect that motivated the author to write this paper, and thereby formally
recognize the consulting concepts learned during this process.
Consulting Concepts Learned from Airworthy 69
The clients we worked with to develop the aircraft overhaul scheduling sys‑
tem are a part of the Long Range Planning Group of the Maintenance and
Engineering (M&E) Division based at the American Airlines Maintenance and
Engineering Center in Tulsa, Okla. An interdisciplinary approach to the sys‑
tems development project was taken from the start involving a maintenance
scheduler (the ultimate user of the system), an industrial engineer supporting
the maintenance schedulers (who acted as a liaison between the consultants and
the client user group), an OR consultant (i.e., the author, responsible for problem
analysis and scheduling model development), and fnally a systems consultant
(responsible for developing the input/output user‑interfaces and report writers).
70 The Art of Data Science
Gene Woolsey has said for years that “if a consultant doesn’t under‑
stand the way their client does business today as well, if not better, than
they do, and then attempts to tell them to do business another way, then
the consultant is a fraud.”
The client’s primary objective for the new system was to produce overhaul
schedules which better utilized available hangar capacity and reduced the total
number of overhaul checks performed, while remaining within government
regulatory aircraft maintenance guidelines. However, the client’s needs for the
system’s functionality included the ability to:
The understanding of these goals and needs allowed the system specifcation
and development process to proceed with few unforeseen circumstances. In
fact, using available technology, we, the consultants, were able to far exceed the
minimum requirements put forth by the user and provide them with imagina‑
tive as well as functional solutions and capabilities.
ATTITUDE
Often, clients harbor feelings of intimidation or ill‑will with regard to
OR consultants. Several factors cause this type of behavior:
CHAMPIONS NEEDED
Tom Peters, the well‑known management consultant, has said, “Anytime
anything gets done anywhere in business, it is because of a champion.”
A champion is defned as the person who leads the effort to initiate a
project, fghts the battles, defends the project, refuses to let the project
die, and pilots the project through to a successful completion.
The acceptance and implementation of a “champion‑of‑the‑cause”
concept in the world of OR consulting is long overdue. Too many times
projects are begun with good intentions on both sides, but because of
lack of support, or lack of focus and direction, or political consequences,
the project is intentionally scrapped or inadvertently left foundering.
Champions are needed on both the client and consultant sides of the
project. A project’s potential for success is considerably higher if there
is a champion on the client side; someone who “knows the players, poli‑
tics and power centers” and who can successfully support and maneu‑
ver the project through the bureaucratic and political minefeld that
has doomed so many OR projects. Even stronger support can be gained
if that client champion has some understanding of the consultant’s
Consulting Concepts Learned from Airworthy 73
Bi‑weekly status reports and almost daily phone calls from consultant to
client helped keep the project on track. It became evident that suffcient time
was not provided in the initial proposal to allow for unforeseen obstacles.
However, the delays were seen as legitimate by the client organization and not
as excuses made by the consultants. In the fnal analysis, the project, originally
planned as 12 man‑months, required 14 man‑months to complete, including
1.5 man‑months worth of client requested add‑on developments.
CLIENT PARTICIPATION
Client participation in and awareness of all aspects of the project is one
of the most signifcant elements in the successful completion of projects
at American Airlines. Too often OR consultants discount the opinions,
insights and knowledge that a client can provide on how to develop a
system or perform an analysis. We mistakenly believe that our educa‑
tion and problem‑solving skills put us in a position to overrule what are
actually the “realities” of applying OR in business. A better approach is
to couple our skills with the client’s knowledge and understanding to
develop a more complete and informed solution.
Granted, there is a very fne line between having enough objectivity
to solve the problem at hand without being jaded by the client’s opin‑
ions, and yet not being ignorant or indifferent to what the client has to
say. Therefore, OR consultants must be attentive listeners, ever‑sifting
through the content of conversations with their clients, to separate facts
from opinions.
Client involvement plays another signifcant role in the success of a
project. Clients who have participated in an analysis or system develop‑
ment are much more likely to accept the results because they take on
the pride of ownership. This acceptance and ownership augments the
relationship between clients and consultants and goes a long way to
ensure the success of the project. A client who feels like part of a team
is considerably less willing to quit or abandon the team’s end product.
The client organization was very much involved in the project from the very
beginning. They recognized the need for the system, documented the reasons
why the system was needed, performed the necessary cost justifcation, collected
and developed all of the necessary data to drive the system, and wrote their own
system functionality specifcation. All of this enthusiasm and active participa‑
tion made the consultants job signifcantly easier. In fact it is the dream of every
consultant to have a client who is motivated to help improve their own situa‑
tion instead of sitting back and waiting to be presented with “the right answer.”
Although the client users were skeptical at frst of the advanced technology,
upon explanation and assurances from the consultants and upon undergo‑
ing their own very thorough model verifcation and validation processes they
76 The Art of Data Science
quickly became convinced that the new solution was, indeed, a better one. It
was and continues to be a joint team effort of clients and consultants working
together that has made the new aircraft overhaul scheduling system successful.
COMPLETE CONSULTING
Experience has shown that successful OR applications take a more holis‑
tic approach to identifying and solving client problems. Understanding
a client’s business and objectives is a place to start. By understanding
their decision‑support needs and delivering models and systems that
address those needs and provide benefts is where OR can make its
greatest contribution.
Instead of focusing on one particular instance of one particular prob‑
lem in one department, OR must broaden its problem‑solving perspec‑
tive to increase its impact on the organization. In order to “go beyond
models” and be effective within organizations, we must develop our
interpersonal skills to better communicate and market what we have
to offer.
The challenge of the OR discipline in the coming decades will be to
employ a complete OR consulting process that uses technology effec‑
tively in conjunction with our people skills to positively impact busi‑
ness and industry.
Recognizing M&E’s Long Range Planning’s function as a business – and
treating their maintenance schedule as the product that that organization
produces – led us to our purpose of developing a system that would allow plan‑
ners to create the best maintenance schedule product that was possible in an
effcient and timely fashion. By being sensitive to the client’s needs and imple‑
menting appropriate levels of OR and computer technology, a system solution
was developed and successfully implemented that will continue to serve the
needs of maintenance planners for several years to come.
The degree of improvement in the aircraft overhaul schedule development
process was recognized when a planner identifed six months worth of excess
Consulting Concepts Learned from Airworthy 77
overhaul hangar capacity using the new system. This extra capacity, if utilized
to perform aircraft conversion work that would have otherwise been contracted
to an outside vendor, will save American Airlines over $3million in labor costs.
Improving the methodology and process by which decisions are made and pro‑
viding users with the right tools to identify such cost saving opportunities is
most defnitely where OR can make its greatest contribution. We would like to
believe that we help people to, in the words of American Airlines President and
CEO Bob Crandall, “work smarter, not harder.”
Douglas A. Gray is a principal with American Airlines Decision Technologies
in Fort Worth, Texas
Introduction
This article has two important connections to other work previously men‑
tioned. First, I posted this article as a follow‑up and concrete example of how
advanced analytics can be economically transformational, in practice, in the real
world, as described in Chapter 4. Second, it is a more recent implementation
of the same type of technical approach and consulting process outlined in
Chapters 5 and 6.
There are a handful of key takeaways:
78 DOI: 10.1201/9781003588344‑8
A Modern Day Project Applying the Same Principles 79
(12 × 50 × 16) models all solve in ~ fve minutes, and generated gloop forecasts
that were 12%–70% more accurate than the prior spreadsheet approach,
which took three days to complete by hand. The data were extracted from
several enterprise systems using SQL queries, cleansed, and integrated using
Alteryx Designer, and all of the forecasting models were developed in R.
Third, Acme must make annual contractual purchasing commitments to
their gloop suppliers at each of their 50 production plant locations. Acme
leverages the gloop demand forecasts generated in the previous step to
inform suppliers of the quantity of gloop required annually at each plant.
The contracting process is carried out as an RFP bidding process where sup‑
pliers submit bids for quantities to be supplied at set prices at a given plant
location. Acme’s goal, simply stated, is to satisfy demand for gloop at each
plant at minimum total cost. Easier said than done since gloop prices vary
by supplier at each plant, and additionally, there is a complex array of taxes,
tariffs, fees, and other costs, e.g., EPA charges, that vary at the federal, state,
and local levels. The variety of different costs for gloop across 50 plants and
2–3 suppliers at each plant quickly becomes unwieldy, especially because it
was previously processed in a spreadsheet. The annual RFP gloop purchas‑
ing contract problem is formulated as a fxed‑charge mixed‑integer linear
programming model that simultaneously selects the gloop supplier(s) (in
case one supplier cannot supply the amount of gloop required) and satisfes
demand such that the total cost across all 50 plants is minimized. Alteryx
Designer was again used to extract, cleanse, and integrate the data, and FICO
Xpress Optimization was used to solve the mathematical programming
problem in a few minutes. Over a three‑year period, Acme was able to satisfy
demand and avoid $38 million in gloop purchasing‑related costs versus
the prior manual spreadsheet‑based method. The automated optimization
tools also enabled Acme to experiment with different purchasing contract
arrangements, e.g., bundling supplier contracts to fulfll demand at plants in
close proximity to one another to achieve volume discounts, which resulted
in millions more dollars in savings on an annual basis. Such experimenta‑
tion was not possible at all with the prior methods.
Finally, because demand for gloop is stochastic, and varies on an intra‑month
basis, and Acme has fxed gloop storage tank capacity, it is necessary to regu‑
larly monitor gloop consumption and on‑hand inventory levels to plan for
and schedule gloop replenishment deliveries from suppliers during each
month at each plant location. Historically, this process was spreadsheet
based, mostly manual, and wholly unscientifc, i.e., rules of thumb. The solu‑
tion approach to the intra‑monthly demand and supply gloop inventory
problem frames up as a traditional EOQ‑Economic Order Quantity model
with Re‑Order Point, Safety Stock, and Supplier Lead Time considerations.
The EOQ model has a nonlinear objective function (quadratic equation),
which attempts to minimize a combination of gloop purchasing, shortage,
and holding costs, subject to a set of (linear and nonlinear) constraints to
account for demand, tank capacity limits, safety stock, and supplier delivery
A Modern Day Project Applying the Same Principles 81
lead times. Demand for gloop during the supplier delivery lead time is mod‑
eled as a normal distribution. The EOQ inventory formulation is a hybrid pre‑
dictive and prescriptive analytics model that is solved using a combination
of R (normal distribution) and FICO Xpress Optimization (nonlinear math‑
ematical programming model) in a matter of minutes. Utilizing this scientifc
approach, Acme is far less likely to unnecessarily over‑stock gloop with‑
out increasing the risk of a shortage. Having less capital tied up in gloop
inventory increases cash fow.
The multifaceted problem described is solved using a combination of fnan‑
cial hedging to offset commodity price fuctuation risk, and operational hedging
to balance the costs associated with two kinds of bias: fnancial bias (minimize
cost) and operational bias (never run out of gloop!). The inventory problem is
stochastic and requires predictive analytics to forecast demand for gloop, and
account for demand during supplier lead times. The problem of minimizing
total purchasing and inventory costs is solved with prescriptive analytics, in
this case, optimization.
Leveraging COTS software, well‑known analytical models, and a large
volume of data from multiple enterprise systems, one part‑time analytics
professional and one part‑time project manager led by a director, working
closely with domain experts, were able to improve Acme’s cost structure,
cash fow, and proftability. The benefts achieved between the before and after
scenarios are signifcant, substantive, and make a strong case for the eco‑
nomically transformational nature of advanced analytics as a part of digital
transformation. Best of all, the 2nd, 3rd, and 4th components of the solu‑
tion described herein were achieved with a modest investment of $400,000
in internal labor costs (hardware and software capital expenditures were
unallocated and considered part of overhead utilized by many other projects
at Acme, Inc.).
8
Right Tool, Right Place, Right
Time (with Nader Kabbani)
Introduction
I co‑authored this article with Nader Kabbani in 1994 when we were col‑
leagues at American Airlines, based on Nader’s experience in fight plan‑
ning and scheduling solutions, and my experience with airline operations
control solutions. Our goal was to characterize planning, scheduling, and
operations problems that occur in all sorts of industries by their respective
inherent characteristics that mandate a solution approach that is well‑suited
to handle all facets of each problem’s underlying nature. Given our experi‑
ence, we used airline industry fight planning, scheduling, and operations as
the context to explain our approach.
Although the principles outlined in the article still largely hold true today,
the advances in computational power of servers have enabled far more robust
approaches in solving such problems; for example, “clean sheet” schedul‑
ing in airline fight scheduling. Airlines used to start with an existing fight
schedule and “tweak it” making small changes to cities served, feet capacity,
etc.; however, clean sheet, as the name implies, enables the airline schedulers
to start from “scratch” each time a new schedule is developed, permitting
much greater fexibility in cities served, route structures, fight frequencies,
aircraft assignments, etc., enabling maximization of airline schedule revenue
and proft potential.
One observation worthy of note is the reference to work done in the
real‑time airline operations control domain at United Airlines that was
published in an article entitled “A Decision Support Framework for Airline
Flight Cancellation and Delays” authored by Drs. Ahmad Jarrah and Gang
Yu, et al. (Transportation Science, Vol. 27, No. 3, August 1993, pp. 266–280). The
article outlined an approach (i.e., minimum‑cost network fow model frame‑
work) that was very similar to that applied by my team, led by Dr. Mark Song
and Dr. Phil Beck at Southwest Airlines, in the development of The Baker,
referenced earlier, from 2008 to 2015, and continues there today. (Small world
in airline O.R., as Dr. Gang Yu was Mark’s colleague at UT‑Austin and at Gang’s
82 DOI: 10.1201/9781003588344‑9
Right Tool, Right Place, Right Time (with Nader Kabbani) 83
company CALEB Technologies in Austin that won the Edelman Award for their
work in Crew Operations Recovery at Continental Airlines; coincidentally, Nader
also worked at CALEB briefy after he left AA before going onto a brilliant career at
Amazon.)
Alberto Vasquez, John Kirk, Pitu Mirchandani, and myself published
a related paper that originated from their work on AA on the Model for
Irregular Operations (MIO). [Vasquez, A., Gray, D., Kirk, J., & Mirchandani,
P. (1990, May). “A Framework for Implementing Real‑time Re‑scheduling
Systems.” Proceedings, Rensselaer’s Second International Conference on Computer
Integrated Manufacturing.]
Airline operating complexity naturally provides an excellent context for
understanding how to approach and solve such complicated planning,
scheduling, and operations problems.
product or service mix which will maximize proft (or revenue) given
available capacity and its associated costs, where capacity is a function
not only of the number of resources available but also the time frame
over which demand is requested for the product mix.
The tools employed in solving planning problems are usually large‑scale,
optimization‑based models and corresponding algorithms. As a result,
the problem’s inherent characteristics include static, rough‑cut data and
a lack of extreme time frame limitations (e.g., weeks or months). What‑if
scenario analysis, sometimes spreadsheet‑based, has put a new spin on
the concepts of sensitivity analysis in order to provide management plan‑
ners with the capability to quickly assess the impact of changing param‑
eters on their plan. This stage has the luxury to be forward‑looking and
proactive in its approach to organizational objectives and is typically not
characterized by frenzied, stressful engagements.
Scheduling, more tactical in nature, attempts to create a sequence
or order in which activities will be completed, as well as some assign‑
ment of activities to available and qualifed resources. Albeit a planned
assignment, the process considers the characteristics of the activity at
hand and the criteria which resources must meet to be considered for
assignment to ensure feasibility; i.e., resource qualifcation and avail‑
ability. The objective in this stage is to generate a feasible assignment
of activities to resources that attempts to realize the proft (or revenue)
objectives of the planning stage. Scheduling is characterized by some
of the features of planning in that schedules are still relatively static
prior to implementation. This stage also embodies some characteris‑
tics of operations control that necessitate re‑scheduling of activities and
re‑assignment of resources, vis‑a‑vis short‑term changes in operating
conditions such as resource availability or activity duration.
Scheduling is well‑known to be a discipline of problems that are com‑
putationally intractable in all but a few simple cases and at least hard
in many practical cases. Given those realities, the objective is usually
to fnd problem‑specifc rules‑of‑thumb or heuristics that quickly pro‑
vide good, feasible solutions in most practical cases. A combination of
optimization‑based and heuristic methods are typically employed in
scheduling. However, simulation‑based evaluative methods, such as
what‑if analysis, have recently shown promise in scheduling as well
[Gray, 1992].
Operations control, as suggested by its name, is a more operational
problem and represents where the “plan hits the fan.” Operations con‑
trol attempts to do just that, “control operations” vis‑a‑vis unforeseen
circumstances, e.g., machine or vehicle mechanical failures, economic
cataclysms or weather conditions. The objective of this stage is two fold;
to minimize the impact of exogenous and sometimes wholly unknown
and uncontrollable factors on the planned schedule, and to ensure that
Right Tool, Right Place, Right Time (with Nader Kabbani) 85
the plan is implemented in the most effcient, cost effective way possible
to protect the revenue and proft objectives built into the plan.
Whereas scheduling and planning are aggregate, proactive and
long‑term, operations control is specifc, reactive and real‑time. This
environment mandates less formal and less rigorous solution tools
and relies heavily on access to timely and accurate information in a
medium conducive to manipulation and ad hoc analysis. Time‑honored
rules‑of‑thumb and heuristics grounded in battlefront experience usu‑
ally help to carry the day. Tools ‑ typically through computer automation
‑ assist operations controllers in monitoring the evolution of the plan,
alert them to adverse conditions, and provide information and basic deci‑
sion support functions to assist in confict resolution are commonplace.
The graph in Figure 8.1 illustrates the relationship between the level
of sophistication of problem‑solving tools and the need for timely and
FIGURE 8.1
Operations control, visualized.
86 The Art of Data Science
OVERVIEW
Although OPSC problems are, as defned above, different in nature,
their respective solution approaches share similarities. These similari‑
ties include a need for:
FLIGHT SCHEDULING
Constraint‑based decisions are made during the intermediate‑
to‑short‑term fight scheduling. Such decisions are designed to make
the schedule operationally feasible while adhering to the impact of
schedule changes to the overall schedule proftability. Detailed oper‑
ational constraints are applied at this stage of the process. Such con‑
straints vary signifcantly and include complex factors such as airport
curfews, slots and tower hours; aircraft range, over‑water capabilities
and seating capacities; maintenance and gating rules and requirements.
critical. Alerts are also used to fag other noteworthy conditions such as
insuffcient ground time for baggage and crew connections and viola‑
tion of airport closure or curfew requirements.
Using a computer model‑based representation of the fight sched‑
ule network, what‑if scenario analysis is employed to simulate fight
operations in an attempt to determine the down‑line impacts of pend‑
ing fight decisions. Controllers can specify a set of conditions such as
fight delays, cancellations or aircraft reassignments to identify and
rectify potential conficts. Smart simulations supported by databases
provide a mechanism to assess the feasibility of fight operations and
answer questions such as, “Can this aircraft be assigned to this fight
routing which goes over water?” or “Is this aircraft’s noise profle
suitable to the noise abatement and curfew profles at the destination
airport?”
Experimentation with more sophisticated OR model‑based systems
that support fight re‑scheduling and aircraft re‑assignment decision‑
making has begun to bear fruit. In a recent paper in Transportation
Science, Jarrah et al., reported on development and implementation of
a minimum‑cost network fow model framework for supporting fight
cancellation and delay decision‑making at United Airlines. The model
was applied to support real‑time operations for United’s “hub airports”
in Chicago, San Francisco and Denver. The model generated effective,
implementable solutions in reasonable time which were in many cases
superior to solutions generated by experienced fight controllers in
terms of the number and magnitude of fight delays and cancellations
required.
Obviously, the level of model sophistication for supporting real‑time
operations is constrained by the limited amount of time available for
solution generation and implementation. However, the rapid advance‑
ments and cost‑effectiveness in desktop computing power (i.e., engi‑
neering workstations), combined with innovative optimization‑based
heuristic and network modeling approaches, will provide opportuni‑
ties to apply more rigorous tools for solving such complex, operational
problems. The key to the effectiveness of such decision support tools,
however, will always be driven by their relevance, applicability and
usability in the eyes of the human fight controller, since they have the
ultimate responsibility for the decisions that get made and the method
employed.
CONCLUSIONS
The decision support tools employed to solve operations planning,
scheduling, and control problems are adapted to ft each problem’s
92 The Art of Data Science
REFERENCES
1. Douglas A. Gray, December 1992, “Airworthiness: Decision Support for
Aircraft Maintenance Scheduling,” OR/MS Today, pp. 24–29.
2. Nader Kabbani and Bruce Patty, “Aircraft Routings at American Airlines,”
Proceedings of the 32nd AGIFORS Annual Symposium, Budapest, Hungary,
October 1992.
3. Ahmad I. Z. Jarrah, Gang Yu, Nirup Krishnamurthy and Ananda
Rakshit, August 1993, “A Decision Support Framework for Airline Flight
Cancellation and Delays,” Transportation Science, Vol. 27. No. 3, pp. 266–280.
Introduction
I wrote this article in 1994 in an attempt to relate the concepts of airline yield
management to other industries, specifcally manufacturing, in the form of
what I refer to as revenue‑based capacity management. Having seen the revolu‑
tionary impact of yield management on airline pricing and seat inventory
management – post‑U.S. airline industry deregulation in 1979 – I thought
there might be an opportunity to apply these concepts, models, and technol‑
ogy to other industries outside of transportation. Beyond airlines, yield man‑
agement has been widely successfully applied to hotels, cruise lines, rental
cars, tours, railroads (passenger and freight), and even self‑storage compa‑
nies. All of these industries share the basic characteristics suitable for yield
management, i.e., perishable commodity product inventory supply, vari‑
able (low brand loyalty) consumer demand, and highly competitive market
pricing.
The semiconductor (“chip”) industry uses a form of yield management
(YM) to decide how many of each type of chip product to make subject to
market demand forecasts, product proft margins, and volatile product yield
from the manufacturing process. Today, there is plenty of literature, and even
commercially available software products, that address YM in semiconduc‑
tor manufacturing.
At the time, I had the opportunity to do a small advisory project for a U.S.
paper products manufacturer that wanted to emulate and apply the types
of systems that airlines use for their paper products manufacturing plant
production planning. For the article, I utilized a hypothetical paper prod‑
ucts company as a contextual template for how to apply revenue‑based capacity
management. It was a nice theory, at best.
I personally never worked in manufacturing, so I never had the oppor‑
tunity to apply these concepts myself. However, 25 years later, when I was
teaching Business Analytics at SMU in the EMBA program, a student, who
was a deputy superintendent of a steel mill (making ingots, pipe, rebar, and
DOI: 10.1201/9781003588344‑10 93
94 The Art of Data Science
LOGISTICAL NIGHTMARE
Operating an airline is a classic example of a logistical nightmare. The
already arduous task of scheduling fights and crews is further com‑
plicated by unforeseen factors, such as weather and mechanical equip‑
ment failures. In an industry where resource and labor costs are high
and proft margins are low (1%–2%), airline companies quickly realize
the absolute necessity of well‑planned and well‑executed operations to
ensure a proftable enterprise. Based on the performance of commer‑
cial airlines recently, well‑planned and well‑executed operations do not
guarantee proftability. However, without logistical cohesion, the com‑
mercial airline operation is doomed to certain failure.
Deregulation created a competitive environment that forced AA to
manage operations effciently and cost effectively. Even more important
was the emphasis on allocating assets in a way that ensured optimal
revenue generation potential. Deregulation motivated AA to harness
the power of technology in nearly all facets of airline planning and
operations management. Leveraging the vast data resources of AA’s
SABRE™ Computer Reservations System (CRS) and related systems, the
96 The Art of Data Science
TECHNOLOGY TRANSFER
Yield management is to airlines what revenue‑based capacity manage‑
ment is to manufacturers, i.e., how to best allocate available capacity
to satisfy customer demand and maximize revenue generation poten‑
tial. Airlines assign seats to fare class buckets, whereas manufactur‑
ers assign production resource capacity, namely plant, materials, and
human capital, to product lines, vis a vis customer demand for those
products and services.
A yield management system is an airline’s single‑most important,
strategic competitive weapon in the industry’s competitive war. It is a
system focused on identifying and classifying market demand, creat‑
ing and pricing products accordingly, and explicitly considering avail‑
able capacity limitations to maximize revenue generation potential.
Similarly, a capacity management system should be the single‑most
important tool for manufacturers competing globally. Although many
companies implicitly consider all of these factors, few have the infra‑
structure to explicitly link demand and capacity in efforts to optimize
proftability.
Although the resources involved are very different, the underlying
concepts are the same. Airlines make many of the same decisions that
manufacturing companies make every day, e.g., manpower schedul‑
ing, job‑to‑machine assignments. The primary difference between
the two industries is the sheer velocity with which ferce competi‑
tion was brought to bear on airlines, whereas in manufacturing the
effects of competition have crept up on and blindsided once prof‑
itable, stable companies. This competitive revolution in the airline
business forced the airlines to address these challenges and build
an extensive technology infrastructure to support activities associ‑
ated with day‑to‑day capacity management planning and operational
decision‑making.
Under Fire 97
CAPACITY MANAGEMENT
All companies. regardless of industry, must effectively manage capac‑
ity to proftably manufacture products and provide services. Capacity
is a function of the number of resources available in a given time frame
and employed to manufacture a mix of products or provide a mix of
services. Capacity is a perishable commodity with which there is asso‑
ciated an opportunity cost for not utilizing a resource to service a cus‑
tomer. Alternatively, there is a cost with overbooking resource capacity,
as well as utilizing a resource at a lesser proft margin. Explicitly con‑
sidering all of these tradeoffs in managing capacity is a formidable task.
Capacity management begins with an in‑depth understanding of cus‑
tomer product demand patterns and resource capacity limitations. A
fundamental requirement to ensure effective capacity management is
a technology infrastructure that supports management of timely and
accurate information and decision support on resource capacity implica‑
tions. Conceptual models that represent customer demand patterns and
production resource capacity are valuable in establishing a company’s
objectives, and the impediments to achieving those goals. The volume
of information regarding demand and capacity implications mandates a
computer‑based infrastructure to manage information fow and support
decision‑making. Implementation of so‑called conceptual models may be
in the form of PC‑based spreadsheets or engineering workstation‑based
integrated model‑database platforms, or whatever platform is commen‑
surate with the size of the capacity management platform at hand.
American Airlines Decision Technologies (AADT) pioneered the
concepts of yield (revenue) management at AA and is the world leader
in successfully applying these concepts to other industries, including:
hotels, cruise lines, rental car and truck companies, freight and passen‑
ger railroads, television and radio stations.
AADT blends state‑of‑the‑art methods and technologies from opera‑
tions research and computer science to deliver customized capacity
management system solutions. Revenue‑based capacity management
systems could provide a strategic competitive advantage for manufac‑
turers by ensuring that available capacity is allocated with a customer
service focus.
FIGURE 9.1
Airline revenue‑based capacity management system – development.
FIGURE 9.2
Airline revenue‑based capacity management system – production.
PAPER MANUFACTURING
A hypothetical paper manufacturing company, ABC. Inc., produces
paper products for home use, e.g., paper towels, tissues, etc. The dia‑
gram in Figure 9.3 illustrates their business process. Marketing
102 The Art of Data Science
FIGURE 9.3
Paper manufacturing company business process – assess market/customer demand.
FIGURE 9.4
Manufacturing‑based capacity management system vision – assess market/customer
demand.
CONCLUSION
Revenue‑based capacity management systems have played a signifcant
role in establishing and maintaining American Airline’s position as a
world‑class competitor. No single event, however, has motivated the
use of customer‑ and revenue‑oriented capacity management systems
in manufacturing industries. Competitive forces, if recognized, will
encourage companies to look for ways to leverage available resources
to provide even better customer service, while maintaining proftabil‑
ity. Capacity management technology should play a major role in this
process.
Revenue‑based capacity management is based on understanding
customer needs and wants, using customer product demand trends
to drive company objectives. Companies must plan production while
explicitly considering capacity limitations, using methodologies such
as fnite capacity planning. Manufacturers must ensure that their pro‑
duction operations are responsive to changes in customer demand pat‑
terns and fexible enough to handle disruptions and still meet customer
delivery dates. Such responsiveness under reactive circumstances is
achievable only if control systems are prepared to handle the entropy
common in complex systems, where Murphy’s Law is the rule, not the
exception.
Under Fire 105
Introduction
This is a series of three short articles that I posted on LinkedIn between
2020 and 2023, which received a great deal of positive feedback because they
address the realities of how the real world of applying the analytical sciences
is quite different from what is taught in undergraduate and graduate school
programs. The second article specifcally addresses three common reasons
why data science (DS) projects are not implemented, contributing to the 80%
of projects that never result in deployment or value capture. Lastly, I address
how an MBA has made me a better DS practitioner and leader for those inter‑
ested in pursuing that type of education.
executing DS and analytics in the real world – more than your MS degree
coursework prepared you to handle. These are challenges that require a dif‑
ferent set of “soft skills,” which are no less important than your “quant skills.”
First of all, the majority of companies are grappling with numerous complex
data challenges including the following: a myriad of (many times confict‑
ing) source systems, a need for better data governance to enable accessibil‑
ity, building a data lake, data warehouse, or now a data lakehouse or data
mesh to integrate data in one location, and moving to the cloud to handle
ever‑increasing data complexity and volume. All of these factors complicate
and sometimes stymy getting DS projects off the ground.
Even beyond data, there is a whole set of skills and domains that DS and
business analytics practitioners need to develop and address to be effective.
• Industry Knowledge
• Business Strategy (Corporate and Departmental)
• Business Processes & Mechanics
• Financial Statements
• Culture and Politics
• Change Management
• Project Management
• Information Technology
and operating proft margins. In airlines, crew and fuel costs can consume
over 75% of annual revenue. Financial statements are a crucially important
dialect in the language of business.
A company’s culture and political landscape will heavily factor in determin‑
ing the ultimate success of analytics, starting with executive, senior, and
mid‑level leadership. Is the company prone to make fact‑based decisions backed
up by data, models, and analysis, or do the HiPPOs rule (Highest Paid Person’s
Opinion)? If the latter, then watch out because when the “data speaks,” it usu‑
ally invalidates outdated assumptions and reveals inconvenient truths about
the performance, effciency, and effectiveness of departments and business
processes, which can ultimately threaten the status quo of budgets, resources,
and power structures. This can be a political minefeld to navigate, despite
your best intentions to improve the frm’s economic performance.
Analytics done well begets “creative disruption” by uncovering ineff‑
ciency and proposing new ways of doing things better, i.e., more effciently
and/or more effectively, delivering more value and output with fewer or the
same number of resources. If and when analytics models are implemented,
this type of disruption leads to everyone’s favorite topic: change. Managing
change is critical to analytics success, particularly working closely with busi‑
ness and IT partners from the start. No one likes being blindsided by major
disruptive impacts on their part of the frm, no matter how much value is
generated with DS. Communication, engagement, and edifcation are key to
managing change, as well as “what’s in it” for constituents.
DS and analytics are no different from any other major business endeavor,
e.g., developing and implementing a new process or system. The activity, or
project, must be organized and managed – scope, timeline, resources, budget,
and quality – engaging constituents on their turf and terms, not in a vacuum
or “lab.” The principles of project management apply, and are relevant and
critical to success, i.e., embedding a model in a process/system for ongoing
use. Time‑boxing activities with feedback loops is highly recommended to
get to a Minimum Viable Product (or Model) (in Agile‑speak) that is accurate,
useful, and generates measurable and substantial incremental business value
as soon as possible.
The interplay between DS and IT, and business partners, in big companies
is tricky. In many companies, the relationship between analytics and IT is
still being ironed out in terms of reporting relationships, responsibility for
selecting and implementing platforms, data management, and ownership,
and engaging with business partners on projects to implement solutions.
There is simultaneously both a distinctive difference and overlap between
data, DS, and IT scope and purview that necessitates a great deal of commu‑
nication, cooperation, collaboration, regular interaction, and everyone focus‑
ing on what they do best without getting in each other’s way.
In the two Business Analytics courses that I teach at SMU (to EMBAs in the
Cox School of Business and graduate DS students in the MS program), the
above topics are the focus. We leverage Tom Davenport’s books (Competing
Analytics Nontechnical Skills 109
While there is no way to entirely avoid any of the above factors, ensuring that
DS teams are in constant, close, and clear communication with the business
leaders/decision‑makers regarding budgets, priorities, and leadership direc‑
tion can help reduce the number of DS projects that get started but do not get
fnished or implemented. There are many valid (and many less than valid)
business reasons, based on practical realities, why even effective DS models
do not get deployed. It is a fact of DS practice. All we can do is endeavor
to deliver the maximum business value possible on each DS project while
adhering to scope, timing, budget/resource, and quality constraints.
Notes
1 DELTTAA – Data, Enterprise view, Leadership support, Targets (KPIs),
Technologies, Analysts, Analytical methodologies
2 FORCE – Fact‑based, data‑ and model‑driven decision‑making, Organization
of analysts, Reinforcing a culture of analytical decision‑making and “fail
fast, test and learn,” Continual renewal of business assumptions and models,
Embedding analytics in processes
3 FACE – Frame, Analytically model, Communicate and act on model results,
Embed models in processes and systems
11
Top 10 Analytics Leadership
Skills (with Tom Davenport)
Introduction
This chapter presents the research I did based on my own experience
and observations in the industry, validated with renowned researcher
and author, Tom Davenport, PhD, and posted on LinkedIn. The chapter
addresses the top 10 attributes that all analytics leaders should develop
and embrace to be effective. As you can see, the majority of the attributes
are not at all technical. Managing and leading an analytics group is a lot
like running your own business. The endeavor is multifaceted and one in
which the technical aspects of the work actually consume a small fraction
of the leader’s time.
With all of the energy expended and hype generated around analytics,
data science, AI, and machine learning, an important question being asked
is: What skills do analytics leaders need to have, or develop, to be successful?
We describe ten of those leadership skills and traits in this chapter and
address what type of professional most likely has these skills. The list may
be useful for anyone seeking to hire a leader of analytics or data science
functions.
and keeping them engaged, happy, satisfed, and productive. Hiring for “soft
skills,” such as communications, work ethic, attitude, and cultural ft, is as
important as heavy‑duty technical skills.
Many leaders, understandably but completely unrealistically, are looking
for “unicorns,” i.e., someone who can do it all (Figure 11.1). You’ll see what I
mean if you take a look at pretty much any job description on any job post‑
ing for a data scientist. No individual is going to walk in the door and be an
expert in all four required skill areas:
Fortunately, however, these skills can be taught, but the bar is set depending
upon the skill area.
FIGURE 11.1
The four main required skills for data scientists.
Top 10 Analytics Leadership Skills (with Tom Davenport) 113
Relationship Building
Trust arrives on foot and leaves on horseback. People don’t care how much you
know until they know how much you care. Trite cliché? Perhaps, but many ana‑
lytical leaders believe that trust is critical to success, and I can state with
certainty from my experience that is indeed the case. Analytics and data
science often expose huge opportunities for performance improvement,
which inevitably can make it look like someone isn’t doing their job or is
less than completely competent. Depending on the organization, it can take
years of painstaking effort and a lot of coffees and lunches to build trusting
relationships. A business leader has to have a big problem that they really
need your help with before they risk their budget, career, reputation, and
political capital on any project, let alone one involving a lot of math they
don’t fully understand.
Famed O.R. academic and practitioner, R.E.D. “Gene” Woolsey once said
that “A manager would rather live with a problem that they cannot solve, than imple‑
ment a solution that they cannot understand.” Getting them to understand the
solution requires a trusting relationship so they are comfortable that they
will beneft from the project, and not get burned in the end.
114 The Art of Data Science
Change Management
Analytics and data science will often drive enormous changes in business
policies, processes, and procedures; organizations; and jobs. Although ana‑
lytics leaders may not need to be the “change management guru” in their
companies, they need to be very sensitive to the shock waves that analyti‑
cal results can have on an organization – the human impact as well as the
fnancial, operational, and economic implications. They should also proac‑
tively engage and align with their change management specialists, if they
are lucky enough to have them, to assist with the implementation of the new
system and help ease the burden of the transitions. There will be inevitable
changes in data requirements, systems, policies, processes, organizational
structures, and decision‑making (inserting the “model” in the human deci‑
sion‑making loop).
To be successful, the business must believe that the new and improved
analytics‑based process and system is their idea and in their best interest. And
they should get to take all the credit for the value created.
My favorite quote on change comes from Niccolo Machiavelli in The Prince
(1532). Even though he was talking about change in social and political sys‑
tems, the principle applies to the change brought about by new systems
based on analytical science.
Top 10 Analytics Leadership Skills (with Tom Davenport) 115
“It must be remembered that there is nothing more diffcult to plan, more doubtful
of success, nor more dangerous to manage than the creation of a new system. For the
initiator has the enmity of all those who would proft by the preservation of the old
institution and merely lukewarm defenders in those who would gain by the new one.”
When we look at the success rates of analytics projects, and IT projects for
that matter, I believe it is clear that Machiavelli was onto something. Ignore
the impacts of change caused by data science at your own peril.
Project Management
Analytical and data science models and systems should always be executed
and delivered using a project management methodology approach, and
in today’s world, that is most commonly Agile (preferably Kanban for a
modeling project, or Scrum for a project that is a part of a larger IT sys‑
tem development effort) or Scaled Agile (SAFe for extremely large, com‑
plex enterprise‑level IT system projects). Scope, time, resources, and quality
are the four primary dimensions of projects (Figure 11.2), and often scope
(creep) is the most diffcult to manage. Your goal should be to deliver more
scope than you promise – easier said than done. These four project dimen‑
sions make up the “box” because no side of the square can lengthen or
shorten without some of the other sides, or the enclosed project footprint
area (L × W) changing, which is not feasible. As an analytics leader, you will
need to know how to wrestle with these factors, and how they interrelate,
and there is as much, if not more so, art and nuance to it as there is science.
Most importantly, the project is not fnished until you measure, and cap‑
ture, the business beneft and communicate and act on the results.
FIGURE 11.2
The four dimensions of project management.
116 The Art of Data Science
Communication Skills
Even the greatest analytics or data science project using the most sophisti‑
cated techniques and delivering substantial business value and return on
investment (ROI) may be rendered completely useless if you cannot convey
its value and impact. Communications must be aimed at the appropriate
audiences, including the business (at all levels, from the Board of Directors
and C‑suite down to rank‑and‑fle individual contributors, and everyone in
between) and technical (from PhDs to BS quant and non‑quant grads) staff
alike. Analytics leaders must develop the skills of how best to communicate
complex concepts to each type of audience.
Some important communication pointers to utilize and master include:
There are many more exceptions to the profles mentioned above. I have
seen folks with undergraduate degrees in everything from Philosophy to
History to Business, who are self‑taught and become superb practitioners
and later leaders in analytical sciences. Tom Cook, who led AADT, had a BS
in Mathematics, an MBA (from SMU), and a PhD in O.R. (from the University
of Texas at Austin), so there you go!
Introduction
The failure rates of data science projects are well‑documented. The number
sits at about 80% of data science projects that never get implemented. The
number also sits at about 80% of data science projects that fail to deliver busi‑
ness value. As a practitioner, these numbers bothered me because this was
not my experience at all. I was fortunate to work for world‑class analytics
companies like American Airlines and Walmart, so I admit I was a bit biased.
This is a series of articles that began in 2020 as a PowerPoint presenta‑
tion that I made to address The Top 10 Reasons Why Data Science Projects Fail,
in response to the abysmal statistics about project outcomes. I presented
it to the Data Science Community of Practice inside Walmart Global Tech
and received a favorable response. Not long after, a colleague invited me to
present on the topic in a session entitled Analytics Leadership at the 2022
INFORMS Business Analytics Conference. The meeting room at the Marriott
in Houston was set up for about 75 people. About 100+ showed up, so it was
standing room only, with folks standing in the back, sitting in the aisle, and
lining the side walls. The response was overwhelmingly favorable, and many
people came up to me afterward and said that I should write a blog to cap‑
ture all of the great stories I told in my presentation. I then wrote about 50
pages to cover the topic, which was when I started thinking about writing a
book but realized I didn’t have enough content.
As luck would have it, I had another one of those career‑defning moments.
Kara Tucker, editor of Analytics magazine, published by INFORMS, reached
out to me and said that she had heard about my presentation at the INFORMS
Business Analytics Conference in Houston and asked if I would be interested
in publishing an article on The Top 10 Reasons Why Data Science Projects Fail. I
had found a home for the 50 pages, which was too long for one article, so she
agreed to publish it as a series of 12 articles. As the series commenced, I posted
the links to the articles on LinkedIn. Herein lies the power of the global reach of
social media. A fne gentleman who I had never met or heard of reached out
to me via LinkedIn and asked if I would like to join forces, and content, to
publish a book on this topic, as neither of us alone had enough material for a
book. And the rest, as they say, is history.
Once again, the credit goes to Kara Tucker for introducing us to Randi
Slack, a publisher at Taylor & Francis | CRC Press, where, as luck would have
it, Kara used to work! Small world indeed!
Working together, Dr. Evan Shellshear and I researched, wrote, and pub‑
lished Why Data Science Projects Fail: The Harsh Realities of Implementing AI
and Analytics, without the Hype, working closely alongside Randi Slack, our
publisher, and Kara Tucker, our editor. For a far, far more thorough treatment
on this subject, I highly recommend purchasing and reading our book. This
chapter comprises the article series that started me down the road to really
understanding in‑depth why data science projects fail.
Why do most data science projects fail to get deployed and deliver the
desired business value, outcome or economic impact? A series of monthly
articles will fundamentally focus on this question. We will explore and
explain the organizational and individual behaviors and factors that con‑
tribute to most data science project failures, which must be addressed and
consciously practiced to increase the likelihood of success. Spoiler alert: The
problem is not with the mathematics and technology but rather with the
actions of the people (practitioners and leaders) engaged in and the processes
employed to execute and manage data science projects.
As a long‑time proponent of Stephen Covey’s “The 7 Habits of Highly
Effective People” [2], I fnd two of his habits particularly useful in the
endeavor to create more successful data science projects:
1. Begin with the end in mind. Analytics expert, researcher and author
Tom Davenport said, “Models make the enterprise smarter; models
embedded in systems and business processes make the enterprise
more economically effcient.” This should be your end goal when
starting work on a data science project. You don’t want to just build a
model; rather, you want to embed that model into a mission‑critical
system that supports a key business process such that greater eco‑
nomic effciency (i.e., lower cost, greater revenue, improved customer
experience) can be achieved on an ongoing basis in an automated
manner with little or no human intervention, creating a fywheel
effect generating business value.
2. Sharpen the saw. Abraham Lincoln once said, “If I had six hours to
cut down a tree, I’d spend the frst four sharpening the saw.” Most
undergraduate and postgraduate education program coursework in
data science (and related felds) is spent focused on mathematics and
computer science methods, skills, and technologies. Although this is
understandable because (1) considerable training in these domains is
necessary to become a data science practitioner and (2) this is what
university staff know how to teach, students enter the workforce
unaware of the more nuanced, subtle and harder‑to‑grasp aspects
and dimensions of executing, managing and leading data science
projects in the real world and corporate America. My intent here is
to help students, practitioners, leaders and executives “sharpen the
saw” and fll in the knowledge gap in their training and education
that heretofore was learned only through real‑world work experience.
professional education programs. Given the low success rates of data science
projects to deploy and deliver value, I felt compelled to share what I and oth‑
ers have learned with the goal of helping practitioners and leaders be more
successful more frequently and avoid many of the common pitfalls associ‑
ated with these endeavors.
Originally, I presented this material to the Data Science Community of
Practice inside Walmart Global Tech in 2021, and then again in April 2022
in Houston, Texas, at an INFORMS Business Analytics Conference in the
Leadership Track under the title “The Top 10 Reasons Data Science Projects
Fail.” About 100 people attended my talk, in a room with a capacity of about
70, and the feedback was so overwhelmingly positive that several people said
to me, “You should really write all this information down!” That particular
invited speaker address was the genesis of this series!
The objective of the material in the series is to help make you a more well‑
rounded, self‑aware, and informed data science practitioner and leader by
learning from the experiences gained by others in the feld who came before
in the spirit of “fail fast and learn.”
Although I utilize “data science” as a contextual delimiter, the series’ prin‑
ciples apply equally to related adjacent felds that utilize data and mathemat‑
ics to model and solve business problems and phenomena, such as operations
research/management science, statistics, analytics, machine learning, artif‑
cial intelligence (AI), business intelligence (BI) and more.
Samuel Smiles famously said that we learn more from failure than we do
from success. My approach therefore was to examine some of the primary
reasons I have observed data science projects fail – a top 10 list, if you will –
to highlight to data science practitioners the aspects and dimensions of their
projects that are more subtle, less tangible, and more diffcult to grasp, but no
less critical, and of which they need to be more conscious to generate success‑
ful outcomes with greater consistency and regularity.
“We learn wisdom from failure much more than from success. We often
discover what will do, by fnding out what will not do; and probably he who
never made a mistake never made a discovery.” – Samuel Smiles
The series will comprise 10 articles, each tackling a different reason why
data science projects fail and how to address them.
Now, join me on the journey to fnd out why data science projects fail and
learn how to avoid making the same types of mistakes.
References
1. K. Troyanos, 2020, “Use Data to Answer Your Key Business Questions,”
Harvard Business Review, February 24, https://hbr.org/2020/02/use‑data‑to‑
answer‑your‑key‑business‑questions.
2. Stephen Covey, 1989, “The 7 Habits of Highly Effective People,” New York: Free Press.
124 The Art of Data Science
Data science applications vary greatly across industries and their respective
segments from energy (oil and gas, electric, wind, solar, generation, trans‑
mission) to transportation (airlines, railroads, trucking, rental cars), health‑
care (providers, insurers, device manufacturers, pharmaceuticals), fnancial
services (banking, credit cards, credit reporting, mutual funds, hedge funds,
private equity, venture capital), manufacturing (automobiles, steel, consumer
packaged goods, semiconductors, food) and retail (big box, hardware, cloth‑
ing, housewares). Each of these industries has their own unique econom‑
ics, operating models, and competitive landscapes. It necessarily behooves
the data scientist to research and understand as much as possible about the
industry in which one is working.
Top 10 Reasons Analytical Sciences Projects Fail 125
Each corporation within a given industry or segment has its own com‑
petitive and economic DNA (e.g., low‑cost provider versus premium high‑
margin provider), culture, and mode of operation. A data scientist must learn
and understand the following:
The company’s annual report and fnancial statements are a great source of in‑depth,
detailed information to learn about the above topics. (If you don’t have a BBA/
MBA/CPA, then fnd a friend in accounting or fnance to help you get started!)
Inside your company, many, many different departments may be using
data science (or none, depending on the company’s data and analytical matu‑
rity). The approach to data science and the problems to be solved are as var‑
ied as the department:
• Marketing
• Sales
• Manufacturing
• Operations
• Finance
• Accounting
• HR
You will need to understand the goals, objectives, business processes, met‑
rics, operating plans, and road maps of the organization with which you
are working to apply data science. You need to know how the work gets done,
including budgets, data, data systems, and software. You literally need to
learn to speak their language – and, yes, each department will have their own
vocabulary, terminology, and acronyms (corporate America loves acronyms).
Being a data scientist requires deep immersion in your industry, your com‑
pany, and your department to understand the domain problem space and be
able to contribute materially – your goal is not to appear to be the “math geek
with the fancy laptop” but rather to be a “team member that digs deep and
helps solve problems using some really powerful, specialized skills.”
126 The Art of Data Science
• What is the problem that we are trying to solve, clearly and suc‑
cinctly stated?
• What is the key business question we are trying to answer?
• What is the desired business outcome?
• What is the end state of the model/system we build? How will it be
utilized?
• What is the “target” for improvement (e.g., cost reduction and con‑
version rate increase)?
• What KPIs (key performance indicators) are relevant to measuring
economic impact?
• What experiments can we run to measure the before‑and‑after effect
of the model?
Reference
1. K. Troyanos, 2020, “Use Data to Answer Your Key Business Questions,”
Harvard Business Review, February 24, https://hbr.org/2020/02/use‑data‑to‑
answer‑your‑key‑business‑questions.
Top 10 Reasons Analytical Sciences Projects Fail 127
Data issues are typically the No. 2 reason data science projects fail. Data
issues manifest themselves in myriad ways as varied as there are companies
attempting to manage and analyze their data. (See Tom Davenport’s book,
“Big Data at Work,” for a great resource in this domain.) (https://www.ama‑
zon.com/Big‑Data‑Work‑Dispelling‑Opportunities/dp/1422168166)
The frst complaint I usually hear from clients asking me how to do analyt‑
ics in their enterprise, and from students when confronted with the reality of
having to do a real‑world data science project in one of my courses, is, “We
don’t have the data!”
That may be true because many companies, especially small to medium‑
sized businesses, are bereft of (automated) data altogether or lacking clean,
accurate, consistent, high‑quality, and high‑integrity data.
More often, what people really mean is that the data is not all in one place –
the most common data affiction of most enterprises. Data is literally scat‑
tered among dozens, or even hundreds (no exaggeration), of enterprise
applications, legacy systems, databases, CSV fles, data warehouses, data
marts, cloud accounts, third‑party systems and, yes, the ubiquitous Excel
spreadsheets.
A data scientist alone, in most enterprise instances, is not going to be able
to solve this problem in isolation. They need to partner up with IT, a database
administrator, a data architect, cloud data engineer, or the chief data offcer’s
data engineering team (if you are lucky enough to have one).
Although you should not try to “boil the ocean” (because you will fgura‑
tively “drown”) and solve all of the enterprise’s data issues, you should stay
laser‑focused on organizing the data you need for your project. That will be
challenging enough.
Two main issues to focus on to enable your data science project are (1) data
integration and (2) data governance.
Historically, the database, data warehouse and data mart were the com‑
mon enterprise data stores. Recently, these have been superseded by the
data lake (usually unstructured, raw data landing zones) and now, yes, the
data lake house, which combines attributes of the warehouse and lake into
one entity. Regardless of the exact data platform, getting all of your data
cleaned, organized, and integrated into a single physical or virtual (acces‑
sible) workspace or view is critical to enabling your data science project out
of the gate.
128 The Art of Data Science
Some companies, such as large retailers, airlines, telcos, and fnancial ser‑
vices, are blessed, and cursed, with enormous amounts of data, i.e., duplicates
and sheer voluminous amounts of data. This is a good problem to have from
a data richness perspective but can represent logistical problems of storage
and management.
Data governance, including metadata, data lineage, and data stewards, is
a hot topic and an absolute necessity to ensure one version of the truth, con‑
sistent data defnitions, and usage patterns. Once again, the data scientist
will not solve this problem alone either but will need to partner with the
data governance team (if you are lucky enough to have one) or at least data
owners and stewards that control access to and governance of some or all
enterprise data.
Personally, I have been greatly blessed when it comes to data. I worked
for companies that were rich in data resources, relatively mature in the way
in which data were managed, and legitimately data‑driven and analytically
inclined, including American Airlines, Sabre, Southwest Airlines, Blue Cross
Blue Shield of Kansas City, and, most recently, Walmart.
At Southwest Airlines, I was in charge of the enterprise data ware‑
house Teradata and ETL & Reporting (as well as advanced analytics), and
we delivered some very substantial, very challenging projects, such as the
Reservation Data Pipeline & Warehouse (3,900+ test cases) and Customer
Data Warehouse. We created whole new data structures to support a brand
new jet fuel demand forecasting, purchasing, and inventory management,
replacing 150–tab spreadsheets and pulling data from a half‑dozen legacy
and new transaction and information systems. That effort avoided millions
of dollars annually in superfuous jet fuel costs using advanced analytics.
But without integrated and governed data and automation, data science is
just math!
A/B testing is a standard and appropriate use of Pearson’s chi‑square test for
independence to determine whether there is a statistically signifcant difference
between the two ratios, such that if a material difference did exist, it would
indicate that one page is more effective than the other at attracting customers
to view the offer.
The misapplication of A/B testing manifested when the manager wanted to
use the exact same test to determine which page would generate more revenue.
Whereas hit rate is a simple ratio of page click‑throughs to total page views,
revenue is a far more complex, multivariate, and multidimensional quantity.
The myriad variables that determine the total revenue on an airline website
transaction include, but are not limited to, the number of ticketed passen‑
gers on the itinerary, origin‑destination market pair (e.g., DAL‑PIT and DAL‑
LGA), fare class, fare class bucket, purchase date relative to fight date and so
on. You get the picture.
To determine with statistical accuracy which landing page generates more
revenue, you would need to rigorously design a series of highly controlled
experiments to account for most, if not all, of these variables that materially
affect revenue to ensure that you are comparing “apples to apples” and not
“apples to oranges.”
In my experience, as a practicing statistician applying my education and
training, I fnd that a fundamental lack of understanding by professional
and citizen data scientists of the principles of experimental design, such as in the
airline web page example, is one of the biggest “gaps” in solving these types
of real‑world problems (i.e., comparing two alternatives against a metric). We
work in a world that is complex and multivariate with confounding effects,
and we must account for all of that in our data science projects. Entire books
have been written on the topic of experimental design, and the practical appli‑
cations of these techniques in industries as different as farming to pharma‑
ceuticals all share the goal of legitimate, logically and statistically accurate
results, conclusions and decision‑making.
130 The Art of Data Science
FIGURE 12.1
The concept of experimental design, visualized using the example of testing two fertilizers.
Top 10 Reasons Analytical Sciences Projects Fail 131
– Fortune 5 EVP|CTO
Every company has limited capital and human resources in IT and data sci‑
ence. There are never suffcient budget dollars and people to go around to
fund all projects. In cases I’ve observed in Fortune 50 companies, new proj‑
ect demand exceeds the available budget by a factor of two to four times.
Projects must compete for resources during each budget cycle based on their
respective relative potential to generate incremental business value.
Data science projects are no exception and are ultimately judged on their
ability to “move the needle” on economic performance. That said, in many
companies, a lot of political wrangling and “pet project” machinations go into
the decisions as to which projects get worked on, i.e., the HiPPO projects –
Highest Paid Person’s Opinion.
Fortunately, there are multiple rational, fact‑based, data‑driven frame‑
works that help estimate, gauge and compare value and inform data science
project decision‑making.
In a Harvard Business Review (HBR) article by Kevin Troyanos, “Use Data To
Answer Your Key Business Questions,” a heuristic rubric is offered to help
prioritize business questions using a two‑dimensional grid. (See Figure 12.2
for an illustration of The Key Business Question Grid.)
132 The Art of Data Science
FIGURE 12.2
The Key Business Question (KBQ) grid.
FIGURE 12.3
A quantitative approach to prioritizing data science projects.
134 The Art of Data Science
The primary takeaway from this exercise is that the multiplicative scoring
approach ensures that not just high business value projects bubble to the top
of the list, rather, the magnitude of business value is tempered by a combined
effect of complexity and cost. Complexity, in effect, is an important surrogate
measure for risk, i.e., the more complex a project is, the more likely you are to
run into diffculties that end up manifesting themselves in timeline delays
and budget overruns that jeopardize the whole project. In the chart, we see
the highest scoring projects are those that have low‑to‑medium relative busi‑
ness value moderated by (very) low complexity and low‑to‑moderate costs.
Top 10 Reasons Analytical Sciences Projects Fail 135
The highest‑value projects, in this example, happen to have the highest risks
and costs, which result in a lower score. This is actually a fairly commonly
encountered set of circumstances, i.e., high risk/cost, high reward.
Now is when the “fun” begins and people start debating and haggling
(i.e., arguing) over the individual and aggregate score for their respective
project(s). (This process should be accompanied by a more rigorous fnancial
analysis using NPV, ROI, and internal rate of return metrics.)
This is not just a theoretical exercise. When I was a VP of Engineering &
Product Management for a division of a $1.3 billion software product com‑
pany, we had a list of 2,000 feature modifcation requests (FMRs). We esti‑
mated that we had capacity to do about 500 FMRs in a new major product
release. We used the above process to rationally, objectively, and as economi‑
cally and effciently as possible narrow down the list of projects our engi‑
neering team could realistically do in one release cycle. It really does work
in practice!
After the acceptance of the model and solution comes the substantive com‑
munication that must go into implementing the model as part of the business
process (the next installment will cover this and change management).
Top 10 Reasons Analytical Sciences Projects Fail 137
Stories as Communication
The most effective data scientists are storytellers. They tell a story of what life
was like before the model was developed and implemented and how life will
change (hopefully for the better) afterward. They start presentations by grab‑
bing the attention of the audience – in particular, executives who are prone to
reading the news, email, or their calendar on their mobile devices. The most
effective data scientists ask provocative rhetorical questions such as, “What
if I told you that we could increase sales (or decrease inventory costs) and make (or
save) the company an extra $X gazillion using data and data science?” Now you
have everyone’s attention! The key is to communicate in the language of your
audience – i.e., managers, executives, and domain experts – not data science!
Lastly, for any data science project to move forward, you will inevitably have
to address and adequately answer the age‑old question: “What’s in it for me, my
team, my department, the company?” As the late NBC Sports television execu‑
tive Don Ohlmeyer once said, “The answer to all of your questions is money.”
The answer may be operating or capital expenditure cost savings or avoidance,
increased revenue, increased customer satisfaction, or increased resource uti‑
lization, all of which may lead to some economic improvement for the people
involved (like a bonus, raise, or promotion!) or the company at large (higher
stock price, increased dividend, increased proft sharing, etc.). Everyone wants
to understand how they, and their stakeholders and constituents, are going to
beneft by undergoing this cataclysmic change in their business process.
– Heraclitus
It is not the strongest or most intelligent who will survive, but those who
can best manage change.
– Charles Darwin
138 The Art of Data Science
– N. Machiavelli [1]
Top 10 Reasons Analytical Sciences Projects Fail 139
As you might imagine, there was quite a bit of skepticism from the ana‑
lysts about the ability of a computer to generate higher‑yield, more eff‑
cient fve‑year maintenance check, and hangar schedule plan better than
they could. Their skepticism turned frst to incredulity and then quickly to
unmitigated fear and dread when my partner and I delivered an early ver‑
sion of the software within a few months that, with the right input param‑
eters (running on a Mac IIcx desktop computer), could generate a fve‑year,
600‑aircraft feet maintenance and hangar plan schedule with optimized
~100% check yields in about 18 minutes (a process that used to take two or
three people weeks to generate one feasible plan with 80% check yields)!
At that point, the analysts took me aside and said something to the effect
of, “You are going to put the three of us out of work with that computer
program of yours!” I not only assured them that was not the case but also
predicted (and bet them a steak dinner) that they would all get promoted as
a result of their ability to use the new system to create more effcient, cost‑
effective maintenance plans in a far timelier manner than before.
Interactive Optimization
As it turned out, the system we created was very much a case of (AI) aug‑
mentation rather than replacement. The system employed a design frame‑
work (conceived at Georgia Tech in the late 1980s, where I earned my M.S.
degree) known as interactive optimization. The approach combines prescrip‑
tive optimization‑based techniques, including heuristics when appropriate,
and an evaluative simulation‑based approach to quickly generate optimized
schedules interactively with a human‑in‑the‑loop iteratively providing the
necessary inputs and feedback to guide and push the algorithm in the right
direction toward an optimized solution. Therefore, the human and system
work together, leveraging their respective strengths to generate better solu‑
tions quickly that neither would be able to deliver on their own. Humans
can more easily inspect a graphical Gantt chart representation of the schedule
and see where hangar capacity needs to be added or excess capacity taken
away to optimize check yields. A computer can add and subtract, albeit
really quickly, and store information, and an algorithm can be programmed
to automatically generate maintenance plans and schedules the same way a
human would, but far faster.
Suffce it to say the project was a success. My software engineer and I,
both working full time, delivered the frst production version of the system
in about six elapsed months (12 labor‑months) and demonstrated how we
would achieve the originally targeted benefts over time, i.e., $454 million
in maintenance cost avoidance through increased wide‑body aircraft check
yields, along with multiple additional unforeseen benefts. By optimizing
yields and, in effect, pushing aircraft maintenance events out later in time,
but still within the Federal Aviation Administration’s (FAA) legal limits, the
analysts used the model to open up additional hangar space, which allowed:
Top 10 Reasons Analytical Sciences Projects Fail 141
The analysts not only received promotions but also became a valued, trusted
resource to the executives, including the senior vice president, as a result
of their ability to “see into the future” with confdence and accuracy and to
evaluate all manner of various planning scenarios with the new model/sys‑
tem that they never could have dreamed of doing before.
Keys to Success
What were the key change management factors that made the project a suc‑
cess? There were clear goals, objectives and a well‑defned project scope,
including a tangible business value target. To start with, I personally spent
the frst six weeks of the project literally sitting and working side by side
with the analysts at the maintenance base in Tulsa, Oklahoma, learning
about and understanding the art and science of scheduling aircraft main‑
tenance and hangar facilities, the data and decision‑making until I could
do the job myself. I listened two‑thirds of the time and asked questions the
other third.
As a team, we had regular status and update meetings every time my
partner and I hit a noteworthy milestone and deliverable (what today we
would call Agile‑Minimum Viable Product) at each stage of development of
the model, algorithm, and schedule GUI. There was ample two‑way com‑
munication – i.e., we demonstrated what we had done in detail, and the ana‑
lysts provided constructive feedback and guidance to validate the model’s
performance and results. I continually reassured the analysts that the system
was designed not to operate “completely autonomously” but rather for them
to operate it and “drive it” iteratively and interactively, much like a driver
directs an automobile with inputs from the gear shift, accelerator and brake
pedals, and steering wheel to reach their destination.
The changeover from a cumbersome, manual, spreadsheet‑based process
to a streamlined, automated and interactively optimized process was orches‑
trated to reduce fear of and instill confdence in the new solution. We endeav‑
ored to make the transition to the new system as seamless and stress‑free as
possible by reusing all of the same data, terminology, scheduling logic, KPIs,
report formats and familiar visualization tools in software GUI, such as the
142 The Art of Data Science
Gantt chart from the historical wall‑hung paper maintenance and hangar
schedules. That way, the learning curve on the new system was not very
steep at all.
The interactive optimization approach, based on the analysts’ own step‑
by‑step processes, also made the analysts feel much more comfortable with
the solution, rather than being a “black box” that they didn’t understand.
One of the analysts even referred to the new system as “a big calculator” that
could enter the input data and output an optimized fve‑year maintenance
schedule and hangar plan. A great metaphor indeed [3]!
The best way to “grease the skids” of change management is to deliver signif‑
icant, tangible, measurable business value that can be categorically attributed
to the new model/solution as demonstrated by before‑and‑after experiments.
We did that in this case, and the “after scenario” delivered far more and bet‑
ter solutions faster than the analysts could have ever imagined. This made
for a super easy justifcation for change. It’s rarely that easy, but sometimes it
can be. (Promotions, raises and escalation of one’s status in the organization
goes a long way toward acceptance!)
I went on to use this exact same approach multiple times during my career,
including once at another airline to build a new jet fuel supply chain pur‑
chasing and inventory management optimization system. As with the main‑
tenance scheduling scenario, the science and technology were sophisticated
and substantive, leveraged augmentation versus replacement, and got the
job done. That said, it was the “soft skills” that really made the difference.
The jet fuel supply chain business folks were quite attached to their 150‑tab
Excel spreadsheet that they had been using for 30 years, and they did not
necessarily want to trade it in for a “new and improved” data‑driven, ana‑
lytically based forecasting, purchasing and inventory optimization model
suite. In fact, they initially put up quite a fght. However, when the results
of a head‑to‑head “bake‑off” between the spreadsheet and the new models
were validated, the supporting case for the new models/system was made:
an eight‑fgure annual cost avoidance opportunity generated by the models
in a matter of minutes, versus days and weeks by the status quo process!
The outcome was quite similar with signifcant business value and
economic impact, and satisfed business stakeholders benefted from a
Top 10 Reasons Analytical Sciences Projects Fail 143
continuous close engagement with my team from the start of the project.
A smooth, seamless transition and a change management process focused
on large doses of communication, mutual understanding and empathy, and
iterative testing and validation made all the difference.
One of the critically important lessons that I personally learned the hard way
earlier in my career involved setting (un)realistic expectations.
There are two primary domains for expectation setting that plague IT and
data science professionals alike:
Numerous books have been written on Agile Scrum and Kanban estimation,
and much of estimation is an art and a science learned over time from lots of
practical experience. My only recommendation here is to balance conserva‑
tism and stretch goals. It is always better to be a bit early than late, relative to
the promised deadline.
In the second domain of business value and economic impact, balancing
conservatism and stretch goals is also advisable. My favorite graphic to illus‑
trate this point is shown below.
We establish an expectation‑setting continuum. See Figure 12.4 for a graphical
illustration of the Expectation Setting Continuum explained below.
144 The Art of Data Science
FIGURE 12.4
The art of expectation setting.
At the far right of the spectrum is where we set the bar for benefts too low,
sailing over the bar too easily and blowing away our target. This approach,
known as “Sandbagging,” tends to lose a customer’s confdence because
they perceive the data scientist as not being aggressive enough in targeting
potential business value.
At the left end of the spectrum, we set the bar for benefts too high and
fail to deliver the promised business value. This approach, known as
“Overcommitting” – or an “epic fail,” as millennials might say – can get you
into really big trouble with customers (and their bosses/executives) because
they were expecting to deliver a monumental economic impact and came up
way, way short (e.g., expecting freworks but got sparklers).
An example of sandbagging versus overcommitting would be achieving a
$100 million verifable cost avoidance but promising $10 million and $1 bil‑
lion, respectively. With sandbagging, you blow your target away times 10,
and with overcommitting, you miss the mark times 10. Both are bad, but
overcommitting can be politically irredeemable and career jeopardizing.
In business school, MBAs (of which I am also one) are taught to always
analyze (at least) three outcome cases in any analysis or projection modeling
scenario:
The process of estimating benefts is similar, and the last case is in the middle
of the expectation‑setting spectrum, in which we try to balance our (and the
customer’s) optimism and pessimism and use an expected or average case to set
a target we can hit (or even exceed) without being too far off in either direc‑
tion. I call this approach the “target zone.” To continue the example above,
we would estimate, say, an $80 million beneft and deliver $100 million.
The upper end of the target zone, and just beyond, is sometimes referred
to as BHAGs, or Big Hairy Audacious Goals (see “Built to Last: Successful
Top 10 Reasons Analytical Sciences Projects Fail 145
The Standish Group has published their “CHAOS Report” for nearly
40 years, chronicling the failure of most IT projects to achieve scope, timing,
budget and quality goals (all four). As of their 2020 “CHAOS Report: Beyond
Infnity,” only 19% of all IT projects achieve these four lofty goals, which has
not gotten much better in the past 40 years and, frankly, may even be worse.
Many projects will, of course, achieve a combination of some subset of these
goals, e.g., timing and quality but not scope and budget.
Why is this statistic even relevant in the context of data science project
failures? Beyond the data analysis and modeling phases, a.k.a. the “fun part,”
successful data science modeling projects ultimately evolve to become soft‑
ware and systems engineering projects. We recall Tom Davenport’s “Begin
with the end in mind” goal: “Models make the enterprise smarter; models
embedded in systems and business processes make the enterprise more eco‑
nomically effcient.”
(I will delve more into the complexities and complications of transform‑
ing a model into a turnkey business system for planning or real‑time decision‑
making in a later installment.)
Building a turnkey production system‑based version of a model that sup‑
ports an enterprise business process of more than modest importance and
criticality will entail building API interfaces to multiple data sources, archi‑
tecting a microservice application around the model, built‑in model drift
detection, reftting the model, model assumption revalidation, error han‑
dling, fault tolerance, high‑availability and high‑reliability robustness, and
failover capabilities to get to 99+% uptime (mission‑critical systems require
the elusive “5–9s” or 99.999% uptime). Having been through this process
148 The Art of Data Science
many, many times myself, strong project management (PM) practices, pro‑
cesses, skills, discipline and judgment are critical to successfully achieving
scope, timing, budget and quality goals, or at least not the null set.
FIGURE 12.5
The four dimensions of project management.
Top 10 Reasons Analytical Sciences Projects Fail 149
• Underestimating scope.
• Overcommitting on scope.
• Underestimating complexity (e.g., technological, architectural,
change management, and system integration).
• Overestimating team capacity and capabilities, especially new staff
or newly formed teams (there is always a ramp‑up, learning curve
period).
• Overprioritizing too many features (i.e., every feature cannot be
Priority Level 1 for Release 1).
• Unrealistic timeline to address scope and no slack in the timeline for
contingencies and unforeseen circumstances.
• Insuffcient quantity of (adequately skilled) resources to address
scope.
• Project manager’s ability to succeed through adversity and make
decisions to course‑correct when things go awry (and they ulti‑
mately will).
I, for one, have always been enamored with the power and beauty of math‑
ematics. The notation is a language unto itself. Known as the “Queen of the
Sciences,” mathematics provides the tools to enable other sciences, such as
physics (which provides the foundation for all of the engineering felds) and
economics.
In capitalism, businesses are in business to make money and return that
money to shareholders while benefting society along the way.
At the intersection of mathematics and business, felds like operations
research, management science, statistics, and now, analytics and data science
are intended to contribute to the betterment of the corporation’s economic
and fnancial performance. The mathematics and models are a means to an end,
not an end themselves.
It is not uncommon, especially among recent graduates, to become exces‑
sively focused and a bit too enamored with the model and mathematics, the
algorithms and technology.
Top 10 Reasons Analytical Sciences Projects Fail 151
The Pareto principle (80/20 rule) can be of interest and application here, i.e.,
getting 80% of the beneft for 20% of the effort (or cost). Perfection is the enemy
of done! In business, most of the time, there is no need or willingness on the
part of management to expend that 80% of the effort to gain the last 20% of
the business value. The business needs an answer … and value delivered …
now. It doesn’t need to be perfect. It just needs to work and deliver against the
economic impact objectives.
The Agile principle of Minimum Viable Product (or Model) is directionally
correct and applicable as well. Get to a version that builds, works and generates
value ASAP. See Figure 12.6 for a representative illustration of a Data Science
Project Lifecycle including typically encountered estimates of Project Phase
(I‑V) activities and durations, and the total elapsed time to begin generating
business value on enterprise‑scale ADSAI projects.
In most businesses with which I have worked, the goal was for minimal
elapsed time possible to value realization. In fact, in status reports that go to
senior management, project update entries must have a business value attached,
or they are omitted forthwith. No technical jargon or detail is even allowed. It
is implicit and assumed that the correct model form was utilized, tested and
validated, as was the business value.
Excessive tweaking, refnements and feature additions or modifcations,
for little or no measurably incremental gain, are a waste of the company’s
time and resources.
I had a team of EMBA students whose fnal project in a Business Analytics
course was focused on improving the accuracy of (binary classifcation) mod‑
els to predict mortgage loan defaults operating on a large volume of historical
loan performance outcome data (a technique that would have come in handy
circa 2008–2009). The students flled 20 PowerPoint slides with mind‑bending
mathematical models, arcane terminology and symbols, and spent 19 of their
20 allotted presentation minutes talking about all of the different mathematical
and statistical models that they had built – Fast Fourier Transforms, Bayesian
inference, neural networks, etc. In minute 20, I fnally raised my hand and
asked, “What was the business outcome result you achieved?” They responded,
“Oh, wow, we increased the accuracy of mortgage loan default prediction to over 93%!
On their typical loan portfolio, the new and improved model was going to avoid tens
of millions of dollars in bad loan default write‑offs annually!” To which I responded,
“In the future, when presenting, especially to executives, please start with that infor‑
mation.” In journalism, this practice of omitting the most important pieces of
information is called “burying the lede.”
The moral of that story should also be applied when presenting to execu‑
tives inside your company. No one, except perhaps other mathematicians at a
conference, cares about all of the technical details. Save that for the appendix.
Instead, tell a story of what life was like before and after the model was imple‑
mented. Focus on the improved business solution and the incremental busi‑
ness value and economic impact that was achieved in terms of cost, revenue,
asset utilization and customer satisfaction – things they understand in terms
152
Data Science Project Lifecycle
Acceptance
~1-3 months < 1 week* ~1-2 months ~1-2 months ~1-2 months Ongoing
Business Process Analysis Model Model Model Production Deployment ML, | System | Process Operations
Fitting, Validation Iterative
UAT
Framing Problem Testing, End User Training Production Support
Scoring,
Determining Solution Approach
Evaluation, ~1-3 months Documentation Bug Fixes
Delay factors:
• Data availability, Access, Gathering, Cleaning, & Integration to Reach Model Fitting Readiness
* Predictive (ML, Statistical) Models • Problem Complexity, Scope, Scale
** Prescriptive (Optimization) Models • Customer Delays (e.g. feedback, Decision-making) & Unforeseen Factors
• Dependence on 3P for Upstream, Downstream System Integration
• Customer & 3P Resource Availability, e.g., ML, MLOPs, Dashboard Creation
FIGURE 12.6
The lifecycle of a data science project.
Top 10 Reasons Analytical Sciences Projects Fail 153
they use every day. Explain how much time and effort will be saved with a
streamlined, automated process. Refer to the controlled experiments that were
run to prove out the model’s value, and the testing and validation, with the
business domain and fnance folks. They’ll want to know that you can “show
your work,” but they don’t need to see or hear about all of the math and code.
How long does it take to transform a model from the sandbox to become a
production system? The answer, as usual in business (and consulting), is, “It
depends (on many factors).” It can take months or sometimes years for large‑
scale, sophisticated systems to solve the most complex types of problems that
must be available, be reliable, and access large amounts of data across the
enterprise.
Some of the most signifcant factors to consider and gauge when determin‑
ing complexity include:
The frst project I worked on from the “ground foor” (a bare C++ compiler
screen; see Part 6) was to build American Airlines’ aircraft maintenance
154 The Art of Data Science
Both of these systems have a high level of dynamism, a high level of integra‑
tion, a high degree of mission criticality and a high degree of problem com‑
plexity and solution sophistication.
UPS’ ORION is solving 55,000 traveling salesman problems (which is
NP‑complete) (one for each delivery truck) based on package delivery infor‑
mation scheduled for the next day. The system took more than 10 years to
build and deploy and cost $250 million, but saves $300 million to $400 mil‑
lion in costs annually.
AA’s DINAMO was performing 22,000 passenger fight demand forecasts
nightly and dynamically optimizing seat inventory pricing using a mixed‑
integer linear programming model for thousands of fights and 600 air‑
craft per day (circa 1991). The system took several years and cost hundreds
of millions of dollars to build and deploy (billions if you count Sabre, the
Top 10 Reasons Analytical Sciences Projects Fail 155
Conclusion
The goal of data science, and related felds like analytics, is to help solve com‑
plex strategic, tactical, and operational problems, support and better enable
data‑based, model‑driven decision‑making, and answer key business questions
in such a manner that business value is created and economic impact is maxi‑
mized. Tom Davenport set the bar necessarily high when he said, “Models make
the enterprise smarter; models embedded in systems and business processes
make the enterprise more economically effcient.” For data scientists, this is our
desired end state, and the end in mind with which we begin our endeavors.
Like any scientifc endeavor, experimentation and data collection and anal‑
ysis are a part of the process, combined with the use of advanced mathemat‑
ics and sophisticated software and computer technology. Notwithstanding
all of this science and technology, the practice of data science takes place in
the private sector (e.g., business, industry, research) and public sector (e.g.,
government, military, law enforcement), all of which are inhabited and oper‑
ated by human beings.
Human involvement in data science is substantial in every step of the pro‑
cess and materially signifcant, requiring the development and application of
many “soft skills” that necessarily facilitate successful execution and comple‑
tion of data science projects.
Of all the soft skills required, I believe that communication is by far the
most critical and foundational to successfully execute data science projects.
Communication in all forms, specifcally:
• Listening, to understand
• Being heard, to be understood
• Speaking and writing concisely, impactfully and with clarity for the
target audience
• Gaining a deep level of mutual understanding in all aspects of a
project
Communication is critical for all parties involved, especially the data sci‑
entist and business manager (“customer”), as well as others including data
158 The Art of Data Science
Project Challenges
Understanding the business problem that you are trying to solve is often a
point in which data science projects go awry. Sometimes the business folks
themselves are not completely clear on what the real problem is. Therefore,
we should not be surprised that the data scientist may need to do quite a
bit of investigating, along with the business folks, to determine the problem
to be solved. Sometimes, the business folks do understand the problem, but
there is a breakdown in communication, such as lack of a clear explanation
from the business or failure to adequately listen and ask clarifying questions
on the part of the data scientist, that inhibits mutual understanding of the
problem. Getting to a clearly stated and mutually understood problem def‑
nition, as well as associated business process fows, data fows and decision‑
making processes and criteria, is foundational to initiating and successfully
completing a data science project.
The challenges associated with data are many and will continue to hin‑
der data science projects. Historically, challenges include not having enough
data or not having it in one place for analysis, and going forward, having too
much in too many forms and in too many locations. Great strides are being
made in the felds of data engineering and data governance and the develop‑
ment of technology platforms that support these endeavors. The volume and
dynamism of data generated by myriad enterprise systems, e‑commerce and
social media platforms, IoT devices, etc., will continue to generate more data
than most enterprises can realistically, let alone easily, manage. The key for
successful data science projects is to focus on the data that you must have for
your project to get to MVP/M (Minimum Viable Product or Model). You can
always add in relevant data when it becomes available down the road.
Misapplying a model often occurs when faulty or improper assumptions
are made about the applicability of a particular model form or its usage to
solve the problem at hand. Experimental design is a critically important
skill that is often lost on citizen data scientists, and some professional data
Top 10 Reasons Analytical Sciences Projects Fail 159
Managing Change
Data science projects induce inordinately large amounts of change. Data sci‑
ence fundamentally and even radically changes the way that problems are
solved, questions are answered, and decisions are made. In general, the tran‑
sition to becoming a fact‑based, data‑driven enterprise is transformational
and fraught with many dimensions of change, including moving away from
gut instinct and Excel‑based heuristics and rules of thumb to more rigor‑
ously rational model‑based approaches to complex problem‑solving and
decision‑making. Data scientists may lead the way, but everyone must go on
the journey together. Data scientists may inform and teach others how these
advanced techniques and technologies function, but everyone, from analysts
to managers to executives, must “buy in” to be successful both on individual
projects and the overall transformation driven by data science methods and
stakeholders.
Communication plays a critical role in managing change and “winning
hearts and minds.” Storytelling using before and after comparisons includ‑
ing lots of data visualization to highlight the business impact generated by
160 The Art of Data Science
and your professors, care about the model, techniques or technology. People
in business, and the higher up the leadership chain they are exponentially,
care more about the business value and top and bottom economic impact of
your data science than the math or code. They trust that you “did the math,”
but they don’t want to hear about it. Sorry.
My advice is to not let yourself get “wrapped around the axle” with a lot
of nuanced, overly sophisticated mathematics for its own sake on corporate
data science projects. Please, do yourself and your constituents and stake‑
holders, a BIG favor, and save the math and code for the Appendix of your pre‑
sentation, for industry conferences and symposia, refereed academic journal
publications, and your Data Science Center of Excellence and community of
practice meetings. Always remember the Pareto principle (deliver 80% of the
value for 20% of the effort), Minimum Viable Product/Model (MVP/M), and
that perfection is the enemy of done. The model and code need to be verifed and
validated, but not perfect.
The fnal and highest hurdle to achieving data science project success
is getting your model from the sandbox (of your desktop or cloud‑based
work area) to a full‑fedged production system (e.g., microservice or stand‑
alone) embedded in a high‑value business process. Availability, reliability,
and repeatability are necessary for your model/system to achieve the “fy‑
wheel” of continuously ongoing business value creation without regular
human intervention. This process/journey requires a team to realize the
endgame – business people (i.e., executives for funding and political “air
cover,” line managers to drive change, and individual contributors to help
design, develop, test, validate and implement the solution), technology peo‑
ple (i.e., software, cloud, security, etc.), data people, test/QA people. It may
take months, years or even a decade, and may cost hundreds of thousands,
millions or hundreds of millions of dollars to deploy and implement com‑
pletely, depending on the scope and complexity of the problem and the level
of sophistication and operational criticality of the model/system solution. (It
is advisable to make sure that the benefts delivered by your solution are
proportional to the costs to build and implement the same, by whatever mea‑
sures and metrics the fnance department/board of directors utilizes, e.g.,
NPV, IRR, and ROI.)
As with any human endeavor involving teams of people, whether it is a
co‑ed softball team, delivering a data science project, or an expedition climb‑
ing Mt. Everest, empathy is the most important quality to embody regardless
of how incredibly diffcult things get along the journey. And trust me, as
worthwhile as data science projects are in every respect, things will get dif‑
fcult at many, many points along the way, and you cannot afford to alienate
any constituents, partners, teammates, or stakeholders. People never forget
heroes, and they never forget jerks. You may (barely) get through one project,
but you will never get through another one by treating anyone who matters
with anything less than The Golden Rule.
162 The Art of Data Science
A few tenets of advice that go along with empathy when times get tough
include:
Introduction
In the late 1990s, the combined field of operations research/management
science (OR/MS) was having a “crisis of confidence” in its collective abil‑
ity to deliver meaningful, real‑world business value results and economic
impacts. (Since then, OR/MS has undergone a branding and marketing
transformation to be closely associated with the field of analytics and is appli‑
cation‑ and mission‑wise adjacent to and closely aligned with data science
and AI – including machine learning, of course – which evolved from sta‑
tistics and computer science, respectively.) At the time, the OR/MS disci‑
pline was still evolving from being a largely theoretical discipline dominated
by academics with heavy emphasis on mathematics of more than modest
rigor toward a more applied discipline embedded in companies in complex
industries, such as transportation (e.g., airlines, railroads, package delivery),
energy, and financial services.
Peter Horner, then‑editor of OR/MS Today, the flagship magazine of
INFORMS (The Institute For Operations Research and Management Science),
asked me to write a commentary from the viewpoint of an OR/MS practi‑
tioner who had worked in and outside the field to provide a perspective on
where to find opportunities for success. Being the eternal pragmatic opti‑
mist that I am, I obliged the request because I had witnessed first‑hand the
extraordinarily widespread success of OR/MS at American Airlines and
Sabre during the initial 10 years of my professional career. I provided exam‑
ples of successes I experienced working in the OR/MS field at American
Airlines, as well as other successful, albeit smaller‑scale, analytical projects
my team and I implemented after I had transitioned to leading Sabre’s inter‑
net e‑commerce travel reservations software technology platform team (for
Travelocity, American Airlines, and multiple other customers).
Given the recent explosion and ubiquitous pervasiveness of applications
of analytics, data science, and, most recently, AI (think ChatGPT LLMs and
robots), such an article seems hardly necessary today. Thanks to an abun‑
dance of rich data from e‑commerce, IoT, and enterprise systems, and practi‑
cally unlimited computing power (think cloud and GPUs), these fields and
When projects are well‑executed and deliver value, you will literally fnd
yourself surrounded by success.
Successful OR/MS applications are all around us, if you know where
to look. Plenty of good OR/MS practice is hidden by job titles. Lots
of people do OR/MS, but don’t carry the title OR/MS analyst. I know
professional people, e.g., market researchers, investment analysts and
project managers, who effectively apply OR/MS methods in their daily
work to achieve signifcant, positive results despite the fact that their
primary educational training wasn’t in OR/MS. Product managers,
who must balance the cost to create a new product with market share
objectives to determine a competitive price, apply fundamental OR/
MS principles. I have one friend who is a manager of information tech‑
nology for a local police department who writes computer programs
that do everything from generate and analyze crime statistics to opti‑
mize patrol routes and offcer allocations. OR/MS is where you choose
to fnd it.
Alan Greenspan, chairman of the Federal Reserve Board, recently
spoke to Congress at length on C‑SPAN about the econometric mod‑
els he uses to reveal the mysteries of the economy, specifcally the
Consumer Price Index (CPI), which attempts to measure infation by
analyzing changes in the prices of basic consumer goods. However, he
also talked about his “personal rules‑of‑thumb” that he uses to validate
the models’ predictions, and gauge the “true health” of the economy,
e.g., calling his friend at the Bureau of Labor Statistics to see how many
unemployment claims were fled this week in major U.S. cities. He was
candid about the limitations of the sophisticated models to predict eco‑
nomic trends, but at the same time defended them, saying, “It is better
to be approximately right than certainly wrong.”
166 The Art of Data Science
For more than 10 years, I have had the opportunity to apply many
different OR/MS methodologies to a lot of different real‑world trans‑
portation industry problems, ranging from discrete‑event simulation
analysis of airports to determine capacity of airspace and airfeld struc‑
tures to mixed integer programming heuristics and scheduling algo‑
rithms for scheduling aircraft maintenance activities and resources.
Surrounded by Success 167
OR/MS MINDSET
Recently, I transitioned into a new assignment in the travel distribu‑
tion technology practice area. I am currently responsible for a group of
60 software engineers focused on building Internet/WWW‑based con‑
sumer travel reservations systems, and travel agency Internet‑enabling
software applications. Needless to say, I thought my days as an OR/MS
professional were over. However, as I sat in meetings with my team and
my new clients wearing my “new IT director hat,” something interest‑
ing happened. Some very real and perplexing business problems began
to arise for which no one in the meetings seemed to possess a solu‑
tion. However, given my OR/MS training and experience, for me each
of the problems ft into a “class of problems” for which model forms
were readily available. After I had convinced myself that I was out of
the “OR/MS business,” I was still able to apply my OR/MS “mindset”
and accompanying “bag of tricks,” and add value by recommending
viable solution approaches to some of the pragmatic business problems
at hand. Here are just a few examples:
The full article is reprinted in this chapter, but the following insights from
Dr. Bell bear repeating:
Often we appear to be focusing on the development and application of
advanced theory or algorithms to try to get a few points closer to the true
optimum solution when we know that the data we are using is rough
and probably out‑of‑date, so the result is a really good solution to an
approximate problem. My data suggests that sometimes we might do
better by focusing on cleaning up the data, improving our understand‑
ing of the real issue, and implementing much faster heuristics to fnd a
decent answer to a problem that is closer to reality.
We in analytics tend to think that the ‘answer’ we derive to an issue
is the end‑point, but for management it is often a starting point. The
‘answer’ is usually delivered to a highly intelligent manager (or team)
who has a thorough understanding of the business and the issue, and
who then merges our analytics work with personal experience and a
variety of opinion into planning a path forward. After choosing a course
of action, managers implement change, monitor the situation carefully
and make corrections. It is common in business strategy to say that the
success of a chosen strategy is ‘all in the implementation.’ The same can
be said of analytics; successful analytics can be simple models imple‑
mented very effectively.
A quote I use every term to introduce students to analytics comes from
Daniel Elwing, former president and CEO of ABB Electric, who said [2]:
‘[Analytics] is not a project or a set of techniques; it is a process, a way of
thinking and managing.’
Along with ‘a way of thinking’ and a ‘process’ that starts with data
collection, analytics adds value to the data through modeling, which in
turns adds value to the decision‑maker. Often, very simple models pro‑
duce substantial benefts.
without doing any math or statistics, and they have claimed a substan‑
tial revenue lift or cost reduction. These student projects provide inter‑
esting, and I think valuable, lessons about “analytics.”
REFERENCES
1. G. Woolsey and H. S. Swanson, 1969, 1975, “Operations Research for
Immediate Application: A Quick and Dirty Manual,” Harper and Row.
2. D.H. Gensch, N.A. Aversa and S.P. Moore, 1990, “A Choice‑Modeling
Market Information System that Enabled ABB Electric to Expand its
Market Share,” Interfaces, Vol. 20, No. 1 (January/February).
Peter C. Bell
Peter C. Bell is a professor at the Richard Ivey School of Business at Western
University, in London, Ontario, Canada.
https://pubsonline.informs.org/do/10.1287/orms.2016.04.12/full/
14
O.R. in 2048
Introduction
In 1998, OR/MS Today magazine then‑editor Peter Horner asked me, and a
few other industry professionals, to provide a perspective on what I thought
O.R. (operations research) would be 50 years into the future, in 2048. Quite
frankly, I hate making these kinds of predictions because I am not a “futur‑
ist,” and I have a diffcult time imagining what things might be like that
far out.
So, at the time of writing this book and collecting the articles, it was late
2023 – 25 years later, or half way through the prediction window. Although
there’s still 25 more years to go until 2048, my predictions turned out to be
directionally correct, so far at least, with some already coming to fruition.
Many of the predictions were actually quite prescient, even now in 2023 as
I write this book.
Before you read the 1998 article to see how I and some others did on pre‑
dicting the future, let’s consider some of what has changed since I made my
prognostications.
Computing
Computing power – up through 2020 or so – was governed by Moore’s law,
which stated that CPU speed will double every 18 months and become
50% less expensive over the same period. Moore’s law has recently given
way to the era of graphical processing units (GPUs) in servers –from com‑
panies like NVIDIA, a pioneer in the domain, which enable powerful
tools such as OpenAI’s large language model (LLM) ChatGPT, trained
on 1 trillion parameters. A new AI‑based LLM from Amazon called
Olympus is being trained on 2 trillion parameters and promises to be
even more powerful.
Software Tools
Most impactful to me is the rise in AutoML software tools for data scientists
to expedite model ftting and selection. Citizen data scientist desktops, such
as DataRobot, Dataiku, H20.ai, and Alteryx, with drag‑and‑drop GUIs, can
remove much of the coding required to handle data, map business process
and data fows, and build, test, and validate models.
The rise of Python as a language for data science modeling, challenging
industry incumbents such as SAS, SPSS, and even R especially on cost, and
providing ample features through rich code libraries and “plug ins,” has rap‑
idly evolved and expanded in the past decade or so.
180 The Art of Data Science
People
People have evolved quite a bit in the past 25 years, and are more comfortable
with using computers at work for problem solving and analysis (albeit mostly
in Excel and using SQL queries and dashboards).
Many more people are now comfortable with analytical sciences (albeit
mostly in Excel shifting to tools like Python or data science desktops like
Dataiku), even those who are not formally trained and educated, but have
math, science, or engineering backgrounds and degrees.
The availability of rich, online training capabilities has expanded dramati‑
cally, such as those found in Coursera, Udacity, LinkedIn Learning, Udemy,
Data Camp, and Kaggle, enabling self‑paced learning for those seeking to
expand their technical knowledge and strengthen their capabilities.
Unfortunately, most people in business are still not comfortable with STEM
in general, or math, and AI augmenting for problem solving, decision‑mak‑
ing, or question answering, let alone automating their jobs so they can move
onto higher order work.
Most people, due to human nature, are still not any more comfortable with
change than they were in 1998.
My primary thesis about O.R. in 2048 was that the discipline goes “main‑
stream” and blends in as a “tool set” and “framework” for professionals
across the spectrum of business to analyze and solve complex problems. One
watershed moment for the commencement of this trend was the publication
of Competing on Analytics in 2007 (2nd edition, 2017), which heralded analyt‑
ics as a powerful strategic competitive weapon of business and industry. I
predicted that the obscurity of O.R. would persist in 2048, so long before 2048,
INFORMS shifted their namesake focus to analytics. Smart move!
The evolution of O.R. would be fueled by the advancement of data and
computing power. Check!
The growth of the feld would be ensured by no shortage of complex prob‑
lems to solve, e.g., omnichannel direct‑to‑consumer supply chains using
optimization software‑driven robots and driverless vehicles and drones!
O.R., along with the other analytical sciences, would continue to be enabled,
or hindered, by our ability, or inability, respectively, to market our solutions
and convince executive decision‑makers to invest in our projects to realize
business value and economic impact, and achieve strategic competitive advantage.
O.R. in 2048 181
I strongly believe that OR will still be relevant in 2048, but may not
be distinguishable as a separate profession. OR may even be more
relevant as computing power increases, enabling the timely solution
of larger and more complex problems, and the general population’s
comfort level with using technology in their daily work increases as
well. I believe OR will “blend in” in the real‑world as a “tool set” and
a “framework” for a variety of professions to leverage in the solution
of complex, real‑world problems across many industries. I believe that
this trend will continue as OR is already being applied in many indus‑
tries and absorbed into many technology products that I will address
later in the article.
While it is extremely doubtful that OR professionals will rule the
world, the people who do rule the world (i.e. CEOs, presidents), will
have professionals with OR skills advising them in areas such as eco‑
nomics, the military and logistics. Professionals with strong OR skills
have risen to high leadership positions in business and government
(i.e. Tom Cook at American Airlines, Alan Greenspan at The Federal
Reserve Bank). Therefore, I believe OR folks will most defnitely con‑
tinue to infuence the world in the future, if not actually rule it.
I strongly believe that technology advances will continue to have a
major impact on the OR feld in 2048, especially in the areas of com‑
puter, information and telecommunications technology. The pervasive
182 The Art of Data Science
I will fnish the story where I started because the sentiment sums up both my
career’s intent and legacy.
I have always had a profound appreciation for the power and beauty of
mathematics and a penchant and skill for applying a variety of modeling
techniques and technologies to solving complex real‑world problems in
business and industry to deliver tremendous business value and economic
impact. My instincts for framing and modeling complex business problems
have always been innately strong and continued to develop over time, which
I have shared with my teams, colleagues, and students. Developing skills
in leadership, communication, people management, project management,
and change management took much longer as did designing, building, and
deploying large‑scale software systems that automate complex business
processes.
Intense intellectual curiosity, supported by great intentionality and work
ethic, is a cornerstone of a successful career in the analytical sciences. Being
driven by a strong desire to fundamentally understand in great detail how
things work today in a process or system of any kind and then being tena‑
cious and perseverant in the curiosity of how to make the process or system
work better, i.e., more effciently garnering the same or more output (revenue
or throughput), with the same or fewer resources (people, vehicles, or other
assets), at a lower cost (operating or capital). The most powerful word in your
lexicon is “Why?” Why does it work that way? Why did that outcome occur?
Analytical sciences and supporting technology provide you with a wide
array of powerful tools to use as economic levers, but your curiosity is the
fuel that powers those mathematical engines.
I am most proud of the business value and economic impact that my teams
have created, which, summing up across the years, is most certainly more
than $3 billion in cost avoidance and incremental revenue. The awards won
by the companies for whom I have worked and the teams I have led are the
testaments to this recognizable and signifcant, measurable, tangible success
in creating greater economic effciency. I believe one should strive to leave
things in a better state than they were found and that has been the case with
my work in industry.
Like most of us, I started out with nothing but a passion for learning, will‑
ingness to work hard, and desire to earn everything based on merit, trust, and
strong relationships. I am a living proof that anyone can build a respectable and
impactful career based on some fundamental inherent and learned skills
and capabilities, concerted and focused effort, and continued growth and
development. It is impossible for me to attach too high of a value on my edu‑
cation. My bachelor’s and master’s degrees in mathematics/statistics from
185
186 The Art of Data Science
• If you are a leader now, or aspire to be one, the most important chapter
is Chapter 11 on Analytics Leadership Skills, not so much because
the content is earth‑shatteringly original, or mind‑bendingly sophis‑
ticated, but rather, a lot of the skills required are more akin to running
a business than being a data scientist, which may seem counterin‑
tuitive and not so obvious to new leaders. Leading and managing is
about getting other people to do the right things and do things the right
way. Skills as a practitioner are useful to being an effective leader,
but they are secondary to all of the other skills listed.
Conclusion 187
Like so many of life’s journeys, your career journey itself is the reward – the
quality and impact of the work you do, the results you achieve and value
you deliver, and, most importantly, the people you meet and work along‑
side. Your career journey in analytical sciences may lead you to one of many
potential destinations, depending on what you want to do and what you do
best. Trust me, the universe will guide you while you endeavor to make your
way. You may want to be an expert individual contributor, e.g., Technical
Fellow or Chief Data Scientist; you may want to lead an analytics or tech‑
nology organization, e.g., Chief Analytics Offcer/VP/Director; or you may
choose to get an MBA and become a GM, leading a division, large enterprise,
or your own company.
Regardless of where you end up in your career, what matters is not the des‑
tination, offce, or title, not how much money you make (although I hope you
make a lot!) or how many awards you win (or do not win – I am still chasing
that elusive Franz Edelman Award!). What matters is that the journey is indeed
the reward as well as the people with whom you travel that road. My advice is
to “become the champion of your own race”– that is to say, always strive to be better
tomorrow than you are today and endeavor to become indispensable and endear
yourself to those around you. Do not get caught up in endlessly comparing
yourself to others –we are all on different paths and move at different paces
over time. For a long time, I was greatly bothered by not having has much suc‑
cess as others who were extraordinarily and uniquely successful professionally
and fnancially. If you spend too much time doing this, you will drive yourself a
bit crazy, and most likely end up feeling bitter, miserable and less than. Success
results from your capability, capacity, will, and, yes, luck, e.g., right place, right
time, and right solution. Additionally, if one day, you fnd yourself with more
success than Jeff Bezos, Elon Musk, Bill Gates, Larry Ellison, and others, then
that will be quite wondrous without a doubt. But if you do not, you can still be
a champion … the champion of your own race … if you added value, helped your
constituents and colleagues, became better every day, and had your share of
commensurate professional and fnancial success along the way.
Hopefully, the lessons presented herein will serve as valuable guideposts
on your career journey.
Additionally, while by no means unique to the analytical sciences, I have
found the following quotes to be valuable guideposts throughout my career.
You may fnd them useful as well.
If it can be done, it will be done.
– Latin proverb
(It is important to note that the bold are not assured fortune, as most of us
found out working for ultra‑high‑risk VC‑backed startups, rather, fortune
more often falls on those who act boldly.)
You have to take calculated risks in your career, job, project selection, and
solution approaches to make a signifcant impact and break away from the
pack. Be bold in digital transformation and economic transformation using
analytical sciences.
Was mich nicht umbringt, macht mich stärker (German) “What does not kill
me makes me stronger.”
Good things happen to those who plan, fail to plan, and plan to fail. Then,
as a result of planning and preparedness, the ball tends to bounce your way
more often.
Sic transit gloria mundi (Latin) (“Thus passes worldly glory”) or All glory
is feeting.
– The latter phrase, often attributed to Napoleon, was meant to imply that
fame or glory is transient, but when someone is forgotten it is forever.
People will remember your greatest accomplishments for a short time, so you
need a string of successes to propel your career forward and increase your
trajectory. (The old saws “What have you done for me lately?” and “Don’t rest
on your laurels” come to mind!)
190 The Art of Data Science
Famously, the last line of the multiple Academy Award‑winning 1970 movie
Patton (one of my favorite movies of all time, certainly in the military his‑
tory genre) about arguably the greatest military feld commander of the 20th
century whose controversial statements and actions led to his diminishment
from WWII history despite his extraordinary battlefeld leadership, exploits,
and successes leading corps and armies in North Africa and Europe to defeat
Nazi Germany.
The “entropy of the universe and the effect of gravity” causes one’s for‑
tunes to rise and wane – you will not win every time, but that should not ever
stop you from leaning in to new challenges.
References and Bibliography
Original Articles
1. From OR/MS Today, October 1991 Issue, Graphic Evidence. “The Dual Challenge
of the OR Practitioner.”
2. From OR/MS Today, December 1992 Issue, Transportation. “Airworthy.”
3. From OR/MS Today, December 1993 Issue, Simulation: A Survey of Flexible Tools
for Modeling. “Broaden Perspective: Consulting concepts come to life for author
while working on American Airlines maintenance project.”
4. From OR/MS Today, April 1994 Issue, Strategic Supply Chain Management. “Right
Tool, Place, Time.”
5. From OR/MS Today, October 1994 Issue, The Future of Manufacturing. “Under
Fire: Lessons from the Front.”
6. From OR/MS Today, June 1997 Issue, The Profession’s Paradox. “Surrounded by
Success.”
7. From OR/MS Today, April 1998 Issue, OR in 2048: A Flight of Fancy into the Future.
“OR Almost Goes Mainstream.”
8. From Analytics magazine (INFORMS), a 12‑part series (2023–2024), entitled
“Why Data Science Projects Fail: Practical Principles for Data Science Success.”
191
Index
193
194 Index
Chief Information Offcers (CIOs) 41, 121 professional 129, 150, 158
Cisco 26, 47 questions for beginning a project 126
Civil Aviation Safety Authority (CASA), skills for 112, 112
Australia 17 Davenport, T. 1, 25, 35, 45, 108, 111, 118,
classic problem model formulation 56 121, 122, 157
“clean sheet” scheduling 82 Competing on Analytics 1–2, 25, 35,
client decision‑maker’s needs 70 45, 180
cloud‑based sandboxes 179 decision support
communication 157–159 for aircraft overhaul maintenance
effective 135–137 planning 56–57
skills 116 full‑fedged model‑based 39
stories as 137 infrastructure 99
company’s culture and political levels of 90
landscape 108 OR consultant’s role in 67, 70
Competing on Analytics (Davenport) 1–2, tools 90, 91, 103, 105
25, 35, 45, 180 for NOC supervisors 91
constraint‑based decisions 88 decision support systems (DSS) 67
cost of goods sold (COGS) 107–108 for American Airlines 139
course and a legacy, creation of 3–4 OR/MS model‑based 96
COVART (Computation Of Vulnerable real‑time 99
Areas and Repair Times) 11, 14 user‑friendly 59
Crandall, R. 47 DELTTAA 109, 110
CTO 24–27, 30, 44, 118 digitally transformative companies
46–48
Data Science Community of change 47
Practice 120, 123 drive 47–48
data science (DS) projects 106, 107, fearlessness 48
110, 136 scale 47
budget limitations 109 speed 46–47
challenges 158–159 “digitally transformed bank” 46–47
failure reasons 120–123 digital transformation 41
lifecycle 152 companies choose to invest and
managing change 159–162 46–48
organization changes 109 competitive imperative 48–50
priority changes 109 internet, data, and analytics in 44–46
quantitative approach to observations and foundational
prioritizing 133 experiences 41–43
data scientists 33, 34, 45, 124, 127, 130, DINAMO 145, 154
131, 136 DockPlan system
absolutely critical for 106 benefts of 64–65
businesspeople and 124, 135, 160 conceptual methodology of 60
Chief Data Scientists 42, 188 development 58–59
citizen 129, 158, 179 overview 59
effective 137 DS/Business Analytics MS program 106
learning and understand 125 DSS see decision support systems (DSS)
MBAs for 109–110 dynamism 138, 153–156
Index 195
vehicle routing problems (VRP) 179 Why Data Science Projects Fail: The Harsh
virtuous cycle 46 Realities of Implementing AI
and Analytics, without the Hype
Walmart 33, 43, 45, 50, 120, 121, (Shellshear) 2
123, 146
Walmart Global Tech 41, 43, 120, 123 yield management (YM) 93, 94, 96, 99,
“What‑if” scenarios 58 100, 121, 145, 146, 150