The Future of Analytics
The Future of Analytics
m
pl
im
en
ts
of
The Future
of Analytics
The New Landscape of Artificial
Intelligence & Machine Learning
Applications
REPORT
The Future of Analytics
The New Landscape of Artificial
Intelligence and Machine
Learning Applications
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. The Future of
Analytics, the cover image, and related trade dress are trademarks of O’Reilly Media,
Inc.
The views expressed in this work are those of the authors, and do not represent the
publisher’s views. While the publisher and the authors have used good faith efforts
to ensure that the information and instructions contained in this work are accurate,
the publisher and the authors disclaim all responsibility for errors or omissions,
including without limitation responsibility for damages resulting from the use of or
reliance on this work. Use of the information and instructions contained in this
work is at your own risk. If any code samples or other technology this work contains
or describes is subject to open source licenses or the intellectual property rights of
others, it is your responsibility to ensure that your use thereof complies with such
licenses and/or rights.
This work is part of a collaboration between O’Reilly and H2O.ai. See our statement
of editorial independence.
978-1-492-09173-8
[LSI]
Table of Contents
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
2. Modern AI Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
The Anatomy of a Modern AI Application 15
Detailed Application Examples for Key Industries and
Functions 19
CPG Sales Forecasting with COVID-19 Data 21
Hospital Staffing Optimization (Healthcare Industry AI
Apps) 22
Marketing Lead Optimization (Line of Business AI
Applications) 23
Data Augmentation (AI Apps for Data Teams) 26
iii
4. Adoption Challenges for Next-Generation Analytics. . . . . . . . . . . . . 35
Ineffective Data and AI Principles 35
Lax Security Practices 36
Inadequate Human Review 36
Downplaying Traditional Domain Expertise 37
Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
iv | Table of Contents
Introduction
1 See the following examples: “This Looks Like That: Deep Learning for Interpretable
Image Recognition”, “Intelligible Models for HealthCare”, “A Unified Approach to
Interpreting Model Predictions”, “Introducing AI Fairness 360”, “Introducing Tensor‐
Flow Privacy”, and the What-If Tool.
v
formed machine learning from a field of black-box algorithms to a
field that is now capable of fierce debate around the concepts of
algorithmic transparency, accountability, and fairness. This techno‐
logical progress not only enables the nuts and bolts of regulatory
oversight, but it also gives companies the power to govern their AI
systems like the enterprise software assets they are. If you can block
out the hype, you’ll see that AI is really just software. And like all
other enterprise IT resources, AI systems should be documented,
managed, monitored, and governed.
Looking forward, I see successful AI deployments being aware of the
risks of AI, taking on the associated governance burdens, and ena‐
bling humans to work together with computers to solve big prob‐
lems. For businesses climbing to the next plateau in digital
transformation, don’t settle for any AI system. You’ll need AI sys‐
tems that are documented, transparent, managed, monitored, and
minimally discriminatory. Moreover, these AI systems must support
AI apps that are flexible, explainable, and, when appropriate, auto‐
matic. That’s why Dan, Rafael, and I have written this new report,
The Future of Analytics. It’s a necessary update to the original Evolu‐
tion of Analytics report, and we hope you find it to be a timely and
useful guide through the new world of AI-powered analytics apps.
— Patrick Hall
vi | Introduction
CHAPTER 1
The Converging World of Analytics
1
While historical trends are useful and can be predictive, they can
also provide a false sense of confidence, when the future does not
mirror the past. In the end, historical information and descriptive
analytics alone leave business leaders to use their best judgment
about trends in order to make decisions based on their own experi‐
ence and limited view of the data. The result of this process is then
highly dependent on the individual decision-maker’s expertise,
which yields highly variable outputs.
Using machines to find patterns in data and make predictions is
another area of great promise for decision support. Machine learn‐
ing models, trained on historical data, can look at new data and pre‐
dict what is likely to happen. For example, credit card companies
use machine learning models to determine who has access to credit
and is likely to carry a balance on their credit card bill each month.
Such models can even prescribe actions for users to take and recom‐
mend products or content of interest. This ability to predict future
outcomes and prescribe actions has made machine learning a hot
technology—and data science a hot profession.
For all its promise, machine learning has not reached widespread
usage in production or within business applications where it can
provide value and support business decisions. The challenges vary
by organization and use case. However, the common themes in AI
and machine learning adoption revolve around a few key areas,
including a lack of resources, lack of business trust in models and
their outputs, difficulty putting models into production and keeping
them running, lack of consistent business involvement, and bottle‐
necks in putting predictive results into business applications. Let’s
discuss some of these dilemmas below:
Talent gap
“Data scientist” has become the hottest title and the hardest to
fill position for many organizations. While many data analysts
changed their titles to capture the new wave of interest, they
lack the skills in languages like Python and R and an under‐
standing of modeling algorithms and techniques needed for
predictive model development. The lack of qualified data scien‐
tists in the market has put many companies on the slow track in
AI development. Even with qualified data scientists on staff,
companies find their output is limited due to the handcrafted
nature of their work and the high maintenance costs of such
models in production.
1 “2020 Planning Guide for Business Analytics and Artificial Intelligence”, Gartner
Research, October 7, 2019.
Generic AI Apps
One approach to AI apps is to build generic apps that allow users to
input values and see results from predictive models. These apps have
names like What-if, Optimizer, and Predictor. These applications are
little more than a front end to the model API. While such applica‐
tions can help business users interact with predictions, these appli‐
cations are not tailored to a business need or user. The result is that
such generic applications become little more than a way for business
users to test models and become comfortable with their outputs. In
addition, generic applications are unlikely to be designed for specific
needs like peak loads in a call center or for online shopping. These
applications also have static workflows, so there is minimal addi‐
tional exploration, simulation, and evaluation.
Noninteractive Experience
Many analytics applications are used to pull data from a data ware‐
house to generate reports or dashboards. Interactivity with such data
is limited to filtering and slicing the available data. In this frame‐
work, additional requests will pull a new set of data. As the user
looks at dashboards, they expect to see the latest data and to have the
data visualization dynamically update as new data arrives, but this is
not the case. Even for IoT applications that process data in real-time
streams, the user’s dashboard will not update unless the user specifi‐
cally requests it. For AI app development, systems should have both
15
Figure 2-1. Example of a modern interactive AI application
Interactive
Modern AI applications are also push-driven versus pull-driven.
Many legacy BI dashboards pull data on request or periodically
from a database. The information displayed is correct as of the
last refresh. Modern AI apps are push-driven because the inter‐
face is continuously being updated as data arrives and is upda‐
ted along with the latest predictions, creating a more dynamic
experience for the user and ensuring that they have the latest
information.
Automated
Modern AI applications include automation to maintain their
performance over time. A critical function in modern AI appli‐
cations is the ability to build and rebuild AI and ML models to
support new data or changes in request parameters. An open
and extensible AutoML engine is used by the developer or data
scientist involved in the AI app project to build models used in
the app. Using AutoML, versus a hand-coded approach, is desir‐
able for the business and the users as it allows the application to
be updated without needing resources from the data science
team, which are typically busy with other projects. The use of
AutoML also creates a well-documented and repeatable process
so that applications don’t break down as they are updated.
Security
Among the many functions of the application environment, the
security of the data and application is highly important. In this
system, AI applications should only be accessed by designated
users and updated by administrators. As data travels between
the application server and the user’s computer, the information
should be encrypted to ensure that it cannot be manipulated in
transit. This is especially true when personal or medical infor‐
mation is present.
Provisioning and management
AI applications are deployed from the server for the use of each
user. The model is similar to “app stores” that mobile phone
users are used to. This is different from shared software, where
multiple users share the same application. This separation
allows the users to customize the apps to meet their needs and
keep those settings for future use. The application management
system then has to provision the application for the user to con‐
sume and track who is using the app and what version they have
downloaded. Application updates are then made available for
those users to update their app to gain access to new functional‐
ity. Administrators may also choose to update applications
directly in the event of a security concern or other serious issue.
The framework and key components of modern AI applications
allow for usage in a variety of use cases to create specific, interactive
applications across industries.
ArmadaHealth
ArmadaHealth is a health data science and services company
founded to help people access the right physician or expert for them.
Their unique solution, QualityCare ConnectSM, combines big data
and expert clinical insights which points straight to the root cause of
healthcare access problems. ArmadaHealth does this by applying
sentiment analysis on customer reviews and advanced analysis of
experts’ wisdom to understand the consumers, objectively finding
providers that meet their needs and preferences, preparing them,
and delivering timely access to a choice of the most appropriate
physicians for their condition (see Figure 3-1).
29
Figure 3-1. The QualityCare Connect app on the ArmadaHealth
website
Challenges
Finding the right specialist is the first step to receiving the right care.
However, consumers are often not equipped to navigate the complex
and confusing healthcare system. It can be challenging for patients
to discover which specialist they should approach for different
health situations and, even with a referral from a primary physician,
it can still be a long process to find the right specialist who can accu‐
rately treat them while also providing a satisfactory patient experi‐
ence. Finding the right match between patient and doctor quickly
can solve major problems and save lives.
Solution
AutoML is an essential part of reaching ArmadaHealth’s goal of
delivering accurate patient-expert matches through their online
application. Using the H2O.ai platform, including automatic
machine learning, the company is able to build and train a natural
language processing (NLP) model to identify the sentiment (posi‐
tive, negative, neutral) in each customer review. The company looks
at three main aspects in each review: treatment outcome, communi‐
cation, and attitude. These three aspects are critical to finding the
expert that best matches customer preferences.
Hortifrut
Hortifrut, based in Chile, is the largest producer of blueberries in the
world, and operates farms in Peru, Chile, Mexico, Argentina, the
United States, Spain, Morocco, and China, with distribution across
37 countries. Hortifrut holds 25% of the world’s blueberry market
and uses AI to make distribution decisions across their expansive
operations. They are able to predict the quality of the blueberries
from origin to final destination, improving the consumer experience
with higher quality products, and increasing revenue throughout
the supply chain.
Challenges
Transporting fruit from the farm can take weeks, so Hortifrut has to
predict the quality of produce upon arrival. Not being able to do this
accurately can impact customer experience and revenue loss. But
getting such predictions accurately can be a difficult task, given the
complexity of the distribution channel, weather data, variety of data‐
sets, shipping times, and more. Traditional machine learning meth‐
ods and toolkits took months to build accurate predictions and
production-ready models. To scale the use of AI under these condi‐
tions would require hiring additional data science talent and
increasing the budget.
Solution
Hortifrut leveraged the H2O.ai platform to have better predictive
insights into the quality of their blueberries. They used capabilities
such as feature engineering, NLP, explainability, time-series analysis,
visualization, and scoring pipelines. Hortifrut is now able to scale
their data science efforts in order to deliver use cases such as pre‐
dicting the quality of blueberries based on features like variety, farm
origin, shipping time, vessel, and packaging, without hiring
Hortifrut | 31
additional data science talent in the team. These results are delivered
directly to business users making decisions, so they can take the cor‐
rect actions for each shipment.
Results
Hortifrut achieved the following key benefits using AI apps:
Jewelers Mutual
Jewelers Mutual is one of the United States’ and Canada’s most
established and trusted providers of affordable and comprehensive
insurance for jewelers and consumers. As a leader in driving
customer-focused innovation and providing the latest technology to
a long-standing industry, Jewelers Mutual uses AI to deliver excep‐
tional customer experiences, prevent losses, and provide better pro‐
tection and policies for both jewelers and consumers.
Challenges
The leadership at Jewelers Mutual recognized the need to invest in
analytics, AI, and machine learning for improving overall customer
experiences. Their business relies on both being able to effectively
protect their customers’ businesses, and providing personal insur‐
ance directly to consumers—both with innovative customer experi‐
ences. Jewelers Mutual has been at the bleeding edge in adopting AI.
They collected data already available from losses, customers, and
multiple other sources, which weren’t tapped into before.
Results
The success that Jewelers Mutual has seen in adopting AI in their
business is a testament to the fact that regulated industries can ach‐
ieve real competitive advantage using AI. For example:
Jewelers Mutual | 33
CHAPTER 4
Adoption Challenges for
Next-Generation Analytics
35
ineffectiveness. Getting a technological perspective, along with ethi‐
cal, legal, oversight, and leadership perspectives into organizational
AI and data principles are perhaps the best way to avoid such issues.
3 See, for example, “To Really ‘Disrupt,’ Tech Needs to Listen to Actual Researchers”,
Wired, June 26, 2019; Rumman Chowdhury’s post on Twitter; “AI Researchers Say Sci‐
entific Publishers Help Perpetuate Racist Algorithms”, MIT Technology Review, June 23,
2020.
The future of analytics is sure to take many forms. With the conver‐
gence of traditional business intelligence with artificial intelligence
and machine learning, the possibilities are endless and exciting.
With AI integrated into business and customer experiences, the
hope is that every business user will be more productive and
empowered and that every customer experience will be exciting,
resulting in new levels of customer satisfaction and engagement and
new growth opportunities for companies.
AI applications present a compelling way to implement AI in the
enterprise. AI apps are different from traditional dashboards and
business applications in that AI apps are designed together with
domain experts to meet their specific needs for descriptive, predic‐
tive, and prescriptive insights. AI app development is accelerated
using AutoML and rapid prototyping frameworks so that organiza‐
tions can scale AI access across the business. The results of AI appli‐
cations are tangible, such as helping patients find the right
physician, reducing waste in fruit shipments, or optimizing insur‐
ance underwriting.
The future, however, is far from certain. Barriers remain, both tech‐
nological and organizational. To reach this transformational future
will require a new wave of innovators and leaders with the vision to
find the right technology partners and create the AI applications
that will drive their business and even their industries for years to
come.
39
About the Authors
Dan Darnell is a seasoned product marketer with over 20 years of
experience in leading technology companies. For the past nine
years, he has been working on AI platforms and applications,
including senior roles at H2O.ai, DataRobot, ParallelM, Talend, and
Baynote. Before that, Dan was focused on analytics and optimiza‐
tion technologies at Adchemy, Interwoven, Oracle, and Siebel Sys‐
tems. He holds an MBA from Carnegie Mellon University and a BS
in engineering from The University of Colorado at Boulder.
Rafael Coss is Director of Technical Marketing and Developer Rela‐
tions at H2O.ai in Mountain View, CA. Before joining H2O.ai, he
was Director of Technical Marketing, Community, and Data Evan‐
gelist at Hortonworks. He also has served as the DataWorks Summit
program cochair for three years. Before Hortonworks, he was a
senior solution architect and manager of IBM’s World Wide Big
Data Enablement team and coauthored Hadoop for Dummies. At
IBM, he also had roles in technical product enablement, quality
engineering, and software development across multiple products
and initiatives: XML database tools, federated database, and object-
relational database. He holds an MS in computer science and a BS in
civil engineering, both from California Polytechnic State University.
Patrick Hall is principal scientist at bnh.ai in Washington, DC, a
boutique law firm focused on data analytics and AI. Patrick also
serves as a visiting professor in the Department of Decision Sciences
at George Washington University and as an advisor to H2O.ai.