Module-1 Chapter 2
Module-1 Chapter 2
Data to decisions
Evolving interrelationships
THINK DATA
31
32 Artificial Intelligence for Business Optimization
VALUE
•IoT, Smartphones, •Sources (Own,
•Sensors Buy, Lease)
•Types (4V) +
Alternate
Handset Dataset
Mindset Toolset
Cybersecurity
Data characteristics
Strategic thinking data for business optimization includes decision-making
that combines analytics (explicit) and human (tacit) thinking. The entire
decision-making process is iterative and incremental resulting in business
agility.5
Embedding data analytics in business processes makes them more effi-
cient and effective. Customer experience is also personalized as a result.
Data is not limited to structured sets. Low-volume data and unstructured
data with variety has the potential to provide valuable insights. The NO
34 Artificial Intelligence for Business Optimization
Data is extracted for the repository of raw data for training and testing
purposes. Features are defined in the training data. Accurate models of data
and processes are used for further training and testing of models.
Data evolves through five layers to become decisions. As shown in Figure 2.2,
the evolution of data to decisions is a systematic, incremental process that
impacts and is impacted by artificial intelligence (AI). This evolution requires
a detailed understanding of handset, dataset, toolset, and mindset. Figure 2.2
starts with observations which, when recorded, become data. Observations
are the start of data collection. Data goes through a process to become infor-
mation of analytics in order to provide knowledge and insights. Analytics
achieve this by correlating wide-ranging and dispersed suites of data. This
is made possible due to the shareability of Big Data on the Cloud. Analytics
support predictions and prescriptive advice as the data evolution continues.
Data to decisions 37
action
Consequences
SUGGESTS Decisions
Reality
Observation
Ch - 1 3
Below is a brief description of the five layers of the data to the decision
pyramid of Figure 2.2.
subject to varied analysis. The quality of data is enhanced through filters. Data
is quantitative and qualitative in nature. Big Data is characterized by high vol-
ume, velocity, and variety. Data contains noise that has to be reduced.
Sharing of data across the organizations eliminates repetitive and redun-
dant data. For example, a customer demographic data, such as name and
address, is usually stored by the telephone company. Therefore, this data
need not be stored by the bank. Instead, this data is collaboratively avail-
able to the bank from the telephone company under “contracts.” Sharing
data through well-connected, reliable, and trustworthy partners is the basic
form of collaboration among organizations. Data sharing is usually over
the Cloud in order to facilitate collaborations.
Layer 5: Decisions
Decisions are made up of explicit inferences based on insights and exten-
sive correlations among widely dispersed knowledge and tacit mindset of
decision-makers. While data, information, analytics, and (to a large extent)
knowledge are considered objective, decisions are a human subjective trait.
Decision makes the use of tacit human factors such as personal experience,
value system, time and location of decision-making, sociocultural environ-
ment, and ability to make estimates and take risks. The implicit form of
knowledge of the user is often cognitive “intuition” about how the input,
the process, and the output are related.
AI suggests actionable knowledge in order to support decisions. Strategic
use of Big Data and AI leads to insightful decisions. Analytical outputs are
combined with the individual decision-maker’s ability to consistently dis-
tinguish the importance, relevance, context, and organizational principles
in decisions. The decision-maker comprehends and balances AI with NI
for decision-making. Decision-making is an agile, iterative process with the
40 Artificial Intelligence for Business Optimization
consequences of the decisions providing fresh data as input for the next
discussion.
Intelligence is not something that can be organized and placed in data-
bases and transferred to others through training. Decisions lead to actions
that are fully mature implementation of the data decision pyramid aiming
at the value goal. Enhancing the customer experience in the most effective
and efficient way is the purpose of data to decisions. Customer groups can
also collaborate and support decisions to achieve higher value for them-
selves. Collaborations across organizations require an exchange of data and
information across highly porous electronic boundaries.
Volume
Technologies
Veracity
Value
Velocity
Analytics
Variety
Figure 2.3 Detailed characteristics of Big Data’s 3 + 1 + 1 Vs and the types and categories
of data.
Data to decisions 41
• Source and store data – This is the entry of data in the organiza-
tion through direct, mobile, audio, text, etc.; volume, cloud, scalable,
secured. Data analytic strategies are concerned with the sourcing of
current and future data, its cost and how current, relevant and reliable
the source and storage facilities are.
• Mine data– (analytics, patterns, collaborations, costs associated with
data and mining; quality – cleansing). Data analytic strategies con-
sider the relevance and the overall costs of mining data. Tools and
technologies for data mining are considered here.
• Utilize data – (decision-making; granularity; feedback mechanism).
Data analytic strategies include business process re-engineering in
order to embed analytics in the decision-making process. The level of
granularity is considered.
42 Artificial Intelligence for Business Optimization
Use
Observe Generate Record Store Secure Clean Retrieve Share
(Analyze)
Retire
DATA
Analytics Usage
SOURCING OF DATA
Think Data and the data analytic usage pattern mention the sourcing of
data. Substantial data is generated by machine sensors and IoT15 devices as
a source. The volume and velocity of such data are very high because it is
generated without human intervention. The number of active IoT devices was
estimated to be 21 billion by 202016 generating 40% of all data. These data
through channels of interconnected sensors and devices are now available
in massive quantities and in both structured and unstructured formats. The
addition to this data is on a continuous basis with high velocity. Third parties
and governmental bodies provide yet another source of data. Users generate
and share content typically on the Cloud. Data analytics on the Cloud pro-
vides real-time insights that enable the optimization of processes. The end
result is a vast collection of data from multiple sources, growing at a high rate
and comprising varieties that have a potential value hidden in them.
Enterprise applications (CRM, ERP, SCM, and others) store this data.
The volume, velocity, and variety of this data has bearing on its analytics.
Data to decisions 43
Structured Un-Structured
Timely and accurate analytics are required in order to help optimize busi-
ness. Analytics impact business decisions.
Figure 2.5 shows the additional categorization of Big Data in fundamen-
tal, macro, and alternative data.
Alternative data
Alternative data is not the organized transactional datasets. Alternative
data is derived from third parties: for example, social media feeds, blogs,
forums, and crowd-sourcing platforms. IoT devices are major sources of
alternative data. Alternative data is a subset of Big Data, often unstructured
and unidentifiable. Alternative data is sourced from outside the boundar-
ies and controls of the organization and yet it references the organization.
For example, a trending news item about the organization and the ensu-
ing discussions by customers and other observers on the news item can be
analyzed to find trends and patterns. The confidence levels of predictions
improve with alternative data. Alternative data can provide validation for
the main data and improve the granularity of predictions.
44 Artificial Intelligence for Business Optimization
Analyzing and using this alternative data require AI technologies and cor-
responding and business capabilities. Identifying the source of alternate data
requires an understanding of the nature of the organization or community
providing that data, understanding security and privacy issues associated with
the source, and the relevance of that data to the problem at hand. For example,
analyzing an insurance quote can make use of alternative data about the prop-
erty or vehicle being insured by sourcing the user community associated with
that item – but the analyst has to be mindful of which Tweeter or Facebook
community they are sourcing the data from, its reliability and security, and
how well it has served the purpose of improving the accuracy of the insur-
ance quote. Furthermore, the mindset of people in the organization who will
utilize the data also requires due consideration because this niche data and its
analytics can change the way in which an agent provides an insurance quote.
Sourcing, analyzing, and integrating this alternative data in decision-making
require more than AI tools. Expertise in the business domain has to couple
with technical expertise in the mindset of the data science professionals.
While structured, transactional data is analyzed quantitatively, alterna-
tive data needs more creative approaches for sourcing and analytics. These
approaches include analyzing the free-formatted blogs, opinions, likes and
dislikes, and the speed with which a particular community is expressing its
opinions. For example, if a car owner community posts a certain number
of “likes” and favorable opinions for a new brand of car model within, say,
1 week, then it has a higher density of positive opinion than the same posi-
tive expressions over 4 weeks.
Alternative data potentially offers an advantage over only using tradi-
tional data. Opportunity to derive value from previously ignored and/or
emerging datasets, often termed the “exhaust” of other business or com-
munities, is the new edge in data analytics.
For example, consider the prophecy project still incomplete. prophecy
project needs more definition as to what it is and a transition to the next
sentence.17 COVID-19 studies attempt to identify clusters based on travel
histories, individual movements, and tweets. (e.g., satellite images of indus-
tries and companies, environmental impact).
Alternative data means an attempt to know anything around the business
based on what “others” think of the business. For example, if the public
health wants to know the path of the pandemic, it will source exotic data
such as parking lots and cell phone usage.
Alternative data provides collective intelligence resulting in rewards and
risks. Risk arises from the source of data, which remains unverified.
Risks in alternative data usage are worth in time-sensitive business processes.
For example, in financial trading, even a narrow timing advantage provides
a trading edge. Alternative data provides that edge. Alternative data provides
potential information advantages in investment management decisions.
Alternative data is usually incomplete and not verifiable. They are unstruc-
tured and difficult to integrate with existing organizational, transactional
data. Privacy and security of this data are also very difficult to establish.
Data to decisions 45
Regulation also plays a role in sourcing and use of alternative data. For
example, with GDPR regulation, it may be necessary for an organization to
explain and make visible the alternative data they have collected from com-
munity forums and which they are using to identify trends in a product or
service. In other regions, where GDPR does not apply, the need to disclose
the alternative data source and usage is not required.
Analytics themselves are not new. Starting with an abacus to the use of a
sophisticated Excel with macros, analytics provide valuable insights for users.
Analytics are a pointer to (a) what action is to be taken and (b) the results of
that action.
Data analytics
Analytics provide a data-centric approach to business decision-making.
Analytics have multiple purpose and importance. Analytics explore the
past by describing a happening based on static historical data. Analytics
also look forward by presenting what is likely to happen in the future – a
prediction. Predictive analytics need to incorporate dynamic data inputs
and use them to create multiple “what-if scenarios.”
Data analytics is at the heart of business optimization. The starting point
of a strategic approach to using analytics in BO is the business problem or
desired outcome by using those analytics. Examining the factors contribut-
ing to value creation in business is more important than the details of ana-
lytics and technologies. Once the business outcomes are understood, data
analytics start examining the diverse data sources and types available to the
organization for analytical purposes.
Data analytics generate insights that are translated into actionable
knowledge. Data analytics includes clustering, segregation, segementation,
and analysis of the customer data in order to understand customer behav-
ior, potential sentiments, and referrals. This analysis results in actions.
Enabling those actions to happen requires changes to business processes
46 Artificial Intelligence for Business Optimization
and people training. For data analytics to succeed, a clear roadmap for its
application is essential.18
Granularity of analytics in BO
Another important concept that helps business agility is the levels of granu-
larity in undertaking analytics. Granularity of data, granularity of ana-
lytics, its context, and the processes embedding the data are crucial in
establishing data strategies for Agile business.
Big Data analytics takes analytics to a very fine degree of granularity.
This finer granularity is enabled by algorithms turned into code. The execu-
tion of this code on the high-volume and velocity data is enabled through
the technologies of Hadoop and NoSQL. Finer granularity of analytics is
the capability that differentiates Big Data analytics from the traditional
analysis of data. 24 The availability and accumulation of data combined
with the availability of computing power enables drilling down through
that data with pinpoint accuracy. Ascertaining the optimum level of this
granularity of data analytics is a strategic business decision discussed later
in this chapter.
Velocity coupled with volume requires strategies for handling data that
also start with the desired business outcomes. For example, the more finely
granular are the analytics, the greater is their confidence level although more
resources are required from the organization. This, in turn, can increase
the cost of analytics. Coarse (or lesser) granularity means less precision in
the results. Granularity and resources are thus a continuously balancing,
“agile” act. Desired outcomes enable the establishment of the right levels
of granularity.
Analytics provide for growth and innovation in services and enable pro-
cess optimization. A strategic approach is a must for data analytics to be
part of the organization culture. Precision and speed in analytics and inter-
active data exploration need to be supported by upskilling the users and
providers of the analytics.
Self-serve analytics (SSA) is a business strategy focused on letting users
(e.g., staff and customers) decide what they want from the DatAnalytics
(data-cum-analytics), how they want it, and then help them achieve those
insights. This not only provides the user with what she wants, when she
wants it, and how it is delivered; but it also reduces the onus of analytics on
the business.
Given the velocity of data, the rapidly changing context of the user (e.g.,
the urgency of the results, the format in which they are desired, and the
device on which they will be presented), and the ever-increasing availability
of data warehouses from “third parties,” it is in the best interest of an orga-
nization to provide patterns or prefabricated analytical tools which enable
users to self-serve themselves. This is called “Customer Intelligence.”25
Self-serve analytical capabilities require back-end data integration, dynamic
business process modeling, and customizable visuals. Tools are used to bring
data together from many different locations – internal and external – to
instantaneously answer self-service queries. These tools enable collabora-
tions among multiple systems, their interfaces, and open data sources from
external organizations (e.g. third-party sites providing data or that being
sourced from data providers).
Gartner26 describes the ease of use as the topmost priority for self-service
analytics. This is because the user is unlikely to have technical or analytical
skills. The user could be a banking or financial analyst and not necessarily
a statistician. Visual configuration of services needs to be presented to the
user. These users should be able to configure analytics to solve their imme-
diate needs.
Clustering groups similar data together into clusters. This clustering is done
with the aim of segmenting them. Segmentation places data into groups
based on similar characteristics. These terms denote different approaches
to analyzing data. ML algorithms help identify clusters by identifying
relationships of different types of data. Clustering deals with finding the
relationships between data that are then placed in segments. Clustering the
data helps discover new segments.
Data to decisions 51
Segmentation
Data segmentation forms the basis for process optimization. Segmentation is
the separation of data into groups of similar elements. This is an important
first step in analytics. Age vs gender or customers vs prospects are examples
of segmentation. Segmentation allows businesses to analyze customers based
on those similarities. Segmenting enables the creation of a target for services
or products. For example, a data analyst helps new businesses improve their
decision-making capabilities by segmenting other similar businesses. Data
from these businesses is further divided for further analysis.
The process of segmentation continues in an iterative manner with
increasingly finer segments. Segmentation finds relationships within vari-
ables in order to predict customer behavior.
• Sales – Clustering can tell what kind of people buy a specific product.
• Insurance – Clustering techniques are used to identify fraudulent
insurance claims.
• Education – Identifying groups of universities based on their tuition,
geographic location, quality of education, and type of degree
programs.
• Credit – Grouping customers based on their credit history.
DATA-DRIVEN DECISIONS
Automation
Automation happens in the ML space wherein machines take over routine
processes in order to reduce time and effort for humans. Any process that
is primarily simple, straightforward, and deterministic in nature requiring
Artificial /
System (AI)
Known Unknown
Known
Automation
Experience
(Hard, mono-
(So, inter-
dimensional
Human (NI)
disciplinary)
Natural /
Data)
Unknown
Prediction
(Fuzzy, multi Intuition
dimensional
Data)
Prediction
Prediction occurs when the disruptions can be anticipated based on trends
in data. 29 AI-based predictions can be categorized as “known” to the system
but “unknown” to the human. This is so because the tools and technologies
of AI are well equipped to make sense of the vast amount of fast-moving
data. Leadership, in this category, needs to rely on the AI-based system to
sound the alarm when there is a possibility of disruption. Big Data analyt-
ics have shown to predict network failures and security breaches based on
the analysis of large amounts of historical data.30 Since predictions improve
with data, the larger the dataset, the better the predictions. Economic dis-
ruptions, for example, are predicted with reasonable confidence based on
data trends. ML code embedded in AI models is, however, agnostic to the
specific situation, or context, of the user.
Experience
This represents the human-based NI that comes into play in decision-making.
AI engines work only on available data which, in some situations, may not be
readily available. In other situations, the tools may not be able to identify cor-
responding trends within a reasonable time. When the situation is unknown
to the system, leaders provide the needed input based on their experience.
54 Artificial Intelligence for Business Optimization
Intuition
This concept encompasses the space where disruptions neither have a past –
historical data to create a trend – nor do they appear in the experience
horizon of the leaders. In other words, these disruptions are unknown to
AI and unknown to humans – as shown in the right-bottom quadrant in
Figure 2.6. The current COVID-19 pandemic falls under this category of
unknown-unknown. The only possibility of anticipating such disruptions
is to explore the instinctive feelings within the leaders, together with imagi-
nary scenarios created in the AI engine.
Customer
Segmentation
And Attrition Product
Acceptance
Optimizing on User
Processes Community
Customer
Analytics
Maintenan
Product
ce
Analytics
Analytics
Targeted
Pre-empting Analytics Marketing
Events Cross for Agility Campaigns on
Marketing
Selling Analytics Multi Channels
Analytics
Security &
Sentiment
Privacy
Analytics
Analytics
CONSOLIDATION WORKSHOP