0% found this document useful (0 votes)

20 views

Module-1 Chapter 2

Chapter 2 discusses the evolution of data into decisions, emphasizing the importance of data architecture and analytics for business optimization. It outlines the four key aspects of 'Think Data': handset, dataset, toolset, and mindset, which must be balanced for effective decision-making. The chapter also introduces a pyramid model illustrating the systematic process through which data transforms into actionable insights and decisions, highlighting the role of artificial intelligence in this evolution.

Uploaded by

nirmalabhajantri456

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Module-1 Chapter 2

Uploaded by

nirmalabhajantri456

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Chapter 2

Data to decisions
Evolving interrelationships

THINK DATA

Data is a vast topic. Data encompasses observations, acquisition, recording

storage, cleansing, analytics, security, and disposal. In the digital world, a
simple query, a product search, a purchase, or a post-sale service request
generates data. This data is typically stored on vast cloud servers. Data is
then analyzed to produce insights that form the basis for business decisions.
Understanding how data evolves into decisions is important for business
optimization (BO).

The explosion in data globally is set to reach close to 150ZB by 2024.1
The growth in data volume is fueled by growth in velocity (e.g., 5G), variety
(e.g., unstructured, audio, graphics), and the need for veracity (e.g., quality,
darkness). Business data is generated by internal business processes, staff,
external customers, business partners, regulatory agencies, and user com-
munities. User communities also generate data whose harvesting can lead
to many interesting business ideas. Business data is typically stored on the
Cloud. The challenge for business optimization is to sort the relevant data
and analytics from the noise across multiple Cloud sources and ensure a
well-understood evolution from data to decisions.
Data architecture based on EA 2 and OBDA3 is required to integrate tech-
nology, storage, computing, and device requirements. Multiple data feeds
using API and scalable processing power on the Cloud are essential. Data
analysts analyze datasets and systematically look for the insights within
them.

Think data: Handset, dataset, toolset, mindset

Data science establishes correlations between suites of data. Data science,
however, is not able to (nor is it meant to) explain the underlying cause for
patterns and trends. Correlation, not causality, is the theme of data science.
Data analysis “plays” with the data, creates “what-if” scenarios, and sup-
ports business decisions. Such data analysis is subject to influences of changes
in the input, frequency, and context of the data. Thinking about data requires

31
32 Artificial Intelligence for Business Optimization

an understanding of the myriad ways in which data can be sourced, analyzed,

and retired. The tools and technologies and the creativity required to make
use of data are part of data science. Furthermore, variations of data types,
such as alternative data, may not even be directly related to the purpose of
the analysis. An example case is the correlation between the satellite images
of various car parks in Wuhan region and the possibility that COVID-19
started much earlier than declared by the WHO.4 The challenge in the use
of such data is its validity and currency – the factors discussed later in this
chapter in using alternative data.
Figure 2.1 summarizes the four key strategic aspects of data applicable to
BO: the handset, dataset, toolset, and mindset. Keeping all four aspects in
Think Data in balance is important in BO.

• Handset – this aspect of Think Data includes smartphones, IoT

devices, and sensors. Think data starts with this important and
mainly automated source of data. Users continuously interact with
the business through their handsets. The volume and velocity of data
depend on the use of the handset. Sensors, although not held in hand,
automatically generate a large-volume and high-velocity data.
• Dataset – this aspect of “thinking data” deals with the storage security
and utilization of data. The data characteristics of volume, velocity,
variety, and veracity are applied in managing datasets. For example,
structured sensor data needs SQL storages and unstructured blogs or
smartphone data needs NoSQL datasets. No backups or archives are
possible with Big Data, especially alternative data. Data-driven digital
business is not limited to using its own data. While costs, prices, and
transaction volumes are examples of data generated by the business,

VALUE
•IoT, Smartphones, •Sources (Own,
•Sensors Buy, Lease)
•Types (4V) +
Alternate

Handset Dataset

Mindset Toolset

•People, Users •Systems,

•Decision-making Processes,
Communications

Cybersecurity

Figure 2.1 Think data – handset, dataset, toolset, mindset.

Data to decisions 33

there are meta-data and alternative data generated by interactions

among third parties and customer sentiments.
• Toolset – includes the tools for communication, storage, and process-
ing. For example, analytics in the Cloud, analytics-as-a-service, visu-
alization, and the various aspects of embedding analytics in business
processes are thought through here. ERP and CRM solutions are also
part of toolsets. Tools used in providing cybersecurity analytics are
also included.
• Mindset – this is the user-centric view of data usage. The mindset also
includes strategies for dealing with the inevitable change when data
analytics is embedded in decision-making. Handling mindset requires
interdisciplinary teams comprising data scientists, domain experts,
business process experts, finance and economics professionals, and
human resource (HR) managers. Training of staff and educating users
is a part of “Mindset” in optimizing business processes. Mindset is an
indication of user sentiments that impacts predictions.

As also summarized in Figure 2.1, underpinning Think Data is “Think

Cybersecurity.” Each aspect of the “think” data needs to consider cyberse-
curity. For example, handset needs physical security, dataset needs secured
cloud storage, toolset includes encryption, and mindset requires good user
habits and awareness.
The over umbrella of think data is “Value.” Therefore, all four aspects of
think data need to be in balance.

Various aspects of think data

Understanding data implies understanding its various dimensions from
sourcing and storage through the security and retirement. The entire gamut
of application of the keyword “data” is summarized in Table 2.1.

Data characteristics
Strategic thinking data for business optimization includes decision-making
that combines analytics (explicit) and human (tacit) thinking. The entire
decision-making process is iterative and incremental resulting in business
agility.5
Embedding data analytics in business processes makes them more effi-
cient and effective. Customer experience is also personalized as a result.
Data is not limited to structured sets. Low-volume data and unstructured
data with variety has the potential to provide valuable insights. The NO
34 Artificial Intelligence for Business Optimization

Table 2.1 Think data for business optimization

Data Think! (for business optimization)
Data science Think overall principles, concepts, and strategies for the application
of data to business. Datasets, handsets, toolsets, and mindsets are
included in Data Science.
Data analytics Think algorithms to process the data and provide results.
Descriptive, predictive, and prescriptive analytics together with
supervised, unsupervised, and reinforced learning techniques are
examples of analytics.
Data architecture Think of technical alignment.
Data bases Think SQL and NoSQL databases storing data locally, on the Cloud,
and on user devices in both structured and unstructured formats.
Data Think networks and transmission infrastructure including its speed,
communications security, and context. Wired, Wi-Fi, and Cellular for data and
analytics (especially as they are offered as services).
Data context Think of the business context in which data will be used. The
urgency, importance, and relevance of the business process provide
an additional context.
Data costs Think costs associated with per unit (e.g., MB or GB) of data (e.g.,
storage, backup, and mirroring costs). Costs of sourcing data from
vendors are included. These “cost”-related thoughts balance the
expenses and risks of BO with returns (ROI).
Data density Think of what reality the data represents. The more compact a data
point in terms of representing the reality, the denser the data. And,
the denser the data, the more effort and tools are required.
Data manipulation Think location of data and data movements. Manipulation is
associated with data retrieval, preparation, and processing.
Data mining Think layers and depths of datasets; think the nature of data – static
vs. dynamic. The algorithms for mining depend on the understanding
of data layers and architecture.
Data presentation Think visualizations, users, IoT devices, screen real estate, and the
relevance of presentation to the user in making decisions; video,
audio, and sensor (buzzers) are examples of presentation types.
Data privacy Think legal aspects of data and repercussions of breaches on
businesses. These privacy aspects change depending on the source,
ownership, usage, and retirement of data.
Data processes Think business processes that use data analysis. Process optimization
is based on an understanding of data usage within the business
process.
Data quality Think verification and validation of data, processing, and visuals. Think
of the veracity of data, which is an ongoing effort based on
continuous testing. Think also of the amount and type of data to be
used for this testing versus the one to be used for training the AI
algorithms. Is the business generating dark data? Is there dark data
in the environment of the collaborating partners?
Data security Think cybersecurity in terms of all aspects of data – including and
especially during communications via networks and processing.
User behavior forms the “soft” Should be soft aspect of security.
(Continued)
Data to decisions 35

Table 2.1 (Continued) Think data for business optimization

Data Think! (for business optimization)
Data sources Traditional, meta, and alternative data sources. Owned and leased
data and also data scraped from sites. Synthetic data is generated
internally to test and also “pad” datasets for analytics.
Data statistics Think clustering and classification of data and the use of relevant
statistical techniques to make sense of the data. Training and testing
data are essential for statistical analysis.
Data tools and Think current and future tools, and technologies related to sourcing,
technologies storage, analytics, and application of data.
Data usage Think the lifecycle of data – from observation to retirement and
everything in between.Vast storage of data with no usage is a
burden to the business.
Data value Think end goals of customer satisfaction for a business. The entire
BO effort is directed toward customer value, and data should help
a business reach that.
Data warehouse Think storage of transactional and large historical data and
associated systems for retrieval and analytics. Think strategies for
mining historical data with descriptive analytics.

SQL databases6 provide opportunities for analytics and influence business

decision-making even without high volume. ML handles not only struc-
tured data, but also semi- or unstructured data irrespective of its volume.
This data also has rich meta-data (e.g., parameters around a data point)
associated with it. This meta-data provides aggregated information for
business – such as potential demand for a product or service from a group
of customers. Continuously growing data holds the promise for continually
improving insights.
The data, its context, and meta-data form a “story.” These data stories
help understand customer behavior, spot market trends, and satisfy compli-
ance needs. Analytics enable a business to proactively respond to changing
external and internal situations.
Data – especially large volume, high velocity, and unstructured data – can
appear to be chaotic. Data strategies find a time-rhythm and a structure-pattern
in that data. These rhythms and patterns are used to predict the interest rates
tomorrow, the weather next week, and the best airfare a month in advance.
Data science identifies trends and patterns through exploration, matching,
referencing, correlating, and extending data and meta-data.
Raw data evolves and transforms into information, analytics, and knowl-
edge. Knowledge drives decisions related to products and services.
Classifications and categorization are the starting point for the applica-
tion of statistical techniques to date. ML creates and uses models based on
these techniques to generate insights.
36 Artificial Intelligence for Business Optimization

Data is extracted for the repository of raw data for training and testing
purposes. Features are defined in the training data. Accurate models of data
and processes are used for further training and testing of models.

Data as enabler of optimization

Data through the range of think data is an enabler of process optimiza-
tion. Detection, predictions, and prescriptions are embedded in the busi-
ness functions. Identifying the right data type and having quick access are
important for integrating within the decision-making process. Testing the
veracity of the dataset is required on a continuous basis. An integrated
analytics platform promotes idea-sharing and generates greater efficiency.
Combining this with traditional data can lead to differentiated insights.
Following are some examples of how think data enables optimization:

• Location and density data of mobile devices predict economic outlook

as it indicates increased human activity. This prediction enables the
organization to develop strategies for newer products and services.
• Analyzing prices of millions of online products enables an under-
standing of shifts in global demands and, therefore, on pricing and
marketing.
• Identifying changes to purchasing habits of customers to cross-sell
and up-sell new products and services.
• Correlating credit card transactions, geolocation, and app downloads
provide insights into how the business is viewed by the customers and
which other businesses it should collaborate with.
• Correlating search engine data with social media data enables predict-
ing earnings.
• Twitter and other social media feeds together with sentiment analytics
enable the capabilities to direct resources.

DATA TO DECISIONS PYRAMID

Data evolves through five layers to become decisions. As shown in Figure 2.2,
the evolution of data to decisions is a systematic, incremental process that
impacts and is impacted by artificial intelligence (AI). This evolution requires
a detailed understanding of handset, dataset, toolset, and mindset. Figure 2.2
starts with observations which, when recorded, become data. Observations
are the start of data collection. Data goes through a process to become infor-
mation of analytics in order to provide knowledge and insights. Analytics
achieve this by correlating wide-ranging and dispersed suites of data. This
is made possible due to the shareability of Big Data on the Cloud. Analytics
support predictions and prescriptive advice as the data evolution continues.
Data to decisions 37

action

Consequences
SUGGESTS Decisions

Presents Knowledge &

Insights
Associates Analytics & Services
(Collaborations)
Information Systems &
Interprets
Processing
Data (Creation & Manipulation of
Filters Records) Noise

Reality
Observation
Ch - 1 3

Figure 2.2 The pyramid of evolution of data to decisions and the AI impacts.

Below is a brief description of the five layers of the data to the decision
pyramid of Figure 2.2.

Layer 1: Data is a record of observations

Data is a suite of observations consolidated and organized in an objective man-
ner. This layer represents the creation and manipulation of records. Storages can
vary vastly, from the neatly organized rows and columns of a massive colum-
nar database, through to the complex, multimedia, data warehouses contain-
ing audio, video, photos, and charts. Data is objective, storable, shareable, and
38 Artificial Intelligence for Business Optimization

subject to varied analysis. The quality of data is enhanced through filters. Data
is quantitative and qualitative in nature. Big Data is characterized by high vol-
ume, velocity, and variety. Data contains noise that has to be reduced.
Sharing of data across the organizations eliminates repetitive and redun-
dant data. For example, a customer demographic data, such as name and
address, is usually stored by the telephone company. Therefore, this data
need not be stored by the bank. Instead, this data is collaboratively avail-
able to the bank from the telephone company under “contracts.” Sharing
data through well-connected, reliable, and trustworthy partners is the basic
form of collaboration among organizations. Data sharing is usually over
the Cloud in order to facilitate collaborations.

Layer 2: Information makes data understandable

Level 2 represents the processing of data to create information. Information
is the systematic use of data in business processes. Data, on its own, is
not meaningful, whereas information based on the data provides meaning.
Big Data is both a challenge and an opportunity in the quest for BO. For
example, bank data is processed to generate information on demographic
behavior patterns, such as spending styles, income groups, and geographi-
cal nuances of the customer. Creating information and understanding the
decision-making process are important Level 2 activities.
Processing vast and complete datasets reveal hidden semantics. Process is
activities and steps undertaken in businesses, like that of opening an account
in a bank or withdrawing cash from an ATM. While minor variations in
each of these processes are accepted, the fundamental process remains the
same. By creating a basic process model for opening an account, it is possi-
ble to embed analytics in it. Rules and regulations can be commonly applied
to these information system processes. Furthermore, agencies specializing
in fraud detection can inspect the information for suspicious transactions.

Layer 3: Analytics and services (collaborations)

Analytics establish correlations between data. Analytics identify patterns
and trends within those data. The more the data, the better is the output.
Also, data analytics is not limited to analytics on a singular type of data.
Data is sourced with the help of services from multiple and widely varied
databases (typically on the Cloud) and a relationship established between
them in order to perform data analytics. Analytics itself is offered as a
Service8 on the Cloud. Electronically established collaborations among other
partnering businesses enable offering of a wide spectrum of services to the
clients of the organization. Therefore, analytics and services together ensure
data and information have evolved to a highly sophisticated level of usage.
Personalization on a highly scalable basis results from collaborative services.
Data to decisions 39

Layer 4: Knowledge and insights

Knowledge can be understood as rationalization and correlation of informa-
tion through reflection, learning, and logical reasoning. The knowledge dis-
covery processes are categorized under six distinct groups (Helmy, Arntzen
Bechina, and Siqveland 2018)9,10: Formulating the Domain Application,
Data Acquisition, Data Preparation, Machine Learning (ML), Evaluation
and Knowledge Discovery, and Deployment. The focus of this 4th layer in
the pyramid of Figure 2.2 is to correlate information and use analytics to
create new knowledge.
Information can lay in silos, mainly dictated by the original classification
of data upon which it is based. Converting this information to knowledge
is a multilayered process. Analytics combined with information produces
knowledge. Big Data provides a greater opportunity to correlate between
otherwise separate islands of information within and outside an organiza-
tion as compared to normal data. This is so because almost all Big Data is
stored in the cloud.
The level also shares knowledge about an individual or a group of cus-
tomers or users across multiple organizations. For example, correlation
between the information about a customer (person) available to one orga-
nization and other bits of information (such as geographical location or
spending habits) available to another organization can be established.
These wide-ranging correlations produce new and unique knowledge about
that customer that is not possible with analytics in a single organization.
Knowledge can extend the predictability of behavior beyond one customer
to an entire customer group.

Layer 5: Decisions
Decisions are made up of explicit inferences based on insights and exten-
sive correlations among widely dispersed knowledge and tacit mindset of
decision-makers. While data, information, analytics, and (to a large extent)
knowledge are considered objective, decisions are a human subjective trait.
Decision makes the use of tacit human factors such as personal experience,
value system, time and location of decision-making, sociocultural environ-
ment, and ability to make estimates and take risks. The implicit form of
knowledge of the user is often cognitive “intuition” about how the input,
the process, and the output are related.
AI suggests actionable knowledge in order to support decisions. Strategic
use of Big Data and AI leads to insightful decisions. Analytical outputs are
combined with the individual decision-maker’s ability to consistently dis-
tinguish the importance, relevance, context, and organizational principles
in decisions. The decision-maker comprehends and balances AI with NI
for decision-making. Decision-making is an agile, iterative process with the
40 Artificial Intelligence for Business Optimization

consequences of the decisions providing fresh data as input for the next
discussion.
Intelligence is not something that can be organized and placed in data-
bases and transferred to others through training. Decisions lead to actions
that are fully mature implementation of the data decision pyramid aiming
at the value goal. Enhancing the customer experience in the most effective
and efficient way is the purpose of data to decisions. Customer groups can
also collaborate and support decisions to achieve higher value for them-
selves. Collaborations across organizations require an exchange of data and
information across highly porous electronic boundaries.

BIG DATA TYPES AND THEIR

CHARACTERISTICS FOR ANALYTICS

The 3+1+1 (5) Vs of Big Data

BO uses Big Data in its analytics. Understanding the characteristics and
nature of Big Data improves the opportunity of its usage in decisions. These
characteristics are summarized in Figure 2.3. Big Data is initially character-
ized by high volumes. Velocity and variety emerged as important additional
characteristics of Big Data. Processing capabilities and storage are required
to be enhanced as Big Data moves beyond the confines of a traditional data
warehouse. Veracity relates to the quality of data. Business optimization
approaches the fifth characteristic of “V”alue in a strategic manner.11 BO
comprehends the new possibilities for business with these characteristics.
Value in this mix of characteristics is the most important yet hidden char-
acteristic of data. The context of data and the concept of Agility in business
are important contributors in extracting the value that is hidden within the
data. As new solutions continue to emerge, users need to quickly under-
stand the implications of the new forms of decision-making.
(Quality; Accuracy; Context)

Volume
Technologies
Veracity

Value

Velocity
Analytics

Variety

Text, Audio, Video,

Sensor, Graphics

Figure 2.3 Detailed characteristics of Big Data’s 3 + 1 + 1 Vs and the types and categories
of data.
Data to decisions 41

Large, historical static datasets are captured, communicated, aggregated,

stored, and analyzed. But data isn’t static.12 Instead, it is a combination of
both static (large volume) data and dynamic data (high-velocity) data such
as being streamed from IoT devices.
Big Data is defined as high-volume, high-velocity, and/or high-variety
information assets that require new forms of processing to enable enhanced
decision-making, insight discovery, and process optimization.13 Large bod-
ies of data also have encrypted patterns that represent knowledge14 pre-
served from the past and providing invaluable hints, tips, and even concrete
solutions to challenges being experienced in the present.
While volume, velocity, and variety are inherent to Big Data, veracity and
value require a business-oriented and strategic approach to handling data.
Data, often associated with text and numbers, can take the shape of an
audio or video file in the Big Data space. Big Data deals with large volumes
(e.g., ticker of a share market) and a variety of sources (e.g., from a website, a
blog, or an IoT device). Text itself can be both structured and unstructured.
For example, a form filled out online will have a structure associated with
the fields. When such forms run into millions and the data entered within
those forms needs to be analyzed, the structured data starts moving into the
realms of Big Data. On the other hand, lesser volume but highly unstruc-
tured data (e.g., descriptive customer feedback or a blog) is also in the realm
of Big Data. Data velocity is further related to clickstream, video streaming,
and machine sensors. Audio-video are unstructured data requiring a transla-
tion into a structure before they can be analyzed. Such conversion of data to
a structure for analysis is an iterative process.
Understanding data usage provides the basis for developing data-driven
strategies for BO. The basic data usage pattern starts with observations. Once
observations become data, that data goes through generate, record, store,
secure, clean, retrieve, share, and use processes. Finally, good data usage
is also concerned with the appropriate retirement of data. Four distinct yet
interrelated aspects of the data usage pattern are summarized in Figure 2.4:

• Source and store data – This is the entry of data in the organiza-
tion through direct, mobile, audio, text, etc.; volume, cloud, scalable,
secured. Data analytic strategies are concerned with the sourcing of
current and future data, its cost and how current, relevant and reliable
the source and storage facilities are.
• Mine data– (analytics, patterns, collaborations, costs associated with
data and mining; quality – cleansing). Data analytic strategies con-
sider the relevance and the overall costs of mining data. Tools and
technologies for data mining are considered here.
• Utilize data – (decision-making; granularity; feedback mechanism).
Data analytic strategies include business process re-engineering in
order to embed analytics in the decision-making process. The level of
granularity is considered.
42 Artificial Intelligence for Business Optimization

Use
Observe Generate Record Store Secure Clean Retrieve Share
(Analyze)
Retire

Mine Data (Analytics,

Source & Store Data - Entry
Patterns, Collaborations,
(Direct, Mobile, Audio, Text
Costs associated with data
etc.; Volume, Cloud,
and mining; Quality –
Scalable, Secured)
cleansing)

DATA
Analytics Usage

Utilize Data (Decision-

Retire Data (Secured
making; Granularity;
Deletions, Future use)
Feedback mechanism)

Figure 2.4 Data analytics usage pattern.

• Retire data – (secured deletions, future use). Data analytic strategies

to deal with data that has already served its purpose. The usage of
data requires an equally careful approach to its retirement and even-
tual deletion.

SOURCING OF DATA

Think Data and the data analytic usage pattern mention the sourcing of
data. Substantial data is generated by machine sensors and IoT15 devices as
a source. The volume and velocity of such data are very high because it is
generated without human intervention. The number of active IoT devices was
estimated to be 21 billion by 202016 generating 40% of all data. These data
through channels of interconnected sensors and devices are now available
in massive quantities and in both structured and unstructured formats. The
addition to this data is on a continuous basis with high velocity. Third parties
and governmental bodies provide yet another source of data. Users generate
and share content typically on the Cloud. Data analytics on the Cloud pro-
vides real-time insights that enable the optimization of processes. The end
result is a vast collection of data from multiple sources, growing at a high rate
and comprising varieties that have a potential value hidden in them.
Enterprise applications (CRM, ERP, SCM, and others) store this data.
The volume, velocity, and variety of this data has bearing on its analytics.
Data to decisions 43

Structured Un-Structured

Fundamental Data Macro Data Alternative Data

• Basic, mostly Linear • Big Picture, Non- • Third source,

• e.g. Security Personalized, typically
Master Aggregated Unstructured
Information: Issue • E.g. Interest rates, • E.g. News feeds,
Name, Issue Political views and social media, online
Symbol, Issue Type, Policies, Exchange communities, blogs,
Issue Class, Listing Rates IoT sensors,
Market Tier, Total satellite imagery
Shares Outstanding
(TSO), Public Float

Figure 2.5 Big Data categories based on sources.

Timely and accurate analytics are required in order to help optimize busi-
ness. Analytics impact business decisions.
Figure 2.5 shows the additional categorization of Big Data in fundamen-
tal, macro, and alternative data.

• Fundamental – Basic data generated by the organization, mostly lin-

ear (e.g., master data on people, company data, share issue and type,
listing exchanges)
• Macro – Big picture, non-personalized, aggregated (e.g., interest rates,
political views and policies, exchange rates). Macro data is based
on aggregated, fundamental data but it does not identify individual
records.
• Alternative – Third source, typically unstructured (e.g., news feeds,
social media, online communities, blogs, IoT sensors, satellite imagery)

Alternative data
Alternative data is not the organized transactional datasets. Alternative
data is derived from third parties: for example, social media feeds, blogs,
forums, and crowd-sourcing platforms. IoT devices are major sources of
alternative data. Alternative data is a subset of Big Data, often unstructured
and unidentifiable. Alternative data is sourced from outside the boundar-
ies and controls of the organization and yet it references the organization.
For example, a trending news item about the organization and the ensu-
ing discussions by customers and other observers on the news item can be
analyzed to find trends and patterns. The confidence levels of predictions
improve with alternative data. Alternative data can provide validation for
the main data and improve the granularity of predictions.
44 Artificial Intelligence for Business Optimization

Analyzing and using this alternative data require AI technologies and cor-
responding and business capabilities. Identifying the source of alternate data
requires an understanding of the nature of the organization or community
providing that data, understanding security and privacy issues associated with
the source, and the relevance of that data to the problem at hand. For example,
analyzing an insurance quote can make use of alternative data about the prop-
erty or vehicle being insured by sourcing the user community associated with
that item – but the analyst has to be mindful of which Tweeter or Facebook
community they are sourcing the data from, its reliability and security, and
how well it has served the purpose of improving the accuracy of the insur-
ance quote. Furthermore, the mindset of people in the organization who will
utilize the data also requires due consideration because this niche data and its
analytics can change the way in which an agent provides an insurance quote.
Sourcing, analyzing, and integrating this alternative data in decision-making
require more than AI tools. Expertise in the business domain has to couple
with technical expertise in the mindset of the data science professionals.
While structured, transactional data is analyzed quantitatively, alterna-
tive data needs more creative approaches for sourcing and analytics. These
approaches include analyzing the free-formatted blogs, opinions, likes and
dislikes, and the speed with which a particular community is expressing its
opinions. For example, if a car owner community posts a certain number
of “likes” and favorable opinions for a new brand of car model within, say,
1 week, then it has a higher density of positive opinion than the same posi-
tive expressions over 4 weeks.
Alternative data potentially offers an advantage over only using tradi-
tional data. Opportunity to derive value from previously ignored and/or
emerging datasets, often termed the “exhaust” of other business or com-
munities, is the new edge in data analytics.
For example, consider the prophecy project still incomplete. prophecy
project needs more definition as to what it is and a transition to the next
sentence.17 COVID-19 studies attempt to identify clusters based on travel
histories, individual movements, and tweets. (e.g., satellite images of indus-
tries and companies, environmental impact).
Alternative data means an attempt to know anything around the business
based on what “others” think of the business. For example, if the public
health wants to know the path of the pandemic, it will source exotic data
such as parking lots and cell phone usage.
Alternative data provides collective intelligence resulting in rewards and
risks. Risk arises from the source of data, which remains unverified.
Risks in alternative data usage are worth in time-sensitive business processes.
For example, in financial trading, even a narrow timing advantage provides
a trading edge. Alternative data provides that edge. Alternative data provides
potential information advantages in investment management decisions.
Alternative data is usually incomplete and not verifiable. They are unstruc-
tured and difficult to integrate with existing organizational, transactional
data. Privacy and security of this data are also very difficult to establish.
Data to decisions 45

Regulation also plays a role in sourcing and use of alternative data. For
example, with GDPR regulation, it may be necessary for an organization to
explain and make visible the alternative data they have collected from com-
munity forums and which they are using to identify trends in a product or
service. In other regions, where GDPR does not apply, the need to disclose
the alternative data source and usage is not required.

Data security and storage

Security and privacy issues for large volumes of data increase with their increas-
ing use in analytics. This challenge of security increases further as the velocity
of the data being generated is also very high. Cloud-based storages reduce the
pressure to plan for data storage infrastructure. Cloud storages do have the
challenge of ownership and security of data. Additionally, in many cases due to
the velocity of data, real-time backups may not be possible. Data redundancy
and backup of data are other important security and storage factors.

DATA ANALYTICS IN BUSINESS

PROCESS OPTIMIZATION

Analytics themselves are not new. Starting with an abacus to the use of a
sophisticated Excel with macros, analytics provide valuable insights for users.
Analytics are a pointer to (a) what action is to be taken and (b) the results of
that action.

Data analytics
Analytics provide a data-centric approach to business decision-making.
Analytics have multiple purpose and importance. Analytics explore the
past by describing a happening based on static historical data. Analytics
also look forward by presenting what is likely to happen in the future – a
prediction. Predictive analytics need to incorporate dynamic data inputs
and use them to create multiple “what-if scenarios.”
Data analytics is at the heart of business optimization. The starting point
of a strategic approach to using analytics in BO is the business problem or
desired outcome by using those analytics. Examining the factors contribut-
ing to value creation in business is more important than the details of ana-
lytics and technologies. Once the business outcomes are understood, data
analytics start examining the diverse data sources and types available to the
organization for analytical purposes.
Data analytics generate insights that are translated into actionable
knowledge. Data analytics includes clustering, segregation, segementation,
and analysis of the customer data in order to understand customer behav-
ior, potential sentiments, and referrals. This analysis results in actions.
Enabling those actions to happen requires changes to business processes
46 Artificial Intelligence for Business Optimization

and people training. For data analytics to succeed, a clear roadmap for its
application is essential.18

Business process optimization

Optimizing business processes is making efficient and effective decisions.
Organizational functions generate data and use data analytics. Project
management, enterprise architecture, process modeling, solutions develop-
ment, and quality assurance are a number of organization-level activities
required for successful optimization. These activities complement data sci-
ence. Categorizing data, finding the correlation, undertaking analytics, and
presenting the insights (in an easy-to-use way for the end-user) result in
improved decisions. This improvement in decisions is measured in terms of
speed, accuracy, and points where decisions are made.
The following activities are undertaken for business optimization:

• Specify, at a high or strategic level, the desired business outcomes from

the optimization effort. This is a leadership activity that sets the business
vision from AI in close collaboration with the senior leadership team.
• Identify current capabilities and technology maturity of the organiza-
tion of an enterprise. This requires a survey of data inventory.
• Identify current business processes and how these processes will
change (reengineered) with AI.
• Start establishing the context in which data will be used.
• Also, identify gaps in the current business processes and the ones that
can be data-driven. This is achieved through process modeling.
• Understand the risks of AI on business processes, including the change
to organizational decision-making and structure.
• Decentralize decision-making in the business processes requiring
mindset change.
• Document operational and performance parameters used in measur-
ing user/customer satisfaction.
• Create process optimization models with the end-user in mind.
• Categorize current structured data both within and outside an orga-
nization and how it is impacting current decision-making. This infor-
mation is helpful in bringing together the structured and unstructured
data to provide a holistic, 360 degree view to the customer/user.
• Approach to integrate semi- and unstructured data with existing
structured data in order to enhance analytical insights in order to
reduce the risks associated with such integrations.
• Develop a strategic approach to the use of alternative data. Include
alternative and data analytics in decisions.
• Explore the correlations of business functions in an organization
(e.g., between revenues and marketing, products and customer
Data to decisions 47

satisfaction, skills and project success). Data analytics engenders

such correlations by relating information from otherwise disparately
spaced data silo.
• Work with the limitations of existing data-related skill levels across
the organization and help formulate an upskilling approach. This is
aimed at enhancing the capabilities of the current staff by either pro-
viding training and experience, or supporting them with external con-
sulting resources in the initial stages of adoption.

Establishing the data context

Developing the context and using context awareness are vital in Big Data
analytics. For example, each IoT (e.g., wristwatches, smoke detectors, shoes,
and home appliances) are a data point with many, additional data points
embedded within them. These devices can send, receive, and process data
in collaboration with other devices and the back-end Cloud in real time.
Embedding a sensor, sending signals, and receiving data points over the
Internet is a start. This start is followed by context awareness of data from
multiple sensors coming from varied sources using advanced algorithms,
in real time, to develop a 360-degree, holistic view of the data point for
enhanced and Agile decision-making.
Typically, although not always, the context can be ascertained by who,
why, what, when, where, and how. A simple IoT device, with limited func-
tionality, may only need to answer one or two W’s, while a more complex
device generating complex sensor data may need to answer all questions.
Table 2.2 shows an example of contextual reference for a “patient” data
point when used in data analytics.

Tools and techniques for BO

Big Data technologies (build on Hadoop19) ingest and store data in different
formats including unstructured data such as customer feedback (via emails,
blogs, and forums).
Big Data based on Hadoop (HDFS) is primarily static, batch-oriented,
and analyzable. The architecture of Hadoop had not incorporated the
velocity of data and its real-time processing. Therefore, additional tech-
nologies are required to handle the high-velocity data. This is so because
irrespective of the incoming format and velocity, eventually all data points
need a semblance of structure in order to be analyzable.
Additional data manipulation tools and techniques are required to inter-
face incoming transactional as well as unstructured data with the large, static
enterprise data warehouses. Ease of configurability and ease of use of these
tools and techniques play an important part in value generation from analytics.
48 Artificial Intelligence for Business Optimization

Table 2.2 Parameters of a data point (patient) in ascertaining business outcome

Context (reference
Description of the Example of the contextual reference point for
point) reference point “patient”

Who Stakeholder Patient in a hospital around whom the entire
context is based.
Why Goal The patient stakeholder has the goal of getting
well and that provides input into the context of
storing and analyzing this data point.
What Tools and Hospital bed, ventilator, monitors, and associated
technology tools provide the technologies for the patient
data point. These tools and technologies provide
the input into the speed with which the “Why”
or the goal is achieved and the costs.
When Timing 1 AM provides input in the context because it
changes the way in which the patient is
transported, and the speed with which the
treatment is made available. A macro-level
context is whether the patient is coming when
the hospitals are full due to a pandemic.
Where Location Specific location of the hospital, the bed, and
whether the patient is taken straight into the
ICU because of an emergency.
How Process Admitting the patient and providing the treatment
to enable the patient to achieve the goals.

Analytical tools and techniques help in handling the challenges – particu-

larly when handling unstructured data. For example, analytical techniques
map and index a suite of unstructured data to identify a pattern in the data.
Statistical analytical techniques (e.g., linear regression) are complemented by
software and programming tools (e.g., “R”20 or Python21) and solutions pack-
ages (e.g., SAS22). NoSQL is the domain of aggregate-oriented databases23 that
provides increasingly sophisticated mechanisms to store and analyze unstruc-
tured data. Tools for data security analytics are discussed in Chapter 9.

Data analytics design for BO

Designing data analytics is a balancing act – provisioning enough to let
the users configure the analytical solutions they want but at the same time
ensuring enough integration and control in the background to ensure secu-
rity, accuracy, and ease of use. The Agile concept of iteration and increment
is invaluable in developing such balanced, configurable self-serve analytical
solutions. While it personalizes the experience for the user it also frees up
valuable organizational resources and enables lean business processes.
BO is reliant on the richness of analytics, the numerous “what-if” scenar-
ios, and their timeliness. The volume, variety, and velocity with structured,
Data to decisions 49

unstructured, and semistructured data types are studied together in data

analytics. The complexity of such data is acknowledged as a challenge and
also as a promise for the insights it can produce. For example, data analytics
goes beyond sampling and discrete categorizing and into the realm for ana-
lyzing full datasets. This is a challenge as well as an opportunity to produce
very precise results.

Granularity of analytics in BO
Another important concept that helps business agility is the levels of granu-
larity in undertaking analytics. Granularity of data, granularity of ana-
lytics, its context, and the processes embedding the data are crucial in
establishing data strategies for Agile business.
Big Data analytics takes analytics to a very fine degree of granularity.
This finer granularity is enabled by algorithms turned into code. The execu-
tion of this code on the high-volume and velocity data is enabled through
the technologies of Hadoop and NoSQL. Finer granularity of analytics is
the capability that differentiates Big Data analytics from the traditional
analysis of data. 24 The availability and accumulation of data combined
with the availability of computing power enables drilling down through
that data with pinpoint accuracy. Ascertaining the optimum level of this
granularity of data analytics is a strategic business decision discussed later
in this chapter.
Velocity coupled with volume requires strategies for handling data that
also start with the desired business outcomes. For example, the more finely
granular are the analytics, the greater is their confidence level although more
resources are required from the organization. This, in turn, can increase
the cost of analytics. Coarse (or lesser) granularity means less precision in
the results. Granularity and resources are thus a continuously balancing,
“agile” act. Desired outcomes enable the establishment of the right levels
of granularity.

User experience analysis and BO

Another important aspect of data analytics is understanding customer
(user) sentiments through “user experience analysis.” Data generated by
social-mobile interactions before the user is in direct contact with the busi-
ness is explored as part of user experience analysis. This exploration enables
a business to come up with business strategies that aim to understand the
customer before and after the contact period of that customer with the busi-
ness. This data can be analyzed in order to understand the customer expecta-
tions and the user behavior and set the business response accordingly. User
experience analysis depends heavily on social media and mobile technologies.
50 Artificial Intelligence for Business Optimization

Analytics provide for growth and innovation in services and enable pro-
cess optimization. A strategic approach is a must for data analytics to be
part of the organization culture. Precision and speed in analytics and inter-
active data exploration need to be supported by upskilling the users and
providers of the analytics.

Self-serve analytics (SSA) is a business strategy focused on letting users
(e.g., staff and customers) decide what they want from the DatAnalytics
(data-cum-analytics), how they want it, and then help them achieve those
insights. This not only provides the user with what she wants, when she
wants it, and how it is delivered; but it also reduces the onus of analytics on
the business.
Given the velocity of data, the rapidly changing context of the user (e.g.,
the urgency of the results, the format in which they are desired, and the
device on which they will be presented), and the ever-increasing availability
of data warehouses from “third parties,” it is in the best interest of an orga-
nization to provide patterns or prefabricated analytical tools which enable
users to self-serve themselves. This is called “Customer Intelligence.”25
Self-serve analytical capabilities require back-end data integration, dynamic
business process modeling, and customizable visuals. Tools are used to bring
data together from many different locations – internal and external – to
instantaneously answer self-service queries. These tools enable collabora-
tions among multiple systems, their interfaces, and open data sources from
external organizations (e.g. third-party sites providing data or that being
sourced from data providers).
Gartner26 describes the ease of use as the topmost priority for self-service
analytics. This is because the user is unlikely to have technical or analytical
skills. The user could be a banking or financial analyst and not necessarily
a statistician. Visual configuration of services needs to be presented to the
user. These users should be able to configure analytics to solve their imme-
diate needs.

DATA CLUSTERS AND SEGMENTATION

Clustering groups similar data together into clusters. This clustering is done
with the aim of segmenting them. Segmentation places data into groups
based on similar characteristics. These terms denote different approaches
to analyzing data. ML algorithms help identify clusters by identifying
relationships of different types of data. Clustering deals with finding the
relationships between data that are then placed in segments. Clustering the
data helps discover new segments.
Data to decisions 51

Horizontal and vertical clustering

Clustering identifies groups of similar entities in a multivariate dataset. These
are business data such as customers, products, sales, marketing, and regions.
Clustering is undertaken through partitioning of data in a horizontal and/or
vertical manner. Good clusters are based on internal cohesion but externally
loose coupling. Minimal differences within each cluster but maximum dif-
ferences across clusters is considered as a good analytics design.

Segmentation
Data segmentation forms the basis for process optimization. Segmentation is
the separation of data into groups of similar elements. This is an important
first step in analytics. Age vs gender or customers vs prospects are examples
of segmentation. Segmentation allows businesses to analyze customers based
on those similarities. Segmenting enables the creation of a target for services
or products. For example, a data analyst helps new businesses improve their
decision-making capabilities by segmenting other similar businesses. Data
from these businesses is further divided for further analysis.
The process of segmentation continues in an iterative manner with
increasingly finer segments. Segmentation finds relationships within vari-
ables in order to predict customer behavior.

Clusters and segments in practice

Segments are based on relationships between datasets. However, there is a
limit to establishing these relationships. Clustering algorithms identify sim-
ilarities in data. The closer the data elements are to each other, the better
the segment. Distance between data items should be less but more between
segments. Clustering can primarily lead to unsupervised learning in analyt-
ics. Clustering is an example in unsupervised learning, wherein the model
has no target variable. Clustering groups the data based on multiple vari-
ables without being explicitly labeled.
Instead of grouping customers, clustering identifies what customers are
likely to do.
ML utilizes clustering and classifications based on data features in prac-
tice. Data analytics drives and optimizes functions such as marketing, sales,
support, supply chains, operations, and HR. Users of products and services
benefit with personalization of services. Analytics benefit with personaliza-
tion of services. Segment, cluster, and aggregate analyze data. Segmentation
and clustering help prepare data for cluster analysis for business optimiza-
tion. Here are some examples in practice:

• Market – Clustering potential customers with similar interest in a

product based on their similarities.
52 Artificial Intelligence for Business Optimization

• Sales – Clustering can tell what kind of people buy a specific product.
• Insurance – Clustering techniques are used to identify fraudulent
insurance claims.
• Education – Identifying groups of universities based on their tuition,
geographic location, quality of education, and type of degree
programs.
• Credit – Grouping customers based on their credit history.

DATA-DRIVEN DECISIONS

Nature and types of decisions

Automation
Automation happens in the ML space wherein machines take over routine
processes in order to reduce time and effort for humans. Any process that
is primarily simple, straightforward, and deterministic in nature requiring

Artificial /
System (AI)
Known Unknown
Known

Automation
Experience
(Hard, mono-
(So, inter-
dimensional
Human (NI)

disciplinary)
Natural /

Data)
Unknown

Prediction
(Fuzzy, multi Intuition
dimensional
Data)

Figure 2.6 Known-unknown disruptions matrix for AI versus NI.

Data to decisions 53

minimal adjustments during execution is ideal for automation. Automation

works best in handling a known or definable disruption that has occurred
in the past, and the decisions and actions taken at the time of the occurrence
are recorded within the AI system. For example, known network failures in
the technology space benefit by automation. Monitoring networks and net-
work communications is a likely beneficiary of automation because the pro-
cesses around network management are usually well defined. 27 As argued
by Gonsalves and Unhelkar (2020), the primary purpose of automation
is to improve the performance of a system.28 This performance is usually
measured in terms of time and costs – both of which are improved through
automation. While this aforementioned improvement in performance may
be considered as “value,” it is not the same value as perceived by human
customers (See Chapter 10 on NI). Disruptions to businesses result not only
in the loss of time and money, but they also result in the loss of value as
perceived by the customer/user. Situations, such as COVID-19, which are
not amenable to prediction with simple and deterministic equations, are
unlikely to benefit by automated systems. Disruptions are complex; they
comprise technology, processes, and user disruptions that require complex
stakeholder management and recovery strategies. Disruptions cannot be
handled entirely by automation, which is limited to the measurable aspects
of a business system – primarily time and costs.

Prediction
Prediction occurs when the disruptions can be anticipated based on trends
in data. 29 AI-based predictions can be categorized as “known” to the system
but “unknown” to the human. This is so because the tools and technologies
of AI are well equipped to make sense of the vast amount of fast-moving
data. Leadership, in this category, needs to rely on the AI-based system to
sound the alarm when there is a possibility of disruption. Big Data analyt-
ics have shown to predict network failures and security breaches based on
the analysis of large amounts of historical data.30 Since predictions improve
with data, the larger the dataset, the better the predictions. Economic dis-
ruptions, for example, are predicted with reasonable confidence based on
data trends. ML code embedded in AI models is, however, agnostic to the
specific situation, or context, of the user.

Experience
This represents the human-based NI that comes into play in decision-making.
AI engines work only on available data which, in some situations, may not be
readily available. In other situations, the tools may not be able to identify cor-
responding trends within a reasonable time. When the situation is unknown
to the system, leaders provide the needed input based on their experience.
54 Artificial Intelligence for Business Optimization

NI can undertake anticipation, decisions, and actions based on leadership

experience. For example, where systems fail to predict network failures or
security breaches, technology leaders anticipate these events based on their
experience. Resilience and recovery strategies are intertwined in the very
thought processes of experienced leaders. Thus, while due consideration is
given to the predictions provided by the automated ML modules, these pre-
dictions are superimposed with NI.31 Despite the cutting-edge deep learning
algorithms, human experience is not overridden, and decision-making is not
entirely handed over to AI. Furthermore, experience-based anticipation and
actions include consequences of the decisions in relation to quality and ethi-
cal ramifications. These aforementioned consequences and the human exper-
tise (NI) are important considerations in the “known to humans, unknown
to system” box in Figure 2.6. The learning algorithm combines its historical
analytics with the freshly incoming data in order to improve its predictability
of disruptions. The learning algorithm undergoes continuous improvements
through ongoing iterations and increments in an agile manner.

Intuition
This concept encompasses the space where disruptions neither have a past –
historical data to create a trend – nor do they appear in the experience
horizon of the leaders. In other words, these disruptions are unknown to
AI and unknown to humans – as shown in the right-bottom quadrant in
Figure 2.6. The current COVID-19 pandemic falls under this category of
unknown-unknown. The only possibility of anticipating such disruptions
is to explore the instinctive feelings within the leaders, together with imagi-
nary scenarios created in the AI engine.

DATA ANALYTICS FOR BUSINESS AGILITY

An optimized business is an agile business. Optimization is achieved by

continuously enhancing business processes with the help of data analytics.
Figure 2.7 summarizes how an optimizing business leverages data analytics
for business agility.

• Customer analytics enables the business to understand the changing

preferences of the customer. Personalized needs of the customer are
also better understood through this analysis.
• Product analytics provides insights into the costs, risks, and relevance
of the product to the users. Processes related to the development of the
product are optimized using this analysis.
• Marketing analytics enables the business to understand the market
gap, its size and location, and how to approach the promotion of its
products or services.
Data to decisions 55

Customer
Segmentation
And Attrition Product
Acceptance
Optimizing on User
Processes Community
Customer
Analytics
Maintenan
Product
ce
Analytics
Analytics
Targeted
Pre-empting Analytics Marketing
Events Cross for Agility Campaigns on
Marketing
Selling Analytics Multi Channels
Analytics

Security &
Sentiment
Privacy
Analytics
Analytics

Strategizing Social Media

for cross/up Discussions and
selling Blog Posts
Recording
Search and
Transactional
Data

Figure 2.7 Leveraging analytics for business agility.

• Sentiment analytics is an understanding of the customer sentiments

after the product or services has been consumed. It is a useful tool to
figure out if the customer is likely to support and promote the business
or work against it.
• Security and privacy analytics is the understanding of the confidence
level of customers and other users in the products and processes of the
business.
• Cross-selling analytics establishes the correlations between the vari-
ous products and services of the business as well as its partners and
competitors.
• Maintenance analytics is invaluable in a manufacturing setting in
particular as it provides descriptive as well as predictive insights into
the state of the equipment.

The analytics alluded to above impact the following business functions.

• Customer segmentation and attrition is optimized due to a bet-

ter understanding of the customers, their needs and priorities, and
changes to the customer groups as they move their priorities.
• Product acceptance on user community is an interesting process that deals
with the generation and use of alternative data. This is the acceptance (or
56 Artificial Intelligence for Business Optimization

lack thereof) of the company’s product or service on the user community

platform.
• Targeted marketing campaigns on multichannels can be brought
about with the help of the analytics mentioned above.
• Social media discussions and blog posts can be influenced by active
participation of the organization in those discussions and alleviating
the issues faced by the customers.
• Recording search and transactional data in order to further update
the databases and fine tune the analytics.
• Strategizing for cross/up-selling the products and services is enhanced
and focused as a result of the use of analytical insights.
• Pre-empting events such as the launch of products or, in the worst case,
redressing large-scale grievances can be organized based on analytics.
• Optimizing the business processes of the organization by having a
better understanding of the slack within them and how they are pro-
viding value to the customer

CONSOLIDATION WORKSHOP

BEER-Craft Brew - 50 Homebrew Recipes From The World's Best Craft Breweries
100% (1)
BEER-Craft Brew - 50 Homebrew Recipes From The World's Best Craft Breweries
260 pages
English 8 Lesson Plan Parallelism
No ratings yet
English 8 Lesson Plan Parallelism
5 pages
Unit 1
No ratings yet
Unit 1
28 pages
Unit 1 Understanding Big Data
No ratings yet
Unit 1 Understanding Big Data
17 pages
UNIT 1 UNDERSTANDING BIG DATA (1)
No ratings yet
UNIT 1 UNDERSTANDING BIG DATA (1)
17 pages
Bigdata Mod-1
No ratings yet
Bigdata Mod-1
33 pages
Big Data (Unit 1)
No ratings yet
Big Data (Unit 1)
37 pages
Big Data (Unit 1)
No ratings yet
Big Data (Unit 1)
37 pages
Information Storage and Management-V.5
No ratings yet
Information Storage and Management-V.5
211 pages
Unit 1 Bda Complete Notes
No ratings yet
Unit 1 Bda Complete Notes
15 pages
Unit I-KCS-061
No ratings yet
Unit I-KCS-061
42 pages
1. Intro_ClassNotes
No ratings yet
1. Intro_ClassNotes
7 pages
Big Data (Unit 1)
No ratings yet
Big Data (Unit 1)
38 pages
Unit - 1 - Big Data - RCA - E 45
No ratings yet
Unit - 1 - Big Data - RCA - E 45
42 pages
Unit-5 DS
No ratings yet
Unit-5 DS
20 pages
dataanalyticsunit-1[1]
No ratings yet
dataanalyticsunit-1[1]
26 pages
21ai402 Data Analytics Unit-1
No ratings yet
21ai402 Data Analytics Unit-1
37 pages
Sem Csen1301
No ratings yet
Sem Csen1301
12 pages
IDAV_Unit-1
No ratings yet
IDAV_Unit-1
20 pages
Unit 5 - Principles of Big Data 2
No ratings yet
Unit 5 - Principles of Big Data 2
14 pages
Module 3 - Business Analytics
No ratings yet
Module 3 - Business Analytics
34 pages
D F Unit-2-Notes
No ratings yet
D F Unit-2-Notes
40 pages
BDA Class1
No ratings yet
BDA Class1
26 pages
Module-1-Introduction To BigData Platform
No ratings yet
Module-1-Introduction To BigData Platform
21 pages
CC Unit 3 Imp Questions
No ratings yet
CC Unit 3 Imp Questions
15 pages
ITB Notes
No ratings yet
ITB Notes
47 pages
Big Data Analytics
No ratings yet
Big Data Analytics
73 pages
Big Data Analytics
No ratings yet
Big Data Analytics
83 pages
BDA Assignment L9
No ratings yet
BDA Assignment L9
7 pages
Big Data Chatgpt
No ratings yet
Big Data Chatgpt
8 pages
Big Data and Data Analysis: Offurum Paschal I Kunoch Education and Training College, Owerri
No ratings yet
Big Data and Data Analysis: Offurum Paschal I Kunoch Education and Training College, Owerri
35 pages
BG DATA ANALYTICS
No ratings yet
BG DATA ANALYTICS
11 pages
Unit-1 Final sgs
No ratings yet
Unit-1 Final sgs
24 pages
P.Prabu (31x61c) CCS334-BDA.Unit-1
No ratings yet
P.Prabu (31x61c) CCS334-BDA.Unit-1
32 pages
AD3491 - FDSA - Unit I - Introduction - Part I
100% (2)
AD3491 - FDSA - Unit I - Introduction - Part I
23 pages
Big Data Analytics 1-5
100% (1)
Big Data Analytics 1-5
63 pages
R Programming UNIT-1
No ratings yet
R Programming UNIT-1
48 pages
(IJETA-V8I5P2) :yew Kee Wong
No ratings yet
(IJETA-V8I5P2) :yew Kee Wong
7 pages
Dashboard Autonomic Messaging Systems
No ratings yet
Dashboard Autonomic Messaging Systems
8 pages
Data Science Vs Big Data
No ratings yet
Data Science Vs Big Data
34 pages
BI Insem Solved
No ratings yet
BI Insem Solved
7 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
17 pages
BDM 1
No ratings yet
BDM 1
37 pages
Unit 1 and Unit 2 notes bda
No ratings yet
Unit 1 and Unit 2 notes bda
11 pages
Unit 4-IOT (1)
No ratings yet
Unit 4-IOT (1)
21 pages
Project FInal Report
No ratings yet
Project FInal Report
67 pages
Unit 1
No ratings yet
Unit 1
60 pages
Data Science Internship
No ratings yet
Data Science Internship
6 pages
IT Infrastructure Management
No ratings yet
IT Infrastructure Management
8 pages
introduction part
No ratings yet
introduction part
5 pages
P.prabu (31x61c) CCS334 BDA - Unit 1
No ratings yet
P.prabu (31x61c) CCS334 BDA - Unit 1
31 pages
04 InformationSystem
No ratings yet
04 InformationSystem
41 pages
Data Management
No ratings yet
Data Management
16 pages
Applications of IT in Civil Engineering
No ratings yet
Applications of IT in Civil Engineering
22 pages
Acc 411 Topic 2
No ratings yet
Acc 411 Topic 2
30 pages
117769
No ratings yet
117769
20 pages
Unit 5 Ch 43 MIS Part 1
No ratings yet
Unit 5 Ch 43 MIS Part 1
39 pages
Lec.2.Intro.D.S. Fall 2024
No ratings yet
Lec.2.Intro.D.S. Fall 2024
31 pages
Introduction to Data Science_students
No ratings yet
Introduction to Data Science_students
237 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
37 pages
2 MIS - Week - 3 - IS in Business
No ratings yet
2 MIS - Week - 3 - IS in Business
28 pages
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Module-4 chapter 9
No ratings yet
Module-4 chapter 9
17 pages
Module-5 chapter 10
No ratings yet
Module-5 chapter 10
12 pages
Module-5 chapter 11
No ratings yet
Module-5 chapter 11
22 pages
Module-2 Chapter 5
No ratings yet
Module-2 Chapter 5
27 pages
Module-3 Chapter 6
No ratings yet
Module-3 Chapter 6
24 pages
Junior Executive (Agri Business) - On Contract: Gujarat State Fertilizers and Chemicals LTD
No ratings yet
Junior Executive (Agri Business) - On Contract: Gujarat State Fertilizers and Chemicals LTD
2 pages
Baby Theresa: Question: If I Were A Parent of Theresa, Would I Have Done The Same?
No ratings yet
Baby Theresa: Question: If I Were A Parent of Theresa, Would I Have Done The Same?
1 page
30 Transactions of Journal, Ledger, Trial Balance, Financial Sta
No ratings yet
30 Transactions of Journal, Ledger, Trial Balance, Financial Sta
11 pages
Conceptofhumansecurity 161201131318
No ratings yet
Conceptofhumansecurity 161201131318
19 pages
English Q3 Week 7-8
No ratings yet
English Q3 Week 7-8
6 pages
Modern Resume
No ratings yet
Modern Resume
2 pages
Capstone Project Final 2
No ratings yet
Capstone Project Final 2
52 pages
[FREE PDF sample] How Trade with China Threatens Western Institutions: The Economic Roots of a Political Crisis 1st Edition Robert Gmeiner ebooks
100% (2)
[FREE PDF sample] How Trade with China Threatens Western Institutions: The Economic Roots of a Political Crisis 1st Edition Robert Gmeiner ebooks
51 pages
Pbs Lesson2 Activ1 2
No ratings yet
Pbs Lesson2 Activ1 2
2 pages
Evan Handler Drama Repeat
No ratings yet
Evan Handler Drama Repeat
4 pages
DIV Contents
No ratings yet
DIV Contents
1 page
Heinrich Cornelius Agrippa
No ratings yet
Heinrich Cornelius Agrippa
5 pages
Modern Nutrition in Health and Disease, 11th Ed-628-631
No ratings yet
Modern Nutrition in Health and Disease, 11th Ed-628-631
4 pages
Anshika Gupta Resume
No ratings yet
Anshika Gupta Resume
2 pages
Htmle Diagnostic Exam Ratio
No ratings yet
Htmle Diagnostic Exam Ratio
18 pages
Casepresentation 160426164530
No ratings yet
Casepresentation 160426164530
20 pages
Kolkata Dental - 276
No ratings yet
Kolkata Dental - 276
7 pages
Organizational Study at Lamco Industries PVT
No ratings yet
Organizational Study at Lamco Industries PVT
42 pages
YT220 Terminal Tractor
No ratings yet
YT220 Terminal Tractor
2 pages
How_to_Be_a_Good_Trader
No ratings yet
How_to_Be_a_Good_Trader
7 pages
CP - Log 003044867 3 5 2024
No ratings yet
CP - Log 003044867 3 5 2024
198 pages
Free Body Diagram 2
No ratings yet
Free Body Diagram 2
59 pages
Cissell HD175lb Dryer Manual
No ratings yet
Cissell HD175lb Dryer Manual
73 pages
Chapterwise Priority
No ratings yet
Chapterwise Priority
5 pages
Metals and Non Metals (Prashant Kirad)
100% (1)
Metals and Non Metals (Prashant Kirad)
71 pages
SQL Constraints Unit III (1)
No ratings yet
SQL Constraints Unit III (1)
16 pages
BASICEEL Expt8 Lab Report Jvbautista
No ratings yet
BASICEEL Expt8 Lab Report Jvbautista
12 pages
Sferra, F. - The Sadangayoga by Anupamaraksita
No ratings yet
Sferra, F. - The Sadangayoga by Anupamaraksita
331 pages

Module-1 Chapter 2

Uploaded by

Module-1 Chapter 2

Uploaded by

Chapter 2

Data is a vast topic. Data encompasses observations, acquisition, recording

Think data: Handset, dataset, toolset, mindset

an understanding of the myriad ways in which data can be sourced, analyzed,

• Handset – this aspect of Think Data includes smartphones, IoT

•People, Users •Systems,

­Figure 2.1 Think data – handset, dataset, toolset, mindset.

there are meta-data and alternative data generated by interactions

As also summarized in Figure 2.1, underpinning Think Data is “Think

Various aspects of think data

­Table 2.1 Think data for business optimization

­Table 2.1 (Continued) Think data for business optimization

SQL databases6 provide opportunities for analytics and influence business

Data as enabler of optimization

• Location and density data of mobile devices predict economic outlook

DATA TO DECISIONS PYRAMID

Presents Knowledge &

­Figure 2.2 The pyramid of evolution of data to decisions and the AI impacts.

Layer 1: Data is a record of observations

Layer 2: Information makes data understandable

Layer 3: Analytics and services (collaborations)

Layer 4: Knowledge and insights

BIG DATA TYPES AND THEIR

The 3+1+1 (5) Vs of Big Data

Text, Audio, Video,

Large, historical static datasets are captured, communicated, aggregated,

Mine Data (Analytics,

Utilize Data (Decision-

­Figure 2.4 Data analytics usage pattern.

• Retire data – (secured deletions, future use). Data analytic strategies

Fundamental Data Macro Data Alternative Data

• Basic, mostly Linear • Big Picture, Non- • Third source,

­Figure 2.5 Big Data categories based on sources.

• Fundamental – Basic data generated by the organization, mostly lin-

Data security and storage

DATA ANALYTICS IN BUSINESS

Business process optimization

• Specify, at a high or strategic level, the desired business outcomes from

satisfaction, skills and project success). Data analytics engenders

Establishing the data context

Tools and techniques for BO

­Table 2.2 Parameters of a data point (patient) in ascertaining business outcome

Analytical tools and techniques help in handling the challenges – particu-

Data analytics design for BO

unstructured, and semistructured data types are studied together in data

User experience analysis and BO

DATA CLUSTERS AND SEGMENTATION

Horizontal and vertical clustering

Clusters and segments in practice

• Market – Clustering potential customers with similar interest in a

Nature and types of decisions

­Figure 2.6 Known-unknown disruptions matrix for AI versus NI.

minimal adjustments during execution is ideal for automation. Automation

NI can undertake anticipation, decisions, and actions based on leadership

DATA ANALYTICS FOR BUSINESS AGILITY

An optimized business is an agile business. Optimization is achieved by

• Customer analytics enables the business to understand the changing

Strategizing Social Media

­Figure 2.7 Leveraging analytics for business agility.

• Sentiment analytics is an understanding of the customer sentiments

The analytics alluded to above impact the following business functions.

• Customer segmentation and attrition is optimized due to a bet-

lack thereof) of the company’s product or service on the user community

You might also like

Figure 2.1 Think data – handset, dataset, toolset, mindset.

Table 2.1 Think data for business optimization

Table 2.1 (Continued) Think data for business optimization

Figure 2.2 The pyramid of evolution of data to decisions and the AI impacts.

Figure 2.4 Data analytics usage pattern.

Figure 2.5 Big Data categories based on sources.

Table 2.2 Parameters of a data point (patient) in ascertaining business outcome

Figure 2.6 Known-unknown disruptions matrix for AI versus NI.

Figure 2.7 Leveraging analytics for business agility.