0% found this document useful (0 votes)
37 views39 pages

ITISA1 Ch06 PowerPoint

Uploaded by

sizwemakgalemele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views39 pages

ITISA1 Ch06 PowerPoint

Uploaded by

sizwemakgalemele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Chapter 6

Business Intelligence: Big Data and


Analytics

Stair/Reynolds, Principles of Information Systems, 14 th Edition. © 2021 Cengage. All Rights Reserved. May not be scanned,
copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Objectives (1 of 3)

• Identify five key characteristics associated


with big data
• Identify five key challenges associated with
big data
• Distinguish between the terms data
warehouse, data mart, and data lake
• Explain the purpose of each step in the
extract, transform, and load process
• State four ways a NoSQL database differs
from an SQL database
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Objectives (2 of 3)

• Identify the two primary components of the


Hadoop computing environment
• Identify the primary advantage of in-
memory database in processing big data
• State the primary difference between
business intelligence and analytics
• Define the role of a data scientist

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Objectives (3 of 3)

• Identify three key organizational


components that must be in place for an
organization to get real value from its
BI/analytics efforts
• Identify five broad categories of business
intelligence/analytics techniques including
the specific techniques used in each
• Identify four potential issues that arise with
the use of self-service analytics

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Why Learn about Big Data and
Analytics?
• New data coming from all directions
• Nearly a zettabyte per year
• 1 trillion gigabytes or a 1 followed by 21 zeros= 1 000 000 000
000 000 000 000
• Must analyze large amounts of data
• Measure past and current performance
• Predict the future
• Forecasts drive anticipatory actions
• Improve business strategies
• Strengthen business operations
• Enrich decision making
• Organization will become more competitive
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Big Data (1 of 2)

• Big data
• Enormous (terabytes or more)
• Complex
• Traditional processes incapable of dealing with
them
• Key characteristics
• Volume
• Velocity
• Value
• Variety
• Veracity
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Sources of Big Data

FIGURE 6.2 Sources of an organization’s useful data


An organization has many sources of useful data.

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Big Data Uses

• Organizations use big data to improve:


• Day-to-day operations
• Planning
• Decision making

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Technologies Used to Manage and
Process Big Data
• Technologies used to manage and process
big data
• Data warehouses
• Extract Transform Load process
• Data marts
• Data lakes
• NoSQL databases
• Hadoop
• In-Memory databases

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Warehouses, Data Marts, and
Data Lakes (1 of 5)

• Online transaction processing (OLTP)


systems
• Traditionally used to capture data
• Do not support data analysis required today
• Data warehouses and data marts
• Allow organizations to access OLTP data
• Support decision making more effectively

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Warehouses, Data Marts, and
Data Lakes (2 of 5)

Characteristic Description
Large Holds billions of records and petabytes of data
Multiple sources Data comes from many sources both internal and
external thus an extract, transform, load process
is required to ensure quality data
Historical Typically 5 years of data or more
Cross organizational access Data accessed, used, and analyzed by users across
and analysis the organization to support multiple business
processes and decision making
Supports various types of Drill down analysis, development of metrics,
analyses and reporting identification of trends

TABLE 6.3 Characteristics of a data warehouse


Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Warehouses, Data Marts, and
Data Lakes (3 of 5)
• Data warehouse
• Large database
• Holds business information from many sources
in the enterprise
• Covers all aspects of the company’s
processes, products, and customers
• Extract Transform Load (ETL) process
• Extracts data from a variety of sources
• Edits and transforms data into a data warehouse
format
• Loads data into the warehouse
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Warehouses, Data Marts, and
Data Lakes (5 of 5)
• Data mart
• Subset of a data warehouse
• Used by small and medium-sized businesses
and departments within large companies
• Supports decision making
• Data lake
• Takes a “store everything” approach to big data
• Saves all data in its raw and unaltered form

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
NoSQL Databases (1 of 3)

• NoSQL database
• Differs from a relational database
• Data modeled without two-dimensional tabular
relations
• Uses horizontal scaling
• Does not require a predefined schema
• Does not conform to true ACID properties when
processing transactions
• Structures used by NoSQL databases
• More flexible than relational database tables
• Provide improved access speed and redundancy

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
NoSQL Databases (2 of 3)

• Four categories
• Key-value NoSQL databases
• Two columns (“key” and “value”)
• Document NoSQL databases
• Store, retrieve, and manage document-oriented
information
• Graph NoSQL databases
• Well-suited for analyzing interconnections
• Column NoSQL databases
• Store data in columns

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Hadoop (1 of 3)

• Hadoop
• Open-source software framework
• Includes several software modules
• Stores and processes extremely large data
sets
• Hadoop Distributed File System (HDFS)
• Distributed file system
• Used for data storage
• Divides the data into subset
• Distributes the subsets onto different servers for
processing
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Hadoop (3 of 3)

• MapReduce program
• Consists of two components
• Map procedure performs filtering and sorting
• Reduce method performs a summary operation
• Hadoop limitation
• Can only perform batch processing

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
In-Memory Databases (1 of 2)

• In-memory database (IMDB)


• Stores the entire database in random access
memory (RAM)
• Faster access to data
• Rates much faster than storing data on secondary
storage
• Enable the analysis of big data and other
challenging data-processing applications
• Feasibility due to two factors
• Increase in RAM capacities
• Corresponding decrease in RAM costs
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
In-Memory Databases (2 of 2)

Database Software Product Name Major Customers


Manufacturer
Altibase HDB E*Trade, China Telecom
Oracle Times Ten Lockheed Martin,
Verizon Wireless
SAP High-Performance eBay, Colgate
Analytic Appliance
(HANA)

Software AG Terracotta Big Memory AdJuggler

TABLE 6.5 IMDB providers

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analytics and Business Intelligence

• Business intelligence (BI)


• Wide range of applications, practices, and
technologies
• Extracts, transforms, integrates, visualizes,
analyzes, interprets, and presents data
• Supports improved decision making
• Analytics
• Extensive use of data and quantitative analysis
• Supports fact-based decision making within
organizations
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Benefits Achieved from BI and
Analytics
• Detect fraud
• Improve forecasting
• Increase sales
• Optimize operations
• Reduce costs

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Role of a Data Scientist

• Data scientist
• Combines several skills
• Strong business acumen
• Deep understanding of analytics
• Healthy appreciation of data, tools, and techniques’
limitations
• Delivers real improvements in decision making
• Highly inquisitive person
• Educational requirements: quite rigorous
• Job outlook: extremely bright

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Components Required for Effective BI
and Analytics
• Three key components
• Existence of a solid data management program
• Includes governance
• Creative data scientists
• Strong commitment to data-driven decision
making

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Business Intelligence and Analytics
Tools

Text and
Descriptive Predictive Video
Analysis Analytics Optimization Simulation Analysis

Visual Time series Genetic Scenario Text


analytics analysis algorithm analysis analysis

Monte
Regression Linear Carlo Video
analysis Data mining programming simulation analysis

TABLE 6.6 General categories of BI/analytic techniques

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Analysis (1 of 6)

• Descriptive analysis
• Preliminary data processing stage
• Identifies data patterns
• Answers questions
• Who, what, where, when, and to what extent
• Two types
• Visual analytics
• Regression analysis

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Analysis (2 of 6)

• Visual analytics
• Presentation of data pictorially or graphically
• Word cloud
• Visual depiction of a set of words
• Words grouped together
▶ Based on frequency of their occurrence

• Conversion funnel
• Graphical representation
• Example: Summary of steps a consumer takes in
making the decision to buy a product and become a
customer

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Analysis (3 of 6)

FIGURE 6.5 Word cloud


This Word cloud shows the topics covered in this chapter.

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Analysis (4 of 6)

FIGURE 6.6 The conversion funnel


The conversion funnel shows the key steps in converting a consumer to a buyer.

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Analysis (5 of 6)

FIGURE 6.7 Data visualization


This scatter diagram shows the relationship between age and weight.

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Analysis (6 of 6)

• Regression analysis
• Determines the relationship between a
dependent variable and one or more
independent variables
• Produces a regression equation
• Coefficients represent a relationship
▶ Between each independent variable and the

dependent variable
• Used to make predictions

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Predictive Analytics (1 of 3)

• Predictive analytics
• Techniques to analyze current data
• Identifies future probabilities and trends
• Makes predictions about the future
• Time series analysis
• Uses statistical methods
• Analyzes time series data
• Extracts meaningful statistics and
characteristics

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Predictive Analytics (2 of 3)

• Data mining
• BI analytics tool
• Explores large amounts of data for hidden
patterns
• Predicts future trends and behaviors
• Used in decision making
• Three common data mining techniques
• Association analysis
• Neural computing
• Case-based reasoning

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Predictive Analytics (3 of 3)

FIGURE 6.8 The Cross-Industry


Process for Data Mining (CRLSP-DM)
CRISP-DM provides a structured
approach for planning and executing a
data mining project.

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Optimization (1 of 2)

• Allocate scarce resources


• To minimize costs or maximize profits
• Genetic algorithm
• Employs a natural selection-like process
• Finds approximate solutions to optimization and
search problems
• Linear programming
• Finds the optimum value of a linear expression
• Calculated based on the value of a set of decision
variables
• Variables subject to a set of constraints
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Optimization (2 of 2)

FIGURE 6.9 Multi-step process of genetic algorithm

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Simulation

• Emulates the dynamic responses of a real-


world system to various inputs
• Scenario analysis
• Predicts future values based on certain potential
events
• Monte Carlo simulation
• Provides a spectrum of thousands of possible
outcomes
• Considers the many variables involved
• Considers the range of potential values for each
variable
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Text and Video Analysis

• Glean insights and data relevant to


decision making
• Text analysis
• Process for extracting value from large
quantities of unstructured text data
• Video analysis
• Process of obtaining information or insights
from video footage

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Self-Service Analytics

• Self-service analytics
• Training, techniques, and processes
• Empower end users to work independently
• Access data from approved sources
• Perform their own analyses
• Use an endorsed set of tools
• Advantages
• Gets valuable data into the hands of end users
• Encourages fact-based decision making
• Accelerates decision making
• Provides a solution to the shortage of data scientists
Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Summary

• Tremendous growth of available data


• Organizations struggling to manage and use it
• Tools and technologies help
• Take advantage of big data opportunities
• Data warehouse, data mart, data lake
• NoSQL database
• Business intelligence (BI) and analytics
• Support improved decision making
• Data scientist plays an important role

Stair/Reynolds, Principles of Information Systems, 14th Edition. © 2021 Cengage. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

You might also like