0% found this document useful (0 votes)
19 views45 pages

Ch-03-1 Unlocked 2

The document discusses database technologies and data management strategies for performance, growth, and sustainability. It covers topics such as database systems, data warehouses, data marts, SQL, data quality, centralized and distributed architectures, data analytics, and data discovery. The author is Dr. Ebadati, who has a Ph.D. in Computer Science from Delhi.

Uploaded by

jz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views45 pages

Ch-03-1 Unlocked 2

The document discusses database technologies and data management strategies for performance, growth, and sustainability. It covers topics such as database systems, data warehouses, data marts, SQL, data quality, centralized and distributed architectures, data analytics, and data discovery. The author is Dr. Ebadati, who has a Ph.D. in Computer Science from Delhi.

Uploaded by

jz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

IT for Management:

On-Demand
Strategies for
3 Performance, Growth,
and Sustainability
Dr. Ebadati
Ph.D. (Computer Science),
Data
Chapter

Delhi
Management,
Business
Intelligence, and
Data Analytics
Learning Objectives (1 of 5)
Database Technologies: Databases
• Collections of data sets or records stored in a systematic way
• Stores data generated by business apps, sensors, operations,
and transaction-processing systems (TPS)
• The data in databases are extremely volatile
• Medium and large enterprises typically have many databases of
various types

Volatile data changes frequently.


Database Technologies: Data Warehouses
• Integrate data from multiple databases and data silos, and
organize them for complex analysis, knowledge discovery,
and to support decision making
• May require formatting processing and/or standardization
• Loaded at specific times making them non-volatile and
ready for analysis
Database Technologies: Data Marts

• Small-scale data warehouses that support a single


function or one department
• Enterprises that cannot afford to invest in data
warehousing may start with one or more data marts
Database Technologies: BI

• Business Intelligence (BI)

• Tools and techniques that process data and conduct


statistical analysis for insight and discovery

• Used to discover meaningful relationships in the


data, keep informed of real time, detect trends, and
identify opportunities and risks
Database Management Systems (DBMS)
• Integrate with data collection systems such as TPS and business applications
• Organized way to store, access, and manage data
• Stores data in tables consisting of columns and rows, similar to the format of a
spreadsheet
• Standard database model adopted by most enterprises
• Functions include:
• Data filtering and profiling
• Data integrity and maintenance
• Data synchronization
• Data security
• Data access
Database Technologies: SQL
• Relational Database Management Systems (DBMS)

• Provides access to data using a declarative language


• Declarative language

• Simplifies data access by requiring that users only specify what data they
want to access without defining how they will be achieved

• Structured Query Language (SQL) is an example of declarative language:

SELECT column_name(s)

FROM table_name

WHERE condition
OLTP and OLAP Systems

Online Transaction Processing and Online Analytics Processing


• Online Transaction Processing (OLTP)
• Designed to manage transaction data, which are volatile & break down complex
information into simpler data tables and strike a balance between transaction-
processing efficiency and query efficiency
• Cannot be optimized for data mining
• Online Analytics Processing (OLAP)
• A means of organizing large business databases
• Divided into one or more cubes that fit the way business is conducted
Database Technologies: NOSQL
• Trend toward NoSQL Systems (Not Only SQL)
o Higher performance
o Easy distribution of data on different nodes
• Enables scalability and fault tolerance
o Greater flexibility
o Simpler administration
Popular DBMS
1. DBMSs (mid-2016)
a. Oracle’s 12C Database

b. Microsoft’s SQL Server

c. IBM’s DB2

d. SAP Sybase Ase

e. PostgreSQL
Data Management and Database Technologies
1. Describe a database and database management system
(DBMS).
2. Explain what an online transaction-processing (OLAP)
system does.
3. Why are data in databases volatile?
4. Describe the functions of a DBMS.
5. Describe the purpose and benefits of data management.
6. What is a relational database management system?
Learning Objectives (2 of 5)
Centralized and Distributed Database Architecture
• Centralized Database Architecture

• Better control of data quality

• Better IT security
• Distributed Database Architecture

• Allow both local and remote access

• Use client/server architecture to process requests


Dirty Data

• Garbage In, Garbage Out

• Dirty Data

• Lacks integrity/validation and reduces user trust

• Incomplete, out of context, outdated, inaccurate, inaccessible, or


overwhelming
Characteristics of Poor Quality or Dirty Data

Characteristic Description
Incomplete Missing data

Outdated or Invalid Too old to be valid or useful

Incorrect Too many errors

Duplicated or in Too many copies or versions of the same data—and the versions are
inconsistent or in conflict with each other
conflict
Non-standardized Data are stored in incompatible formats—and cannot be compared or
summarized

Unusable Data are not in context to be understood or interpreted correctly at the time of
access
Data Life Cycle and Data Principles (1 of 2)
• Principle of Diminishing Data Value
• The value of data diminishes as they age
• Blind spots (lack of data availability) of 30 days or longer
inhibit peak performance
• Global financial services institutions rely on near-real-time
data for peak performance
• Principle of 90/90 Data Use
• As high as 90 percent, is seldom accessed after 90 days
(except for auditing purposes)
• Roughly 90 percent of data lose most of their value after
3 months
Data Life Cycle and Data Principles (2 of 2)
• Principle of data in context

• The capability to capture, process, format, and distribute


data in near real time or faster requires a huge investment
in data architecture

• The investment can be justified on the principle that data


must be integrated, processed, analyzed, and formatted
in “actionable information”
Figure 3.11 Data life cycle
Figure 3.12 An enterprise has transactional, master, and analytical data.
Centralized and Distributed Database Architectures
1. Describe the data life cycle.
2. What is the function of master data management (MDM)?
3. What are the consequences of not cleaning “dirty data”?
4. Describe the differences between centralized and
distributed databases.
5. Discuss how data ownership and organizational politics
affect the quality of an organization’s data.
Learning Objectives (3 of 5)
Data Warehouses: Enterprise data warehouses (EDW)

• Data warehouses that pull together data from disparate


sources and databases across an entire enterprise
• Warehouses are the primary source of cleansed data for
analysis, reporting, and Business Intelligence (BI)
• Their high costs can be subsidized by using data marts
Data Preparation: Procedures to Prepare EDW Data for
Analytics
• Extract from designated databases
• Transform by standardizing formats, cleaning the data,
integration
• Loading into a data warehouse
Figure 3.15 Database, data warehouses and marts, and BI architecture.
Data Warehouses: ADW
• Active Data Warehouse (ADW)

• Real-time data warehousing and analytics

• Transform by standardizing formats, cleaning the data, integration


• They provide

• Interaction with a customer to provide superior customer


service

• Respond to business events in near real time

• Share up-to-date status data among merchants, vendors, and


associates
Data Warehouse Processing: Hadoop and MapReduce
• Hadoop is an Apache processing platform that places no
conditions on the processed data structure
• MapReduce provides a reliable, fault-tolerant software
framework to write applications easily that process vast
amounts of data (multi-terabyte datasets) in-parallel on large
clusters (thousands of nodes) of commodity software

• Map stage: breaks up huge data into subsets

• Reduce stage: recombines partial results


Data Warehouses
1. What are the differences between databases and data
warehouses?
2. What are the differences between data warehouses and data
marts?
3. Explain ETL.
4. Explain CDC.
5. What is an advantage of an active data warehouse (ADW)?
6. Why might a company invest in a data mart?
7. How can manufacturers and health care benefit from data
analytics?
8. Explain how Hadoop implements MapReduce in two stages.
Learning Objectives (4 of 5)
Data Analytics and Data Discovery Defined
• Data Analytics is a technique of qualitatively or
quantitatively analyzing a data set to reveal patterns,
trends, and associations that often relate to human
behavior and interaction, to enhance productivity and
business gain.
• Big data is an extremely large data set that is too
large or complex to be analyzed using traditional data
processing techniques.
Four V’s of Data Analytics
1. Variety: The analytic environment has expanded from pulling
data from enterprise systems to include big data and
unstructured sources.
2. Volume: Large volumes of structured and unstructured data
are analyzed.
3. Velocity: Speed of access to reports that are drawn from data
defines the difference between effective and ineffective analytics.
4. Veracity: Validating data and extracting insight that manager
and workers can trust are key factors successful analytics. Trust
in analytics. Trust analytics has grown more difficult with the
explosion of data sources.
Data Analytics: Human Expertise is Needed

• To interpret the output of analytics, Big Data Specialists and


Business Intelligence Analysts perform many tasks

• Data preparation for analysis through data cleansing


techniques, to eliminate duplicates or incomplete data

• Dirty data degrade the value of analytics

• Data must be put into meaningful context


Data Discovery: Data and Text Mining
• Creating Business Value

• Data Mining: software that enables users to analyze data


from various dimension or angles, categorize them, and find
correlative patterns among fields in the data warehouse

• Text Mining: broad category involving interpreted words and


concepts in context

• Sentiment Analysis: trying to understand consumer intent


Data Analytics and Data Discovery
1. Why are human expertise and judgment important to data
analytics? Give an example.
2. What is the relationship between data quality and the value of
analytics?
3. Why do data need to be put into a meaningful context?
4. How can manufacturers and health care benefit from data
analytics?
5. How does data mining provide value? Give an example.
6. What is text mining? ?
7. What are the basic steps involved in text analytics?
Learning Objectives (5 of 5)
Business Intelligence: Key to competitive advantage
• Across industries in all size enterprises
• Used in operational management, business process, and
decision making
• Provides moment of value to decision makers
• Unites data, technology, analytics, & human knowledge to
optimize decisions
• BI “unites data, technology, analytics, and human knowledge
to optimize business decision and ultimately drive an
enterprise’s success” (The Data Warehousing Institute)
Business Intelligence Challenges
• Challenges
• Data selection and quality
• Alignment with business strategy and BI strategy
• Alignment
• Clearly articulates business strategy
• Deconstructs business strategy into targets
• Identifies PKIs
• Prioritizes PKIs
• Creates a plan based on priorities
• Transform based on strategic results and changes
Figure 3.17: Business Intelligence Factors: Four factors contributing to increased use of BI
Business Intelligence Architecture

• Advances in response to big data and end-user performance


demands
• Hosted on public or private clouds
• Limits IT staff and controls costs
• May slow response time, add security and backup risks
Electronic Records Management
• Business Records

• Documentation of a business event, action, decision, or


transaction
• Electronic Records Management (EMR)

• Workflow software, authoring tools, scanners, and databases that


manage and archive electronic documents and image paper
documents

• Index and store documents according to company policy or legal


compliance

• Success depends on partnership of key players


ERM Practices and Standards
• Best Practices

• Effective systems capture all business data

• Input from online forms, bar codes, sensors, websites, social sites, copiers,
emails, and more
• Industry Standards

• Association for Information and Image Management (AIIM; www.aim.org)

• National Archives and Records Administration (NARA; www.archives.gov)

• ARMA International (formerly the Association of Records Managers and


Administrators; www.arma.org)
ERM Benefits: an ERM can help a business

• Access and use the content contained in documents


• Cut labor costs by automating business processes
• Reduce time and effort to locate require information
for decision making
• Improve content security, thereby reducing intellectual
property theft risks
• Minimize content printing, storing, and searching costs
ERM: Disaster Recovery, Business Continuity, and
Compliance
1. Does the software meet the organization’s needs? For example, can the
DMS be installed on the existing network? Can it be purchased as a
service?
2. Is the software easy to use and accessible from Web browsers, office
applications, and email applications? If not, people will not use it.
3. Does the software have lightweight, modern Web and graphical user
interfaces that effectively support remote users?
4. Before selecting a vendor, it is important to examine workflows and how
data, documents, and communications flow throughout the company.
Business Intelligence and Electronic Records
Management
1. What are the business benefits of BI?
2. What are two data-related challenges that must be resolved for BI to produce
meaningful insight?
3. What are the steps in a BI governance program?
4. What does it mean to drill down into data, and why is it important?
5. What four factors are contributing to increased use of BI?
6. Why is ERM a strategic issue rather than simply an IT issue?
7. Why might a company have a legal duty to retain records? Give an example.
8. Why is creating backups an insufficient way to manage an organization’s documents?
Best Wishes
Do you have any questions?

[email protected]

ebadati.com

Omid Ebadati

You might also like