0% found this document useful (0 votes)
1 views

Week 5 Database

The document outlines the foundations of business intelligence, focusing on data management in traditional file environments and the capabilities of database management systems (DBMSs). It discusses the data hierarchy, problems with traditional file processing, types of databases, and the operations of relational DBMS. Additionally, it covers tools for improving business performance and decision-making, including big data, data warehouses, and analytical tools.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Week 5 Database

The document outlines the foundations of business intelligence, focusing on data management in traditional file environments and the capabilities of database management systems (DBMSs). It discusses the data hierarchy, problems with traditional file processing, types of databases, and the operations of relational DBMS. Additionally, it covers tools for improving business performance and decision-making, including big data, data warehouses, and analytical tools.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Week 5

Foundations of
Business
Intelligence:
Databases and
Information
Management
Managing Data in a Traditional File Environment

• File organization concepts


– Database : Group of related files
– File : Group of records of same type
– Record : Group of related fields
– Field : Group of characters as word(s) or number
• Describes an entity (person, place, thing on which we store information)
• Attribute: Each characteristic, or quality, describing entity
– Example: Attributes DATE or GRADE belong to entity COURSE
THE DATA HIERARCHY

A computer system
organizes data in a
hierarchy that starts
with the bit, which
represents either a 0 or
a 1. Bits can be grouped
to form a byte to
represent one character,
number, or symbol.
Bytes can be grouped to
form a field, and related
fields can be grouped to
form a record. Related
records can be collected
to form a file, and
related files can be
organized into a
database.

FIGURE 6-1
TRADITIONAL FILE PROCESSING

The use of a
traditional approach to
file processing
encourages each
functional area in a
corporation to develop
specialized
applications. Each
application requires a
unique data file that is
likely to be a subset of
the master file. These
subsets of the master
file lead to data
redundancy and
inconsistency,
processing inflexibility,
and
FIGUREwasted
6-2 storage
resources.
Managing Data in a Traditional File Environment

• Problems with the traditional file environment (files


maintained separately by different departments)
– Data redundancy:
• Presence of duplicate data in multiple files
– Data inconsistency:
• Same attribute has different values
– Program-data dependence:
• When changes in program requires changes to data accessed by program
– Lack of flexibility
– Poor security
– Lack of data sharing and availability
Capabilities of Database Management Systems (DBMSs)

• Database
– Serves many applications by centralizing data and controlling redundant data
• Database management system (DBMS)
– Interfaces between applications and physical data files
– Separates logical and physical views of data
– Solves problems of traditional file environment
• Controls redundancy
• Eliminates inconsistency
• Uncouples programs and data
• Enables organization to central manage data and data security
Types of Database

• Single user database


• Desktop database
• Multiuser database
• Workgroup database
• Enterprise database
• Centralized database
• Distributed database
• Cloud database
• General purpose
database
DBMS Software
• Oracle RDBMS
• IBM DB2
• Microsoft SQL Server
• SAP Sybase
• Teradata
• ADABAS
• MySQL
• FileMaker
• Microsoft Access
• Informix OODBMS
Top 10 Database Software System
http://www.itcareersuccess.com/tech/database.htm
Relational Database Tables

A relational database
organizes data in the
form of two-dimensional
tables. Illustrated here
are tables for the
entities SUPPLIER and
PART showing how they
represent each entity
and its attributes.
Supplier Number is a
primary key for the
SUPPLIER
FIGURE 6-4 table and a
foreign key for the PART
table.
Capabilities of Database Management Systems (DBMSs)

• Operations of a Relational DBMS


– Three basic operations used to develop useful sets of data
• SELECT: Creates subset of data of all records that meet stated criteria
• JOIN: Combines relational tables to provide user with more
information than available in individual tables
• PROJECT: Creates subset of columns in table, creating tables with only
the information specified
THE THREE BASIC OPERATIONS OF A RELATIONAL DBMS

FIGURE 6-5 The select, join, and project operations enable data from two different tables to be combined and only selected
attributes to be displayed.
Capabilities of Database Management Systems (DBMSs)

• Designing Databases
– Conceptual (logical) design: abstract model from business perspective
– Physical design: How database is arranged on direct-access storage devices
• Design process identifies:
– Relationships among data elements, redundant database elements
– Most efficient way to group data elements to meet business requirements, needs of
application programs
• Normalization
– Streamlining complex groupings of data to minimize redundant data elements and
awkward many-to-many relationships
AN UNNORMALIZED RELATION FOR ORDER

FIGURE 6-9 An unnormalized relation contains repeating groups. For example, there can be many parts and suppliers for each
order. There is only a one-to-one correspondence between Order_Number and Order_Date.
Capabilities of Database Management Systems (DBMSs)

• Referential integrity rules


• Used by RDMS to ensure relationships between tables remain consistent
• Entity-relationship diagram
• Used by database designers to document the data model
• Illustrates relationships between entities
– Caution:
IF A BUSINESS DOESN’T GET DATA MODEL RIGHT, SYSTEM WON’T BE
ABLE TO SERVE BUSINESS WELL
ERD Examples (conceptual
design)
Author
Name
memberID
Birthdat Year
e Publishe
r

Member Borrow Book


1 M

BookNo
Phone State

Title
Address
Date
Tools for Improving Business Performance and Decision Making

• Big data
• Massive sets of
unstructured/semi-structured
data from Web traffic, social
media, sensors, and so on
• Petabytes, exabytes of data
• Volumes too great for typical
DBMS
• Can reveal more patterns and
anomalies
Tools for Improving Business Performance and
Decision Making

• Business intelligence infrastructure


• Today includes an array of tools for
separate systems, and big data
• Contemporary tools:
• Data warehouses
• Data marts
• Hadoop
• In-memory computing
• Analytical platforms
Tools for Improving Business Performance and Decision Making

• Data warehouse:
– Stores current and historical data from many core operational transaction
systems
– Consolidates and standardizes information for use across enterprise, but data
cannot be altered
– Provides analysis and reporting tools
• Data marts:
– Subset of data warehouse
– Summarized or focused portion of data for use by specific population of users
– Typically focuses on single subject or line of business
CONTEMPORARY BUSINESS INTELLIGENCE INFRASTRUCTURE

A contemporary
business intelligence
infrastructure features
capabilities and tools to
manage and
analyze large quantities
and different types of
data from multiple
sources. Easy-to-use
query and
reporting tools for
casual business users
and more sophisticated
analytical toolsets for
power users
FIGURE 6-12
are included.
Tools for Improving Business Performance and Decision Making

• In-memory computing
• Used in big data analysis
• Uses computers main memory (RAM) for data storage to avoid delays in
retrieving data from disk storage
• Can reduce hours/days of processing to seconds
• Requires optimized hardware
• Analytic platforms
• High-speed platforms using both relational and non-relational tools
optimized for large datasets
Tools for Improving Business Performance and Decision Making

• Analytical tools: Relationships,


patterns, trends
– Tools for consolidating, analyzing, and
providing access to vast amounts of
data to help users make better business
decisions
• Multidimensional data analysis (OLAP)
• Data mining
• Text mining
• Web mining
ANALYTICS TOOLS FREE

• GOOGLE ANALYTICS • CLOUDFLARE


• CLICKY • MIXPANEL
• MINT • WOOPRA
• CHURCH ANALYTICS • WORDPRESS.COM/JETPACK
• KISSMETRICS • BITLY
• OPEN WEB ANALYTICS • OPEN WEB ANALITYCS
• CLICKTALE • SIMILARWEB
• CRAZYEGG • SEMRUSH
• PIWIK • MOZ KEYWORD EXPLORER
ANALYTICS TOOLS FREE

• CYFE
• GOOGLE SEARCH CONSOLE
Tools for Improving Business Performance and Decision Making

• Online analytical processing (OLAP)


– Supports multidimensional data analysis
• Viewing data using multiple dimensions
• Each aspect of information (product, pricing, cost, region, time period) is
different dimension
• Example: How many washers sold in the East in June compared with other
regions?
– OLAP enables rapid, online answers to ad hoc queries
MULTIDIMENSIONAL DATA MODEL

The view that is showing


is product versus region.
If you rotate the cube 90
degrees, the face that
will show product versus
actual and projected
sales. If you rotate the
cube 90 degrees again,
you will see region
versus actual and
projected sales. Other
views are
FIGURE possible.
6-13
Tools for Improving Business Performance and Decision Making

• Data mining:
• Finds hidden patterns, relationships in datasets
• Example: customer buying patterns
• Infers rules to predict future behavior
• Types of information obtainable from data mining:
• Associations
• Sequences
• Classification
• Clustering
• Forecasting
Tools for Improving Business Performance and Decision Making

• Text mining
• Extracts key elements from large unstructured data sets
• Stored e-mails
• Call center transcripts
• Legal cases
• Patent descriptions
• Service reports, and so on
• Sentiment analysis software
• Mines e-mails, blogs, social media to detect opinions
Tools for Improving Business Performance and Decision Making

• Web mining
– Discovery and analysis of useful patterns and information from Web
– Understand customer behavior
– Evaluate effectiveness of Web site, and so on
– Web content mining
• Mines content of Web pages
– Web structure mining
• Analyzes links to and from Web page
– Web usage mining
• Mines user interaction data recorded by Web server
Tools for Improving Business Performance and Decision Making

• Databases and the Web


– Many companies use Web to make some internal databases available to
customers or partners
– Typical configuration includes:
• Web server
• Application server/middleware/CGI scripts
• Database server (hosting DBMS)
– Advantages of using Web for database access:
• Ease of use of browser software
• Web interface requires few or no changes to database
• Inexpensive to add Web interface to system
Managing Data Resources

• Ensuring data quality


– More than 25 percent of critical data in Fortune 1000
company databases are inaccurate or incomplete
– Redundant data
– Inconsistent data
– Faulty input
– Before new database in place, need to:
• Identify and correct faulty data
• Establish better routines for editing data once database in
operation
Managing Data Resources

• Data quality audit:


– Structured survey of the accuracy and level of completeness of the data
in an information system
• Survey samples from data files, or
• Survey end users for perceptions of quality
• Data cleansing
– Software to detect and correct data that are incorrect, incomplete,
improperly formatted, or redundant
– Enforces consistency among different sets of data from separate
information systems
Homework
• Make a simple database (Phonebook) using Microsoft Excel, that consist
of your friends contact since elementary school until now
• Define all FIELDS that you need in the database (as complete as you
can) in order to:
• Classify (Group) them based on their social media (FB, IG, Twitter, WA, etc)
• Sort their name based on alphabetical order
• Count friends based on school level
• Calculate their age
• Select friends who live in the same town with you
• Give some example of RECORDS
• Make an ERD from

You might also like