0% found this document useful (0 votes)
18 views65 pages

Chapter 6 MIS270

Uploaded by

cd70e08884
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views65 pages

Chapter 6 MIS270

Uploaded by

cd70e08884
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

‫‪Management information‬‬

‫‪system MIS-270‬‬
‫ﺟﺎﻣﻌﺔ ﺍﻟﻤﻠﻚ ﻋﺒﺪﺍﻟﻌﺰﻳﺰ‬
‫ﻳﺎﺭﺍ ﺍﺑﻮﺍﻟﻔﺮﺝ‬

‫ﺇﺑﺪﺃ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻥ‬


Management information
system MIS-270

Chapter 6
Foundations of Business Intelligence: Databases

and Information Management


Learning Objectives
6.1
What are the problems of managing data resources in a traditional file
environment?
6.2
What are the major capabilities of database management systems (DBMS),
and why is a relational DBMS so powerful?
6.3
What are the principal tools and technologies for accessing information
from databases to improve business performance and decision making?
6.4
Why are information policy, data administration, and data quality
assurance essential for managing the firm’s data resources?
Learning object 1:

What are the problems of managing


data resources in a traditional file
environment?
File Organization Terms and
Concepts
A computer system organizes data in a hierarchy.
starts with bits and bytes and progresses to fields,
records, files, and databases.
1- bit: represents the smallest unit of data a computer can
handle.
2- byte: represents a single character, which can be a
letter, a number, or another symbol.
3- field: grouping of characters into a word, a group of
words, or a complete number (such as a person’s name or
age).
File Organization Terms and
Concepts
4- record: group of related fields, such as the student’s
name, the course taken, the date, and the grade.
5- file: group of records of the same type.
6- entity (A record describes an entity): is a person, place,
thing, or event on which we store and maintain
information.
7- attribute: Each characteristic or quality describing a
particular entity
Problems with the Traditional
File Environment
files created, maintained, and operated by separate
divisions or departments.
1. Data Redundancy
2. Inconsistency
3. Program-Data Dependence
4. Lack of Flexibility
5. Poor Security
6. Lack of Data Sharing and Availability
Problems with the Traditional
File Environment

Data Redundancy:
is the presence of duplicate data in multiple data files so
that the same data are stored in more than one place or
location.
when different groups collect the same piece of data and
stored it independently of each other.
Data inconsistency:
where the same attribute may have different values.
Problems with the Traditional
File Environment
Programme-data dependence:
the coupling of data stored in files and the specific
programs required to update and maintain those files.
Such that changes in programs require changes to the
data.
lake of flexibility:
cannot deliver ad hoc reports or respond to unanticipated
information requirements in a timely fashion.
the information required by ad hoc requests is somewhere
in the system but may be too expensive to retrieve
Problems with the Traditional
File Environment
Poor security:
Because there is little control or management of data so
the access to information may be out of control
Lake of data sharing and availability:
Because pieces of information and different files and
different parts of the organisation cannot be related to
one another
it is virtually impossible for information to be shared or
accessed in a timely manner.
information cannot flow freely across different functional
areas of the organisation.
Learning object 2:

What are the major capabilities of


database management systems
(DBMS), and why is a relational DBMS
so powerful?
Database management system
Database:
is a collection of data organized to serve many
applications efficiently by centralizing the data and
controlling redundant data.
Database technology cuts through many of the problems
of traditional file organization.
Rather than storing data in separate files for each
application, data appear to users as being stored in only
one location.
A single database services multiple applications.
Database management system
Database management system (DBMS):
is software that enables an organization to centralize
data, manage them efficiently, and provide access to
the stored data by application programs.
The DBMS acts as an interface between application
programs and the physical data files.

Data sharing throughout the organization is easier


because the data are presented to users as being in a
single location rather than fragmented in many different
systems and files.
Database management system
How a DBMS Solves the Problems of the
Traditional File Environment:
1. reduces data redundancy and inconsistency.
2. may not enable the organization to eliminate data
redundancy entirely, but it can help control
redundancy.
3. eliminates data inconsistency
4. uncouples programs and data, enabling data to stand
on their own.
5. enables the organization to centrally manage data,
their use, and security.
Database management system
Database management system

Relational DBMS:
1. The most popular type of DBMS for larger computers
and mainframes is the relational DBMS
2. Relational databases represent data as two-
dimensional tables (called relations).
3. Tables may be referred to as files. Each table contains
data on an entity and its attributes.
Database management system
1. tuples (Rows): records for different entities.
2. fileds (columns): represents an attribute for that entity.
3. key filed: the filed the table uniquely identifies each
record.
4. Primary key: each table in a relational database has one
filed that is designed as its primary key
This key filed is the unique identifier for all the
information in any row of the table and this primary key
cannot be duplicated.
5. foreign key: primary used in second table as look up
filed to identify records from original table
Database management system
Capabilities of Database
Management Systems

1) data definition capability:


It would be used to create database tables and to define
the characteristics of the fields in each table.
This information about the database would be
documented in a data dictionary
2) data dictionary:
is an automated or manual file that stores definitions of
data elements and their characteristics.
Capabilities of Database
Management Systems

3) Querying and Reporting:


Most DBMS have a specialized language called a data
manipulation language that is used to add, change,
delete, and retrieve the data in the database.
The most prominent data manipulation language
today is Structured Query Language, or SQL.
Capabilities of Database
Management Systems
Microsoft Access also uses SQL, but it provides its own set
of user- friendly tools for querying databases and for
organizing data from databases into more polished
reports.

Microsoft Access and other DBMS include capabilities for


report generation so that the data of interest can be
displayed in a more structured and polished format than
would be possible just by querying.
Capabilities of Database
Management Systems
Capabilities of Database
Management Systems
Capabilities of Database
Management Systems
Non-relational Databases, Cloud
Databases, and Blockchain
Companies are turning to “NoSQL” non-relational database
technologies for this purpose.
1- Non-relational database management systems:
use a more flexible data model
designed for managing large data sets across many
distributed machines
easily scaling up or down
useful for accelerating simple queries against large
volumes of structured and unstructured data
Non-relational Databases, Cloud
Databases, and Blockchain

2- Cloud Databases and Distributed Databases:


Amazon Relational Database Service (Amazon RDS)
Microsoft SQL Server, Oracle Database, PostgreSQL, or
Amazon Aurora
distributed database: distributed database is one that
is stored in multiple physical locations (private cloud)
Non-relational Databases, Cloud
Databases, and Blockchain

Blockchain:
is a distributed database technology that enables
firms.
organizations to create and verify transactions on a
network nearly instantaneously without a central
authority.
The blockchain maintains a continuously growing list
of records called blocks.
Non-relational Databases, Cloud
Databases, and Blockchain
There are many large benefits to firms using
blockchain databases. Blockchain networks radically
reduce the cost of verifying users, validating
transactions, and the risks of storing and processing
transaction information across thousands of firms.
encryption used to identify participants and
transactions.
Standardization of recording transactions is aided
through the use of smart contracts.
Non-relational Databases, Cloud
Databases, and Blockchain
Smart contracts: re computer programs that implement
the rules governing transactions between firms, e.g.,
what is the price of products, how will they be
shipped, when will the transaction be completed
The simplicity and security that blockchain offers has
made it attractive for storing and securing financial
transactions, supply chain transactions, medical
records, and other types of data.
is a foundation technology for Bitcoin, Ethereum, and
other cryptocurrencies.
Non-relational Databases, Cloud
Databases, and Blockchain
Learning object 3:

What are the principal tools and


technologies for accessing
information from databases to
improve business performance and
decision making?
The Challenge of Big Data

big data:
sets with volumes so huge that they are beyond the
ability of typical DBMS to capture, store, and analyze.
Th massive sits of unstructured/semi structured data
from web traffic, social media, sensors and so on.
Volumes two great for typical DBMS
Petabytes, exabytes of data.
The Challenge of Big Data

Businesses: are interested in big data because they can


reveal more patterns and interesting relationships than
smaller data sets.
to derive business value from these data, organizations
need new technologies and tools capable of managing and
analyzing nontraditional data along with their traditional
enterprise data.
Business Intelligence
Infrastructure
A contemporary infrastructure for business intelligence has
an array of tools for obtaining useful information from all the
different types of data used by businesses today, including
semi-structured and unstructured big data in vast quantities.
These capabilities include:
1. Data Warehouses
2. Data mart
3. Hadoop
4. In-Memory Computing
5. Analytic Platforms
Business Intelligence
Infrastructure
Data Warehouses:
is a database that stores current and historical data of
potential interest to decision makers throughout the
com- pany. The data originate in many core
operational transaction systems.
The data warehouse makes the data available for
anyone to access as needed, but the data cannot be
altered.
Provides a range of ad hoc and standardized query
tools, analytical tools, and graphical reporting facilities.
Business Intelligence
Infrastructure
Data Mart:
is a subset of a data ware- house in which a
summarized or highly focused portion of the
organization’s data is placed in a separate database
for a specific population of users.
Typically focus on single subject or line of business.
Business Intelligence
Infrastructure
Hadoop:
is an open source software framework that enables
distributed parallel processing of huge amounts of
data across inexpensive computers.
Typically focus on single subject or line of business.
It breaks a big data problem down into subproblems.
Hadoop runs on a cluster of inexpensive servers, and
processors can be added or removed as needed.
Business Intelligence
Infrastructure
key services:
1. Hadoop Distributed File System (HDFS) for data storage.
2. MapReduce for high-performance parallel data
processing. Break data into clusters for work.
3. HBase: non-relational database NoSQL.
Yahoo uses Hadoop to track users’ behavior so it can
modify its home page to fit their in- terests. Life
sciences research firm NextBio uses Hadoop and HBase
to process data for pharmaceutical companies
conducting genomic research.
Business Intelligence
Infrastructure
In-Memory Computing:
facilitating big data analysis
Uses computers main memory (RAM) for data storage
to avoid delays in retrieving data from disk storage.
Complex business calculations that used to take hours
or days are able to be completed within seconds, and
this can even be ac- complished using handheld
devices.
Requires optimised hardware
Business Intelligence
Infrastructure
Analytic Platforms:
high-speed platforms using both relational and non-
relational technology that are optimized for analyzing
large data sets.
Analytic platforms feature preconfigured hardware-
software systems that are specifically designed for
query process- ing and analytics.
Analytic platforms also include in-memory systems
and NoSQL non-relational database management
systems and are now available as cloud services.
Business Intelligence
Infrastructure

Some companies are starting to pour all of these types


of data into a data lake.
data lake: is a repository for raw unstructured data or
structured data that for the most part has not yet
been analyzed, and the data can be accessed in many
ways.
Business Intelligence
Infrastructure
Analytical Tools: Relationships,
Patterns, Trends
Tools for consolidating, analysing and providing access to
vast amounts of data to help users make better business
decisions
1. Online analytical processing (OLAP)
2. Data Mining
3. Text Mining
4. Web Mining
Analytical Tools: Relationships,
Patterns, Trends

Online analytical processing (OLAP):


EX: how many washers were sold during the past quarter.
OLAP supports multidimensional data analysis.
enabling users to view the same data in different ways
using multiple dimensions.
Each aspect of information—product, pricing, cost, region,
or time period—represents a dif- ferent dimension.
Analytical Tools: Relationships,
Patterns, Trends

OLAP enables users to obtain online answers to ad hoc


questions such as these in a fairly rapid amount of
time, even when the data are stored in very large
databases, such as sales figures for multiple years.
A company would use either a specialized
multidimensional database or a tool that creates
multidimensional views of data in relational databases.
Analytical Tools: Relationships,
Patterns, Trends
Analytical Tools: Relationships,
Patterns, Trends
Data Mining:
provides insights into corporate data that cannot be
obtained with OLAP by finding hidden patterns and
relationships in large databases.
inferring rules from them to predict future behavior.
The types of information obtainable from data mining:
1. Associations
2. sequences
3. Classification
4. Clustering
5. forecasting
Analytical Tools: Relationships,
Patterns, Trends
Associations: are occurrences linked to a single event.
sequences: events are linked over time.
Classification: recognizes patterns that describe the group
to which an item belongs by examining existing items that
have been classified and by infer- ring a set of rules.
Clustering: works in a manner similar to classification when
no groups have yet been defined. A data mining tool can
discover different groupings within data.
Forecasting: It uses a series of existing values to forecast
what other values will be.
Analytical Tools: Relationships,
Patterns, Trends
Text Mining:
tools are now available to help busi- nesses analyze
these data.
extract key elements from unstructured natural
language text, discover patterns and relationships,
and summarize the information.
account for more than 80 percent of useful
organizational information.
one of the major sources of big data that firms want to
analyze.
Analytical Tools: Relationships,
Patterns, Trends
Email, memos, call center tran- scripts, survey
responses, legal cases, patent descriptions, and
service reports.
Businesses might turn to text mining to analyze
transcripts of calls to cus- tomer service centers to
identify major service and repair issues or to measure
customer sentiment about their company.
Sentiment analysis: software is able to mine text
comments in an email message, blog, social media
conversation, or survey forms to detect favorable and
unfavorable opinions about specific sub- jects.
Analytical Tools: Relationships,
Patterns, Trends
Wed Mining:
The discovery and analysis of useful patterns and
information from the World Wide Web (WWW.).
1) Web content mining:
is the process of extracting knowledge from the content
of web pages, which may include text, image, audio, and
video data.
2) Web structure mining:
examines data related to the struc- ture of a particular
website.
Analytical Tools: Relationships,
Patterns, Trends

3) Web usage mining:


examines user interaction data recorded by a web server
when- ever requests for a website’s resources are
received. The usage data records the user’s behavior when
the user browses or makes transactions on the website
and collects the data in a server log.
Databases and the Web

Many companies now use the web to make some of


the information in their internal databases available to
customers and business partners.
Typical configuration includes:
1. Web Server.
2. application server/middleware/CGI scripts.
3. database server (hosting DBMS).
Databases and the Web
Web Server.
The web server passes these requests for data to
software that translates HTML commands into SQL.
application server/middleware/CGI scripts.
The application server software handles all application
operations, including transaction processing and data
access, between browser-based com- puters and a
company’s back-end business applications or databases.
A CGI script is a compact program using the Common
Gateway Interface (CGI) specifi- cation for processing data
on a web server.
Databases and the Web
database server (hosting DBMS).
In a client/server environment, the DBMS resides on a
dedicated computer
Advantages to using the web to access an
organization’s internal databases:
1. web browser software is much easier to use than
proprietary query tools.
2. the web interface requires few or no changes to the
internal database.
3. It costs much less to add a web interface in front of a
legacy system than to redesign and rebuild the system
to improve user access.
Databases and the Web
Learning object 3:

Why are information policy, data


administration, and data quality
assurance essential for managing
the firm’s data resources?
Establishing an Information
Policy

Every business, large and small, needs an information


policy.
information policy: information policy specifies the
organization’s rules for sharing, disseminating,
acquiring, standardizing, classifying, and inventorying
information.
Establishing an Information
Policy
small business, the information policy would be
established and implemented by the owners or
managers.
In a large organization, managing and planning for
information as a corporate resource often require a
formal data administration function.
1) Data administration: is responsible for the specific
policies and procedures through which data can be
managed as an organizational resource.
Establishing an Information Policy

2) data governance:
deals with the policies and processes for managing the
availability, usability, integrity, and secu- rity of the data
employed in an enterprise with special emphasis on pro-
moting privacy, security, data quality, and compliance with
government regulations.
3) database administration: responsible for defining and
organizing the structure and content of the database and
maintaining the database.
Ensuring Data Quality

Data that are inaccurate, untimely, or inconsistent


with other sources of information lead to incorrect
decisions, product recalls, and financial losses.
Gartner, Inc. reported that more than 25 percent of the
critical data in large Fortune 1000 companies’
databases is inaccurate or incomplete.
Before a new database is in place, organizations need:
1. identify and correct their faulty data
2. establish better routines for editing data once their
database is in operation.
Ensuring Data Quality

Data quality audit:


structured survey of the accuracy and level of
completeness of the data in an information system.
Data cleansing: also known as data scrubbing.
activities for de- tecting and correcting data in a database
that are incorrect, incomplete, improp- erly formatted, or
redundant.
Data cleansing not only corrects errors but also enforces
consistency among different sets of data that originated
in separate information systems.

You might also like