0% found this document useful (0 votes)
6 views5 pages

Chapter 5 ITM100

Uploaded by

adinasara212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views5 pages

Chapter 5 ITM100

Uploaded by

adinasara212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

A Database

● An organized collection of data stored centrally to serve various information system


applications

Basic Concepts
● Entity: Person, place, thing, event about which information is maintained
● Attribute: Description of a particular entity
● Key field: Identifier field used to retrieve, update, sort a record

Enrollment System
● Maintains information about students, courses, schedule, and enrollment information
● Students: name, current major, gender, Student ID, advisor
● Courses: course identifier, department, course name, units
● Faculty: name, employee ID, department, courses taught
● Schedule: courses taught by all professors in all departments

File Organization Terms and Concepts


● Database: Group of related files
● File: Group of records of the same type
● Record: Group of related fields
● Field: Group of characters as word(s) or number(s)
● Byte: Group of bits that represents a single character
● Bit: Smallest unit of data; binary digit (0,1)

Problems with the Traditional File Environment


● Files maintained separately by different departments
● Data redundancy: Presence of duplicate data in multiple files
● Data inconsistency: Same attribute has different values
● Program-data dependence: Changes in program require changes to data
accessed by the program
● Lack of flexibility-it cant deliver ad-hoc reports or respond to unanticipated
information requirements
● Poor security
● Lack of data sharing and availability

Database Management Systems


● Database: Serves many applications by centralizing data and controlling redundant data
● Database management system (DBMS):
● Interfaces between applications and physical data files
● Separates logical and physical views of data
● Solves problems of traditional file environment
● Controls redundancy
● Eliminates inconsistency
● Uncouples programs and data
● Enables organization to centrally manage data and data security

Relational DBMS
● Represent data as two-dimensional tables
● Table: Grid of columns and rows
● Rows (tuples): Records for different entities
● Fields (columns): Represents attribute for entity
● Key field: Field used to uniquely identify each record
● Primary key: Field in table used for key fields
● Foreign key: Primary key used in second table as look-up field to identify
records from original table

Capabilities of Database Management Systems


● Data definition capability: Specifies structure of the database
● Data dictionary: Stores definition of data elements and their characteristics
● Querying and reporting:
● Data manipulation language
● Structured Query Language (SQL)
● Many DBMS have report generation capabilities for creating polished reports
(e.g., Microsoft Access)

Operations of a Relational DBMS


1. SELECT: Creates subset of data of all records that meet stated criteria
2. JOIN: Combines relational tables to provide user with more information than available in
individual tables
3. PROJECT: Creates subset of columns in table, creating tables with only the information
specified

Designing Databases
● Conceptual design: Abstract model of database from a business perspective
● Entity-relationship diagram: Methodology for documenting databases illustrating
relationships between database entities
● Normalization: Process of creating small stable data structures from complex groups of
data
● Physical design: Detailed description of how the data will actually be arranged and
stored on physical devices

Non-relational Databases and Databases in the Cloud


● Non-relational databases (NoSQL):
● More flexible data model
● Data sets stored across distributed machines
● Easier to scale
● Handle large volumes of unstructured and structured data
● Databases in the cloud:
● Appeal to start-ups, smaller businesses
● Examples: Amazon Relational Database Service, Microsoft SQL Azure
● Private clouds

Blockchain
● Distributed ledgers in a peer-to-peer distributed database
● Maintains a growing list of records and transactions shared by all
● Encryption used to identify participants and transactions
● Used for financial transactions, supply chain, and medical records
● Foundation of Bitcoin, and other cryptocurrencies

Business Intelligence Infrastructure


● Data warehouse:
● Stores current and historical data from many core operational transaction
systems
● Consolidates and standardizes information for use across enterprise, but data
cannot be altered
● Provides analysis and reporting tools
● Data marts:
● Subset of data warehouses
● Summarized or highly focused portion of firm’s data for use by specific population
of users
● Typically focuses on single subject or line of business
● Hadoop:
● Enables distributed parallel processing of big data across inexpensive computers
● Key services:
● Hadoop Distributed File System (HDFS): Data storage
● MapReduce: Breaks data into clusters for work
● Hbase: NoSQL database
● In-memory computing:
● Used in big data analysis
● Uses computer's main memory (RAM) for data storage to avoid delays in
retrieving data from disk storage
● Can reduce hours/days of processing to seconds
● Requires optimized hardware
● Analytic platforms:
● High-speed platforms using both relational and non-relational tools optimized for
large datasets

Analytical Tools: Relationships, Patterns, Trends


● Tools for consolidating, analyzing, and providing access to vast amounts of data to help
users make better business decisions
● Tools include:
● Multidimensional data analysis (OLAP)
● Data mining
● Text mining
● Web mining

Online Analytical Processing (OLAP)


● Supports multidimensional data analysis
● Viewing data using multiple dimensions
● Each aspect of information (product, pricing, cost, region, time period) is a
different dimension
● OLAP enables rapid, online answers to ad hoc queries

Data Mining
● Finds hidden patterns, relationships in datasets
● Example: Customer buying patterns
● Infers rules to predict future behavior
● Types of information obtainable from data mining:
● Associations
● Sequences
● Classification
● Clustering
● Forecasting
Text Mining and Web Mining
● Text mining: Extracts key elements from large unstructured data sets
● Web mining: Discovery and analysis of useful patterns and information from web
● Web content/structure/usage mining
● Sentiment analysis: Mines text comments in email, blog, social media conversation, or
survey to detect favorable and unfavorable opinions about specific subjects

Databases and the Web


● Many companies use the web to make some internal databases available to customers
or partners
● Advantages of using the web for database access:
● Ease of use of browser software
● Web interface requires few or no changes to database
● Inexpensive to add web interface to system

Establishing an Information Policy


● Firm’s rules, procedures, roles for sharing, managing, standardizing data
● Data administration: Establishes policies and procedures to manage data
● Data governance: Deals with policies and processes for managing availability, usability,
integrity, and security of data, especially regarding government regulations
● Database administration: Creating and maintaining database

Ensuring Data Quality


● More than 25 percent of critical data in Fortune 1000 company databases are inaccurate
or incomplete
● Data quality audit: Structured survey of the accuracy and completeness of data in an
information system
● Data cleansing: Consists of activities for detecting and correcting data in an information
system

You might also like