50% found this document useful (2 votes)
253 views53 pages

Coronel PPT Ch02 Modified

The document discusses data models and their evolution. It covers hierarchical and network models, which represented complex manufacturing and data relationships using tree structures and allowed both one-to-many and many-to-many relationships. However, these models required knowledge of physical storage and did not scale well. The relational model improved on these issues using a table-based structure and provided flexibility, simplicity and ease of use.

Uploaded by

aki8a
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
50% found this document useful (2 votes)
253 views53 pages

Coronel PPT Ch02 Modified

The document discusses data models and their evolution. It covers hierarchical and network models, which represented complex manufacturing and data relationships using tree structures and allowed both one-to-many and many-to-many relationships. However, these models required knowledge of physical storage and did not scale well. The relational model improved on these issues using a table-based structure and provided flexibility, simplicity and ease of use.

Uploaded by

aki8a
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

11e Database Systems

Design, Implementation, and Management

Chapter 2
Data Models
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Learning Objectives
§ In this chapter, you will learn:
§ About data modeling and why data models are
important
§ About the basic data-modeling building blocks
§ What business rules are and how they influence
database design

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 2
Learning Objectives
§ In this chapter, you will learn:
§ How the major data models evolved
§ About emerging alternative data models and the need
they fulfill
§ How data models can be classified by their level of
abstraction

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 3
4

Database Model
A database model is a
type of data model that
determines the logical
structure of
a database and
fundamentally determines
in which manner data can
be stored, organized and
manipulated. The most
popular example of a
database model is
the relational model,
which uses a table-based
format.

https://en.wikipedia.org/wiki/Database_model

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Modeling and Data Models
• Data modeling: Iterative and progressive process of
creating a specific data model for a determined problem
domain
§ Data models: Simple representations of complex
real-world data structures
§ Useful for supporting a specific problem domain
§ Model - Abstraction of a real-world object or event

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 5
Importance of Data Models
Are a communication tool

Give an overall view of the database

Organize data for various users

Are an abstraction for the creation of good


database

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 6
Data Model Basic Building Blocks
§ Entity: Unique and distinct object used to collect and store data
§ Attribute: Characteristic of an entity
§ Customer name is a distinguishable occurrence
§ Relationship: Describes an association among entities
§ One-to-many (1:M), e.g. PAINTER create PAINTINGS
§ Many-to-many (M:N or M:M), e.g. EMPLOYEE learns
SKILL
§ One-to-one (1:1), e.g. EMPLOYEE manages STORE
§ Constraint: Set of rules to ensure data integrity
§ E.g. salary have values that between 6,000 and 350,000
§ 0<= GPA <= 4.0
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 7
Business Rules
Brief, precise, and unambiguous description of a
policy, procedure, or principle

Enable defining the basic building blocks

Describe main and distinguishing characteristics


of the data

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 8
Sources of Business Rules

Company Department
managers Policy makers managers

Written Direct
documentation interviews
with end users

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 9
Reasons for Identifying and Documenting
Business Rules
§ Help standardize company’s view of data
§ Communications tool between users and designers
§ Allow designer to:
§ Understand the nature, role, scope of data, and business
processes
§ Develop appropriate relationship participation rules and
constraints
§ Create an accurate data model

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 10
Translating Business Rules into Data
Model Components
§ Nouns translate into entities
§ Verbs translate into relationships among entities
§ Relationships are bidirectional
§ Questions to identify the relationship type
§ How many instances of B are related to one instance of
A?
§ How many instances of A are related to one instance of
B?

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 11
Naming Conventions
§ Entity names - Required to:
§ Be descriptive of the objects in the business
environment
§ Use terminology that is familiar to the users
§ Attribute name - Required to be descriptive of the
data represented by the attribute
§ Proper naming:
§ Facilitates communication between parties
§ Promotes self-documentation

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 12
13

Evolution of Major Data Models

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Hierarchical and Network Models
Hierarchical Models Network Models
§ Manage large amounts of data § Represent complex data
for complex manufacturing relationships
projects § Improve database performance
§ Represented by an upside- and impose a database
down tree which contains standard
segments § Depicts both one-to-many
§ Segments: Equivalent of a file (1:M) and many-to-many
system’s record type (M:N) relationships
§ Depicts a set of one-to-many
(1:M) relationships
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 14
Hierarchical Model
Advantages Disadvantages
§ Promotes data sharing § Requires knowledge of physical
§ Parent/child relationship promotes data storage characteristics
conceptual simplicity and data § Navigational system requires
integrity knowledge of hierarchical path
§ Database security is provided and § Changes in structure require
enforced by DBMS changes in all application
§ Efficient with 1:M relationships programs
(Each parent can have many § Implementation limitations
children, but each child has only § No data definition
one parent)
§ Lack of standards
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 15
Network Model
Advantages Disadvantages
§ Conceptual simplicity § System complexity limits
§ Handles more relationship types efficiency

§ Data access is flexible § Navigational system yields


complex implementation,
§ Data owner/member relationship application development, and
promotes data integrity management
§ Conformance to standards § Structural changes require
§ Includes data definition language changes in all application
(DDL) and data manipulation programs
language (DML)

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 16
Standard Database Concepts

Schema
• Conceptual organization of the entire database as viewed by
the database administrator

Subschema

• Portion of the database seen by the application programs that


produce the desired information from the data within the
database

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 17
Standard Database Concepts

Data manipulation language (DML)

• Environment in which data can be managed and is


used to work with the data in the database

Schema data definition language


(DDL)
• Enables the database administrator to define the
schema components
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 18
The Relational Model
§ Produced an automatic transmission database that
replaced standard transmission databases
§ Based on a relation
§ Relation or table: Matrix composed of intersecting
tuple and attribute
§ Tuple: Rows
§ Attribute: Columns
§ Describes a precise set of data manipulation
constructs

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 19
Relational Model
Advantages Disadvantages
§ Structural independence is § Requires substantial hardware and
promoted using independent system software overhead
tables
§ Conceptual simplicity gives
§ Tabular view improves untrained people the tools to use a
conceptual simplicity
good system poorly
§ Ad hoc query capability is based
on SQL § May promote information
problems
§ Isolates the end user from
physical-level details
§ Improves implementation and
management simplicity

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 20
Relational Database Management
System(RDBMS)
§ Performs basic functions provided by the hierarchical
and network DBMS systems
§ Makes the relational data model easier to understand
and implement
§ Hides the complexities of the relational model from
the user

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 21
Linking relational tables

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 22
Figure 2.2 - A Relational Diagram

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 23
SQL-Based Relational Database Application

§ End-user interface
§ Allows end user to interact with the data
§ Collection of tables stored in the database
§ Each table is independent from another
§ Rows in different tables are related based on common
values in common attributes
§ SQL engine
§ Executes all queries

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 24
The Entity Relationship Model
§ Graphical representation of entities and their relationships
in a database structure
§ Entity relationship diagram (ERD)
§ Uses graphic representations to model database components
§ An entity is represented by a rectangle, and name (singular
form) is written in the center of the rectangle.
§ An entity is mapped to a relational table
§ Entity instance or entity occurrence
§ Rows in the relational table
§ Connectivity: Term used to label the relationship types
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 25
Entity Relationship Model
Advantages Disadvantages
§ Visual modeling yields § Limited constraint
conceptual simplicity representation
§ Visual representation makes it § Limited relationship
an effective communication representation
tool § No data manipulation
§ Is integrated with the dominant language
relational model § Loss of information content
occurs when attributes are
removed from entities to avoid
crowded displays
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 26
Figure 2.3 - The ER Model Notations

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 27
The Object-Oriented Data Model (OODM)
or Semantic Data Model
§ Object-oriented database management
system(OODBMS)
§ Based on OODM
§ Object: Contains data and their relationships with
operations that are performed on it
§ Basic building block for autonomous structures
§ Abstraction of real-world entity
§ Attributes - Describe the properties of an object

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 28
The Object-Oriented Data Model (OODM)
§ Class: Collection of similar objects with shared
structure and behavior organized in a class hierarchy
§ Class hierarchy: Resembles an upside-down tree in
which each class has only one parent
§ Inheritance: Object inherits methods and attributes
of parent class
§ Unified Modeling Language (UML)
§ Describes sets of diagrams and symbols to graphically
model a system

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 29
Object-Oriented Model
Advantages Disadvantages
§ Semantic content is added § Slow development of
standards caused vendors to
§ Visual representation includes supply their own
semantic content enhancements
§ Inheritance promotes data § Compromised widely accepted
integrity standard
§ Complex navigational system
§ Learning curve is steep
§ High system overhead slows
transactions

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 30
Figure 2.4 - A Comparison of OO, UML,
and ER Models

Invoices are generated by customers, each invoice


references one or more lines, and each line
represents an item purchased by a customer.
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 31
Object/Relational and XML
§ Extended relational data model (ERDM)
§ Supports OO features and complex data
representation
§ Object/Relational Database Management System
(O/R DBMS)
§ Based on ERDM, focuses on better data management
§ Extensible Markup Language (XML)
§ Manages unstructured data for efficient and
effective exchange of all data types

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 32
Big Data
§ Aims to:
§ Find new and better ways to manage large amounts of
web and sensor-generated data
§ Provide high performance and scalability at a
reasonable cost
§ Characteristics
§ Volume
§ Velocity
§ Variety

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 33
Big Data Challenges
Volume does not allow the usage of
conventional structures

Expensive

OLAP tools proved inconsistent dealing


with unstructured data

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 34
Big Data New Technologies

Hadoop Hadoop Distributed


File System (HDFS)

MapReduce NoSQL

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 35
Hadoop
§ Hadoop is a Java based, open source, high speed, fault-tolerant
distributed storage and computational framework.
§ Hadoop uses low-cost hardware to create clusters of thousands
of computer nodes to store and process data.
§ Hadoop originated from Google’s work on distributed file
systems and parallel processing and is currently supported by
the Apache Software Foundation.
§ Hadoop has several modules, but the two main components
are Hadoop Distributed File System (HDFS) and MapReduce.

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 36
Hadoop Distributed File Systems
(HDFS)
§ Hadoop Distributed File System (HDFS) is a highly
distributed, fault-tolerant file storage system designed to
manage large amounts of data at high speeds.
§ In order to achieve high throughput, HDFS uses the write-
once, read many model. This means that once the data is
written, it cannot be modified.
§ HDFS uses three types of nodes: a name node that stores all
the metadata about the file system, a data node that stores
fixed-size data blocks (that could be replicated to other data
nodes), and a client node that acts as the interface between the
user application and the HDFS.

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 37
MapReduce
§ MapReduce is an open source application
programming interface (API) that provides fast data
analytics services.
§ MapReduce distributes the processing of the data
among thousands of nodes in parallel.
§ The MapReduce framework provides two main
functions, Map and Reduce. In general terms, the
Map function takes a job and divides it into smaller
units of work; the Reduce function collects all the
output results generated from the nodes and
integrates them into a single result set.
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 38
NoSQL Databases
§ Not based on the relational model
§ Support distributed database architectures
§ Provide high scalability, high availability, and fault
tolerance
§ Support large amounts of sparse data
§ Geared toward performance rather than transaction
consistency
§ Store data in key-value stores

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 39
NoSQL
Advantages Disadvantages
§ High scalability, availability, and § Complex programming is
fault tolerance are provided required
§ Uses low-cost commodity § There is no relationship support
hardware § There is no transaction integrity
§ Supports Big Data support
§ 4. Key-value model improves § In terms of data consistency, it
storage efficiency provides an eventually consistent
model

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 40
Figure 2.5 - A Simple Key-value
Representation

Schema-less

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 41
Figure 2.6 - The Evolution of Data Models

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 42
Table 2.3 - Data Model Basic Terminology
Comparison

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 43
Figure 2.7 - Data Abstraction Levels

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 44
The External Model
§ End users’ view of the data environment
§ ER diagrams are used to represent the external views
§ External schema: Specific representation of an
external view

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 45
Figure 2.8 - External Models for Tiny
College

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 46
Relationship
§ A PROFESSOR may teach many CLASSes, and each CLASS is taught by
only one PROFESSOR; there is a 1:M relationship between PROFESSOR
and CLASS.
§ A CLASS may ENROLL many students, and each STUDENT may
ENROLL in many CLASSes, thus creating an M:N relationship between
STUDENT and CLASS.
§ Each COURSE may generate many CLASSes, but each CLASS references
a single COURSE. For example, there may be several classes (sections) of
a database course that have a course code of CIS-420, but each of which is
offered on different time intervals.
§ a CLASS requires one ROOM, but a ROOM may be scheduled for many
CLASSes. That is, each classroom may be used for several classes: one at
9:00 a.m., one at 11:00 a.m., and one at 1:00 p.m., for example. In other
words, there is a 1:M relationship between ROOM and CLASS.

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 47
The Conceptual Model
§ Represents a global view of the entire database by the
entire organization
§ Conceptual schema: Basis for the identification and
high-level description of the main data objects
§ Has a macro-level view of data environment
§ Is software and hardware independent
§ Logical design: Task of creating a conceptual data
model

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 48
Figure 2.9 - Conceptual Model for Tiny
College

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 49
The Internal Model
§ Representing database as seen by the DBMS
mapping conceptual model to the DBMS
§ Internal schema: Specific representation of an
internal model
§ Uses the database constructs supported by the chosen
database
§ Is software dependent and hardware independent
§ Logical independence: Changing internal model
without affecting the conceptual model

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 50
Figure 2.10 - Internal Model for Tiny
College

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 51
The Physical Model
§ Operates at lowest level of abstraction
§ Describes the way data are saved on storage media
such as disks or tapes
§ Requires the definition of physical storage and data
access methods
§ Relational model aimed at logical level
§ Does not require physical-level details
§ Physical independence: Changes in physical model
do not affect internal model

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 52
Table 2.4 - Levels of Data Abstraction

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 53

You might also like