0% found this document useful (0 votes)
18 views13 pages

Cho 2

Uploaded by

zeinn20032003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views13 pages

Cho 2

Uploaded by

zeinn20032003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Chapter 2 Introduction to Data Modeling

Objectives: to understand • Data modeling reduces complexities of database


• Data modeling and why data models are design
important • Designers, programmers, and end users see data
• The basic data-modeling building blocks in different ways
• What business rules are and how they influence • Different views of same data lead to designs that
database design do not reflect organization’s operation
• How the major data models evolved historically • Various degrees of data abstraction help
• How data models can be classified by level of reconcile varying views of same data
abstraction

CS275 Fall 2010 1 CS275 Fall 2010 2

Data Modeling and Data Models The Importance of Data Models


• Model: an abstraction of a real-world object or • Facilitate interaction among the designer, the
event applications programmer, and the end user
– Useful in understanding complexities of the real- • End users have different views and needs for data
world environment • Data model organizes data for various users
• Data models • Data model is a conceptual model - an abstraction
– Relatively simple representations of complex real-
• It’s a graphical collection of logical constructs
world data structures
representing the data structure and relationships
• Often graphical
within the database.
• Creating a Data model is iterative and progressive
– Cannot draw required data out of the data model
– An implementation model would represent how
CS275 Fall 2010 3
the data are represented in the database.
CS275 Fall 2010 4

1
Data Model Basic Building Blocks
Business Rules
Terminology
• Entity: anything about which data are to be • Descriptions of policies or principles within an
collected and stored organization
• Attribute: a characteristic of an entity • Description of operations or procedures, to
• Relationship: describes an association among create/enforce actions within an organization’s
entities environment
– One-to-many (1:M) relationship – Must be in writing and kept up to date
– Must be easy to understand and widely disseminated
– Many-to-many (M:N or M:M) relationship
– Sometimes externally defined, i.e. government
– One-to-one (1:1) relationship
regulations.
• Constraint: a restriction placed on the data • These describe characteristics of data as viewed
by the company
CS275 Fall 2010 5 CS275 Fall 2010 6

Discovering Business Rules Importance of Business Rules


• Sources of business rules: • Standardize company’s view of data
– Company managers • Useful as a communications tool between users
– Policy makers and designers
– Department managers • Allows the designer to
– Written documentation – understand the nature, role, and scope of data
• Procedures – understand business processes
• Standards – develop appropriate relationship participation
• Operations manuals rules and constraints
– Direct interviews with end users • Promotes the creation of an accurate data model

• Always verify sources of information


CS275 Fall 2010 7 CS275 Fall 2010 8

2
Translating Business Rules into Data
Naming Conventions
Model Components
• Naming occurs during translation of business
• Generally, nouns translate into entities
rules to data model components
• Verbs translate into relationships among entities
• Names should make the object unique and
• Relationships are bidirectional distinguishable from other objects
• Two questions to identify the relationship type: • Names should also be descriptive of objects in the
– How many instances of B are related to one environment and be familiar to users
instance of A?
• Proper naming:
– How many instances of A are related to one
– Facilitates communication between parties
instance of B?
– Promotes self-documentation

CS275 Fall 2010 9 CS275 Fall 2010 10

Evolution of Data
The Hierarchical Model
Implementation Models
• Hierarchical • The hierarchical model was developed in the 1960s
– Logically represented by an upside down tree to manage large amounts of data for manufacturing
• Each parent can have many children projects
• Each child has only one parent
• Basic logical structure is represented by an upside-
down “tree”
• Network
• Hierarchical structure contains levels or segments
• Relational – Segment analogous to a record type
• Object oriented – Set of one-to-many relationships between segments
• Hybrid, XML • Example – manufacturing a car from components
(a,b,or c), each made of subassemblies (1,2,or3),
each having parts (x,y,&z) ....(tree structure)

CS275 Fall 2010 11 CS275 Fall 2010 12

3
Hierarchical Structure Hierarchical Structure
• Each parent can have many children
• Each child has only one parent
• Tree is defined by path that traces parent
segments to child segments, beginning from the
left
• Hierarchical path
– Ordered sequencing of segments tracing
hierarchical structure
• Preorder traversal or hierarchic sequence
– “Left-list” path
CS275 Fall 2010 13 CS275 Fall 2010 14

The Hierarchical Model The Hierarchical Model


• GUAM (Generalized Update Access Method)
• Advantages • Disadvantages
– Based on the recognition that the many smaller
– Conceptual simplicity – Complex
parts would come together as components of still implementation
– Database security
larger components
– Data independence – Difficult to manage
• Information Management System (IMS) – Lacks structural
– Database integrity
– World’s leading mainframe hierarchical database independence
– Efficiency
system in the 1970s and early 1980s – Complex applications
• TCDMS/ADABAS – jointly developed by IBM and programming and use
Lane County – Implementation
limitations
– Lack of standards
CS275 Fall 2010 15 CS275 Fall 2010 16

4
The Network Model The Network Model
• The network model was created to represent • Collection of records in 1:M relationships
complex data relationships more effectively than • A Set is a relationship and composed of two
the hierarchical model record types:
– Improves database performance – Owner: Equialent to the hierarchical model’s
– Imposes a database standard parent
– Represent complex data relationships more – Member: Equivalent to the hierarchical model’s
effectively – such as child w/ multiple parents child
• Conference on Data Systems Languages
(CODASYL)
• American National Standards Institute (ANSI)
• Database Task Group (DBTG)
CS275 Fall 2010 17 CS275 Fall 2010 18

The Network Model Components The Network Model


• Concepts still used today: • Disadvantages of the
• Advantages:
– Schema: Conceptual organization of entire network model:
– Conformance to
database as viewed by the database administrator standards – System complexity
– Subschema: Database portion “seen” by the – Handled more – Lack of ad hoc query
application programs relationship types capability placed
– Data management language (DML): Defines the – Data access flexibility burden on
environment in which data can be managed programmers to
generate code for
– Data definition language (DDL): Enables the
reports
administrator to define the schema components
– Structural change in
the database could
produce havoc in all
CS275 Fall 2010 19 CS275 Fall 2010 application programs 20

5
The Relational Model Relational Table
• Developed by E.F. Codd (IBM) in 1970 • A Relational table is a purely logical structure
• Relational models were considered impractical in – How data are physically stored in the database is
the 1970’s. of no concern to the user or the designer.
• Model was conceptually simple at expense of • Stores a collection of related entities
computer overhead – Resembles a file
• Relational table is purely logical structure • Table (relations)
– How data are physically stored in the database is – Matrix consisting of a series of row/column
of no concern to the user or the designer intersections
– This concept is the source of a real database
– Each row in a relation is called a tuple
revolution
– Related to each other by sharing a common entity
characteristic
CS275 Fall 2010 21 CS275 Fall 2010 22

The Relational Model Components


• Relational data management system (RDBMS)
– Performs same functions provided by hierarchical
model, but hides complexity from the user
• Relational schema/diagram
– Visual representation of relational database’s
entities, attributes within those entities, and
relationships between those entities
• Relational diagram
– Representation of entities, attributes, and
relationships
• Relational table stores collection of related entities.
CS275 Fall 2010 23 CS275 Fall 2010

6
The Relational DBMS Application The Relational Implementation Model

• SQL-based relational database application • Advantages • Disadvantages


involves three parts: – Structural – Substantial hardware
– User interface independence and system software
– Improved conceptual overhead
• Allows end user to interact with the data
simplicity – Can facilitate poor
– Set of tables stored in the database – Easier database design, design and
• Each table is independent from another implementation, implementation
• Rows in different tables are related based on management, and use – May promote “islands
common values in common attributes – Ad hoc query of information”
capability (SQL) problems
– SQL “engine”
– Powerful database
• Executes all queries management system

CS275 Fall 2010 25 CS275 Fall 2010 26

Logical/Conceptual Model
The Entity Relationship Model
The Entity Relationship Model
• Widely accepted standard for data modeling • Entity instance (or occurrence) is row in table
• Introduced by Chen in 1976 • Entity set is collection of like entities
• Graphical representation of entities and their • Connectivity labels types of relationships
relationships in a database structure • Relationships are expressed using Chen notation
• Entity relationship diagram (ERD) – Relationships are represented by a diamond
– Uses graphic representations to model database – Relationship name is written inside the diamond
components
– Entity is mapped to a relational table • Crow’s Foot notation used as design standard in
this book

CS275 Fall 2010 27 CS275 Fall 2010 28

7
Logical/Conceptual Model
The Object-Oriented (OO) Model
• Models both data and relationships contained in a
single structure known as an object
• OODM (object-oriented data model) is the basis
for OO-DBMS (Semantic data model)
• An object is described by its factual content:
– Are self-contained: a basic building-block for
autonomous structures
– Is an abstraction of a real-world entity
– Contains information about relationships between
facts within the object and with other objects.
CS275 Fall 2010 CS275 Fall 2010 30

The Object-Oriented (OO) Model


• An Object is the logical abstraction or basic
building block for autonomous structures
– Attributes describe the properties of an object
– Objects that share similar characteristics are
grouped in classes
– Classes are organized in a class hierarchy
– Inheritance: an object inherits methods and
attributes of parent class
– UML - Unified Modeling Language is used to
graphically model a system
• based on OO concepts that describe diagrams and
CS275 Fall 2010symbols 31 CS275 Fall 2010

8
Logical Models:
Newer Data Models: Object/Relational
Object Oriented Model
• Extended relational data model (ERDM)
• Advantages • Disadvantages
– Semantic data model developed in response to
– Adds semantic content – Slow pace of OODM
standards
increasing complexity of applications
– Visual presentation
includes semantic development – Includes many of OO model’s best features
content – Complex navigational – Often described as an object/relational database
– Database integrity data access management system (O/RDBMS)
– Both structural and – Steep learning curve – Primarily geared to business applications
data independence – High system overhead
slows transactions
– Lack of market
penetration
CS275 Fall 2010 33 CS275 Fall 2010 34

Newer Data Models: XML The Future of Data Models


• The Internet revolution created the potential to • Hybrid DBMSs
exchange critical business information – Retain advantages of relational model
• Dominance of Web has resulted in growing need – Provide object-oriented view of the underlying
to manage unstructured information data
• In this environment, Extensible Markup Language • SQL data services – ‘Cloud Computing’
(XML) emerged as the de facto standard – Store data remotely without incurring expensive
• Current databases support XML hardware, software, and personnel costs
– XML: the standard protocol for data exchange – Companies operate on a “pay-as-you-go” system
among systems and Internet services

CS275 Fall 2010 35 CS275 Fall 2010 36

9
The Development of Data Models Data Models: A Summary
• Each new data model capitalized on the
shortcomings of previous models
• Common characteristics:
– Conceptual simplicity with semantic completeness
– Represent the real world as closely as possible
– Real-world transformations (behavior) must
comply with consistency and integrity
characteristics
• Some models better suited for some tasks

CS275 Fall 2010 CS275 Fall 2010 38

SPARC Framework :
The SPARC External Model
Degrees of Data Abstraction
• Represents the End users’ view of the data
• Database designer starts with abstracted view, environment
then adds details
• ER diagrams represent external views
• ANSI Standards Planning and Requirements
• External schema: specific representation of an
Committee (SPARC)
external view
– Defined a framework for data modeling based on
– Entities
degrees of data abstraction (1970s):
1. External
– Relationships
2. Conceptual – Processes
3. Internal – Constraints

CS275 Fall 2010 39 CS275 Fall 2010 40

10
External Models showing
The External Model two different Users

Conceptual Model
• End users’ view of the data environment
• Requires that the modeler subdivide set of
requirements and constraints into functional
modules that can be examined within the
framework of their external models
• Advantages:
– Easy to identify specific requirements to support
each business unit’s operations
– Facilitates designer’s job by providing feedback
about the model’s adequacy
– Ensures security constraints in database design
– Simplifies application program development
CS275 Fall 2010 41 CS275 Fall 2010 42

The SPARC Conceptual Model The Conceptual Model


• Represents global view of the entire database Advantages
– All external views integrated into single global • Provides a relatively easily understood macro
view: conceptual schema level view of data environment
• Representation of data as viewed by high-level • Independent of both software and hardware
managers – Does not depend on the DBMS software used to
• ER Diagram graphically represents the implement the model
conceptual schema – Does not depend on the hardware used in the
– ER model most widely used conceptual model implementation of the model
• Basis for identification and description of main – Changes in hardware or software or do not affect
data objects, avoiding details database design at the conceptual level

CS275 Fall 2010 43 CS275 Fall 2010 44

11
The SPARC Internal Model
• Representation of the database as “seen” by the
DBMS
– Maps the Conceptual model to the DBMS
• Internal schema depicts a specific representation
of an internal model
• Depends on specific database software
– Change in DBMS software requires internal model
be changed
• Logical independence: change internal model
without affecting conceptual model
CS275 Fall 2010 45 CS275 Fall 2010

The Physical Model Summary


• Operates at lowest level of abstraction • A data model is an abstraction of a complex real-
– Describes the way data are saved on storage world data environment
media such as disks or tapes • Basic data modeling components:
– Software and hardware dependent – Entities
• Requires the definition of physical storage and – Attributes
data access methods – Relationships
• Relational model aimed at logical level – Constraints
– Does not require physical-level details • Business rules identify and define basic modeling
• Physical independence: changes in physical components
model do not affect internal model
CS275 Fall 2010 47 CS275 Fall 2010 48

12
Summary Summary
• Hierarchical model • Object-oriented data model: object is basic
– Set of one-to-many (1:M) relationships between a modeling structure
parent and its children segments • Relational model adopted object-oriented
• Network data model extensions: extended relational data model
– Uses sets to represent 1:M relationships between (ERDM)
record types • OO data models depicted using UML
• Relational model • Data-modeling requirements are a function of
– Current database implementation standard different data views and abstraction levels
– ER model is a tool for data modeling – Three SPARC abstraction levels: external,
• Complements relational model conceptual, internal

CS275 Fall 2010 49 CS275 Fall 2010 50

13

You might also like