Chapter Two Data Moder
Chapter Two Data Moder
Learn about:
• Data modeling and why data models are important
• Basic data-modeling building blocks
• What business rules are and how they influence database
design
• How the major data models evolved
• Emerging alternative data models and the need they fulfill
• How data models can be classified by their level of
abstraction
Data Models
Data Model: an abstraction (simple representations) of
complex real-world object or event or data structures
Useful in understanding complexities of the real-world environment
Often graphical
Data modeling is iterative and progressive
Definition :
A set of concepts to describe the structure of a database, the
operations for manipulating these structures, and certain
constraints that the database should obey.
Data Model Structure and Constraints:
Constructs are used to define the database structure
Constructs typically include elements (and their data types) as
well as groups of elements (e.g. entity, record, table), and
relationships among such groups
Constraints specify some restrictions on valid data; these
Slid
e 2-
2
constraints must be enforced at all times
Data Models (Cont…)
Data Model Operations:
These operations are used for specifying database
retrievals and updates by referring to the constructs of the
data model.
Operations on the data model may include basic model
operations (e.g. generic insert, delete, update) and user-
defined operations (e.g. compute_student_gpa,
update_inventory)
Importance of Data Models
Facilitate interaction among the designer, the applications
programmer, and the end user
End users have different views and needs for data
Data model organizes data for various users
Data model is an abstraction
Slid
e 2-
3
Cannot draw required data out of the data model
Data Model Basic Building Blocks
Entity: anything about which data are to be collected and stored
Attribute: a characteristic of an entity
Relationship: describes an association among entities
One-to-many (1:M) relationship
Many-to-many (M:N or M:M) relationship
One-to-one (1:1) relationship
Constraint: a restriction placed on the data
Business Rules
Descriptions of policies, procedures, or principles within a specific
organization
Apply to any organization that stores and uses data to generate information
Must be in writing and kept up to date
Must be easy to understand and widely disseminated
Describe characteristics of data as viewed by the company
Discovering
Slid
Business Rules
e 2-
Sources of business rules:
4
Discovering Business Rules (Cont..)
Written documentation
Procedures
Standards
Operations manuals
Direct interviews with end users
Naming Conventions
Naming occurs during translation of business rules to data model
components
Names should make the object unique and distinguishable from other
objects
Names should also be descriptive of objects in the environment and
be familiar to users
Proper naming:
Facilitates communication between parties
Promotes self-documentation
THE EVOLUTION OF DATA MODELS
The quest for better data management has led to several models
that attempt to resolve the file system’s critical shortcomings.
These models represent schools of thought as to what a database
is, what it should do, the types of structures that it should employ,
and the technology that would be used to implement these
structures. Perhaps confusingly, these models are called data
models just as are the graphical data models that we have been
discussing. This section gives an overview of the major data
models in roughly chronological order. You will discover that
many of the “new” database concepts and structures bear a
remarkable resemblance to some of the “old” data model
concepts and structures. Table 2.1 traces the evolution of the
major data models.
Evolution of Data Models:
Slid
e 2-
9
Hierarchical Model:
• Developed in 1960 to manage large amount of data of Apollo rocket
which landed on moon in 1969.
• Basic logical structure is represented by an upside-down “tree”
• It contains Levels or Segments. Segment is equivalent to file system’s
record type.
• The top layer is perceived as parent of the segment directly beneath
it.
• It depicts a set of one-to-many (1:M) relationships between parent and
its children segments.
Slid
e 2-
10
Network Model:
• Created to represent complex data relationships more effectively than
the hierarchical model, to improve database performance and to
impose database standard.
• Database Task Group(DBTG) was created to define database
standard specifications.
• Final DBTG report contains 3 crucial database components
• Schema: which defines conceptual organization of the entire
database as viewed by database admin.
• Subschema: defines the portion of the database seen by the
application program that actually produce the desired information
from the data contained within the database.
• Data management language : defines the environment in which
data can be managed. DBTG specified 3 components:
• Schema Data Definition Language: Define schema components by DB admin.
• Subschema DDL: defines database components that will be used by
Slid
application programs.
e 2- • Data Manipulation Language (DML) to work with the data in the DB.
11
In the following network model, CUSTOMER, SALESREP,
PRODUCT, INVOICE, PAYEMNT and INV_LINE represent
record types.
Slid
e 2-
12
Relational Model:
Introduced in 1970 by E.F.Codd(of IBM) in his paper “A Relational
Model of Data for Large Shared Databanks”.
• Its foundation is a mathematical concept known as relation.
• RDBMS manages all the physical details, while user sees the DB as
a collection of tables in which data is stored.
• Relational Diagram: Representation of entities, Attributes &
Slid
relationships.
e 2-
•
13 Relational table stores collection of related entities.
Slid
e 2-
14
SQL- based Relational Model:
SQL-based relational database application involves 3 parts:
End-user interface :Allows end user to interact with the data
Set of tables stored in the database
Each table is independent from another
Rows in different tables are related based on common values in
common attributes
SQL “engine”:Executes all queries
Slid
e 2-
16
The Object-Oriented (OO) Model
Data and relationships are contained in a single structure known
as an object
OODM (object-oriented data model) is the basis for OODBMS
Semantic data model
An object:
Contains operations
Are self-contained: a basic building-block for autonomous structures
Is an abstraction of a real-world entity
Attributes describe the properties of an object
Objects that share similar characteristics are grouped in classes
Classes are organized in a class hierarchy
Inheritance: object inherits methods and attributes of parent class
UML based on OO concepts that describe diagrams and symbols
Used to graphically model a system
Object/Relational and XML
Extended relational data model (ERDM)
Semantic data model developed in response to increasing
complexity of applications
Includes many of OO model’s best features
Often described as an object/relational database
management system (O/RDBMS)
Primarily geared to business applications
The Internet revolution created the potential to exchange
critical business information
In this environment, Extensible Markup Language (XML)
emerged as the de facto standard
Current databases support XML
XML: the standard protocol for data exchange among
systems and Internet services
Emerging Data Models: Big Data and NoSQL
Big Data
Find new and better ways to manage large amounts of Web-
generated data and derive business insight from it
Simultaneously provides high performance and scalability at
a reasonable cost
Relational approach does not always match the needs of
organizations with Big Data challenges
NoSQL databases
Not based on the relational model, hence the name NoSQL
Supports distributed database architectures
Provides high scalability, high availability, and fault tolerance
Supports very large amounts of sparse data
Geared toward performance rather than transaction
consistency
Degrees of Data Abstraction
Database designer starts with abstracted view, then adds details
ANSI Standards Planning and Requirements Committee (SPARC)
Defined a framework for data modeling based on degrees of
data abstraction (1970s):
External
Conceptual
Internal