0% found this document useful (0 votes)
17 views51 pages

LEC02 DataModels

Uploaded by

Abc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views51 pages

LEC02 DataModels

Uploaded by

Abc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 51

DATA MODELS

Lecture 2

1
LEARNING OUTCOMES
In this chapter, you will learn:
 About data modeling and why data models are
important
 About the basic data-modeling building blocks
 What business rules are and how they influence
database design
 How the major data models evolved
 How data models can be classified by level of
abstraction

2
INTRODUCTION

 Different views of same data lead to designs


that do not reflect organization’s operation
 Data modeling reduces complexities of
database design
 Various degrees of data abstraction help
reconcile varying views of same data

3
DATA MODELING AND DATA
MODELS
 Data models
 Relatively simple representations of complex real-
world data structures via graphical representation.
 Describes the most important “things” in a
business environment from a data-centric point of
view
 Model: an abstraction of a real-world object or
event to understand complexities of the real-
world environment
 Data modeling is iterative and progressive
process of creating a specific data model for a
determined problem domain

4
IMPORTANCE OF DATA
MODELING

 A communication tool
 Provide overall view of database
 Organise data for various users
 Abstraction for the creationof good database

5
DATA MODEL BASIC BUILDING
BLOCKS
 Entity: anything about which data are to be
collected and stored
 Attribute: a characteristic of an entity
 Relationship: describes an association among
entities
 One-to-many (1:M) relationship
 Many-to-many (M:N or M:M) relationship
 One-to-one (1:1) relationship
 Constraint: a restriction placed on the data

6
BUSINESS RULES

 Descriptions of policies, procedures, or principles within


a specific organization
 Description of operations to create/enforce actions
within an organization’s environment
 Must be in writing and kept up to date
 Must be easy to understand and widely disseminated
 Describe characteristics of data as viewed by the
company

7
DISCOVERING BUSINESS
RULES
 Sources of business rules:
 Company managers
 Policy makers
 Department managers
 Written documentation
 Procedures
 Standards
 Operations manuals
 Direct interviews with end users

8
DISCOVERING BUSINESS
RULES (CONT’D.)
 Why do we need business rules?
 Standardize company’s view of data
 Communications tool between users and
designers
 Allow designer to understand the nature, role,
and scope of data
 Allow designer to understand business
processes
 Allow designer to develop appropriate
relationship participation rules and constraints

9
PRINCIPLES OF TRANSLATING
BUSINESS RULES INTO DATA
MODEL COMPONENTS
 Generally, nouns translate into entities
 Verbs translate into relationships among
entities
 Relationships are bidirectional
 Two questions to identify the relationship type:
 How many instances of B are related to one
instance of A?
 How many instances of A are related to one
instance of B?

10
NAMING CONVENTIONS

 Naming occurs during translation of business


rules to data model components
 Names should make the object unique and
distinguishable from other objects
 Names should also be descriptive of objects in
the environment and be familiar to users
 Proper naming:
 Facilitates communication between parties
 Promotes self-documentation

11
THE EVOLUTION OF DATA
MODELS

12
THE HIERARCHICAL MODEL

 The hierarchical model was developed in the 1960s to


manage large amounts of data for manufacturing projects
 Basic logical structure is represented by an upside-down
“tree”
 Hierarchical structure contains levels or segments
 Segment : Equivalence of a file system’s record type
 Set of one-to-many relationships between segments

Source: http://creately.com/blog/diagrams/database-modeling-basics/
13
THE HIERARCHICAL MODEL
Advantages Disadvantages

Promotes data sharing Requires knowledge of physical


storage characteristics

Parent/Chid relationship Changes in structure require


promote conceptual simplicity changes in all application
and data integrity programs
Database security is provided No data definition
and enforced by DBMS

Efficient with 1: M relationship Lack of standard

14
THE NETWORK MODEL

 The network model was created to represent complex data


relationships more effectively than the hierarchical model
 Improves database performance
 Imposes a database standard
 Resembles hierarchical model
 However, record may have more than one parent
 Collection of records in 1:M relationships
 Set composed of two record types:
 Owner
 Equivalent to the hierarchical model’s parent
 Member
 Equivalent to the hierarchical model’s child

15
Source: http://whatisdbms.com/data-models-in-dbms-11-types-of-
data-models-with-diagram/
THE NETWORK MODEL
Advantages Disadvantages

Conceptual simplicity System complexity limits


efficiency

Data access is flexible Lack of ad hoc query


capability placed burden on
programmers to generate
code for reports
Conformance to standard Structural change require
changes in all application
programs
Handles more relationship type Difficult for the database
designer to visualize the logical
database structure

16
Data owner/member
relationship promotes data
integrity
THE RELATIONAL MODEL

 Developed by E.F. Codd (IBM) in 1970


 Table (relations)
 Matrix consisting of row/column intersections
 Each row in a relation is called a tuple
 Relational models were considered impractical in 1970
 Model was conceptually simple at expense of computer
overhead
 Relational data management system (RDBMS)
 Performs same functions provided by hierarchical model
 Hides complexity from the user
 Relational diagram
 Representation of entities, attributes, and relationships
 Data is organized in tables
 Relational table stores collection of related entities

17
THE RELATIONAL MODEL
Advantages Disadvantages

Promote structure Requires substantial hardware


independence through and system software overhead
independent tables
Tabular view improves May lead to isolated
conceptual simplicity databases where
information cannot be
shared from one system to
another
Ad hoc query capability is Have limit on field lengths
based on SQL

Isolates end user from physical


level details

18
RELATIONAL DIAGRAM

20
THE RELATIONAL MODEL

 SQL-based relational database application involves three


parts:
 User interface
 Allows end user to interact with the data
 Set of tables stored in the database
 Each table is independent from another
 Rows in different tables are related based on common values
in common attributes
 SQL “engine”
 Executes all queries

21
THE ENTITY RELATIONSHIP
MODEL
 Widely accepted standard for data modeling
 Introduced by Chen in 1976
 Graphical representation of entities and their
relationships in a database structure
 Entity relationship diagram (ERD) is famous
because
 Uses graphic representations to model database
components
 Shows the major entities in the diagram
 Shows the interrelationship among entities

22
THE ENTITY RELATIONSHIP
MODEL

 Entity instance (or occurrence) is row in table


 Entity set is collection of like entities
 Connectivity labels types of relationships
 Relationships are expressed using Chen
notation
 Relationships are represented by a diamond
 Relationship name is written inside the diamond

23
THE ENTITY RELATIONSHIP MODEL
Advantages Disadvantages

Visual modeling yields Limited constraint representation


conceptual simplicity

Visual representation makes it an Limited relationship


effective communication tool representation

Integrated with the dominant Loss of information content


relational model occurs when attributes are
removed from entities to avoid
crowded displays
THE OBJECT-ORIENTED (OO)
MODEL
 Data and relationships are contained in a
single structure known as an object
 OODM (object-oriented data model) is the basis
for OODBMS
 Semantic data model
 An object:
 Contains operations
 Are self-contained: a basic building-block for
autonomous structures
 Is an abstraction of a real-world entity

26
THE OBJECT-ORIENTED (OO)
MODEL (CONT’D.)
 Attributes describe the properties of an object
 Objects that share similar characteristics are
grouped in classes
 Classes are organized in a class hierarchy
 Inheritance: object inherits methods and
attributes of parent class
 UML based on OO concepts that describe
diagrams and symbols
 Used to graphically model a system

27
THE OBJECT-ORIENTED (OO) MODEL

Advantages Disadvantages

Semantic content is added Slow development of standard


caused vendors to supply their
own enhancements
Visual representation includes Learning curve is steep
semantic content

Inheritance promotes data High system overhead slows


integrity transaction
THE BIG DATA

 Huge in volume
 Velocity: flow of data is massive and continuous
 Variety: from various sources such as machine,
networks, human interaction system like social media

 Could be structure or unstructured


BIG DATA TECHNOLOGIES
 Hadoop
 Java-based, Open source software framework
 To develop data processing application (big data) which are executable in distributed computing
environment
 Uses low cost hardware to create clusters of thousands of computer nodes to store and process
data
 Always use together with map reduct
 It has Hadoop Distributed File System which enable fast data transfer among nodes
 In year 2009: sort a petabyte of data in 17 hours. How huge is a petabyte?

1000
1 PETABYTE = TERABYTE
HADOOP
RELATIONAL DBMS VS HADOOP
RDBMS Hadoop
Computing Model • Notion of transactions
• Transaction is the unit •Notion of jobs
of work •Job is the unit of work
• Concurrency control •No concurrency
control
Data Model • Structured data with • Any data will fit in
known schema any format
• Read/Write mode • (un)(semi)
structured
• Read only data
Cost Model • Expensive servers • Cheap commodity
machines
Fault Tolerance • Failures are rare • Failures are
• Recovery mechanism common over
thousand of
machines
• Simple yet efficient
BIG DATA TECHNOLOGIES
 Map reduce
 Run on Hadoop cluster
 Programming model and an associated implementation for processing and
generating big data sets with a parallel, distributed algorithm on a cluster
 Map: filter and sort the data. For e.g.: remove redundant attributes and outliers;
sort data according to name or year
 Reduce: Summaries the data. For e.g.: provides only means, variance, of the data;
could be sales by year, number of students by year and so on.
MAP REDUCE
BIG DATA TECHNOLOGY: NOSQL DATABASES

 NoSQL – NOT only SQL


 Initial purpose: to process large-scale database clustering in
cloud and web application
 Google and Amazon use noSQL to focus on narrow operational
goals. For e.g.: capture your habit of searching in Google and
Amazon
 Not based on relational model
 Support distributed database architectures
 Support large amounts of sparse data
 Provide high scalability, high availability and fault tolerance
 Geared toward performance rather than transaction
consistency
THE EVOLUTION OF DATA MODELS
38
DEGREES OF DATA
ABSTRACTION
 Database designer starts with abstracted view,
then adds details
 In other words, focus on showing important details
and hide away the implementation details.
 ANSI Standards Planning and Requirements
Committee (SPARC)
 Defined a framework for data modeling based on
degrees of data abstraction (1970s):
 External: end user’s view of the data environment
 Conceptual: global view of the entire database by the
entire organization. ER model is widely used.
 Internal: representation of the database as ‘seen’ by
the DBMS

39
40
SCHEMA AND SUB-SCHEMA

 Schema: conceptual organisation of the


database as viewed by the database
administrator
 Sub-schema: Portion of the database seen by
the application programs that produce the
desired information from the data within the
database

42
THE EXTERNAL MODEL
 End users’ view of the data environment
 External schema: specific representation of an
external view
 Entities
 Relationships
 Processes
 Constraints
 Each external model is then represented by its
own external schema.
 CREATE VIEW CLASS_VIEW AS
SELECT (CLASS_ID, CLASS_NAME, PROF_NAME,
CLASS_TIME, ROOM_ID)
FROM CLASS, PROFESSOR, ROOM
WHERE CLASS.PROF_ID = PROFESSOR.PROF_ID AND
CLASS.ROOM_ID = ROOM.ROOM_ID;

43
THE EXTERNAL MODEL

 Easy to identify specific data required to


support each business unit’s operations
 Facilitates designer’s job by providing
feedback about the model’s adequacy
 Ensures security constraints in database
design
 Simplifies application program development

44
THE CONCEPTUAL MODEL

 Represents global view of the entire database


by the entire organization
 All external views integrated into single global
view: conceptual schema
 Entity-Relationship (ER) model most widely
used
 ER Diagram graphically represents the
conceptual schema

45
TINY COLLEGE ENTITIES (AN
EXAMPLE)
46
A CONCEPTUAL SCHEMA FOR
TINY COLLEGE
47
THE CONCEPTUAL MODEL

 Provides a relatively easily understood macro


level view of data environment
 Independent of both software and hardware
 Does not depend on the DBMS software used to
implement the model
 Does not depend on the hardware used in the
implementation of the model
 Changes in hardware or software do not affect
database design at the conceptual level

48
THE INTERNAL MODEL

 Representation of the database as “seen” by


the DBMS
 Maps the conceptual model to the DBMS
 Internal schema depicts a specific
representation of an internal model
 Depends on specific database software
 Change in DBMS software requires internal
model be changed
 Logical independence: change internal model
without affecting conceptual model
49
THE PHYSICAL MODEL

 Operates at lowest level of abstraction


 Describes the way data are saved on storage
media such as disks or tapes
 Requires the definition of physical storage and
data access methods
 Relational model aimed at logical level
 Does not require physical-level details
 Physical independence: changes in physical
model do not affect internal model

51

You might also like