LEC02 DataModels
LEC02 DataModels
Lecture 2
1
LEARNING OUTCOMES
In this chapter, you will learn:
About data modeling and why data models are
important
About the basic data-modeling building blocks
What business rules are and how they influence
database design
How the major data models evolved
How data models can be classified by level of
abstraction
2
INTRODUCTION
3
DATA MODELING AND DATA
MODELS
Data models
Relatively simple representations of complex real-
world data structures via graphical representation.
Describes the most important “things” in a
business environment from a data-centric point of
view
Model: an abstraction of a real-world object or
event to understand complexities of the real-
world environment
Data modeling is iterative and progressive
process of creating a specific data model for a
determined problem domain
4
IMPORTANCE OF DATA
MODELING
A communication tool
Provide overall view of database
Organise data for various users
Abstraction for the creationof good database
5
DATA MODEL BASIC BUILDING
BLOCKS
Entity: anything about which data are to be
collected and stored
Attribute: a characteristic of an entity
Relationship: describes an association among
entities
One-to-many (1:M) relationship
Many-to-many (M:N or M:M) relationship
One-to-one (1:1) relationship
Constraint: a restriction placed on the data
6
BUSINESS RULES
7
DISCOVERING BUSINESS
RULES
Sources of business rules:
Company managers
Policy makers
Department managers
Written documentation
Procedures
Standards
Operations manuals
Direct interviews with end users
8
DISCOVERING BUSINESS
RULES (CONT’D.)
Why do we need business rules?
Standardize company’s view of data
Communications tool between users and
designers
Allow designer to understand the nature, role,
and scope of data
Allow designer to understand business
processes
Allow designer to develop appropriate
relationship participation rules and constraints
9
PRINCIPLES OF TRANSLATING
BUSINESS RULES INTO DATA
MODEL COMPONENTS
Generally, nouns translate into entities
Verbs translate into relationships among
entities
Relationships are bidirectional
Two questions to identify the relationship type:
How many instances of B are related to one
instance of A?
How many instances of A are related to one
instance of B?
10
NAMING CONVENTIONS
11
THE EVOLUTION OF DATA
MODELS
12
THE HIERARCHICAL MODEL
Source: http://creately.com/blog/diagrams/database-modeling-basics/
13
THE HIERARCHICAL MODEL
Advantages Disadvantages
14
THE NETWORK MODEL
15
Source: http://whatisdbms.com/data-models-in-dbms-11-types-of-
data-models-with-diagram/
THE NETWORK MODEL
Advantages Disadvantages
16
Data owner/member
relationship promotes data
integrity
THE RELATIONAL MODEL
17
THE RELATIONAL MODEL
Advantages Disadvantages
18
RELATIONAL DIAGRAM
20
THE RELATIONAL MODEL
21
THE ENTITY RELATIONSHIP
MODEL
Widely accepted standard for data modeling
Introduced by Chen in 1976
Graphical representation of entities and their
relationships in a database structure
Entity relationship diagram (ERD) is famous
because
Uses graphic representations to model database
components
Shows the major entities in the diagram
Shows the interrelationship among entities
22
THE ENTITY RELATIONSHIP
MODEL
23
THE ENTITY RELATIONSHIP MODEL
Advantages Disadvantages
26
THE OBJECT-ORIENTED (OO)
MODEL (CONT’D.)
Attributes describe the properties of an object
Objects that share similar characteristics are
grouped in classes
Classes are organized in a class hierarchy
Inheritance: object inherits methods and
attributes of parent class
UML based on OO concepts that describe
diagrams and symbols
Used to graphically model a system
27
THE OBJECT-ORIENTED (OO) MODEL
Advantages Disadvantages
Huge in volume
Velocity: flow of data is massive and continuous
Variety: from various sources such as machine,
networks, human interaction system like social media
1000
1 PETABYTE = TERABYTE
HADOOP
RELATIONAL DBMS VS HADOOP
RDBMS Hadoop
Computing Model • Notion of transactions
• Transaction is the unit •Notion of jobs
of work •Job is the unit of work
• Concurrency control •No concurrency
control
Data Model • Structured data with • Any data will fit in
known schema any format
• Read/Write mode • (un)(semi)
structured
• Read only data
Cost Model • Expensive servers • Cheap commodity
machines
Fault Tolerance • Failures are rare • Failures are
• Recovery mechanism common over
thousand of
machines
• Simple yet efficient
BIG DATA TECHNOLOGIES
Map reduce
Run on Hadoop cluster
Programming model and an associated implementation for processing and
generating big data sets with a parallel, distributed algorithm on a cluster
Map: filter and sort the data. For e.g.: remove redundant attributes and outliers;
sort data according to name or year
Reduce: Summaries the data. For e.g.: provides only means, variance, of the data;
could be sales by year, number of students by year and so on.
MAP REDUCE
BIG DATA TECHNOLOGY: NOSQL DATABASES
39
40
SCHEMA AND SUB-SCHEMA
42
THE EXTERNAL MODEL
End users’ view of the data environment
External schema: specific representation of an
external view
Entities
Relationships
Processes
Constraints
Each external model is then represented by its
own external schema.
CREATE VIEW CLASS_VIEW AS
SELECT (CLASS_ID, CLASS_NAME, PROF_NAME,
CLASS_TIME, ROOM_ID)
FROM CLASS, PROFESSOR, ROOM
WHERE CLASS.PROF_ID = PROFESSOR.PROF_ID AND
CLASS.ROOM_ID = ROOM.ROOM_ID;
43
THE EXTERNAL MODEL
44
THE CONCEPTUAL MODEL
45
TINY COLLEGE ENTITIES (AN
EXAMPLE)
46
A CONCEPTUAL SCHEMA FOR
TINY COLLEGE
47
THE CONCEPTUAL MODEL
48
THE INTERNAL MODEL
51