Database Concepts EGS 2207
Database Concepts EGS 2207
Database design
Lecture outline
1. Requirements analysis
2. Conceptual database(db) design
3. Choice of the DBMS
4. Data model mapping(Logical design)
5. Physical design
6. Implementation
feedback loops exist, i.e. may need to revisit earlier
stages during a later stage
Conceptual design and logical design are DBMS
independent while physical design is DBMS specific.
Requirements collection and
analysis
Document the data requirements of the users
Functional requirements are the operations that can be
applied to the database including queries and updates.
Specification used as the basis of the design.
Typical activities:
Identification of application areas and user groups
Analysis of existing documentation of application areas
eg policy documents, forms,reports,organisational
charts
Analysis of current operating environments and the
planned use of the information eg information flow ,
types of transactions
Responses to user questionnaires are analysed.
Cont
Start from a description of the requirements which is:
poorly structured,
heterogeneous
informal
And use a technique to transform that into a specification
of the database requirements which is:
formal
homogeneous
consistent
complete
Conceptual design
To produce a conceptual schema of the db
Two sub tasks:
Schema design
Transaction design
Schema design:
Expressed using concepts of the high level data model
No implementation details(should be non-technical so
users may understand)
But should be detailed in terms of the objects of the
domain the db will represent.
Independent of the DBMS
cont
Transaction design:
examines the database applications whose
requirements were analysed in phase 1 and produces
high level specifications for these transactions
The goal is to achieve understanding of database
structure, semantics ,interrelationships and constraints.
cont
Need to be expressed in a “language” which offers:
expressiveness: able to distinguish between different
types of data,relationships and constraints
simplicity: easy for non-specialist users to understand
and use concepts
minimality: small number of basic concepts that are
distinct and do not overlap
diagrammatic representation: for ease of presentation;
it should therefore be easy to interpret
formality: must represent a formal, unambiguous
specification of the data
Some of these requirements sometimes conflict
Most popular data model used is the Entity-Relationship
(ER) model
Conceptual schema design
Purpose: to produce a conceptual schema of the database
Expressed using concepts of the high level data model
Not including implementation details(has to be
understood by non-technical users)
But detailed in terms of the ‘objects’ of the domain the
database will represent.
Independent of the DBMS to be used
Cannot be used directly to implement the database
Design is made in terms of a semantic or conceptual
model
Transaction Design
Purpose: to produce a design of the transactions, that will
run on the database
1. retrieval: retrieve data for display or as part of a report
2. update: enter new data or amend existing data
3. mixed: more complex applications may do both retrieval
and update
Why?
need to be sure to include in the conceptual schema all
information required by transactions
relative importance and frequency of use of
transactions will influence physical database design
Choosing a DBMS
Purpose: establish which is the best framework for
implementing the produced schema.
Choice made on the basis of technical factors
the DBMS has to support the required tasks
Of economic factors
software acquisition/maintenance, hardware
acquisition,creation/conversion, training of staff
And of organisational factors:
platforms supported, availability of vendor services
Logical design
To transform the generic , DBMS independent conceptual
schema into the data model of the chosen DBMS
Two stages:
1. system independent mapping: no consideration of any
specific characteristics that may apply to the specific
DBMS package
2. Tailoring to DBMS: different DBMSs may implement the
same data model in slightly different ways
Result is a set of DDL statements in the language of the
chosen DBMS.
Some CASE tools generate DDL statements from a
conceptual design
Physical design
To choose the specific storage structures and access
paths for the db files.
Attention is placed on performances
Response time: minimise data access time for data
items referenced by frequently used transactions.
Space utilisation: less frequently used data and queries
may be archived.
Transaction throughput: average number of
transactions that can be processed per minute.
Implementation
To create the db
Compile and execute DDL statements
Populate the db
Manually or automatically(may need to convert data
from a previous format)
Implement application programs(Transactions).
Programs are written with embedded DML statements.
Operational phase may begin
PART II
ENTITY RELATIONSHIP MODEL
• Its a tool commonly used to:
–Translate different views of data among end users
and fits it in a common framework
–Define data processing and constraint
requirements to help meet different views
–Help implement the database
Database model using ER modelling
• System requirements analysis
– The first step in developing a database model using ER
modelling is to consider the requirements of the system.
– The requirements are typically gathered from a scope
document
ERD Development Process
Identify the entities
Determine the attributes for each entity
Select the primary key for each entity
Establish the relationships between the entities
Draw an entity model
Test the relationships and the keys
A Simple Example
STUDENTs attend COURSEs that consist of many
SUBJECTs.
A single SUBJECT (i.e. English) can be studied in many
different COURSEs.
Each STUDENT may only attend one COURSE.
Cont....
• Identifying entities in ER modeling
– Entities are objects or things that can be described by
their characteristics.
– An entity represent a real world object or concept eg
student, courses
– Its a set and not a single occurrence
• Include :
– Weak
– Recursive
– strong
WEAK ENTITY
• Its existence dependent
– Cant exist without an entity with which it is related
• Has a primary key that is partially or totally derived
from the parent entity in the relationship.
• Represented by a double rectangle
RECURSIVE ENTITY
• Its where a relationship exists between
occurrences of the same entity type.
• The condition is found in a unary relationship
Eg a course is a prerequisite of many courses
An employee manages many employees
STRONG ENTITY
• Not existence dependent
• As we identify entities, we list the attributes that describe
the entity
ATTRIBUTES
• Identifies or describes an entity
• Has a domain that is the set of possible values
• May share a domain
• Types of attributes include:
– Key attributes
– Simple attributes
– Composite attributes
– Single valued attributes
– Multi valued attributes
– Derived attributes
Attributes
An attribute is a property or characteristic of an entity type, for
example the entity EMPLOYEE may have attributes
Employee_Name and Employee_Address.
In ER diagrams place attributes name in an ellipse with a line
connecting it to its associated entity
Attributes may also be associated with relationships
An attribute is associated with exactly one entity or
relationship
Simple versus composite
attributes
Some attributes can be broken down into meaningful
component parts, such as Address, which can be broken down
into Street_Address, City..etc.
The component attributes may appear above or below the
composite attribute on an ER diagram
Provide flexibility to users, as can refer to it as a single unit or
to the individual components
A simple (atomic) attribute is one that cannot be broken down
into smaller components
A composite attribute
Single-Valued versus Multivalued
Attribute
It frequently happens that there is an attribute that may have
more than one value for a given instance, e.g. EMPLOYEE
may have more than one Skill.
A multivalued attribute is one that may take on more than one
value – it is represented by an ellipse with double lines
Entity with a multivalued attribute (Skill)
and derived attribute (Years_Employed)
Stored versus Derived Attributes
Some attribute values can be calculated or derived from
others
e.g., if Years_Employed needs to be calculated for
EMPLOYEE, it can be calculated using Date_Employed and
Today's_Date
A derived attribute is one whose value can be calculated from
related attribute values (plus possibly other data not in the
database)
A derived attribute is signified by an ellipse with a dashed line
Key Attributes
Certain attributes identify particular facts within an entity,
these are known as KEY attributes.
The different types of KEY attribute are:
Primary Key
Composite Primary Key
Foreign Key
Key Definitions
Primary Key:
One attribute whose value can uniquely identify a
complete record (one row of data) within an entity.
Composite Primary Key
A primary key that consists of two or more attribute
within an entity.
Foreign Key
A copy of a primary key that exists in another entity
for the purpose of forming a relationship between
the entities involved.
Identifier attribute
Identifier attribute or Key is an attribute (or combination of
attributes) that uniquely identifies individual instances of an
entity type, such as Student_ID
To be a candidate identifier, each entity instance must have a
single value for the attribute, and the attribute must be
associated with each entity
The identifier attribute is underlined, such as Student_ID
Simple and composite key attributes
(a) Simple key attribute
Composite Identifier
A Composite Identifier is when there is no single (or atomic)
that can serve as an identifier
Flight_ID is a composite identifier that has component
attributes Flight_Number and Date – this combination is
required to uniquely identify individual occurrences of Flight
Flight_ID is underlined, whilst its components are not
(b) Composite key attribute
Criteria for selecting identifiers
Is between the instances of a single entity type (also called
recursive relationships)
‘Is_Married_To’ is a one-to-one relationship between instances
of the PERSON entity type
‘Manages’ is a one-to-many relationship between instances of
the EMPLOYEE entity type
Binary relationships
Between the instances of two entity types, and is the most
common type of relationship encountered in data modelling.
e.g. (one-to-one) an EMPLOYEE is assigned one
PARKING_PLACE, and each PARKING_PLACE is assigned
to one EMPLOYEE
e.g. (one to many) a PRODUCT_LINE may contain many
PRODUCTS, and each PRODUCT belongs to only one
PRODUCT_LINE
e.g. (many-to-many) a STUDENT may register for more than
one COURSE, and each COURSE may have many
STUDENTS
Ternary relationships
A ternary relationship is a simultaneous relationship among
the instances of 3 entity types
It is the most common relationship encountered in data
modelling
Ternary relationships
Here, vendors can supply various parts to warehouses
The relationship ‘Supplies’ is used to record the specific
PARTs supplied by a given VENDOR to a particular
WAREHOUSE
There are two attributes on the relationship ‘Supplies’,
Shipping_Mode and Unit_Cost
e.g. one instance of ‘Supplies might record that VENDOR X
can ship PART C to WAREHOUSE Y, that the Shipping_Mode
is ‘next_day_air’ and the Unit_Cost is £5-00 per unit
One-to-one