TI404 - Introduction To Databases: Maha Naceur/Laurent Cetinsoy

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

TI404 - Introduction to

databases
Part 2

Maha Naceur/Laurent Cetinsoy

2021 - 2022
02

Database design
and modelling
How do we structure
data in a database ?
Are there rules to follow
?
What should be avoided
?
Model E/A

Relational model

Normalizing
Database design process

Understand the problem !

Analyse : study of the existing, the needs, the choices, the


constraints, etc...
used to help
e book rentals
Conceptuel data model : represent the important
aspects of the problem in a non-formal way (graphical)
: design (Part 1)
m : manage
al
Logical data model : description of the solution in a
formal way but independent from the choice of the
implementation

You can design databases without Physical data model : implementation of the DB in a DBMS
from the LDM (and optimization)
thinking about the Database
management system specifities !
Do not think about the software,
think about the concepts
Life cycle

Needs analysis Analyse


(specifications)

Specification Of
the DB
1. Conceptual data
Merise -> Model model (CDM)
Conception
conceptuel de
donnée Conceptual
(abstract) diagram
Physical
Transformation
Design
into a logic
2. Logical Data Model
model
(LDM)

Logical Model Internal Schema Maintenanc


e

Spécifique à un SGBD
Analyze well, model well!

The analysis stage is The conceptual modeling step must


fundamental in the be properly performed
design process

Moving from a natural language


specification subject to interpretation
to an unambiguous specification
The perception of the
Analyse well !

Model well
requires the use of modeling
existing situation and
formalisms such as E-A or UML
the needs relies on the
engineer's expertise
The logical model is deduced in a
systematic way from the conceptual
model and the software
implementation is realized by direct
translation of the logical model.
Needs of description
Conceptual modeling Describe the application data (trains, trips and reservations) without referring to a specific IT
solution

Logical modeling (Data Develop an equivalent description for the data storage in the chosen DBMS (the collections:
Description Language ticket, train and stop)
(LDD))

Language for data Create the initial database with the data representing the SNCF network (insert the different
insertion information: ticket, train and stop by setting up the relations between them)

Data manipulation Create as you go the data on the reservations. To be able to modify (and possibly delete) any
language (DML) data already entered (if late, update the departure time of train X)

Query language (query Respond to any request for information on the data contained in the database (Display all trains
language) leaving at 10am from Nantes station)

Language for expressing It is necessary to be able to express all the rules that constrain the values that can be recorded
integrity constraints in such a way as to avoid any error that may be detected (ticket X and stop Y depend on a train
number Z)
Description requirements

Reliability The information (e.g. reservations) must not be lost due to any kind of malfunction:
guarantee programming error, system failure, computer failure, power failure, ...

Competition An action done for one user (e.g. booking a reservation) must not be lost because of
control another action done simultaneously for another user (booking the same place)
guarantee

Confidentiality All information must be protected from access by unauthorized users in read or write mode.
guarantee Example: Prohibit customers from changing train numbers or schedules or their
reservations.

Optimization The response time of the system must be in accordance with the needs: in interactive: not
mechanisms more than 3 seconds, in programming: fast enough to assume the expected workload
(number of transactions per day)
Entity/Association
You want to
Model
describe the
problem in term of
entity and
association
between entity
Model E/A

Distinguish the entities that


make up the database, and Simple and powerful enough
Database design (relational Suffers from many
the associations to represent relational
ones mainly) shortcomings:
(relationships) between structures.
these entities.

Set up a correct It is based on a graphic There is no operation to


These concepts give
schema allowing the representation that manipulate the data,
structure to the base,
development of a makes it much easier and no (or few) ways
which is essential.
viable application. to understand. to express constraints

It often leads to some


ambiguities for
complex schemes.

The designer must


meet the requirements
The jargon !

➢An entity is similar to the notion of object / concept, it describes an "entity" of the real
world.
➢Example: a book, a student, an account, an invoice, ...
➢An association is a link between several entities
➢Example: an invoice contains several products.
E/A: informal representation

“let a database describing films, with their directors and actors, as well as the theaters where
these films are shown. This database is accessible on the Web and Internet users can rate
the films they have seen.

Same MES

Same
year
E/A : formal representation
Associations

Entities will
more Attributes
information : Attributes
attributes

Entities Attributes cardinalities

Unique identifier Attributes


Entity, attribute and type of entity

Entity : any identifiable object relevant to the application


To distinguish entities, we
 ex : Film, Artist, Internet user... give them a unique id

Attribute : a property that characterizes an entity


 Ex : Title of a movie, the name of an artist...
 An attribute is atomic: takes one value and one value only
 An attribute can be multivalued: set of values taken in the same domain (phone numbers)
 An attribute can be composed: constituted by aggregation of other attributes (address: street number,
street name, postal code, country)

The type of an entity is composed of the following elements :


 His name;
 The list of its attributes with, - optionally - the domain where the attribute takes its values: integers, strings ;
 The indication of the attribute(s) allowing to identify the entity: they constitute the key.
Unique identifier: key

Key : Let E be an entity type, A the set of its attributes, a key of E is a minimal subset of A, allowing to uniquely
identify an entity among any extension of E

• The characteristics of a good key are:

 its value is known for any entity;

 it should never need to be modified;

 its storage size should be as small as possible (storage performance)

Examples : id_Film to identify a film, the pair (email, password) to identify an internet user, etc

Clés

Attributs

Entités
Choice of the identifier

❑ Avoid identifiers composed of several attributes (such as, for example, an identifier formed by the attributes
name and first name):
✓ they degrade the performance of the DBMS,
✓ but above all, the uniqueness assumed by such an approach usually ends up, sooner or later, being
denied

❑ Avoid identifiers that may change over time (such as a vehicle license plate)

❑ Avoid identifiers of the character string type


Association

➢ An association (sometimes called a relation) represents the semantic links that can exist between several entities
➢ An association class (or type) contains all similar associations (that link entities belonging to the same entity classes)

Association

➢ An association class can link more than two entity classes. Here are the names of the association classes:
– a recursive (or reflexive) association class links the same entity class
– a binary association class links two entity classes
– a ternary association class links three entity classes
– an n-ary association class links n entity classes
Cardinality
Cardinality : Let be an association (E1, E2) between two types of entities. The cardinality of the association for
Ei (E1 or E2) is a pair [Min, Max] such that :

➢ Max : maximum cardinality which designates the maximum number of times an instance ei of Ei can
intervene in the association (1 or n)

➢ Min : minimal cardinality that designates the minimal number of times an instance ei of Ei in the
relation (0 or 1)

Exemple : An artist(MES) makes 0 or more films and a film is made by one and only one artist(MES).

Artist makes a movie

min

max

A movie is made by an artist


n-ary Associations

Relationship between n types of entities E1, E2,..,En is a set of n-tuples (e1, e2,..,en) where each ei belongs to Ei

Exemple :

• A film is shown at 3pm in room number 5

and at 6pm in room number 7

• At 3pm, there will be a screening of the

movie "Avengers" in room 9 and the movie

"Iron man" in room 5.

• In the room number 5, they will be diffused

the films "Avengers" at 3 pm and "Iron man"

at 5 pm.
A few tips
o Name standardization: The name given to an entity, an association or an attribute must be
unique
➢ Risk of inconsistency if attributes have the same name but also of redundancy, waste of
memory space Try not to use
abbreviation : clear and
precises are better
A few tips
o Standardization of attributes
o Each multiple attribute (structure) must be transformed into an entity and an additional
association is added
o Same for an enumeration (example for Film, the attribute "Genre" ={action, comedy,
fiction,...})
o Avoid information redundancy of attributes by calculation or transition: an attribute
derived from other attributes

Avoid attrbutes containing


several data

« Adresse » Contains a lot of data :


number in the streeet, the street name,
city, zip code, country (bad). It is to have a
A few tips
o Merging when possible
o Entities and associations should be factorized when possible

Merger of entities
Association merger
A few tips
o Make deletions when possible
o We must ask ourselves the question of the interest of an association when the
maximum cardinalities are all equal to 1

possible
deletion Non possible
deletion
A few tips

o If there are two paths to get from one entity to another, then these two paths must have two
distinct meanings.
o Otherwise, the shortest path must be deleted since it is deductible from the other paths
From an n-ary association to an entity type
Let A be an association between the entity types {E1, E2,.., En}. The transformation of A into an
entity type is done in three steps:

1. Assignment of an autonomous identifier of A

2. Creation of an association Ai, of type (1..n) between A and each of the Ei

3. The minimum stress on the A side is always 1.


In a nutshell

Establish the relationships


List the attributes of the Uniquely identify each Identify
Identify the entities present between the different
entities occurrence cardinalities
entities

•In general, an entity is •We generally limit •Let's say we have two
created if it has at least 2 ourselves to the subscribers named "John", it
occurrences. properties necessary for is necessary to distinguish
the development between them or they will be Generally indicated
•Each element of an entity confused. Generally verbs
is called an occurrence of •Each property must have explicitly in the text
•We then add a property that
the entity only one possible value will uniquely identify each
for each occurrence, occurrence: Primary Key
otherwise it is an entity. •This key is underlined to
•It must also be highlight its role as an
elementary and non- identifier
decomposable (The
address property (street,
postal code, city) is
composed of 3
elementary entities)
Application : Newsletter

"A subscriber is subscribed to one or more column. Each column sends a NewsLetter
every week to the subscribers who registering to the corresponding column. A
subscriber can choose one out several reasons for subscribing."

 These sentences, if they are accurate and validated by the client, are sufficient to
model our first model. They contain all the necessary information.
Application : Newsletter

A subscriber is registered to one or more column. Each

colulmn sends a NewsLetter every week to the subscribers

of the corresponding section. A subscriber can choose

one out several possible reasons to subscribe.


Application : Hospital management

«The patients of the hospital are distributed in the departments (each


characterized by an identifying name, its location, its specialty) of the latter.
Each doctor belongs to a department. He is identified by his name and first
name.
A patient makes visits.
Each visit is made to a doctor on a specific date.
A patient cannot have more than one visit per day.
During a visit, one or more prescriptions can be written.
Each prescription mentions the name of a drug and the dosage to be
respected by the patient (dosage = simple line of text).
It is obvious that the doctor does not prescribe the same medication twice
during the same visit.
A patient has a registration number (identifier), a name, a first name and an
address. »
Application Hospital management
Application : Société de formation

"A training company wants to computerize the management of registrations for the
sessions it organizes, as well as the invoicing. There are a number of training
seminars, each devoted to a different theme and billed at a specific rate. A seminar
can be held more than once, which corresponds to as many sessions. The sessions
of a seminar are held on different dates. Some companies register their employees
for certain sessions. There is a maximum number of participants for each seminar
session (regardless of the date of the session). Each month, the company invoices
each company for the amount of money its employees have spent in the past
month's sessions."
Application : Société de formation
Application: Newsletter (Final standardized version)
Application: Hospital management
Application: Training company
Approach to building an E/A model

1. Pragmatic approach: :
 Based on the designer's intuition. Entities and associations are built from the entity definition
✓ Advantage : speed
✓ Disadvantage : risk of only modeling the existing
2. Formal approach :
 Establish the data dictionary
 Eliminate redundancies
 Locate the identifiers and identify the material entities

 Attach to entities the properties in DFM (functional dependencies) with their identifiers
 Add the associations and assign them properties in DF with the identifiers of the
participating entities
 Simplify the model using FICs (Functional Integrity Constraints)
 Check the conceptual data model
E/A in a nutshell!

Simple and practical:


•There are only 3 concepts: entities, associations and
attributes
•Suitable for an intuitive graphical representation,
even if there are many conventions.
•Allows to quickly model not too complex structures.

Non-determinism:
•No hard and fast rule for determining what is an
entity, attribute or relationship.

You might also like