Chapitre I - Intro DB
Chapitre I - Intro DB
Content
I. INTRODUCTION ..................................................................................................................................... 3
II. KEY TERMINOLOGIES ....................................................................................................................... 3
2.1. Database .............................................................................................................................................. 3
2.2. Field, record, table and data type ..................................................................................................... 3
2.3. Data Dictionary................................................................................................................................... 5
2.3.1. Active data dictionary ................................................................................................................. 8
2.3.2. Passive data dictionary ............................................................................................................... 8
III. DATABASE MODELS .......................................................................................................................... 9
3.1. Hierarchical Database Model ............................................................................................................ 9
3.2. Network Database Model ................................................................................................................ 10
3.3. Object-oriented Database Model .................................................................................................... 10
3.4. Relational Database.......................................................................................................................... 10
IV. DATA CONCEPTS .............................................................................................................................. 11
4.1. Logical data concepts: ...................................................................................................................... 11
4.1.1. Entity ........................................................................................................................................... 11
4.1.2. Attributes..................................................................................................................................... 11
4.1.3. Relationship ................................................................................................................................ 12
4.1.4. Types of Relationship ................................................................................................................. 12
4.2. Keys ................................................................................................................................................... 18
4.2.1. Primary Key ................................................................................................................................ 18
4.2.2. Candidate Key............................................................................................................................. 18
4.2.3. Foreign Key ................................................................................................................................ 18
4.2.4. Simple Key .................................................................................................................................. 19
4.2.5. Compound Key ........................................................................................................................... 19
4.2.6. Composite Key ............................................................................................................................ 20
4.3. Data integrity .................................................................................................................................... 20
4.3.1. Entity integrity (or table integrity) ............................................................................................. 21
4.3.2. Referential integrity.................................................................................................................... 22
| 1
Chapter I : Generalities on DB
| 2
Chapter I : Generalities on DB
I. INTRODUCTION
Often abbreviated DB, a database is basically a collection of information organized in such a
way that a computer program can quickly select desired pieces of data. You can think of a
database as an electronic filing system. Databases may be stored on a computer and examined
using a program. These programs are often called database management systems (DBMS).
Traditional databases are organized by fields, records, and files. A field is a single piece of
information; a record is one complete set of fields; and a file is a collection of records. For
example, a telephone book (or telephone directory) is analogous to a file. It contains a list of
records, each of which consists of three fields: name, address, and telephone number. This
topic covers some database terminologies and concepts.
2.1. Database
A database is a collection of non-redundant data which can be shared by different application
systems [non-redundant here means that the data appears only once]. Although databases are
generally computerized, instances of non-computerized databases from everyday life can be
cited in abundance. A dictionary, a phone book, a collection of recipes [Recipe] and a TV
guide are all common examples of non-computerized databases. The examples of
computerized databases include customer files, employee rosters, books catalogue, equipment
inventories and sales transactions.
| 3
Chapter I : Generalities on DB
Field: A field represents one related part of a table and is the smallest logical structure of
storage in a database. It holds one piece of information about an item or a subject. For
example, in a database maintaining information about students, the fields can be Name,
Surname, Birthday (see figure 1 above).
Record: A record is a collection of multiple related fields that can be treated as a unit. For
example, fields Name, Surname and Birthday for a particular Student form a record. figure 1
contains 6 records, and each record has 3 fields.
Table: A table is a named collection of logically related multiple records. For example, a
collection of all the student records of a school forms student table. Depending on the DBMS,
a table can also be referred to as a file. The collection of multiple related files (tables) forms
the database.
Data Type: A data type determines the type of data that can be stored in a column. Although
many data types are available, the four most commonly used data types are Character,
Numeric, Boolean and DateTime. The values for this data type vary widely depending on the
DMS being used.
Data type Character Numeric Boolean DateTime
Field Name Name Salary Is_Married Joining_Date
Data Peter 450000 False (No) 02/10/98
| 4
Chapter I : Generalities on DB
Metadata summarizes basic information about data, which can make finding and working with
particular instances of data easier. For example, author, date created and date modified and
file size are examples of very basic document metadata. Having the ability to filter through
that metadata makes it much easier for someone to locate a specific document.
This metadata is stored in an area called the data dictionary. Hence, a data dictionary
defines the basic organization of a database. It is a centralized repository of metadata [it can
also be defined as an inventory of data elements in a database or data model with detailed
description of its format, relationships, meaning, source and usage].
• The programmers may use it to ensure that they have the name and coding of the data
items or segments correct in their programs.
• Managers may use it as a guide to decide what data could be made available to them)
The data dictionary contains important information, such as what files [Tables] are in the
database and descriptions (called attributes) of the data contained in the files [Description of
| 5
Chapter I : Generalities on DB
fields]. The data dictionary is very important as it contains information such as what is in the
database, who is allowed to access it, where is the database physically stored etc. The users
of the database normally don't interact with the data dictionary, it is only handled by the
database administrators.
Information stored in the data dictionary could normally be expected to include:
- The names of fields contained in all of the organization’s databases
- What table(s) each field exists in
- What database(s) each field exists in
- The data types, e.g., integer, real, character, and image of all fields in the
organization’s databases
- The sizes, e.g., LONG INT, DOUBLE, and CHAR(64), of all fields in the
organization’s databases
- An explanation of what each database field means
- The source of the data for each database field
- A list of applications that reference each database field
- The relationship between fields in all of the organization’s databases
- Default values that exist for all fields in all of the organization’s databases
- Who has access to each field
- Details about all the tables in the database, such as their owners, their security
constraints, when they were created etc.
- Physical information about the tables such as where they are stored and how.
- Table constraints such as primary key attributes, foreign key information etc.
- Information about the database views that are visible.
| 6
Chapter I : Generalities on DB
Example 2: This is a data dictionary describing a table that contains employee details.
Field Name Data Type Field Size for Description Example
display
| 7
Chapter I : Generalities on DB
| 8
Chapter I : Generalities on DB
updated to match the database. This needs careful handling or else the database and data
dictionary are out of sync.
| 9
Chapter I : Generalities on DB
| 10
Chapter I : Generalities on DB
4.1.1. Entity
An entity is any object in the system that we want to model and store information about.
Entities are usually recognizable concepts, either concrete or abstract, such as person, places,
things, or events which are relevant to the database. Some specific examples of entities are
Employee, Student, and Lecturer. An entity is analogous to a table in the relational model.
An entity occurrence is an instance of an entity. For example, in the student entity, the
information about each individual student details is an entity occurrence. An entity
occurrence can also be referred to as a record. By convention, entities are represented by
rectangles.
4.1.2. Attributes
An attribute is an item of information which is stored about an entity. For example, the entity
'lecturer' could have attributes such as staff id, surname, telephone number, etc. By
convention, an attribute is represented by a diamond linked to the corresponding entity (figure
5).
| 11
Chapter I : Generalities on DB
Figure 5: Attributes
4.1.3. Relationship
Relationship is an association, dependency or link between two or more entities and is
represented by a diamond symbol. A relationship describes how two or more entities are
related to each other. For example, the relationship Buys (shown in figure 5) associates the
CUSTOMER entity with ITEMS entity.
Relationships symbols:
| 12
Chapter I : Generalities on DB
Example 1: if a man only marries one woman and a woman only marries one man, it is a
oneto-one (1:1) relationship.
| 13
Chapter I : Generalities on DB
Example 2: A car has only one engine. An engine is installed only on one car.
Example 1: A manager manages many employees, but each -employee only has one manager.
Figure 9: One-to-Many
The crowbar represents the Many occurrences.
| 14
Chapter I : Generalities on DB
| 15
Chapter I : Generalities on DB
d. Complete examples
e. Short summary:
Relationship Example left right
one-to-one person ←→ birth certificate 1 1
one-to-one (optional on one side) person ←→ driving license 1 0..1 or ?
many-to-one person ←→ birth place 1..* or + 1
many-to-many (optional on both sides) person ←→ book 0..* or * 0..* or *
one-to-many order ←→ line item 1 1..* or +
| 16
Chapter I : Generalities on DB
4.1.5. Questions:
1. Add the cardinalities to the following ERM (Consider the context of ISPA)
| 17
Chapter I : Generalities on DB
4.2. Keys
A key is a data item that allows us to uniquely identify individual occurrences or an entity.
There are many types of keys:
4.2.1. Primary Key
A field or a set of fields that uniquely identify each record in a table is known as a primary
key. This implies that no two records in the relation can have same value for the primary key.
For example, student number is a primary key as it uniquely identifies a student (within a
college’s student record system). An employee number uniquely identifies a member of staff
within a company. An IP address uniquely addresses a PC on the internet.
A primary key is mandatory. That is, each entity occurrence must have a value for its primary
key. By convention the field that represents the primary key is underlined (figure 2).
Example 1: The figure below illustrates how a foreign key constraint is related to a primary
key constraint. Here, the field Item_Code in the PURCHASE table references the field
Item_Code in the ITEM relation. Thus, the attribute Item_Code in the PURCHASE relation
is the foreign key.
| 18
Chapter I : Generalities on DB
Example 2:
| 19
Chapter I : Generalities on DB
CD name in the track entity is a simple key linking to the CD entity; but track number is not
a simple key.
| 20
Chapter I : Generalities on DB
Any unintended changes to data as the result of a storage, retrieval or processing operation,
including malicious intent, unexpected hardware failure, and human error, is failure of data
integrity. If the changes are the result of unauthorized access, it may also be a failure of data
security. Depending on the data involved this could manifest itself as benign as a single pixel
in an image appearing a different color than was originally recorded, to the loss of vacation
pictures or a business-critical database, to even catastrophic loss of human life in a life-critical
system.
Data integrity refers to the overall completeness, accuracy and consistency of data. Integrity
constraints are a set of data validation rules that specified in order to restrict the data values
that can be stored in the DB. Integrity constraints help in preserving the validity and
consistency of data (in the DB).
Integrity constraints ensure that the data insertion, updating, and other processes have
to be performed in such a way that data integrity is not affected. Thus, integrity constraint
is used to guard against accidental damage to the database.
There are three types of integrity constraints: entity integrity, referential integrity and domain
integrity.
• A table can contain a null value other than the primary key field.
Figure 15: Table integrity example - Jackson's ID is not allowed since the primary can't
contain null value
| 22
Chapter I : Generalities on DB
Figure 16: Domain integrity - Age "A" is not allow since "A" is not an integer
[ Other concepts
| 23
Chapter I : Generalities on DB
Integrity enforcement deals with the prevention of semantic errors made by users due
to their carelessness or lack of knowledge
• Integrity checking is the process of verify that a given update satisfies the
constraint. If any constraint is violated, then the update is rejected. Otherwise
the update is accepted.
• Integrity maintenance is a process that also starts with a given update and the
constraints but now, if some constraint is violated, an attempt is made to find a
repair, that is, an additional set of insertions and/or deletions of facts to be
added to the update, such that the resulting update satisfies all integrity
constraints
]
| 24
Chapter I : Generalities on DB
Advantages of a DBMS
Reduction in Data Redundancy: Data redundancy refers to duplication of data.
Reduction in data Inconsistency: Data inconsistency is when different
versions of the same data appear in different places in a database (e.g. a record
showing that an employee salary is 100000, another record within the same database
showing that its salary is 500000). This causes unreliable information, because it
is difficult to determine which version of the information is correct.
Sharing of Data: Sharing of data allows the existing applications to use the data in the
database simultaneously.
Improvement in Data Security: DBMS ensures that data are accessed only by
authorized users or applications. DBMS provides security tools such as user codes and
passwords.
Maintenance of Data Integrity: Data integrity means the consistency and accuracy of
the data in the database.
Better Interaction with Users: it allows the users, who do not know programming, to
interact with the data more easily (e.g. MS access). Disadvantages
• Cost of Staff Training: Most database management system are often complex
systems so the training for users to use the DBMS is required.
• Cost of Hardware and Software: A processor with high speed of data processing
and memory of large size is required to run the DBMS software. Similarly, DBMS
software is also very costly.
• Higher impact of a failure: The centralization of resources increases the vulnerability
of the system. Since all users and applications rely on the ~vailabi1ity of the DBMS,
the failure of any component can bring operations to a halt
| 25
Chapter I : Generalities on DB
| 26
Chapter I : Generalities on DB
| 27