0% found this document useful (0 votes)
5 views27 pages

Chapitre I - Intro DB

Chapter I provides an overview of databases, including key terminologies, database models, and data concepts. It explains the structure of databases, such as fields, records, and tables, and introduces various database models like hierarchical, network, object-oriented, and relational. Additionally, it discusses the importance of data dictionaries and data integrity in managing databases.

Uploaded by

temmybryan74
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views27 pages

Chapitre I - Intro DB

Chapter I provides an overview of databases, including key terminologies, database models, and data concepts. It explains the structure of databases, such as fields, records, and tables, and introduces various database models like hierarchical, network, object-oriented, and relational. Additionally, it discusses the importance of data dictionaries and data integrity in managing databases.

Uploaded by

temmybryan74
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Chapter I : Generalities on DB

Chapter I: Generalities on Database

Content
I. INTRODUCTION ..................................................................................................................................... 3
II. KEY TERMINOLOGIES ....................................................................................................................... 3
2.1. Database .............................................................................................................................................. 3
2.2. Field, record, table and data type ..................................................................................................... 3
2.3. Data Dictionary................................................................................................................................... 5
2.3.1. Active data dictionary ................................................................................................................. 8
2.3.2. Passive data dictionary ............................................................................................................... 8
III. DATABASE MODELS .......................................................................................................................... 9
3.1. Hierarchical Database Model ............................................................................................................ 9
3.2. Network Database Model ................................................................................................................ 10
3.3. Object-oriented Database Model .................................................................................................... 10
3.4. Relational Database.......................................................................................................................... 10
IV. DATA CONCEPTS .............................................................................................................................. 11
4.1. Logical data concepts: ...................................................................................................................... 11
4.1.1. Entity ........................................................................................................................................... 11
4.1.2. Attributes..................................................................................................................................... 11
4.1.3. Relationship ................................................................................................................................ 12
4.1.4. Types of Relationship ................................................................................................................. 12
4.2. Keys ................................................................................................................................................... 18
4.2.1. Primary Key ................................................................................................................................ 18
4.2.2. Candidate Key............................................................................................................................. 18
4.2.3. Foreign Key ................................................................................................................................ 18
4.2.4. Simple Key .................................................................................................................................. 19
4.2.5. Compound Key ........................................................................................................................... 19
4.2.6. Composite Key ............................................................................................................................ 20
4.3. Data integrity .................................................................................................................................... 20
4.3.1. Entity integrity (or table integrity) ............................................................................................. 21
4.3.2. Referential integrity.................................................................................................................... 22

| 1
Chapter I : Generalities on DB

4.3.3. Domain integrity (or column integrity) ..................................................................................... 23


4.3.4. User-defined integrity ................................................................................................................. 23
4.3.5. Static and dynamic constraint .................................................................................................... 23
V. DATABASE MANAGEMENT SYSTEM (DBMS) ............................................................................ 24
5.1. What is DBMS .................................................................................................................................. 24
5.2. Examples of DBMS .......................................................................................................................... 25
VI. Indexing in Databases .......................................................................................................................... 26

| 2
Chapter I : Generalities on DB

I. INTRODUCTION
Often abbreviated DB, a database is basically a collection of information organized in such a
way that a computer program can quickly select desired pieces of data. You can think of a
database as an electronic filing system. Databases may be stored on a computer and examined
using a program. These programs are often called database management systems (DBMS).
Traditional databases are organized by fields, records, and files. A field is a single piece of
information; a record is one complete set of fields; and a file is a collection of records. For
example, a telephone book (or telephone directory) is analogous to a file. It contains a list of
records, each of which consists of three fields: name, address, and telephone number. This
topic covers some database terminologies and concepts.

II. KEY TERMINOLOGIES

2.1. Database
A database is a collection of non-redundant data which can be shared by different application
systems [non-redundant here means that the data appears only once]. Although databases are
generally computerized, instances of non-computerized databases from everyday life can be
cited in abundance. A dictionary, a phone book, a collection of recipes [Recipe] and a TV
guide are all common examples of non-computerized databases. The examples of
computerized databases include customer files, employee rosters, books catalogue, equipment
inventories and sales transactions.

2.2. Field, record, table and data type


Within the database, data are organized into storage containers, called tables. Tables are made
up of columns and rows. In a table, columns represent individual fields and rows represent
records of data.

| 3
Chapter I : Generalities on DB

Figure 1: Fields and Records in a Table

Field: A field represents one related part of a table and is the smallest logical structure of
storage in a database. It holds one piece of information about an item or a subject. For
example, in a database maintaining information about students, the fields can be Name,
Surname, Birthday (see figure 1 above).

Record: A record is a collection of multiple related fields that can be treated as a unit. For
example, fields Name, Surname and Birthday for a particular Student form a record. figure 1
contains 6 records, and each record has 3 fields.

Table: A table is a named collection of logically related multiple records. For example, a
collection of all the student records of a school forms student table. Depending on the DBMS,
a table can also be referred to as a file. The collection of multiple related files (tables) forms
the database.

Data Type: A data type determines the type of data that can be stored in a column. Although
many data types are available, the four most commonly used data types are Character,
Numeric, Boolean and DateTime. The values for this data type vary widely depending on the
DMS being used.
Data type Character Numeric Boolean DateTime
Field Name Name Salary Is_Married Joining_Date
Data Peter 450000 False (No) 02/10/98

| 4
Chapter I : Generalities on DB

2.3. Data Dictionary


Apart from the data, the database also stores metadata, which describes the tables, columns,
indexes, constraints and other items that make up the database [Contraints are rules set in
the database to guaranty the accuracy of data in the DB. For exemple a column constraint
can be set on the field name to state that its size ranges from 10 to 50]. In simple words,
metadata is the data about data.

Metadata summarizes basic information about data, which can make finding and working with
particular instances of data easier. For example, author, date created and date modified and
file size are examples of very basic document metadata. Having the ability to filter through
that metadata makes it much easier for someone to locate a specific document.

Figure 2: Meta data example

This metadata is stored in an area called the data dictionary. Hence, a data dictionary
defines the basic organization of a database. It is a centralized repository of metadata [it can
also be defined as an inventory of data elements in a database or data model with detailed
description of its format, relationships, meaning, source and usage].
• The programmers may use it to ensure that they have the name and coding of the data
items or segments correct in their programs.
• Managers may use it as a guide to decide what data could be made available to them)

The data dictionary contains important information, such as what files [Tables] are in the
database and descriptions (called attributes) of the data contained in the files [Description of

| 5
Chapter I : Generalities on DB

fields]. The data dictionary is very important as it contains information such as what is in the
database, who is allowed to access it, where is the database physically stored etc. The users
of the database normally don't interact with the data dictionary, it is only handled by the
database administrators.
Information stored in the data dictionary could normally be expected to include:
- The names of fields contained in all of the organization’s databases
- What table(s) each field exists in
- What database(s) each field exists in
- The data types, e.g., integer, real, character, and image of all fields in the
organization’s databases
- The sizes, e.g., LONG INT, DOUBLE, and CHAR(64), of all fields in the
organization’s databases
- An explanation of what each database field means
- The source of the data for each database field
- A list of applications that reference each database field
- The relationship between fields in all of the organization’s databases
- Default values that exist for all fields in all of the organization’s databases
- Who has access to each field
- Details about all the tables in the database, such as their owners, their security
constraints, when they were created etc.
- Physical information about the tables such as where they are stored and how.
- Table constraints such as primary key attributes, foreign key information etc.
- Information about the database views that are visible.

Example 1: Data dictionary of the table that contains learner details

| 6
Chapter I : Generalities on DB

Figure 3: Example of data dictionary


Note: In figure 2, the data dictionary contains only a table. It is just a simplified example. It
real situation there are more tables in the dictionary.

Example 2: This is a data dictionary describing a table that contains employee details.
Field Name Data Type Field Size for Description Example
display

Employee Integer 10 Unique ID of 1645000001


Number each employee

Name Text 20 Name of the employee David


Heston
Date of Birth Date/Time 10 DOB of Employee 08/03/1995

Phone Integer 10 Phone number of 6583648648


Number employee

| 7
Chapter I : Generalities on DB

Example 3: Data dictionary of “student” table from “PHPmyadmin” page.

Data dictionary can be classified into 2 main categories:


→ Active data dictionary - is part of and managed by DBMS.
→ Passive data dictionary - is not part of the DBMS.

2.3.1. Active data dictionary


Every change in database structure (using DDL - Data Definition Language) is automatically
reflected in active data dictionary. So, the data dictionary is automatically updated by the
database management system when any changes are made in the database.

2.3.2. Passive data dictionary


Passive data dictionary is separate from the database and all changes in database structure
need to be applied in the passive data dictionary manually, or with dedicated software.
This is not as useful or easy to handle as an active data dictionary. A passive data dictionary
is maintained separately to the database whose contents are stored in the dictionary. That
means that if the database is modified the database dictionary is not automatically updated as
in the case of Active Data Dictionary. So, the passive data dictionary has to be manually

| 8
Chapter I : Generalities on DB

updated to match the database. This needs careful handling or else the database and data
dictionary are out of sync.

Passive data dictionary can take different forms:


1. A document or spreadsheet
2. Tools o Data Catalogs o Data integration/ETL (Extract Transform and
Load) metadata repositories o Data modeling tools
3. Custom implementations

[APPLICATION: University and schools: registration. Bank: all transaction, customers’


accounts. Airline: reservation. Sales: Costumers, products, purchases. Manufacturing:
inventory, orders, production, supply chain. Human resources: Employees’ records, salaries,
tax deduction]

III. DATABASE MODELS


A database model or simply a data model is an abstract model that describes how the data
are organized and represented. Every database and DBMS is based on a particular database
model. There are four basic types of database models: hierarchical, network, relational and
objectoriented.

3.1. Hierarchical Database Model


The hierarchical data model is the oldest type of data model, developed by IBM in 1968. This
data model organizes the data in a tree-like structure in which each child node can have only
one parent node. In other words, a hierarchal database is a collection of records connected to
one another through links. The top of the tree structure consists of a single node that does not
have any parent and is called the root node.
Advantage: retrieval and updates can be highly optimized by a DBMS.
Drawback: insertion and redundant data problems.

| 9
Chapter I : Generalities on DB

3.2. Network Database Model


In a network model, the data are represented by a collection of records and the relationships
among data are represented by links. A link is an association between two records.
The main limitation of the network data model is that it can be quite complicated to maintain
all the links and a single broken link can lead to problems in the database. In addition, since
there are no restrictions on the number of links, the database design can become complex.

3.3. Object-oriented Database Model


The object-oriented model is a relatively new data model and provides an outlook for the
future database models. An object-oriented database stores and maintains objects. An object
is an item that can contain both the data and the procedures that manipulate the data. For
example, a student object might contain not only data about a student's name, roll number and
address, but also procedures on some tasks such as printing the student record or calculating
the student's tuition fees.

3.4. Relational Database


The relational data model represents the database as a collection of simple two-dimensional
tables called tables or relations. The rows of a relation are referred to as tupples and the
columns are referred to as attributes. The relationship between the two relations is
implemented through a common attribute in the relations and not by physical links or
pointers [foreign key].

Figure 4: A relational database

| 10
Chapter I : Generalities on DB

IV. DATA CONCEPTS


4.1. Logical data concepts:
Once the requirements of the user have been specified, the next step is to construct an abstract
or conceptual model of a DB based on the requirements of the user. The conceptual model
represents various pieces of data and their relationships at a very high level of abstraction.
The conceptual model can be represented using Entity-Relationship model (E-R model). The
E-R model views the real world as a set of basic objects (known as entities), their
characteristics (known as attributes) and associations among these objects (known as
relationships).

4.1.1. Entity
An entity is any object in the system that we want to model and store information about.
Entities are usually recognizable concepts, either concrete or abstract, such as person, places,
things, or events which are relevant to the database. Some specific examples of entities are
Employee, Student, and Lecturer. An entity is analogous to a table in the relational model.
An entity occurrence is an instance of an entity. For example, in the student entity, the
information about each individual student details is an entity occurrence. An entity
occurrence can also be referred to as a record. By convention, entities are represented by
rectangles.

4.1.2. Attributes
An attribute is an item of information which is stored about an entity. For example, the entity
'lecturer' could have attributes such as staff id, surname, telephone number, etc. By
convention, an attribute is represented by a diamond linked to the corresponding entity (figure
5).

| 11
Chapter I : Generalities on DB

Figure 5: Attributes
4.1.3. Relationship
Relationship is an association, dependency or link between two or more entities and is
represented by a diamond symbol. A relationship describes how two or more entities are
related to each other. For example, the relationship Buys (shown in figure 5) associates the
CUSTOMER entity with ITEMS entity.

Figure 6: Entities, Attributes and Relationship

4.1.4. Types of Relationship


The most commonly encountered relationships are binary i.e. involving exactly two entities.
Such binary relationships are of three types and called cardinality: one-to-one, one-to-many
and many-to-many. (This topic will be deeply studied with MERISE)

Relationships symbols:

| 12
Chapter I : Generalities on DB

Figure 7: General representation of a relationship

a. One-to-one Relationship (1:1)


There is a (1:1) relationship, if one occurrence of an entity relates to only one occurrence in
another entity; i.e. One entity from entity set X can be associated with at most one entity of
entity set Y and vice versa.

Example 1: if a man only marries one woman and a woman only marries one man, it is a
oneto-one (1:1) relationship.

| 13
Chapter I : Generalities on DB

Figure 8: One-To-One Relationship

Example 2: A car has only one engine. An engine is installed only on one car.

Example 3: In a company a computer can be assigned to one employee. Reversely, an


employee is assigned to one computer.

b. One-to-many Relationship (1:M)


A one-to-many relationship occurs when one occurrence in an entity relates to many
occurrences in another entity. One entity from entity set X can be associated with multiple
entities of entity set Y, but an entity from entity set Y can be associated with at least one entity.

Example 1: A manager manages many employees, but each -employee only has one manager.

Figure 9: One-to-Many
The crowbar represents the Many occurrences.

Example 2: Provide the associations name and explain the model

| 14
Chapter I : Generalities on DB

Example 3: one class is consisting of multiple students.

c. Many-to-many Relationship (M:M)


There is a many-to-many relationship when a record in a table can be related to one or more
records in a second table, and one or more records in the second table can be related to one or
more records in the first table. One entity from X can be associated with more than one entity
from Y and vice versa.
Example 1: a teacher teaches many students and a student is taught by many teachers.

Figure 10: Many-to-Many relationship


Example 2:
Students as a group are associated with multiple faculty members, and faculty members can
be associated with multiple students.

| 15
Chapter I : Generalities on DB

Figure 11: Many-To-Many

d. Complete examples

e. Short summary:
Relationship Example left right
one-to-one person ←→ birth certificate 1 1
one-to-one (optional on one side) person ←→ driving license 1 0..1 or ?
many-to-one person ←→ birth place 1..* or + 1
many-to-many (optional on both sides) person ←→ book 0..* or * 0..* or *
one-to-many order ←→ line item 1 1..* or +

| 16
Chapter I : Generalities on DB

many-to-many course ←→ student 1..* or * 1..* or *

4.1.5. Questions:
1. Add the cardinalities to the following ERM (Consider the context of ISPA)

2. Correct the following diagrams (Consider the context of ISPA)

| 17
Chapter I : Generalities on DB

(This last diagram is a more modern representation. IDs should be underlined)

4.2. Keys
A key is a data item that allows us to uniquely identify individual occurrences or an entity.
There are many types of keys:
4.2.1. Primary Key
A field or a set of fields that uniquely identify each record in a table is known as a primary
key. This implies that no two records in the relation can have same value for the primary key.
For example, student number is a primary key as it uniquely identifies a student (within a
college’s student record system). An employee number uniquely identifies a member of staff
within a company. An IP address uniquely addresses a PC on the internet.
A primary key is mandatory. That is, each entity occurrence must have a value for its primary
key. By convention the field that represents the primary key is underlined (figure 2).

4.2.2. Candidate Key


In a table, there may be more than one field that can uniquely identify each record. All such
fields are known as candidate keys. One of these candidate keys is chosen as a primary key;
the other keys that are not chosen as primary key are known as alternate keys or secondary
keys.

4.2.3. Foreign Key


A field of a table that references the primary key of another table is foreign key.

Example 1: The figure below illustrates how a foreign key constraint is related to a primary
key constraint. Here, the field Item_Code in the PURCHASE table references the field
Item_Code in the ITEM relation. Thus, the attribute Item_Code in the PURCHASE relation
is the foreign key.

| 18
Chapter I : Generalities on DB

Figure 12: Foreign Key

Example 2:

4.2.4. Simple Key


Any of the keys described before (i.e.: primary, secondary or foreign) may have one or more
attributes. A simple key consists of a single attribute to uniquely identify an entity occurrence,
for example, a student number, which uniquely identifies a particular student. No two students
would have the same student number.

4.2.5. Compound Key


A compound key consists of more than one attribute to uniquely identify an entity occurrence.
Each attribute, which makes up the key, is also a simple key in its own right.

| 19
Chapter I : Generalities on DB

Figure 13: A compound key

4.2.6. Composite Key


A composite key consists of more than one attribute to uniquely identify an entity occurrence.
This differs from a compound key in that one or more of the attributes, which make up the
key, are not simple key.

Figure 14: Composite key

CD name in the track entity is a simple key linking to the CD entity; but track number is not
a simple key.

4.3. Data integrity


Data integrity is the maintenance and the assurance of the accuracy and consistency of data
over its entire life-cycle. It is a critical aspect to the design, implementation and usage of any
system which stores, processes, or retrieves data. The term is broad in scope and may have
widely different meanings depending on the specific context – even under the same general
umbrella of computing. It is at times used as a proxy term for data quality, while data
validation is a pre-requisite for data integrity. Data integrity is the opposite of data corruption,
The overall intent of any data integrity technique is the same: ensure data is recorded exactly
as intended (such as a database correctly rejecting mutually exclusive possibilities,) and upon
later retrieval, ensure the data is the same as it was when it was originally recorded. In short,
data integrity aims to prevent unintentional changes to information. Data integrity is not to be
confused with data security, the discipline of protecting data from unauthorized parties.

| 20
Chapter I : Generalities on DB

Any unintended changes to data as the result of a storage, retrieval or processing operation,
including malicious intent, unexpected hardware failure, and human error, is failure of data
integrity. If the changes are the result of unauthorized access, it may also be a failure of data
security. Depending on the data involved this could manifest itself as benign as a single pixel
in an image appearing a different color than was originally recorded, to the loss of vacation
pictures or a business-critical database, to even catastrophic loss of human life in a life-critical
system.

Data integrity refers to the overall completeness, accuracy and consistency of data. Integrity
constraints are a set of data validation rules that specified in order to restrict the data values
that can be stored in the DB. Integrity constraints help in preserving the validity and
consistency of data (in the DB).

Integrity constraints ensure that the data insertion, updating, and other processes have
to be performed in such a way that data integrity is not affected. Thus, integrity constraint
is used to guard against accidental damage to the database.
There are three types of integrity constraints: entity integrity, referential integrity and domain
integrity.

4.3.1. Entity integrity (or table integrity)


It states that each table must have a primary key and that the primary key should be unique
and not null.
• The entity integrity constraint states that primary key value can't be null.
• This is because the primary key value is used to identify individual rows in relation
and if the primary key has a null value, then we can't identify those rows.
| 21
Chapter I : Generalities on DB

• A table can contain a null value other than the primary key field.

Figure 15: Table integrity example - Jackson's ID is not allowed since the primary can't
contain null value

4.3.2. Referential integrity


It is concerns with the concept of foreign key. It states that foreign key value can be in
two states: it can refer to primary key of some table, or it can be null. A referential
integrity constraint is specified between two tables.
In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary
Key of Table 2, then every value of the Foreign Key in Table 1 must be null or be
available in Table 2.

| 22
Chapter I : Generalities on DB

4.3.3. Domain integrity (or column integrity)


It specifies that all columns in a table must be defined upon the same domain.
• Domain constraints can be defined as the definition of a valid set of values for an
attribute.
• The data type of domain includes string, character, integer, time, date, currency, etc.
The value of the attribute must be available in the corresponding domain.

Figure 16: Domain integrity - Age "A" is not allow since "A" is not an integer

4.3.4. User-defined integrity


User defined integrity refers to a set of rules specified by a user, which do not belong to the
entity, domain and referential integrity categories.

4.3.5. Static and dynamic constraint


Constraints also can be distinguished as static (or state) and dynamic integrity constraints.
A static integrity constraint expresses state-independent properties that must hold at
any state of the database. It depends only on the current state, independently of any
previous states of the database. Example: “an employee’s salary must be less than
his manager’s”.

A dynamic constraint allows expressing conditions over (usually time-ordered)


sequence of two or more database states, e.g. the condition "an employee's salary
must never decrease". There is a particular case of dynamic constraints called
transition constraints. A transition constraint imposes restrictions on pairs of states,
the before and after state of a transaction.

[ Other concepts

| 23
Chapter I : Generalities on DB

Integrity enforcement deals with the prevention of semantic errors made by users due
to their carelessness or lack of knowledge
• Integrity checking is the process of verify that a given update satisfies the
constraint. If any constraint is violated, then the update is rejected. Otherwise
the update is accepted.
• Integrity maintenance is a process that also starts with a given update and the
constraints but now, if some constraint is violated, an attempt is made to find a
repair, that is, an additional set of insertions and/or deletions of facts to be
added to the update, such that the resulting update satisfies all integrity
constraints
]

V. DATABASE MANAGEMENT SYSTEM (DBMS)

5.1. What is DBMS


To carry out operations like insertion, deletion and retrieval, the database needs to be managed
by a software package. This software is called a database management system (DBMS).
DBMS can be defined as a collection of interrelated data and a set of programs to access that
data. The DBMS is responsible for the following data manipulation acts: data controlling, data
retrieving, data maintenance and data definition.

Figure 17: A database system

| 24
Chapter I : Generalities on DB

Advantages of a DBMS
 Reduction in Data Redundancy: Data redundancy refers to duplication of data.
 Reduction in data Inconsistency: Data inconsistency is when different
versions of the same data appear in different places in a database (e.g. a record
showing that an employee salary is 100000, another record within the same database
showing that its salary is 500000). This causes unreliable information, because it
is difficult to determine which version of the information is correct.
 Sharing of Data: Sharing of data allows the existing applications to use the data in the
database simultaneously.
 Improvement in Data Security: DBMS ensures that data are accessed only by
authorized users or applications. DBMS provides security tools such as user codes and
passwords.
 Maintenance of Data Integrity: Data integrity means the consistency and accuracy of
the data in the database.
 Better Interaction with Users: it allows the users, who do not know programming, to
interact with the data more easily (e.g. MS access). Disadvantages
• Cost of Staff Training: Most database management system are often complex
systems so the training for users to use the DBMS is required.
• Cost of Hardware and Software: A processor with high speed of data processing
and memory of large size is required to run the DBMS software. Similarly, DBMS
software is also very costly.
• Higher impact of a failure: The centralization of resources increases the vulnerability
of the system. Since all users and applications rely on the ~vailabi1ity of the DBMS,
the failure of any component can bring operations to a halt

5.2. Examples of DBMS


Some of the database management systems are: Ms Access, MySQL, Oracle, PostgreSQL,
Microsoft SQL server, Derby…

| 25
Chapter I : Generalities on DB

VI. Indexing in Databases


Indexing is a way to optimize the performance of a database by minimizing the number of
disk accesses required when a query is processed. It is a data structure technique which is used
to quickly locate and access the data in a database.
Indexes are created using a few database columns.
• The first column is the Search key that contains a copy of the primary key or candidate
key of the table. These values are stored in sorted order so that the corresponding data
can be accessed quickly. Note: The data may or may not be stored in sorted order.
• The second column is the Data Reference or Pointer which contains a set of pointers
holding the address of the disk block where that particular key value can be found.

| 26
Chapter I : Generalities on DB

The indexing has various attributes:


• Access Types: This refers to the type of access such as value based search, range
access, etc.
• Access Time: It refers to the time needed to find particular data element or set of
elements.
• Insertion Time: It refers to the time taken to find the appropriate space and insert a
new data.
• Deletion Time: Time taken to find an item and delete it as well as update the index
structure.
• Space Overhead: It refers to the additional space required by the index.

| 27

You might also like