0% found this document useful (0 votes)
32 views12 pages

UNIT2

Uploaded by

sujana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views12 pages

UNIT2

Uploaded by

sujana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

The Relational Model Concepts:

The relational model represents the database as a collection of relations. Informally, each
relation resembles a table of values or, to some extent, a flat file of records. It is called a flat
file because each record has a simple linear or flat structure.
When a relation is thought of as a table of values, each row in the table represents a collection
of related data values. A row represents a fact that typically corresponds to a real-world entity
or relationship. The table name and column names are used to help to interpret the meaning
of the values in each row.
Example: In STUDENT relation because each row represents facts about a particular student
entity. The column names Name, Student_number, Class, and Major specify how to interpret
the data values in each row, based on the column each value is in. All values in a column are
of the same data type.
In the formal relational model terminology, a row is called a tuple, a column header is called
an attribute, and the table is called a relation. The data type describing the types of values that
can appear in each column is represented by a domain of possible values.

Codd’s rules:

Codd's Rules, also known as Codd's 12 Rules, were formulated by Edgar F. Codd, the father of the
relational model. These rules define what is required from a database management system (DBMS)
to be considered truly relational. Here's a concise overview of Codd's 12 Rules:

1. Information Rule: All information in a relational database must be represented explicitly at


the logical level in exactly one way - by values in tables.
2. Guaranteed Access Rule: Each piece of data must be logically accessible by using a
combination of table name, primary key value, and column name.
3. Systematic Treatment of Null Values: Null values (distinct from empty string or zero) must
be supported for representing missing information and inapplicable information in a
systematic way.
4. Active Online Catalog: The database description must be stored in an online catalog, known
as the data dictionary, that can be accessed by authorized users.
5. Comprehensive Data Sublanguage Rule: The system must support at least one relational
language that has a linear syntax, can be used interactively and in application programs, and
supports data definition, manipulation, integrity constraints, authorization, and transaction
management.
6. View Updating Rule: All theoretically updatable views must be updatable by the system.
7. High-Level Insert, Update, and Delete: The system must support set-level insert, update, and
delete operations.
8. Physical Data Independence: Changes to the physical level (how the data is stored) must not
require changes to an application based on the structure.
9. Logical Data Independence: Changes to the logical level (tables, columns) must not require
changes to an application based on the structure.
10. Integrity Independence: Integrity constraints must be specified separately from application
programs and stored in the catalog.
11. Distribution Independence: The distribution of portions of the database to various locations
should be invisible to users of the database.
12. Non-Subversion Rule: If the system provides a low-level (record-at-a-time) interface, it
must not allow subverting or bypassing the integrity rules and constraints.

These rules serve as a guideline for evaluating the "relational-ness" of a DBMS. It's worth noting
that while these rules are important, few (if any) commercial systems fully comply with all of them.
They remain more of an ideal to strive for rather than a strict set of requirements.
Domains, Attributes, Tuples, and Relations
Domain:
A domain D is a set of atomic values. By atomic we mean that each value in the domain is
invisible as far as the formal relational model is concerned. A common method of specifying
a domain is to specify a data type from which the data values forming the domain are drawn.
It is also useful to specify the name for the domain, to help in interpreting its values.
Some examples of domains follow:
• Usa_phone_numbers: The set of ten-difgit phone numbers valid in United States.
• Social_security_numbers: The set of valid nine-digit social security numbers.
• Names: The set of character strings that represents the names of persons.
• Employee_ages: Possible ages of employees in a company; each must be an
integer value between 15 and 80.
The preceding are called logical definitions of domains. A data type or format is also
specified for each domain. For example, the data type for the domain Usa_phone_numbers
can be declared as a character string of the form (ddd)ddddddd, where each d is a numeric
(decimal) digit and the first three digits form a valid telephone area code. The data type for
Employee_ages is an integer number between 15 and 80.
Attribute:
An attribute Ai is the name of a role played by some domain D in the relation schema R.
D is called the domain of Ai and is denoted by dom(Ai).
Tuple:
Mapping from attributes to values drawn from the respective domains of those attributes.
Tuples are intended to describe some entity (or relationship between entities) in the miniworld
Example: a tuple for a PERSON entity might be
{ Name -- ”smith”, Gender--> Male, Age --> 25 }
Relation:
A named set of tuples all of the same form i.e., having the same set of attributes.
Relation schema:

Constraints (Domain, Key constraints, integrity constraints):

1.1 Relational Model Constraints and Relational Database Schemas:


Constraints are restrictions on the actual values in a database state. These constraints
are derived from the rules in the miniworld that the database represents. Constraints on
databases can generally be divided into three main categories:

1) Inherent model-based constraints or implicit constraints


 Constraints that are inherent in the data model.
 The characteristics of relations are the inherent constraints of the relational
model and belong to the first category. For example, the constraint that a
relation cannot have duplicate tuples is an inherent constraint.
2) Schema-based constraints or explicit constraints
 Constraints that can be directly expressed in schemas of the data model,
typically by specifying them in the DDL.
 The schema-based constraints include domain constraints, key constraints,
constraints on NULLs, entity integrity constraints, and referential integrity
constraints.
3) Application-based or semantic constraints or business rules
 Constraints that cannot be directly expressed in the schemas of the data
model, and hence must be expressed and enforced by the application
programs.
 Examples of such constraints are the salary of an employee should not exceed
the salary of the employee, supervisor and the maximum number of hours an
employee can work on all projects per week is 56.

1.1.1 Domain Constraints


Domain Constraints specify that within each tuple, the value of each attribute A must be an
atomic value from the domain dom(A). The data types associated with domains typically include
standard numeric data types for integers (such as short integer, integer, and long integer) and
real numbers (float and doubleprecision float). Characters, Booleans, fixed-length
DATABASE MANAGEMENT SYSTEMS PVP20 UNIT-2

strings, and variable-length strings are also available, as are date, time, timestamp, and
money, or other special data types.
1.1.2 Key Constraints and Constraints on NULL Values
A key is a set of one or more attributes that can uniquely identify each row in a table. A key
not only identifies the rows of a table but also relates two or more tables.
Different Types of Keys:
1) Super Key
2) Candidate Key
3) Primary Key
4) Foreign Key
5) Secondary Key/Alternate Key
6) Unique Key
7) Composite Key
8) Surrogate Key
9) Partial Key

1) Super Key: Super Key is an attribute (or a set of attributes) that uniquely identify a
tuple i.e. an entity in entity set.
It is a superset of Candidate Key, since Candidate Keys are selected from super key.

Example:

V.RASHMI (Assistant Professor) PVPSIT IT 4


DATABASE MANAGEMENT SYSTEMS PVP20 UNIT-2

Super Keys are:

Candidate Keys are:

2) Candidate Key: Each table has only a single primary key. Each relation may have one
or more candidate key. One of these candidate key is called Primary Key. Each
candidate key qualifies for Primary Key. Therefore candidates for Primary Key is
called Candidate Key.
Candidate key can be a single column or combination of more than one column. A
minimal super key is called a candidate key.
Example:

Above, Student_ID, Student_Enroll and Student_Email are the candidate keys. They
are considered candidate keys since they can uniquely identify the student record.
3) Primary Key: It is an attribute or set of attributes that uniquely identify an entity
(row) in the entity set (table). The main difference between the primary key and the
candidate key in that is primary key does not contain NULL values.
Primary
 Key must be UNIQUE and NOT NULL.

Example:

The primary key of the relation can be EID.

V.RASHMI (Assistant Professor) PVPSIT IT 5


DATABASE MANAGEMENT SYSTEMS PVP20 UNIT-2

4) Foreign Key: A foreign key is a set of attributes in a table that refers to the primary
key of another table. The foreign key links these two tables.
Example:

5) Secondary Key/Alternalte Key: A primary key is the field in a database that is the
primary key used to uniquely identify a record in a database. A secondary key is an
additional key, or alternate key, which can be use in addition to the primary key to
locate specific data.
Secondary Key is the key that has not been selected to be the primary key. However,
it is considered a candidate key for the primary key.
Therefore, a candidate key not selected as a primary key is called secondary key.
Candidate key is an attribute or set of attributes that you can consider as a Primary
key. Note: Secondary Key is not a Foreign Key.
Example 1:

Above, Student_ID, Student_Enroll and Student_Email are the candidate keys. They
are considered candidate keys since they can uniquely identify the student record.
Select any one of the candidate key as the primary key. Rest of the two keys would be
Secondary Key.
If you selected Student_ID as primary key, therefore Student_Enroll and
Student_Email will be Secondary Key (candidates of primary key).
Example 2:

V.RASHMI (Assistant Professor) PVPSIT IT 6


DATABASE MANAGEMENT SYSTEMS PVP20 UNIT-2

Above, Employee_ID, Employee_No and Employee_Email are the candidate keys.


They uniquely identify the Employee record. Select any one of the candidate key as
the primary key. Rest of the two keys would be Secondary Key.
6) Unique Key: A Unique Key is used to prevent duplicate values in a column. Primary
Key provided uniqueness to a table.
A primary key cannot accept NULL values; this makes Primary Key different from
Unique Key, since Unique Key allows one value as NULL value.
A table can only have a single Primary Key, whereas a Unique Key can be more than
one if you need it in the table.
Unique Key ensures that data is not duplicated in two rows in the database. A row in
the database can have null in case of Unique Key.
You cannot modify a Primary Key, but a Unique Key can be modified.
7) Composite Key: A primary key having two or more attributes is called composite
key. It is a combination of two or more columns.
Example 1: Here our composite key is OrderID and ProductID −
{OrderID, ProductID}

Example 2:

Above, our composite keys are StudentID and StudentEnrollNo. The table has two
attributes as primary key.
Therefore, the Primary Key consisting of two or more attribute is called Composite
Key.
8) Surrogate Key: A Surrogate Key’s only purpose is to be a unique identifier in a
database, for example, incremental key.
Surrogate Key has no actual meaning and is used to represent existence. It has an
existence only for data analysis.
Example: The surrogate key is
Key in the <ProductPrice> table.

V.RASHMI (Assistant Professor) PVPSIT IT 7


DATABASE MANAGEMENT SYSTEMS PVP20 UNIT-2

Other examples of a Surrogate Key:


Counter

System
 date/time stamp
Random
 alphanumeric string.
9) Partial Key: Partial key is a key using which all the records of the table can not be
identified uniquely.
However, a bunch of related tuples can be selected from the table using the partial
key. Example: Consider the following schema-
Department ( Emp_no , Dependent_name , Relation )

Here, using partial key Emp_no, we can not identify a tuple uniquely but we can
select a bunch of tuples from the table

V.RASHMI (Assistant Professor) PVPSIT IT 8


All tuples in a relation must also be distinct. This means that no two tuples can have the same
combination of values for all their attributes. There are other subsets of attributes of a relation
schema R with the property that no two tuples in any relation state r of R should have the
same combination of values for these attributes.
Suppose that we denote one such subset of attributes by SK; then for any two distinct tuples
t1 and t2 in a relation state r of R, we have the constraint that: t 1 t2[SK] . such set of attributes
SK is called a superkey of the relation schema R

Superkey
A superkey SK specifies a uniqueness constraint that no two distinct tuples in any state r of R
can have the same value for SK. Every relation has at least one default superkey the set of all
its attributes.
Key
A key K of a relation schema R is a superkey of R with the additional property that removing
any attribute A from K leaves a set of attributes K that is not a superkey of R anymore.
Hence, a key satisfies two properties:
1. Two distinct tuples in any state of the relation cannot have identical values for
(all) the attributes in the key. This first property also applies to a superkey.
2. It is a minimal superkey that is, a superkey from which we cannot remove any
attributes and still have the uniqueness constraint in condition will hold.This
property is not required by a superkey.
Example: Consider the STUDENT relation

 The attribute set {Ssn} is a key of STUDENT because no two student tuples can have
the same value for Ssn.
 Any set of attributes that includes Ssn for example, {Ssn, Name, Age} is a superkey.
 The superkey {Ssn, Name, Age} is not a key of STUDENT because removing Name or
Age or both from the set still leaves us with a superkey.
In general, any superkey formed from a single attribute is also a key. A key with multiple
attributes must require all its attributes together to have the uniqueness property.
Candidate Key
A relation schema may have more than one key. In this case, each of the keys is called a
candidate key.
Example: The CAR relation has two candidate keys: License_number and
Engine_serial_number

Primary Key
It is common to designate one of the candidate keys as the primary key of the relation. This is
the candidate key whose values are used to identify tuples in the relation. We use the
convention that the attributes that form the primary key of a relation schema are underlined.
Other candidate keys are designated as unique keys and are not underlined.
Another constraint on attributes specifies whether NULL values are or are not permitted.
For example, if every STUDENT tuple must have a valid, non-NULL value for the Name
attribute, then Name of STUDENT is constrained to be NOT NULL.
Relational Algebra & relational calculus:

Certainly. Relational algebra and relational calculus are formal languages used to manipulate and
query relational databases. They form the theoretical foundation for SQL and other query
languages. Let's break them down:

Relational Algebra:

1. Definition: A procedural language that describes the step-by-step process of how to


compute a query result.
2. Basic operations:
o SELECT (σ): Selects rows that satisfy a given predicate
o PROJECT (π): Extracts specified columns
o UNION (∪): Combines rows from two relations
o SET DIFFERENCE (-): Removes rows from one relation that appear in another
o CARTESIAN PRODUCT (×): Combines each row of one relation with every row
of another
o RENAME (ρ): Renames attributes or relations
3. Derived operations:
o INTERSECT (∩): Retains only rows that appear in both relations
o JOIN (⋈): Combines related rows from two relations
o DIVISION (÷): Finds rows in one relation that are related to all rows in another

Relational Calculus:

1. Definition: A declarative language that describes what data to retrieve, not how to
retrieve it.
2. Types:
o Tuple Relational Calculus (TRC)
o Domain Relational Calculus (DRC)
3. Key components:
o Variables
o Atomic formulas
o Logical connectives (∧, ∨, ¬)
o Quantifiers (∃ for "exists", ∀ for "for all")
4. Basic structure: {T | P(T)} where T is a tuple variable and P(T) is a predicate

Comparison:

1. Expressiveness: Relational algebra and relational calculus are equivalent in power. Any
query that can be expressed in one can be expressed in the other.
2. Usage: Relational algebra is closer to how queries are actually executed, while relational
calculus is more abstract and closer to natural language.
3. Implementation: Most database systems use relational algebra as the basis for query
optimization and execution.
4. Learning curve: Relational algebra is often considered easier to learn initially, while
relational calculus can be more intuitive for complex queries once mastered.

You might also like