Infomanagement Lesson 3
Infomanagement Lesson 3
Terminologies
• Relation model represents data as a collection of table
• A named, two-dimensional table is called a relation
• Each relation consists of named columns and an arbitrary number of unnamed rows
• Each row in the table is called a tuple, it corresponds to a record that contains data
• A named columns are called attributes
Relation name Attributes
Properties of a Relation
• It has a unique name
• No multivalued attributes are allowed in a relation
• Each row is unique
• Each attribute has a unique name
• The sequence of columns as well as of rows is insignificant
Classification of Attributes
• Required vs Optional Attributes
○ Required - must have a value
▪ e.g., medical condition
○ Optional - may not have a value
▪ e.g., provincial address
INFOMAN Page 1
▪ e.g., Age (can be derived from birthdate)
• Identifier
○ An attribute (or combination) that uniquely identifies each entity
○ It must be:
▪ Unique
▪ Not null
▪ Unchanging
○ e.g., Account number (assigned by the insurance company)
▪ Relation Employee with the attributes inside the parenthesis, empid,
name, depname, employstatus
▪ EMPLOYEE (EmpID, Name, DeptName, EmployStatus)
Relational KEYS
• Primary Key - a single or combined attributes that uniquely identifies each rows in a table
• Composite Key - key that consist of more than one attribute
• Foreign Key - an attribute that links two tables by referencing the primary key of another
table
1. Integrity Constraints - are set of rules used in DBMS to ensure that the data in database
is accurate, consistent and reliable.
• Domain Constraints - it specifies the value for an attribute, including its name, meaning,
type, size, and allowable values.
e.g., Correct Wrong
• Entity Integrity - it ensures that every relation has a valid primary key
This example violates the Entity
Integrity constraints because
the third tuple contains a null
value in the primary key field.
• Referential Integrity - ensures consistency between two tables by requiring that every
foreign key either matches a primary key in the related table or is null.
INFOMAN Page 2
In this example, the DNO is the foreign
key in the first table and the primary
key in the second table. The DNO value
of 22 is not allowed because it is not
• defined in the primary key of the
second table.
Database Normalization
• Optimize table structures
• Removing duplicate data entries
• A process of efficiently organizing data in a database
• Produces a set of relations with desirable properties based on data requirements
• Use primary keys and functional dependencies to define proper table design
• Why Normalize?
○ Improved speed
○ Efficient use of space (eliminates redundant data)
○ Ensure logical dependencies
○ Increase data integrity (prevents update anomalies and potential corruption)
INFOMAN Page 3
complex and inefficient structures.
▪ Isolate Semantically Related Multiple Relationships
□ Group semantically related data into separate tables. This ensures that each table
contains data that is logically connected, improving data clarity and reducing
redundancy.
Tables that are not normalized are susceptible to
experiencing modification anomalies
Insertion Anomaly - occurs when certain
attributes cannot be inserted into the database
without the presence of other attributes.
TERMINOLOGIES
Functional Dependency:
A functional dependency defines the relationship between attributes in a relation (table). If A
and B are attributes in a relation R, B is functionally dependent on A if, for each value of A,
there is exactly one corresponding value of B. This means that knowing the value of A is enough
to determine the value of B in the relation.
In simpler terms, A → B (read as "A functionally determines B") means that for every unique
value of A, there is only one value of B associated with it.
Example : Given an ISBN, one would know the title of a book, Title is functionally dependent
on ISBN
Analogy:
Determinant:
Attribute/s on the left side of the functional dependencies (e.g., ISBN).
Unnormalized form (UNF):
A table with repeating groups.
Repeating group:
An attribute or set of attributes with multiple values for a single occurrence of the key
attribute(s).
PROCESS OF NORMALIZATION
• First normal form (1NF)
The first normal form (1NF) is the foundational step of data normalization. A database is in 1NF if:
○ It contains only atomic values (each field holds a single value, no lists or arrays).
○ Every record is unique and identified by a primary key.
INFOMAN Page 4
○ Every record is unique and identified by a primary key.
○ There are no repeating groups of data within a row.
This stage eliminates duplicate data and ensures that each entry in the database has a unique identifier, enhancing data consistency.
• Example of 1NF violation and solutions
The 'Products Ordered' column has multiple values (not atomic), violating 1NF.
Now, each record stores a single atomic value, making it 1NF compliant.
Student Table
Courses Table
INFOMAN Page 5
Enrollment Table (Bridging Table)
Now, every non-key attribute is fully dependent on its respective primary key.
Partial functional dependency: This occurs when a non-key attribute is functionally dependent on part (but not all) of the primary key.
In the example above, the primary keys (composite key) are the Student_ID and Course_ID. Student_Name depends only on the Student_ID,
not on both primary keys, and Course_Name depends only on the Course_ID.
Full functional dependency: This occurs when a non-key attribute is functionally dependent on the entire composite key.
The Tax Rate is dependent on a non-primary key attribute (Salary) and not directly on the Employee_ID. This introduces a transitive
dependency.
Employee Table
Salary Table
Tax Table
Now Tax Rate is dependent on Salary, not on Employee ID, removing the transitive dependency
INFOMAN Page 6