Module 1&2 Notes
Module 1&2 Notes
CHAPTER 1
Introduction to Databases
A database is a collection of related data.
A data mean known facts that can be recorded and that have implicit meaning.
For eg, consider the names, telephone numbers, and addresses of the people you know. You may
have recorded this data in an indexed address book or you may have stored it on a hard drive, using
a personal computer and software such as Microsoft Access or Excel. This collection of related data
with an implicit meaning is a database.
A Database Management System (DBMS) is a collection of programs that enables users to create
and maintain a database.
The DBMS is a generalpurpose software system that facilitates the processes of defining,
constructing, manipulating, and sharing databases among various users and applications.
Defining a database involves specifying the data types, structures, and constraints of the
data to be stored in the database. The database definition or descriptive information is
also stored by the DBMS in the form of a database catalog or dictionary; it is called
metadata.
Constructing the database is the process of storing the data on some storage medium
that is controlled by the DBMS.
Manipulating a database includes functions such as querying the database to retrieve
specific data, updating the database to reflect changes in the miniworld, and generating
reports from the data.
Sharing a database allows multiple users and programs to access the database
simultaneously
An application program accesses the database by sending queries or requests for data
to the DBMS.
A Query typically causes some data to be retrieved; a transaction may cause some data to be read
and some data to be written into the database.
Characteristics of the Database Approach:
The main characteristics of the database approach versus the file processing approach are the
following:
Self describing nature of a database system
Insulation between programs and data, and data abstraction
Support of multiple views of the data
Sharing of data and multiuser transaction processing
Database Administrators
In a database environment, the primary resource is the database itself, and the
secondary resource is the DBMS and related software.
Administering these resources is the responsibility of the database administrator
(DBA).
The DBA is responsible for authorizing access to the database, coordinating and
monitoring its use, and acquiring software and hardware resources as needed.
The DBA is accountable for problems such as security breaches and poor system
response time. In large organizations, the DBA is assisted by a staff that carries out
these functions.
Database Designers
Database designers are responsible for identifying the data to be stored in the
database and for choosing appropriate structures to represent and store this data.
It is the responsibility of database designers to communicate with all prospective
database users in order to understand their requirements and to create a design that
meets these requirements.
Database designers typically interact with each potential group of users and develop
views of the database that meet the data and processing requirements of these groups.
Each view is then analyzed and integrated with the views of other user groups. The
final database design must be capable of supporting the requirements of all user
groups.
End Users
End users are the people whose jobs require access to the database for querying,
updating, and generating report.
There are several categories of end users:
Casual end users occasionally access the database, but they may need different
information each time. They use a sophisticated database query language to
specify their requests and are typically middle or highlevel managers or other
occasional browsers.
Naive or parametric end users make up a sizable portion of database end users.
Their main job function revolves around constantly querying and updating the
database, using standard types of queries and updates—called canned
transactions—that have been carefully programmed and tested.
The tasks that such users perform are varied:
Bank tellers check account balances and post withdrawals and deposits.
Reservation agents for airlines, hotels, and car rental companies check
availability for a given request and make reservations.
shipping clerks (e.g., at UPS) who use buttons, bar code scanners, etc., to
update status of in transit packages.
Sophisticated end users include engineers, scientists, business analysts, and
others who thoroughly familiarize themselves with the facilities of the DBMS in
order to implement their own applications to meet their complex requirements.
Standalone users Use "personal" databases, possibly employing a special
purpose (e.g., financial) software package.
Mostly maintain personal databases using readytouse packaged
applications.
An example is a tax program user that creates its own internal database.
System Analysts, Application Programmers, Software Engineers:
System Analysts: determine needs of end users, especially naive and parametric
users, and develop specifications for canned transactions that meet these needs.
Application Programmers: Implement, test, document, and maintain programs that
satisfy the specifications mentioned above.
Interactions
The application program sends queries to the query processor.
The query processor translates these queries and instructs the database engine.
The database engine accesses the stored database and consults metadata.
Results are sent back through the chain to the application program and presented to the user.
This structure ensures efficient data management and retrieval.
Relational Model
1970: Introduction of the relational model by IBM and universities.
1980s: Emergence of commercial relational DBMS products.
These advancements lead to new research and development in incorporating diverse data
types, complex structures, new operations, and storage/indexing schemes, including database
updates through web pages.
1. Internal Level
Internal Schema: Describes the physical storage structure of the database.
Uses a physical data model to detail data storage and access paths.
2. Conceptual Level
Conceptual Schema: Describes the structure of the entire database for a community
of users.
Hides physical storage details, focusing on entities, data types, relationships, user
operations, and constraints.
Uses a representational data model for implementation, often based on a highlevel
data model design.
DBMS Interfaces
1. Menu ased Interfaces for Web Clients or Browsing
Presents lists of options (menus) to guide users through request formulation.
CHAPTER 1
Conceptual Data Modelling using Entities and Relationships
In a database management system (DBMS), entity types and entity sets are fundamental concepts in
the entity relationship model, used to organize and represent data.
Entity Type:
An entity type represents a collection of similar entities, having the same properties or
attributes.
It describes a category of objects or entities within the database schema.
For example, in a university database, "Student" and "Professor" could be entity types.
Entity Set:
An entity set is a collection of all entities of a particular entity type at a given point in
time.
It is the actual instance or occurrence of an entity type.
Each entity in an entity set has a unique identifier (primary key) that distinguishes it
from other entities in the same set.
For instance, the set of all enrolled students in a semester forms the "Student" entity set
in the university database.
Structural constraints in Database Management Systems (DBMS) are rules that define certain
conditions on the structure of a database schema to ensure data integrity and consistency. These
constraints are enforced by the DBMS and are critical in designing a robust database. Here are the
key types of structural constraints in DBMS:
1. Domain Constraints:
Definition: These constraints restrict the values that can be stored in a column based
on the data type and, optionally, a set of permissible values.
Example: If a column is defined to store integers, inserting a string would violate a
domain constraint.
2. Entity Integrity Constraints:
Definition: These constraints ensure that each entity (row) in a table is uniquely
identifiable. This is typically enforced through primary keys.
Example: A primary key column cannot have null values and must have unique
values across all rows in the table.
3. Referential Integrity Constraints:
Definition: These constraints ensure that a foreign key value always points to an
existing row in another table, maintaining consistency across related tables.
Example: If a table `Orders` has a foreign key `CustomerID` referencing the
`Customers` table, an order cannot be added with a `CustomerID` that does not exist
in the `Customers` table.
4. Unique Constraints:
Definition: These constraints ensure that all values in a column or a set of columns
are unique across the table, excluding duplicates.
Example: An email address column in a users table must have unique values for
each user.
5. Not Null Constraints:
Definition: These constraints ensure that a column cannot have null values, requiring
a valid data entry in every row.
Example: A column `birthdate` in a `Persons` table cannot be null, ensuring every
person has a birthdate recorded.
6. Check Constraints:
Definition: These constraints allow the definition of a condition that each row must
satisfy. They are used to enforce specific rules on data.
Example: A column `salary` in an `Employees` table must have values greater than
zero.
7. Default Constraints:
Definition: These constraints provide default values for a column when no value is
specified during insertion.
Example: A `status` column in an `Orders` table may have a default value of
'pending'.
8. Foreign Key Constraints:
Definition: These constraints establish a link between two tables, enforcing a
relationship between columns in different tables.
Example: A `department_id` in an `Employees` table must match a valid `id` in the
`Departments` table.
9. Tuple Uniqueness Constraints:
Definition: These constraints ensure that no two rows in a table can have identical
values in certain specified columns.
Example: A combination of `first_name` and `last_name` in a table may be required
to be unique to avoid duplication of names.
10. Key Constraints:
Definition: These constraints involve the designation of primary keys and candidate
keys that uniquely identify rows within a table.
Example: A `student_id` column in a `Students` table is the primary key ensuring
each student has a unique identifier.
Each of these constraints plays a crucial role in maintaining data integrity and consistency in
a relational database. By enforcing these rules, DBMS systems help prevent data anomalies
and ensure reliable database operations.
Notations of ER Diagram
Relationship in DBMS
What is a Relationship In Database?
A relationship in a DBMS exists when the variable has a connection with the properties stored in
different tables. Such relationships help the organization of entities intertwined with each other,
ultimately enabling efficient data processing. They’re exhibited usually via keys in a table, which is
either columns or fields that specify a distinctive arrangement for each record.
As another example, one can think of the university database. Students, courses, and instructors,
variably, might become tables. The relationship between the two tables is that courses indicate
which courses the student can enroll in. Database management facilitates the reliable data
apostrophe and performance of complex operations due to the fact that it guarantees the quality and
identity of the data.
Generalization
Generalization is the process of extracting common properties from a set of entities and
creating a generalized entity from it. It is a bottomup approach in which two or more
entities can be generalized to a higherlevel entity if they have some attributes in common.
For Example, STUDENT and FACULTY can be generalized to a higherlevel entity
called PERSON as shown in below Figure.
In this case, common attributes like P_NAME, and P_ADD become part of a higher
entity (PERSON), and specialized attributes like S_FEE become part of a specialized
entity (STUDENT).
Generalization is also called as ‘ Bottomup approach”.
Specialization
The relational model is a foundational concept in database management, representing data in a structured
format using relations (tables). Here are the key concepts of the relational model:
1. Relation:
A relation is essentially a table with rows and columns.
Each relation is made up of tuples (rows) and attributes (columns).
2. Attributes:
Attributes are the columns of a relation.
Each attribute has a specific domain, which defines the type of values it can hold.
Attribute names must be unique within a relation.
3. Tuples:
Tuples are the rows in a relation.
Each tuple represents a single record and contains values for each attribute.
Tuples in a relation are unique; no two tuples can be identical.
4. Domains:
A domain is the set of permissible values for a given attribute.
For example, the domain for an attribute "Age" might be the set of all non negative integers.
5. Relation Schema:
The schema of a relation defines its structure, including the relation's name, its attributes, and the
domains of these attributes.
A relation schema is typically represented as `R(A1, A2, ..., An)`, where `R` is the relation name
and `A1, A2, ..., An` are the attributes.
6. Relation Instance:
A relation instance is a specific set of tuples that conforms to the schema at a particular moment
in time.
The instance is a snapshot of the data in the relation at a given point.
1. Domain Constraints:
Definition: These constraints restrict the values that can be stored in a column based
on the data type and, optionally, a set of permissible values.
Example: If a column is defined to store integers, inserting a string would violate a
domain constraint.
2. Entity Integrity Constraints:
Definition: These constraints ensure that each entity (row) in a table is uniquely
identifiable. This is typically enforced through primary keys.
Example: A primary key column cannot have null values and must have unique
values across all rows in the table.
3. Referential Integrity Constraints:
Definition: These constraints ensure that a foreign key value always points to an
existing row in another table, maintaining consistency across related tables.
Example: If a table `Orders` has a foreign key `CustomerID` referencing the
`Customers` table, an order cannot be added with a `CustomerID` that does not exist
in the `Customers` table.
4. Unique Constraints:
Definition: These constraints ensure that all values in a column or a set of columns
are unique across the table, excluding duplicates.
Example: An email address column in a users table must have unique values for
each user.
5. Not Null Constraints:
Definition: These constraints ensure that a column cannot have null values, requiring
a valid data entry in every row.
Example: A column `birthdate` in a `Persons` table cannot be null, ensuring every
person has a birthdate recorded.
6. Check Constraints:
Definition: These constraints allow the definition of a condition that each row must
satisfy. They are used to enforce specific rules on data.
Example: A column `salary` in an `Employees` table must have values greater than
zero.
Characteristics of Relations
Ordering of Tuples in a Relation
Ordering of Values within a Tuple
Values and NULLs in the Tuples
Interpretation (Meaning) of a Relation
1. UPDATE Operation
The `UPDATE` statement is used to modify existing records in a table.
Syntax:sql
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
Example:
sql
UPDATE employees
SET salary = 5000
WHERE employee_id = 101;
This command updates the `salary` of the employee with `employee_id` 101 to 5000.
2. INSERT Operation
The `INSERT` statement is used to add new records to a table.
Syntax: sql
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);
Example:
sql
INSERT INTO employees (employee_id, first_name, last_name, salary)
VALUES (102, 'John', 'Doe', 4500);
This command inserts a new record into the `employees` table with specified values.
3. DELETE Operation
The `DELETE` statement is used to remove existing records from a table.
Syntax:sql
DELETE FROM table_name
WHERE condition;
Example:
sql
DELETE FROM employees
WHERE employee_id = 103;
This command deletes the record of the employee with `employee_id` 103 from the
`employees` table.
4. ALTER Operation
The `ALTER` statement is used to modify the structure of an existing table.
Syntax:
sql
ALTER TABLE table_name
ADD column_name datatype;
Example:
sql
ALTER TABLE employees
ADD date_of_birth DATE;
Transaction in DBMS
A transaction in a Database Management System (DBMS) is a sequence of one or more SQL
operations that are executed as a single unit of work. Transactions ensure that a series of operations
are performed completely and correctly, maintaining the integrity of the database. Transactions are
crucial for maintaining consistency, especially in systems where multiple operations must be
executed in sequence without interruption.
1. Atomicity: Ensures that all operations within a transaction are completed successfully. If
any operation fails, the entire transaction is rolled back.
2. Consistency: Ensures that a transaction brings the database from one valid state to
another valid state, maintaining all predefined rules, such as constraints.
3. Isolation: Ensures that operations in a transaction are isolated from those in other
transactions. Intermediate transaction results are not visible to other transactions until the
transaction is committed.
4. Durability: Ensures that once a transaction is committed, its changes are permanent and
will survive system failures.
Transaction Commands
Example of a Transaction
Consider a banking system where we need to transfer money from one account to
another.
This operation requires multiple steps:
Deduct money from the sender's account.
Add money to the receiver's account.
These operations must be executed as a single transaction to ensure consistency.
The COMPANY database keeps track of a company's employees, departments, and projects.
The company is organized into departments. Each department has a unique name, a
unique number, and a particular employee who manages the department. We keep track
of the start date when that employee began managing the department. A department may
have several locations.
A department controls a number of projects, each of which has a unique name, a unique
number, and a single location.
The database will store each employee's name, Social Security number, address, salary,
sex (gender), and birth date. An employee is assigned to one department, but may work
on several projects, which are not necessarily controlled by the same department. It is
required to keep track of the current number of hours per week that an employee works
on each project, as well as the direct supervisor of each employee (who is another
employee).
The database will keep track of the dependents of each employee for insurance purposes,
including each dependent's first name, sex, birth date, and relationship to the employee.
Outline an ER diagram for the above given scenario.