0% found this document useful (0 votes)
8 views

Dbms & SQL Notes

Uploaded by

Hyper Beast
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Dbms & SQL Notes

Uploaded by

Hyper Beast
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

DBMS & SQL NOTES

Database: A database consists of related data that represents a certain aspect of reality. A
database system is carefully planned to be created and enhanced with data in order to suit a
certain function.

Database Management System: Software created to safely store and retrieve user data is
called a database management system (DBMS). It includes a group of applications in charge
of managing database operations. The DBMS responds to applications' requests for data and
interacts with the operating system to retrieve the needed information. A DBMS assists users
and outside software in the storage and retrieval of data in complex systems.

Database management systems were created to address the following challenges commonly
encountered in traditional file-processing systems that rely on conventional operating
systems:

1) Eliminating redundancy and maintaining data consistency

2) Streamlining data access

3) Unifying data, which may exist in multiple files and formats

4) Ensuring data integrity

5) Guaranteeing the atomicity of updates

6) Enabling concurrent access by multiple users

7) Enhancing security measures

ER diagram:

●The ER diagram, short for Entity Relationship diagram, serves as a conceptual model that visually
portrays the logical architecture of a database.

●This diagram illustrates the various constraints and connections that interlink different elements
within the database.

●An ER diagram primarily consists of three key components: Entity Sets, Attributes, and Relationship
Sets.
ER Diagram

● Roll_No. can serve as the primary key due to its uniqueness as a number.

Entity Set:

● An entity set comprises entities of the same type.

Strong Entity Set:


A strong entity set possesses enough attributes to uniquely identify each of its entities.
In essence, a primary key is defined for a strong entity set.
The primary key for a strong entity set is indicated by underlining it.
Weak Entity Set:
A weak entity set lacks sufficient attributes to independently identify its entities uniquely.
In other words, a primary key is not applicable to a weak entity set.
However, it includes a partial key referred to as a discriminator.
The discriminator can distinguish a subgroup of entities within the entity set.
The discriminator is denoted by underlining with a dashed line.

Relationship:

A relationship is a connection established among multiple entities.

Unary Relationship Set: In a unary relationship set, only one entity set is involved.
Binary Relationship Set: A binary relationship set comprises two entity sets engaged in the
relationship.
Ternary Relationship Set: A ternary relationship set involves three entity sets in the relationship.
N-ary Relationship Set: An N-ary relationship set encompasses 'n' entity sets participating in
the relationship.

Cardinality Constraint:

Cardinality constraints establish the maximum number of instances of a relationship in which an entity
can participate.
One-to-One Cardinality: An entity in set A can be linked to at most one entity in set B, and vice
versa.
One-to-Many Cardinality: An entity in set A can be linked to any number (including zero or
more) of entities in set B, but an entity in set B can be linked to at most one entity in set A.
Many-to-One Cardinality: An entity in set A can be linked to at most one entity in set B, while an
entity in set B can be linked to any number of entities in set A.
Many-to-Many Cardinality: An entity in set A can be linked to any number (including zero or
more) of entities in set B, and conversely, an entity in set B can be linked to any number
(including zero or more) of entities in set A.

Attributes:

Attributes are the characteristic properties that are possessed by each entity within an Entity Set.

Types of Attributes:

Simple Attributes: Simple attributes are indivisible and cannot be further broken down. For example,
"Age."

Composite Attributes: Composite attributes are made up of several other simple attributes. For
instance, "Name" may consist of "First Name" and "Last Name," and "Address" could include "Street,"
"City," and "Zip Code."

Multi-Valued Attributes: Multi-valued attributes can hold more than one value for a given entity
within an entity set. Examples include "Mobile Number" (which can have multiple phone numbers)
and "Email ID" (which can have multiple email addresses).

Derived Attributes: Derived attributes are values that can be calculated or obtained from other
attributes. For instance, "Age" can be derived from the "Date of Birth."

Key Attributes: Key attributes are those attributes that can uniquely identify an entity within an entity
set. An example is "Roll No.," which can distinguish one student from another within a student entity
set.

Constraints:

Relational constraints are rules that govern the database's content and operations, ensuring the
accuracy of the data within it.

Domain Constraint: The domain constraint defines the acceptable range or set of values for an
attribute. It dictates that the attribute's value must be a valid atomic value within its predefined
domain.
Tuple Uniqueness Constraint: The tuple uniqueness constraint mandates that all tuples within
a relation must be inherently distinct and possess unique attributes.
Key Constraint: In the key constraint, all values of the primary key must be unique, and they
cannot be null. This constraint enforces the uniqueness and non-null property of the primary key.
Entity Integrity Constraint: The entity integrity constraint specifies that no attribute belonging to
the primary key can contain a null value in any relation, ensuring the integrity of the primary key.
Referential Integrity Constraint: The referential integrity constraint requires that all values in a
foreign key must either match values found in the corresponding primary key's relation or be null,
maintaining data consistency between related tables.

Closure of an Attribute Set:

The closure of an attribute set consists of all attributes that can be deduced or functionally determined
from that specific attribute set.

Keys:

A key is a collection of attributes capable of uniquely identifying each tuple within a given relation.

Super Key: A superkey is a set of attributes that can uniquely identify each tuple in the relation. It
can comprise any number of attributes.
Candidate Key: A candidate key is a minimal set of attributes that can uniquely identify each
tuple in the relation. It is a key candidate for being the primary key.
Primary Key: The primary key is a candidate key chosen by the database designer during the
database design process. It must be unique and cannot contain null values.
Alternate Key: Alternate keys are candidate keys within a database table that remain unused or
unimplemented once the primary key has been established.
Foreign Key: A foreign key, denoted as attribute 'X', is one whose values depend on the values
of another attribute 'Y'. The table containing attribute 'Y' is referred to as the referenced relation,
while the table containing attribute 'X' is known as the referencing relation.
Composite Key: A composite key is a primary key formed by combining multiple attributes rather
than using a single attribute.
Unique Key: A unique key is a constraint ensuring that all records within a table have distinct
values for this key. Once assigned, the value of a unique key cannot be modified, though it may
allow NULL values.

Functional Dependency:

In any relation, a functional dependency α → β is established when two tuples with identical values
for attribute α also share the same value for attribute β.

Trivial Functional Dependencies:

A functional dependency X → Y is considered trivial if Y is entirely contained within X.


So, if everything that Y consists of is already included in X, it's called a trivial functional
dependency.

Non-Trivial Functional Dependencies:

A functional dependency X → Y is considered non-trivial if Y contains at least one attribute that is


not found in X.
In other words, if there's at least one thing in Y that is not already in X, it's called a non-trivial
functional dependency.

Decomposition of a Relation:

Breaking a single table into two or more smaller tables is what we mean by the decomposition of a
relation.
Lossless Decomposition:

Lossless decomposition means that when you break a big table into smaller ones, you don't lose
any information.
When you put those smaller tables back together, you get the same big table you started with.

Dependency Preservation:

Dependency preservation means that all the rules about how things in the table are related
(functional dependencies) still hold even after splitting the table.
The smaller tables must still follow the same rules about how their data is connected as the
original big table.

Properties of Decomposition:

Lossless Decomposition:

Lossless decomposition means that when we break a big table into smaller ones, we don't lose
any information.
When we put those smaller tables back together (join them), we get the same big table we
started with.

Dependency Preservation:

Dependency preservation means that all the rules about how things in the table are related
(functional dependencies) still hold even after splitting the table.
The smaller tables must still follow the same rules about how their data is connected as the
original big table.

Types of Decomposition:

Lossless Join Decomposition:

Imagine we have a big table (relation) called R, and we break it into smaller tables R1, R2, ...,
Rn.
This decomposition is considered lossless join when combining (joining) the sub-tables gives us
the exact same big table R that we decomposed.
In a lossless join decomposition, we always have this relationship: R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn =
R, where ⋈ represents a natural join operator.

Lossy Join Decomposition:

Again, suppose we have a big table (relation) called R, and we break it into smaller tables R1,
R2, ..., Rn.
This decomposition is called lossy join when combining (joining) the sub-tables does not give us
the exact same big table R that we decomposed.
For lossy join decomposition, we have this relationship: R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R, where ⋈
is the natural join operator.

In simple terms, lossless means you get back the original table when you join the smaller tables,
while lossy means you might not get the exact original table when you join them.
Normalization

In a database management system (DBMS), normalization is the process of ensuring that the
database is well-organized by:

Minimizing Redundancy: Making sure that data isn't duplicated unnecessarily, which helps
save storage space and reduces the risk of inconsistencies.
Preserving Data Integrity: Ensuring that the data remains accurate and consistent by
decomposing it in a way that doesn't lead to information loss.

Normal Forms:

First Normal Form (1NF):

A table is in 1NF when every cell in the table holds a single piece of data, which means it's either
a single value or a NULL.

Second Normal Form (2NF):

For a table to be in 2NF:


It must already be in 1NF.
It should not have partial dependencies, which means an attribute should depend on the
whole primary key, not just part of it.

Third Normal Form (3NF):

For a table to be in 3NF:


It must already be in 2NF.
It should not have transitive dependencies, meaning an attribute should not depend on
another non-key attribute.

Boyce-Codd Normal Form (BCNF):

For a table to be in BCNF:


It must already be in 3NF.
Every non-trivial functional dependency 'A → B' should have 'A' as a super key, meaning 'A'
uniquely identifies rows in the table.

In simpler terms, each of these normal forms is a set of rules for organizing data in a way that
minimizes redundancy and ensures that data is logically structured to avoid problems like partial
dependencies and transitive dependencies.

Transaction:

A transaction is like a single bundle of tasks that need to be done together.

In a transaction, you have two main types of actions:

Read Operation:

When you do a "Read(A)" instruction, you're grabbing the value of 'A' from the database
and keeping it in a temporary storage area in the computer's memory.

Write Operation:

On the other hand, a "Write(A)" operation means you're taking the updated value of 'A'
from that temporary storage area and saving it back into the database, essentially making
a change.

Transaction States:

Active State:

The transaction starts in the active state.


In this state, the transaction's instructions are being carried out, and any changes it makes are
temporarily stored in the computer's memory.

Partially Committed State:

After all the instructions of the transaction have been executed, it moves to the partially
committed state.
It's called "partially committed" because the changes made by the transaction are still in the
computer's memory, not in the actual database yet.

Committed State:

Once all the changes made by the transaction have been successfully saved into the database, it
enters the committed state.
Now, the transaction is fully committed, and its changes are permanent in the database.

Failed State:
If something goes wrong during the transaction's execution in either the active or partially
committed state, and it's impossible to continue, it enters the failed state.

Aborted State:

After a transaction has failed, it needs to undo the changes it made.


To do this, it rolls back or undoes all the changes, and it enters the aborted state once this
process is complete.

Terminated State:

The final state in the transaction's life cycle is the terminated state.
It reaches this state after either successfully committing or after being aborted due to a failure.
In the terminated state, the transaction's life cycle concludes.

ACID Properties:

The term "ACID" is an acronym that stands for Atomicity, Consistency, Isolation, and Durability. It was
coined to represent a set of properties that are crucial for maintaining the reliability and integrity of
data in a database system.

Atomicity:

Atomicity means that a transaction is like an all-or-nothing deal.


It ensures that a transaction is completed in its entirety or not at all, avoiding partial transactions.

Consistency:

Consistency ensures that the database remains in a valid state before and after a transaction.
It means that the integrity rules and constraints are always maintained.

Isolation:

Isolation allows multiple transactions to happen at the same time without causing issues.
It ensures that the final result of all the transactions is the same as if they were executed one
after the other.

Durability:

Durability makes sure that once a transaction successfully makes changes, those changes are
permanent.
Even in the face of failures, like system crashes, the changes should never be lost and should be
saved on the disk.

Schedules:

A schedule is the arrangement or sequence in which the operations of multiple transactions are
organized for execution.

Serial Schedules:

In serial schedules, transactions run one after the other, with no overlap.
When one transaction is running, no other transaction can execute.
Serial schedules are always consistent, recoverable, cascadeless, and strict.

Non-Serial Schedules:

Non-serial schedules involve multiple transactions running concurrently.


Their operations are interleaved or mixed.
Non-serial schedules may not always maintain consistency, recoverability, cascadelessness, or
strictness.

Serializability:

Serializability helps determine which non-serial schedules are correct and maintain database
consistency.
Serializable schedules are those where a non-serial schedule of 'n' transactions is equivalent to
some serial schedule of 'n' transactions.
Serializable schedules are always consistent, recoverable, cascadeless, and strict.

Types of Serializability:

Conflict Serializability:

A schedule is conflict serializable if we can turn it into a serial schedule by rearranging its
operations that don't conflict with each other.

View Serializability:

A schedule is view serializable if it produces the same results as some serial schedule. In other
words, it's equivalent to a serial schedule in terms of the final outcomes.

Non-Serializable Schedules:

A non-serial schedule that cannot be turned into a serial schedule is called non-serializable.
Non-serializable schedules may or may not produce the same results as a serial schedule on a
consistent database.
They may or may not be consistent or recoverable.

Irrecoverable Schedules:

An irrecoverable schedule is one where a transaction reads from another transaction that hasn't
committed yet and then commits before the other transaction finishes.

Recoverable Schedules:

A recoverable schedule is when a transaction reads from another transaction that hasn't
committed yet, but the committing of the reading transaction is delayed until the other transaction
either commits or rolls back.

Types of Recoverable Schedules:

Cascading Schedule: If one transaction's failure causes multiple dependent transactions to


rollback or abort, it's called a Cascading Schedule or Cascading Rollback or Cascading Abort.
Cascadeless Schedule: In a Cascadeless Schedule, a transaction is not allowed to read a data
item until the last transaction that has written it is committed or aborted.
Strict Schedule: In a Strict Schedule, a transaction is neither allowed to read nor write a data
item until the last transaction that has written it is committed or aborted.

Relational Algebra:

Relational Algebra is a query language that operates on relations, taking one relation as input and
producing another relation as output through a series of procedural operations.
File Structure:

Primary Index:

A primary index is like an organized list of records with two parts.


The first part contains the same values as the primary key in the data file.
The second part is like a guide telling you where to find those values in the data blocks.
To figure out how many times you need to look through this index, you can use a formula:
average block accesses = log2(Bi) + 1, where Bi is the number of index blocks.

Clustering Index:

A clustering index is made for a data file where the records are already lined up in order based
on a field that's not the primary key.

Secondary Index:

A secondary index is like an extra way to find stuff in a file that already has a primary way to find
stuff. It's like having a backup plan for searching in case the primary way isn't enough.

B-Trees:

B-Trees organize data in a tree structure, where each level has keys and data pointers.
Data pointers point to blocks or records.
Key properties include:
The root can have children between 2 and P, where P is the order of the tree (maximum
children a node can have).
Internal nodes can have children between ⌈ P/2 ⌉ and P.
Internal nodes can have keys between ⌈ P/2 ] – 1 and P-1.

B+ Trees:
B+ Trees also use a tree structure but have different structures for leaf and non-leaf nodes.
The order of non-leaf nodes is higher compared to leaf nodes.
Searching is faster in B+ Trees because non-leaf nodes don't have record pointers, reducing the
tree's depth and improving search efficiency.

SQL

DDL (Data Definition Language):

DDL stands for Data Definition Language.


It's about defining how data should be structured in a database.
Key DDL commands include:
CREATE: Used to create a database and its objects like tables, indexes, views, stored
procedures, functions, and triggers.
ALTER: Used to modify the structure of an existing database.
DROP: Used to delete objects from the database.
TRUNCATE: Removes all records from a table, including allocated space for the records.
RENAME: Renames an object in the database.

DML (Data Manipulation Language):

DML stands for Data Manipulation Language.


It deals with manipulating data in the database.
Key DML commands include:
SELECT: Retrieves data from a database.
INSERT: Adds data into a table.
UPDATE: Modifies existing data within a table.
DELETE: Removes records from a table.
MERGE: Performs an UPSERT operation, which means insert or update.

DCL (Data Control Language):

DCL stands for Data Control Language.


It's about controlling rights, permissions, and other security aspects of the database.
Key DCL commands include:
GRANT: Grants access privileges to users for the database.
REVOKE: Withdraws access privileges previously granted using the GRANT command.

TCL (Transaction Control Language):

TCL stands for Transaction Control Language.


It manages transactions within a database.
Key TCL commands include:
COMMIT: Confirms and completes a transaction.
ROLLBACK: Reverts a transaction in case of an error.
SAVEPOINT: Sets points within a transaction to allow for partial rollbacks.

SQL (Structured Query Language):

SQL is a standard language for managing and retrieving data in databases.


SELECT: The SELECT statement retrieves data from a database.
Syntax: SELECT column1, column2, ... FROM table_name;
You can select specific fields by specifying them or use SELECT * to select all fields.
Example: SELECT CustomerName, City FROM Customers;

SELECT DISTINCT:

The SELECT DISTINCT statement is used to retrieve unique values from a column.
Syntax: SELECT DISTINCT column1, column2, ... FROM table_name;
For example, you can use it to get a list of distinct countries from the "Customers" table: SELECT
DISTINCT Country FROM Customers;

WHERE:

The WHERE clause is used to filter records based on a specified condition.


Syntax: SELECT column1, column2, ... FROM table_name WHERE condition;
For instance, you can use it to retrieve all customer records from the "Customers" table where
the country is 'Mexico': SELECT * FROM Customers WHERE Country='Mexico';

<> means Not equal

AND, OR, and NOT:

In SQL, the WHERE clause can be combined with operators like AND, OR, and NOT.
These operators help filter records based on multiple conditions:
AND: Displays a record if all the conditions separated by AND are TRUE.
OR: Displays a record if any of the conditions separated by OR is TRUE.
NOT: Displays a record if the condition is NOT TRUE.

Syntax:

##AND

SELECT column1, column2, ...

FROM table_name

WHERE condition1 AND condition2 AND ...;

##OR

SELECT column1, column2, ...

FROM table_name

WHERE condition1 OR condition2 OR ...;

##NOT

SELECT column1, column2, ...

FROM table_name
WHERE NOT condition;

ORDER BY:

The ORDER BY keyword is used to sort the result-set in ascending or descending order.
By default, it sorts in ascending order, but you can use the DESC keyword to sort in descending
order.
Syntax: SELECT column1, column2, ... FROM table_name ORDER BY column1, column2
ASC|DESC;

INSERT INTO:

INSERT INTO is used to add new records to a table.


You can specify the columns and their values or insert values in the same order as the table's
columns.
Syntax: INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...);

NULL Value:

NULL values represent missing or undefined data.


To test for NULL values, use the IS NULL and IS NOT NULL operators.
Syntax: SELECT column_names FROM table_name WHERE column_name IS NULL;

UPDATE:

The UPDATE statement modifies existing records in a table.


Syntax: UPDATE table_name SET column1 = value1, column2 = value2 WHERE condition;

DELETE:

DELETE statement removes existing records from a table.


You can specify a condition to delete specific records or use DELETE FROM table_name to
delete all rows.
Syntax: DELETE FROM table_name WHERE condition;

SELECT TOP:

SELECT TOP retrieves a specified number of records.


You can use it to limit the number of rows returned by a query.
Syntax: SELECT TOP number|percent column_name(s) FROM table_name WHERE condition;

Aggregate Functions:

Aggregate functions perform calculations on a set of values and return a single value.
Common aggregate functions include MIN, MAX, COUNT, AVG, and SUM.

LIKE Operator:

The LIKE operator is used to search for a specified pattern in a column.


It uses wildcards like % (matches zero, one, or multiple characters) and _ (matches a single
character).
Syntax: SELECT column1, column2, ...

FROM table_name

WHERE column_name LIKE pattern;

IN Operator:

The IN operator allows you to specify multiple values or a subquery in a WHERE clause.
It's a shorthand for multiple OR conditions, making it easier to filter records based on multiple
options.

Syntax: SELECT column_name(s) FROM table_name WHERE column_name IN (SELECT


column_name FROM another_table WHERE condition);

BETWEEN Operator:

The BETWEEN operator is used to select values within a specified range, and it includes both the
beginning and end values. This range can apply to numbers, text, or dates.

Syntax: SELECT column_name(s)

FROM table_name

WHERE column_name BETWEEN value1 AND value2;

JOIN:
A JOIN clause is used to combine rows from two or more tables based on a related column between
them.

INNER JOIN

INNER JOIN keyword is used to select records that have matching values in both tables.

Syntax: SELECT column_name(s)

FROM table1

INNER JOIN table2

ON table1.column_name = table2.column_name;

LEFT JOIN

The LEFT JOIN keyword is used to retrieve all records from the left table (table1) and the
matching records from the right table (table2). If there is no match in the right table, it will return 0
records.
The syntax for a LEFT JOIN is as follows:

SELECT column_name(s) FROM table1 LEFT JOIN table2 ON table1.column_name =


table2.column_name;

RIGHT JOIN

The RIGHT JOIN keyword is used to retrieve all records from the right table (table2) and the
matching records from the left table (table1). If there is no match in the left table, it will return 0
records.
The syntax for a RIGHT JOIN is as follows:

Syntax: SELECT column_name(s)

FROM table1

RIGHT JOIN table2

ON table1.column_name = table2.column_name;

UNION:

The UNION operator is used to combine the result-set of two or more SELECT statements.
Each SELECT statement within UNION must have the same number of columns.
The columns must also have similar data types.
The columns in every SELECT statement must also be in the same order.
By default, the UNION operator selects only distinct values (removing duplicates).
Syntax for UNION:

SELECT column_name(s) FROM table1


UNION

SELECT column_name(s) FROM table2;

UNION ALL:

The UNION ALL operator is used to combine the result-set of two or more SELECT statements
while allowing duplicate values.
It does not remove duplicates and includes all rows from all SELECT statements.
Syntax for UNION ALL:

SELECT column_name(s) FROM table1

UNION ALL

SELECT column_name(s) FROM table2;

GROUP BY:

The GROUP BY statement is used to group rows with identical values into summary rows, such
as finding the number of customers in each country.
It is often used in conjunction with aggregate functions like COUNT(), MAX(), MIN(), SUM(), and
AVG() to perform calculations on grouped data.
Syntax:

SELECT column_name(s)

FROM table_name

WHERE condition

GROUP BY column_name(s)

ORDER BY column_name(s);

HAVING

The HAVING clause is used in SQL because the WHERE keyword cannot be used with
aggregate functions.
It allows you to filter the results of aggregate functions applied to grouped data.
The HAVING clause is used in combination with the GROUP BY statement.
It's important to note that WHERE is given priority over HAVING when both are used in the same
query.
Syntax:

SELECT column_name(s)

FROM table_name

WHERE condition

GROUP BY column_name(s)
HAVING condition

ORDER BY column_name(s);

CREATE DATABASE:

CREATE DATABASE is used to create a new SQL database, and you provide the desired name for
the new database.

The CREATE DATABASE statement is used in SQL to create a new database.


It allows you to specify the name of the database you want to create.
Syntax: CREATE DATABASE databasename;

DROP DATABASE:

The DROP DATABASE statement is used to drop an existing SQL database.

Syntax: DROP DATABASE databasename;

CREATE TABLE:

The CREATE TABLE statement is used to create a new table in a database.

Syntax: CREATE TABLE table_name (

column1 datatype,

column2 datatype,

column3 datatype,

...

);

DROP TABLE:

The DROP TABLE statement is used to drop and existing table in a database

Syntax: DROP TABLE table_name;

TRUNCATE TABLE:

The TRUNCATE TABLE statement is used to delete the data inside a table, but not the table itself.

Syntax: TRUNCATE TABLE table_name;

ALTER:

The ALTER TABLE statement is used in SQL to make changes to an existing table, such as
adding, deleting, or modifying columns.
It can also be used to add or remove various constraints on an existing table.
SYNTAX:
# Add a new Column: ALTER TABLE table_name ADD column_name datatype;

# Drop a column: ALTER TABLE table_name DROP COLUMN column_name;

# Modify the datatype of a column:

ALTER TABLE table_name

MODIFY COLUMN column_name datatype;

MADE BY LOVE FROM VAIBHAV SINGH

You might also like