Database Development Notes
Database Development Notes
Develop Database
School: ESSJT
Database requirements
2.2. Logical Database schema is properly designed based on system
requirements
4. Secure 4.1. Access control is properly enforced based on database security measures
Database 4.2. Auditing and logging are clearly managed based on the security policies
4.4. Backup and Recovery of data are regularly configured based on DBMS
● Data, in the context of databases, refers to all the single items that are
stored in a database, either individually or as a set. Data in a database is
primarily stored in database tables, which are organized into columns that
dictate the data types stored therein. It is
● Information: is the data that has been converted into more useful or
intelligent form. For example: Report card sheet. The information is
needed for the following reasons − To gain knowledge about the
surroundings. To keep the system up to date.
Types of attributes
• Single valued attributes consist of a single value for each entity instance and can’t
store more than one value.
• Multi-valued attributes can take up and store more than one value at a time for an
entity instance from a set of possible values. They are represented by a co-centric
elliptical shape.
• Derived attributes are those attributes whose values can be derived from the values
of other attributes.
• Key attributes act as the primary key for an entity, and they can uniquely identify an
entity from an entity set.
• Complex attributes are formed by the combination of multi-valued and composite
attributes.
• Values of stored attributes remain constant and fixed for an entity instance.
● Records: is simply a set of data stored in a table, for example, a
customer record. A record in a database is an object that can contain one
or more
values. Groups of records are then saved in a table; the table defines the
data that each record may contain.
● Table: Tables are database objects that contain all the data in a database.
In tables, data is logically organized in a row-and-column format similar
to a spreadsheet. Each row represents a unique record, and each column
represents a field in the record.
1. Logical Schema
2. Physical Schema
3. View Schema
The Logical database schema specifies all the logical constraints that need to be applied to the
stored data. It defines the views, integrity constraints, and table. Here the term integrity
constraints define the set of rules that are used by DBMS (Database Management System) to
maintain the quality for insertion & update the data. The logical schema represents how the data
is stored in the form of tables and how the attributes of a table are linked together.
The primary key is used to uniquely identify the entry in a document or record. The Ids of the
upper three circles are the primary keys.
The Foreign key is used as the primary key for other tables. The FK represent the foreign key in
the diagram. It relates one table to another table.
3. View Schema
The view level design of a database is known as view schema. This schema generally describes
the end-user interaction with the database systems.
Data definition language (DDL) refers to SQL commands that design the database structure.
Database engineers use DDL to create and modify database objects based on the business
requirements. For example, the database engineer uses the CREATE command to create database
objects such as tables, views, and indexes.
Data query language (DQL) consists of instructions for retrieving data stored in relational
databases. Software applications use the SELECT command to filter and return specific
results from a SQL table.
Data manipulation language (DML) statements write new information or modify existing records
in a relational database. For example, an application uses the INSERT command to store a new
record in the database.
Database administrators use data control language (DCL) to manage or authorize database
access for other users. For example, they can use the GRANT command to permit certain
applications to manipulate one or more tables.
The relational engine uses transaction control language (TCL) to automatically make database
changes. For example, the database uses the ROLLBACK command to undo an erroneous
transaction.
COMMIT: This command is used to save all the transactions in the DB.
ROOL BACK: The “rollback” term refers to the method of undoing changes. Thus, this
command could only be used in order to reverse transactions that occurred since the last
ROLLBACK or COMMIT command. All the modifications must be cancelled in case any
SQL grouped statements produce a certain error.
SAVEPOINT: It is used to roll back a certain transaction to a certain point rather than the entire
transaction.
✔ Application of database
A database model shows the logical structure of a database, including the relationships and
constraints that determine how data can be stored and accessed.
There are more database models but, basically, we focus on five most used, those five are the
following:
The hierarchical model organizes data into a tree-like structure, where each record has a
single parent or root. Sibling records are sorted in a particular order. That order is used
as
the physical order for storing the database. This model is good for describing many real-
world relationships.
• Relational model
• The most common model, the relational model sorts data into tables, also known as
relations, each of which consists of columns and rows. Each column lists an attribute
of the entity in question, such as price, zip code, or birth date. Together, the attributes
in a relation are called a domain. A particular attribute or combination of attributes is
chosen as a primary key that can be referred to in other tables, when it’s called a
foreign key.
• Each row, also called a tuple, includes data about a specific instance of the entity in
question, such as a particular employee.
• The model also accounts for the types of relationships between those tables, including
one-to-one, one-to-many, and many-to-many relationships. Here’s an example:
• Network model: are the types of Database models that are designed to represent
objects and their relationships flexibly. The network model extends the hierarchical
model by allowing many-to-many relationships between linked records, which implies
multiple parent records.
• Object-oriented database model: is the data model where data is stored in the form of
objects. This model is used to represent real-world entities. The data and data
relationship is stored together in a single entity known as an object in the Object
Oriented Model. The Object-Oriented Database Management System is built on top of
Object Oriented Model.
• Here Transport, Bus, Ship, and Plane are objects.
• Bus has Road Transport as the attribute.
• Ship has Water Transport as the attribute.
• Plane has Air Transport as the attribute.
• The Transport object` is the base object and the Bus, Ship, and Plane objects derive
from it.
Take a look at another example-
1.2. Data dictionary is clearly described based on database model
Passive data dictionaries are created separately from the databases they describe to act as a
repository for data information. Passive data dictionaries require additional work to stay in sync
with the databases they describe. As such, database managers must handle passive directories
with care to ensure there are no discrepancies.
The specific components of a data dictionary can vary, but they typically take the form of various
types of metadata. Examples of these components include the following:
• data element properties, such as data types, unique identifiers, sizes and
indexes;
• system-level diagrams;
• reference data;
Data dictionaries can be valuable tools for the organization and management of large data
listings. These are some of the biggest benefits of using a data dictionary:
• easily searchable;
However, data dictionaries can also prove difficult for some to manage. Here are some of
the downsides:
1.3. Database Task requirements are properly identified based on user requirements
Identification of database requirements
Database requirement for a system are the description of what the system should do, the
service or services that it provides and the constraints on its operation.
A condition or capability that must be met or possessed by a system or system component to
satisfy a contract, standard, specification, or other formally imposed document.
Functional requirements define a function that a system or system element must be qualified to
perform and must be documented in different forms. The functional requirements describe the
behavior of the system as it correlates to the system's functionality.
Non-functional requirements
Non-functional requirements are not related to the software's functional aspect. They can be the
necessities that specify the criteria that can be used to decide the operation instead of specific
behaviors of the system. Basic non-functional requirements are - usability, reliability, security,
storage, cost, flexibility, configuration, performance, legal or regulatory requirements, etc.
Execution qualities like security and usability, which are observable at run time.
Evolution qualities like testability, maintainability, extensibility, and scalability that embodied
in the static structure of the software system.
Now, let's see the comparison chart between the functional and non-functional requirements.
They describe what the product does. They describe the working of product.
These requirements are specified by the user. These requirements are specified by the
software developers, architects, and
technical persons.
There is functional testing such as API testing, system, There is non-functional testing such as
integration, etc. usability, performance, stress, security, etc.
These requirements are important to system operation. These are not always the important
requirements, they may be desirable.
Completion of Functional requirements allows the While system will not work only with
system to perform, irrespective of meeting the nonfunctional requirements.
nonfunctional requirements.
Data collection happens when you gather and analyze valuable information (e.g., names,
email addresses, customer feedback, and website analytics) from a variety of sources to build
compelling marketing campaigns, learn more about your customers, or create financial
budgets.
1. Interviews are a method of data collection that involves two or more people
exchanging information through a series of questions and answers.
An interview is a research method that involves asking questions to collect data from
individuals who have knowledge, experience or opinions on a particular topic or subject
matter.
Types of Interviews
• structured,
• unstructured
• semi-structured interviews.
1. A structured interview is one where the researcher asks the participants a list of
questions that have been prepared in advance.
Types of questions
2. Close ended questions are defined as question types that ask respondents to
choose from a distinct set of pre-defined responses, such as “yes/no” or among set
multiple choice questions.
There are many benefits to using a database schema. Some of the most common include:
There are several different types of schema used for databases. The three most common types
you’ll likely encounter in the field include:
• Conceptual schema. A conceptual database schema represents all the elements
contained in a database and illustrates their relationship to one another, but it doesn’t
contain any tables. As a result, it provides a big-picture view of the database without
offering real-world details.
• Logical database schema. Logical schemas flesh out conceptual schemas with more
concrete details about the objects that will be contained within them, such as names,
tables, views, and integrity constraints.
• Physical database schema. A physical schema is an actual design for a relational
database. It includes all the technical and contextual information needed for the
schema and is created with a specific physical data system in mind.
Data Abstraction is a process of hiding unwanted or irrelevant details from the end user. It
provides a different view and helps in achieving data independence which is used to enhance the
security of data.
Physical or Internal Level
It is the lowest level of abstraction for DBMS which defines how the data is actually stored,
it defines data-structures to store data and access methods used by the database.
Logical level is the intermediate level or next higher level. It describes what data is stored in the
database and what relationship exists among those data. It tries to describe the entire or whole
data because it describes what tables to be created and what are the links among those tables that
are created.
It is the highest level. In view level, there are different levels of views and every view only
defines a part of the entire data. It also simplifies interaction with the user and it provides many
views or multiple views of the same database.
View level can be used by all users (all levels' users). This level is the least complex and easy
to understand.
Data Independence o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level of
the database system without altering the schema at the next higher level.
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema. o Logical data independence is
used to separate the external level from the conceptual view. o If we do any changes in
the conceptual view of the data, then the user view of the data would not be affected.
o Logical data independence occurs at the user interface level.
2. Physical Data Independence o Physical data independence can be defined as the capacity
to change the internal schema without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected. o Physical data independence is
used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.
https://www.scaler.com/topics/what-is-er-model-in-dbms/
ER Model stands for Entity Relationship Model is a high-level conceptual data model diagram.
ER model helps to systematically analyze data requirements to produce a welldesigned database.
The ER Model represents real-world entities and the relationships between them.
• Provide a preview of how all your tables should connect, what fields are going to be on
each table
• Helps to describe entities, attributes, relationships
• ER diagrams are translatable into relational tables which allows you to build
databases quickly
• ER diagrams can be used by database designers as a blueprint for implementing data in
specific software applications
• The database designer gains a better understanding of the information to be contained
in the database with the help of ERP diagram
• ERD Diagram allows you to communicate with the logical structure of the database
to users
Now in this ERD Diagram Tutorial, let’s check out some interesting facts about ER Diagram
Model:
• ER model allows you to draw Database Design
• It helps you to identify the entities which exist in a system and the relationships between
those entities
Entity Relationship Diagram Symbols & Notations mainly contains three basic symbols
which are rectangle, oval and diamond to represent relationships between elements, entities and
attributes. There are some sub-elements which are based on main elements in ERD Diagram. ER
Diagram is a visual representation of data that describes how data is related to each other using
different ERD Symbols and Notations.
• Lines: It links attributes to entity types and entity types with other relationship types
• Attributes
• Relationships
ER Diagram Examples
For example, in a University database, we might have entities for Students, Courses, and
Lecturers. Students entity can have attributes like Rollno, Name, and DeptID. They might have
relationships with Courses and Lecturers.
WHAT IS ENTITY?
A real-world thing either living or non-living that is easily recognizable and nonrecognizable.
n entity can be place, person, object, event or a concept, which stores data in the database. The
characteristics of entities are must have an attribute, and a unique key. Every entity is made up of
some ‘attributes’ which represent that entity.
Examples of entities:
Entity notation
Relationship
Relationship is nothing but an association among two or more entities. E.g., Tom works in the
Chemistry department.
Entities take part in relationships. We can often identify relationships with verbs or verb phrases.
For example:
Weak Entities
A weak entity is a type of entity which doesn’t have its key attribute. It can be identified
uniquely by considering the primary key of another entity. For that, weak entity sets need to
have participation.
Let’s learn more about a weak entity by comparing it with a Strong Entity
Strong entity set always has a primary key. It does not have enough attributes to build a primary key.
It contains a Primary key represented by It contains a Partial Key which is represented by a dashed
the underline symbol. underline symbol.
The member of a strong entity set is called The member of a weak entity set called as a subordinate
as dominant entity set. entity set.
Primary Key is one of its attributes which In a weak entity set, it is a combination of primary key and
helps to identify its member. partial key of the strong entity set.
In the ER diagram the relationship between The relationship between one strong and a weak entity set
two strong entity set shown by using a shown by using the double diamond symbol.
diamond symbol.
The connecting line of the strong entity set The line connecting the weak entity set for identifying
with the relationship is single. relationship is double.
Attributes
For example, a lecture might have attributes: time, date, duration, place, etc. An
attribute in ER Diagram examples, is represented by an Ellipse
sets.
• One-to-One Relationships
• One-to-Many Relationships
• Many-to-Many Relationships
1. One-to-one:
One entity from entity set X can be associated with at most one entity of entity set Y and vice
versa.
Example: One student can register for numerous courses. However, all those courses have a
single line back to that one student.
2. One-to-many:
One entity from entity set X can be associated with multiple entities of entity set Y, but an entity
from entity set Y can be associated with at least one entity.
3. Many to One
More than one entity from entity set X can be associated with at most one entity of entity set Y.
However, an entity from entity set Y may or may not be associated with more than one entity
from entity set X.
4. Many to Many:
One entity from X can be associated with more than one entity from Y and vice versa.
For example, Students as a group are associated with multiple faculty members, and
faculty members can be associated with multiple students.
Now in this ERD Diagram Tutorial, we will learn how to create an ER Diagram. Following are
the steps to create an ER Diagram:
In a university, a Student enrolls in Courses. A student must be assigned to at least one or more
Courses. Each course is taught by a single Professor. To maintain instruction quality, a
Professor can deliver only one course
• Student
• Course
• Professor
Step 2) Relationship Identification
You need to study the files, forms, reports, data currently maintained by the organization to
identify attributes. You can also conduct interviews with various stakeholders to identify entities.
Initially, it’s important to identify the attributes without mapping them to a particular entity.
Once, you have a list of Attributes, you need to map them to the identified entities. Ensure an
attribute is to be paired with exactly one entity. If you think an attribute should belong to more
than one entity, use a modifier to make it unique.
Once the mapping is done, identify the primary Keys. If a unique key is not readily available,
create one.
For Course Entity, attributes could be Duration, Credits, Assignments, etc. For the sake of ease
we have considered just one attribute.
Here are some best practice or example for Developing Effective ER Diagrams.
• You need to make sure that all your entities and relationships are properly labeled
• There may be various valid approaches to an ER diagram. You need to make sure that the
ER diagram supports all the data you need to store
• You should assure that each entity only appears a single time in the ER diagram
• Name every relationship, entity, and attribute are represented on your diagram
The logical schema defines the structure of the data itself and the relationships between
the various attributes, tables, and entries.
Table constraints
Database constraints are a key feature of database management systems. They ensure that rules
defined at data model creation are enforced when the data is manipulated (inserted, updated, or
deleted) in a database.
• INDEX - Used to create and retrieve data from the database very quickly
• Optimization of database
Data normalization: in the context of databases refers to a set of techniques used to organize
and structure relational databases efficiently. The main goal of data normalization in databases is
to reduce data redundancy, improve data integrity, and ensure the consistency of the data while
minimizing anomalies like update anomalies, insertion anomalies, and deletion anomalies. It is an
essential concept in relational database design.
Then why do you need it? If there is no normalization in SQL, there will be many problems,
such as:
• Insert Anomaly: This happens when we cannot insert data into the table without
another.
3) The normalization is important because it allows database to take up less disk space.
Advantages:
There are many benefits to normalizing a database. Some of the main advantages are:
• Finding, sorting, and indexing can be faster because the table is small and more rows
can be accommodated on the data page.
Data normalization is typically achieved by dividing a database into multiple related tables
and establishing relationships between them. This process is guided by a set of rules known as
normal forms. The most common normal forms are:
First Normal Form (1NF): A table is in 1NF if it has no repeating groups or arrays,
contains only atomic (indivisible) values. This means that hold a single piece of data.
Second Normal Form (2NF): A table is in 2NF if it is in 1NF and all non-key attributes are functionally dependent o
table is in 3NF if it is in 2NF and has no transitive dependencies. This means that non-key
attributes should not depend on other non-key attributes.
A functional dependency occurs when the value of one attribute determines the value of
another attribute.
Data inconsistency is the term that refers to mismatched data values in DBMS, whereas data
redundancy refers to the unwanted repetition of data in different locations.
Suppose you have a table with three attributes: A, B, and C. A is said to have a transitive
dependency on B if the following conditions hold:
1. A depends on B.
2. B depends on C.
In other words, the dependency from A to C is indirect and mediated through B. This kind of dependency
is considered a transitive dependency.
3. Boyce-Codd Normal Form (BCNF): A table is in BCNF if, for every non-trivial
functional dependency, the left- side of the dependency is a superkey. In BCNF,
endencies, making it a more stringent form of
normalization
than 3NF.
4. Fourth Normal Form (4NF) and beyond: These normal forms address more complex
types of dependencies, such as multi-valued dependencies and join dependencies, but
they are less commonly encountered in practice.
Here is a list of Normal Forms in SQL:
The Theory of Data Normalization in MySQL server is still being developed further.
For example, there are discussions even on 6th Normal Form. However, in most
practical applications, normalization achieves its best in 3rd Normal Form. The
evolution of Normalization in SQL theories is illustrated below-
Database Normalization Example can be easily understood with the help of a case study.
Assume, a video library maintains a database of movies rented out. Without any normalization in
database, all information is stored in one table as shown below. Let’s understand Normalization
database with normalization example with solution:
Here you see Movies Rented column has multiple values. Now let’s move into 1st Normal
Forms:
A KEY in SQL is a value used to identify records in a table uniquely. An SQL KEY is a single
column or combination of multiple columns used to uniquely identify rows or tuples in the table.
SQL Key is used to identify duplicate information, and it also helps establish a relationship
between multiple tables in the database.
Note: Columns in a table that are NOT used to identify a record uniquely are called non-key
columns.
• The primary key must be given a value when a new record is inserted.
A composite key is a primary key composed of multiple columns used to identify a record
uniquely
In our database, we have two people with the same name Robert Phil, but they live in different
places.
Hence, we require both Full Name and Address to identify a record uniquely. That is a composite
key.
• Rule 1- Be in 1NF
• Rule 2- Single Column Primary Key that does not functionally dependant on any
subset of candidate key relation
It is clear that we can’t move forward to make our simple database in 2nd Normalization form
unless we partition the table above.
We have divided our 1NF table into two tables viz. Table 1 and Table2. Table 1 contains
member information. Table 2 contains information on movies rented.
We have introduced a new column called Membership_id which is the primary key for table
Foreign Key references the primary key of another Table! It helps connect your Tables
• A foreign key can have a different name from its primary key
• Unlike the Primary key, they do not have to be unique. Most often they aren’t
• Foreign keys can be null even though primary keys can not
Why do you need a foreign key?
You will only be able to insert values into your foreign key that exist in the unique key in the
parent table. This helps in referential integrity.
The above problem can be overcome by declaring membership id from Table2 as foreign key of
membership id from Table1
Now, if somebody tries to insert a value in the membership id field that does not exist in the
parent table, an error will be shown!
A transitive functional dependency is when changing a non-key column, might cause any of
the other non-key columns to change
Consider the table 1. Changing the non-key column Full Name may change Salutation.
• Rule 1- Be in 2NF
To move our 2NF table into 3NF, we again need to again divide our table.
3NF Example
There are no transitive functional dependencies, and hence our table is in 3NF
In Table 3 Salutation ID is primary key, and in Table 1 Salutation ID is foreign to primary key in
Table 3
Now our little example is at a level that cannot further be decomposed to attain higher normal
form types of normalization in DBMS. In fact, it is already in higher normalization forms.
Separate efforts for moving into next levels of normalizing data are normally needed in complex
databases. However, we will be discussing next levels of normalisation in DBMS in brief in the
following.
Even when a database is in 3rd Normal Form, still there would be anomalies resulted if it has
more than one Candidate Key.
If no database table instance contains two or more, independent and multivalued data describing
the relevant entity, then it is in 4th Normal Form.
A table is in 5th Normal Form only if it is in 4NF and it cannot be decomposed into any number
of smaller tables without loss of data.
6th Normal Form is not standardized, yet however, it is being discussed by database experts for
some time. Hopefully, we would have a clear & standardized definition for 6th Normal
Form in the near future…
That’s all to SQL Normalization!!!
https://www.guru99.com/database-normalization.html
Homework
Imagine we're building a restaurant management application. That application needs to store data
about the company's employees and it starts out by creating the following table of employees:
Index structure:
o The first column of the database is the search key that contains a copy of the primary key
or candidate key of the table. The values of the primary key are stored in sorted order so that
the corresponding data can be accessed easily. o The second column of the database is the
data reference. It contains a set of pointers holding the address of the disk block where the
value of the particular key can be found.
Indexing Methods
Ordered indices
The indices are usually sorted to make searching faster. The indices which are sorted are known
as ordered indices.
Example: Suppose we have an employee table with thousands of record and each of which is 10
bytes long. If their IDs start with 1, 2, 3.. .and so on and we have to search student with ID-543.
o In the case of a database with no index, we have to search the disk block from starting till
it reaches 543. The DBMS will read the record after reading 543*10=5430 bytes. o In the
case of an index, we will search using indexes and the DBMS will read the record after
reading 542*2= 1084 bytes which are very less compared to the previous case.
Primary Index
o If the index is created on the basis of the primary key of the table, then it is known as
primary indexing. These primary keys are unique to each record and contain 1:1 relation
between the records. o As primary keys are stored in sorted order, the performance of the
searching operation is quite efficient. o The primary index can be classified into two types:
Dense index and Sparse index.
Dense index
o The dense index contains an index record for every search key value in the data file. It
makes searching faster. o In this, the number of records in the index table is same as the
number
of records in the main table. o It needs more space to store index record itself. The index
records have the search key and a pointer to the actual record on the disk.
Sparse index
o In the data file, index record appears only for a few items. Each item points to a block. o
In this, instead of pointing to each record in the main table, the index points to the records in
the main table in a gap.
Clustering Index o A clustered index can be defined as an ordered data file. Sometimes the index
is created on non-primary key columns which may not be unique for each record. o In this
case, to identify the record faster, we will group two or more columns to get the unique value
and create index out of them. This method is called a clustering index. o The records which
have similar characteristics are grouped, and indexes are created for these group.
Example: suppose a company contains several employees in each department. Suppose we use a
clustering index, where all employees which belong to the same Dept_ID are considered within a
single cluster, and index pointers point to the cluster as a whole. Here Dept_Id is a non-unique
key.
The previous schema is little confusing because one disk block is shared by records which
belong to the different cluster. If we use separate disk block for separate clusters, then it is called
better technique.
Secondary Index
In the sparse indexing, as the size of the table grows, the size of mapping also grows. These
mappings are usually kept in the primary memory so that address fetch should be faster. Then the
secondary memory searches the actual data based on the address got from mapping. If the
mapping size grows then fetching the address itself becomes slower. In this case, the sparse
index will not be efficient. To overcome this problem, secondary indexing is introduced.
In secondary indexing, to reduce the size of mapping, another level of indexing is introduced. In
this method, the huge range for the columns is selected initially so that the mapping size of the
first level becomes small. Then each range is further divided into smaller ranges. The mapping of
the first level is stored in the primary memory, so that address fetch is faster. The mapping of the
second level and actual data are stored in the secondary memory (hard disk).
For example:
o If you want to find the record of roll 111 in the diagram, then it will search the highest
entry which is smaller than or equal to 111 in the first level index. It will get 100 at this
level.
o Then in the second index level, again it does max (111) <= 111 and gets 110. Now using
the address 110, it goes to the data block and starts searching each record till it gets 111.
o This is how a search is performed in this method. Inserting, updating or deleting is also
done in the same manner.
Description of DBMS
A Database Management System (DBMS) is a software system that is designed to manage and
organize data in a structured manner. It allows users to create, modify, and query a database,
as well as manage the security and access controls for that database.
1. Data modeling: A DBMS provides tools for creating and modifying data
models, which define the structure and relationships of the data in a database.
2. Data storage and retrieval: A DBMS is responsible for storing and retrieving data
from the database, and can provide various methods for searching and querying the
data.
3. Concurrency control: A DBMS provides mechanisms for controlling concurrent
access to the database, to ensure that multiple users can access the data without
conflicting with each other.
4. Data integrity and security: A DBMS provides tools for enforcing data integrity
and security constraints, such as constraints on the values of data and access
controls that restrict who can access the data.
5. Backup and recovery: A DBMS provides mechanisms for backing up and
recovering the data in the event of a system failure.
6. DBMS can be classified into two types: Relational Database Management System
(RDBMS) and Non-Relational Database Management System (NoSQL or Non-SQL)
7. RDBMS: Data is organized in the form of tables and each table has a set of rows
and columns. The data is related to each other through primary and foreign keys.
8. NoSQL: Data is organized in the form of key-value pairs, document, graph, or
column-based. These are designed to handle large-scale, high-performance
scenarios.
DBMS allows users the following tasks:
1. Data organization: A DBMS allows for the organization and storage of data in a
structured manner, making it easy to retrieve and query the data as needed.
2. Data integrity: A DBMS provides mechanisms for enforcing data integrity
constraints, such as constraints on the values of data and access controls that
restrict who can access the data.
3. Concurrent access: A DBMS provides mechanisms for controlling
concurrent access to the database, to ensure that multiple users can access the
data without conflicting with each other.
4. Data security: A DBMS provides tools for managing the security of the data, such
as controlling access to the data and encrypting sensitive data.
5. Backup and recovery: A DBMS provides mechanisms for backing up and
recovering the data in the event of a system failure.
6. Data sharing: A DBMS allows multiple users to access and share the same data,
which can be useful in a collaborative work environment.
1. Relational DBMS (RDBMS): An RDBMS stores data in tables with rows and
columns, and uses SQL (Structured Query Language) to manipulate the data.
2. Object-Oriented DBMS (OODBMS): An OODBMS stores data as objects, which
can be manipulated using object-oriented programming languages.
3. NoSQL DBMS: A NoSQL DBMS stores data in non-relational data structures, such
as key-value pairs, document-based models, or graph models.
Hardware, Software, Data, Database Access Language, Procedures and Users all together form
the components of a DBMS.
The hardware is the actual computer system used for keeping and accessing the database. The
conventional DBMS hardware consists of secondary storage devices such as hard disks.
Databases run on the range of machines from micro computers to mainframes.
Software
Software is the actual DBMS between the physical database and the users of the system. All the
requests from the user for accessing the database are handled by DBMS.
Data
It is an important component of the database management system. The main task of DBMS is to
process the data. Databases are used to store the data, retrieved, and updated to and from the
databases.
Users
There are a number of users who can access or retrieve the data on demand using the
application and the interfaces provided by the DBMS.
Examples of DBMS
1. Microsoft Access
2. MySQL
3. Oracle Database
4. MongoDB
6. Amazon RDS
7. PostgreSQL
8. Apache Cassandra
9. Informix
10. Maria DB
11. SQLite
What is MySQL is a relational database management system (RDBMS) developed by Oracle that
is based on structured query language (SQL).
• MySQL is open-source
A logical data model is a data model that provides a detailed, structured description of data
elements and the connections between them. It includes all entities — a specific object
transferred from the real world (relevant to business) — and the relationships among them.
These entities have defined their attributes as their characteristics.
Entities have been transformed into tables and attributes into table columns. Their names are also
translated into technical terms — how they could be implemented and stored in the database. In
addition, each column's data type has been specified.
Introduction to SQL
Structured query language (SQL) is a programming language for storing and processing
information in a relational database. A relational database stores information in tabular form,
with rows and columns representing different data attributes and the various relationships
between the data values.
SQL table
A SQL table is the basic element of a relational database. The SQL database table consists of
rows and columns. Database engineers create relationships between multiple database tables to
optimize data storage space.
SQL statements
SQL statements, or SQL queries, are valid instructions that relational database management
systems understand. Software developers build SQL statements by using different SQL language
elements. SQL language elements are components such as identifiers, variables, and search
conditions that form a correct SQL statement.
Stored procedures
Stored procedures are a collection of one or more SQL statements stored in the relational
database. Software developers use stored procedures to improve efficiency and performance.
For example, they can create a stored procedure for updating sales tables instead of writing the
same SQL statement in different applications.
Structured query language (SQL) commands are specific keywords or SQL statements that
developers use to manipulate the data stored in a relational database. You can categorize SQL
commands as follows.
CREATE: This command is used to create the database or its objects (like table, index, function,
views, store procedure, and triggers).
DROP: This command is used to delete objects from the database.
TRUNCATE: This is used to remove all records from a table, including all spaces allocated
for the records are removed.
COMMENT: This is used to add comments to the data dictionary.
Data query language (DQL) consists of instructions for retrieving data stored in relational
databases. Software applications use the SELECT command to filter and return specific
results from a SQL table.
Data manipulation language (DML) statements write new information or modify existing records
in a relational database. For example, an application uses the INSERT command to store a new
record in the database.
Database administrators use data control language (DCL) to manage or authorize database
access for other users. For example, they can use the GRANT command to permit certain
applications to manipulate one or more tables.
REVOKE: This command withdraws the user’s access privileges given by using the GRANT
command.
The relational engine uses transaction control language (TCL) to automatically make database
changes. For example, the database uses the ROLLBACK command to undo an erroneous
transaction.
SQL sub-languages
In SQL a Subquery can be simply defined as a query within another query. In other words we
can say that a Subquery is a query that is embedded in WHERE clause of another SQL query.
Syntax: There is not any general syntax for Subqueries. However, Subqueries are seen to be
used most frequently with SELECT statement as shown below: SELECT column_name
FROM table_name
WHERE column_name expression operator
SQL operators
% Modulo SELECT 17 % 5;
Operator Description
| Bitwise OR
^ Bitwise exclusive OR
> Greater than SELECT * FROM Products WHERE Price > 30;
< Less than SELECT * FROM Products WHERE Price < 30;
>= Greater or equal to SELECT * FROM Products WHERE Price >= 30;
<= Less or equal to SELECT * FROM Products WHERE Price <= 30;
<> OR != Not equal to SELECT * FROM Products WHERE Price <> 18; or
Operator Description
+= Add equals
-= Subtract equals
*= Multiply equals
/= Divide equals
%= Modulo equals
ALL TRUE if all of the subquery SELECT ProductName FROM Products WHERE
values meet the condition ProductID = ALL (SELECT ProductID FROM
ANY TRUE if any of the SELECT * FROM Products WHERE Price > ANY
subquery values meet the (SELECT Price FROM Products WHERE Price > 50);
condition
BETWEEN TRUE if the operand is SELECT * FROM Products WHERE Price BETWEEN
within the range of 50 AND 60;
comparisons
LIKE TRUE if the operand SELECT * FROM Customers WHERE City LIKE 's%';
matches a pattern
NOT Displays a record if the SELECT * FROM Customers WHERE City NOT LIKE
condition(s) is NOT TRUE 's%';
SOME TRUE if any of the subquery SELECT * FROM Products WHERE Price > SOME
values meet the condition (SELECT Price FROM Products WHERE Price > 20);
Syntax
Create Constraints
Constraints can be specified when the table is created with the CREATE TABLE statement, or
after the table is created with the ALTER TABLE statement.
Syntax
• CREATE INDEX - Used to create and retrieve data from the database very quickly.
The MySQL CREATE TABLE Statement
Syntax
);
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
The new table gets the same column definitions. All columns or specific columns can be
selected.
If you create a new table using an existing table, the new table will be filled with the existing
values from the old table.
Syntax
FROM customers;
Syntax
The TRUNCATE TABLE statement is used to delete the data inside a table, but not the table
itself.
Syntax
The ALTER DATABASE Statement of MySQL allows you to modify/change the characteristics
of an existing database.
Syntax
• ENCRYPTION − This option allows you enable (Y) or disable (N) the default database
encryption.
• ReadOnly − Using this option you can allow modifications on the database along with
the objects within (0) or make it read only (1).
Example
Following query changes the character set of the above create database −
You can see the list of all the available character sets using the SHOW CHARACTER SET
Statement
Similarly, following query changes the collation of the database named mydatabase −
You can see the list of all the available collations using the SHOW COLLATION Statement.
You can verify the characteristics of the created database as shown below −
| mydatabase | CREATE DATABASE `mydatabase` /*!40100 DEFAULT CHARACTER SET utf8 */ /*!80016 DE
1 row in set (0.08 sec)
The ALTER TABLE statement is used to add, delete, or modify columns in an existing table.
The ALTER TABLE statement is also used to add and drop various constraints on an existing
table.
The following SQL deletes the "Email" column from the "Customers" table:
To change the data type of a column in a table, use the following syntax:
Now we want to change the data type of the column named "DateOfBirth" in the "Persons" table.
Example
Notice that the "DateOfBirth" column is now of type year and is going to hold a year in a two- or
four-digit format.
Next, we want to delete the column named "DateOfBirth" in the "Persons" table.
We use the following SQL statement:
Example
I.C.3.Application of DML
INTO Statement
2. If you are adding values for all the columns of the table, you do not need to
specify the column names in the SQL query. However, make sure the order of the
values is in the same order as the columns in the table. Here, the INSERT INTO
syntax would be as follows:
INSERT INTO table_name
Example
The following SQL statement will insert a new record, but only insert data in the
"CustomerName", "City", and "Country" columns (CustomerID will be updated automatically):
Example
UPDATE Syntax
UPDATE table_name
Example
UPDATE Customers
It is the WHERE clause that determines how many records will be updated.
The following SQL statement will update the PostalCode to 00000 for all records where country
is "Mexico":
Example
UPDATE Customers
SET PostalCode = 00000 WHERE
Country = 'Mexico';
Update Warning!
Be careful when updating records. If you omit the WHERE clause, ALL records will be updated!
Example
UPDATE Customers
SET PostalCode = 00000;
DELETE Syntax
Example
It is possible to delete all rows in a table without deleting the table. This means that the table
structure, attributes, and indexes will be intact:
SELECT Syntax
FROM table_name;
Here, column1, column2, ... are the field names of the table you want to select data from. If
you want to select all the fields available in the table, use the following syntax:
The following SQL statement selects the "CustomerName", "City", and "Country" columns from
the "Customers" table:
Example
SELECT * Example
The following SQL statement selects ALL the columns from the "Customers" table:
Example
The SELECT DISTINCT statement is used to return only distinct (different) values.
Inside a table, a column often contains many duplicate values; and sometimes you only want to
list the different (distinct) values.
FROM table_name;
The following SQL statement selects all (including the duplicates) values from the "Country"
column in the "Customers" table:
Example
Now, let us use the SELECT DISTINCT statement and see the result.
The following SQL statement selects only the DISTINCT values from the "Country" column in
the "Customers" table:
Example
The following SQL statement counts and returns the number of different (distinct) countries in
the "Customers" table:
Example
o COUNT function is used to Count the number of rows in a database table. It can work on
both numeric and non-numeric data types. o COUNT function uses the COUNT(*) that
returns the count of all the rows in a specified table. COUNT(*) considers duplicate and
Null.
Syntax
COUNT(*)
Sample table:
PRODUCT_MAST
Item1 Com1 2 10 20
Item2 Com2 3 25 75
Item3 Com1 2 30 60
Item4 Com3 5 10 50
Item5 Com2 2 20 40
Item6 Cpm1 3 25 75
Item8 Com1 3 10 30
Item9 Com2 2 25 50
Example: COUNT()
Output:
10
Output:
Com1 5
Com2 3
Com3 2
HAVING COUNT(*)>2;
Output:
Com1 5
Com2 3
2. SUM() Function
Sum function is used to calculate the sum of all selected columns. It works on numeric fields
only.
Syntax
Example: SUM()
NY;
Output:
Com1 150
Com2 170
Com3 170
3. AVG function
The AVG function is used to calculate the average value of the numeric type. AVG function
returns the average of all non-Null values.
Syntax
4. MAX Function
MAX function is used to find the maximum value of a certain column. This function
determines the largest value of all selected values of a column.
Syntax
30
5. MIN Function
MIN function is used to find the minimum value of a certain column. This function
determines the smallest value of all selected values of a column.
Syntax
MIN() or MIN( [ALL|DISTINCT] expression ) Example:
SELECT MIN(RATE) FROM PRODUCT_MAST; Output:
10
https://www.javatpoint.com/dbms-sql-aggregate-function
SQL CLAUSES
o SQL clause helps us to retrieve a set or bundles of records from the table.
o SQL clause helps us to specify a condition on the columns or the records of a table.
1. WHERE CLAUSE
2. GROUP BY CLAUSE
3. HAVING CLAUSE
4. ORDER BY CLAUSE
Let's see each clause one by one with an example. We will use MySQL database for writing the
queries in examples.
1. WHERE CLAUSE
A WHERE clause in SQL is used with the SELECT query, which is one of the data manipulation
language commands. WHERE clauses can be used to limit the number of rows to be displayed in
the result set, it generally helps in filtering the records. It returns only those queries which fulfill
the specific conditions of the WHERE clause. WHERE clause is used in SELECT, UPDATE,
DELETE statement, etc.
Asterisk symbol is used with a WHERE clause in a SELECT query to retrieve all the
column values for every record from a table.
Syntax of where clause with a select query to retrieve all the column values for every
record from a table:
CONDITION;
Example 1:
Write a query to retrieve all those records of an employee where employee salary is greater than
50000.
As per the expected output, only those records are displayed where an employee's salary is
greater than 50000. There are six records in the employee's table which satisfy the given
condition.
Example 2:
Write a query to update the employee's record and set the updated name as 'Harshada Sharma'
where the employee's city name is Jaipur.
Query:
mysql> UPDATE employees SET Name = "Harshada Sharma" WHERE City = "Jaipur";
The above query will update the employee's name to "Harshada Sharma," where the employee's
city is Jaipur.
To verify whether records are updated or not, we will run a select query.
There is only one record in the employee's table where the employee's city is 'Jaipur'. The id of
the record is 3, which satisfies the given condition. Hence, according to the given condition, the
employee's name with employee id 3 is now changed to 'Harshada Sharma'.
Example 3:
Write a query to delete an employee's record where the employee's joining date is "2013-1212".
Query:
To verify the results of the above query, we will execute the select query.
There is only one record in the employee's table where the employee's joining date is '201312-
12'. The id of the record is 13, which satisfies the given condition. Therefore according to the
given condition, the employee with employee id 13 is now deleted from the employee's table.
2. GROUP BY CLAUSE
The Group By clause is used to arrange similar kinds of records into the groups in the Structured
Query Language. The Group by clause in the Structured Query Language is used with Select
Statement. Group by clause is placed after the where clause in the SQL statement. The Group By
clause is specially used with the aggregate function, i.e., max (), min (), avg (), sum (), count () to
group the result based on one or more than one column.
Example 1:
Write a query to display all the records of the employees table but group the results based on the
age column.
Query:
The above query will display all the records of the employees table but grouped by the
age column.
Example 2:
Write a query to display all the records of the employees table grouped by the designation and
salary.
Query: mysql> SELECT * FROM employees GROUP BY Salary,
Designation;
The above query will display all the records of the employees table but grouped by the salary
and designation column.
Example 1:
Write a query to list the number of employees working on a particular designation and group the
results by designation of the employee.
Query:
The above query will display the designation with the respective number of employees
working on that designation. All these results will be grouped by the designation column.
Example 2:
Write a query to display the sum of an employee's salary as per the city grouped by an
employee's age.
Query: mysql> SELECT SUM (Salary) AS Salary, City FROM employees GROUP BY
City;
The above query will first calculate the sum of salaries working in each city, and then it will
display the salary sum with the respective salary but grouped by the age column.
As per the expected output, the sum of employee salary according to the city to which the
employee belongs to is displayed. If two employees belong to the same city, then they will be in
one group.
3. HAVING CLAUSE:
When we need to place any conditions on the table's column, we use the WHERE clause in SQL.
But if we want to use any condition on a column in Group By clause at that time, we will use the
HAVING clause with the Group By clause for column conditions.
Syntax:
Example 1:
Write a query to display the name of employees, salary, and city where the employee's maximum
salary is greater than 40000 and group the results by designation.
Query:
mysql> SELECT Name, City, MAX (Salary) AS Salary FROM employees GROUP BY D
esignation HAVING MAX (Salary) > 40000;
The above output shows that the employee name, salary, and city of an employee where
employee salary is greater than 4000
0 grouped by designation. (Employees with a similar designation are placed in one group, and
those with other designation are placed separately).
Example 2:
Write a query to display the name of employees and designation where the sum of an
employee's salary is greater than 45000 and group the results by city.
Query:
mysql> SELECT Name, Designation, SUM (Salary) AS Salary FROM employees GROUP
BY City HAVING SUM (Salary) > 45000; You will get the following output:
The above output shows the employee name, designation, and salary of an employee. The sum of
salary is greater than 45000 grouped by city. (Employees with similar cities are placed in one
group and those with a different city are not similar are placed separately).
4. ORDER BY CLAUSE
Whenever we want to sort anything in SQL, we use the ORDER BY clause. The ORDER BY
clause in SQL will help us to sort the data based on the specific column of a table. This means
that all the data stored in the specific column on which we are executing the ORDER BY clause
will be sorted. The corresponding column values will be displayed in the sequence in which we
have obtained the values in the earlier step.
By default, sorting in the SQL will be done using the ORDER BY clause in ASCENDING order
if we didn't mention the sorting order.
Before moving towards the example of the ORDER BY clause to sort the records, first, we will
look at syntax so it will be easy for us to go through the example.
COLUMNAME;
COLUMN_NAME ASC;
COLUMN_NAME DESC;
Example 1:
Write a query to sort the records in the ascending order of the employee designation from the
employees table.
Write a query to display employee name and salary in the ascending order of the employee's
salary from the employees table.
Query:
Here in a SELECT query, an ORDER BY clause is applied to the 'Salary' column to sort the
records. We have used the ASC keyword to sort the employee's salary in ascending order.
Example 3:
Write a query to sort the data in descending order of the employee name stored in the
employees table.
Query:
Here we have used the ORDER BY clause with the SELECT query applied on the Name column
to sort the data. We have used the DESC keyword after the ORDER BY clause to sort data in
descending order.
DCL commands are used to manage database security and access control. The two primary DCL
commands are:
GRANT: The GRANT command is used to grant specific privileges to database users or roles:
Syntax
GRANT [privilege_name] ON [object_name] TO [user_name];
Or
GRANT privilege_name ON object_name TO user_name WITH GRANT OPTION;
This grants the “HR_Manager” role the privileges to select and insert data into the “Employees”
table.
Syntax
REVOKE [privilege_name] ON [object_name] FROM [user_name];
REVOKE DELETE ON Customers FROM Sales_Team;
This revokes the privilege to delete data from the “Customers” table from the “Sales_Team” role.
COMMIT
The COMMIT command is used to save changes made during a transaction to the database
permanently:
BEGIN;
-- SQL statements
COMMIT;
This example begins a transaction, performs SQL statements, and then commits the changes to
the database.
ROLLBACK
BEGIN;
-- SQL statements
ROLLBACK;
This example begins a transaction, performs SQL statements, and then rolls back the changes,
restoring the database to its previous state.
SAVEPOINT
The SAVEPOINT command allows you to set a point within a transaction to which you can later
roll back:
BEGIN;
-- SQL statements
SAVEPOINT my_savepoint;
ROLLBACK TO my_savepoint;
This example creates a savepoint and later rolls back to that point, undoing some of the
transaction’s changes.
Here are code snippets and their corresponding outputs for TCL commands:
These examples provide code snippets and their corresponding real-value outputs in a tabular
format for each type of SQL command.
A COMMIT is the SQL command used in the transaction tables or database to make the current
transaction or database statement as permanent. It shows the successful completion of a transaction. If we
have successfully executed the transaction statement or a simple database query, we want to make the
changes permanent. We need to perform the commit command to save the changes, and these changes
become permanent for all users. Furthermore, once the commit command is executed in the database, we
cannot regain its previous states in which it was earlier before the execution of the first statement.
Syntax
1. COMMIT;
In the above example, we will delete all those records from the EMPLOYEES table whose age is 27 and
then COMMIT query to make the changes as permanent that visible for all users in the database records.
After that, use the select command to fetch all the records from the Employees table.
Output
In the above table, when the COMMIT query is executed, it permanently saves the EMPLOYEES table
changes, and these changes visible to all database users.
The SQL ROLLBACK command is used to roll back the current transaction state if any error occurred
during the execution of a transaction. In a transaction, the error can be a system failure, power outage,
incorrect execution of the transaction, system crash, etc. Generally, a rollback command performs the
current transaction's rollback action to return the transaction on its previous state or the first statement. A
rollback command can only be executed if the user has not performed the COMMIT command on the
current transaction or statement.
Syntax
1. ROLLBACK;
In the above example, we will delete all those records from the EMPLOYEES table whose age is 27 and then
perform the ROLLBACK query to retrieve the deleted records from the EMPLOYEES table.
TABLE - EMPLOYEES
Here, in the above table, there are two rows whose age is 27 deleted from the Employees table that satisfy the
age condition. And then ROLLBACK query to undo the operations.
After that, use the select command to retrieve all the records from the Employees table.
Output
TABLE - EMPLOYEES
EMPID EMP NAME AGE ADDRESS SALARY
Definition/ Basic A COMMIT statement is used to A Rollback statement is used to undo all the
save the changes on the current changes made on the current transaction.
transaction is permanent.
Transaction Once the current transaction is Whereas in the Rollback statement, once the
condition completely executed using the current transaction is successfully executed, it
COMMIT command, it can't can reach its previous state using the
undo its previous state. ROLLBACK command.
Occurrence The COMMIT statement is The Rollback statement occurs when the
applied when the transaction is transaction is either aborted, power failure, or
completed. incorrect execution of system failure.
Successfully If all the statements are executed If any operations fail during the completion of
executed the successfully without any error, a transaction, it shows all the changes have not
statement. the COMMIT statement will been successfully executed, and we can undo
permanently save the state. them using the ROLLBACK statement.
Visible change When we perform the commit Whereas the rollback command is also visible
command, the current transaction to all users, even the current transaction may
statement becomes permanent contain the wrong or right information.
and visible to all users.
https://www.javatpoint.com/commit-vs-rollback-in-sql
Savepoint in SQL
o Savepoint is a command in SQL that is used with the rollback command.
o It is a command in Transaction Control Language that is used to mark the transaction in a table.
o Consider you are making a very long table, and you want to roll back only to a certain position in a
table then; this can be achieved using the savepoint.
o If you made a transaction in a table, you could mark the transaction as a certain name, and later on, if
you want to roll back to that point, you can do it easily by using the transaction's name.
o Savepoint is helpful when we want to roll back only a small part of a table and not the whole table.
In simple words, we can say savepoint is a bookmark in SQL.
Let us see the practical examples to understand this concept more clearly. We will use the MySQL database
for writing all the queries.
To create a table in the database, first, we need to select the database in which we want to create a table.
Then we will write a query to create a table named student in the selected database 'dbs'.
1. mysql> CREATE TABLE student(ID INT, Name VARCHAR(20), Percentage INT, Location VA
RCHAR(20), DateOfBirth DATE);
Now, we will write a single query to insert multiple records in the student table:
1. mysql> INSERT INTO student(ID, Name, Percentage, Location, DateOfBirth) VALUES(1, "Mant
han Koli", 79, "Delhi", "2003-08-20"), (2, "Dev Dixit", 75, "Pune", "1999-06-
17"), (3, "Aakash Deshmukh", 87, "Mumbai", "1997-09-
12"), (4, "Aaryan Jaiswal", 90, "Chennai", "2005-10-
02"), (5, "Rahul Khanna", 92, "Ambala", "1996-03-
04"), (6, "Pankaj Deshmukh", 67, "Kanpur", "2000-02-
02"), (7, "Gaurav Kumar", 84, "Chandigarh", "1998-07-06"), (8, "Sanket Jain", 61, "Shimla", "1990-
09-08"), (9, "Sahil Wagh", 90, "Kolkata", "1968-04-03");
To verify that multiple records are inserted in the student table, we will execute the SELECT query.
The results show that all ten records are inserted successfully.
To use the TCL commands in SQL, we first need to initiate the transaction by using the BEGIN / START
TRANSACTION command.
Here, we have saved the initiated transaction with the name of 'ini'.
Then, we decided to insert a new record with an ID of 10 into the existing student table.
1. mysql> INSERT INTO student VALUES (10, "Saurabh Singh", 54, "Kashmir", "1989-01-06");
We will execute the SELECT query to verify that the new record with ID as ten is inserted successfully.
To save the transaction with this newly inserted record, we will create a new savepoint.
Here, the newly inserted record table is saved with the savepoint named 'ins'.
To update the record in the student table and set the updated name as 'Mahesh Kuwar' for the record whose
ID is 1, we will execute the following query:
To verify that the record's name field with ID as 1 is updated successfully, we will again execute the SELECT
query.
To save the transaction with this updated record, we will create a new savepoint.
Here, the table with the updated record is saved with the savepoint named 'upd'.
To remove the record from the student table with ID as 6, we will execute the following query:
We will again execute the SELECT query to verify that the record with ID as 6 is removed successfully.
To save the transaction with this removed record, we will create a new savepoint.
Here, the table with the deleted record is saved with the savepoint named 'del'.
Later, we decided that we need the record in the student table, which we have deleted in the previous step.
Since at each and every operation, we have created a savepoint. Using that savepoint, we can jump to any
point of the transaction. To do so, we will execute the ROLLBACK command along with the name of the
savepoint to which we want to jump.
To verify that we have achieved the exact table which we had after updating the student table in the earlier
steps, we will again execute the SELECT query.
The results above show that we have rollback successfully to the savepoint named 'upd'.
https://www.javatpoint.com/savepoint-in-sql
Learning outcome 4: Implement Database security
Database security is the processes, tools, and controls that secure and protect databases against
accidental and intentional threats. (Database security is the practice of protecting sensitive data
and information stored in a database from unauthorized access).The objective of database
security is to secure sensitive data and maintain the confidentiality, availability, and integrity of the
database.
Database Security Threats: Security threats in a Database Management System (DBMS) refer
to risks that can compromise data confidentiality, integrity, and availability.
Access control is critical as it ensures that only authorized individuals can access the
database and perform specific actions on the data.
Access control also helps prevent unauthorized data modifications or deletions.
4. Data Encryption:
Data encryption is a technique used to protect data stored in a database from unauthorized
access by encrypting it using encryption algorithms. Encryption ensures that even if an
unauthorized individual gains access to the data, they cannot read or use it.
5. Auditing and logging are critical as they provide a record of all activities performed on the
database. This can be used to detect and prevent security breaches. Auditing and logging also
help meet regulatory and compliance requirements related to data security and privacy.
Data access control
Data access control in database management systems (DBMS) refers to the mechanisms put in place
to ensure that only authorized users can access and manipulate data within a database. Through
authentication and authorization, access control policies make sure users are who they say they are
and that they have appropriate access to company data.
1. Authentication: is the process of verifying the identity of users attempting to access the
database.
2. Authorization: determines what actions they are allowed to perform within the database
3. Encryption: Encrypting sensitive data within the database can add an extra layer of security, ensuring
that even if unauthorized users gain access to the database, they cannot read the data without the
appropriate decryption keys.
4. Auditing: Auditing mechanisms track and record database activity, including user
logins, access attempts, and data modifications
5. Database Firewalls: Database firewalls monitor and filter incoming and outgoing database
traffic, enforcing access control policies and protecting against unauthorized access or malicious
activity. There are three types of access control:
1. Role-based access control
RBAC assigns permissions to roles rather than individual users.
Users are then assigned to roles based on their job responsibilities or functions within the organization.
For instance, people in higher authority roles may have more privileges compared to the lower-level
management.
2. Rule-Based Access Control (RuBAC):
• RuBAC uses a set of rules or conditions to determine access.
• These rules can be based on various factors such as user identity, time of access, location, and the data
being accessed.
3. Mandatory access control (MAC)
The mandatory access control system provides the most restrictive protections, where the power to permit
access falls entirely on system administrators. That means users cannot change permissions that deny or
allow them entry into different areas, creating formidable security around sensitive information.
Access control policies which are decided by the data owner. The business owner themselves decides the
number of people and level of access to data.
The only disadvantage, of course, is giving the end-user control of security levels requires oversight. And
since the system requires a more active role in managing permissions, it’s easy to let actions fall through the
cracks. Where the MAC approach is rigid and low-effort, a DAC system is flexible and high-effort.
Access control policies help define the standards of data security and data governance for
organizations. They set up the level of access to sensitive information for users based on roles,
policies, or rules.
Access control policies are high-level requirements that specify how access is managed and who
may access information under what circumstances.
Identify the data classifications
The following are five common categories used for data classification:
1. Public data: refers to information that is intended for unrestricted access and does not
contain sensitive or confidential information.
2. Private data: this data should not be available for public access and is often protected
through traditional security measures such as passwords.
3. Internal data: The use of an organization’s internal data is usually limited to its employees.
Examples include:
4. Confidential data: This information should only be accessed by a limited audience that has obtained
proper authorization.
5. Restricted data: Restricted data is the classification used for an organization’s most sensitive
information. Access to this data is strictly controlled to prevent its unauthorized use.
Roles represent sets of permissions or access rights that define what actions users are
allowed to perform within the database. Examples of roles include "admin", "developer",
"analyst", "manager", etc.
Permissions, also known as privileges or access rights, define the actions that users or
roles are allowed to perform on database objects. Common permissions include select,
insert, update, delete, create, alter, drop, grant.
Authentication: ensures that only authorized individuals can access the database by
verifying their identity using usernames and passwords.
Steps involved in managing authentication within a DBMS:
Identify user accounts: Create user accounts within the database system for each
individual or entity that requires access. Each user account should have a unique identifier
(username) and associated credentials.
Create privileges: Assign appropriate privileges to each user account. Privileges determine
what actions users are allowed to perform within the database, such as querying data, inserting
records etc
Configure the authentication system: Most DBMS offer built-in mechanisms for user
authentication. This involves creating user accounts within the database itself, assigning them
passwords, and potentially linking them to operating system credentials for a single sign-on
experience.
Test the authentication system: Rigorously test the authentication system to ensure it
functions as expected. This involves simulating login attempts with valid and invalid
credentials, verifying access granted based on assigned privileges, and identifying any
vulnerabilities.
Monitor and maintain: Regularly monitor the authentication system for suspicious activity,
such as failed login attempts or unauthorized access. Enforce strong password policies,
require periodic password changes, and stay updated on the latest security patches for your
DBMS to address potential weaknesses.
Authorization: controls what actions each user can perform on the database based on their
role or privileges. how to manage authorization effectively:
Create roles: Define different roles that represent various job functions or levels of access
within the database. For example, you might have roles like "Admin," "Manager,"
"Employee," or "Read-only."
Syntax:
CREATE ROLE manager;
Role created.
Arguments
role_name: Is the name of the role to be created
owner_name: AUTHORIZATION: If no user is specified, the user who executes CREATE
ROLE will be the owner of the role. The role’s members can be added or removed at the
discretion of the role’s owner or any member of an owning role.
Drop a Role
DROP ROLE manager;
Assign permissions/privilege to roles: Once roles are created, assign specific permissions or
privileges to each role. Common permissions include SELECT (read data), INSERT (add
data), UPDATE (modify data), DELETE (remove data), and administrative privileges like
CREATE (create objects) or DROP (delete objects).
Assign roles to users: Assign users to the appropriate roles based on their job functions and
data access needs.
Permissions
It requires either membership in the fixed database role db_securityadmin or the CREATE ROLE
permission on the database. The following authorizations are also necessary when using the
AUTHORIZATION option:
1. It takes IMPERSONATE permission from that user in order to transfer ownership of a role to
another user.
2. It takes membership in the recipient role or the ALTER permission on that role to transfer
ownership of one role to another.
3. An application role must have ALTER permission in order to transfer ownership of a role to it.
Now that the role is created, the DBA can use the GRANT statement to assign users to the role
as well as assign privileges to the role.
It’s easier to GRANT or REVOKE privileges to the users through a role rather than assigning a
privilege directly to every user.
If a role is identified by a password, then GRANT or REVOKE privileges have to be identified
by the password.
Grant Privileges To a Role
GRANT create table, create view
TO manager;
Grant succeeded.
Grant succeeded.
Test the authorization system: Test various scenarios to verify that users can perform their
expected actions and are restricted from unauthorized activities.
Procedure Steps
1) Start Microsoft SQL Server Management Studio.
2) Right-click the server name and select "Properties."
3) Select the Security page.
4) Verify that Server authentication is set to SQL Server and Windows Authentication
mode.
5) Verify that Login auditing is set to Failed logins only.
6) Click [OK] and exit the application.
Monitor and maintain: Continuously monitor the authorization system to detect any
anomalies, such as unauthorized access attempts or changes to permissions.
• Management of Auditing and logging
Managing auditing and logging is critical for maintaining the security and integrity of
systems and networks.
An audit log (an audit trail) is a chronological record of all activities and security events
within a computer system or network.
Logging: Database logging stores a record of changes to tables or fields in the
database log table.
Log files can be used to review any event within a system, including failures and state
transformations.
Steps involved in effective database logging:
Identify the logging requirements: involves understanding what information needs to
be captured in the logs to achieve your specific goals.
User Access: Track login attempts (successful/failed), user activities (data
modifications, queries executed), and access control changes.
Suspicious Activity: Identify unauthorized access attempts, unusual data access patterns,
or potential security breaches.
Configure logging settings: involves defining various parameters that control how and
what information gets logged.
Steps to Configure logging settings
• Access DBMS configuration tools or command-line interfaces.
• Set the audit level (error, warning, information, debug).
• Choose a log format (plain text, CSV, JSON) based on analysis needs.
• Define the log location (local server or centralized log server) for security and manageability.
• Implement log rotation strategies (size-based or time-based) to manage disk space.
Types of database logs depending on the level:
1. Error Logs (Highest Severity): Capture critical errors that prevent the database
from functioning correctly.
• Examples: Disk failures, database corruption, connection errors, deadlocks.
• Action: These require immediate attention as they indicate potential data
2. Warning Logs (Medium Severity): Record potential issues that could lead
to problems if not addressed.
• Examples: Low disk space warnings, invalid user login attempts, inefficient queries.
• Action: Although not critical, investigate warnings to prevent future problems and optimize
performance.
Execution of SQL command: Audit the execution of SQL commands to track database
activity and ensure accountability. This involves capturing information about the SQL
commands executed, including the type of command (e.g., SELECT, INSERT,
UPDATE, DELETE), the user who executed the command etc
Configure audit settings: Configure audit settings in the DBMS to Capture details like
the SQL statement itself, user information, timestamp, database object, data changes (for
data manipulation statements), and outcome (success/failure)
Review audit: It involves analyzing captured data to gain valuable insights into various
aspects of your database, including security posture, performance bottlenecks, and user
activity
Analyse audit data: is the process of examining and interpreting information recorded
by your database management system about user activity, data modifications, and
system events.
Corrective action: Corrective action (CA) is the activities taken to eliminate the cause
of a process nonconformity.
Implementation of Data encryption
Description of data encryption
Data Backup: Data backup involves creating copies of database files, ensuring that in case of
data loss or corruption, the information can be recovered from these copies.
Data Restore: Data restore refers to the process of recovering data from backup copies and restoring it to
the database, typically after data loss, corruption, or system failure.
Backup Method
Full backup: Creates a complete copy of your entire database,
Advantages: you restore everything from the full backup.
Disadvantages: Requires significant storage space and longer backup times compared to
other methods.
Differential backup: Captures only the data that has changed since the last full backup.
Advantages: Advantages: Faster backup process compared to full backup, as it only includes
changes since the last full backup.
Advantages: Faster backup process compared to full backup, as it only includes changes since
the last full backup.
Incremental backup: An incremental backup captures changes made to the database since the last
backup, whether it was a full backup or an incremental backup.
Advantages: Requires the least storage space and shortest backup times, as it only includes
changes since the last backup.
Disadvantages: Longer restore times compared to full backup and differential backup, as it
may require multiple incremental backups to restore to a specific point in time.
✔ Backup schedule: A backup schedule defines the frequency that a backup job should be run
automatically, for example, monthly, weekly, daily, hourly etc. You can determine how long a
backup job is permitted to run.
✔ Create Backup: Backup using MySQL Workbench, Backup using SQLBackupAndFTP
What's the best way to create a DBMS backup and recovery plan in web development?
1. Why backup and recovery? Backup and recovery are critical for protecting data within any DBMS.
Backup creates secure data copies, while recovery restores data after loss.
2. What to backup and how often? The first step to create a backup and recovery plan is to decide
what data you need to backup and how often you need to do it.
3. Where to store your backups? The next step to create a backup and recovery plan is to choose where
to store your backups.
4. How to test and verify your backups? The final step to create a backup and recovery plan is to test and
verify your backups. Testing and verifying your backups can ensure that your backups are valid,
complete, and consistent.
1. To create a backup of all MySQL server databases, run the following command:
mysqldump --user root --password --all-databases > all-databases.sql
2. To recover data, use the following command:
mysql --user root --password mysql < all-databases.sql
3. Often you need to backup not the entire server, but a specific database. To dump a
specific database, use the name of the database instead of the –all-database
parameter.
mysql --user root --password [db_name] < [db_name].sql
✔ Perform recovery method: Database recovery is the process of restoring the database to a correct
(consistent) state in the event of a failure. The main goal of recovery techniques is to ensure data
integrity and consistency and prevent data loss.
Full database recovery: Test the process of restoring the entire database from a full backup.
Ensure that all data is recovered successfully and the database is operational.
Rollback recovery: This method focuses on undoing incomplete transactions. It's helpful when a
transaction starts but fails to finish due to an error or system crash. The database management
system (DBMS) uses the transaction log to identify uncommitted changes and rolls them back,
ensuring data consistency.
Point-in-time recovery: This recovery method allows you to recover the database to a specific
point in time. Use transaction logs or incremental backups to restore the database to the desired
timestamp. It's particularly useful for recovering from accidental data deletion or modification
Test your backup and recovery plan
Review your plan: Thoroughly understand your documented backup and recovery
procedures. Identify the different types of backups (full, differential, incremental),
their schedules, and storage locations (local, offsite).
Choose a test environment: Ideally, use a non-production environment that mirrors
your production database setup. This allows you to test recovery procedures without
impacting live data.
Data Backup Verification: Before simulating failures, verify the integrity of your
recent backups. Use built-in DBMS tools or third-party utilities to check for
corruption or missing files.
The final step to create a backup and recovery plan is to test and verify your backups. Testing and
verifying your backups can ensure that your backups are valid, complete, and consistent. It can also help you
identify and fix any errors or issues in your backup process. You should test and verify your backups
periodically, using different methods, such as restoring your backups to a test environment, comparing your
backups with the original data, or using tools that check the integrity and consistency of your backups.