0% found this document useful (0 votes)
13 views

IntroToDbms-Test1Notes

Uploaded by

Pranav Kakade
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

IntroToDbms-Test1Notes

Uploaded by

Pranav Kakade
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

About Database - Introduction

A database is an organized collection of information that is stored electronically to be


maintained, accessed, and analyzed efficiently. It can store various types of data, including
text, numbers, images, videos, and files.

A Database Management System (DBMS) is the software used to manage and interact with
the database, enabling users to store, retrieve, and edit data. The combination of the DBMS
and the data it manages is often referred to as a “database system,” or simply a “database.”
Databases are stored on servers either on-premises at an organization’s office or off-
premises at an organization’s data center (or even within their cloud infrastructure).
Databases come in many formats in order to do different things with various types of data.

Advantages of databases
Computerized databases were first introduced to the world in the 1960s and have since
become the foundation for products, analysis, business processes and more. Many of the
services you use online every day (banking, social media, shopping, email) are all built on
top of databases.

Today, databases are used for many reasons.

Databases Hold Data Efficiently

We use databases because they are an extremely efficient way of holding vast amounts of
data and information. Databases around the world store everything from your credit card
transactions to every click you make within one of your social media accounts. Given there
are nearly eight billion people on the planet, that’s a lot of data.

Databases Allow Smooth Transactions

Databases allow access to various services which, in turn, allow you to access your accounts
and perform transactions all across the internet. For example, your bank’s login page will
ping a database to figure out if you’ve entered the right password and username. Your
favorite online shop pings your credit card’s database to pull down the funds needed for you
to buy that item you’ve been eyeing.

Databases Update Information Quickly

Databases allow for easy information updates on a regular basis. Adding a video to your
social media account, directly depositing your salary into your bank account or buying a
plane ticket for your next vacation are all updates made to a database and displayed back to
you almost instantaneously.
Databases Simplify Data Analysis

Databases make research and data analysis much easier because they are highly structured
storage areas of data and information. This means businesses and organizations can easily
analyze databases once they know how a database is structured. Common structures (e.g.
table formats, cell structures like date or currency fields) and common database querying
languages (e.g., SQL) make database analysis easy and efficient.

History of databases

The database as we know it today dates back to the 1960s when the use of computers
became popular. Below are some of the main milestones in the history of databases.

SQL in 1970s:
In the 1970s, IBM computer scientist Edgar Codd published his paper “A Relational Model of
Data for Large Shared Data Banks.” This paper coined the term “relational database” and
established a new way to store and access data.

Following Codd’s paper, Michael Stonebraker and Eugene Wong at the University of
California in Berkeley created INGRES (Interactive Graphics and Retrieval System). INGRES
was a relational database model that used QUEL query language. IBM released their version
of a relational database called System R that used Structured Query Language (SQL) in 1974.

RDBMS in 1980s:
Relational databases grew in popularity during the 1980s, and SQL became the standard
language for querying and managing the data. Database Management Systems (DBMSes)
became essential tools for handling data storage, retrieval, and security for multiple users.

Internet in 1990s:
The rise of the internet in the 1990s fueled the next round of growth in the database
industry. The Relational Database Management System (RDBMS) model, designed to
manage the data of a single organization, wasn’t prepared to handle the volume of data that
web applications were generating. Furthermore, with the decline in performance and
increase in maintenance costs, developers looked for a new solution, and found MySQL, an
open-source relational database.

This period also saw the need to organize data more efficiently, leading to advancements in
database architecture and the management of structured and unstructured data.

NoSQL in 2000s:
NoSQL (“not only structured query language”) was initially coined in 1998 and referred to
databases that used query languages other than SQL. However, as the internet continued to
grow, there was a need for a new kind of database that could store unstructured and semi-
structured data. This led to the emergence of NoSQL databases, which became popular due
to their speed and flexibility in handling large amounts of unstructured data.

NoSQL databases support different data models, including document, key-value, graph, and
column-family. They also provide solutions for modern applications that require scalability
and fast access to data.

Today:
In recent years organizations have increasingly been adopting cloud-native and purpose-
built databases. They are moving away from on-premises and legacy databases to cloud-
native databases to improve agility, scalability, and decrease total cost of ownership.

Modern databases now support hybrid cloud computing platforms and integrated data
stores with both structured and unstructured data. These advancements help manage
distributed data across multiple users and systems. They also ensure data security and
compliance.

About SQL

SQL is a Query language that is used to communicate with relational databases. The
American National Standards Institute (ANSI) has considered SQL the standard language for
relational database management systems. SQL statements are used to add, remove, modify,
and query data, and they can also be used to grant permissions to users or roles. Popular
RDBMSes that use SQL are Oracle, Microsoft SQL Server, IBM, MySQL, PostgreSQL,
Microsoft Access, Ingres, and more.

Applications of Databases
When used correctly, databases can be a helpful tool for organizations in various industries
looking to better arrange their information. Common use cases include:

 Healthcare: storing massive amounts of patient data.


 Logistics: monitoring and analyzing route information and delivery statuses.
 Insurance: storing customer data like addresses, policy details and driver history.
 Finance: handling account details, invoices, stock information and other assets.
 E-commerce: compiling and arranging data on products and customer behavior.
 Transportation: storing passengers’ names, scheduled flights and check-in status.
 Manufacturing: keeping track of machinery status and production goals.
 Marketing: collecting data on demographics, purchasing habits and website visits.
 Education: tracking student grades, course schedules and more.
 Human resources: organizing personnel info, benefits and tax information.

A database management system (DBMS) is a software package we use to create and


manage databases. In other words, a DBMS makes it possible for users to actually interact
with the database. In other words, the DBMS is the user interface (UI) that allows us to
access, add, modify and delete content from the database. There are several types of
database management systems, including relations, non-relational and hierarchical.

Characteristics of DBMS
o It uses a digital repository established on a server to store and manage the
information.
o It can provide a clear and logical view of the process that manipulates data.
o DBMS contains automatic backup and recovery procedures.
o It contains ACID properties which maintain data in a healthy state in case
of failure.
o It can reduce the complex relationship between data.
o It is used to support manipulation and processing of data.
o It is used to provide security of data.
o It can view the database from different viewpoints according to the
requirements of the user.

Types of Databases
There are many types of databases used today. Below are some of the more prominent
ones.

1. Hierarchical Databases

Hierarchical databases were the earliest form of databases. You can think of these
databases like a simplified family tree. There’s a singular parent object (like a table) that has
child objects (or tables) under it. A parent can have one or many child objects but a child
object only has one parent. The benefit of these databases are that they’re incredibly fast
and efficient plus there’s a clear, threaded relationship from one object to another. The
downside to hierarchical databases is that they’re very rigid and highly structured.

2. Relational Databases

Relational databases are perhaps the most popular type of database. Relational databases
are set up to connect their objects (like tables) to each other with keys. For example, there
might be one table with user information (name, username, date of birth, customer
number) and another table with purchase information (customer number, item purchased,
price paid). In this example, the key that creates a relationship between the tables is the
customer number.

3. Non-Relational or NoSQL Databases

Non-relational databases were invented more recently than relational databases and
hierarchical databases in response to the growing complexity of web applications. Non-
relational databases are any database that doesn’t use a relational model. You might also
see them referred to as NoSQL databases. Non-relational databases store data in different
ways such as unstructured data, structured document format or as a graph. Relational
databases are based on a rigid structure whereas non-relational databases are more
flexible.

4. Cloud Databases

Cloud databases refer to information that’s accessible in a hybrid or cloud environment. All
users need is an internet connection to reach their files and manipulate them like any other
database. A convenience of cloud databases is that they don’t require extra hardware to
create more storage space. Users can either build a cloud database themselves or pay for a
service to get started.

5. Centralized Databases

Centralized databases are contained within a single computer or another physical system.
Although users may access data through devices connected within a network, the database
itself operates from one location. This approach may work best for larger companies or
organizations that want to prioritize data security and efficiency.

6. Distributed Databases

Distributed databases run on more than one device. That can be as simple as operating
several computers on the same site, or a network that connects to many devices. An
advantage of this method is that if one computer goes down, the other computers and
devices keep functioning.

7. Object-Oriented Databases

Object-oriented databases perceive data as objects and classes. Objects are specific data —
like names and videos — while classes are groups of objects. Storing data as objects means
users don’t have to distribute data across tables. This makes it easier to determine the
relationships between variables and analyze the data.

8. Graph Databases

Graph databases highlight the relationships between various data points. While users may
have to do extra work to determine trends in other types of databases, graph databases
store relationships right next to the data itself. Users can then immediately see how various
data points are connected to each other.

A Relational Database management System(RDBMS) is a database management


system based on the relational model introduced by E.F Codd. In relational model, data is
stored in relations(tables) and is represented in form of tuples(rows).

RDBMS is used to manage Relational database. Relational database is a collection of


organized set of tables related to each other, and from which data can be accessed easily.
Relational Database is the most commonly used database these days.

Database Languages

 Data Definition Language


 Data Manipulation Language
 Data Control Language
 Transactional Control Language

Data Definition Language (DDL)


DDL is the short name for Data Definition Language, which deals with database schemas
and descriptions, of how the data should reside in the database.
 CREATE: to create a database and its objects like (table, index, views, store
procedure, function, and triggers)
 ALTER: alters the structure of the existing database
 DROP: delete objects from the database
 TRUNCATE: remove all records from a table, including all spaces allocated for
the records are removed
 COMMENT: add comments to the data dictionary
 RENAME: rename an object

Data Manipulation Language (DML)


DML is the short name for Data Manipulation Language which deals with data
manipulation and includes most common SQL statements such SELECT, INSERT, UPDATE,
DELETE, etc., and it is used to store, modify, retrieve, delete and update data in a
database. Data query language(DQL) is the subset of “Data Manipulation Language”. The
most common command of DQL is SELECT statement. SELECT statement help on retrieving
the data from the table without changing anything in the table.
 SELECT: retrieve data from a database
 INSERT: insert data into a table
 UPDATE: updates existing data within a table
 DELETE: Delete all records from a database table

Data Control Language (DCL)


DCL is short for Data Control Language which acts as an access specifier to the
database.(basically to grant and revoke permissions to users in the database
 GRANT: grant permissions to the user for running DML (SELECT, INSERT,
DELETE,…) commands on the table
 REVOKE: revoke permissions to the user for running DML(SELECT, INSERT,
DELETE,…) command on the specified table

Transactional Control Language (TCL)


TCL is short for Transactional Control Language which acts as an manager for all types of
transactional data and all transactions. Some of the command of TCL are
 Roll Back: Used to cancel or Undo changes made in the database
 Commit: It is used to apply or save changes in the database
 Save Point: It is used to save the data on the temporary basis in the database

Common functions that a DBMS performs:

 Administration tasks. A DBMS supports many typical database administration


tasks, including change management, performance monitoring and tuning,
security, and backup and recovery. Most database management systems are also
responsible for automated rollbacks and restarts as well as logging and auditing
of activity in databases and the applications that access them.

 Storage. A DBMS provides efficient data storage and retrieval by ensuring that
data is stored in tables, rows and columns.

 Concurrency control. In environments where multiple users access and modify


the database simultaneously, a DBMS guarantees controlled transaction
execution to prevent data corruption or inconsistency.

 Centralized view. A DBMS provides a centralized view of data that multiple users
can access from multiple locations in a controlled manner. A DBMS can limit
what data end users see and how they view the data, providing many views of a
single database schema. End users and software programs are free from having
to understand where the data is physically located or on what type of storage
medium it resides because the DBMS handles all requests.

 Data manipulation. A DBMS ensures data integrity and consistency by letting


users insert, update, delete and modify data inside a database.

 Data independence. A DBMS offers both logical and physical data independence
to protect users and applications from having to know where data is stored or
from being concerned about changes to the physical structure of data. As long as
programs use the application programming interface (API) for the database that
the DBMS provides, developers won't have to modify programs just because
changes have been made to the database.

 Backup and recovery. A DBMS facilitates backup and recovery options by


creating backup copies so that data can be restored to a consistent state. This
protects against data loss due to hardware failures, software errors or other
unforeseen events. In a relational database management system (RDBMS) -- the
most widely used type of DBMS -- the API is structured query language (SQL), a
standard programming language for defining, protecting and accessing data.

Database Schema

A database schema logically describes a part or all of a database by displaying the data
structure in tables, fields, and relationships. You can think of it as a blueprint for
understanding an organization’s data resources.

Database 3 Level Architecture

Three Schema Architecture of DBMS

Within a database management system (DBMS), the term "schema" pertains to the logical
structure or arrangement of data, dictating how it is stored and accessed. "Architecture"
denotes the comprehensive layout and organization of the database. The three-schema
architecture in DBMS segregates the logical and physical aspects of the system, enabling
modifications to one layer without impacting the others. This segregation facilitates the
preservation of data integrity and consistency.

The three layers of a three-schema architecture are:

 External Layer

 Conceptual Layer

 Internal Layer

What is External Schema?

In a DBMS, the External layer offers a logical perspective of the database, serving as the
accessible portion that users interact with. This topmost layer is specifically designed to
provide a user-friendly interface for the database. To illustrate, consider an example of an
Employee Management system. When an employee logs into the system, the External layer
enables the display of the employee’s information.

What is Conceptual Schema?

The Conceptual schema in a database refers to the segment that defines the distinctions
among various datasets and establishes the overall structure of the database. For instance,
in an employee database, it outlines the columns or attributes of the table. It serves as a
high-level representation of the database. The Conceptual schema is commonly depicted
using the Entity-Relationship Model (ER Model), which employs symbols to visually
represent data elements and relationships specific to a given system. In an ER Model, the
database is portrayed through an ER Diagram. Let’s now examine the ER Diagram for an
Employee Management system, represented as follows.
This ER Diagram illustrates the relationships among the Employee, Department, Employee’s
Role, and Login System.

What is Internal Schema?

The internal schema in a database management system (DBMS) refers to the lowest level of
the three-schema architecture. It describes the physical storage structure and organization
of data within the database. The internal schema defines how the data is stored on the
storage media, such as disks or tapes, and how it is accessed by the system. This includes
details like data file formats, indexing techniques, storage allocation methods, and any
physical constraints or optimizations implemented in the database. The internal schema is
primarily concerned with the efficient storage and retrieval of data, and it is hidden from
the users and applications that interact with the database through the higher-level schemas.

The Structure of Database System

1. Query Processor:
It interprets the requests (queries) received from end user via an application program into
instructions. It also executes the user request which is received from the DML compiler.
Query Processor contains the following components –
 DML Compiler: It processes the DML statements into low level instruction
(machine language), so that they can be executed.
 DDL Interpreter: It processes the DDL statements into a set of table containing
meta data (data about data).
 Embedded DML Pre-compiler: It processes DML statements embedded in an
application program into procedural calls.
 Query Optimizer: It executes the instruction generated by DML Compiler.
2. Storage Manager:
Storage Manager is a program that provides an interface between the data stored in the
database and the queries received. It is also known as Database Control System. It
maintains the consistency and integrity of the database by applying the constraints and
executing the DCL statements. It is responsible for updating, storing, deleting, and
retrieving data in the database.
It contains the following components –
 Authorization Manager: It ensures role-based access control, i.e,. checks
whether the particular person is privileged to perform the requested operation
or not.

 Integrity Manager: It checks the integrity constraints when the database is


modified.

 Transaction Manager: It controls concurrent access by performing the


operations in a scheduled way that it receives the transaction. Thus, it ensures
that the database remains in the consistent state before and after the
execution of a transaction.

 File Manager: It manages the file space and the data structure used to
represent information in the database.

 Buffer Manager: It is responsible for cache memory and the transfer of data
between the secondary storage and main memory.

3. Disk Storage:
It contains the following components:
 Data Files: It stores the data.
 Data Dictionary: It contains the information about the structure of any
database object. It is the repository of information that governs the metadata.
 Indices: It provides faster retrieval of data item.

Properties of Transaction
The ACID properties, in totality, provide a mechanism to ensure the correctness and
consistency of a database in a way such that each transaction is a group of operations that
acts as a single unit, produces consistent results, acts in isolation from other operations,
and updates that it makes are durably stored.
ACID properties are the four key characteristics that define the reliability and consistency
of a transaction in a Database Management System (DBMS). The acronym ACID stands for
Atomicity, Consistency, Isolation, and Durability. Here is a brief description of each of
these properties:
1. Atomicity: Atomicity ensures that a transaction is treated as a single, indivisible
unit of work. Either all the operations within the transaction are completed
successfully, or none of them are. If any part of the transaction fails, the entire
transaction is rolled back to its original state, ensuring data consistency and
integrity.
1. Consistency: Consistency ensures that a transaction takes the database from
one consistent state to another consistent state. The database is in a consistent
state both before and after the transaction is executed. Constraints, such as
unique keys and foreign keys, must be maintained to ensure data consistency.
1. Isolation: Isolation ensures that multiple transactions can execute concurrently
without interfering with each other. Each transaction must be isolated from
other transactions until it is completed. This isolation prevents dirty reads, non-
repeatable reads, and phantom reads.
1. Durability: Durability ensures that once a transaction is committed, its changes
are permanent and will survive any subsequent system failures. The
transaction’s changes are saved to the database permanently, and even if the
system crashes, the changes remain intact and can be recovered.
Overall, ACID properties provide a framework for ensuring data consistency, integrity, and
reliability in DBMS. They ensure that transactions are executed in a reliable and consistent
manner, even in the presence of system failures, network issues, or other problems. These
properties make DBMS a reliable and efficient tool for managing data in modern
organizations.
Advantages of ACID Properties in DBMS
1. Data Consistency: ACID properties ensure that the data remains consistent and
accurate after any transaction execution.
1. Data Integrity: ACID properties maintain the integrity of the data by ensuring
that any changes to the database are permanent and cannot be lost.
1. Concurrency Control: ACID properties help to manage multiple transactions
occurring concurrently by preventing interference between them.
1. Recovery: ACID properties ensure that in case of any failure or crash, the
system can recover the data up to the point of failure or crash.

Understanding Tables, Records, and Fields in Relational Databases

Relational databases are the backbone of many applications and systems in today's digital
world. They provide a structured way to store, organize, and retrieve data. In this article, we
will delve into the fundamental components of a relational database: tables, records, and
fields.

Tables

In a relational database, a table is a collection of data elements organized in terms of rows


and columns. Each table in a database represents a specific entity, such as customers,
products, or orders. The table contains all the data pertaining to that entity.

For example, a 'Customers' table might include columns for CustomerID, FirstName,
LastName, Email, and PhoneNumber. Each column represents a different attribute of the
customer entity.

Records

Each row in a table is known as a record. A record is a set of related data items that are
grouped together. In the 'Customers' table example, each record would represent a single
customer. The record would include the customer's ID, first name, last name, email, and
phone number.

Fields

A field is a single piece of data within a record. In the 'Customers' table, the 'FirstName' field
of a record would contain the first name of a specific customer. Each field in a table is
associated with a specific data type, such as integer, text, date/time, etc., which determines
what kind of data it can store.

Relationships

Relational databases get their name from the fact that they allow relationships to be
established between different tables. These relationships are based on the use of keys.

A primary key is a unique identifier for a record in a table. For example, in the 'Customers'
table, 'CustomerID' could be the primary key. A primary key ensures that each record in the
table is unique.

A foreign key is a field (or collection of fields) in one table, that uniquely identifies a record
in another table. The table containing the foreign key is called the child table, and the table
containing the candidate key is called the referenced or parent table.

For example, in an 'Orders' table, there might be a 'CustomerID' field that acts as a foreign
key linking each order to a specific customer in the 'Customers' table. This allows for a
relationship to be established between the 'Customers' and 'Orders' tables, where each
customer can have multiple associated orders.

In conclusion, understanding the concepts of tables, records, and fields, and how they
interact, is fundamental to working with relational databases. These components provide
the structure that allows data to be efficiently stored, organized, and retrieved in a
relational database.

What is a Primary Key in SQL?


A Primary key is a unique column we set in a table to easily identify and locate data in
queries. A table can have only one primary key.

The primary key column has a unique value and doesn’t store repeating values. A Primary
key can never take NULL values.

For example, in the case of a student when identification needs to be done in the class, the
roll number of the student plays the role of Primary key.
Similarly, when we talk about employees in a company, the employee ID is functioning as
the Primary key for identification.
What is a Foreign key in SQL?
A Foreign key is beneficial when we connect two or more tables so that data from both can
be put to use parallelly.

A foreign key is a field or collection of fields in a table that refers to the Primary key of the
other table. It is responsible for managing the relationship between the tables.

The table which contains the foreign key is often called the child table, and the table whose
primary key is being referred by the foreign key is called the Parent Table.

For example: When we talk about students and the courses they have enrolled in, now if we
try to store all the data in a single table, the problem of redundancy arises.
To solve this table, we make two tables, one the student detail table and the other
department table. In the student table, we store the details of students and the courses
they have enrolled in.

And in the department table, we store all the details of the department. Here the courseId
acts as the Primary key for the department table whereas it acts as the Foreign key in the
student table.

You might also like