Chapter One Note
Chapter One Note
Data:=>
• Data are raw facts, figures, or values that are typically unprocessed and lack context.
• They can take various forms, such as numbers, text, images, or any other format.
Information:=>
• Information is the result of processing and interpreting data to make it meaningful and useful.
• It provides answers, insights, or context. Information is data that has been organized, analyzed,
and presented in a way that adds value
• For example, if we calculate the average of the numbers in the list (1, 2, 3, 4, 5) and present it as
"The average is 3," it becomes information
Database:=>
• A database is a structured collection of data that is organized and stored for efficient retrieval,
manipulation, and management.
• Databases consist of tables (or files) that contain records (rows) and fields (columns) to
represent related pieces of data.
• For example, in a library database, there may be tables for books, authors, borrowers, and
transactions.
DBMS:=>
• A DBMS is software that facilitates the creation, maintenance, and use of databases
• It provides tools and services for users and applications to interact with the database, including
data storage, retrieval, querying, security, and data integrity.
• The DBMS manages the underlying database structure, ensuring data consistency and reliability.
Examples of DBMSs include MySQL, Oracle, Microsoft SQL Server, and PostgreSQL.
Application of database:=>
i. Library catalog: It's a database that organizes information about the books, periodicals, and
other materials held in a library's collection.
ii. Online shopping: Online retailers use databases to manage their product inventory, customer
information, orders, and transactions.
iii. Social media: Social media platforms use databases to store user profiles, posts, comments,
likes, and other interactions.
iv. Customer records: Databases are used to store and manage customer information such as
contact details, purchase history, preferences, and feedback
.
v. Airline reservations: Airlines use databases to manage flight schedules, seat availability,
passenger reservations, and other related information.
vi. Medical records: Healthcare providers use databases to store patient demographics, medical
history, diagnostic test results, treatment plans, and other health-related information.
vii. Student grades: Educational institutions use databases to manage student enrollment, course
schedules, grades, attendance records, and other academic data.
1.2 Characteristics:=>
A. Data integration: DBMS allows for integrating data from various sources into a single database,
enabling users to access and manipulate the data seamlessly.
B. Data retrieval: Users can easily retrieve specific data from the database using queries and
commands provided by the DBMS.
C. Data security: DBMS provides mechanisms for ensuring data security, including user
authentication, access control, encryption, and data masking techniques to protect sensitive
information.
D. Data consistency: DBMS enforces rules and constraints to maintain data consistency, ensuring
that data remains accurate and valid throughout the database.
E. Data relationships: DBMS supports defining and managing relationships between different data
entities, such as one-to-one, one-to-many, and many-to-many relationships, allowing for
efficient data organization and retrieval.
F. Data redundancy reduction: DBMS helps in reducing data redundancy by storing data in a
centralized location and minimizing duplication, which improves data integrity and saves storage
space.
G. Concurrency control: DBMS manages access to data by multiple users or processes
concurrently, ensuring that transactions are executed safely and efficiently without interfering
with each other.
H. Backup and recovery: DBMS provides features for backing up and restoring data to prevent data
loss in case of system failures, disasters, or human errors.
I. Data independence: DBMS provides a level of abstraction between the application programs
and the physical data storage, allowing changes to the database structure without affecting the
applications that use it.
J. Scalability: DBMS is designed to handle growing amounts of data and increasing numbers of
users or transactions by scaling both vertically (adding more resources to a single server) and
horizontally (distributing data across multiple servers).
K. Performance optimization: DBMS includes optimization techniques such as indexing, query
optimization, and caching to enhance the performance of data retrieval and manipulation
operations.
L. Multi-user support: DBMS allows multiple users to access and manipulate the database
simultaneously while maintaining data integrity and security.
M. Data modeling and schema management: DBMS provides tools for defining data models,
creating database schemas, and managing changes to the schema over time.
The DBMS design depends upon its architecture. The basic client/server architecture is used to
deal with a large number of PCs, web servers, database servers and other components that are
connected with networks.
The client/server architecture consists of many PCs and a workstation which are connected via
the network.
DBMS architecture depends upon how users are connected to the database to get their request
done.
Types of DBMS Architecture:=>
Database architecture can be seen as a single tier or multi-tier. But logically, database architecture is of
two types like: 2-tier architecture and 3-tier architecture.
A simple standalone calculator application installed on a personal computer, where both the user
interface and the calculation logic are running on the same machine.
2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications
on the client end can directly communicate with the database at the server side. For this
interaction, API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and
transaction management.
o To communicate with the DBMS, client-side application establishes a connection with the server
side.
An online banking system where the client-side application (web browser) communicates directly with
the server-side application (running on a bank's server), which interacts with the database storing user
account information
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application server. The
database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
An e-commerce website where the client (web browser) interacts with a front-end server handling
user requests. This front-end server communicates with a middle-tier application server, which in turn
interacts with a database server storing product information, user profiles, and order history.
1.4 Data abstraction and Independence :=>
Data abstraction in DBMS means hiding unnecessary background details from the end user to make the
accessing of data easy and secure.
The main purpose of data abstraction is to hide irrelevant data and provide an abstract view of the data.
With the help of data abstraction, developers hide irrelevant data from the user and provide them the
relevant data. By doing this, users can access the data without any hassle(annoying), and the system will
also work efficiently.
In DBMS, data abstraction is performed in layers which means there are levels of data abstraction in
Based on these levels, the database management system is designed.
In database management systems (DBMS), abstraction refers to hiding the complex inner workings of
the database from the end user, providing a simplified interface or view.
In DBMS, there are three levels of data abstraction, which are as follows:
Note: View level presents a customized view of the data to users, logical level defines the structure of
the database, while physical level deals with how data is actually stored on the storage devices.
Understanding:=> "Customized view of data to the users" means showing different people or
applications only the specific information they need from the database, presented in a way that makes
sense for them, like showing sales data to a sales team and inventory data to the warehouse team, each
in a format they understand easily.
Note: while data abstraction is about keeping things simple by hiding the nitty-gritty details, the view
level takes that a step further by customizing how people see and interact with the data, making it even
easier to work with without worrying about the underlying complexity.
1. Physical or Internal Level:
The physical or internal layer is the lowest level of data abstraction in the database management
system. It is the layer that defines how data is actually stored in the database. It defines methods to
access the data in the database. It defines complex data structures in detail, so it is very complex to
understand, which is why it is kept hidden from the end user.
Data Administrators (DBA) decide how to arrange data and where to store data. The Data Administrator
(DBA) is the person whose role is to manage the data in the database at the physical or internal level.
There is a data center that securely stores the raw data in detail on hard drives at this level.
The logical or conceptual level is the intermediate or next level of data abstraction. It explains what data
is going to be stored in the database and what the relationship is between them.
It describes the structure of the entire data in the form of tables. The logical level or conceptual level is
less complex than the physical level. With the help of the logical level, Data Administrators (DBA)
abstract data from raw data present at the physical level.
The view level of database abstraction refers to the highest level of abstraction in a database
management system (DBMS), where users interact with the database through customized views or
perspectives tailored to their specific needs. These views provide a simplified and user-friendly
interface by hiding the complexity of the underlying database schema and data structures. Users
can query, insert, update, or delete data through views, which act as virtual tables presenting a
subset or combination of data from one or more underlying tables.
1. Customized Views: The view level offers customized perspectives of the database for different users
or applications. It tailors the presentation of data to meet specific needs, showing users only the
information that's relevant to them.
3. Virtual Tables: Views are like virtual tables that don't physically store data. Instead, they act as
filters over underlying tables, showing selected columns or rows. Users interact with views just like
regular tables, querying, updating, or deleting data as needed.
4. Data Security: Views help enforce data security by controlling access to sensitive information.
Database administrators can create views that restrict access to certain parts of the database,
ensuring users only see what they're supposed to see.
5. Simplified Access: Views simplify data access by hiding complex underlying structures. They
combine data from multiple tables or present it in a simpler format, making it easier for users to
work with the database without needing to understand its intricate details.Advantages of data
abstraction in DBMS
Data Independence
Data independence in DBMS refers to the capacity to change the schema (structure) of the database
without affecting the application programs or user views that access the data. It is a fundamental
concept that simplifies database maintenance and enhances flexibility.
Metadata itself follows a layered architecture, so that when we change data at one layer, it does not
affect the data at another level. This data is independent but mapped to each other.
[Understanding:=> a. Physical Data Independence: This means you can change how data is stored or
accessed without affecting how users interact with it. Imagine you have a bookshelf. You can rearrange
the books on the shelf without changing the way people read or find the books.
Understanding:=>b. Logical Data Independence: This means you can change the way data is structured
or organized without affecting how users interact with it. Continuing the book analogy, you can change
how books are categorized on the shelf (fiction, non-fiction, genre, author) without changing how
people browse or search for books.]
5.1. Physical Data Independence: It refers to the capacity to modify the physical storage structures or
methods of the database system without affecting the conceptual or external levels of the system's
organization or operation.
5.2. Logical Data Independence: It denotes the ability to alter the conceptual schema of a database
system without requiring changes to the external schema or application programs that use the data,
ensuring that changes in the database structure do not necessitate modifications in user
applications.
5.3. Schemas and Instances:=>
Schema refers to the overall description of any given database. Instance basically refers to a
collection of data and information that the database stores at any particular moment. The schema
remains the same for the entire database as a whole.
Difference between schema and instances:=>
A schema defines the structure and constraints of a database, while an instance is a snapshot of the
data stored within that database at a particular moment where , A constraint in the context of
databases refers to rules and conditions that are enforced to maintain the integrity, accuracy, and
consistency of data within the database and A snapshot in the context of databases refers to a
point-in-time view of the data stored in the database, capturing the state of the data at that specific
moment without considering ongoing changes or updates.
These refer to a collection of all the information and data stored at any given moment. One can easily
change these instances using certain CRUD operations, such as deletion and addition of data and
information.
You must note that no search queries make any changes in any instances.
What is a Schema?
It refers to an overall description that we get for any given database. In simpler words, schema refers to
the basic structure of how one needs to store data in any database. There are basically two types of
Schema: Physical Schema and Logical Schema.
Meaning Schema refers to the overall description of any Instance basically refers to a collection of data
given database. and information that the database stores at
any particular moment.
Alterations The schema remains the same for the entire One can change the instances of data and
database as a whole. information in a database using updation,
deletion, and addition.
Uses We use Schema for defining the basic structure We use Instance for referring to a set of
of any given database. It defines how the information at any given instance/ time.
available needs to get stored.
Database Management Systems (DBMS) are software systems used to store, retrieve, and run queries
on data. A DBMS serves as an interface between an end-user and a database, allowing users to create,
read, update, and delete data in the database.
DBMS manage the data, the database engine, and the database schema, allowing for data to be
manipulated or extracted by users and other programs. This helps provide data security, data integrity,
concurrency, and uniform data administration procedures.
DBMS optimizes the organization of data by following a database schema design technique called
normalization, which splits a large table into smaller tables when any of its attributes have redundancy
in values. DBMS offer many benefits over traditional file systems, including flexibility and a more
complex backup system.
Database management systems can be classified based on a variety of criteria such as the data model,
the database distribution, or user numbers. The most widely used types of DBMS software are
relational, distributed, hierarchical, object-oriented, and network.
A) Centralized Database
It is the type of database that stores data at a centralized database system. It comforts the users to
access the stored data from different locations through several applications. These applications contain
the authentication process to let users access data securely. An example of a Centralized database can
be Central Library that carries a central database of each library in a college/university.
It has decreased the risk of data management, i.e., manipulation of data will not affect the core
data.
Data consistency is maintained as it manages data in a central repository.
It provides better data quality, which enables organizations to establish data standards.
It is less costly because fewer vendors are required to handle the data sets.
The size of the centralized database is large, which increases the response time for fetching the
data.
It is not easy to update such an extensive database system.
If any server failure occurs, entire data will be lost, which could be a huge loss.
B ) Distributed Database
Unlike a centralized database system, in distributed systems, data is distributed among different
database systems of an organization. These database systems are connected via communication
links. Such links help the end-users to access the data easily. Examples of the Distributed
database are Apache Cassandra, HBase, Ignite, etc.
Modular development is possible in a distributed database, i.e., the system can be expanded by
including new computers and connecting them to the distributed system.
One server failure will not affect the entire data set.
It is the type of database that stores data in the form of parent-children relationship nodes.
Here, it organizes data in a tree-like structure.
It is the database that typically follows the network data model. Here, the representation of data is in
the form of nodes connected via links between them. Unlike the hierarchical database, it allows each
record to have multiple children and parent nodes to form a generalized graph structure
E) Relational Database
This database is based on the relational data model, which stores data in the form of rows
(tuple) and columns(attributes), and together forms a table(relation). A relational database uses
SQL for storing, manipulating, as well as maintaining the data. E.F. Codd invented the database in
1970. Each table in the database carries a key that makes the data unique from others. Examples
of Relational databases are MySQL, Microsoft SQL Server, Oracle, etc.
There are following four commonly known properties of a relational model known as ACID
properties, where:
a. A means Atomicity: This ensures the data operation will complete either with success or
with failure. It follows the 'all or nothing' strategy. For example, a transaction will either
be committed or will abort.
b. C means Consistency: If we perform any operation over the data, its value before and
after the operation should be preserved. For example, the account balance before and
after the transaction should be correct, i.e., it should remain conserved.
c. I means Isolation: There can be concurrent users for accessing data at the same time from
the database. Thus, isolation between the data should remain isolated. For example,
when multiple transactions occur at the same time, one transaction effects should not be
visible to the other transactions in the database.
d. D means Durability: It ensures that once it completes the operation and commits the data,
data changes should remain permanent.Object-oriented database management system
G) Cloud Database
A type of database where data is stored in a virtual environment and executes over the cloud
computing platform. It provides users with various cloud computing services (SaaS, PaaS, IaaS,
etc.) for accessing the database. There are numerous cloud platforms, but the best options
are:
b. Microsoft Azure
Structured Query Language (SQL), as we all know, is the database language by which we can perform
certain operations on the existing database, and we can also use this language to create a
database. SQL uses certain commands like CREATE, DROP, INSERT, etc. to carry out the required tasks.
SQL commands are like instructions to a table. It is used to interact with the database with some
operations. It is also used to perform specific tasks, functions, and queries of data. SQL can perform
various tasks like creating a table, adding data to tables, dropping the table, modifying the table, set
permission for users.
DDL or Data Definition Language actually consists of the SQL commands that can be
used to define the database schema. It simply deals with descriptions of the database schema
and is used to create and modify the structure of database objects in the database.
DDL is a set of SQL commands used to create, modify, and delete database structures but not
data. These commands are normally not used by a general user, who should be accessing the
database via an application.
The SQL commands that deal with the manipulation of data present in the database belong to DML or
Data Manipulation Language and this includes most of the SQL statements.
It is the component of the SQL statement that controls access to data and to the database. Basically,
DCL statements are grouped with DML statements.
List of DML commands
Some DML commands and their syntax are:
Command Description Syntax
Update existing data within UPDATE table_name SET column1 = value1, column2 =
UPDATE
a table value2 WHERE condition;