0% found this document useful (0 votes)
9 views51 pages

UNIT 1 Final

This document provides an overview of database systems, including their characteristics, advantages, and applications. It explains the differences between databases and file systems, outlines various types of database users, and details the structure and schema of databases. Additionally, it covers key concepts such as data, information, and the role of Database Management Systems (DBMS) in managing data efficiently.

Uploaded by

cherrihoesduh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views51 pages

UNIT 1 Final

This document provides an overview of database systems, including their characteristics, advantages, and applications. It explains the differences between databases and file systems, outlines various types of database users, and details the structure and schema of databases. Additionally, it covers key concepts such as data, information, and the role of Database Management Systems (DBMS) in managing data efficiently.

Uploaded by

cherrihoesduh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

UNIT I:

Introduction: Database system, Characteristics (Database Vs File System),


Database Users, Advantages of Database systems, Database applications. Brief
introduction of different Data Models; Concepts of Schema, Instance and data
independence; Three tier schema architecture for data independence; Database
system structure, environment, Centralized and Client Server architecture for the
database. Entity Relationship Model: Introduction, Representation of entities,
attributes, entity set, relationship, relationship set, constraints, sub classes, super
class, inheritance, specialization, generalization using ER Diagrams.

 What is Data?
Data is a real-world entity or an object. Data is a distinct piece of information or facts
that need to be processed. It can be in any form like text, number, picture,
measurements, and bytes

Example: Ankit, Delhi, 12, 80.

 What is Information?
When data are processed, organized, structured, and interpreted in a given context, so
as to make them useful and meaningful, they are called information.
Example: Name - Ankit, City - Delhi, Class – 12, Marks – 80.

 DBMS:
Database management system or DBMS is collection of inter-related data and set
programs which helps in insertion, deletion, and retrieval of those data efficiently. The
database is also used to organize the data or information in the form of tables, views,
schemas, reports, etc.
A database management system or DBMS is a software used for creating and
managing the data in the database easily and effectively.
Example: MySQL, MS SQL Server, Oracle, SQL, DB2, Microsoft Access, etc. are
different types of database management system.

Characteristics (Database Vs File System:


Aspect File System DBMS

Manages data as individual Manages data as a collection of


Structure
files in directories. interrelated databases.

1
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Data Details of data storage are Provides an abstract view of data,
Abstraction visible to users. hiding storage details.

Data High data redundancy due Low data redundancy due to


Redundancy to lack of central control. centralised management.

Ensures data integrity Automatically ensures data


Data Integrity
manually. integrity through constraints.

Excellent concurrency control,


Poor concurrency control,
Concurrency allowing multiple users to access
leading to data conflicts.
data simultaneously.

Advanced security features,


Limited security features;
Security including user authentication and
difficult to protect files.
access control.

Backup and No built-in backup and Built-in backup and recovery


Recovery recovery mechanisms. features to protect data.

Query Limited query capabilities; Supports complex queries and


Processing slow data retrieval. efficient data retrieval using SQL.

Limited scalability; not


Highly scalable; can handle large
Scalability suitable for large data
volumes of data efficiently.
volumes.

More expensive due to advanced


Cheaper to implement and
Cost features and infrastructure
maintain.
requirements.

Changes to data structure


Data Data structure changes do not
affect application
Independence affect application programs.
programs.

Supports transactions with ACID


Transaction No support for properties (Atomicity,
Management transactions. Consistency, Isolation,
Durability).

2
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Simpler to understand and More complex due to advanced
Complexity use for basic storage features but provides better data
needs. management.

Suitable for simple, small- Optimised for complex, large-


Performance scale data management scale data management and high
tasks. performance.

Ideal for complex applications


Ideal for simple file
requiring robust data
Applications storage, like text
management, such as financial
documents and images.
systems

NTFS, FAT32, ext4,


MySQL, PostgreSQL, Oracle
HDFS (Hadoop
Example DB, MongoDB (for NoSQL),
Distributed File System),
Microsoft SQL Server.
ZFS.

Database Users:
A Database User is defined as a person who interacts with data daily, updating, reading,
and modifying the given data. Database users can access and retrieve data from the
database through the Database Management System (DBMS) applications and
interfaces.
Types of Database Users
Database users are categorized based on their interaction with the database. There are
seven types of database users in DBMS. Below mentioned are the types of database
users:
1. Database Administrator (DBA)
 A Database Administrator (DBA) is a person/team who defines the schema and also
controls the 3 levels of the database.
 The DBA will then create a new account ID and password for the user if he/she needs
to access the database.
 DBA is also responsible for providing security to the database and he allows only
authorized users to access/modify the database. DBA is responsible for problems such
as security breaches and poor system response time.
o DBA also monitors the recovery and backup and provides technical support.
o The DBA has a DBA account in the DBMS which is called a system or super-
user account.
o DBA repairs damage caused due to hardware and/or software failures.

3
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
o DBA is the one having privileges to perform DCL (Data Control Language) operations
such as GRANT and REVOKE, to allow/restrict a particular user from accessing the
database.

2. Naive / Parametric End Users


 Parametric End Users are the unsophisticated who don’t have any DBMS
knowledge but they frequently use the database applications in their daily life to
get the desired results.
 For example, Railway’s ticket booking users are naive users. Clerks in any bank
is a naive user because they don’t have any DBMS knowledge but they still use
the database and perform their given task.
3. A System Analyst
A system Analyst is a user who analyzes the requirements of parametric end users. They
check whether all the requirements of end users are satisfied.
4. Sophisticated Users
 Sophisticated users can be engineers, scientists, business analyst, who are familiar
with the database.
 They can develop their own database applications according to their requirement.
 They don’t write the program code but they interact the database by writing SQL
queries directly through the query processor.
5. Database Designers
 Data Base Designers are the users who design the structure of database which
includes tables, indexes, views, triggers, stored procedures and constraints which
are usually enforced before the database is created or populated with data. He/she
controls what data must be stored and how the data items to be related.

 It is the responsibility of Database Designers to understand the requirements of


different user groups and then create a design which satisfies the need of all the
user groups.
6. Application Programmers
 Application Programmers also referred as System Analysts or simply Software
Engineers, are the back-end programmers who writes the code for the application
programs.
 They are the computer professionals.
7. Casual Users / Temporary Users
 Casual Users are the users who occasionally use/access the database but each time
when they access the database they require the new information,
 for example, Middle or higher level manager.

4
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Characteristics of DBMS
There are various characteristics of a database management system, but following are
some important characteristics:

 A database management system (DBMS) should be able to store any kind of


data in a database.
 Any database management system should be able to support ACID (atomicity,
consistency, isolation, durability) properties.
 The Database management system allows more than one users to access the
same database at the same time.
 Backup and recovery are the two main methods that allow users to protect their
data from damage or loss.
 It provides multiple views for different users in one organization.
 DBMS follows the concept of normalization to minimize the redundancy of a
relation.
 It provides users query language, using which they can easily insert, retrieve,
update, and delete the data in a database.

Advantage of DBMS:
1. Data Security: The more accessible and usable the database, the more it is prone to
security issues. As the number of users increases, the data transferring or data sharing
rate also increases thus increasing the risk of data security. It is widely used in the
corporate world where companies invest large amounts of money, time, and effort to
ensure data is secure and used properly. A Database Management System (DBMS)
provides a better platform for data privacy and security policies thus, helping
companies to improve Data Security.
2. Data integration: Due to the Database Management System we have access to well-
managed and synchronized forms of data thus it makes data handling very easy and
gives an integrated view of how a particular organization is working and also helps to
keep track of how one segment of the company affects another segment.
3. Data abstraction: The major purpose of a database system is to provide users with an
abstract view of the data. Since many complex algorithms are used by the developers
to increase the efficiency of databases that are being hidden by the users through
various data abstraction levels to allow users to easily interact with the system.
4. Reduction in data Redundancy: When working with a structured database, DBMS
provides the feature to prevent the input of duplicate items in the database. for e.g. – If
there are two same students in different rows, then one of the duplicate data will be
deleted.

5
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
5. Data sharing: A DBMS provides a platform for sharing data across multiple
applications and users, which can increase productivity and collaboration.
6. Data consistency and accuracy: DBMS ensures that data is consistent and accurate
by enforcing data integrity constraints and preventing data duplication. This helps to
eliminate data discrepancies and errors that can occur when data is stored and managed
manually.
7. Data organization: A DBMS provides a systematic approach to organizing data in a
structured way, which makes it easier to retrieve and manage data efficiently.
8. Efficient data access and retrieval: DBMS allows for efficient data access and
retrieval by providing indexing and query optimization techniques that speed up data
retrieval. This reduces the time required to process large volumes of data and increases
the overall performance of the system.
9. Concurrency and maintained Atomicity: That means, if some operation
is performed on one particular table of the database, then the change must be
reflected for the entire database. The DBMS allows concurrent access to multiple users
by using the synchronization technique.
10.Scalability and flexibility: DBMS is highly scalable and can easily accommodate
changes in data volumes and user requirements. DBMS can easily handle large
volumes of data, and can scale up or down depending on the needs of the organization.
It provides flexibility in data storage, retrieval, and manipulation, allowing users to
easily modify the structure and content of the database as needed.
Applications of DBMS:
 There are various fields where a database management system is used. Following are some
applications which make use of the database management system:
 1. Railway Reservation System: In the railway reservation system, the database is required
to store the record or data of ticket bookings, status about train’s arrival, and departure. Also
if trains get late, people get to know it through database update.
 2. Library Management System: There are lots of books in the library so; it is tough to store
the record of all the books in a register or copy. So, the database management system
(DBMS) is used to maintain all the information related to the name of the book, issue date,
availability of the book, and its author.
 3. Banking: Database management system is used to store the transaction information of
the customer in the database.
 4. Education Sector: Presently, examinations are conducted online by many colleges and
universities. They manage all examination data through the database management system
(DBMS). Inspite that student’s registrations details, grades, courses, fee, attendance, results,
etc. all the information is stored in the database
 5. Credit card transactions: Database Management system is used for purchasing on credit
cards and generation of monthly statements.
 6. Social Media Sites: We all use of social media websites to connect with friends and to
share our views with the world. Daily, millions of peoples sign up for these social media
accounts like Facebook, Twitter, and Google plus. By the use of the database management

6
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
system, all the information of users are stored in the database and, we become able to
connect with other people.
 7. Telecommunications: The Database management system is necessary for these
companies to store the call details and monthly postpaid bills in the database.
 8. Finance: The database management system is used for storing information about sales,
holding and purchases of financial instruments such as stocks and bonds in a database
 9. Online Shopping: These days, online shopping has become a big trend. No one wants
to visit the shop and waste their time. Everyone wants to shop through online shopping
websites (such as Amazon, Flipkart, snapdeal) from home. So all the products are sold and
added only with the help of the database management system (DBMS). Invoice bills,
payments, purchase information all of these are done with the help of DBMS.
 10. Human Resource Management: Big firms or companies have many workers or
employees working under them. They store information about employee’s salary, tax, and
work with the help of database management system (DBMS).
 11. Manufacturing: Manufacturing companies make different types of products and sale
them on a daily basis. In order to keep the information about their products like bills, purchase
of the product, quantity, supply chain management, database management system (DBMS)
is used.
 12. Airline Reservation System: This system is the same as the railway reservation system.
This system also uses a database management system to store the records of flights
departure, arrival, and delay status

Definition of schema: Design of a database is called the schema. Schema is of three types:
Physical schema, logical schema and view schema.

Example: In the following diagram, we have a schema that shows the relationship between three
tables: Course, Student and Section. The diagram only shows the design of the database, it doesn’t
show the data present in those tables. Schema is only a structural view (design) of a database as
shown in the diagram below.

7
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
1. Physical schema:
In the physical schema, the database is designed at the physical level. At this level, the schema
describes how the data block is stored and how the storage is managed.
2. Logical schema:
In the logical schema, the database is designed at a logical level. At this level, the programmer
and data administrator perform their work. Also, at this level, a certain amount of data is stored
in a structured way. But the internal implementation data are hidden in the physical layer for
the security proposed.
3. View schema:
In view schema, the database is designed at the view level. This schema describes the user
interaction with the database system.
Moreover, Data Definition Language (DDL) statements help to denote the schema of a database.
The schema represents the name of the table, the name of attributes, and their types; constraints
of the tables are related to the schema. Therefore, if users want to modify the schema, they can
write DDL statements.

Definition of instance: The data stored in database at a particular moment of time is called instance
of database. Database schema defines the variable declarations in tables that belong to a particular
database; the value of these variables at a moment of time is called the instance of that database.

Example:- lets say we have a single table student in the database, today the table has 100 records,
so today the instance of the database has 100 records. Lets say we are going to add another 100
records in this table by tomorrow so the instance of database tomorrow will have 200 records in
table. In short, at a particular moment the data stored in database is called the instance, that changes
over time when we add or delete data from the database.

8
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
DATA MODEL:
 A Database model defines the logical design and structure of a database and defines how data
will be stored, accessed and updated in a database management system.
 Data Model is a logical structure of Database. It describes the design of database to reflect
entities, attributes, relationship among data, constrains etc.
 The structure of the database is called data model
 A data model is a model that defines in which format the data are represented and accessed.
Some of the types of Data Model: -
 Hierachical Model
 Network Model
 Entity-Relational Model
 Relational Model
 Object-Oriented Model

Hierarchical Model

 This database model organises data into a tree-like-structure, with a single root, to which all the
other data is linked. The heirarchy starts from the Root data, and expands like a tree, adding
child nodes to the parent nodes.
 In this model, a child node will only have a single parent node.
 This model efficiently describes many real-world relationships like index of a book, recipes etc.
 In hierarchical model, data is organised into tree-like structure with one one-to-many
relationship between two different types of data, for example, one department can have many
courses, many professors and of-course many students.

9
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Advantages of Hierarchical Data Model:

 Efficient storage for data that have a clear hierarchy.


 Parent/Child relationship promotes conceptual simplicity & data integrity.
 It promotes data sharing.
 It is easy to understand and use.

Disadvantages of Hierarchical Data Model:

 When we want to add a new level (parent/child) to an existing Structure then the user has to
reconstruct the entire structure so that it leads to time-consuming.
 When we want to access data from this model then we need to Travel from root level to child
level which will time taken process.
 This model was designed based on a “one–many” relation i.e. every child is having only one
parent so that there is a chance to occur Data duplicate.

Network Model

 This is an extension of the Hierarchical model. In this model data is organised more like a graph,
and are allowed to have more than one parent node.
 In this database model data is more related as more relationships are established in this database
model. Also, as the data is more related, hence accessing the data is also easier and fast. This
database model was used to map many-to-many data relationships.
 This was the most widely used database model, before Relational Model was introduced.

10
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Advantages:

 To reduce duplicate data because supporting many – many


 Relation (a child can have multiple parents)
 By using the pointers mechanism, we can add a new level (parent/child) to an existing
structure without reconstruction.
 Accessing data in this model is very fast because it uses pointers.

Disadvantages :

 When we use a number of pointers in an application then it will increase


complexity(difficult) to identifying which pointer is belongs to which parent or which child
and also degrade performance.
 NDBMS model was not a more successful model in real-time because of immediate take
over by the RDBMS model in the 1970s with Effective features.

Entity-relationship Model

 In this database model, relationships are created by dividing object of interest into entity
and its characteristics into attributes.
 Different entities are related using relationships.
 E-R Models are defined to represent the relationships into pictorial form to make it
easier for different stakeholders to understand.
 This model is good to design a database, which can then be turned into tables in
relational model
 Example:- If we have to design a School Database, then Student will be
an entity with attributes name, age, address etc. As Address is generally complex, it
11
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
can be another entity with attributes street name, pincode, city etc, and there will be a
relationship between them.

Advantages:

 It is easy to understand and design.


 Using the ER model we can represent data structures easily.
 As the ER model cannot be directly implemented into a database model, it is just a step
toward designing the relational database model.

Disadvantages:

 There is no industry standard for developing an ER model.


 Information might be lost or hidden in the ER model.
 There is no Data Manipulation
 There is limited relationship representation.

Relational Model

 In this model, data is organised in two-dimensional tables and the relationship is


maintained by storing a common field.
 This model was introduced by E.F Codd in 1970, and since then it has been the most
widely used database model.
 The basic structure of data in the relational model is tables. All the information related to
a particular type is stored in rows of that table.
 Hence, tables are also known as relations in relational model.

12
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Object oriented data model is based upon real world situations. These situations are
represented as objects, with different attributes. All these object have multiple
relationships between them.

Advantages:

 It's simple and easy to implement.


 Poplar database software is available for this database model.
 It supports SQL using which you can easily query the data.

Disadvantages:

 Relational model requires powerful hardware and large data storage devices.
 May lead to slower processing time.
 Poorly designed systems lead to poor implementation of database systems.

Object oriented data model


 Objects: The real world entities and situations are represented as objects in the Object
oriented database model.
 Attributes and Method: Every object has certain characteristics. These are represented
using Attributes. The behaviour of the objects is represented using Methods.
13
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
 Class: Similar attributes and methods are grouped together using a class. An object can
be called as an instance of the class.
 Inheritance: A new class can be derived from the original class. The derived class
contains attributes and methods of the original class as well as its own.
 Example

An Example of the Object Oriented data model is

 Shape, Circle, Rectangle and Triangle are all objects in this model.
 Circle has the attributes Center and Radius.
 Rectangle has the attributes Length and Breath
 Triangle has the attributes Base and Height.
 The objects Circle, Rectangle and Triangle inherit from the object Shape

Advantages:

• Reduced Maintenance
• Real-World Modeling
• Improved Reliability and Flexibility
• High Code Reusability

Disadvantages:

• It is a complex navigational system.


• Slow development of standards.
• High system overheads.
• Slow transactions.

Semi-Structured Data Model

 The semi-structured data model permits the specification of data where individual
data items of same type may have different sets of attributes. The Extensible
Markup Language (XML) is widely used to represent semi-structured data model.

14
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Advantages:

 Data is not constrained by fixed schema.


 It is flexible.
 It is portable.

Disadvantage:

 Queries are less efficient than other types of data model.

DBMS STRUCTURE:
Query Processor:
The Query Processor receives the queries (requests) from the user and interprets them in the
form of instructions. It also executes the instructions received from the DML Compiler. It has
the following four components:
 DDL Interpreter: It interprets the DDL (Data Definition Language) Instructions
and stores the record in a data dictionary (in a table containing meta-data)
 DML compiler and Organizer:
 It translates DML statements in a query language to low level instructions that the
query evaluation engine understands. Also DML compiler transform a users request
into an equivalent but more efficient form of low level instructions
 Compiler and Linker:
SQL commands can be embedded in host language application programs like JAVA
or COBOL. These are converter into DML statements by this compiler
 Query Evaluation Engine:
It executes low-level instructions generated by DML compiler.
 Application program Object Code:
It is low level instruction of the program written by naïve users which the query
evolution engine understands and executes them.
Storage Manager:
Storage manager acts as the interface between the data stored in the database and the
queries received from the end-user. This component in the Structure of DBMS is
responsible for the constraints applied to the data so that it remains consistent. It also
executes the DCL (Data Control Language). It encapsulates the following modules:
 Authorization and Integrity Manager: It checks the authority of various users who
access data and the Integrity Constraints of the database.
 Transaction Manager: Its job is to assure the system remains in a proper state during
the transaction process. It also ensures that concurrent transactions are executed without
any conflict.
 File Manager: It manages the space allocation of files in disk and data structures which
stores information in the database.
 Buffer Manager: It manages the transfer of data between the secondary and main
15
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
memory. It also decides what data should be cached in the memory.
Disk Storage
 The Disk Storage in the Structure of DBMS represents the space where data is stored. It
has the following components:
 Files: These are responsible for storing the data.
 Data Dictionary: It is the repository that maintains the information of the database object
and maintains the metadata.
 Indices: These are the keys that are used for faster retrieval of data.
 Statistical data: It stores statistical information about the data in the database.
This information is used by the query processor to select efficient ways to execute a query.
Thus, the Structure of DBMS represents the functional modules that are employed to
process the queries received from the user, retrieve the data, maintain the changes in the
database and optimize the data retrieval.

Data Independence:
The ability to modify a schema in one level without affecting a schema in the next higher
level is called as data independence. The data changes in one level are independent from the
data changes in next higher level.
Data independence is of two types
1) Physical Data Independence
2) Logical Data Independence

16
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
1. Logical Data Independence
o Changing the logical schema (Conceptual or Logical Level) without changing the external
schema (External or View Level) is called logical data independence.
o It is used to keep the external schema separate from the logical schema.
o If we make any changes at the conceptual level of data, it does not affect the view level.
o This happens at the user interface level.
o For example, it is possible to add or delete new entities, attributes to the conceptual schema
without making any changes to the external schema.
o Example: With Logical Data Independence, they can do this without affecting the user
interface. Imagine adding a “Reviews” section to a “Products” table. This change won’t
disrupt the shopping experience for customers

2. Physical Data Independence


o Making changes to the physical schema without changing the logical schema is called physical
data independence.
o If we change the storage size of the database system server, it will not affect the conceptual
structure of the database.
o It is used to keep the conceptual level separate from the internal level.
o This happens at the logical interface level.
o Example – Changing the location of the database from C drive to D drive.
o Example: Now, consider Physical Data Independence. An e-commerce site may switch
from hard drives to SSDs for faster performance. This upgrade happens without any changes
to the database logic or user interface.

17
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Fig: Data Independence

THREE TIER SCHEMA ARCHITECTURE FOR DATA INDEPENDENCE

o The DBMS design depends upon its architecture. The basic client/server architecture is used
to deal with a large number of PCs, web servers, database servers and other components that
are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are connected via
the network.
o DBMS architecture depends upon how users are connected to the database to get their request
done.

Types of DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically, database
architecture is of two types like: 2-tier architecture and 3-tier architecture.

18
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
1-TIER ARCHITECTURE

o In this architecture, the database is directly available to the user. It means the user can directly
sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide a handy
tool for end users.
o The 1-Tier architecture is used for development of the local application, where programmers
can directly communicate with the database for the quick response.

2-TIER ARCHITECTURE

o The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications
on the client end can directly communicate with the database at the server side. For this
interaction, API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and
transaction management.
o To communicate with the DBMS, client-side application establishes a connection with the
server side.

Fig: 2-tier Architecture

3-TIER ARCHITECTURE

o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.

19
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application server. The
database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.

Fig: 3-tier Architecture

Database Environment:
 One of the primary aims of a database is to supply users with an abstract view of data,
hiding a certain element of how data is stored and manipulated.
 Therefore, the starting point for the design of a database should be an abstract and
general description of the information needs of the organization that is to be represented
in the database. And hence you will require an environment to store data and make it
work as a database.
 A database environment is a collective system of components that comprise and
regulates the group of data, management, and use of data, which consist of software,
hardware, people, techniques of handling database, and the data also.
 Here, the hardware in a database environment means the computers and computer
peripherals that are being used to manage a database, and the software means the whole
thing right from the operating system (OS) to the application programs that include
database management software like M.S. Access or SQL Server. Again the people in a
database environment include those people who administrate and use the system. The
techniques are the rules, concepts, and instructions given to both the people and the
software along with the data with the group of facts and information positioned within
the database environment.
20
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Centralized and Client Server Architecture for DBMS:
Centralized Architecture:

 A centralized architecture for DBMS is one in which all data is stored on a single
server, and all clients connect to that server in order to access and manipulate the
data. This type of architecture is also known as a monolithic architecture. One of
the main advantages of a centralized architecture is its simplicity - there is only one
server to manage, and all clients use the same data.
 However, there are also some drawbacks to this type of architecture. One of the
main downsides is that, because all data is stored on a single server, that server can
become a bottleneck as the number of clients and/or the amount of data increases.
Additionally, if the server goes down for any reason, all clients lose access to the
data.

21
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Client-server Architecture of DBMS
 A client-server architecture for DBMS is one in which data is stored on a central
server, but clients connect to that server in order to access and manipulate the data.
This type of architecture is more complex than a centralized architecture, but it
offers several advantages over the latter.
 One of the main benefits of a client-server architecture is that it is more scalable
than a centralized architecture. As the number of clients and/or the amount of data
increases, the server can be upgraded or additional servers can be added to handle
the load. This allows the system to continue functioning smoothly even as it grows
in size.
 Another advantage of a client-server architecture is that it is more fault-tolerant than
a centralized architecture. If a single server goes down, other servers can take over
its responsibilities, and clients can still access the data. This makes the system less
likely to experience downtime, which is a crucial factor in many business
environments. On this underlying client/server framework, Two-tier and Three-
tier fundamental DBMS architectures were developed.

Two-Tier Client Server Architecture:

 Here, the term "two-tier" refers to our architecture's two layers-the Client layer and
the Data layer. There are a number of client computers in the client layer that can
contact the database server. The API on the client computer will use JDBC or some
other method to link the computer to the database server. This is due to the
possibility of various physical locations for clients and database servers.

Three-Tier Client-Server Architecture:

 The Business Logic Layer is an additional layer that serves as a link between the
Client layer and the Data layer in this instance. The layer where the application
programs are processed is the business logic layer, unlike a Two-tier architecture,
where queries are performed in the database server. Here, the application programs
are processed in the application server itself.

22
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Introduction of Entity Relationship Model
 The Entity Relationship Model is a model for identifying entities to be represented
in the database and representation of how those entities are related. The ER data
model specifies enterprise schema that represents the overall logical structure of a
database graphically.

Rectangle: Represents Entity sets.


Ellipses: Attributes
Diamonds: Relationship Set
Lines: They link attributes to Entity Sets and Entity sets to Relationship Set
Double Ellipses: Multivalued Attributes
Dashed Ellipses: Derived Attributes
Double Rectangles: Weak Entity Sets
Double Lines: Total participation of an entity in a relationship set

Components of a ER Diagram

23
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
As shown in the above diagram, an ER diagram has three main components:
1. Entity
2. Attribute
3. Relationship

1. Entity

An entity is an object or component of data. An entity is represented as rectangle in an ER


diagram.
For example: In the following ER diagram we have two entities Student and College and
these two entities have many to one relationship as many students study in a single college.
We will read more about relationships later, for now focus on entities.

Types of Entity:
Weak Entity:
A Weak Entity in DBMS does not have a Key Attribute. It depends on some other Strong
Entity to be identified uniquely. For example, if Installment is an Entity, then it can exist
only if a Loan exists as an Entity. It is represented by a double rectangle in ER Diagram.

Strong Entity:
If an Entity has a key attribute (uniquely identifiable feature), then it is called a Strong
Entity. A simple example of a Strong Entity is the Employee which has the key attribute
‘Employee ID,’ using which a record of that Entity Type is uniquely identified. In ER
Diagram, it is represented using a single rectangle.

24
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Attribute
An attribute describes the property of an entity. An attribute is represented as Oval in an ER
diagram. There are four types of attributes:

1. Key attribute
2. Composite attribute
3. Multivalued attribute
4. Derived attribute

1. Key attribute:

A key attribute can uniquely identify an entity from an entity set. For example, student roll
number can uniquely identify a student from a set of students. Key attribute is represented by
oval same as other attributes however the text of key attribute is underlined..

2. Composite attribute:

25
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
An attribute that is a combination of other attributes is known as composite attribute. For
example, In student entity, the student address is a composite attribute as an address is composed
of other attributes such as pin code, state, country.

3. Multivalued attribute:
An attribute that can hold multiple values is known as multivalued attribute. It is represented
with double ovals in an ER Diagram. For example – A person can have more than one phone
numbers so the phone number attribute is multivalued.

4. Derived attribute:
A derived attribute is one whose value is dynamic and derived from another attribute. It is
represented by dashed oval in an ER Diagram. For example – Person age is a derived
attribute as it changes over time and can be derived from another attribute (Date of birth).

E-R diagram with multivalued and derived attributes:

Entity Set is a collection or a group of ‘entities’ sharing exactly the ‘same set of attributes’.

26
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Types of Entity Sets-
An entity set may be of the following two types-

1. Strong entity set


2. Weak entity set
1. Strong Entity Set-
 A strong entity set is an entity set that contains sufficient attributes to uniquely identify
all its entities.
 In other words, a primary key exists for a strong entity set.
 Primary key of a strong entity set is represented by underlining it.
Symbols Used-
 A single rectangle is used for representing a strong entity set.
 A diamond symbol is used for representing the relationship that exists between two
strong entity sets.
 A single line is used for representing the connection of the strong entity set with the
relationship set.
 A double line is used for representing the total participation of an entity set with the
relationship set.
 Total participation may or may not exist in the relationship.
Example-
Consider the following ER diagram-

27
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
In this ER diagram,
 Two strong entity sets “Student” and “Course” are related to each other.
 Student ID and Student name are the attributes of entity set “Student”.
 Student ID is the primary key using which any student can be identified uniquely.
 Course ID and Course name are the attributes of entity set “Course”.
 Course ID is the primary key using which any course can be identified uniquely.
 Double line between Student and relationship set signifies total participation.
 It suggests that each student must be enrolled in at least one course.
 Single line between Course and relationship set signifies partial participation.
 It suggests that there might exist some courses for which no enrollments are made.
2. Weak Entity Set-
 A weak entity set is an entity set that does not contain sufficient attributes to uniquely
identify its entities.
 In other words, a primary key does not exist for a weak entity set.
 However, it contains a partial key called as a discriminator.
 Discriminator can identify a group of entities from the entity set.
 Discriminator is represented by underlining with a dashed line.

In this ER diagram,
 One strong entity set “Building” and one weak entity set “Apartment” are related to
each other.
 Strong entity set “Building” has building number as its primary key.
 Door number is the discriminator of the weak entity set “Apartment”.
 This is because door number alone can not identify an apartment uniquely as there
may be several other buildings having the same door number.
 Double line between Apartment and relationship set signifies total participation.
 It suggests that each apartment must be present in at least one building.
 Single line between Building and relationship set signifies partial participation.
 It suggests that there might exist some buildings which has no apartment.

28
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
3. Relationship

A relationship is represented by diamond shape in ER diagram, it shows the relationship


among entities. There are four types of relationships:
1. One to One
2. One to Many
3. Many to One
4. Many to Many

1. One to One Relationship

When a single instance of an entity is associated with a single instance of another entity then it
is called one to one relationship. For example, a person has only one passport and a passport is
given to one person.

2. One to Many Relationship

When a single instance of an entity is associated with more than one instances of another
entity then it is called one to many relationship. For example – a customer can place many
orders but a order cannot be placed by many customers.

3. Many to One Relationship

29
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
When more than one instances of an entity is associated with a single instance of another
entity then it is called many to one relationship. For example – many students can study in a
single college but a student cannot study in many colleges at the same time.

4. Many to Many Relationship

When more than one instances of an entity is associated with more than one instances of
another entity then it is called many to many relationship. For example, a can be assigned to
many projects and a project can be assigned to many students.

Total Participation of an Entity set

A Total participation of an entity set represents that each entity in entity set must have at least
one relationship in a relationship set. For example: In the below diagram each college must
have at-least one associated Student.

RELATIONSHIP SET

A relationship set is a set of relationships of same type


Example-
Set representation of above ER diagram is-

30
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Degree of a Relationship Set-
The number of entity sets that participate in a relationship set is termed as the degree of
that relationship set. Thus,\

Degree of a relationship set = Number of entity sets participating in a relationship set

Types of Relationship Sets-


On the basis of degree of a relationship set, a relationship set can be classified into the
following types-

1. Unary relationship set


2. Binary relationship set
3. Ternary relationship set
4. N-ary relationship set
1. Unary Relationship Set-
Unary relationship set is a relationship set where only one entity set participates in a
relationship set.
Example- One person is married to only one person

31
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
2. Binary Relationship Set-
Binary relationship set is a relationship set where two entity sets participate in a
relationship set.
Example- Student is enrolled in a Course

3. Ternary Relationship Set-


Ternary relationship set is a relationship set where three entity sets participate in a
relationship set.
Example-

4. N-ary Relationship Set-


N-ary relationship set is a relationship set where ‘n’ entity sets participate in a
relationship set.

Participation Constraints

 Total Participation − Each entity is involved in the relationship.


OR
 It specifies that each entity in the entity set must compulsorily participate in at
least one relationship instance in that relationship set.
 That is why, it is also called as mandatory participation.

32
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
 Total participation is represented by double lines.

 Partial participation − Not all entities are involved in the relationship.


OR
 It specifies that each entity in the entity set may or may not participate in the
relationship instance in that relationship set.
 That is why, it is also called as optional participation.
 Partial participation is represented using a single line between the entity set and
relationship set.
Example:
 Suppose an entity set Student related to an entity set Course through Enrolled
relationship set.
 The participation of entity set course in enrolled relationship set is partial because
a course may or may not have students enrolled in. It is possible that only some of
the course entities are related to the student entity set through the enrolled
relationship set.
 The participation of entity set student in enrolled relationship set is total because
every student is expect to relate at least one course through the enrolled
relationship set.

Subclasses:
 A subclass is a class derived from the superclass. It inherits the properties of the
superclass and also contains attributes of its own. An example is:

33
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Car, Truck and Motorcycle are all subclasses of the superclass Vehicle. They all inherit
common attributes from vehicle such as speed, colour etc. while they have different
attributes also i.e Number of wheels in Car is 4 while in Motorcycle is 2.

Super classes:
 A superclass is the class from which many subclasses can be created. The subclasses
inherit the characteristics of a superclass. The superclass is also known as the parent
class or base class.

In the above example, Vehicle is the Superclass and its subclasses are Car, Truck and
Motorcycle.

Inheritance
 Inheritance is basically the process of basing a class on another class i.e to build a
class on a existing class. The new class contains all the features and functionalities
of the old class in addition to its own.
 The class which is newly created is known as the subclass or child class and the
original class is the parent class or the superclass.
 Inheritance helps avoid redundancy by centralizing common attributes in a
superclass and allowing subclasses to inherit these attributes without duplicating
them.

Let's consider an example where we model an employee system:


 Superclass: Employee (contains general information about employees)
Attributes: employee_id, name, address, salary
 Subclasses: Manager, Engineer
34
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Manager has an additional attribute: department
Engineer has an additional attribute: specialization
In this case:
 The Employee is the superclass.
 Manager and Engineer are subclasses that inherit the common attributes from
Employee but also have their own specific attributes.
Advantages:
 Reduces Redundancy: Common attributes are stored in the superclass, avoiding
duplication across multiple entities.
 Improves Flexibility: New subclasses can be added easily without changing the
structure of the superclass.
 Better Data Representation: Inheritance allows modeling of real-world
hierarchies and relationships more naturally.
 Simplifies Maintenance: Changes in shared attributes need to be made only once
in the superclass, making the model easier to maintain.

Generalization

 Generalization is like a bottom-up approach in which two or more entities of


lower level combine to form a higher level entity if they have some attributes in
common.
 In generalization, an entity of a higher level can also combine with the entities of
the lower level to form a further higher level entity.
 Generalization is more like subclass and superclass system, but the only difference
is the approach. Generalization uses the bottom-up approach.
 In generalization, entities are combined to form a more generalized entity, i.e.,
subclasses are combined to make a superclass.
 For example, Faculty and Student entities can be generalized and create a higher
level entity Person.

35
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Specialization
o Specialization is a top-down approach, and it is opposite to Generalization. In
specialization, one higher level entity can be broken down into two lower level
entities.
o Specialization is used to identify the subset of an entity set that shares some
distinguishing characteristics.
o Normally, the superclass is defined first, the subclass and its related attributes are
defined next, and relationship set are then added.
For example: In an Employee management system, EMPLOYEE entity can be
specialized as TESTER or DEVELOPER based on what role they play in the company.

Aggregation
In aggregation, the relation between two entities is treated as a single entity. In
aggregation, relationship with its corresponding entities is aggregated into a higher level
entity.

For example: Center entity offers the Course entity act as a single entity in the
relationship which is in a relationship with another entity visitor. In the real world, if a
visitor visits a coaching center then he will never enquiry about the Course only or just
about the Center instead he will ask the enquiry about both.

36
PREPARED BY A.RAMESH DEPT OF CSE,RGMCET
Object-Relational Database:
Definition: An Object-Relational Database (ORD) is a database
management system (DBMS) that integrates object-oriented database
model features into relational databases. ORDs aim to bridge the gap
between relational databases and the object-oriented modeling techniques
that are commonly used in programming languages. This type of database
supports data types, structures, and behaviors directly in the database
schema and query language.

Exploring Object-Relational Databases:


Object-relational databases enhance the flexibility and scalability of data
management systems by allowing complex data types and relationships to
be represented more naturally while maintaining all the robust features of
traditional relational databases.

Architecture of Object-Relational Databases:


ORD architecture extends traditional relational database architecture with
several key enhancements:
Type System: Supports user-defined types and inheritance in database
schemas.
Table Inheritance: Allows table definitions to inherit from other tables.
Complex Data Types: Facilitates complex data types like arrays, structs,
and even custom-defined types.
Methods: Similar to object-oriented programming, methods (functions)
can be defined on data types and stored in the database.
How Object-Relational Databases Work
Object-relational databases work by:
Storing Data: Data is stored in tables with the flexibility of object-
oriented features such as inheritance and polymorphism.
Querying Data: SQL queries are enhanced to support complex types and
object-oriented constructs, enabling more sophisticated data retrieval.
Handling Transactions: Provides full ACID (Atomicity, Consistency,
Isolation, Durability) compliance for handling transactions, ensuring data
integrity and consistency.

A.RAMESH DEPT OF CSE RGMCET 1


Benefits of Using Object-Relational Databases
 Enhanced Modeling Capabilities: More closely aligns database
design with the application’s object model.
 Improved Data Integrity: By encapsulating behaviors with data, data
handling becomes more robust and consistent.
 Increased Scalability and Flexibility: Supports complex applications
and data types without sacrificing performance.
Considerations and Challenges
 Complexity: Increased complexity in design and query language can
lead to steeper learning curves.
 Performance Overhead: Additional features such as inheritance and
user-defined types might introduce performance overhead.
 Integration: Integrating with other systems that do not support
object-relational features can be challenging.
Applications of Object-Relational Databases
Object-relational databases are particularly useful in applications where
the data model is complex and varies frequently, such as:
 Computer-Aided Design (CAD)
 Multimedia Databases
 Scientific Databases
 E-commerce Systems
Document Oriented Database
Definition
 A document-oriented database is a class of NoSQL database that
stores and queries data in the form of documents, usually JSON but
sometimes also XML or YAML. Unlike a traditional relational
database, objects within a document-based database are stored
completely within one document, rather than across multiple tables.
As a result, the concept of “joins” is not usually present. This model
allows developers to query the database programmatically in a

A.RAMESH DEPT OF CSE RGMCET 2


similar fashion to how they already access data within the
application code.
 Conceptually, you can think of a document as roughly equivalent to
an object in your programming language. The structure is similar (or
identical), and documents are not required to conform to a specific
schema. For example, you may have a class of document called
“books.” Each “book” object is not required to contain the same
keys; one may include a “co-author” key, but the next may not. This
allows for a lot of flexibility with certain types of data.
Use Cases:
Because of the flexibility and speed that document databases allow, they
are a great choice for any application that relies on user profiles, content
management systems, product information, and even real-time big data
analytics.
Examples:
DocumentDB, CosmosDB, Firestore, MongoDB

GRAPH DATABASE:
A graph database is a type of NoSQL database that is designed to handle
data with complex relationships and interconnections. In a graph
database, data is stored as nodes and edges, where nodes represent
entities and edges represent the relationships between those entities.
1. Graph databases are particularly well-suited for applications that
require deep and complex queries, such as social networks,
recommendation engines, and fraud detection systems. They can also
be used for other types of applications, such as supply chain
management, network and infrastructure management, and
bioinformatics.
2. One of the main advantages of graph databases is their ability to
handle and represent relationships between entities. This is because
the relationships between entities are as important as the entities
themselves, and often cannot be easily represented in a traditional
relational database.

A.RAMESH DEPT OF CSE RGMCET 3


3. Another advantage of graph databases is their flexibility. Graph
databases can handle data with changing structures and can be adapted
to new use cases without requiring significant changes to the database
schema. This makes them particularly useful for applications with
rapidly changing data structures or complex data requirements.
4. However, graph databases may not be suitable for all applications. For
example, they may not be the best choice for applications that require
simple queries or that deal primarily with data that can be easily
represented in a traditional relational database. Additionally, graph
databases may require more specialized knowledge and expertise to
use effectively.
Some popular graph databases include Neo4j, OrientDB, and
ArangoDB. These databases provide a range of features, including
support for different data models, scalability, and high availability, and
can be used for a wide variety of applications.

As we all know the graph is a pictorial representation of data in the form


of nodes and relationships which are represented by edges. A graph
database is a type of database used to represent the data in the form of a
graph. It has three components: nodes, relationships, and properties.
These components are used to model the data. The concept of a Graph
Database is based on the theory of graphs. It was introduced in the year
2000. They are commonly referred to NoSql databases as data is stored
using nodes, relationships and properties instead of traditional databases.
A graph database is very useful for heavily interconnected data. Here
relationships between data are given priority and therefore the
relationships can be easily visualized. They are flexible as new data can
be added without hampering the old ones. They are useful in the fields
of social networking, fraud detection, AI Knowledge graphs etc.
The description of components are as follows:
 Nodes: represent the objects or instances. They are equivalent to a
row in database. The node basically acts as a vertex in a graph. The
nodes are grouped by applying a label to each member.

A.RAMESH DEPT OF CSE RGMCET 4


 Relationships: They are basically the edges in the graph. They have
a specific direction, type and form patterns of the data. They basically
establish relationship between nodes.
 Properties: They are the information associated with the nodes.
Some examples of Graph Databases software are Neo4j, Oracle NoSQL
DB, Graph base etc. Out of which Neo4j is the most popular one.
In traditional databases, the relationships between data is not established.
But in the case of Graph Database, the relationships between data are
prioritized. Nowadays mostly interconnected data is used where one data
is connected directly or indirectly. Since the concept of this database is
based on graph theory, it is flexible and works very fast for associative
data. Often data are interconnected to one another which also helps to
establish further relationships. It works fast in the querying part as well
because with the help of relationships we can quickly find the desired
nodes. join operations are not required in this database which reduces the
cost. The relationships and properties are stored as first-class entities in
Graph Database.
Graph databases allow organizations to connect the data with external
sources as well. Since organizations require a huge amount of data, often
it becomes cumbersome to store data in the form of tables. For instance,
if the organization wants to find a particular data that is connected with
another data in another table, so first join operation is performed between
the tables, and then search for the data is done row by row. But Graph
database solves this big problem. They store the relationships and
properties along with the data. So if the organization needs to search for
a particular data, then with the help of relationships and properties the
nodes can be found without joining or without traversing row by row.
Thus the searching of nodes is not dependent on the amount of data.
Types of Graph Databases:
 Property Graphs: These graphs are used for querying and analyzing
data by modelling the relationships among the data. It comprises of
vertices that has information about the particular subject and edges

A.RAMESH DEPT OF CSE RGMCET 5


that denote the relationship. The vertices and edges have additional
attributes called properties.
 RDF Graphs: It stands for Resource Description Framework. It
focuses more on data integration. They are used to represent complex
data with well defined semantics. It is represented by three elements:
two vertices, an edge that reflect the subject, predicate and object of a
sentence. Every vertex and edge is represented by URI(Uniform
Resource Identifier).
When to Use Graph Database?
 Graph databases should be used for heavily interconnected data.
 It should be used when amount of data is larger and relationships are
present.
 It can be used to represent the cohesive picture of the data.
How Graph and Graph Databases Work?
Graph databases provide graph models They allow users to perform
traversal queries since data is connected. Graph algorithms are also
applied to find patterns, paths and other relationships this enabling more
analysis of the data. The algorithms help to explore the neighboring
nodes, clustering of vertices analyze relationships and patterns.
Countless joins are not required in this kind of database.
Example of Graph Database:
 Recommendation engines in E commerce use graph databases to
provide customers with accurate recommendations, updates about
new products thus increasing sales and satisfying the customer’s
desires.
 Social media companies use graph databases to find the “friends of
friends” or products that the user’s friends like and send suggestions
accordingly to user.
 To detect fraud Graph databases play a major role. Users can create
graph from the transactions between entities and store other important
information. Once created, running a simple query will help to
identify the fraud.
Advantages of Graph Database:

A.RAMESH DEPT OF CSE RGMCET 6


 Potential advantage of Graph Database is establishing the
relationships with external sources as well
 No joins are required since relationships is already specified.
 Query is dependent on concrete relationships and not on the amount
of data.
 It is flexible and agile.
 it is easy to manage the data in terms of graph.
 Efficient data modeling: Graph databases allow for efficient data
modeling by representing data as nodes and edges. This allows for
more flexible and scalable data modeling than traditional relational
databases.
 Flexible relationships: Graph databases are designed to handle
complex relationships and interconnections between data elements.
This makes them well-suited for applications that require deep and
complex queries, such as social networks, recommendation engines,
and fraud detection systems.
 High performance: Graph databases are optimized for handling large
and complex datasets, making them well-suited for applications that
require high levels of performance and scalability.
 Scalability: Graph databases can be easily scaled horizontally,
allowing additional servers to be added to the cluster to handle
increased data volume or traffic.
 Easy to use: Graph databases are typically easier to use than
traditional relational databases. They often have a simpler data model
and query language, and can be easier to maintain and scale.
Disadvantages of Graph Database:
 Often for complex relationships speed becomes slower in searching.
 The query language is platform dependent.
 They are inappropriate for transactional data
 It has smaller user base.
 Limited use cases: Graph databases are not suitable for all
applications. They may not be the best choice for applications that
require simple queries or that deal primarily with data that can be
easily represented in a traditional relational database.

A.RAMESH DEPT OF CSE RGMCET 7


 Specialized knowledge: Graph databases may require specialized
knowledge and expertise to use effectively, including knowledge of
graph theory and algorithms.
 Immature technology: The technology for graph databases is
relatively new and still evolving, which means that it may not be as
stable or well-supported as traditional relational databases.
 Integration with other tools: Graph databases may not be as well-
integrated with other tools and systems as traditional relational
databases, which can make it more difficult to use them in conjunction
with other technologies.
Future of Graph Database:
Graph Database is an excellent tool for storing data but it cannot be used
to completely replace the traditional database. This database deals with
a typical set of interconnected data. Although Graph Database is in the
developmental phase it is becoming an important part as business and
organizations are using big data and Graph databases help in complex
analysis. Thus these databases have become a must for today’s needs and
tomorrow success.

Distributed Database:

 A distributed database represents multiple


interconnected databases spread out across several sites connected
by a network. Since the databases are all connected, they appear as
a single database to the users.
 Distributed databases utilize multiple nodes. They scale horizontally
and develop a distributed system. More nodes in the system provide
more computing power, offer greater availability, and resolve
the single point of failure issue.
 Different parts of the distributed database are stored in several
physical locations, and the processing requirements are distributed
among processors on multiple database nodes.

A.RAMESH DEPT OF CSE RGMCET 8


 A centralized distributed database management system (DDBMS)
manages the distributed data as if it were stored in one physical
location. DDBMS synchronizes all data operations among databases
and ensures that the updates in one database automatically reflect on
databases in other sites.

Distributed Database Features

Some general features of distributed databases are:

 Location independency - Data is physically stored at multiple sites


and managed by an independent DDBMS.
 Distributed query processing - Distributed databases answer
queries in a distributed environment that manages data at multiple
sites. High-level queries are transformed into a query execution plan
for simpler management.
 Distributed transaction management - Provides a consistent
distributed database through commit protocols, distributed
concurrency control techniques, and distributed recovery methods
in case of many transactions and failures.
 Seamless integration - Databases in a collection usually represent
a single logical database, and they are interconnected.
 Network linking - All databases in a collection are linked by a
network and communicate with each other.
 Transaction processing - Distributed databases incorporate
transaction processing, which is a program including a collection of
one or more database operations. Transaction processing is an
atomic process that is either entirely executed or not at all.

Distributed Database Types

There are two types of distributed databases:

 Homogenous
 Heterogenous
A.RAMESH DEPT OF CSE RGMCET 9
Homogeneous

 A homogenous distributed database is a network of identical


databases stored on multiple sites. The sites have the same
operating system, DDBMS, and data structure, making them easily
manageable.

 Homogenous databases allow users to access data from each


of the databases seamlessly.
The following diagram shows an example of a homogeneous
database:
Heterogeneous

 A heterogeneous distributed database uses different schemas,


operating systems, DDBMS, and different data models.
 In the case of a heterogeneous distributed database, a particular site
can be completely unaware of other sites causing limited

A.RAMESH DEPT OF CSE RGMCET 10


cooperation in processing user requests. The limitation is why
translations are required to establish communication between sites.
 The following diagram shows an example of a heterogeneous
database:

Distributed Database Storage

Distributed database storage is managed in two ways:

 Replication
 Fragmentation

Replication

In database replication, the systems store copies of data on different


sites. If an entire database is available on multiple sites, it is a fully
redundant database.

The advantage of database replication is that it increases data


availability on different sites and allows for parallel query requests to be
processed.

A.RAMESH DEPT OF CSE RGMCET 11


Fragmentation

When it comes to fragmentation of distributed database storage, the


relations are fragmented, which means they are split into smaller parts.
Each of the fragments is stored on a different site, where it is required.

The prerequisite for fragmentation is to make sure that the fragments can
later be reconstructed into the original relation without losing data.

The advantage of fragmentation is that there are no data copies, which


prevents data inconsistency.

There are two types of fragmentation:

 Horizontal fragmentation - The relation schema is fragmented into


groups of rows, and each group (tuple) is assigned to one fragment.

A.RAMESH DEPT OF CSE RGMCET 12


 Vertical fragmentation - The relation schema is fragmented into
smaller schemas, and each fragment contains a common candidate
key to guarantee a lossless join.

Advantages

 Modular Development. Modular development of a distributed database


implies that a system can be expanded to new locations or units by adding
new servers and data to the existing setup and connecting them to the
distributed system without interruption. This type of expansion causes no
interruptions in the functioning of distributed databases.

 Reliability. Distributed databases offer greater reliability in contrast to


centralized databases. In case of a database failure in a centralized database,
the system comes to a complete stop. In a distributed database, the system
functions even when failures occur, only delivering reduced performance until
the issue is resolved.

 Lower Communication Cost. Locally storing data reduces communication


costs for data manipulation in distributed databases. Local data storage is not
possible in centralized databases.

 Better Response. Efficient data distribution in a distributed database system


provides a faster response when user requests are met locally. In centralized
databases, user requests pass through the central machine, which processes all
requests. The result is an increase in response time, especially with a lot of
queries.

Disadvantages

 Costly Software. Ensuring data transparency and coordination across


multiple sites often requires using expensive software in a distributed database
system.

 Large Overhead. Many operations on multiple sites requires numerous


calculations and constant synchronization when database replication is used,
causing a lot of processing overhead.

 Data Integrity. A possible issue when using database replication is data


integrity, which is compromised by updating data at multiple sites.
A.RAMESH DEPT OF CSE RGMCET 13
 Improper Data Distribution. Responsiveness to user requests largely
depends on proper data distribution. That means responsiveness can be
reduced if data is not correctly distributed across multiple sites.

NEWSQL Database:

 NewSQL database is developed by integrating the speed and


performance of NoSQL with the reliability of SQL. This database
emphasizes on the features, which are not available in NoSQL
Database and offers a strong dependability. The term NewSQL
was coined in 2011 for online transaction processing (OLTP)
systems, while maintaining atomicity, consistency, isolation and
durability (ACID) guarantees. It was devised to work around the
limitations of traditional SQL based systems. It aims to revamp the
flaws in NoSQL by reincorporating some related database features.
 NewSQL databases resolve the problems concerned with the
traditional online transaction processing. These databases run on
SQL but differ in terms of their internal design. They can
assimilate new information and perform many transaction at the
same time. The main categories of NewSQL system are SQL
engines, new architecture, transparent sharding middleware and
Database-as-a-service.
NewSQL Database Features
Following are some best features of NewSQL Databases:
 Currency Control: This feature allows performing simultaneous
transactions while maintaining the data integrity. It tackles the
problem that may occur when multiple users are accessing or
modifying the data simultaneously.
 Replication: With this feature, user can create copies of database and
store them in a remote site next to the main site. User can update this
database replica simultaneously.
 Crash Recovery: This mechanism enables the system to retrieve the
data and move to a consistent state whenever system crashes.

A.RAMESH DEPT OF CSE RGMCET 14


 Secondary Indexes: With the secondary index feature, database user
can approach databases information by using a different value other
than the primary key.
 Partitioning/Sharding: NewSQL system divides the database into
different subsets known as partitions or shards. The tables are
bifurcated into various fragments with the boundaries based on
Column values.
NewSQL Database Advantages and Disadvanatges
As we know, every new system or technology comes with some
advantages over its previous version. But, on the other hand, there are
some limitations too. So, let’s check out the advantages and
disadvantages of NewSQL Databases in this section.
NewSQL Database Advantages
 Benefits traditional ones with the currency control feature

 It preserves the ACID properties of databases


 It brings the advantages of SQL and NoSQL together
 Provide synchronous updates of data over the WAN
 Easy to switch between the users need and the type
 High availability and strong data durability
 Faster query processing time
NewSQL Database Disadvantages
 Not standardized

 In-memory architecture may be unsuitable for handling larger volumes


 Not fit for general purpose
 Provides limited access to traditional SQL system

A.RAMESH DEPT OF CSE RGMCET 15

You might also like