DBMS
DBMS
Class - T.Y.PLD
(Division-)
AY 2023-2024
SEM-I
1
Unit – IV
2
MIT School of Computing
Department of Computer Science & Engineering
Syllabus
● Database Architectures: Centralized and ClientServer
Architectures,
● Database Connectivity using Java/Python with SQL and
NoSQL databases.
PLD
● Introduction to Parallel Databases, Architecture of Parallel
Databases.
● Introduction to Distributed Databases, Distributed
Transactions.
● 2PC, 3PC protocols
● Introduction to Data Mining and clustering.
3
Centralized Systems
• A centralized database is basically a type of database that
is stored, located as well as maintained at a single location
only.
• This type of database is modified and managed from that
location itself. This location is thus mainly any database
system or a centralized computer system
Advantages:
Since all data is stored at a single location only thus it is easier to
access and coordinate data.
The centralized database has very minimal data redundancy since all
data is stored in a single place.
Disadvantages:
● Bus. System components send data on and receive data from a single
communication bus.
Does not scale well with increasing parallelism, since the bus can
handle communication from only one component at a time.
● Mesh. Components are arranged as nodes in a grid, and each component is
connected to all adjacent components.
In a two-dimensional mesh each node connects to four adjacent nodes,
while in a three-dimensional mesh each node connects to six adjacent nodes.
Figure b shows a two-dimensional mesh.
● Hyper cube. Components are numbered in binary; components are
connected to one another if their binary representations differ in exactly one
bit.
In a hypercube interconnection, a message from a component can
reach any other component by going through at most log(n) links.
Interconnection Architectures
Parallel Database Architectures
There are several architectural models for parallel machines. Among the
most prominent ones are those in Figure
It is a blend of technologies and components which aids the strategic use of data.
• Decision tree.
• Regression
• Association Rules
• Support
• Confidence
Clustering
⚫Clustering: Intuitively, finding clusters of points in the given data such
that similar points lie in the same cluster
⚫Can be formalized using distance metrics in several ways
● Group points into ksets (for a given k) such that the average distance
of points from the centroid of their assigned group is minimized
● Centroid: point defined by taking average of coordinates in each
dimension.
● Data mining systems aim at clustering techniques that can handle
very large data sets
● E.g., the Birch clustering algorithm
● k means
● fuzzy c means