0% found this document useful (0 votes)
112 views29 pages

Unit1 Ddbms Notes PDF

1. A distributed database consists of multiple databases spread across different physical locations that are connected. This allows data to be independently managed at different locations while still allowing communication between databases. 2. Key advantages of distributed databases include improved reliability if one location fails, faster response times by accessing local data, easier expansion to new locations, and reduced data movement across networks. 3. Distributed database management systems allow data to be sharded, or distributed, across different nodes in various architectures like sharing memory, disks, or nothing between nodes. This can improve scalability, performance, and reduce costs.

Uploaded by

keshav prasaath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views29 pages

Unit1 Ddbms Notes PDF

1. A distributed database consists of multiple databases spread across different physical locations that are connected. This allows data to be independently managed at different locations while still allowing communication between databases. 2. Key advantages of distributed databases include improved reliability if one location fails, faster response times by accessing local data, easier expansion to new locations, and reduced data movement across networks. 3. Distributed database management systems allow data to be sharded, or distributed, across different nodes in various architectures like sharing memory, disks, or nothing between nodes. This can improve scalability, performance, and reduce costs.

Uploaded by

keshav prasaath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Unit 1

1.Difference between Centralized Database and Distributed Database


(i). Centralized Database :
A centralized database is basically a type of database that is stored, located as well as maintained at a single
location only. This type of database is modified and managed from that location itself. This location is thus
mainly any database system or a centralized computer system. The centralized location is accessed via an
internet connection (LAN, WAN, etc). This centralized database is mainly used by institutions or
organizations.

Advantages –
• Since all data is stored at a single location only thus it is easier to access and coordinate data.
• The centralized database has very minimal data redundancy since all data is stored in a single place.
• It is cheaper in comparison to all other databases available.

Disadvantages –

• The data traffic in the case of centralized database is more.


• If any kind of system failure occurs at the centralized system then the entire data will be destroyed.

(ii). Distributed Database :


A distributed database is basically a type of database which consists of multiple databases that are connected
with each other and are spread across different physical locations. The data that is stored on various physical
locations can thus be managed independently of other physical locations. The communication between
databases at different physical locations is thus done by a computer network.
Advantages –

• This database can be easily expanded as data is already spread across different physical locations.
• The distributed database can easily be accessed from different networks.
• This database is more secure in comparison to centralized database.

Disadvantages –

• This database is very costly and it is difficult to maintain because of its complexity.
o In this database, it is difficult to provide a uniform view to user since it is spread across
different physical locations.

Difference between Centralized database and Distributed database :

S.NO. Centralized database Distributed database

It is a database which consists of multiple databases


It is a database that is stored, located as well as
1. which are connected with each other and are spread
maintained at a single location only.
across different physical locations.

The data access time in the case of multiple The data access time in the case of multiple users is
2.
users is more in a centralized database. less in a distributed database.

The management, modification, and backup of The management, modification, and backup of this
3. this database are easier as the entire data is database are very difficult as it is spread across
present at the same location. different physical locations.

This database provides a uniform and complete Since it is spread across different locations thus it is
4.
view to the user. difficult to provide a uniform view to the user.

This database has more data consistency in This database may have some data replications thus
5.
comparison to distributed database. data consistency is less.

The users cannot access the database in case of In distributed database, if one database fails users have
6.
database failure occurs. access to other databases.

7. Centralized database is less costly. This database is very expensive.

2.Why distributed database?

• Increased Reliability and availability

Reliability is basically defined as the probability that a system is running at a certain time whereas

Availability is defined as the probability that the system is continuously available during a time interval.

When the data and DBMS software are distributed over several sites one site may fail while other sites

continue to operate and we are not able to only access the data that exist at the failed site and this basically

leads to improvement in reliability and availability


• Better Response

If data is distributed in an efficient manner, then user requests can be met from local data itself, thus providing a faster

response. On the other hand, in centralized systems, all queries have to pass through the central computer for

processing, which increases the response time.

• Modular Development

If the system needs to be expanded to new locations or new units, in centralized database systems, the action requires

substantial efforts and disruption in the existing functioning. However, in distributed databases, the work simply

requires adding new computers and local data to the new site and finally connecting them to the distributed system,

with no interruption in current functions.

• Less Data Movement over Network

The more replicas of, a relation are there, the greater are the chances that the required data is found where the

transaction is executing. Hence, data replication reduces the movement of data among sites and. increases .speed of

processing.

• Smaller Databases are Easier to Manage

Production databases must be fully managed for regular backups, database optimization, and other common tasks.

With a single large database, these routine tasks can be very difficult to accomplish, if only in terms of the time

window required for completion. Routine table and index optimizations can stretch from hours to days, in some cases

making regular maintenance infeasible. By using the sharding approach, each individual “shard” can be maintained

independently, providing a far more manageable scenario, performing such maintenance tasks in parallel.

• Smaller Databases are Faster

The scalability of sharding is apparent and achieved through the distribution of processing across multiple shards and

servers in the network. What is less apparent is the fact that each individual shard database will outperform a single

large database due to its smaller size. By hosting each shard database on its own server, the ratio between memory

and data on disk is properly balanced, thereby reducing disk I/O and maximizing system resources. This results in less
contention, greater join performance, faster index searches, and fewer database locks. Therefore, not only can a

sharded system scale to new levels of capacity, individual transaction performance is benefited as well.

• Database Sharding can Reduce Costs

Most database sharding implementations take advantage of low-cost open-source databases and commodity

databases. The technique can also take full advantage of reasonably priced “workgroup” versions of many

commercial databases. Sharding works well with commodity multi-core server hardware, systems that are far less

expensive when compared to high-end, multi-CPU servers, and expensive storage area networks (SANs). The overall

reduction in cost due to savings in license fees, software maintenance, and hardware investment is substantial in some

cases 70% when compared to traditional solutions.

3.Distributed Database Management Systems:


Distributed database system is one in which the data belonging to a single logical database is distributed to two or more

physical databases to inscure reliability and availability

you could distribute your data across multi nodes using many different system architectures

Shard Memory

CPUs have access to common memory address space via a fast interconnect.

• Each processor has a global view of all the in-memory data structures.

• Each DBMS instance on a processor has to “know” about the other instances.

Shard Disc

All CPUs can access a single logical disk directly via an interconnect, but each has its own private memories.

• Can scale execution layer independently from the storage layer.

• Must send messages between CPUs to learn about their current state.
Shard Nothing

Each DBMS instance has its own CPU, memory, and disk. Nodes only communicate with each other via a network.

→ Hard to increase capacity. → Hard to ensure consistency. → Better performance & efficiency.

4. Review of database
Depending upon the usage requirements, there are following types of databases available in the market −

• Centralised database.
• Distributed database.
• Personal database.
• End-user database.
• Commercial database.
• NoSQL database.
• Operational database.
• Relational database.
• Cloud database.
• Object-oriented database.
• Graph database.

Let us explain all of them:


(i). Centralised Database
The information(data) is stored at a centralized location and the users from different locations can access this data.
This type of database contains application procedures that help the users to access the data even from a remote location.
Various kinds of authentication procedures are applied for the verification and validation of end users, likewise, a
registration number is provided by the application procedures which keeps a track and record of data usage. The local
area office handles this thing.

(ii).Distributed Database
Just opposite of the centralized database concept, the distributed database has contributions from the common database
as well as the information captured by local computers also. The data is not at one place and is distributed at various
sites of an organization. These sites are connected to each other with the help of communication links which helps
them to access the distributed data easily.

You can imagine a distributed database as a one in which various portions of a database are stored in multiple different
locations(physical) along with the application procedures which are replicated and distributed among various points
in a network.
There are two kinds of distributed database, viz. homogenous and heterogeneous. The databases which have same
underlying hardware and run over same operating systems and application procedures are known as homogeneous
DDB, for eg. All physical locations in a DDB. Whereas, the operating systems, underlying hardware as well as
application procedures can be different at various sites of a DDB which is known as heterogeneous DDB.

(iii).Personal Database
Data is collected and stored on personal computers which is small and easily manageable. The data is generally used
by the same department of an organization and is accessed by a small group of people.
(iv).End User Database
The end user is usually not concerned about the transaction or operations done at various levels and is only aware of
the product which may be a software or an application. Therefore, this is a shared database which is specifically
designed for the end user, just like different levels’ managers. Summary of whole information is collected in this
database.
(v)Commercial Database
These are the paid versions of the huge databases designed uniquely for the users who want to access the information
for help. These databases are subject specific, and one cannot afford to maintain such a huge information. Access to
such databases is provided through commercial links.

(vi) NoSQL Database


These are used for large sets of distributed data. There are some big data performance issues which are effectively
handled by relational databases, such kind of issues are easily managed by NoSQL databases. There are very efficient
in analyzing large size unstructured data that may be stored at multiple virtual servers of the cloud.
(vii) Operational Database
Information related to operations of an enterprise is stored inside this database. Functional lines like marketing,
employee relations, customer service etc. require such kind of databases.

(viii) Relational Databases


These databases are categorized by a set of tables where data gets fit into a pre-defined category. The table consists
of rows and columns where the column has an entry for data for a specific category and rows contains instance for
that data defined according to the category. The Structured Query Language (SQL) is the standard user and application
program interface for a relational database.
There are various simple operations that can be applied over the table which makes these databases easier to extend,
join two databases with a common relation and modify all existing applications.
(ix) Cloud Databases
Now a day, data has been specifically getting stored over clouds also known as a virtual environment, either in a
hybrid cloud, public or private cloud. A cloud database is a database that has been optimized or built for such a
virtualized environment. There are various benefits of a cloud database, some of which are the ability to pay for storage
capacity and bandwidth on a per-user basis, and they provide scalability on demand, along with high availability.

A cloud database also gives enterprises the opportunity to support business applications in a software-as-a-service
deployment.
(x)Object-Oriented Databases
An object-oriented database is a collection of object-oriented programming and relational database. There are various
items which are created using object-oriented programming languages like C++, Java which can be stored in relational
databases, but object-oriented databases are well-suited for those items.

An object-oriented database is organized around objects rather than actions, and data rather than logic. For example,
a multimedia record in a relational database can be a definable data object, as opposed to an alphanumeric value.

(xi)Graph Databases
The graph is a collection of nodes and edges where each node is used to represent an entity and each edge describes
the relationship between entities. A graph-oriented database, or graph database, is a type of NoSQL database that uses
graph theory to store, map and query relationships.

Graph databases are basically used for analyzing interconnections. For example, companies might use a graph database
to mine data about customers from social media.
5.Review of Networks:
A system of interconnected computers and computerized peripherals such as printers is called computer network.
This interconnection among computers facilitates information sharing among them. Computers may connect to each
other by either wired or wireless media.
Classification of Computer Networks
Computer networks are classified based on various factors.They includes:

• Geographical span
• Inter-connectivity
• Administration
• Architecture

(i) Geographical Span


Geographically a network can be seen in one of the following categories:

• It may be spanned across your table, among Bluetooth enabled devices,. Ranging not more than few
meters.
• It may be spanned across a whole building, including intermediate devices to connect all floors.
• It may be spanned across a whole city.
• It may be spanned across multiple cities or provinces.
• It may be one network covering whole world.
(ii) Inter-Connectivity
Components of a network can be connected to each other differently in some fashion. By connectedness we mean
either logically , physically , or both ways.

• Every single device can be connected to every other device on network, making the network mesh.
• All devices can be connected to a single medium but geographically disconnected, created bus like
structure.
• Each device is connected to its left and right peers only, creating linear structure.
• All devices connected together with a single device, creating star like structure.
• All devices connected arbitrarily using all previous ways to connect each other, resulting in a hybrid
structure.
(iii) Administration
From an administrator’s point of view, a network can be private network which belongs a single autonomous system
and cannot be accessed outside its physical or logical domain. A network can be public which is accessed by all.
(iv) Network Architecture
Computer networks can be discriminated into various types such as Client-Server,peer-to-peer or hybrid,
depending upon its architecture.

• There can be one or more systems acting as Server. Other being Client, requests the Server to serve
requests.Server takes and processes request on behalf of Clients.
• Two systems can be connected Point-to-Point, or in back-to-back fashion. They both reside at the same
level and called peers.
• There can be hybrid network which involves network architecture of both the above types.
Network Applications
Computer systems and peripherals are connected to form a network.They provide numerous advantages:

• Resource sharing such as printers and storage devices


• Exchange of information by means of e-Mails and FTP
• Information sharing by using Web or Internet
• Interaction with other users using dynamic web pages
• IP phones
• Video conferences
• Parallel computing
• Instant messaging
Types of Computer Network Models:
(i) OSI Model:

The OSI Model has seven layers, namely:

• Application Layer:
o The application layer is in charge of providing an interface to the application user. This layer contains
protocols that communicate directly with the user.
• Presentation Layer:
o This layer deals with the appearance and format of the data on the end devices.
• Session Layer:
o This layer is responsible for maintaining connections between remote hosts.
o For example, after user/password authentication is complete, the remote host retains the session and does not
request authentication again within that time period.
• Transport Layer:
o The Transport Layer is in charge of end-to-end delivery between hosts.
• Network Layer:
o This layer is in charge of assigning addresses and uniquely addressing hosts in a network.
• Data Link Layer:
o
The Data Link Layer is in charge of reading and writing data from and onto the line. At this layer, link
problems are identified.
• Physical Layer:
o This layer tells us about the hardware, cabling wiring, power output, pulse rate, and so on.

(ii). TCP/IP Model:

The TCP/IP Model has four layers, namely:

• Application Layer:
o
The application layer specifies the protocol that allows users to communicate with the network. FTP, HTTP
are some such protocols.
• Transport Layer:
o The Transport Layer describes how data should move between hosts.
o The Transmission Control Protocol is the most important protocol at this layer (TCP).
o This layer guarantees that data transferred between hosts is in the correct sequence and is in charge of end-to-
end delivery.
• Internet Layer:
o The Internet Protocol (IP) operates on this layer.
o This layer makes host addressing and identification easier.
o This layer is responsible for routing.
• Network Interface Layer:
o This layer offers the means for delivering and receiving real data.
o This layer, unlike its OSI Model equivalent, is independent of the underlying network architecture and
hardware.

What is the need for Layered Architecture?

• Divide-and-conquer method:
o
The divide-and-conquer technique divides unmanageable tasks into tiny and manageable jobs during the
design phase.
o In a nutshell, this technique minimises the complexity of the design.
• Modularity:
o Layered architecture has a higher level of modularity.
o Layer independence is provided through modularity, making it easier to comprehend and apply.
• Simple to modify:
o It provides layer independence, allowing changes to one layer’s implementation to have no effect on other
levels.
• Simple to test:
o Each layer of the layered architecture may be separately studied and tested

6. Level of distribution transparency:

Distribution transparency is the property of distributed databases by the virtue of which the internal details of the
distribution are hidden from the users. The DDBMS designer may choose to fragment tables, replicate the fragments
and store them at different sites. However, since users are oblivious of these details, they find the distributed database
easy to use like any centralized database.
The three dimensions of distribution transparency are −

• Location transparency
• Fragmentation transparency
• Replication transparency
Location Transparency
Location transparency ensures that the user can query on any table(s) or fragment(s) of a table as if they were stored
locally in the user’s site. The fact that the table or its fragments are stored at remote site in the distributed database
system, should be completely oblivious to the end user. The address of the remote site(s) and the access mechanisms
are completely hidden.
In order to incorporate location transparency, DDBMS should have access to updated and accurate data dictionary
and DDBMS directory which contains the details of locations of data.
Fragmentation Transparency
Fragmentation transparency enables users to query upon any table as if it were unfragmented. Thus, it hides the fact
that the table the user is querying on is actually a fragment or union of some fragments. It also conceals the fact that
the fragments are located at diverse sites.
This is somewhat similar to users of SQL views, where the user may not know that they are using a view of a table
instead of the table itself.
Replication Transparency
Replication transparency ensures that replication of databases are hidden from the users. It enables users to query
upon a table as if only a single copy of the table exists.
Replication transparency is associated with concurrency transparency and failure transparency. Whenever a user
updates a data item, the update is reflected in all the copies of the table. However, this operation should not be known
to the user. This is concurrency transparency. Also, in case of failure of a site, the user can still proceed with his
queries using replicated copies without any knowledge of failure. This is failure transparency.
Combination of Transparencies
In any distributed database system, the designer should ensure that all the stated transparencies are maintained to a
considerable extent. The designer may choose to fragment tables, replicate them and store them at different sites; all
oblivious to the end user. However, complete distribution transparency is a tough task and requires considerable
design efforts

7.Reference Architecture for Distributed DBMSs:


• Data is distributed system are usually fragmented and replicated. Considering this fragmentation and
replication issue
2.The reference architecture of DBMS consist of the following schemas:-
●A set of global external schema.
●A global conceptual schema.
●A fragmentation schema and allocation schema.
●A set of schemas for each local DBMS.
• Global external schema- In a distributed system,user applications and user accesseto the distributed
database are represented by a number of global external schemas.This is the topmost level in the reference
architecture of DBMS.This level describes the part of the distributed database that is relevant to different
users.
• Global conceptual schema- The GCS represents the logical discription of entire database as if it is not
distributed.This level contains definitions of all entities,relationships among entities and security and
integrity information of whole databases stored at all sites in a distributed system.
• Fragmentation schema and allocation schema- The fragmentation schema describes how the data is to
be logically partitioned in a distributed database.The GCS consists of a set of global relations,and the
mapping between the global relations and fragments is defined in the fragmentation schema.
The allocation schema is a description of where the data(fragments)are to be located,taking account of any
replication.The type of mapping in the allocation schema determined whether the distributed database is redundant
or non redundant.In case of redundant data distribution,the mapping is one to many,whereas in case of non
redundant data distribution is one to one.
• Local schemas- In a distributed database system,the physical data organization at each machine is probably
different,and therefore it requires an individual internal schema definition at each site,called local internal
schema.
To handle fragmentation and replication issues,the logicalorganization of data at each sites is described by a third
layer in the architecture called local conceptual schema.
The GCS is the union of all local conceptual schemas thus the local conceptual schemas are mappings of the global
schema onto each site.This mapping is done by local mapping schemas.
This architecture provides a very general conceptual framework for understanding distributed database.

7.Fragmentation in Distributed System and Type of fragmentation:

What is fragmentation?

• The process of dividing the database into a smaller multiple parts is called as fragmentation.

• These fragments may be stored at different locations.


• The data fragmentation process should be carrried out in such a way that the reconstruction of original database
from the fragments is possible.

Types of data Fragmentation

• There are three types of data fragmentation:

1. Horizontal data fragmentation

Horizontal fragmentation divides a relation(table) horizontally into the group of rows to create subsets
of tables.

Example:
Account (Acc_No, Balance, Branch_Name, Type).
In this example if values are inserted in table Branch_Name as Pune, Baroda, Delhi.

The query can be written as:


SELECT*FROM ACCOUNT WHERE Branch_Name= “Baroda”

Types of horizontal data fragmentation are as follows:

a)Primary horizontal fragmentation


Primary horizontal fragmentation is the process of fragmenting a single table, row wise using a set of
conditions.

Example:

Acc_No Balance Branch_Name

A_101 5000 Pune

A_102 10,000 Baroda

A_103 25,000 Delhi


For the above table we can define any simple condition like, Branch_Name= 'Pune', Branch_Name= 'Delhi', Balance
< 50,000

Fragmentation1:
SELECT * FROM Account WHERE Branch_Name= 'Pune' AND Balance < 50,000

Fragmentation2:
SELECT * FROM Account WHERE Branch_Name= 'Delhi' AND Balance < 50,000

b) Derived horizontal fragmentation


Fragmentation derived from the primary relation is called as derived horizontal fragmentation.

Example: Refer the example of primary fragmentation given above.


The following fragmentation are derived from primary fragmentation.

Fragmentation1:
SELECT * FROM Account WHERE Branch_Name= 'Baroda' AND Balance < 50,000

Fragmentation2:
SELECT * FROM Account WHERE Branch_Name= 'Delhi' AND Balance < 50,000

c) Complete horizontal fragmentation

• The complete horizontal fragmentation generates a set of horizontal fragmentation, which includes every table of
original relation.

• Completeness is required for reconstruction of relation so that every table belongs to at least one of the partitions.
d)Disjoint horizontal fragmentation
The disjoint horizontal fragmentation generates a set of horizontal fragmentation in which no two fragments have
common tables. That means every table of relation belongs to only one fragment.

e) Reconstruction of horizontal fragmentation


Reconstruction of horizontal fragmentation can be performed using UNION operation on fragments.

(ii) Vertical Fragmentation

Vertical fragmentation divides a relation(table) vertically into groups of columns to create subsets of tables.
Example:

Acc_No Balance Branch_Name

A_101 5000 Pune

A_102 10,000 Baroda

A_103 25,000 Delhi

Fragmentation1:
SELECT * FROM Acc_NO

Fragmentation2:
SELECT * FROM Balance

Complete vertical fragmentation

• The complete vertical fragmentation generates a set of vertical fragments, which can include all the attributes of
original relation.

• Reconstruction of vertical fragmentation is performed by using Full Outer Join operation on fragments.

(iii) Hybrid Fragmentation

• Hybrid fragmentation can be achieved by performing horizontal and vertical partition together.

• Mixed fragmentation is group of rows and columns in relation.

Example: Consider the following table which consists of employee information.

Emp_ID Emp_Name Emp_Address Emp_Age Emp_Salary

101 Surendra Baroda 25 15000

102 Jaya Pune 37 12000

103 Jayesh Pune 47 10000

Fragmentation1:
SELECT * FROM Emp_Name WHERE Emp_Age < 40

Fragmentation2:
SELECT * FROM Emp_Id WHERE Emp_Address= 'Pune' AND Salary < 14000

Reconstruction of Hybrid Fragmentation


The original relation in hybrid fragmentation is reconstructed by performing UNION and FULL OUTER JOIN.
11. DDBMS Access primitives and integrity constraints:

Database control refers to the task of enforcing regulations so as to provide correct data to authentic users and
applications of a database. In order that correct data is available to users, all data should conform to the integrity
constraints defined in the database. Besides, data should be screened away from unauthorized users so as to maintain
security and privacy of the database. Database control is one of the primary tasks of the database administrator
(DBA).
The three dimensions of database control are −

• Authentication
• Access rights
• Integrity constraints

Authentication
In a distributed database system, authentication is the process through which only legitimate users can gain access to
the data resources.
Authentication can be enforced in two levels −
• Controlling Access to Client Computer − At this level, user access is restricted while login to the client
computer that provides user-interface to the database server. The most common method is a
username/password combination. However, more sophisticated methods like biometric authentication may
be used for high security data.
• Controlling Access to the Database Software − At this level, the database software/administrator assigns
some credentials to the user. The user gains access to the database using these credentials. One of the
methods is to create a login account within the database server.
Access Rights
A user’s access rights refers to the privileges that the user is given regarding DBMS operations such as the rights to
create a table, drop a table, add/delete/update tuples in a table or query upon the table.
In distributed environments, since there are large number of tables and yet larger number of users, it is not feasible
to assign individual access rights to users. So, DDBMS defines certain roles. A role is a construct with certain
privileges within a database system. Once the different roles are defined, the individual users are assigned one of
these roles. Often a hierarchy of roles are defined according to the organization’s hierarchy of authority and
responsibility.
For example, the following SQL statements create a role "Accountant" and then assigns this role to user "ABC".

CREATE ROLE ACCOUNTANT;


GRANT SELECT, INSERT, UPDATE ON EMP_SAL TO ACCOUNTANT;
GRANT INSERT, UPDATE, DELETE ON TENDER TO ACCOUNTANT;
GRANT INSERT, SELECT ON EXPENSE TO ACCOUNTANT;
COMMIT;
GRANT ACCOUNTANT TO ABC;
COMMIT;

Semantic Integrity Control


Semantic integrity control defines and enforces the integrity constraints of the database system.
The integrity constraints are as follows −
• Data type integrity constraint
• Entity integrity constraint
• Referential integrity constraint

• Data Type Integrity Constraint


A data type constraint restricts the range of values and the type of operations that can be applied to the field with the
specified data type.
For example, let us consider that a table "HOSTEL" has three fields - the hostel number, hostel name and capacity.
The hostel number should start with capital letter "H" and cannot be NULL, and the capacity should not be more
than 150. The following SQL command can be used for data definition −

CREATE TABLE HOSTEL (


H_NO VARCHAR2(5) NOT NULL,
H_NAME VARCHAR2(15),
CAPACITY INTEGER,
CHECK ( H_NO LIKE 'H%'),
CHECK ( CAPACITY <= 150)
);
• Entity Integrity Control
Entity integrity control enforces the rules so that each tuple can be uniquely identified from other tuples. For this a
primary key is defined. A primary key is a set of minimal fields that can uniquely identify a tuple. Entity integrity
constraint states that no two tuples in a table can have identical values for primary keys and that no field which is a
part of the primary key can have NULL value.
For example, in the above hostel table, the hostel number can be assigned as the primary key through the following
SQL statement (ignoring the checks) −

CREATE TABLE HOSTEL (


H_NO VARCHAR2(5) PRIMARY KEY,
H_NAME VARCHAR2(15),
CAPACITY INTEGER
);

• Referential Integrity Constraint


Referential integrity constraint lays down the rules of foreign keys. A foreign key is a field in a data table that is the
primary key of a related table. The referential integrity constraint lays down the rule that the value of the foreign key
field should either be among the values of the primary key of the referenced table or be entirely NULL.
For example, let us consider a student table where a student may opt to live in a hostel. To include this, the primary
key of hostel table should be included as a foreign key in the student table. The following SQL statement incorporates
this −

CREATE TABLE STUDENT (


S_ROLL INTEGER PRIMARY KEY,
S_NAME VARCHAR2(25) NOT NULL,
S_COURSE VARCHAR2(10),
S_HOSTEL VARCHAR2(5) REFERENCES HOSTEL
);

You might also like