SQL | Constraints
SQL constraints are essential elements in relational database design that
ensure the integrity, accuracy, and reliability of the data stored in a
database. By enforcing specific rules on table columns, SQL constraints help
maintain data consistency, preventing invalid data entries and optimizing query
performance.
What Are SQL Constraints?
SQL constraints are rules applied to columns or tables in a relational
database to limit the type of data that can be inserted, updated, or deleted.
These rules ensure the data is valid, consistent, and adheres to the business
logic or database requirements. Constraints can be enforced during table
creation or later using the ALTER TABLE statement. They play a vital role in
maintaining the quality and integrity of your database.
Types Of Constraints In DBMS
Within relational databases, there are primarily five types of
constraints, commonly referred to as relational constraints. These
include:
1. Domain Constraints( unique/ not null)
2. Key Constraints( p/ f)
3. Entity Integrity Constraints (null value)
4. Referential Integrity Constraints(reference table)
5. Tuple Uniqueness Constraints (unique)
Types of SQL Constraints
SQL provides several types of constraints to manage different aspects of data
integrity. These constraints are essential for ensuring that data meets the
requirements of accuracy, consistency, and validity. Let’s go through each of
them with detailed explanations and examples.
1. NOT NULL Constraint
The NOT NULL constraint ensures that a column cannot contain NULL values.
This is particularly important for columns where a value is essential for
identifying records or performing calculations. If a column is defined as NOT
NULL, every row must include a value for that column.
Example:
CREATE TABLE Student
(
ID int(6) NOT NULL,
NAME varchar(10) NOT NULL,
ADDRESS varchar(20)
);
Explanation: In the above example, both the ID and NAME columns are defined
with the NOT NULL constraint, meaning every student must have
an ID and NAME value.
2. UNIQUE Constraint
The UNIQUE constraint ensures that all values in a column are distinct across all
rows in a table. Unlike the PRIMARY KEY, which requires uniqueness and does
not allow NULLs, the UNIQUE constraint allows NULL values but still enforces
uniqueness for non-NULL entries.
Example:
CREATE TABLE Student
(
ID int(6) NOT NULL UNIQUE,
NAME varchar(10),
ADDRESS varchar(20)
);
Explanation: Here, the ID column must have unique values, ensuring that no
two students can share the same ID. We can have more than
one UNIQUE constraint in a table.
3. PRIMARY KEY Constraint
A PRIMARY KEY constraint is a combination of the NOT
NULL and UNIQUE constraints. It uniquely identifies each row in a table. A table
can only have one PRIMARY KEY, and it cannot accept NULL values. This is
typically used for the column that will serve as the identifier of records.
Example:
CREATE TABLE Student
(
ID int(6) NOT NULL UNIQUE,
NAME varchar(10),
ADDRESS varchar(20),
PRIMARY KEY(ID)
);
Explanation: In this case, the ID column is set as the primary key, ensuring
that each student’s ID is unique and cannot be NULL.
4. FOREIGN KEY Constraint
A FOREIGN KEY constraint links a column in one table to the primary key in
another table. This relationship helps maintain referential integrity by
ensuring that the value in the foreign key column matches a valid record in the
referenced table.
Customers Table:
C_I ADDRES
D NAME S
1 RAMESH DELHI
2 SURESH NOIDA
DHARME GURGA
3
SH ON
Orders Table:
O_I ORDER_ C_I
D NO D
1 2253 3
2 3325 3
3 4521 2
4 8532 1
As we can see clearly that the field C_ID in Orders table is the primary key in
Customers table, i.e. it uniquely identifies each row in the Customers table.
Therefore, it is a Foreign Key in Orders table.
Example:
CREATE TABLE Orders
(
O_ID int NOT NULL,
ORDER_NO int NOT NULL,
C_ID int,
PRIMARY KEY (O_ID),
FOREIGN KEY (C_ID) REFERENCES Customers(C_ID)
)
Explanation: In this example, the C_ID column in the Orders table is a foreign
key that references the C_ID column in the Customers table. This ensures that
only valid customer IDs can be inserted into the Orders table.
5. CHECK Constraint
The CHECK constraint allows us to specify a condition that data must satisfy
before it is inserted into the table. This can be used to enforce rules, such as
ensuring that a column’s value meets certain criteria (e.g., age must be greater
than 18)
Example:
CREATE TABLE Student
(
ID int(6) NOT NULL,
NAME varchar(10) NOT NULL,
AGE int NOT NULL CHECK (AGE >= 18)
);
Explanation: In the above table, the CHECK constraint ensures that only
students aged 18 or older can be inserted into the table.
6. DEFAULT Constraint
The DEFAULT constraint provides a default value for a column when no value is
specified during insertion. This is useful for ensuring that certain columns
always have a meaningful value, even if the user does not provide one
Example:
CREATE TABLE Student
(
ID int(6) NOT NULL,
NAME varchar(10) NOT NULL,
AGE int DEFAULT 18
);
Explanation: Here, if no value is provided for AGE during an insert, the default
value of 18 will be assigned automatically.
How to Specify Constraints in SQL
Constraints can be specified during the table creation process using the CREATE
TABLE statement. Additionally, constraints can be modified or added to existing
tables using the ALTER TABLE statement.
Syntax for Creating Constraints:
CREATE TABLE table_name
column1 data_type [constraint_name],
column2 data_type [constraint_name],
column3 data_type [constraint_name],
);
We can also add or remove constraints after a table is created:
Example to Add a Constraint:
ALTER TABLE Student
ADD CONSTRAINT unique_student_id UNIQUE (ID);
Codd's 12 Rules
Dr Edgar F. Codd, after his extensive research on the Relational Model of
database systems, came up with twelve rules of his own, which according to
him, a database must obey in order to be regarded as a true relational
database.
These rules can be applied on any database system that manages stored data
using only its relational capabilities. This is a foundation rule, which acts as a
base for all the other rules.
Rule 1: Information Rule
The data stored in a database, may it be user data or metadata, must be a
value of some table cell. Everything in a database must be stored in a table
format.
Rule 2: Guaranteed Access Rule
Every single data element (value) is guaranteed to be accessible logically with a
combination of table-name, primary-key (row value), and attribute-name
(column value). No other means, such as pointers, can be used to access data.
Rule 3: Systematic Treatment of NULL Values
The NULL values in a database must be given a systematic and uniform
treatment. This is a very important rule because a NULL can be interpreted as
one the following − data is missing, data is not known, or data is not applicable.
Rule 4: Active Online Catalog
The structure description of the entire database must be stored in an online
catalog, known as data dictionary, which can be accessed by authorized
users. Users can use the same query language to access the catalog which they
use to access the database itself.
Rule 5: Comprehensive Data Sub-Language Rule
A database can only be accessed using a language having linear syntax that
supports data definition, data manipulation, and transaction management
operations. This language can be used directly or by means of some application.
If the database allows access to data without any help of this language, then it
is considered as a violation.
Rule 6: View Updating Rule
All the views of a database, which can theoretically be updated, must also be
updatable by the system.
Rule 7: High-Level Insert, Update, and Delete Rule
A database must support high-level insertion, updation, and deletion. This must
not be limited to a single row, that is, it must also support union, intersection
and minus operations to yield sets of data records.
Rule 8: Physical Data Independence
The data stored in a database must be independent of the applications that
access the database. Any change in the physical structure of a database must
not have any impact on how the data is being accessed by external applications.
Rule 9: Logical Data Independence
The logical data in a database must be independent of its user’s view
(application). Any change in logical data must not affect the applications using
it. For example, if two tables are merged or one is split into two different tables,
there should be no impact or change on the user application. This is one of the
most difficult rule to apply.
Rule 10: Integrity Independence
A database must be independent of the application that uses it. All its integrity
constraints can be independently modified without the need of any change in
the application. This rule makes a database independent of the front-end
application and its interface.
Rule 11: Distribution Independence
The end-user must not be able to see that the data is distributed over various
locations. Users should always get the impression that the data is located at one
site only. This rule has been regarded as the foundation of distributed database
systems.
Rule 12: Non-Subversion Rule
If a system has an interface that provides access to low-level records, then the
interface must not be able to subvert the system and bypass security and
integrity constraints.
Data Models in DBMS
A Data Model in Database Management System (DBMS) is the concept of tools
that are developed to summarize the description of the database. Data Models
provide us with a transparent picture of data which helps us in creating an
actual database. It shows us from the design of the data to its proper
implementation of data.
Types of Relational Models
1. Conceptual Data Model
2. Representational Data Model
3. Physical Data Model
It is basically classified into 3 types:-
1. Conceptual Data Model
The conceptual data model describes the database at a very high level and is
useful to understand the needs or requirements of the database. It is this model,
that is used in the requirement-gathering process i.e. before the Database
Designers start making a particular database. One such popular model is
the entity/relationship model (ER model). The E/R model specializes in entities,
relationships, and even attributes that are used by database designers. In terms
of this concept, a discussion can be made even with non-computer science(non-
technical) users and stakeholders, and their requirements can be understood.
Entity-Relationship Model( ER Model): It is a high-level data model which is
used to define the data and the relationships between them. It is basically a
conceptual design of any database which is easy to design the view of data.
Components of ER Model:
1. Entity: An entity is referred to as a real-world object. It can be a name,
place, object, class, etc. These are represented by a rectangle in an ER
Diagram.
2. Attributes: An attribute can be defined as the description of the entity.
These are represented by Ellipse in an ER Diagram. It can be Age, Roll
Number, or Marks for a Student.
3. Relationship: Relationships are used to define relations among different
entities. Diamonds and Rhombus are used to show Relationships.
Characteristics of a conceptual data model
Offers Organization-wide coverage of the business concepts.
This type of Data Models are designed and developed for a business
audience.
The conceptual model is developed independently of hardware
specifications like data storage capacity, location or software
specifications like DBMS vendor and technology. The focus is to represent
data as a user will see it in the “real world.”
Conceptual data models known as Domain models create a common vocabulary
for all stakeholders by establishing basic concepts and scope
2. Representational Data Model
This type of data model is used to represent only the logical part of the
database and does not represent the physical structure of the database. The
representational data model allows us to focus primarily, on the design part of
the database. A popular representational model is a Relational model. The
relational Model consists of Relational Algebra and Relational Calculus. In the
Relational Model, we basically use tables to represent our data and the
relationships between them. It is a theoretical concept whose practical
implementation is done in Physical Data Model.
The advantage of using a Representational data model is to provide a
foundation to form the base for the Physical model.
Characteristics of Representational Data Model
Represents the logical structure of the database.
Relational models like Relational Algebra and Relational Calculus are
commonly used.
Uses tables to represent data and relationships.
Provides a foundation for building the physical data model.
3. Physical Data Model
The physical Data Model is used to practically implement Relational Data Model.
Ultimately, all data in a database is stored physically on a secondary storage
device such as discs and tapes. This is stored in the form of files, records, and
certain other data structures. It has all the information on the format in which
the files are present and the structure of the databases, the presence of
external data structures, and their relation to each other. Here, we basically
save tables in memory so they can be accessed efficiently. In order to come up
with a good physical model, we have to work on the relational model in a better
way. Structured Query Language (SQL) is used to practically implement
Relational Algebra.
This Data Model describes HOW the system will be implemented using a specific
DBMS system. This model is typically created by DBA and developers. The
purpose is actual implementation of the database.
Characteristics of a physical data model:
The physical data model describes data need for a single project or
application though it maybe integrated with other physical data models
based on project scope.
Data Model contains relationships between tables that which addresses
cardinality and nullability of the relationships.
Developed for a specific version of a DBMS, location, data storage or
technology to be used in the project.
Columns should have exact datatypes, lengths assigned and default
values.
Primary and Foreign keys, views, indexes, access profiles, and
authorizations, etc. are defined
Some Other Data Models
1. Hierarchical Model
The hierarchical Model is one of the oldest models in the data model which was
developed by IBM, in the 1950s. In a hierarchical model, data are viewed as a
collection of tables, or we can say segments that form a hierarchical relation. In
this, the data is organized into a tree-like structure where each record consists
of one parent record and many children. Even if the segments are connected as
a chain-like structure by logical associations, then the instant structure can be a
fan structure with multiple branches. We call the illogical associations as
directional associations.
2. Network Model
The Network Model was formalized by the Database Task group in the 1960s.
This model is the generalization of the hierarchical model. This model can
consist of multiple parent segments and these segments are grouped as levels
but there exists a logical association between the segments belonging to any
level. Mostly, there exists a many-to-many logical association between any of
the two segments.
3. Object-Oriented Data Model
In the Object-Oriented Data Model, data and their relationships are contained in
a single structure which is referred to as an object in this data model. In this,
real-world problems are represented as objects with different attributes. All
objects have multiple relationships between them. Basically, it is a combination
of Object Oriented programming and a Relational Database Model.
6. Semi-Structured Data Model
Semi-Structured data models deal with the data in a flexible way. Some entities
may have extra attributes and some entities may have some missing attributes.
Basically, you can represent data here in a flexible way.
Advantages of Data Models
1. Data Models help us in representing data accurately.
2. It helps us in finding the missing data and also in minimizing Data
Redundancy.
3. Data Model provides data security in a better way.
4. The data model should be detailed enough to be used for building the
physical database.
5. The information in the data model can be used for defining the
relationship between tables, primary and foreign keys, and stored
procedures.
Disadvantages of Data Models
1. In the case of a vast database, sometimes it becomes difficult to
understand the data model.
2. You must have the proper knowledge of SQL to use physical models.
3. Even smaller change made in structure require modification in the entire
application.
4. There is no set data manipulation language in DBMS.
5. To develop Data model, one should know physical data stored
characteristics
ER Diagram:
We typically follow the below steps for designing a database for an application.
Gather the requirements (functional and data) by asking questions to the
database users.
Do a logical or conceptual design of the database. This is where ER
model plays a role. It is the most used graphical representation of the
conceptual design of a database.
Physical Database Design (Like indexing) and external design (like views)
The Entity Relationship Model is a model for identifying entities (like student, car
or company) to be represented in the database and representation of how those
entities are related. The ER data model specifies enterprise schema that
represents the overall logical structure of a database graphically.
Why Use ER Diagrams In DBMS?
ER diagrams represent the E-R model in a database, making them easy to
convert into relations (tables).
ER diagrams provide the purpose of real-world modeling of objects which
makes them intently useful.
ER diagrams require no technical knowledge of the underlying DBMS used.
It gives a standard solution for visualizing the data logically.
Symbols Used in ER Model
ER Model is used to model the logical view of the system from a data
perspective which consists of these symbols:
Rectangles: Rectangles represent Entities in the ER Model.
Ellipses: Ellipses represent Attributes in the ER Model.
Diamond: Diamonds represent Relationships among Entities.
Lines: Lines represent attributes to entities and entity sets with other
relationship types.
Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
Double Rectangle: Double Rectangle represents a Weak Entity.
Symbols used in ER Diagram
Components of ER Diagram
ER Model consists of Entities, Attributes, and Relationships among Entities in a
Database System.
Components of ER Diagram
What is Entity?
An Entity may be an object with a physical existence – a particular person, car,
house, or employee – or it may be an object with a conceptual existence – a
company, a job, or a university course.
What is Entity Set?
An Entity is an object of Entity Type and a set of all entities is called an entity
set. For Example, E1 is an entity having Entity Type Student and the set of all
students is called Entity Set. In ER diagram, Entity Type is represented as:
Entity Set
We can represent the entity set in ER Diagram but can’t represent entity in ER
Diagram because entity is row and column in the relation and ER Diagram is
graphical representation of data.
Types of Entity
There are two types of entity:
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not
depend on other Entity in the Schema. It has a primary key, that helps in
identifying it uniquely, and it is represented by a rectangle. These are called
Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the
entity set. But some entity type exists for which key attributes can’t be defined.
These are called Weak Entity types .
For Example, A company may store the information of dependents (Parents,
Children, Spouse) of an Employee. But the dependents can’t exist without the
employee. So Dependent will be a Weak Entity Type and Employee will be
Identifying Entity type for Dependent, which means it is Strong Entity Type .
A weak entity type is represented by a Double Rectangle. The participation of
weak entity types is always total. The relationship between the weak entity type
and its identifying strong entity type is called identifying relationship and it is
represented by a double diamond.
Strong Entity and Weak Entity
Types of Attributes in ER Model
In a Database Management System (DBMS), an attribute is a property or
characteristic of an entity that is used to describe an entity. Essentially, it is a
column in a table that holds data values. An entity may contain any number of
attributes. One of the attributes is considered as the primary key. In an Entity-
Relation model, attributes are represented in an elliptical shape.
Example: Student has attributes like name, age, roll number, and many more. To
uniquely identify the student, we use the primary key as a roll number as it is
not repeated. Attributes can also be subdivided into another set of attributes.
Attributes help define and organize the data, making it easier to retrieve and
manipulate information within the database. In this article, we are going to
discuss about different types of attributes in detail.
Types of Attributes
There are different types of attributes as discussed below-
Simple Attribute
Composite Attribute
Single-Valued Attribute
Multi-Valued Attribute
Derived Attribute
Null Attribute
Key Attribute
Attributes define the properties of entities in an ER model, and
understanding their types is essential for database design. To explore attributes
and their applications further, Simple Attribute
An attribute that cannot be further subdivided into components is a simple
attribute.
Example: The roll number of a student, the ID number of an employee, gender,
and many more.
Simple Attribute
Composite Attribute
An attribute that can be split into components is a composite attribute.
Example: The address can be further split into house number, street number,
city, state, country, and pin code, the name can also be split into first name
middle name, and last name.
Single-Valued Attribute
The attribute which takes up only a single value for each entity instance is a
single-valued attribute.
Example: The age of a student, Aadhar card number.
Multi-Valued Attribute
The attribute which takes up more than a single value for each entity instance is
a multi-valued attribute. And it is represented by double oval shape.
Example: Phone number of a student: Landline and mobile.
Derived Attribute
An attribute that can be derived from other attributes is derived attributes. And
it is represented by dotted oval shape.
Example: Total and average marks of a student, age of an employee that is
derived from date of birth.
Key attribute
Key attributes are those attributes that can uniquely identify the entity in
the entity set.
Example: Roll-No is the key attribute because it can uniquely identify the
student.
Null Attribute
This attribute can take NULL value when entity does not have value for it.
Example –The ‘Net Banking Active Bin’ attribute gives weather particular
customer having net banking facility activated or not activated.
For bank which does not offer facility of net banking in customer table ‘Net
Banking Active Bin’ attribute is always null till Net banking facility is not
activated as this attribute indicates Bank offers net banking facility or does not
offers.
What is Participation Constraints?
Participation Constraints in database management refer to rules that determine
the minimum and maximum participation of entities or relationships in a given
relationship set. While partial participation permits discretionary involvement,
total participation requires every entity in one set to take part in a relationship
in another set. By maintaining consistency and enforcing business standards,
these restrictions guarantee data integrity.
For example, in a College Database, partial participation would permit courses
with no enrolled students, while entire participation might require all students to
be enrolled in at least one course. For database schemas to effectively replicate
real-world circumstances and enable efficient data management, it is imperative
to comprehend and include participation limitations.
There are two types of participation constraints in database management
systems, such as :
Total Participation
Partial Participation
Total Participation
Entire participation, sometimes known as required participation, denotes the
requirement that each individual in a group participate in an activity pertaining
to another group. It's similar to saying that in order to belong to one group, you
must somehow be associated with another. In a university database, for
instance, total participation between courses and students indicates that each
student is required to be registered in a minimum of one course. It follows that
no student can be excluded from a course. It serves as a means of guaranteeing
that every member of one group is connected to something within another,
ensuring that nothing is overlooked or left disconnected.
In below digram,The Participation of an entity set E in a relationship set R is said
to be total if every entity in E participates in at least one relationship in R.
The participation of entity set A in the relationship set is total because every
entity and The participation of entity set B in the relationship set is also total
because every entity of B also participates in the relationship set.
Total Participation
Partial Participation
In database design, partial participation—also known as optional participation—
allows certain aspects of a relationship to be optional. It implies that the way the
database is configured does not require that every entity be linked to every
other thing. Consider a database at a university, for instance. Partial
participation can mean that some students are enrolled in classes but not all
students are registered in them. Because it recognizes that not everything in
real life is always connected to everything else, this flexibility is crucial. While
some objects are connected to one another, others may stand alone. It permits
scenarios in which certain database entities may not be connected to any other
entity.
In below diagram, The participation of an entity set E in relationship set R is said
to be partial if only some entities in E participate in relationships in R.
The participation of entity set A in the relationship set is partial because only
some entities of A participate in the relationship set. while the participation of
entity set B in the relationship set is total because every entity of B participates
in the relationship set.
Partial Participation
Example:
Suppose an entity set Student related to an entity set Course through Enrolled
relationship set.
The participation of entity set course in enrolled relationship set
is partial because a course may or may not have students enrolled in. It is
possible that only some of the course entities are related to the student entity
set through the enrolled relationship set.
The participation of entity set student in enrolled relationship set
is total because every student is expect to relate at least one course through
the enrolled relationship set.
Generalization
Generalization is the process of extracting common properties from a set of
entities and creating a generalized entity from it. It is a bottom-up approach in
which two or more entities can be generalized to a higher-level entity if they
have some attributes in common. For Example, STUDENT and FACULTY can be
generalized to a higher-level entity called PERSON as shown in Figure 1. In this
case, common attributes like P_NAME, and P_ADD become part of a
higher entity (PERSON), and specialized attributes like S_FEE become part of a
specialized entity (STUDENT).
Generalization is also called as ‘ Bottom-up approach”.
Specialization
In specialization, an entity is divided into sub-entities based on its
characteristics. It is a top-down approach where the higher-level entity is
specialized into two or more lower-level entities. For Example, an EMPLOYEE
entity in an Employee management system can be specialized into DEVELOPER,
TESTER, etc. as shown in Figure 2. In this case, common attributes like E_NAME,
E_SAL, etc. become part of a higher entity (EMPLOYEE), and specialized
attributes like TES_TYPE become part of a specialized entity (TESTER).
Specialization is also called as ” Top-Down approch”.
Aggregation
An ER diagram is not capable of representing the relationship between an entity
and a relationship which may be required in some scenarios. In those cases, a
relationship with its corresponding entities is aggregated into a higher-level
entity. Aggregation is an abstraction through which we can represent
relationships as higher-level entity sets.
For Example, an Employee working on a project may require some machinery.
So, REQUIRE relationship is needed between the relationship WORKS_FOR and
entity MACHINERY. Using aggregation, WORKS_FOR relationship with its entities
EMPLOYEE and PROJECT is aggregated into a single entity and relationship
REQUIRE is created between the aggregated entity and MACHINERY.