14. Module 1
14. Module 1
What is a Database?
A database is a collection of related data which represents some aspect of the real world. A database
system is designed to be built and populated with data for a certain task.
What is DBMS?
Database Management System (DBMS) is a software for storing and retrieving users’ data while
considering appropriate security measures. It consists of a group of programs which manipulate the
database. The DBMS accepts the request for data from an application and instructs the operating system
to provide the specific data. In large systems, a DBMS helps users and other third-party software to
store and retrieve data.
DBMS allows users to create their own databases as per their requirement. The term “DBMS” includes
the user of the database and other application programs. It provides an interface between the data and
the software application.
Conceptual Design: The requirements of database are captured using high level conceptual data
model. For Example, the ER model is used for the conceptual design of the database.
Logical Design: Logical Design represents data in the form of relational model. ER diagram
produced in the conceptual design phase is used to convert the data into the Relational Model.
Physical Design: In physical design, data in relational model is implemented using commercial
DBMS like Oracle, DB2.
Advantages of DBMS
DBMS helps in efficient organization of data in database which has following advantages over
typical file system:
• Minimized redundancy and data inconsistency: Data is normalized in DBMS to
minimize the redundancy which helps in keeping data consistent. For Example, student
information can be kept at one place in DBMS and accessed by different users.This
minimized redundancy is due to primary key and foreign keys
• Simplified Data Access: A user need only name of the relation not exact location to
access data, so the process is very simple.
• Multiple data views: Different views of same data can be created to cater the needs of
different users. For Example, faculty salary information can be hidden from student
view of data but shown in admin view.
• Data Security: Only authorized users are allowed to access the data in DBMS. Also,
data can be encrypted by DBMS which makes it secure.
• Concurrent access to data: Data can be accessed concurrently by different users at
same time in DBMS.
• Backup and Recovery mechanism: DBMS backup and recovery mechanism helps to
avoid data loss and data inconsistency in case of catastrophic failures.
Application of DBMS
There are different fields where a database management system is utilized. Following are a few
applications which utilize the information base administration framework –
1. Railway Reservation System –
In the rail route reservation framework, the information base is needed to store the
record or information of ticket appointments, status about train’s appearance, and flight.
Additionally, if trains get late, individuals become acquainted with it through the
information base update.
3. Banking –
Database the executive’s framework is utilized to store the exchange data of the client
in the information base.
4. Education Sector –
Presently, assessments are led online by numerous schools and colleges. They deal with
all assessment information through the data set administration framework (DBMS). In
spite of that understudy’s enlistments subtleties, grades, courses, expense, participation,
results, and so forth all the data is put away in the information base.
7. Broadcast communications –
Without DBMS any media transmission organization can’t think. The Database the
executive’s framework is fundamental for these organizations to store the call subtleties
and month to month postpaid bills in the information base.
8. Account –
The information base administration framework is utilized for putting away data about
deals, holding and acquisition of monetary instruments, for example, stocks and bonds
in a data set.
9. Online Shopping –
These days, web-based shopping has become a major pattern. Nobody needs to visit the
shop and burn through their time. Everybody needs to shop through web based
shopping sites, (for example, Amazon, Flipkart, Snapdeal) from home. So all the items
are sold and added uniquely with the assistance of the information base administration
framework (DBMS). Receipt charges, installments, buy data these are finished with the
assistance of DBMS.
11. Manufacturing –
Manufacturing organizations make various kinds of items and deal them consistently.
To keep the data about their items like bills, acquisition of the item, amount, inventory
network the executives, information base administration framework (DBMS) is
utilized.
The purpose of database systems is to make the database user-friendly and do easy operations. Users
can easily insert, update, and delete. Actually, the main purpose is to have more control of the data.
If there are multiple copies of the same data, it just avoids it. It just maintains data in a single
repository. Also, the purpose of database systems is to make the database consistent.
A database system can easily manage to access data. Through different queries, it can access data
from the database.
Data isolation:
Atomicity of updates:
In case of power failure, the database might lose data. So, this feature will automatically prevent data
loss.
Concurrent access:
Users can have multiple access to the database at the same time.
Security problems:
Database systems will make the restricted access. So, the data will not be vulnerable.
It can support multiple views of data to give the required view as their needs. Only database admins
can have a complete view of the database. We cannot allow the end-users to have a view of
developers.
Views of data
View of data in DBMS narrate how the data is visualized at each level of data abstraction? Data
abstraction allow developers to keep complex data structures away from the users. The developers
achieve this by hiding the complex data structures through levels of abstraction.
View of Data in DBMS
1. Data Abstraction
2. Data Independence
3. Instance and Schema
Data Abstraction
Data abstraction is hiding the complex data structure in order to simplify the user’s interface of
the system. It is done because many of the users interacting with the database system are not that
much computer trained to understand the complex data structures of the database system.
To achieve data abstraction, we will discuss a Three-Schema architecture which abstracts the
database at three levels discussed below:
Three-Schema Architecture:
The main objective of this architecture is to have an effective separation between the user
interface and the physical database. So, the user never has to be concerned regarding the internal
storage of the database and it has a simplified interaction with the database system.
The physical or the internal level schema describes how the data is stored in the hardware. It also
describes how the data can be accessed. The physical level shows the data abstraction at the lowest
level and it has complex data structures. Only the database administrator operates at this level.
2. Logical Level/ Conceptual Level
It is a level above the physical level. Here, the data is stored in the form of the entity set, entities,
their data types, the relationship among the entity sets, user operations performed to retrieve or
modify the data and certain constraints on the data. Well adding constraints to the view of data adds
the security. As users are restricted to access some particular parts of the database.
It is the developer and database administrator who operates at the logical or the conceptual level.
There is one more feature that should be kept in mind i.e. the data independence. While changing the
data schema at one level of the database must not modify the data schema at the next level. In this
section, we will discuss the view of data in DBMS with data abstraction, data independence, data
schema in detail.
1. Data Abstraction
2. Data Independence
3. Instance and Schema
4. Key Takeaways
Data Abstraction
Data abstraction is hiding the complex data structure in order to simplify the user’s interface of
the system. It is done because many of the users interacting with the database system are not that
much computer trained to understand the complex data structures of the database system.
To achieve data abstraction, we will discuss a Three-Schema architecture which abstracts the
database at three levels discussed below:
Three-Schema Architecture:
The main objective of this architecture is to have an effective separation between the user
interface and the physical database. So, the user never has to be concerned regarding the internal
storage of the database and it has a simplified interaction with the database system.
The physical or the internal level schema describes how the data is stored in the hardware. It also
describes how the data can be accessed. The physical level shows the data abstraction at the lowest
level and it has complex data structures. Only the database administrator operates at this level.
It is a level above the physical level. Here, the data is stored in the form of the entity set, entities,
their data types, the relationship among the entity sets, user operations performed to retrieve or
modify the data and certain constraints on the data. Well adding constraints to the view of data adds
the security. As users are restricted to access some particular parts of the database.
It is the developer and database administrator who operates at the logical or the conceptual level.
3. View Level/ User level/ External level
It is the highest level of data abstraction and exhibits only a part of the whole database. It exhibits the
data in which the user is interested. The view level can describe many views of the same data. Here,
the user retrieves the information using different application from the database.
In the figure above you can clearly distinguish between the three levels of abstraction. To understand
it more clearly let us take an example:
We have to create a database of a college. Now, what entity sets would be involved? Student,
Lecturer, Department, Course and so on…
Now, the entity sets Student, Lecturer, Department, Course will be stored in the storage as
the consecutive blocks of the memory location. This is the physical or internal level and is hidden
from the programmers but the database administrator is it aware of it.
At the logical level, the programmers define the entity sets and relationship among these entity sets
using a programming language like SQL. So, the programmers work at the logical level and even the
database administrator also operates at this level.
At the view level, the users have the set of applications which they use to retrieve the data they are
interested in.
Data Independence
Data independence defines the extent to which the data schema can be changed at one level without
modifying the data schema at the next level. Data independence can be classified as shown below:
Logical data independence describes the degree up to which the logical or conceptual schema can be
changed without modifying the external schema. Now, a question arises what is the need to change
the data schema at a logical or conceptual level?
Well, the changes to data schema at the logical level are made either to enlarge or reduce the
database by adding or deleting more entities, entity sets, or changing the constraints on data.
Physical data independence defines the extent up to which the data schema can be changed at the
physical or internal level without modifying the data schema at logical and view level.
Well, the physical schema is changed if we add additional storage to the system or we reorganize
some files to enhance the retrieval speed of the records.
What is an instance?
We can define an instance as the information stored in the database at a particular point of time. Let
us discuss it with the help of an example.
As we discussed above the database comprises of several entity sets and the relationship between
them. Now, the data in the database keeps on changing with time. As we keep inserting or deleting the
data to and from the database.
What is schema?
Whenever we talk about the database the developers have to deal with the definition of database and
the data in the database.
The definition of a database comprises of the description of what data it would contain what would be
the relationship between the data. This definition is the database schema.
3. System Analyst :
System Analyst is a user who analyzes the requirements of parametric end users. They
check whether all the requirements of end users are satisfied.
4. Sophisticated Users :
Sophisticated users can be engineers, scientists, business analyst, who are familiar with
the database. They can develop their own data base applications according to their
requirement. They don’t write the program code but they interact the data base by
writing SQL queries directly through the query processor.
6. Application Program :
Application Program are the back end programmers who writes the code for the
application programs.They are the computer professionals. These programs could be
written in Programming languages such as Visual Basic, Developer, C, FORTRAN,
COBOL etc.
Database Languages
Read, Update, Manipulate, and Store data in a database using Database Languages. The following are
the database languages −
• Data Definition Language
• Data Manipulation Language
• Data Control Language
• Transaction Control Language
Let us begin with Data Definition Language:
Data Definition Language
The language is used to create database, tables, alter them, etc. With this, you can also rename the
database, or drop them. It specifies the database schema.
The DDL statements include −
The logical model concentrates on the data requirements and the data to be stored independent of
physical considerations. It does not concern itself with how the data will be stored or where it will be
stored physically.
The physical data design model involves translating the logical DB design of the database onto physical
media using hardware resources and software systems such as database management systems (DBMS).
The database development life cycle has a number of stages that are followed when developing
database systems.
The steps in the development life cycle do not necessarily have to be followed religiously in a
sequential manner.
On small database systems, the process of database design is usually very simple and does not involve
a lot of steps.
In order to fully appreciate the above diagram, let’s look at the individual components listed in each
step for overview of design process in DBMS.
Requirements analysis
• Planning – This stages of database design concepts are concerned with planning of entire
Database Development Life Cycle. It takes into consideration the Information Systems
strategy of the organization.
• System definition – This stage defines the scope and boundaries of the proposed database
system.
Database designing
• Logical model – This stage is concerned with developing a database model based on
requirements. The entire design is on paper without any physical implementations or specific
DBMS considerations.
• Physical model – This stage implements the logical model of the database taking into
account the DBMS and physical implementation factors.
Implementation
• Data conversion and loading – this stage of relational databases design is concerned with
importing and converting data from the old system into the new database.
• Testing – this stage is concerned with the identification of errors in the newly implemented
system. It checks the database against requirement specifications.
1. Normalization
2. ER Modeling
Design process
Database design process is a series of instructions detailing the creation of tables, attributes, domains,
views, indexes, security constraints, and storage and performance guidelines. In this process, you
stage, more details about the data model design are determined and documented. You could think of
the conceptual design as the overall data as seen by the end-user, the logical design as the data as seen
by the DBMS, and the physical design as the data as seen by the operating system’s storage
management devices.
1. Conceptual design
Conceptual design is the process that designs a conceptual data model that describes the main data
entities, attributes, relationships, and constraints of a given problem domain represents real-world
objects in the most realistic way possible. It must embody a clear understanding of the business and its
The process should follow a minimal data rule: All that needed is there, and all that is there is needed.
However, as you apply the minimal data rule, avoid excessive short-term bias. Focus not only on the
immediate data needs of the business but on future data needs. Thus, the database design must leave
room for future modifications and additions, ensuring that the business’s investment in information
In order to design appropriate data element characteristics that can be transformed into appropriate
In this step, we could communicate and enforce appropriate standards to be used in the documentation
of the design(diagrams and symbols, documentation writing style, layout, and any other conventions).
Designers often overlook this very important requirement, especially when they are working as
After that, the designer could incorporate business rules into the conceptual model using ER diagrams
The activities often take place in parallel, and the process is iterative until you are satisfied that the ER
model accurately represents a database design that can meet the required system demands
Below is the array of design tools and information sources that the designer can use to produce the
conceptual model
All objects (entities, attributes, relations, views, and so on) are defined in a data dictionary, which is
used in tandem with the normalization process to help eliminate data anomalies and redundancy
The naming conventions requirement is important, yet it is frequently ignored at the designer’s risk.
Real database design is generally accomplished by teams. Therefore, it is important to ensure that team
members work in an environment in which naming standards are defined and enforced. Proper
documentation is crucial to the successful completion of the design, and adherence to the naming
conventions serves database designers well. In fact, a common refrain from users seems to be: “I didn’t
know why you made such a fuss over naming conventions, but now that I’m doing this stuff for real,
It is one of the last steps in the conceptual design stage, and it is one of the most critical. In this step,
the ER model must be verified against the proposed system processes to corroborate that they can be
supported by the database model. Verification requires that the model be run through a series of tests
against:
• End-user data views.
• All required transactions: SELECT, INSERT, UPDATE, and DELETE operations.
• Access rights and security.
• Business-imposed data requirements and constraints.
Because real-world database design is generally done by teams, the database design is probably divided
into major components known as modules. A module is an information system component that handles
a specific business function, such as inventory, orders, or payroll. Under these conditions, each module
As useful as modules are, they represent a loose collection of ER model fragments that could wreak
havoc in the database if left unchecked. For example, the ER model fragments:
• Might present overlapping, duplicated, or conflicting views of the same data.
• Might not be able to support all processes in the system’s modules.
To avoid these problems, it is better if the modules’ ER fragments are merged into a single enterprise
ER model. This process starts by selecting a central ER model segment and iteratively adding more ER
model segments one at a time. At each stage, for each new entity added to the model, you need to
validate that the new entity does not overlap or conflict with a previously identified entity in the
enterprise ER model.
Merging the ER model segments into an enterprise ER model triggers a careful reevaluation of the
entities, followed by a detailed examination of the attributes that describe those entities. This process
After finishing the merging process, the resulting enterprise ER model is verified against each of the
module’s processes
1. Identify the ER model’s center entity.
2. Identify each model and its components.
3. Identify each module’s transaction requirements
Internal: updates/inserts/delets/queries/reports
External: modeul interfaces
4. Verify all process against system requirments.
5. Make all necessary changes suggested in Step 4.
6. Repeat Steps 2–5 for all modules.
Keep in mind that this process requires the continuous verification of business transactions as well as
system and user requirements. The verification sequence must be repeated for each of the system’s
modules.
The verification process starts with selecting the central (most important) entity, which is the focus for
To identify the central entity, the designer selects the entity involved in the greatest number of the
model’s relationships. In the ER diagram, it is the entity with more lines connected to it than any other.
The next step is to identify the module or subsystem to which the central entity belongs and to define
that module’s boundaries and scope. The entity belongs to the module that uses it most frequently.
Once each module is identified, the central entity is placed within the module’s framework to let you
All identified processes must be verified against the ER model. If necessary, appropriate changes are
implemented. The process verification is repeated for all of the model’s modules. You can expect that
additional entities and attributes will be incorporated into the conceptual model during its validation.
At this point, a conceptual model has been defined as hardware- and software-independent. Such
independence ensures the system’s portability across platforms. Portability can extend the database’s
Although not a requirement for most databases, some may need to be distributed among multiple
geographical locations. Processes that access the database may also vary from one location to another.
For example, a retail process and a warehouse storage process are likely to be found in different
physical locations. If the database data and processes will be distributed across the system, portions of
a database, known as database fragments, may reside in several physical locations. A database
fragment is a subset of a database that is stored at a given location. The database fragment may be a
Distributed database design defines the optimum allocation strategy for database fragments to ensure
database integrity, security, and performance. The allocation strategy determines how to partition the
DBMS software selection is critical, we should carefully study the advantages and disadvantages.
Logical design goal is to design an enterprise-wide database that is based on a specific data model but
independent of physical-level details. It requires that all objects in the conceptual model be mapped to
Physical design is the process of determining the data storage organization and data access
characteristics of the database to ensure its integrity, security, and performance. The storage
characteristics are a function of the types of devices supported by the hardware, the type of data access
• Entities − It is a real-world thing which can be a person, place, or even a concept. For Example:
Department, Admin, Courses, Teachers, Students, Building, etc are some of the entities of a
School Management System.
• Attributes − An entity which contains a real-world property called an attribute. For Example:
The entity employee has the property like employee id, salary, age, etc.
• Relationship − Relationship tells how two attributes are related. For Example: Employee
works for a department.
An entity has a real-world property called attribute and these attributes are defined by a set of values
called domain.
Example 1
In a university,
• A student is an entity,
• The relationships among entities define the logical association between entities.
Example 2
Given below is another example of ER:
• Database Design − This model helps the database designers to build the database.
Advantages
The advantages of ER are as follows −
• This model is widely used by database designers for communicating their ideas.
• This model can easily convert to any other model like network model, hierarchical model etc.
ER diagram
ER Diagram Symbols
• Entities
• Attributes
• Relationships
ER Diagram Examples
For example, in a University database, we might have entities for Students, Courses, and Lecturers.
Students entity can have attributes like Rollno, Name, and DeptID. They might have relationships
with Courses and Lecturers.
Components of the ER Diagram
WHAT IS ENTITY?
A real-world thing either living or non-living that is easily recognizable and nonrecognizable. It is
anything in the enterprise that is to be represented in our database. It may be a physical thing or
simply a fact about the enterprise or an event that happens in the real world.
An entity can be place, person, object, event or a concept, which stores data in the database. The
characteristics of entities are must have an attribute, and a unique key. Every entity is made up of
some ‘attributes’ which represent that entity.
Examples of entities:
Notation of an Entity
Entity set:
Student
An entity set is a group of similar kind of entities. It may contain entities with attribute sharing similar
values. Entities are represented by their properties, which also called attributes. All attributes have
their separate values. For example, a student entity may have a name, age, class, as attributes.
Example of Entities:
A university may have some departments. All these departments employ various lecturers and offer
several programs.
Some courses make up each program. Students register in a particular program and enroll in various
courses. A lecturer from the specific department takes each course, and each lecturer teaches a various
group of students.
Relationship
Relationship is nothing but an association among two or more entities. E.g., Tom works in the
Chemistry department.
Entities take part in relationships. We can often identify relationships with verbs or verb phrases.
For example:
Weak Entities
A weak entity is a type of entity which doesn’t have its key attribute. It can be identified uniquely by
considering the primary key of another entity. For that, weak entity sets need to have participation.
In above ER Diagram examples, “Trans No” is a discriminator within a group of transactions in an
ATM.
Let’s learn more about a weak entity by comparing it with a Strong Entity
It contains a Primary key represented by the underline It contains a Partial Key which is represented by a
symbol. dashed underline symbol.
The member of a strong entity set is called as The member of a weak entity set called as a
dominant entity set. subordinate entity set.
Primary Key is one of its attributes which helps to In a weak entity set, it is a combination of primary
identify its member. key and partial key of the strong entity set.
The connecting line of the strong entity set with the The line connecting the weak entity set for identifying
relationship is single. relationship is double.
Attributes
It is a single-valued property of either an entity-type or a relationship-type.
For example, a lecture might have attributes: time, date, duration, place, etc.
Multivalued attributes can have more than one values. For example,
Multivalued attribute a student can have more than one mobile number, email address,
etc.
Cardinality
Defines the numerical attributes of the relationship between two entities or entity sets.
• One-to-One Relationships
• One-to-Many Relationships
• May to One Relationships
• Many-to-Many Relationships
1.One-to-one:
One entity from entity set X can be associated with at most one entity of entity set Y and vice versa.
Example: One student can register for numerous courses. However, all those courses have a single
line back to that one student.
2.One-to-many:
One entity from entity set X can be associated with multiple entities of entity set Y, but an entity from
entity set Y can be associated with at least one entity.
More than one entity from entity set X can be associated with at most one entity of entity set Y.
However, an entity from entity set Y may or may not be associated with more than one entity from
entity set X.
4. Many to Many:
One entity from X can be associated with more than one entity from Y and vice versa.
For example, Students as a group are associated with multiple faculty members, and faculty members
can be associated with multiple students.
In a university, a Student enrolls in Courses. A student must be assigned to at least one or more
Courses. Each course is taught by a single Professor. To maintain instruction quality, a Professor can
deliver only one course
Step 1) Entity Identification
We have three entities
• Student
• Course
• Professor
Once, you have a list of Attributes, you need to map them to the identified entities. Ensure an attribute
is to be paired with exactly one entity. If you think an attribute should belong to more than one entity,
use a modifier to make it unique.
Once the mapping is done, identify the primary Keys. If a unique key is not readily available, create
one.
For Course Entity, attributes could be Duration, Credits, Assignments, etc. For the sake of ease we
have considered just one attribute.