Unit 1
Unit 1
INTRODUCTION TO DATABASE:
Data are binary computer representations of stored logical entities. They are distinct pieces of
information usually formatted in a special way. Data is the plural of datum-a single piece of
information.
In DBMS data files are the files that store database information whereas other files such as index files
and data dictionaries store administrative information known as metadata.
Data Items
Relationships
Constraints
Schema
Internal Schema
1
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
The Schema defines various views of the database for the use of the various components of the DBMS
and for the applications’ security. A schema separates the physical aspects of data storage from the
logical aspects of data representation.
The Internal schema defines how and where data are organized in physical data storage.
The Conceptual schema defines the stored data structures in terms of database model used.
The External schema defines a view or views of the database for particular users.
A Database Management System provides services for accessing the database while maintaining the
required correctness and consistency of the stored data.
2
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
DBMS. Moreover, it also restores the database after a crash or system failure to its previous
condition.
7. Data Consistency: Data consistency is ensured in a database because there is no data
redundancy. All data appears consistently across the database and the data is same for all the
users viewing the database. Moreover, any changes made to the database are immediately
reflected to all the users and there is no data inconsistency.
3
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
To retrieve a field’s data, we need to traverse through each tree until the record is found.
The hierarchical database system structure was developed by IBM in the early 1960s. While
the hierarchical structure is simple, it is inflexible due to the parent-child one-to-many
relationship. Hierarchical databases are widely used to build high performance and availability
applications usually in the banking and telecommunications industries.
4
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
The IBM Information Management System (IMS) and Windows Registry are two popular
examples of hierarchical databases.
Advantage
A hierarchical database can be accessed and updated rapidly. As shown in the figure above,
its model structure is like a tree and the relationships between records are defined in advance.
This feature is a double-edged sword.
Disadvantage
This type of database structure is that each child in the tree may have only one parent.
Relationships or linkages between children are not permitted, even if they make sense from
a logical standpoint. Hierarchical databases are like this in their design. Adding a new field or
record requires that the entire database be redefined.
2. NETWORK MODEL: The Network Model replaces the hierarchical tree with a graph thus
allowing more general connections among the nodes. The main difference of the network
model from the hierarchical model is its ability to handle many-to-many relationships(n:n). Or
in other words, it allows records to have more than one parent. Suppose an employee works
for two departments. The strict hierarchical arrangement is not possible here and the tree
becomes a more generalized graph-a network. Logical proximity fails because you cannot
place a data item simultaneously in two locations in the list.
It is possible to handle such situations in a hierarchical model, it becomes more complicated
and difficult to comprehend. The network model was evolved specifically to handle non-
hierarchical relationships.
In network database terminology, a relationship is a set. Each set is made up of at least two
types of records; an owner record(equivalent to the parent in the hierarchical model) and a
member record( similar to the child record in the hierarchical model). The difference between
the hierarchical model and the network model is that the network model allows a record to
appear as a member in more than a set thus facilitating many-to-many relationships.
5
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
Structured Query Language (SQL) is the language used to query RDBMS, including inserting,
updating, deleting, and searching records. Relational databases work on each table that has a
key field that uniquely indicates each row. These key fields can be used to connect one table
of data to another.
Relational databases are the most popular and widely used databases. Some of the popular
DDBMS are Oracle, SQL Server, MySQL, SQLite, and IBM DB2.
4. OBJECT-ORIENTED MODEL: Object database is quite different, for the most part, object
database design is a fundamental part of the overall application design process. The object
classes used by the programming language are the classes used by the ODBMS. Because their
models are consistent, there is no need to transform the program’s object model to something
unique for the database manager.
Object-Oriented model represents an entity as a class. A class represents both object
attributes as well as the behavior of the entity. For example a book class will have not only
the attributes such as ISBN, Title, Author, etc. but procedures that imitate actions expected of
a book such as UpdatePrice(Updating the price). Instances of class-object correspond to
individual books. Within an object the class attributes takes specific values, which distinguish
one book (object) from another. However the behavior patterns of the class is shared by all
the objects belonging to the class.
Object- oriented databases manage objects(abstract data types). An object-oriented DBMS,
OODBMS is suited for multimedia applications as well as data with complex relationships that
are difficult to model and process in a relational DBMS. Because any type of data can be
6
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
stored. OODBMS allows for fully integrated databases that hold data, text, pictures, voice and
video.
5. OBJECT-RELATIONAL MODEL: Object-relational systems combine the advantages of modern
object-oriented programming languages with relational database features such as multiple
views of data and a high-level, nonprocedural query language. An object relational system is
a good long-term investment, because its extenders provide the capabilities you need to
manage today’s specialized objects, and its object infrastructure gives you the ability to define
new types, functions, and rules to deal with the evolving needs of today’s businesses. Some
of the object-relational systems available in the market are IBM’s DB2 Universal Servers,
Oracle Corporations Oracle8, Microsoft Corporation SQL Server 7 and so on.
DATA MODELS
Data models define how the logical structure of a database is modeled. Data Models are fundamental
entities to introduce abstraction in a DBMS. Data models define how data is connected to each other
and how they are processed and stored inside the system.
Entity-Relationship Model
o ER model stands for an Entity-Relationship model. It is a high-level data model. This model is
used to define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple and easy to
design view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-relationship
diagram.
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships among
them. While formulating real-world scenario into the database model, the ER Model creates entity
set, relationship set, general attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
Entities and their attributes.
Relationships among entities.
These concepts are explained below.
7
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
database, a student is considered as an entity. Student has various attributes like name, age,
class, etc.
Relationship − The logical associa on among en es is called relationship. Relationships are
mapped with entities in various ways. Mapping cardinalities define the number of association
between two entities.
Mapping cardinalities −
o one to one
o one to many
o many to one
o many to many
Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model than
others. This model is based on first-order predicate logic and defines a table as an n-ary relation.
8
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
Relational Model represents how data is stored in Relational Databases. A relational database stores
data in the form of relations (tables). Consider a relation STUDENT with attributes ROLL_NO, NAME,
ADDRESS, PHONE and AGE shown in Table 1.
STUDENT
ROLL_NO NAME ADDRESS PHONE AGE
4 SURESH DELHI 18
IMPORTANT TERMINOLOGIES
1. Attribute: Attributes are the properties that define a relation. e.g.; ROLL_NO, NAME
2. Relation Schema: A relation schema represents name of the relation with its attributes. e.g.;
STUDENT (ROLL_NO, NAME, ADDRESS, PHONE and AGE) is relation schema for STUDENT. If a
schema has more than 1 relation, it is called Relational Schema.
3. Tuple: Each row in the relation is known as tuple. The above relation contains 4 tuples, one of
which is shown as:
1 RAM DELHI 9455123451 18
4. Relation Instance: The set of tuples of a relation at a particular instance of time is called as
relation instance. Table 1 shows the relation instance of STUDENT at a particular time. It can
change whenever there is insertion, deletion or updation in the database.
5. Degree: The number of attributes in the relation is known as degree of the relation.
The STUDENT relation defined above has degree 5.
6. Cardinality: The number of tuples in a relation is known as cardinality. The STUDENT relation
defined above has cardinality 4.
7. Column: Column represents the set of values for a particular attribute. The column ROLL_NO is
extracted from relation STUDENT.
ROLL_NO
9
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
8. NULL Values: The value which is not known or unavailable is called NULL value. It is represented
by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is NULL.
10
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
For example, lets say we have a single table student in the database, today the table has 100 records,
so today the instance of the database has 100 records. Lets say we are going to add another 100
records in this table by tomorrow so the instance of database tomorrow will have 200 records in table.
In short, at a particular moment the data stored in database is called the instance, that changes over
time when we add or delete data from the database.
3. PHYSICAL LEVEL: At the lowest level, certain physical components organize and store
the raw data. Physical database design is the process of producing a description of
the implementation of the database on secondary storage; it describes the storage
structures and access methods used to achieve efficient access to the data. It is during
the physical database design process that the database designer decides how the
database is to be implemented.
DBMS ARCHITECTURE
The design of a DBMS depends on its architecture. It can be centralized or decentralized or
hierarchical. The architecture of a DBMS can be seen as either single tier or multi-tier. An n-
tier architecture divides the whole system into related but independent n modules, which
can be independently modified, altered, changed, or replaced.
In 1-tier architecture, the DBMS is the only entity where the user directly sits on the DBMS
and uses it. Any changes done here will directly be done on the DBMS itself. It does not
11
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
provide handy tools for end-users. Database designers and programmers normally prefer to
use single-tier architecture.
If the architecture of DBMS is 2-tier, then it must have an application through which the
DBMS can be accessed. Programmers use 2-tier architecture where they access the DBMS
by means of an application. Here the application tier is entirely independent of the database
in terms of operation, design, and programming.
3-tier Architecture
A 3-tier architecture separates its tiers from each other based on the complexity of the users
and how they use the data present in the database. It is the most widely used architecture
to design a DBMS.
Database (Data) Tier − At this er, the database resides along with its query
processing languages. We also have the relations that define the data and their
constraints at this level.
Application (Middle) Tier − At this er reside the applica on server and the programs
that access the database. For a user, this application tier presents an abstracted view
of the database. End-users are unaware of any existence of the database beyond the
application. At the other end, the database tier is not aware of any other user beyond
the application tier. Hence, the application layer sits in the middle and acts as a
mediator between the end-user and the database.
User (Presentation) Tier − End-users operate on this tier and they know nothing
about any existence of the database beyond this layer. At this layer, multiple views
of the database can be provided by the application. All views are generated by
applications that reside in the application tier.
DATA INDEPENDENCE
A database system normally contains a lot of data in addition to users’ data. For example, it
stores data about data, known as metadata, to locate and retrieve data easily. It is rather
difficult to modify or update a set of metadata once it is stored in the database. But as a
DBMS expands, it needs to change over time to satisfy the requirements of the users. If the
12
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
entire data is dependent, it would become a tedious and highly complex job.
Metadata itself follows a layered architecture, so that when we change data at one layer, it
does not affect the data at another level. This data is independent but mapped to each other.
Logical Data Independence
Logical data is data about database, that is, it stores information about how data is
managed inside. For example, a table (relation) stored in the database and all its
constraints, applied on that relation.
Logical data independence is a kind of mechanism, which liberalizes itself from actual
data stored on the disk. If we do some changes on table format, it should not change the
data residing on the disk.
Physical Data Independence
All the schemas are logical, and the actual data is stored in bit format on the disk. Physical
data independence is the power to change the physical data without impacting the
schema or logical data.
For example, in case we want to change or upgrade the storage system itself − suppose
we want to replace hard-disks with SSD − it should not have any impact on the logical
data or schemas.
DESIGN CONSTRAINTS
Database systems are designed to represent the real world systems. The database systems
require certain controls and limits for it to truly represent the real world system’s behavior.
In other words, Constraints enforce limits to the data or type of data that can be
inserted/updated/deleted from a table. The whole purpose of constraints is to maintain the data
integrity during an update/delete/insert into a table.
Types of Constraints:
1. Structural Constraints: The structure of the information within the database gives an idea
about entities in the database. For example, simple data structures are represented using
simple structures while complex data structures will need advanced structures. Structural
constraints are specified to force the placement of information into structures that best
matches the application.
13
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
2. Type Constraints: A type constraint limits the application to only one representation of
information for an entity’s attribute. For example, the database designer might want to
limit the name attribute to a fixed length character string, age, number etc.
3. Range Constraints: limits the values an attribute can take. It refers to the possible values
a particular data item can have. Range constraints can be used to limit the value of a
particular attribute to be in range.
4. Relationship Constraints: represents the relationships on values between entities. For
example there could be a relationship constraint between the entities MANAGER and
EMPLOYEE that the maximum bonus of the manager should not be greater than 6 times
that of the employees.
5. Temporal Constraints: These indicates the time period for which some information is
valid. For example: The value of the attribute sales tax or excise duty is valid only for a
specific period.
FUNCTIONAL DEPENDENCIES
Functional dependency in DBMS, as the name suggests is a relationship between attributes of a table
dependent on each other. Introduced by E. F. Codd, it helps in preventing data redundancy and gets
to know about bad designs.
To understand the concept thoroughly, let us consider P is a relation with attributes A and B.
Functional Dependency is represented by -> (arrow sign)
Then the following will represent the functional dependency between attributes with an arrow sign:
A -> B
Example
The following is an example that would make it easier to understand functional dependency:
We have a <Department> table with two attributes: DeptId and DeptName.
DeptId = Department ID
DeptName = Department Name
14
UNIT-1 BCA402: DATABASE MANAGEMENT SYSTEM(DBMS)
The DeptId is our primary key. Here, DeptId uniquely identifies the DeptName attribute. This is
because if you want to know the department name, then at first you need to have the DeptId.
DeptId DeptName
001 Finance
002 Marketing
003 HR
Therefore, the above functional dependency between DeptId and DeptName can be determined
as DeptName is functionally dependent on Deptid:
A ->B
Example: We are considering the same <Department> table with two attributes to understand the
concept of trivial dependency.
The following is a trivial functional dependency since DeptId is a subset of DeptId and DeptName
A ->B
Example
The above is a non-trivial functional dependency since DeptName is a not a subset of DeptId.
Completely Non - Trivial Functional Dependency
It occurs when A intersection B is null in:
A ->B
15