Unit 1
Unit 1
Unit-1
What is Data?
Data is a collection of a distinct small unit of information. It can be used in a variety of
forms like text, numbers, media, bytes, etc. it can be stored in pieces of paper or electronic
memory, etc.
Word 'Data' is originated from the word 'datum' that means 'single piece of information.'
It is plural of the word datum.
In computing, Data is information that can be translated into a form for efficient movement
and processing. Data is interchangeable.
What is Database?
A database is an organized collection of data, so that it can be easily accessed and managed.
The database is a collection of inter-related data which is used to retrieve, insert and delete
the data efficiently. It is also used to organize the data in the form of a table, schema, views,
and reports, etc.
You can organize data into tables, rows, columns, and index it to make it easier to find relevant
information.
For example: The college Database organizes the data about the admin, staff, students
and faculty etc.
Using the database, you can easily retrieve, insert, and delete the information.
o Data Updation: It is used for the insertion, modification, and deletion of the actual
data in the database.
o Data Retrieval: It is used to retrieve the data from the database which can be used
by applications for various purposes.
o User Administration: It is used for registering and monitoring users, maintain data
integrity, enforcing data security, dealing with concurrency control, monitoring
performance and recovering information corrupted by unexpected failure.
Characteristics of DBMS
o It uses a digital repository established on a server to store and manage the information.
o It can provide a clear and logical view of the process that manipulates data.
o DBMS contains automatic backup and recovery procedures.
o It contains ACID properties which maintain data in a healthy state in case of failure.
o It can reduce the complex relationship between data.
o It is used to support manipulation and processing of data.
o It is used to provide security of data.
o It can view the database from different viewpoints according to the requirements of
the user.
Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores all
the data in one single database file and that recorded data is placed in the database.
o Data sharing: In DBMS, the authorized users of an organization can share the
data among multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of
the database system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic
backup of data from hardware and software failures and restores the data if
required.
Data Base Management System 4
o multiple user interface: It provides different types of user interfaces like graphical
user interfaces, application program interfaces
Disadvantages of DBMS
o Cost of Hardware and Software: It requires a high speed of data processor and
large memory size to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently.
o Complexity: Database system creates additional complexity and requirements.
Data Base Management System 5
o Higher impact of failure: Failure is highly impacted the database because in most of
the organization, all the data stored in a single database and if the database is
damaged due to electric failure or database corruption then the data may be lost
forever.
DBMS
A database approach is a well-organized collection of data that are related in a meaningful
way which can be accessed by different users but stored only once in a system. The various
operations performed by the DBMS system are: Insertion, deletion, selection, sorting etc.
There are the following differences between DBMS and File systems:
Sharing of data Due to the centralized approach, data Data is distributed in many files, and
sharing is easy. it may be of different formats, so it
isn't easy to share data.
Data Abstraction DBMS gives an abstract view of data The file system provides the detail of
that hides the details. the data representation and storage
of data.
Data Base Management System 7
Security and DBMS provides a good It isn't easy to protect a file under the
Protection protection mechanism. file system.
Recovery DBMS provides a crash recovery The file system doesn't have a crash
Mechanism mechanism, i.e., DBMS protects the mechanism, i.e., if the system crashes
user from system failure. while entering some data, then the
content of the file will be lost.
Manipulation DBMS contains a wide variety of The file system can't efficiently store
Techniques sophisticated techniques to store and and retrieve the data.
retrieve the data.
Concurrency DBMS takes care of Concurrent access In the File system, concurrent access
Problems of data using some form of locking. has many problems like redirecting
the file while deleting some
information or updating some
information.
Where to use Database approach used in large File system approach used in large
systems which interrelate many files. systems which interrelate many files.
Cost The database system is expensive to The file system approach is cheaper to
design. design.
Data Redundancy Due to the centralization of the In this, the files and application
and Inconsistency database, the problems of data programs are created by different
redundancy and inconsistency are programmers so that there exists a
controlled. lot of duplication of data which may
lead to inconsistency.
Structure The database structure is complex to The file system approach has a simple
design. structure.
Data Base Management System 8
Data In this system, Data Independence In the File system approach, there
Independence exists, and it can be of two types. exists no Data Independence.
o Logical Data Independence
o Physical Data Independence
Integrity Integrity Constraints are easy to apply. Integrity Constraints are difficult
Constraints to implement in file system.
Data Base Management System 9
Data Models In the database approach, 3 types of In the file system approach, there is
data models exist: no concept of data models exists.
o Hierarchal data models
o Network data models
o Relational data models
Flexibility Changes are often a necessity to the The flexibility of the system is less
content of the data stored in any as compared to the DBMS approach.
system, and these changes are more
easily with a database approach.
1. Internal Level
o The internal level has an internal schema which describes the physical storage
structure of the database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data will be stored
in a block.
o The physical level is used to describe complex low-level data structures in detail.
2. Conceptual Level
Data Base Management System 12
o The conceptual schema describes the design of a database at the conceptual
level. Conceptual level is also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the database and
also describes what relationship exists among those data.
o In the conceptual level, internal details such as an implementation of the data
structure are hidden.
o Programmers and database administrators work at this level.
Data Base Management System 13
3. External Level
o At the external level, a database contains several schemas that sometimes called
as subschema. The subschema is used to describe the different view of the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user group is
interested and hides the remaining database from that user group.
o The view schema describes the end user interaction with database systems.
Data Models
Data Base Management System 14
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a
database at each level of data abstraction.
1) Relational Data Model: This type of model designs the data in the form of rows and
columns within a table. Thus, a relational model uses tables for representing data and in-
between relationships. Tables are also called relations. This model was initially
described by Edgar F.
Data Base Management System 15
Codd, in 1969. The relational data model is the widely used model which is primarily used
by commercial data processing applications.
A schema diagram can display only some aspects of a schema like the name of record type,
data type, and constraints. Other aspects can't be specified through the schema diagram.
For example, the given figure neither show the data type of each data item nor the
relationship among various files.
Data Base Management System 17
In the database, actual data changes quite frequently. For example, in the given figure, the
database changes whenever we add a new grade or add a student. The data at a particular
moment of time is called the instance of the database.
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one
level of the database system without altering the schema at the next higher level.
Responsibilities of DBA
The responsibilities of DBA are as follows −
• Makes the decision concerning the content of the database.
Data Base Management System 19
• Plans the storage structure and access strategy.
• Provides the support to the users.
• Defines the security and integrity checks.
• Interpreter backup and recovery strategies.
• Monitoring the performance and responding to the changes in the requirements.
For example, Suppose we design a school database. In this database, the student will be an
entity with attributes like address, name, id, age, etc. The address can be another entity
with attributes like city, street name, pin code, etc and there will be a relationship between
them.
ER Diagrams
Data Base Management System 21
An Entity Relationship Diagram is a diagram that represents relationships among entities
in a database. It is commonly known as an ER Diagram. An ER Diagram in DBMS plays a
crucial role in designing the database. Today's business world previews all the
requirements demanded by the users in the form of an ER Diagram.
Component of ER Diagram
Data Base Management System 22
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can
be represented as rectangles.
Consider an organization as an example- manager, product, employee, department etc. can
be taken as an entity.
Strong Entity
Data Base Management System 23
The strong entity has a primary key. Weak entities are dependent on strong entity.
Its existence is not dependent on any other entity.
Strong Entity is represented by a single rectangle
Continuing our previous example, Professor is a strong entity here, and the primary
key is Professor_ID.
Weak Entity
Data Base Management System 24
The weak entity in DBMS do not have a primary key and are dependent on the parent
entity. It mainly depends on other entities.
Weak Entity is represented by double rectangle.
Continuing our previous example, Professor is a strong entity, and the primary
key is Professor_ID. However, another entity is Professor_Dependents, which is our
Weak Entity.
<Professor_Dependents>
Name DOB Relation
This is a weak entity since its existence is dependent on another entity Professor, which
we saw above. A Professor has Dependents.
An entity that depends on another entity called a weak entity. The weak entity doesn't
contain any key attribute of its own. The weak entity is represented by a double rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent
an attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.
Data Base Management System 25
a. Key Attribute
The key attribute is used to represent the main characteristics of an entity. It
represents a primary key. The key attribute is represented by an ellipse with the text
underlined.
Data Base Management System 26
b. Composite Attribute
An attribute that composed of many other attributes is known as a composite attribute.
The composite attribute is represented by an ellipse, and those ellipses are connected with
an ellipse.
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a
multivalued attribute. The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
Data Base Management System 27
An attribute that can be derived from other attribute is known as a derived attribute. It can be
represented by a dashed ellipse.
For example, A person's age changes over time and can be derived from another attribute
like Date of birth.
Data Base Management System 28
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is known as
one to one relationship.
For example, A female can marry to one male, and a male can marry to one female.
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity
on the right associates with the relationship then this is known as a one-to-many
relationship.
For example, Scientist can invent many inventions, but the invention is done by the
Data Base Management System 29
only specific scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity
on the right associates with the relationship then it is known as a many-to-one relationship.
Data Base Management System 30
For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an
entity on the right associates with the relationship then it is known as a many-to-many
relationship.
For example, Employee can assign by many projects and project can have many employees.
Generalization
o Generalization is like a bottom-up approach in which two or more entities of lower level
combine to form a higher level entity if they have some attributes in common.
o In generalization, an entity of a higher level can also combine with the entities of the lower level
to form a further higher level entity.
o Generalization is more like subclass and superclass system, but the only difference is the
approach. Generalization uses the bottom-up approach.
o In generalization, entities are combined to form a more generalized entity, i.e., subclasses are
combined to make a superclass.
For example, Faculty and Student entities can be generalized and create a higher level entity Person.
Data Base Management System 31
Specialization
o Specialization is a top-down approach, and it is opposite to Generalization. In specialization, one
higher level entity can be broken down into two lower level entities.
o Specialization is used to identify the subset of an entity set that shares some distinguishing
characteristics.
o Normally, the superclass is defined first, the subclass and its related attributes are defined next,
and relationship set are then added.
Aggregation
In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.
For example: Center entity offers the Course entity act as a single entity in the relationship which is in a
relationship with another entity visitor. In the real world, if a visitor visits a coaching center then he
will never enquiry about the Course only or just about the Center instead he will ask the enquiry about
both.
The database can be represented using the notations, and these notations can be reduced to a collection
of tables.
In the database, every entity set or relationship set can be represented in tabular form.
Data Base Management System 33
The ER diagram is given below:
In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual tables.
In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of STUDENT table.
Similarly, COURSE_NAME and COURSE_ID form the column of COURSE table and so on.
In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID, and LECTURE_ID are the key
attribute of the entity.
In the given ER diagram, student address is a composite attribute. It contains CITY, PIN, DOOR#,
STREET, and STATE. In the STUDENT table, these attributes can merge as an individual column.
In the STUDENT table, Age is the derived attribute. It can be calculated at any point of time by
calculating the difference between current date and Date of Birth.
Using these rules, you can convert the ER diagram to tables and columns and assign the mapping
between the tables. Table structure for the given ER diagram is as below
Data Base Management System 35