Unit 1 CS 502
Unit 1 CS 502
Data: Data are characteristics, usually numerical, that are collected through observation. In a more
technical sense, data are a set of values of qualitative or quantitative variables about one or more persons
or objects, while a datum (singular of data) is a single value of a single variable. Any facts and figures
about an entity is called as Data.
Information: When data is processed/ Organized/ structured or presented in a given context or in order
to make information useful.
Database: A database is an organized collection of data, generally stored and accessed electronically
from a computer system. The database is a collection of inter-related data which is used to retrieve, insert
and delete the data efficiently. It is also used to organize the data in the form of a table, schema, views,
and reports, etc. For example: The college Database organizes the data about the admin, staff, students
and faculty etc. Using the database, you can easily retrieve, insert, and
delete the information.
⚫ Data Definition: It is used for creation, modification, and removal of definition that defines the
organization of data in the database.
⚫ Data Updation: It is used for the insertion, modification, and deletion of the actual data in the database.
⚫ Data Retrieval: It is used to retrieve the data from the database which can be used by applications for
various purposes.
⚫ User Administration: It is used for registering and monitoring users, maintain data integrity, enforcing
data security, dealing with concurrency control, monitoring performance and recovering information
corrupted by unexpected failure.
Characteristics of DBMS
⚫ It uses a digital repository established on a server to store and manage the information.
⚫ It can provide a clear and logical view of the process that manipulates data.
⚫ It contains ACID properties which maintain data in a healthy state in case of failure.
⚫ It can view the database from different viewpoints according to the requirements of the user.
Advantages of DBMS
⚫ Controls database redundancy: It can control data redundancy because it stores all the data in one
single database file and that recorded data is placed in the database.
⚫ Data sharing: In DBMS, the authorized users of an organization can share the data among multiple
users.
⚫ Easily Maintenance: It can be easily maintainable due to the centralized nature of the database system.
⚫ Backup: It provides backup and recovery subsystems which create automatic backup of data from
hardware and software failures and restores the data if required.
⚫ multiple user interface: It provides different types of user interfaces like graphical user interfaces,
application program interfaces
Disadvantages of DBMS
⚫ Cost of Hardware and Software: It requires a high speedof data processor and large memory size to
DBMS software.
⚫ Size: It occupies a large space of disks and large memory to run them efficiently.
⚫ Higher impact of failure: Failure is highly impacted the database because in most of the organization,
all the data stored in a single database and if the database is damaged due to electric failure or database
corruption then the data may be lost forever.
DBMS Architecture
⚫ The DBMS design depends upon its architecture. The basic client/server architecture is used to deal
with a large number of PCs, web servers, database servers and other components that are connected with
networks.
⚫ The client/server architecture consists of many PCs and a workstation which are connected via the
network.
⚫ DBMS architecture depends upon how users are connected to the database to get their request done.
1-Tier Architecture
⚫ In this architecture, the database is directly available to the user. It means the user can directly sit on
the DBMS and uses it.
⚫ Any changes done here will directly be done on the database itself. It doesn't provide a handy tool for
end users.
⚫ The 1-Tier architecture is used for development of the local application, where programmers can
directly communicate with the database for the quick response.
• ADVANTAGES:
• Simple Architecture: 1-Tier Architecture is the most simple architecture to set up, as only a single
machine is required to maintain it.
• Cost-Effective: No additional hardware is required for implementing 1-Tier Architecture, which
makes it cost-effective.
• Easy to Implement: 1-Tier Architecture can be easily deployed, and hence it is mostly used in
small projects.
2-Tier Architecture
⚫ The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on the
client end can directly communicate with the database at the server side. For this interaction, API's like:
ODBC, JDBC are used.
⚫ The user interfaces and application programs are run on the client-side.
⚫ The server side is responsible to provide the functionalities like: query processing and transaction
management.
⚫ To communicate with the DBMS, client-side application establishes a connection with the server side.
ADVANTAGES:
• Easy to Access: 2-Tier Architecture makes easy access to the database, which makes fast retrieval.
• Scalable: We can scale the database easily, by adding clients or upgrading hardware.
• Low Cost: 2-Tier Architecture is cheaper than 3-Tier Architecture and Multi-Tier Architecture.
• Easy Deployment: 2-Tier Architecture is easier to deploy than 3-Tier Architecture.
• Simple: 2-Tier Architecture is easily understandable as well as simple because of only two
components.
3-Tier Architecture
⚫ The 3-Tier architecture contains another layer between the client and server. In this architecture, client
can't directly communicate with the server.
⚫ The application on the client-end interacts with an application server which further communicates with
the database system.
⚫ End user has no idea about the existence of the database beyond the application server. The database
also has no idea about any other user beyond the application.
ADVANTAGES:
• Enhanced scalability: Scalability is enhanced due to the distributed deployment of application
servers. Now, individual connections need not be made between the client and server.
• Data Integrity: 3-Tier Architecture maintains Data Integrity. Since there is a middle layer between
the client and the server, data corruption can be avoided/removed.
• Security: 3-Tier Architecture Improves Security. This type of model prevents direct interaction of
the client with the server thereby reducing access to unauthorized data.
⚫ The three schema architecture is also called ANSI/SPARC architecture or three-level architecture.
⚫ This framework is used to describe the structure of a specific database system.
⚫ The three schema architecture is also used to separate the user applications and physical database.
⚫ The three schema architecture contains three-levels. It breaks the database down into three different
categories.:
1. Internal Level:
⚫ The internal level has an internal schema which describes the physical storage structure of the
database.
⚫ It uses the physical data model. It is used to define that how the data will be stored in a block
⚫ The physical level is used to describe complex low-level data structures in detail.
2. Conceptual Level:
⚫ The conceptual schema describes the design of a database at the conceptual level. Conceptual
level is also known as logical level.
⚫ The conceptual level describes what data are to be stored in the database and also describes
what relationship exists among those data.
⚫ In the conceptual level, internal details such as an implementation of the data structure are
hidden.
3. External Level :
⚫ At the external level, a database contains several schemas that sometimes called as subschema. The
subschema is used to describe the different view of the database.
⚫ Each view schema describes the database part that a particular user group is interested and hides
the remaining database from that user group.
⚫ The view schema describes the end user interaction with database systems.
Data Independence
Data independence means a change of data at one level should not affect another level. Two types of
data independence are present in this architecture:
• Physical Data Independence: Any change in the physical location of tables and indexes should
not affect the conceptual level or external view of data. This data independence is easy to achieve
and implemented by most of the DBMS.
• Conceptual Data Independence: The data at conceptual level schema and external level schema
must be independent. This means a change in conceptual schema should not affect external schema.
e.g.; Adding or deleting attributes of a table should not affect the user’s view of the table. But this
type of independence is difficult to achieve as compared to physical data independence because the
changes in conceptual schema are reflected in the user’s view.
Phases of Database Design
Database designing for a real-world application starts from capturing the requirements to physical
implementation using DBMS software which consists of following steps shown below:
DBMS Phases
• Conceptual Design: The requirements of database are captured using high level conceptual data
model. For Example, the ER model is used for the conceptual design of the database.
• Logical Design: Logical Design represents data in the form of relational model. ER diagram
produced in the conceptual design phase is used to convert the data into the Relational Model.
• Physical Design: In physical design, data in relational model is implemented using commercial
DBMS like Oracle, DB2.
Database Language :
⚫ A DBMS has appropriate languages and interfaces to express database queries and updates.
⚫ Database languages can be used to read, store and update the data in the database.
⚫ Using the DDL statements, you can create the skeleton of the database.
⚫ Data definition language is used to store the information of metadata like the number of tables and
schemas, their names, indexes, columns in each table, constraints, etc. Here are some tasks that come
under DDL:
⚫ Comment: It is used to comment on the data dictionary. These commands are used to update the
database schema that's why they come under Data definition language.
Database Administrator
1. A Database Administrator (DBA) is an individual or person responsible for controlling, maintaining,
coordinating, and operating a database management system. Managing, securing, and taking care of the
database systems is a prime responsibility.
2. They are responsible and in charge of authorizing access to the database, coordinating, capacity,
planning, installation, and monitoring uses, and acquiring and gathering software and hardware resources
as and when needed.
3.Their role also varies from configuration, database design, migration, security, troubleshooting, backup,
and data recovery.
4. A major and key function in any firm or organization that is relying on one or more databases. They are
overall commanders of the Database system.
1. Database Administrator manages and controls three levels of database internal level, conceptual level,
and external level of Database management system architecture and in discussion with the comprehensive
user community, gives a definition of the world view of the database. It then provides an external view of
different users and applications.
2. Database Administrator ensures held responsible to maintain integrity and security of database
restricting from unauthorized users. It grants permission to users of the database and contains a profile of
each and every user in the database.
3. Database Administrators are also held accountable that the database is protected and secured and that
any chance of data loss keeps at a minimum.
4. Database Administrator is solely responsible for reducing the risk of data loss as it backup the data at
regular intervals.
2. Manages Data Integrity and Security: Data integrity needs to be checked and managed accurately as it
protects and restricts data from unauthorized use. DBA eyes on relationships within data to maintain data
integrity.
3. Database Accessibility: Database Administrator is solely responsible for giving permission to access
data available in the database. It also makes sure who has the right to change the content.
4. Database Design: DBA is held responsible and accountable for logical, physical design, external model
design, and integrity and security control.
5. Database Implementation: DBA implements DBMS and checks database loading at the time of its
implementation.
6. Query Processing Performance: DBA enhances query processing by improving speed, performance,
and accuracy.
7. Tuning Database Performance: If the user is not able to get data speedily and accurately then it may
lose organization’s business. So by tuning SQL commands DBA can enhance the performance of the
database.
4. DBA selects appropriate DBMS software like oracle, SQL server or MySQL.
6. DBA decides the user access level and security checks for accessing, modifying or manipulating data.
7. DBA is responsible for specifying various techniques for monitoring the database performance.
DATA MODEL:
The Data Model gives us an idea of how the final system would look after it has been fully
implemented. It specifies the data items as well as the relationships between them. In a database
management system, data models are often used to show how data is connected, stored, accessed,
and changed. We portray the information using a set of symbols and language so that members
of an organisation may understand and comprehend it and then communicate.
A Data Model in Database Management System (DBMS) is the concept of tools that are
developed to summarize the description of the database. Data Models provide us with a
transparent picture of data which helps us in creating an actual database. It shows us from the
design of the data to its proper implementation of data.
Types of Models:
1. Conceptual Data Model
2. Representational Data Model
3. Physical Data Model
ER Model:
The ER model was created to provide a simple and understandable model for representing the
structure and logic of databases. It has since evolved into variations such as the Enhanced ER
Model and the Object Relationship Model
The Entity Relational Model is a model for identifying entities to be represented in the database
and representation of how those entities are related. The ER data model specifies enterprise
schema that represents the overall logical structure of a database graphically.
• ER diagrams represent the E-R model in a database, making them easy to convert into
relations (tables).
• ER diagrams provide the purpose of real-world modeling of objects which makes them
intently useful.
• ER diagrams require no technical knowledge and no hardware support.
• These diagrams are very easy to understand and easy to create even for a naive user.
• It gives a standard solution for visualizing the data logically.
ER Model is used to model the logical view of the system from a data perspective which consists
of these symbols:
• Rectangles: Rectangles represent Entities in the ER Model.
• Ellipses: Ellipses represent Attributes in the ER Model.
• Diamond: Diamonds represent Relationships among Entities.
• Lines: Lines represent attributes to entities and entity sets with other relationship types.
• Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
• Double Rectangle: Double Rectangle represents a Weak Entity.
Entity: An Entity may be an object with a physical existence – a particular person, car,
house, or employee – or it may be an object with a conceptual existence – a company, a
job, or a university course.
Entity Set: An Entity is an object of Entity Type and a set of all entities is called an entity set.
For Example, E1 is an entity having Entity Type Student and the set of all students is called
Entity Set. In ER diagram, Entity Type is represented as:
Types of Entity
There are two types of entity:
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not depend on
other Entity in the Schema. It has a primary key, that helps in identifying it uniquely, and it is
represented by a rectangle. These are called Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the entity set. But some
entity type exists for which key attributes can’t be defined. These are called Weak Entity types .
For Example, A company may store the information of dependents (Parents, Children, Spouse)
of an Employee. But the dependents can’t exist without the employee. So Dependent will be
a Weak Entity Type and Employee will be Identifying Entity type for Dependent, which means
it is Strong Entity Type .
A weak entity type is represented by a Double Rectangle. The participation of weak entity types
is always total. The relationship between the weak entity type and its identifying strong entity
type is called identifying relationship and it is represented by a double diamond.
Attributes:
The properties that define the entity type. For example, Roll_No, Name, DOB, Age,
Address, and Mobile_No are the attributes that define entity type Student. In ER
diagram, the attribute is represented by an oval.
Attribute
Types of Attributes
1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key attribute.
For example, Roll_No will be unique for each student. In ER diagram, the key attribute is
represented by an oval with underlying lines.
Key Attribute
2. Composite Attribute
An attribute composed of many other attributes is called a composite attribute. For example,
the Address attribute of the student Entity type consists of Street, City, State, and Country. In
ER diagram, the composite attribute is represented by an oval comprising of ovals.
Composite Attribute
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example, Phone_No (can
be more than one for a given student). In ER diagram, a multivalued attribute is represented by
a double oval.
Multivalued Attribute
4. Derived Attribute
An attribute that can be derived from other attributes of the entity type is known as a derived
attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived attribute is
represented by a dashed oval.
Derived Attribute
The Complete Entity Type Student with its Attributes can be represented as:
Relationship: A Relationship Type represents the association between entity types. The
number of times an entity of an entity set participates in a relationship set is known
as cardinality.
One-to-One: When each entity in each entity set can take part only once in the
relationship, the cardinality is one-to-one.
One-to-Many: In one-to-many mapping as well where each entity can be related to
more than one entity and the total number of tables that can be used in this is 2.
Many-to-One: When entities in one entity set can take part only once in the
relationship set and entities in other entity sets can take part more than once in the
relationship set, cardinality is many to one.
Many-to-Many: When entities in all entity sets can take part more than once in the
relationship cardinality is many to many. The total number of tables that can be used
in this is 3.
Using the ER model for bigger data creates a lot of complexity while designing a database
model, So in order to minimize the complexity Generalization, Specialization, and Aggregation
were introduced in the ER model and these were used for data abstraction in which an
abstraction mechanism is used to hide details of a set of objects.
Generalization:
Generalization is like a bottom-up approach in which two or more entities of lower level combine
to form a higher level entity if they have some attributes in common.
In generalization, an entity of a higher level can also combine with the entities of the lower level
to form a further higher level entity.
Generalization is more like subclass and superclass system, but the only difference is the approach.
Generalization uses the bottom-up approach.
In generalization, entities are combined to form a more generalized entity, i.e., subclasses are
combined to make a superclass.
For example, Faculty and Student entities can be generalized and create a higher level entity Person.
Specialization:
Specialization is a top-down approach, and it is opposite to Generalization. In specialization, one
higher level entity can be broken down into two lower level entities.
Specialization is used to identify the subset of an entity set that shares some distinguishing
characteristics.
Normally, the superclass is defined first, the subclass and its related attributes are defined next,
and relationship set are then added
For example: In an Employee management system, EMPLOYEE entity can be specialized as
TESTER or DEVELOPER based on what role they play in the company.
Aggregation:
In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.
For example: Center entity offers the Course entity act as a single entity in the relationship which is
in a relationship with another entity visitor. In the real world, if a visitor visits a coaching center then
he will never enquiry about the Course only or just about the Center instead he will ask the enquiry
about both.
Study Of Various Data Models:
A Database model defines the logical design and structure of a database. It defines how data will
be stored, accessed, and updated in a database management system.
• As per your application's requirement, you can use a database model to define your
database.
• The database model sets the rule, relationships, constraints, etc. to define how data is
stored in the database.
• It's like creating a blueprint of your Database.
• There are different types of Database models and each one has its own set of features.
• You can define how you want to structure the application data using a database model.
• The hierarchical database model organizes data into a tree-like structure, with a single
root, to which all the other data is linked.
• The hierarchy starts from the Root data, and expands like a tree, adding child nodes to
the parent nodes.
• In this model, a child node will only have a single parent node.
• This model efficiently describes many real-world relationships like the index of a book,
etc.
• IBM's Information Management System (IMS) is based on this model.
• Data is organized into a tree-like structure with a one-to-many relationship between two
different types of data.
Here are a few points to mark the advantages and disadvantages of the Hierarchical database
model:
1. Because it has one-to-many relationships between different types of data so it is easier
and fast to fetch the data.
2. But the Hierarchical model is less flexible.
3. And it doesn't support many-to-many relationships.
Network Model
• The Network Model is an extension of the Hierarchical model.
• In this model, data is organized more like a graph, and allowed to have more than one
parent node.
• In the network database model, data is more related as more relationships are established
in this database model.
• Also, as the data is more related, hence accessing the data is also easier and fast.
• This database model uses many-to-many data relationships.
• Integrated Data Store (IDS) is based on this database model.
• This was the most widely used database model before Relational Model was introduced.
• The implementation of the Network model is complex, and it's very difficult to maintain
it.
• The Network model is difficult to modify also.
• You may want to explore this if you are developing some social networking applications,
although the Graph Database model is new and is far better than the Network Database
model.
Object-oriented Model:
3. Define database management system what are the major components of this system? Explain
each component.
4. What is data model? List a few data models that you know.
7. What is an entity type? What is an entity set? explain the differences among and entity and
entity type and an entity set.
8. Explain the advantages of database management system over file management system.
9. What is entity? What is relationship? Explain E-R modelling with help of database for student
management system.
11. What is the difference between database user and database administrator ? explain various
functions of database administrator.
12. Define and explain following terms: Levels of data abstraction, Instances, Schema, Physical
data independence, Logical data independence.