CS502 DBMS Unit-1 - 1692116690
CS502 DBMS Unit-1 - 1692116690
DBMS Concepts
Data:-Data is “facts or pieces of information”. Data can be defined as a representation of facts, concepts or
instructions in a formalized manner which should be suitable for communication, interpretation, or
processing by human or electronic machine.
Types of Data: Text, Numbers & Multimedia
Information: -
Information is organized or classified data which has some meaningful values for the receiver.
Information is the processed data on which decisions and actions are based.
For the decision to be meaningful, the processed data must qualify for the following characteristics:
Timely - Information should be available when required.
Accuracy - Information should be accurate.
Completeness - Information should be complete.
Architecture of DBMS-
An early proposal for a standard terminology and general architecture database a system was produced in
1971 by the DBTG (Data Base Task Group) appointed by the Conference on data Systems and Languages.
The DBTG recognized the need for a two-level approach with a system view called the schema and user
view called subschema. The American National Standard Institute terminology and architecture in
1975.ANSI-SPARC recognized the need for a three-level approach with a system catalog.
There are following three levels or layers of DBMS architecture:
1. External Level
2. Conceptual Level
3. Internal Level
1. External Level: - External Level is described by a schema i.e. it consists of definition of logical records and
relationship in the external view. It also contains the method of deriving the objects in the external view
from the objects in the conceptual view.
2. Conceptual Level: - Conceptual Level represents the entire database. Conceptual schema describes the
records and relationship included in the Conceptual view. It also contains the method of deriving the
objects in the conceptual view from the objects in the internal view.
3. Internal Level: - Internal level indicates how the data will be stored and described the data structures
and access method to be used by the database. It contains the definition of stored record and method of
representing the data fields and access aid used.
A mapping between external and conceptual views gives the correspondence among the records and
relationship of the conceptual and external view. The external view is the abstraction of conceptual view
which in turns is the abstraction of internal view. It describes the contents of the database as perceived by
the user or application program of that view.
Figure 1.1 Three layers of DBMS architecture
Three-tier architecture typically comprises a presentation tier, a business or data access tier, and
a data tier. Three layers in the three-tier architecture are as follows:
1) Client layer
2) Business layer
3) Data layer
1) Client layer:
It is also called as Presentation layer which contains UI part of our application. This layer is used for the
design purpose where data is presented to the user or input is taken from the user.
Example-Designing registration form which contains text box, label, button etc
2) Business layer:
In this layer all business logic written likes validation of data, calculations, data insertion etc. This acts as an
interface between Client layer and Data Access Layer. This layer is also called the intermediate layer helps
to make communication faster between client and data layer.
3) Data layer:
In this layer, actual database is coming in the picture. Data Access Layer contains methods to connect with
database and to perform insert, update, delete, get data from database based on our input data.
Advantages
1. High performance, lightweight persistent objects
2. Scalability – Each tier can scale horizontally
3. Performance – Because the Presentation tier can cache requests, network utilization is minimized,
and the load is reduced on the Application and Data tiers.
4. High degree of flexibility in deployment platform and configuration
5. Better Re-use
6. Improve Data Integrity
7. Improved Security – Client is not direct access to database.
8. Easy to maintain and modification is bit easy, won’t affect other modules
9. In three tier architecture application performance is good.
Disadvantages
1. Increase Complexity/Effort
Use separate data file for each application All Application shares a pool of related and
integrated data.
Data redundancy – independent data files Minimal data redundancy – Separate data
included a lot of duplicated data. files are integrated in to a single, logical
structure.
Same data is recorded and stored in several Each occurrence of a data item is recorded
files. only once.
Data inconsistency – several versions of the Single version of data exist
same data may exist.
Same update must be done in all Single update is required.
occurrences of same data item in each file.
Users have very little opportunity to share A database is developed to share the data
data outside of their own application. among the user who access to it
There is no centralized control for overall There is centralized control for overall data
data in different files. in database.
Data dependence – description of files, Data independence – the database system
records and data items are embedded within separates data descriptions from the
individual application programs. application programs that use the data in it
Modification to data files requires the Data structure can be modified without
programs which access that file to be changing the programs accessing the data
modified.
High program maintenance Less program maintenance
Lack of data integration – accessing data in Data are organized in to a single logical
several files are difficult structure with logical relationships defined
between associated data
Difficult to manipulation data Easy to manipulation data
Database Instance
Database schema is the skeleton of database. It is designed when the database doesn't exist at all. Once
the database is operational, it is very difficult to make any changes to it. A database schema does not
contain any data or information.
A database instance is a state of operational database with data at any given time. It contains a snapshot of
the database. Database instances tend to change with time.
Data Independence
1. Logical data independence is the capacity to change the conceptual schema without having to change
external schemas or application programs. User change the conceptual schema to expand the database (by
adding a record type or data item), to change constraints, or to reduce the database (by removing a record
type or data item). In the last case, external schemas that refer only to the remaining data should not be
affected. Only the view definition and the mappings need to be changed in a DBMS that supports
logical data independence.
2. Physical data independence is the capacity to change the internal schema without having to change the
conceptual schema. For example, by creating additional access structures—to improve the performance of
retrieval or update. If the same data as before remains in the database, user should not have to change the
conceptual schema. Generally, physical data independence exists in most databases and file environments
where physical details such as the exact location of data on disk, and hardware details of storage encoding,
placement, compression, splitting, merging of records, and so on are hidden from the user. Applications
remain unaware of these details.
Interfaces in DBMS
A database management system (DBMS) interface is a user interface which allows for the ability to input
queries to a database without using the query language itself.
User-friendly interfaces provide by DBMS include the following:
Menu-Based Interfaces for Web Clients or Browsing- These interfaces present the user with lists of
options (called menus) that lead the user through the formation of a request.
Forms-Based Interfaces - A forms-based interface displays a form to each user. Users can fill out all of the
form entries to insert a new data, or they can fill out only certain entries, in which case the DBMS will
redeem same type of data for other remaining entries.
Graphical User Interface - A GUI typically displays a schema to the user in diagrammatic form.The user
then can specify a query by manipulating the diagram.
Database Languages
Database Languages are the set of statements, that are used to define and manipulate a database. A
Database language has Data Definition Language (DDL), which is used to construct a database & it has Data
Manipulation Language (DML), which is used to access a database.
Database languages provide the tools to implement and manipulate a database. DDL and DML are not
two distinct languages but they together form a database language. The examples of database languages
are SQL, My Access, Oracle, etc.
Data models
Data Model gives us an idea that how the final system will look like after its complete implementation. It
defines the data elements and the relationships between the data elements. Data Models are used to
show how data is stored, connected, accessed and updated in the database management system. Here, a
set of symbols and text to represent the information so that members of the organisation can
communicate and understand it. Though there are many data models being used nowadays but the
Relational model is the most widely used model.
Generalization
It is a bottom-up approach in which two lower level entities combine to form higher entity. In
generalization, the higher-level entity can also combine with other lower level entity to make
further higher-level entity.
Generalization proceeds from the recognition that a number of entity sets share some common
features. On the basis of the commonalities, generalization synthesizes these entity sets into a
single, higher-level entity set.
Generalization is used to emphasize the similarities among lower-level entity sets and to hide the
differences in the schema.
Specialization
It is opposite to Generalization. It is a top-down approach in which one higher level entity can be
broken down into lower level entity.
Example: The specialization of student allows us to distinguish among students according to
whether they are Ex-Student or Current Student.
Specialization can be repeatedly applied to refine a design schema.
Figure 1.4 Specialization
Aggregation
One limitation of the E-R model is that it cannot express relationships among relationships. To
illustrate the need for such a construct, quaternary relationships are used which lead to
redundancy in data storage. The best way to mode such situations is to use aggregation.
Aggregation is an abstraction through which relationships are treated as higher-level entities.
Below is the example of aggregation relation between offer (which is binary relation between
center and course) and visitor.
Data-model
Data models define how the logical structure of a database is modeled. Data Models are fundamental
entities to introduce abstraction in a DBMS. Data models define how data is connected to each other and
how they are processed and stored inside the system.
The very first data model could be flat data-models, where all the data used are to be kept in the same
plane. Earlier data models were not so scientific, hence they were prone to introduce lots of duplication
and update anomalies.