KKW Unit 1 Database System Concepts
KKW Unit 1 Database System Concepts
The collection of data, usually referred to as the database , contains information relevant to an
enterprise.
The primary goal of a DBMS is to provide a way to store and retrieve database information that is both
convenient and efficient.
Management of data involves both defining structures for storage of information and providing
mechanisms for the manipulation of information.
In addition, the database system must ensure the safety of the information stored, despite system crashes
or attempts at unauthorized access.
If data are to be shared among several users, the system must avoid possible anomalous results.
Data Isolation:
As data is scattered in different files & locations & formats, then writing new applications programs for retrieving
is tedious task.
Data Integrity:
The database must satisfy some consistency constructions i.e. accuracy. E.g. In patient’s database there should be
limit for storing the x-ray results for minimum of five year before it can be deleted for easy access & accuracy.
Atomicity Problems:
If any transaction is in process, & some failure occurs, then it should be fully executed or not, so that it can be in
consistency state. Eg. If transaction is in process from sending money from A account. to B account & if failure
occurs, then A to B should get transferred or not is major problem.
Security problems:
Data should be secured against unauthorised access & prevented from some malicious attacks which is not
possible in file processing.
In this architecture the application is partition into a component that resides on server machine.
Application program interface standards like ODBC & JDBC are used for interaction between the client
and the server
2] 3-tier Architecture
In this architecture client machine acts as a frontend and does not contain any direct database calls.
Instead a client communicates with application server. The application server in turn communicates with
database system to access data.
The main logic for all application program is embedded within application server.
3-tier architecture is more appropriate for large application i. e. www or business logic
Physical level.
The lowest level of abstraction describes how the data are actually stored. The physical level describes
complex low-level data structures in Detail.
Logical level.
The highest level of abstraction describes only part of the entire database.
Even though the logical level uses simpler structures, complexity Remains because if the variety of
information stored in a large database.
Many users of the database system do not need all this information; instead, they need to access only a
part of the database.
The view level of abstraction exists to simplify their interaction with the system. The system may provide
many views for the same database.
Data Independence
It is the ability of an application to change the storage structure & access strategy.
This is a prime advantage of a database. In conventional systems applications are data-dependent.
For example, if a file is stored in indexed sequential form then an application must know that the index
exists the file sequence (as defined by the index), and the internal structure of the application will be built
around this knowledge.
If, for example, the file was to be replaced by a hash-addressed file major modifications would have to be
made to the application.
Such an application is data-dependent it is undesirable to allow applications to be data-dependent the DBA
must have the freedom to change storage structure or access strategy in response to changing requirements
without having to modify existing applications.
It is divided into 2 types:
Logical Data Independence: The ability to change the conceptual schema without having to change the
external schemas and their application programs. It is easy to achieve this.
Ex. New fields can be added to database without disturbing old records.
Physical Data Independence: The ability to change the internal schema without having to change the
conceptual schema. It is difficult to achieve this.
Ex. To achieve this, attributes of different tables are considered & changes are done. Then those changes are
reflected to old one.
Instances and Schema
Instances –
The collection of information stored in the database at a particular moment is called an instance of the database.
Schema –
The overall design of the database is called the database schema.
1) Storage Manager
The storage manager is important because databases typically require a large amount of storage space.
A storage manager is a program module that provides the interface between the low level data stored in the
database and the application programs and queries submitted to the system
Thus, the storage manager is responsible for storing, retrieving, and updating data in the database.
Authorization and Integrity Manager - which tests for the satisfaction of integrity Constraints and checks
the authority of users to access data.
Transaction Manager - which ensures that the database remains in a consistent (correct) state despite
system failures, and that concurrent transaction executions proceed without conflicting.
File Manager- which manages the allocation of space on disk storage and the data structures used to
represent information stored on disk.
Buffer Manager- which is responsible for fetching data from disk storage into main memory, and deciding
what data to cache in main memory.
The buffer manager is a critical part of the database system, since it enables the database to handle data sizes
that are much larger than the size of main memory
2) Query Processor
The query processor is important because it helps the database system simplify and facilitate access to data.
It is the job of the database system to translate updates and queries written in a nonprocedural language, at
the logical level, into an efficient sequence of operations at the physical level.
DDL interpreter - which interprets DDL statements and records the definitions in the data dictionary
DML compile r - which translates DML statements in a query language into an evaluation plan consisting of
low- level instructions that the query evaluation engine understands. A query can usually be translated into
any of a number of alternative evaluations plans that all give the same result. The DML compiler also
performs query optimization - that is, it picks the lowest cost evaluation plan from among the alternatives.
Query evaluation engine - which executes low-level instructions generated by the DML compiler.
3) Database Users
i) Naive users
They are unsophisticated users who interact with the system by invoking one of the application programs that have
been written previously. Consider a user who wishes to find her account balance over the World Wide Web. Such a
user may access a form, where she enters her account number. An application program at the Web server then
retrieves the account balance, using the given account number, and passes this information back to the user.
They are computer professionals who write application programs. Application programmers can choose from many
tools to develop user interfaces. Rapid application development (RAD) tools are tools that enable an application
programmer to construct forms and reports with minimal programming effort.
They interact with the system without writing programs. Instead, they form their requests in a database query
language. They submit each such query to a query processor, whose function is to break down DML statements into
instructions that the storage manager understands. Analysts who submit queries to explore data in the database fa ll in
this category.
iv)Database Administrator
1) Schema Definition -The DBA creates the original database schema by executing a set of data definition
statements in the DDL.
3) Schema and physical-organization modification- The DBA carries out changes to the schema and physical
organization to reflect the changing needs of the organization, or to alter the physical organization to improve
performance.
4) Granting of authorization for data access.-By granting different types of authorization, the database
administrator can regulate which parts of the database various users can access. The authorization information is
kept in a special system structure that the database system consults whenever someone attempts to access the data in
the system
5) Routine Maintenance
Examples of the database administrator's routine maintenance activities are:
Periodically backing up the database, either onto tapes or onto remote servers, to prevent loss of data in case
of disasters such as flooding.
Ensuring that enough free disk space is available for normal operations, and upgrading disk space as
required.
Monitoring jobs running on the database and ensuring that performance is not degraded by very expensive
tasks submitted by some users.
Advantages:
• Network Model is able to model complex relationships and represents semantics of add/delete on the
relationships.
• Can handle most situations for modelling using record types and relationship types.
• Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND NEXT within set,
GET etc. Programmers can do optimal navigation through the database.
Disadvantages:
• Navigational and procedural nature of processing
• Database contains a complex array of pointers that thread through a set of records.
Little scope for automated "query optimization‖
B) Hierarchical Model
A data model in which data are organized in a top-down, or inverted tree structure. It represents Parent-Child
relationship structure.
It organizes data in a tree structure.
It allows 1:N mapping between record types.
Advantages:
• Hierarchical Model is simple to construct and operate on
• Corresponds to a number of natural hierarchically organized domains - e.g., assemblies in manufacturing,
personnel organization in companies
• Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT
etc.
Disadvantages:
• Navigational and procedural nature of processing
• Database is visualized as a linear arrangement of records
• Little scope for "query optimization"
C) Relational Model
– It was introduced by Dr. E. F. Codd in 1970.
– Data is represented with the help of tables (Person, Place, Things, and Events etc.)
– All data elements are placed in two-dimensional tables, called relations that are equal to files.
It performs-
Selecting: Data manipulation that eliminates rows according to certain criteria
Projecting: Data manipulation that eliminates columns in a table
Joining: Data manipulation that combines two or more tables
Linking: Relating tables in a relational database together
2. Foreign Key
– A foreign key is a combination of column with a value based on the primary key values from another table.
– A foreign key constraint also known as referential constraint corresponds to a actual values of the primary key in the
another table. Foreign Keys link to data in other tables
3. Super key
--It is a set of one or more attributes that, taken collectively, to identify uniquely an item in the entity set. For example,
SSN is a superkey.
4. Candidate key
Sometimes in relation, there are more than one attributes are having the unique identification property. Those attributes are
known as candidate key. E.g. Consider Student (RNO, NAME, PER, BRANCH), if RNO and NAME are unique then both are
known as Candidate keys.
► Attribute:
– It is Name of a column. It is also called as Arity, Degree or Order of a Table.
– Ex. For Emp table, E-Name, E-City etc. are the attributes of table.
– Number of Attributes in a relation or table is called as Degree of a relation.
► Tuple:
– It is row in a table. Ex. Particular record in a table represents a tuple.
– Number of tuples in a relation or table is called as Cardinality of a relation.
E-R Model is based on a perception of a real world that consist of a set of basic objects called entities and
relationships among this objects
Entity:
o Individual Object in a real world is called as Entity.
o An Entity is a thing in the real world with an independent existence.
o Eg. In a Relation Stud_Info ―Stud‖ is an Entity.
Entity Set:
o Group of Objects in a system is called as Entity Set.
o Entity set ia a set of entities of the same type that share the same properties or Attributes.
o Ex. In a Relation Stud_Info ―Stud‖, ―Dept‖ is called as Entity Set.
Attributes:
o The properties of an Entity(ies) in the called as attribute.
o Ex. In a Relation Stud_Info ―Stud‖ having ―Attribute‖ Stud-Name
Relationship:
o The meaningful association among two or more entity Sets is called as Relationship
o Ex. For showing relation between ―Stud_Info ― & ―Dept‖ Works keyword is used.
Keys:
o A column or set of columns used to uniquely identify the record in a Relation
Types of Attributes
Each attribute may have any value from its Domain.
It is shown by Ellipse
It is divided into following types
1. Simple or Composite Attributes
Simple attributes cannot be divided into subparts. On other hand composite attributes can be divided into sub parts
Simple attribute having simple structure for its use. Ex. DOB
Composite attributes formed by different attributes Ex. Name
2. Single Valued or Multivalued Attributes
Single-valued attributes are attributes that only have a single value for a particular entity. Ex. StudId
Multi-valued attributes are attributes that only have a multiple values for a particular entity. Ex. Ph. No
3. Derived Attribute
The value for this type of attribute can be derived from the values of other related attributes or entities.
Ex. Age is calculated from DOB
3. It can only relate one table to another table. 3. It can relate one database to another database.
4. Its data security is low as compared to RDBMS. 4. Its level of data security is very high as compared to
DBMS.
Difference between Network Model & Hierarchical Model & Relational Model
Summer-2014 Summer-2016 Winte-2016
Network Model Hierarchical Model Relational Model
Allowed the network model to One to many or one to one One to One, One to many, Many to
support many to many relationships relationships many relationships
A record can have many parents as
Based on parent child relationship Based on relational data structures
well as many children.
Relational databases are what brings
CODASYL (Conference on Data Does not provide an independent
many sources into a common query
Systems Languages) standalone query interface
(such as SQL)
Retrieve algorithms are complex and retrieve algorithms are complex and Retrieve algorithms are simple and
symmetric asymmetric symmetric
Does not suffer from any insertion Cannot insert the information of a Does not suffer from any insert
anomaly. child who does not have any parent. anomaly.
Multiple occurrences of child
records which lead to problems of
Free from update anomalies. Free form update anomalies
inconsistency during the update
operation
Deletion of parent results in deletion
Free from delete anomalies Free from delete anomalies
of child records