database management systems
database management systems
Course Outcomes:
After completion of this course students will be able to:
1. Understand the different data models.
2. Implement relational database design from any data model.
3. Create database system for real world applications
4. Handle the transaction management system.
2
CCA & LCA Component & Evaluation
3
Unit 1: Introduction to DBMS and Data Modelling
4
History of Database Management System
5
History(Contd.)
• 1980s:
– Research relational prototypes evolve into commercial systems
• SQL becomes industrial standard
– Parallel and distributed database systems
– Object-oriented database systems
• 1990s:
– Large decision support and data-mining applications
– Large multi-terabyte data warehouses
– Emergence of Web commerce
• Early 2000s:
– XML and XQuery standards
– Automated database administration
• Later 2000s:
– Giant data storage systems
• Google BigTable, Yahoo PNuts, Amazon, ..
6
Basics of Database Management System
• DBMS contains information about a particular enterprise
– Collection of interrelated data Eg.-University Database
– Set of programs to access the data Example
– An environment that is both convenient and efficient to use • Application program examples
– Databases can be very large. – Add new students, instructors,
• Database Applications: and courses
– Banking: transactions – Register students for courses,
– Airlines: reservations, schedules and generate class rosters
– Universities: registration, grades – Assign grades to students,
– Sales: customers, products, purchases compute grade point averages
– Online retailers: order tracking, customized recommendations (GPA) and generate transcripts
– Manufacturing: production, inventory, orders, supply chain
– Human resources: employee records, salaries, tax deductions
7
DBMS Vs File Systems
• Drawbacks of using file systems to store data
– Data redundancy and inconsistency
• Multiple file formats, duplication of information in different files
– Difficulty in accessing data
• Need to write a new program to carry out each new task
– Data isolation
• Multiple files and formats
– Integrity problems
• Integrity constraints (e.g., account balance > 0) become “buried” in program code rather than
being stated explicitly
• Hard to add new constraints or change existing ones Atomicity of updates
– Failures may leave database in an inconsistent state with partial updates carried out
– Example: Transfer of funds from one account to another should either complete or not happen at all
8
DBMS Vs File Systems contd.
• Concurrent access by multiple users
– Concurrent access needed for performance
– Uncontrolled concurrent accesses can lead to inconsistencies
• Example: Two people reading a balance (say 100) and updating it by withdrawing money (say 50
each) at the same time
• Security problems
– Hard to provide user access to some, but not all, data
9
Database System Architectures
• Centralized and
• Client-Server Systems
• Parallel Systems
• Distributed Systems
10
Centralized
Systems
• Run on a single computer system and do not
interact with other computer systems.
• General-purpose computer system: one to a
few CPUs and a number of device controllers
that are connected through a common bus
that provides access to shared memory.
• Single-user system (e.g., personal computer
or workstation): desk-top unit, single user,
usually has only one CPU and one or two
hard disks; the OS may support only one
user.
• Multi-user system: more disks, more
memory, multiple CPUs, and a multi-user OS.
Serve a large number of users who are
connected to the system vie terminals. Often
called server systems.
11
Client-Server
Systems
• Server systems satisfy requests
generated at m client systems,
whose general structure is shown
in the dig.
• Database functionality can be
divided into:
• Back-end: manages access
structures, query evaluation
and optimization,
concurrency control and
recovery.
• Front-end: consists of tools
such as forms, report-writers,
and graphical user interface
facilities.
• The interface between the front-
end and the back-end is through
SQL or through an application
program interface.
Parallel Systems
• Parallel database systems consist
of multiple processors and
multiple disks connected by a fast
interconnection network.
• A coarse-grain parallel machine
consists of a small number of
powerful processors
• A massively parallel or fine grain
parallel machine utilizes
thousands of smaller processors.
• Two main performance measures:
• throughput --- the number of
tasks that can be completed in a
given time interval
• response time --- the amount of
time it takes to complete a single
task from the time it is submitted
13
Distributed
Systems
• Data spread over multiple machines (also
referred to as sites or nodes).
• Network interconnects the machines
• Data shared by users on multiple machines
• Homogeneous distributed databases
• Same software/schema on all sites, data
may be partitioned among sites
• Goal: provide a view of a single
database, hiding details of distribution
• Heterogeneous distributed databases
• Different software/schema on different
sites
• Goal: integrate existing databases to
provide useful functionality
• Differentiate between local and global
transactions
• A local transaction accesses data in the
single site at which the transaction was
initiated.
• A global transaction either accesses
data in a site different from the one at
which the transaction was initiated or
accesses data in several different sites.
14
Levels of Abstraction
15
Instances and Schemas
16
Data Independence
Types of Data Independence :
• Physical Data Independence : the ability to modify the physical schema without
changing the logical schema
⮚ Applications depend on the logical schema
⮚ In general, the interfaces between the various levels and components should be well defined so
that changes in some parts do not seriously influence others.
• Logical Data Independence : the ability to change the conceptual scheme without
changing
⮚ External views
⮚ External API or programs
⮚ Any change made will be absorbed by the mapping between external and conceptual levels.
⮚ When compared to Physical Data independence, it is challenging to achieve logical data
independence.
17
Data Models
• A collection of tools for describing
– Data
– Data relationships
– Data semantics
– Data constraints
• Relational model
• Entity-Relationship data model (mainly
for database design)
• Object-based data models (Object-
oriented and Object-relational)
• Semistructured data model (XML)
• Other older models:
– Network model
– Hierarchical model
Database System Languages
19
Database System
Structure/Architecture
Important Components of
Database System :
⮚Database Users
⮚Query Processing
⮚Storage Management
⮚Transaction Management
20
Database System Components
: Database Users
21
Database System Components : Query Processing
DML Pre-Compiler:
• It converts DML statements embedded in an application program to normal procedure calls in the host
language.
• The pre-complier must interact with the query processor in order to generate the appropriate code.
DDL Compiler:
• The DDL compiler converts the data definition statements into a set of tables. These tables contains
information concerning the database and are in a form that can be used by other components of the
dbms.
File Manager:
• File manager manages the allocation of space on disk storage and the data structure used to represent
information stored on disk.
Query Processor:
• The query processor used to interpret to online user’s query and convert it into an efficient series of
operations in a form capable of being sent to the data manager for execution.
• The query processor uses the data dictionary to find the details of data file and using this information it
create query plan/access plan to execute the query.
22
Database System Components
: Query Processing
23
Database System Components : Query Processing
Data Dictionary:
• Data dictionary is the table which contains the information about database objects. It contains
information like
24
Database System Components :Storage
Management
• Issues:
– Storage access
– File organization
25
– Indexing and hashing
Database System Components :Transaction
Management
Note: ER diagram notation in 6th edition of Database System Concepts changed from earlier
editions; now based on UML class diagram notation with some modifications.
Relationship Sets With Attributes
• An attribute can also be property of a relationship set.
• For instance, the advisor relationship set between entity sets instructor and student may have
the attribute date
– E.g. date may track when the student started being associated with the advisor
Relationship Sets with Attributes
Attributes
• An entity is represented by a set of attributes, that is descriptive properties possessed by all members
of an entity set.
– Example:
instructor = (ID, name, street, city, salary )
course= (course_id, title, credits)
• Domain – the set of permitted values for each attribute
• Attribute types:
– Simple and composite attributes.
– Single-valued and multivalued attributes
• Example: multivalued attribute: phone_numbers
– Derived attributes
• Can be computed from other attributes
• Example: age, given date_of_birth
Composite Attributes
Entity With Composite, Multivalued, and Derived Attributes
composite
multivalued
derived
Degree of a Relationship Set
• binary relationship
– involve two entity sets (or degree two).
– most relationship sets in a database system are
binary.
• Relationships between more than two entity sets are
rare. Most relationships are binary. (More on this
later.)
Example:
• Ternary relationship
⮚Example: students work
on research projects under
the guidance of an
instructor.
⮚relationship proj_guide
is a ternary relationship
between instructor,
student, and project
38
Mapping Cardinality Constraints
• Express the number of entities to which another entity can be associated via a relationship set.
• Most useful in describing binary relationship sets.
• For a binary relationship set the mapping cardinality must be one of the following types:
– One to one
– One to many
– Many to one
– Many to many
Mapping Cardinalities
● Total participation (indicated by double line): every entity in the entity set participates in at least
one relationship in the relationship set
● E.g. participation of section in sec_course is total
4 every must have an associated course
● Partial participation: some entities may not participate in any relationship in the relationship set
● Example: participation of instructor in advisor is partial
Alternative Notation for Cardinality Limits
• An entity set that does not have a primary key is referred to as a weak entity set.
• The existence of a weak entity set depends on the existence of a identifying entity set
– it must relate to the identifying entity set via a total, one-to-many relationship set from the
identifying to the weak entity set
– Identifying relationship depicted using a double diamond
• The discriminator (or partial key) of a weak entity set is the set of attributes that distinguishes
among all the entities of a weak entity set.
• The primary key of a weak entity set is formed by the primary key of the strong entity set on
which the weak entity set is existence dependent, plus the weak entity set’s discriminator.
Weak Entity Sets (Cont.)
• We underline the discriminator of a weak entity set with a dashed line.
• We put the identifying relationship of a weak entity in a double diamond.
• Discriminator of the weak entity set is underlined by dashed lines
• Primary key for section – (course_id, sec_id, semester, year)
Weak Entity Sets (Cont.)
• Note: the primary key of the strong entity set is not explicitly stored with the weak entity set,
– it is implicit in the identifying relationship.
• If course_id were explicitly stored, section could be made a strong entity
– but then there is an implicit relationship defined by the attribute course_id common to course
and section
– and the implicit relationship duplicates the explicit relationship between section and course
• Example:
– Strong Entity : Professor(ID,Name,City,Salary)
– Weak Entity : Dependent(Name,DOB,Relation)
• The Dependent Entity will share the ID attribute of Professor.
• Resultant Schema :
– Dependent(ID,Name,DOB,Relation)
– The primary key for Weak Entity Dependent will be ID + Name as Name is the discriminator
attribute.
E-R Diagram for a University Enterprise
E-R Diagram Example
• Question : Design an ER Diagram for Airline Reservation scenario given below : The flight
database stores details about an airline’s fleet, flights, and seat bookings.
•An airplane has a model number, a registration number, and the capacity to take one or more passengers.
•An airplane flight has a unique flight number, a departure airport, a destination airport, a departure date and time,
and an arrival date and time.
•A passenger has given names(first name, last name),contact and a unique email address.
• Top-down design process; we designate subgroupings within an entity set that are distinctive
from other entities in the set.
• It is a top-down approach in which one higher level entity can be broken down into two lower
level entity.
• These subgroupings become lower-level entity sets that have attributes or participate in
relationships that do not apply to the higher-level entity set.
• Depicted by a triangle component labeled ISA (E.g. instructor “is a” person).
• Attribute inheritance – a lower-level entity set inherits all the attributes and relationship
participation of the higher-level entity set to which it is linked.
Specialization Example
Extended ER Features : Generalization
In generalization, the higher level entity can also combine with other lower level
entities to make further higher level entity.
58
Generalization Example
59
Extended ER Features : Aggregation
61
Reduction to Relational Schemas
62
Representing Entity Sets With Simple Attributes
70
71
72
73