0% found this document useful (0 votes)
34 views59 pages

Unit 1 PART A

The document discusses database management systems (DBMS) including their purpose, components, data models, and advantages. A DBMS is software that allows for the creation, maintenance and use of large collections of data and includes programs to access and update the stored data.

Uploaded by

vinodnangare01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views59 pages

Unit 1 PART A

The document discusses database management systems (DBMS) including their purpose, components, data models, and advantages. A DBMS is software that allows for the creation, maintenance and use of large collections of data and includes programs to access and update the stored data.

Uploaded by

vinodnangare01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 59

Unit-I

Introduction to DBMS
Syllabus
• Introduction: Introduction to database systems application,
purpose of database system. Introduction to Data models,
Three-schema architecture of a database, Components of a
DBMS.
• E-R model: modeling, entity, attributes, relationships, constraints,
components of E-R model.
• Relational model: basic concepts, attributes and domains,
concept of integrity and referential constraints, schema diagram.
What is a DBMS?
• A database is a collection of data,
• typically describing the activities of one or more related organization
• Entities:
• students, • A database management
• Faculty,
system, or DBMS,
• courses, and classrooms.
• is software designed to assist
• Relationships between entities: in maintaining and utilizing
• students' enrollment in courses,
large collections of data.
• faculty teaching courses, and
• the use of rooms for courses.
DBMS
• DBMS contains information about a particular enterprise
• Collection of interrelated data
• Set of programs to access the data
• An environment that is both convenient and efficient to use
• Database systems are used to manage collections of data that are:
• Highly valuable
• Relatively large
• Accessed by multiple users and applications, often at the same time.
• A modern database system is a complex software system whose task is
to manage a large, complex collection of data.
• Databases touch all aspects of our lives
Examples of DBMS
• Enterprise Information
• Sales: customers, products, purchases
• Accounting: payments, receipts, assets
• Human Resources: Information about employees, salaries, payroll taxes.
• Manufacturing: management of production, inventory, orders, supply chain.
• Banking and finance
• customer information, accounts, loans, and banking transactions.
• Credit card transactions
• Finance: sales and purchases of financial instruments (e.g., stocks and bonds; storing
real-time market data
• Universities:
• registration, grades
Examples of DBMS
• Airlines:
• reservations, schedules
• Telecommunication:
• records of calls, texts, and data usage, generating monthly bills, maintaining balances on
prepaid calling cards
• Web-based services
• Online retailers: order tracking, customized recommendations
• Online advertisements
• Navigation systems:
• For maintaining the locations of varies places of interest along with the exact routes of
roads, train systems, buses, etc.
• Document databases
Need of DBMS
In the early days, database applications were built directly on top of file
systems, which leads to:

• Data redundancy and inconsistency: data is stored in multiple file


formats resulting induplication of information in different files
• Difficulty in accessing data
• Need to write a new program to carry out each new task
• Data isolation
• Multiple files and formats
• Integrity problems
• Integrity constraints (e.g., account balance > 0) become “buried” in program code rather
than being stated explicitly
• Hard to add new constraints or change existing ones
Need of DBMS
• Atomicity of updates
• Failures may leave database in an inconsistent state with partial updates carried out
• Example: Transfer of funds from one account to another should either complete or not happen at
all
• Concurrent access by multiple users
• Concurrent access needed for performance
• Uncontrolled concurrent accesses can lead to inconsistencies
• Ex: Two people reading a balance (say 100) and updating it by withdrawing money (say 50
each) at the same time
• Security problems
• Hard to provide user access to some, but not all, data

Database systems offer solutions to all the above problems


ADVANTAGES OF A DBMS
• Data Independence:
• Efficient Data Access:
• Data Integrity and Security
• Data Administration
• Concurrent Access and Crash Recovery
• Reduced Application Development Time
University Database Example
• Data consists of information about:
• Students
• Instructors
• Classes
• Application program examples:
• Add new students, instructors, and courses
• Register students for courses, and generate class rosters
• Assign grades to students, compute grade point averages (GPA) and generate transcripts
View of Data
• A database system is a collection of interrelated data and a set of programs
that allow users to access and modify these data.
• A major purpose of a database system is to provide users with an abstract
view of the data.
• Data models
• A collection of conceptual tools for describing data, data relationships,
data semantics, and consistency constraints.
• Data abstraction
• Hide the complexity of data structures to represent
data in the database from users through several levels of data
abstraction.
View of Data
• An architecture for a database system : 3-Schema Architecture

Application Programs:
Address of student,
Shows only specific data,
etc
hides the other information!!

Roll-int, name –string etc Describe the data, relationship among the data
Set of relationship

Student Records How a record is stored


Physical Data Independence
• The ability to modify the physical schema without changing the logical
or view level schema
• Like a Interface and Implementation in OOP
• Applications depend on the logical schema
• In general, the interfaces between the various levels and components should be
well defined so that changes in some parts do not seriously influence others.

• Performance tuning – modification at physical level creating a new index etc.


• Physical Data Independence achieved by modification is localized
• achieved by suitably modifying PL-LL mapping.
• a very important feature of modern DBMS
Logical Data Independence
• The ability to change the logical level scheme without affecting the
view level schemes or application programs
• Adding a new attribute to some relation
• no need to change the programs or views that don’t require to use the
new attribute
• Deleting an attribute
• no need to change the programs or views that use the remaining data
• view definitions in VL-LL mapping only need to be changed for views that
use the deleted attribute
Instances and Schemas
• Similar to types and variables in programming languages
• Logical Schema – the overall logical structure of the database
• Example: The database consists of information about a set of customers
and accounts in a bank and the relationship between them
• Analogous to type information of a variable in a program
• Physical schema:
• The overall physical structure of the database
• Instance
• The actual content of the database at a particular point in time
• Analogous to the value of a variable
Metadata
• Data and schema store separately
• In Relational DBMS
• Schema : Table name, attribute name with their data type, constraints
Example Change in Schema: At time of Database Designed: No Updating- unchanged: Logical View
Instances: At any time: New record Insertion: Updated any time: View

• Bank Database Instance


• Schema
• Account Schema
• Account No Account No Account Type Balance Min Bal Branch Name
• Account Type 2647859 Saving 2300 1000 Kopargaon
• Balance Metadata: Data
2354698 Current 5000 5000 Shirdi
• Min Bal about Data
• Branch Name
• Customer Schema
Account No Name PAN Card Address Mobile No
• Account No
• Name of Customer 2647859 Shon Jadhav AXXX566J Kopargaon 98XXXXXX56
• PAN Card 2354698 Rama RMGXXXXR Pune 76XXXXXX67
• Address
• Mobile No
Data Models
• Collection of conceptual tools to describe the database at a certain level of abstraction.
• A collection of tools for describing
• Data, Data relationships, Data semantics and Data constraints
• Conceptual Data Model
• a high level description
• useful for requirements understanding.
• Entity-Relationship data model (mainly for database design)
• Representational Data Model
• Describing the logical representation of data without giving details of physical representation.
• Relational model
• Physical Data Model
• Description giving details about record formats, file structures etc.
• Object-based data models (Object-oriented and Object-relational)
• Semi-structured data model (XML)
• Other older models:
• Network model
• Hierarchical model
Representational Data Model Columns / Attribute

• Relational Model:
• Provide the concept of relation
• All the data is stored in various tables.
• Example of tabular data in the relational model
Rows /
• Relation Scheme: Tuple
• Attributes name of the relation: Column
• Relation data/ Instance: Set of data tuple: Rows
Development Process of a Database System (1/2)
Step 1. Requirements collection
• Data model requirements
• various pieces of data to be stored and the interrelationships.
• presented using a conceptual data model such as E/R model.
• Functional requirements
• various operations that need to be performed as part of running the
enterprise.
• Example:
• Acquiring a new book, enrolling a new user, issuing a book to the user, recording the
return of a book etc
Development Process of a Database System
(2/2)
Step 2. Convert the data model into a representational level model
• typically relational data model.
• choose an RDBMS system and create the database.

Step 3. Convert the functional requirements into application


programs
• programs in a high-level language that use embedded
• SQL to interact with the database and carry out the
required tasks.
Data Definition Language (DDL)
• Specification notation for defining the database schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
• DDL compiler generates a set of table templates stored in a data dictionary
• Data dictionary contains metadata (i.e., data about data)
• Database schema
• Integrity constraints
• Primary key (ID uniquely identifies instructors)
• Authorization
• Who can access what
Data Manipulation Language (DML) (1/2)
• Language for accessing and updating the data organized by the
appropriate data model
• DML also known as query language
• There are basically two types of data-manipulation language
• Procedural DML: require a user to specify what data are needed and
how to get those data.
• Declarative DML: require a user to specify what data are needed without
specifying how to get those data.
• Declarative DMLs are usually easier to learn and use than are
procedural DMLs.
• It also referred to as non-procedural DMLs
• The portion of a DML that involves information retrieval is
called a
query language.
Data Manipulation Language (DML) (2/2)
• DML: Query language
• Two class of language:
• Pure: used for providing properties about computational power and for
optimization
• Relation Algebra:
• Tuple relational calculus
• Domain relational calculus
• Commercial:
• SQL is widely used as commercial language
SQL Query Language
• SQL query language is nonprocedural.
• A query takes as input several tables (possibly only one) and always returns a single
table.
• Example to find all instructors in ‘Info Tech’. dept
select name
from instructor
where dept_name = ‘Info Tech’.
• SQL is NOT a Turing machine equivalent language
• To be able to compute complex functions SQL is usually embedded in some
higher-level language
• Application programs generally access databases through one of
• Language extensions to allow embedded SQL
• Application program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to
a database
Database Access from Application Program
• Non-procedural query languages such as SQL are not as powerful
as a universal Turing machine.
• SQL does not support actions such as input from users, output to
displays, or communication over the network.
• Such computations and actions must be written in a host language,
such as C/C++, Java or Python, with embedded SQL queries that
access the data in the database.
• Application programs -- are programs that are used to interact
with the database in this fashion.
Database Design
• The process of designing the general structure of the database:
• Logical Design:
• Deciding on the database schema.
• Database design requires that we find a “good” collection of relation schemas.
• Business decision :What attributes should we record in the database?
• Computer Science decision – What relation schemas should we
have and how should the attributes be distributed among the
various relation schemas?
• Physical Design:
• Deciding on the physical layout of the database
Database Engine
Database Engine
• A database system is partitioned into modules that deal with each
of the responsibilities of the overall system.
• The functional components of a database system can be divided
into
• The storage manager,
• The query processor component,
• The transaction management component.
1. Storage Manager
• A program module that provides the interface between the
low-level data stored in the database and the application
programs and queries submitted to the system.
• The storage manager is responsible to the following tasks:
• Interaction with the OS file manager
• Efficient storing, retrieving and updating of data
• The storage manager components include:
• Authorization and integrity manager
• Transaction manager
• File manager
• Buffer manager
1. Storage Manager (Cntd…..)
• The storage manager implements several data structures as part
of the physical system implementation:
• Data files :
• store the database itself
• Data dictionary :
• stores metadata about the structure of the database, in particular the
schema of the database.
• Indices :
• can provide fast access to data items. A database index
provides pointers to those data items that hold a particular
value.
2. Query Processor(1/2)
• The query processor components include:
• DDL interpreter:
• Interprets DDL statements and records the definitions in the data dictionary.
• DML compiler:
• Translates DML statements in a query language into an evaluation plan
consisting of low-level instructions that the query evaluation engine
understands.
• It performs query optimization
• i.e. it picks the lowest cost evaluation plan from among the various
alternatives.
• Query evaluation engine:
• Executes low-level instructions generated by the DML compiler.
2. Query Processing (2/2)
1. Parsing and translation
2. Optimization
3. Evaluation
3. Transaction Management: if d/b fails???
• A transaction is a collection of operations that performs a single
logical function in a database application
• Transaction-management component ensures that the database
remains in a consistent (correct)
• state despite system failures (e.g., power failures and operating system
crashes) and
• transaction failures.
• Concurrency-control manager controls the interaction among the
concurrent transactions
• to ensure the consistency of the database.
Database Architecture
• Centralized databases
• One to a few cores, shared memory
• Client-server,
• One server machine executes work on behalf of multiple client machines.
• Parallel databases
• Many core shared memory
• Shared disk
• Shared nothing
• Distributed databases
• Geographical distribution
• Schema/data heterogeneity
Database Architecture
(Centralized/Shared-Memory)
Database Applications
• Database applications are usually partitioned into two or three parts
• Two-tier architecture:
• The application resides at the client machine,
• where it invokes database system functionality at the server machine
• Three-tier architecture:
• The client machine acts as a front end and does not contain any direct database
calls.
• The client end communicates with an application server, usually through a
forms interface.
• The application server in turn communicates with a database system to access
data.
Two-tier and three-tier architectures
Three Schema Architecture
• View Level Schema
• Each view describes an aspect of the database relevant to a particular
group of users.
• For instance, in the context of a library database, views are:
• Books Purchase Section
• Issue/Returns Management Section
• Users Management Section
• Each section views/uses a portion of the entire data.
• Views can be set up for each section of users.
Three Schema Architecture
• Logical Level Schema
• Describes the logical structure of the entire database.
• No physical level details are given.
• Physical Level Schema
• Describes the physical structure of data in terms of record formats, file
structures, indexes etc.
• Remarks
• Views are optional
• Can be set up if the DB system is very large and if easily identifiable user-groups
exist
• The logical scheme is essential
• Modern RDBMS’s hide details of the physical layer
Database Users
Roles for people in an Info System
management (1/2)
• Naive users / Data entry operators
• Use the GUI provided by an application program
• Feed-in the data and invoke an operation
• e.g., person at the train reservation counter, person at library issue / return counter
• No deep knowledge of the Information System required
• Application Programmers
• Knowledge about the logical schema about entire database
• Embed SQL in a high-level language and develop programs to
handle functional requirements of an IS
• Should thoroughly understand the logical schema or relevant
views
• Meticulous testing of programs - necessary
Roles for people in an Info System
management (2/2)
• Sophisticated user / data analyst:
• Uses SQL to generate answers for complex queries

• DBA (Database Administrator)


• Designing the logical scheme
• Creating the structure of the entire database
• Monitor usage and create necessary index structures to speed up query
execution
• Grant / Revoke data access permissions to other users etc.
Database Administrator
• A person who has central control over the system is called a database
administrator (DBA).
• Functions of a DBA include:
• Schema definition
• Storage structure and access-method definition
• Schema and physical-organization modification
• Granting of authorization for data access
• Routine maintenance
• Periodically backing up the database
• Ensuring that enough free disk space is available for normal operations, and
upgrading disk space as required
• Monitoring jobs running on the database
Architecture of RDBMS System
Architecture (1/3)
• Disk Storage:
• Meta-data / schema
• table definitions, view definitions, mappings
• Data – relation instances, index structures
• statistics about data
• Log – record of database update operations essential for failure
recovery
• DDL and other SQL command processor: (DDL – Data
definition language part of SQL)
• Commands for relation scheme creation, constraints setting etc
• Commands for handling authorization and data access control
Architecture (2/3)
• Query compiler
• Compiles : SQL adhoc queries and update / delete
commands
• Query optimizers
• Selects a near optimal plan for executing a query
• relation properties and index structures are utilized
• Application Program Compiler
• Preprocess to separate embedded SQL commands
• Use host language compiler to compile rest of the program
• Integrate the compiled program with the libraries for SQL commands
supplied by RDBMS
Architecture (2/3)
• RDBMS Run Time System:
• Executes Compiled queries, Compiled application programs
• Interacts with Transaction Manager, Buffer Manager
• Transaction Manager:
• Keeps track of start, end of each transaction
• Enforces concurrency control protocols
• Buffer Manager:
• Manages disk space
• Implements paging mechanism
• Recovery Manager:
• Takes control as restart after a failure
• Brings the system to a consistent state before it can be resumed
History
• Early 1960s : The first general-purpose DBMS, designed by Charles Bachman at General Electric in the, was
called the Integrated Data Store
• Network Data Model,
• Late 1960s: IBM developed the Information Management System (IMS) DBMS, used even today in many
major installations.
• IMS formed the basis for an alternative data representation framework called the hierarchical data model.
• In 1970: Edgar Codd, at IBM's San Jose Research Laboratory, proposed New data representation
framework called the relational data model.
• In 1980: Developed SQL query language for relational databases
• In 1999 SQL:, was adopted by the American National Standards Institute (ANSI) and International
Organization for Standardization (ISO).
• IBM's DB2, Oracle 8, Informix2 UDS: with the ability to store new data types such as images and text, and
to ask more complex queries
• Data warehouses: consolidating data from several databases, and for carrying out specialized analysis.
• Enterprise Resource Planning (ERP) and management resource planning (MRP) packages
Packages include systems from Baan, Oracle, PeopleSoft, SAP, and Siebel.
History of Database Systems
• 1950s and early 1960s:
• Data processing using magnetic tapes for storage
• Tapes provided only sequential access
• Punched cards for input
• Late 1960s and 1970s:
• Hard disks allowed direct access to data
• Network and hierarchical data models in widespread use
• Ted Codd defines the relational data model
• Would win the ACM Turing Award for this work
• IBM Research begins System R prototype
• UC Berkeley (Michael Stonebraker) begins Ingres prototype
• Oracle releases first commercial relational database
• High-performance (for the era) transaction processing
History of Database Systems cntd…
• 1980s:
• Research relational prototypes evolve into commercial systems
• SQL becomes industrial standard
• Parallel and distributed database systems
• Wisconsin, IBM, Teradata
• Object-oriented database systems
• 1990s:
• Large decision support and data-mining applications
• Large multi-terabyte data warehouses
• Emergence of Web commerce
• 2000s
• Big data storage systems
• Google BigTable, Yahoo PNuts, Amazon,
• “NoSQL” systems.
• Big data analysis: beyond SQL
• Map reduce and friends
• 2010s
• SQL reloaded
• SQL front end to Map
Reduce systems
• Massively parallel
database systems
• Multi-core main-
memory databases
Course Outcome
• CO1: Exploring the fundamental Concepts of Database Management.

• You Understand the


• Basics of Data model
• DBMS architecture
• DBMS User
Reflection Quiz!!
Complete the sentence: Logical Data Independence is the ability to modify...

• physical-level schema without affecting the logical-level schema


• the logical-level schema with no effect on view-level schema.
• view-level schema without affecting logical-level schema.
• logical-level schema without affecting physical-level schema.
Complete the sentence: Physical Data Independence is the ability to modify...

• physical-level schema without affecting the logical-level schema


• the logical-level schema with no effect on view-level schema.
• view-level schema without affecting logical-level schema.
• logical-level schema without affecting physical-level schema.
Reflection Quiz!!
An Entity-Relationship (ER) Model represents:
• The various entity types of interest and the relationships among them in the domain being
modeled.
• Various tables and links among them in the domain being modeled.
• The various entity types of interest and the relationships among them in the domain being
modeled along with operations to be performed on data.
• Various tables and links among them in the domain being modeled along with operations to be
performed on data.

A person who develops a high-level language program that meets a functional requirement
of the database is usually called:
• A naive user
• An application programmer
• A data analyst
• A DB administrator.
Reflection Quiz!!
The people playing the following role need NOT have an understanding of the complete
logical schema of the database:
• Data-entry Operator
• Application Programmer
• Data Analyst
• Database Administrator
Revised
• State True or False::

• A database is a collection of related pieces of data.


True

• Database systems are designed to manage large amounts of information. True


False
• DBMS is a software system for database management and manages a single database.
True
• Many users can concurrently access a particular database.
True
• The information stored in a database is actually stored in secondary storage.
False
• DBMS guarantees the availability of data when a user tries to retrieve the data.
False
• In DBMS, record structures are hard-coded into its internal programs. True
• A “log” in the RDBMS keeps track of update operations of all transactions.
1. What is the most appropriate matching between the following sets
• where set S1 represents a type of enterprise and
• set S2 represents the type of information that an enterprise wants to store as a database?

S1: {w: Airline; x: Telecommunication; y: Banking; z: Universities} w--p


x--r
S2: {p: reservation and schedule information; y--q
q: customers, accounts, loans, and associated information; z--s
r: information about the communication networks;
s: information about students, courses, grades, etc.}

2. Consider the statements given below: State which is True or False


S1: Data abstraction is the DBMS characteristic that allows program-data independence. S1: True
S2: Data models allow representation of a database at different levels of detail. S2:
True
3. Consider the statements given below: State which is True or False
S1: Meta-data is the descriptions of the relation schemas and associated constraints.
S1: True
S2: DBMS stores data and meta-data in the database catalog. S2:
True
S1: Database schema is specified during the design stage and describes the database. S1: True
S2: The scheme of a database changes frequently. S2: False
What is the most appropriate matching between the following sets w.r.t. data abstraction:
S1: {w: physical level;
x: logical level; w--p
y: view level} x--q
S2: {p: describes y--r
what data is
stored;
q: describes
Typically, a database administrator (DBA) is responsible for:
how the data
• Schema definition
is stored;
• Schema modification
r: describes
• Granting of authorization for data access
only part of
• All of the above
the database}
Consider a typical data retrieval request in DBMS. Find the statement which is TRUE.
• The data retrieval query always returns the records in sorted order.
• The query formulation is based on the conceptual schema.
• The query formulation is based on the physical schema.
• None of the above is TRUE.

What is FALSE regarding the relational data model:


• A relational database consists of a collection of tables.
• The term “tuple” refers to an element in a relation instance.
• The tuples in a relation instance appear in a sorted order.
• For each attribute of a relation, there is a set of permitted values.
References
• Abraham Silberschatz, Henry F. Korth and S.
Sudarshan, “Database System Concepts”, 6th
Edition, McGraw Hill, 2010.

• Raghu Ramkrishnan and Johannes Gehrke,


“Database Management Systems”, 2nd Edition,
McGraw Hill International Editions
ISBN 978-0072465631.

• Kristina Chodorow and MongoDB, “The


Definitive Guide”, 2nd Edition, O’Reilly
Publications, ISBN: 978-93-5110-269-4.

You might also like