IT5351.L001 Intro To DBMS
IT5351.L001 Intro To DBMS
Lecture 1
Introduction to Databases
Reading
Abraham Silberschatz, Henry F. Korth, S. Sudharshan, “Database System Concepts”,
Sixth Edition, Tata McGraw Hill, 2014.
Sections to study
2
What is “Data”?
3
Early Data Management – Ancient History
4
File Processing – More Recent History
Data are stored in files with interface between programs and files.
Various access methods exist (e.g., sequential, indexed, random).
One file corresponds to one or several programs.
5
Purpose of Database Systems
As we discussed, in the early days, database applications were built directly on top of file
systems, which leads to:
Data redundancy and inconsistency: data is stored in multiple file formats resulting induplication of
information in different files
Difficulty in accessing data: Need to write a new program to carry out each new task
Data isolation: Multiple files and formats
Integrity problems:
Integrity constraints (e.g., account balance > 0) become “buried" in program code rather than being stated
explicitly
Hard to add new constraints or change existing ones
6
Purpose of Database Systems
Atomicity problems:
Failures may leave database in an inconsistent state with partial updates carried out
Example: Transfer of funds from one account to another should either complete or not happen at all
Concurrent-access anomalies:
Concurrent access needed for performance
Security problems:
Hard to provide user access to some, but not all, data
7
Database systems offer solutions to all the above problems
Database
What is a database?
Organized collection of related data
Database may be as simple as a text file or a CSV file or may be as complex as a large relational,
integrated collection of data.
Examples of databases
Bank account database; payroll database; AU student database; Amazon’s product database; Hotel
reservation database; your notes for this class
Why do we need databases (in general)?
Contain details about the organization or domain application
Manage large amounts of data - deal with “big data"
8
Types of Databases
Collect, modify, maintain data Store and track historical and time-
dependent data
Backbone of companies
Asset for tracking trends, viewing statistical
Store dynamic data (i.e., change constantly,
data over a long period, making strategic
reflect upto-the-minute info)
business projections
10
Think about the past
Database Applications:
Banking: transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized recommendations
Manufacturing: production, inventory, orders, supply chain
Human resources: employee records, salaries, tax deductions
…
12
University Database Example
In the early days, database applications were built directly on top of file
systems
13
Database Approach
14
What’s a Database Management System?
Physical construction
Manipulation
Sharing/Protecting
Persistence/Recovery
15
Database Management Systems
A database system is a collection of interrelated data and a set of programs that allow users
to access and modify these data.
A major purpose of a database system is to provide users with an abstract view of the data.
Data models: A collection of conceptual tools for describing data, data relationships, data semantics,
and consistency constraints.
Data Abstraction: Hide the complexity of data structures to represent data in the database from
users through several levels of data abstraction.
19
Levels of Abstraction
20
Schemas and Instances
21
Schemas and Instances
Instance
The actual content of the database at a particular point in time
Analogous to the value of a variable
Customer Instance
Name Customer ID Account # Aadhaar # Mobile #
Dinesh 6728 917322 182719289372 9830100291
Kavitha 8912 827183 918291204829 7189203928
Chandra Sekar 6617 372912 127837291021 8892021892
Account Instance
Account # Account Type Interest Rate Min. Bal. Balance
917322 Savings 4.0% 5000 7812
372912 Current 0.0% 0 291820 22
Logical data independence – the ability to modify the logical schema without changing the
external models
Physical Data Independence – the ability to modify the physical schema without changing
the logical schema
Analogous to independence of ‘Interface’ and ‘Implementation’ in Object-Oriented Systems
In general, the interfaces between the various levels and components should be well defined so that
changes in some parts do not seriously influence others.
23
Data Modelling and Data Models
Data modelling: Iterative and progressive process of creating a specific data model for a
determined problem domain
Data models: Simple representations of complex real-world data structures
Useful for supporting a specific problem domain
24
Importance of Data models
25
Data Models
27
Hierarchical Model
28
Network Model
29
Hierarchical and Network Models
30
Relational Model
tuples
(or rows)
31
A Sample Relational Database
32
Relational Model
Advantages Disadvantages
Structural independence is promoted using Requires substantial hardware and system
independent tables software overhead
Tabular view improves conceptual Conceptual simplicity gives untrained
simplicity people the tools to use a good system
poorly
Ad hoc query capability is based on SQL
May promote information problems
Isolates the end user from physical-level
details
Improves implementation and management
simplicity
33
Interfacing to the DBMS
34
Interfacing to the DBMS
Data Manipulation Language (DML): for accessing and manipulating the data organized by
the appropriate data model
DML also known as query language
35
Database Design
36
Database Design (Cont.)
37
Design Approaches
Need to come up with a methodology to ensure that each of the relations in the database is
“good”
Two ways of doing so:
Entity Relationship Model (will discuss later)
Models an enterprise as a collection of entities and relationships
Represented diagrammatically by an entity-relationship diagram:
38
Object-Relational Data Models
39
XML: Extensible Markup Language
40
Summary
41
Summary
A major purpose of a database system is to provide users with an abstract view of the data.
That is, the system hides certain details of how the data are stored and maintained.
A schema is a description of the data interface to the database (i.e., how the data is
organized). A schema can have many instances
A database instance is a database (real data) that conforms to a given schema.
Underlying the structure of a database is the data model: a collection of conceptual tools for
describing data, data relationships, data semantics, and data constraints.
The relational data model is the most widely deployed model for storing data in databases.
A data-manipulation language (DML) is a language that enables users to access or
manipulate data. Nonprocedural DMLs, which require a user to specify only what data are
needed, without specifying exactly how to get those data, are widely used today
42
Summary
A data-definition language (DDL) is a language for specifying the database schema and other
properties of the data.
The entity relationship (E-R) data model is a widely used model for database design. It provides a
convenient graphical representation to view data, relationships, and constraints.
43