337 Lecture-01

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

COMP 337

Database Management Systems


Fall 2023-2024
by
Dr. Ferhun Yorgancıoğlu

Lecture-1
Introduction to the Database Systems

1
Introduction
• A database-management system (DBMS) is a collection of interrelated data and a set
of programs to access those data.
• The collection of data, usually referred to as the database, contains information
relevant to an enterprise.
• The primary goal of a DBMS is to provide a way to store and retrieve database
information that is both convenient and efficient.
• Database systems are designed to manage large bodies of information.
• Management of data involves both defining structures for storage of information and
providing mechanisms for the manipulation of information.
• In addition, the database system must ensure the safety of the information stored,
despite system crashes or attempts at unauthorized access.
• Because information is so important, a large body of concepts and techniques have
been developed for managing data.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 3

Database-System Applications
• The earliest database systems arose in the 1960s in response to the computerized
management of commercial data. Those earlier applications were relatively simple
compared to modern database applications.
• Modern applications include highly sophisticated, worldwide enterprises.
• All database applications, old and new, share important common elements.
• The central aspect of the application is not a program performing some calculation,
but rather the data themselves.
• Database systems are used to manage collections of data that are:
o highly valuable,
o relatively large, and
o accessed by multiple users and applications, often at the same time.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 4

2
Database-System Applications – cnt
• The first database applications had only simple, precisely formatted, structured data.
• Today, database applications may include data with complex relationships and a more
variable structure.
• As an example of an application with structured data, consider a university’s records
regarding courses, students, and course registration.
• The university keeps the same type of information about each course: course-
identifier, title, department, course number, etc., and similarly for students: student-
identifier, name, address, phone, etc.
• Course registration is a collection of pairs: one course identifier and one student
identifier.
• Information of this sort has a standard, repeating structure and is representative of
the type of database applications that go back to the 1960s.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 5

Database-System Applications – cnt


• Below are screenshots of a university database application (file-processing system).

The application program aims to


open text files, process them
and produce a transcript file for
each student.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 6

3
Database-System Applications – cnt
• Contrast this simple university database application with a social-networking site.
• Users of the site post varying types of information about themselves ranging from
simple items such as name or date of birth, to complex posts consisting of text,
images, videos, and links to other users.
• There is only a limited amount of common structure among these data.
• Both of these applications, however, share the basic features of a database.
• Modern database systems exploit commonalities in the structure of data to gain
efficiency but also allow for weakly structured data and for data whose formats are
highly variable.
• As a result, a database system is a large, complex software system whose task is to
manage a large, complex collection of data.

 This course aims to teach the design and implementation of relational databases (!)

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 7

Database Application Examples


• Here are some representative applications:
o Enterprise information: sales, accounting, human resources
o Manufacturing: management of the supply chain, tracking production of items,
inventories of items in warehouses, and orders for items
o Banking and finance: customer information, accounts, loans, banking transactions, real-
time sales and purchases of financial instruments
o Universities: course registrations, grades, human resources and accounting
o Airlines: reservations and schedule information
o Telecommunication: keeping records of calls, text and data usages, generating monthly
bills
o Web-based services: social media, online retailers, online advertisements
o Document databases: maintaining collections of news articles, patents, published
research papers, etc.
o Navigation systems: maintaining locations of various places of interest with routes

 As this list illustrates, databases form an essential part not only of every enterprise but
also of a large part of a person’s daily activities.
Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 8

4
Database Interaction
• The ways in which people interact with databases has changed over time.
• Early databases were maintained as back-office systems with which users interacted
via printed reports and paper forms for input.
• As database systems became more sophisticated, better languages were developed
for programmers to use in interacting with the data, along with user interfaces that
allowed end users within the enterprise to query and update data.
• Today, virtually every enterprise employs web applications or mobile applications to
allow its customers to interact directly with the enterprise’s database, and, thus, with
the enterprise itself.
• For instance, when you read a social-media post, or access an online bookstore and
browse a book or music collection, you are accessing data stored in a database.
• When you enter an order online, your order is stored in a database. When you access
a bank web site and retrieve your bank balance and transaction information, the
information is retrieved from the bank’s database system. When you access a web
site, information about you may be retrieved from a database to select which
advertisements you should see.
• Almost every interaction with a smartphone results in some sort of database access.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 9

Purpose of Database Systems


• To understand the purpose of database systems, consider part of a university
organization that, among other data, keeps information about all instructors,
students, departments, and course offerings.
• One way to keep the information on a computer is to store it in operating-system files.
• To allow users to manipulate the information, the system has a number of application
programs that manipulate the files, including programs to:
o Add new students, instructors, and courses.
o Register students for courses and generate class rosters.
o Assign grades to students, compute grade point averages (GPA), and generate
transcripts.
• Programmers develop these application programs to meet the needs of the university.
New application programs are added to the system as the need arises.
• This typical file-processing system is supported by a conventional operating system.
• The system stores permanent records in various files, and it needs different
application programs to extract records from, and add records to, the appropriate
files.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 10

5
Purpose of Database Systems – cnt
• Keeping organizational information in a file-processing system has a number of major
disadvantages:

o Data redundancy and inconsistency: data is stored in multiple file formats resulting in
duplication of information in different files

o Difficulty in accessing data


 Need to write a new program to carry out each new task

o Data isolation
 Multiple files and formats

o Integrity problems
 Integrity constraints (e.g., account balance ≥ 0) become “buried” in program code rather
than being stated explicitly
 Hard to add new constraints or change existing ones

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 11

Purpose of Database Systems – cnt


o Atomicity of updates
 Failures may leave database in an inconsistent state with partial updates carried out
 Example: Transfer of funds from one account to another should either complete or not
happen at all

o Concurrent access by multiple users


 Concurrent access needed for performance
 Uncontrolled concurrent accesses can lead to inconsistencies

o Security problems
 Hard to provide user access to some, but not all, data

 These difficulties, among others, prompted both the initial development of database
systems and the transition of file-based applications to database systems, back in the
1960s and 1970s.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 12

6
View of Data
• A database system is a collection of interrelated data and a set of programs that allow
users to access and modify these data.

• A major purpose of a database system is to provide users with an abstract view of the
data. That is, the system hides certain details of how the data are stored and
maintained.
o Data models
 A collection of conceptual tools for describing data, data relationships, data semantics,
and consistency constraints.

o Data abstraction
 Hide the complexity of data structures to represent data in the database from users
through several levels of data abstraction.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 13

Data Models
• Underlying the structure of a database is the data model: a collection of conceptual
tools for describing:
o data,
o data relationships,
o data semantics, and
o consistency constraints.

• There are several models proposed in the literature:


o Relational model (most popular)
o Entity-Relationship model (mainly for database design)
o Semi-structured data model (XML, JSON)
o Object-based data models
 OOP has become the dominant programing development methodology
 This led initially to the development of a distinct object-oriented data model
 Today the concept of object is well integrated into relational databases
o Other older models: network model, hierarchical model
Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 14

7
Relational Data Model
• In the relational model, data are represented in the form of tables. Ted Codd
Turing Award 1981
• Each table has multiple columns, and each column has a unique name.
• Each row of the table represents one piece of information.
Columns (attributes)

Rows (tuples)

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 15

Sample Relational Database

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 16

8
Data Abstraction
• Managing complexity is challenging, not only in the management of data but in any
domain.
• Key to the management of complexity is the concept of abstraction.
• Abstraction allows a person to use a complex device or system without having to
know the details of how that device or system is constructed.
• A person is able, for example, to drive a car by knowing how to operate its controls.
However, the driver does not need to know how the motor was built nor how it
operates. All the driver needs to know is an abstraction of what the motor does.
• Similarly, for a large, complex collection of data, a database system provides a simpler,
abstract view of the information so that users and application programmers do not
need to be aware of the underlying details of how data are stored and organized.
• By providing a high level of abstraction, a database system makes it possible for an
enterprise to combine data of various types into a unified repository of the
information needed to run the enterprise.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 17

Levels of Data Abstraction


• For the system to be usable, it must retrieve data efficiently. The need for efficiency has led
database system developers to use complex data structures to represent data in the database.

o Physical level: describes how a


record is stored
o Logical level: describes what data
is stored in database, and the
relationships among the data
o View level: application program
hides details of data types, and it
can also hide information for
security purposes

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 18

9
Instances and Schemas
• Databases change over time as information is inserted and deleted.

• The collection of information stored in the database at a particular moment is called


an instance of the database.

• The overall design of the database is called the database schema.

• The concept of database schemas and instances can be understood by analogy to a


program written in a programming language.

• A database schema corresponds to the variable declarations (along with associated


type definitions) in a program.

• Each variable has a particular value at a given instant. The values of the variables in a
program at a point in time correspond to an instance of a database schema.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 19

Database Languages
• A database system provides a data-definition language (DDL) to specify the database
schema and a data-manipulation language (DML) to express database queries and
updates.
• In practice, the data-definition and data-manipulation languages are not two separate
languages; instead they simply form parts of a single database language, such as the
SQL language.

• There are basically two types of data-manipulation language:


o Procedural DMLs require a user to specify what data are needed and how to get those
data.
o Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify
what data are needed without specifying how to get those data.

• Declarative DMLs are usually easier to learn and use than are procedural DMLs.
• However, since a user does not have to specify how to get the data, the database
system has to figure out an efficient means of accessing data.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 20

10
Database Access from Application Programs
• Nonprocedural query languages such as SQL are not as powerful as a universal Turing
machine; that is, there are some computations that are possible using a general-
purpose programming language but are not possible using SQL.
• SQL also does not support actions such as input from users, output to displays, or
communication over the network. Such computations and actions must be written in
a host language, such as C/C++, Java, or Python, with embedded SQL queries that
access the data in the database.
• Application programs are programs that are used to interact with the database in this
fashion. Examples in a university system are programs that allow students to register
for courses, generate class rosters, calculate student GPA, generate payroll checks, and
perform other tasks.
• To access the database, DML statements need to be sent from the host to the
database where they will be executed. This is most commonly done by using an
application-program interface (set of procedures) that can be used to send DML and
DDL statements to the database and retrieve the results.
• The Open Database Connectivity (ODBC) standard defines application program
interfaces for use with C and several other languages. The Java Database Connectivity
(JDBC) standard defines a corresponding interface for the Java language.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 21

The design of a “complete” database application environment

Database Design
that meets the needs of the enterprise being modelled requires
attention to a broader set of issues.

• Database design mainly involves the design of the database schema.


• A high-level data model provides the database designer with a conceptual framework
in which to specify the data requirements of the database users and how the database
will be structured to fulfil these requirements.
• The initial phase of database design, then, is to characterize fully the data needs of
the prospective database users.
o The database designer needs to interact extensively with domain experts and users to
carry out this task.
o The outcome of this phase is a specification of user requirements.
• Next, the designer chooses a data model, and by applying the concepts of the chosen
data model, translates these requirements into a conceptual schema of the database.
• The schema developed at this conceptual-design phase provides a detailed overview of
the enterprise.
• The designer reviews the schema to confirm that all data requirements are indeed
satisfied and are not in conflict with one another.
• The designer can also examine the design to remove any redundant features. The focus
at this point is on describing the data and their relationships, rather than on specifying
physical storage details.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 22

11
Database Design – cnt
• In terms of the relational model, the conceptual-design process involves decisions on
what attributes we want to capture in the database and how to group these attributes
to form the various tables.

• The “what” part is basically a business decision, and we shall not discuss it further in
this course.

• The “how” part is mainly a computer-science problem. There are principally two ways
to tackle the problem:
o The first one is to use the entity-relationship model;
o The other is to employ a set of algorithms (collectively known as normalization) that
takes as input the set of all attributes and generates a set of tables.

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 23

Architecture of Database Applications


• Database applications are usually partitioned into two or three parts:
o Two-tier architecture: the application resides at the client machine, where it invokes
database system functionality at the server machine
o Three-tier architecture: the client machine acts as a front end and does not contain
any direct database calls
 The client end communicates with an application server, usually through a forms
interface
 The application server in turn communicates with a database system to access data

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 24

12
Database Users
• There are four different types of database-system users:
o Naïve users
 unsophisticated users who interact with the system by using predefined user interfaces,
such as web or mobile applications
o Application programmers
 are computer professionals who write application programs
 application programmers can choose from many tools to develop user interfaces
o Sophisticated users
 interact with the system without writing programs
• form their requests either using a database query language or by using tools such as data
analysis software
 analysists who submit queries to explore data in the database fall in this category
o Specialized users
 write specialized database applications that do not fit into the traditional data-
processing framework
• For example: CAD, graphic data, audio, video

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 25

Database Administrators
• A person who has central control over the system is called a database administrator
(DBA), whose functions are:
o Schema definition
o Storage structure and access-method definition
o Schema and physical-organization modification
o Granting of authorization for data access
o Routine maintenance
o Periodically backing up the database
o Ensuring that enough free disk space is available for normal operations, and upgrading
disk space as required
o Monitoring jobs running on the database and ensuring that performance is not
degraded by very expensive tasks submitted by some users

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 26

13
History of Database Systems
• 1950s and early 1960s
o Data processing using magnetic tapes for storage
o Punch cards for input
• Late 1960s and 1970s
o Hard disks allowed direct access to data
o Network and hierarchical data models in widespread use
o Ted Codd defines the relational data model (would win the ACM Turing Award for this work)
o IBM Research begins System R prototype
o Oracle releases the first commercial relational database
• 1980s
o SQL becomes industrial standard
o Parallel and distributed database systems (Wisconsin, IBM, Teradata)
o Object-oriented database systems
• 1990s
o Large decision support and data-mining applications
o Large multi-terabyte data warehouses
o Emergence of web commerce

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 27

History of Database Systems – cnt


• 2000s
o Big data storage systems
 Google Bigtable, Yahoo PNUTS, Amazon AWS
 NoSQL systems
o Big data analysis: beyond SQL
• 2010s
o SQL reloaded
 SQL front-end to Map Reduce systems
 Massively parallel database systems
 Multi-core main-memory databases

Lecture-1 COMP337 by Dr. Ferhun Yorgancıoğlu 28

14

You might also like