0% found this document useful (0 votes)
12 views11 pages

Lesson 1 - Database System Overview

Uploaded by

Erwin Javier
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views11 pages

Lesson 1 - Database System Overview

Uploaded by

Erwin Javier
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

LESSON 1 - DATABASE SYSTEM OVERVIEW

Learning Outcomes

By the end of this lesson, you should be able to:


1. Describe the primary applications of databases in real-world scenarios.
2. Differentiate between various database languages.
3. Summarize the historical milestones of database systems.

Outline
1. Database-System Applications
2. Purpose of Database Systems
3. View of Data
4. Database Languages
5. Database Design
6. Database Engine
7. Database Architecture
8. Database Users and Administrators

Database Systems

• A Database Management System (DBMS) contains information about a particular enterprise. It


includes:
o Collection of Interrelated Data - The database itself.
o Set of Programs - Used to access and manage the data.
o Environment - Designed to be both convenient and efficient for users.

• Uses of Database Systems


o Manage collections of data that are:
▪ Highly Valuable - Data critical to the enterprise.
▪ Relatively Large - Large volumes of data.
▪ Accessed by Multiple Users and Applications - Often concurrently.

• Modern Database Systems


o A complex software system that manages large, complex collections of data.
o Databases play a crucial role in various aspects of modern life, influencing nearly every sector.

Database Applications Examples

• Enterprise Information
o Sales: Managing customer information, products, and purchases.
o Accounting: Handling payments, receipts, and assets.
o Human Resources: Storing and managing information about employees, salaries, and payroll
taxes.

• Manufacturing
o Managing production processes, inventory, orders, and supply chains.

Prepared by: Jun Y. Ercia


• Banking and Finance
o Customer Information - Managing accounts, loans, and banking transactions.
o Credit Card Transactions - Tracking and processing payments.
o Finance - Handling sales and purchases of financial instruments (e.g., stocks and bonds) and
storing real-time market data.

• Universities
o Managing student registration and grades.

• Airlines
o Managing reservations and flight schedules.

• Telecommunication
o Recording calls, texts, and data usage.
o Generating monthly bills and maintaining balances on prepaid calling cards.

• Web-Based Services
o Online Retailers: Tracking orders and providing customized recommendations.
o Online Advertisements: Managing ad placements and targeting.

Purpose of Database Systems

In the early days, database applications were built directly on top of file systems, which led to several issues:

• Data Redundancy and Inconsistency


o Data was stored in multiple file formats, resulting in duplication of information across different
files.

• Difficulty in Accessing Data


o A new program had to be written for each new task, making data access cumbersome.

• Data Isolation
o Data was scattered across multiple files and formats, making it difficult to combine and use
efficiently.

• Integrity Problems
o Integrity Constraints: For example, rules like "account balance > 0" were often buried in
program code rather than being explicitly stated.
o Difficulty in Modifications: Adding new constraints or modifying existing ones was challenging
because they were hard-coded into applications.

• Atomicity of Updates
o Failures may leave the database in an inconsistent state with only partial updates carried out.

Example: A fund transfer from one account to another should either be fully completed or not
occur at all to maintain consistency.

Prepared by: Jun Y. Ercia


• Concurrent Access by Multiple Users
o Concurrent access is necessary for performance, especially in multi-user environments.

• Uncontrolled Concurrent Access


o Can lead to inconsistencies.

Example: Two people reading a balance of 100 and both withdrawing 50 at the same time
could cause errors if not properly managed.

• Security Problems
o It is difficult to provide user access to only specific parts of data, leading to potential security
vulnerabilities.

• Solution
o Modern database systems are designed to address and solve these problems, providing robust
solutions for atomicity, concurrency, and security.

View of Data

• A database system is a collection of interrelated data and a set of programs that allow users to access
and modify these data.
• The primary purpose of a database system is to provide users with an abstract view of the data,
simplifying interaction with complex data structures.

• Components
o Data Models - A collection of conceptual tools for describing data, data relationships, data
semantics, and consistency constraints.
o Data Abstraction - Hides the complexity of data structures by representing data in the database
through several levels of abstraction, making it easier for users to interact with the data without
needing to understand the underlying complexities.

Data Models

Data models are essential tools used for describing and managing data within a database system. They
encompass several key aspects:

Components of Data Models

o Data - The actual information stored in the database.


o Data Relationships - The connections and interactions between different data entities.
o Data Semantics - The meaning and interpretation of the data.
o Data Constraints -The rules that ensure the integrity and validity of the data.
o
Types of Data Models

o Relational Model - Organizes data into tables (relations) where each table represents an entity.
o Entity-Relationship Data Model - Primarily used for database design, this model focuses on
entities and their relationships.

Prepared by: Jun Y. Ercia


Example of Relational Model

A Sample Relational Database

o Object-Based Data Models - Includes object-oriented and object-relational models, combining


data with behavior (methods).
o Semi-Structured Data Model (XML) -Handles data that does not fit neatly into a relational
model, often used for documents and web data.
o Older Models
▪ Network Model - Uses a graph structure to represent relationships between entities.
▪ Hierarchical Model - Organizes data in a tree-like structure with parent-child
relationships.

Prepared by: Jun Y. Ercia


Levels of Abstraction

In a database system, there are three levels of abstraction that help manage the complexity of data:

1. Physical Level
o Describes how a record (e.g., an instructor) is stored in the database.

2. Logical Level
o Describes the data stored in the database and the relationships among the data.

Example:

type instructor = record


ID : string;
name : string;
dept_name : string;
salary : integer;
end;

3. View Level
o Application programs hide details of data types from the user.
o Views can also hide specific information (such as an employee’s salary) for security purposes.

Instances and Schemas

• Logical Schema - The overall logical structure of the database, like a blueprint.
o Example - Consider a banking database where the logical schema defines entities like
"Customers" and "Accounts," and the relationships between them, such as which customers
hold which accounts.

• Physical Schema - The actual physical storage of the database data.


o Example - The physical schema specifies how the customer and account data are stored on
disk, such as in tables, files, or indexes.

• Instance - The specific content of the database at a particular point in time, akin to the current value
of a variable.
o Example - At a given moment, the instance could include all customer details and their
corresponding account balances as they exist right now.

Physical Data Independence

• Physical data independence is the ability to modify the physical schema (how data is stored) without
affecting the logical schema (how data is structured logically).

• Key Points
o Applications depend on the logical schema
▪ This means that even if the physical storage of data changes (like moving from one
type of storage system to another), the applications that use the database won’t need
to change as long as the logical schema remains the same.

Prepared by: Jun Y. Ercia


o Well-defined interfaces
▪ To achieve physical data independence, the interfaces between the different levels of
the database system (physical, logical, and view) must be clearly defined so that
changes in one level do not negatively impact the others.
Example
o Suppose a company decides to migrate its database from using traditional hard drives to faster
solid-state drives (SSDs). This change involves modifying the physical storage method, but
thanks to physical data independence, the logical schema remains unchanged, and the
applications accessing the data continue to work as before without requiring any adjustments.

Data Definition Language (DDL)

• DDL is used to define the structure of the database schema.


Example:
CREATE TABLE instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2)
);
This example creates a table named instructor with columns for ID, name, dept_name, and salary.

• DDL Compiler - The DDL compiler processes these definitions and stores the table templates in a
data dictionary.

• Data Dictionary
o Contains metadata—data about the data, such as:
▪ Database Schema - The structure of the database.
▪ Integrity Constraints - Rules to ensure data accuracy, such as a primary key that
uniquely identifies each instructor.
▪ Authorization - Information about who has access to what data.

Data Manipulation Language (DML)

• DML is used for accessing and updating the data organized by the appropriate data model. It is also
commonly referred to as a query language.

• Types of DML
o Procedural DML - Requires the user to specify what data is needed and how to get that data.
o Declarative DML - Requires the user to specify what data is needed without needing to specify
how to retrieve it.

• Key Points
o Declarative DMLs are usually easier to learn and use compared to procedural DMLs.
o Declarative DMLs are also known as non-procedural DMLs.
o The portion of a DML that involves information retrieval is specifically called a query language.

Prepared by: Jun Y. Ercia


SQL Query Language

• SQL is a nonprocedural query language. A query in SQL takes one or more tables as input and always
returns a single table.

Example:
SELECT name
FROM instructor
WHERE dept_name = 'Comp. Sci.';

This query retrieves the names of all instructors in the Computer Science department.

Key Points
o SQL is NOT a Turing machine equivalent language - It cannot perform all computational tasks
by itself.
o To perform more complex operations, SQL is often embedded in a higher-level programming
language.
o Application programs typically access databases using -
▪ Language extensions - Allow embedding SQL directly within another programming
language.
▪ Application Program Interfaces (APIs) - Such as ODBC or JDBC, which allow SQL
queries to be sent to the database.

Database Access from Application Program

• Non-procedural query languages like SQL are not as powerful as a universal Turing machine, meaning
they can't handle all computational tasks.

• Lack of Support in SQL


o SQL does not support direct actions such as:
▪ Input from users
▪ Output to displays
▪ Communication over a network

• Embedding in Host Languages


o To perform these tasks, SQL must be embedded within a host programming language,
such as C/C++, Java, or Python.
o The host language handles the computations and actions, while SQL is used to query and
manipulate data in the database.

• Application Programs
o Programs that interact with the database in this way are known as application programs.
They combine the strengths of a host language and SQL to provide a full range of
functionality.

Prepared by: Jun Y. Ercia


Database Design

The process of designing the general structure of the database involves two key aspects:

• Logical Design
o Deciding on the database schema, which requires identifying a "good" collection of relation
schemas.
▪ Business Decision - Determining what attributes (pieces of data) should be recorded in
the database.
▪ Computer Science Decision - Deciding what relation schemas should be created and
how the attributes should be distributed among them.

• Physical Design
o Deciding on the physical layout of the database, including how the data will be stored on disk
and how it will be accessed efficiently.

Database Engine

• A database system is divided into modules, each responsible for a specific part of the overall system.

Functional Components
1. Storage Manager - Handles the storage, retrieval, and update of data in the database.
2. Query Processor Component - Interprets and executes database queries, optimizing them for
efficiency.
3. Transaction Management Component - Ensures that all database transactions are processed
reliably and that the database remains in a consistent state, even in the case of failures.

Storage Manager

• The storage manager is a program module that acts as the interface between the low-level data
stored in the database and the application programs and queries submitted to the system.

• Responsibilities
o Interaction with the OS File Manager - Coordinates with the operating system's file
manager to handle data storage and retrieval.
o Efficient Storing, Retrieving, and Updating of Data - Ensures that data operations are
performed efficiently.

• Components of the Storage Manager


o Authorization and Integrity Manager - Manages user permissions and ensures data
integrity.
o Transaction Manager - Handles transactions to ensure data consistency.
o File Manager - Manages the storage of files on disk.
o Buffer Manager - Manages the database's memory buffer, optimizing access to data.

• Data Structures Implemented by the Storage Manager


o Data Files - Store the actual database data.
o Data Dictionary - Stores metadata about the structure of the database, particularly the
schema. This includes details like table definitions, column data types, and constraints.

Prepared by: Jun Y. Ercia


o Indices - Provide fast access to data items. A database index contains pointers to data
items that hold particular values, allowing for quicker retrieval of information.

Query Processor

The query processor is responsible for interpreting and executing database queries. It includes the
following components:

• DDL Interpreter
o Interprets Data Definition Language (DDL) statements and records the definitions in the
data dictionary.

• DML Compiler
o Translates Data Manipulation Language (DML) statements into an evaluation plan. This
plan consists of low-level instructions that the query evaluation engine can execute.
o Query Optimization - The DML compiler performs query optimization by selecting the
most cost-effective evaluation plan from various alternatives.

• Query Evaluation Engine


o Executes the low-level instructions generated by the DML compiler.

These components work together to process and optimize queries, ensuring efficient retrieval and
manipulation of data.

Query Processing

Query processing involves three key steps:

1. Parsing and Translation


o The query is parsed and translated
into a relational-algebra expression
by the parser and translator.

2. Optimization
o The relational-algebra expression is
optimized by selecting the most
efficient execution plan. The
optimizer uses statistics about the
data to choose the best approach.

3. Evaluation
o The evaluation engine executes the
query based on the optimized
execution plan, resulting in the
query output.

This process ensures that database queries are executed efficiently, minimizing resource use and
speeding up response times.

Prepared by: Jun Y. Ercia


Transaction Management

• A transaction is a collection of operations that performs a single logical function in a database


application.

• Transaction Management
o The transaction management component ensures that the database remains in a
consistent (correct) state despite system failures (e.g., power failures, operating system
crashes) and transaction failures.

• Concurrency Control
o The concurrency-control manager oversees the interaction among concurrent transactions
to ensure the consistency of the database, preventing issues such as data conflicts.

DATABASE ARCHITECTURE

• Centralized Databases
o Typically run on one to a few cores with shared memory. All data is stored and managed in a
single location.

• Client-Server Architecture
o Involves one server machine that executes tasks on behalf of multiple client machines. This
setup allows for efficient resource management and centralized control.

• Parallel Databases
o Utilize multiple cores with shared memory to handle large-scale data processing.
o Types:
▪ Shared Disk - All processors share a single disk for data storage.
▪ Shared Nothing - Each processor has its own disk and memory, eliminating
bottlenecks.

• Distributed Databases
o Geographical Distribution - Data is distributed across different geographical locations.
o Schema/Data Heterogeneity - Different sites may use different schemas and data formats,
requiring integration.

Prepared by: Jun Y. Ercia


Database Architecture (Centralized/Shared-Memory)

In a centralized or shared-memory database architecture, all components of the database system are
closely integrated and share the same memory space. Here’s how it is structured -

• Query Processor
o Compiler and Linker - Compiles and links application program object code.
o DML Compiler and Organizer - Translates and organizes DML (Data Manipulation Language)
queries for execution.
o DDL Interpreter - Interprets DDL (Data Definition Language) statements, recording them in the
data dictionary.
o Query Evaluation Engine - Executes the optimized query plan.

• Storage Manager
o Buffer Manager - Manages the memory buffer, optimizing data access.
o File Manager - Manages the storage of data on disk.
o Authorization and Integrity Manager - Controls access to data and ensures data integrity.
o Transaction Manager - Ensures that transactions are processed reliably and maintains the
consistency of the database.

• Disk Storage
o Data - The actual data stored in the database.
o Indices - Structures that improve the speed of data retrieval.
o Data Dictionary - Metadata that describes the structure of the database, including tables,
columns, and constraints.
o Statistical Data - Information used by the optimizer to choose the best execution plan.

Prepared by: Jun Y. Ercia

You might also like