0% found this document useful (0 votes)
3 views30 pages

Course Cef 438 Chap1 Advanced - Database2025 Student

Uploaded by

mishaelzabud04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views30 pages

Course Cef 438 Chap1 Advanced - Database2025 Student

Uploaded by

mishaelzabud04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Faculty of Engineering and Technology

Department of Computer Engineering

CEF 438 - Advanced Databases and Administration

Chapter 1: Introduction to Advanced Database Concepts .....................................................................1


1.1. Overview of Database Systems ............................................................................................3
1.2. Evolution of Database Systems ............................................................................................7
1.3. Key Principles in Database Design .................................................................................... 15
1.4. Core Database Components ............................................................................................... 20
1.5. Database Administration .................................................................................................... 24
1.6. Tutorials ............................................................................................................................ 28

Chapter 1: Introduction to Advanced Database Concepts

At the core of any modern information system lies a database, responsible for ensuring data
integrity, availability, and security. A well-designed database system enables organizations to
manage data efficiently and supports complex decision-making processes. Given the critical
role of databases in business, government, healthcare, finance, and various industries,
professionals skilled in database design, implementation, and administration are essential in
today’s IT landscape (Fig 1.1).

The power of databases comes from a body of knowledge and technology that has developed
over several decades and is embodied in specialized software called a database management
system, or DBMS, or more colloquially a “database system.” A DBMS is a powerful tool for
creating and managing large amounts of data efficiently and allowing it to persist over long
periods of time, safely. These systems are among the most complex types of software
available.

Dr. KAMDJOU | 1
Fig. 1.1. In the verge of a disruptive century

The volume of data generated worldwide is skyrocketing, highlighting the need for advanced
database techniques:

Over 4 million emails are sent every second.


More than 500 hours of video are uploaded to YouTube every minute.
Google processes over 40 petabytes of data daily.
Over 600 million tweets are generated per day.
Users spend over 1 trillion minutes on Facebook each month.
Nearly 100 items are ordered on Amazon every second.

In this country, the increasing reliance on digital solutions across government, business,
healthcare, education, agriculture, and financial services has made advanced database
management a critical skill. As the country moves towards a more data-driven economy, the
ability to store, process, analyze, and secure vast amounts of data is essential for efficiency,
security, and informed decision-making. This course provides students with the practical and
theoretical knowledge necessary to develop and manage high-performance database systems
that address local and national challenges:

The Cameroonian government is actively digitalizing public services through


initiatives such as e-taxation, biometric identification systems (National Identity
Cards, Passports), and land registration databases. Advanced database management

Dr. KAMDJOU | 2
ensures the security and integrity of citizen data, financial records, and administrative
processes.
With the expansion of mobile banking (CAMTEL, MTN Money, Orange Money,
YUP, Express Union Mobile...) and fintech solutions, secure and high-performance
databases are essential for real-time transactions, fraud detection, and financial data
security. This course equips students with the knowledge to manage and optimize
financial databases and digital payment systems.
The healthcare sector in Cameroon is adopting electronic health records (EHR),
patient management systems, and telemedicine platforms. Effective database
administration ensures data security, patient confidentiality, and seamless information
sharing between hospitals, clinics, and pharmacies.
Agriculture remains a pillar of the Cameroonian economy, and data-driven solutions
can enhance market linkages, crop yield analysis, and logistics. Advanced databases
support agriculture platforms, price tracking systems, and real-time weather data
analysis to empower farmers and agribusinesses.
E-Commerce and local online marketplaces rely on well-structured databases to
handle product listings, customer transactions, and logistics. Database expertise is
essential for scalability, fraud prevention, and recommendation systems.
Cameroon's urban centers, particularly Douala and Yaoundé, are adopting smart city
technologies for traffic management, waste collection, and public transport systems.
Managing such large-scale data requires efficient database design, distributed systems,
and real-time analytics.

This course, Advanced Databases and Administration, builds upon foundational database
principles and introduces advanced topics such as object-oriented databases, query
optimization, data warehousing, data mining, and database security. It provides a practical,
hands-on approach, ensuring that students not only understand theoretical concepts but can
also apply them in real-world database systems.

1.1. Overview of Database Systems


A database is a structured collection of data that is stored and accessed electronically. It is
designed to efficiently store, manage, and retrieve data. Databases are typically organized into
tables, which can contain rows and columns (Fig 1.2).

Dr. KAMDJOU | 3
Fig. 1.2. Database management system

A database management system (DBMS) is the software that facilitates the creation,
maintenance, and use of databases, providing users with the tools to add, modify, and query
data (Fig 1.2). Databases can range from simple systems, like flat files, to complex systems
that manage large amounts of data across distributed environments. It acts as an interface
between the user and the database, therefore ensuring efficient storage and retrieval of data as
well as management of the same.

Various users extract and manipulate the data for different business and personal
requirements. It might be used by an administrator who ensures that the data stored in the
system is secured and limits free access to other users. Some designers access the database to
handle the same design aspect, making it more flexible and data-pro. The last sect is the end-
users who access it to collect and analyze the existing data for their needs.

A DBMS acts as a middleman between the user and the database. It simplifies the data
management process and ensures the data’s security, consistency, and integrity. It works in
the following way:

Dr. KAMDJOU | 4
Organizing Data: The DBMS organizes your database files, categorizing them in a
way that makes them easily retrievable. Data organization can be based on various
factors such as the data type, relationships between the data, etc.
Access Control: The system provides end users more access and control over the
data. It ensures appropriate security measures are in place so that only authorized
persons can access specific parts of the database.
Data Interaction: System users are given facilities to perform various operations on
the data, such as creating new data, updating existing data, deleting unnecessary data,
and retrieving required data.
Concurrency: The DBMS also manages concurrency, which means it allows multiple
users to access and modify the data simultaneously without compromising data
integrity.
Backup and Recovery: Another crucial function of a DBMS is managing backups
and recovery. It ensures that the data can be recovered from backups without any
discrepancies in case of any failure or data loss.
Data Integrity: The DBMS ensures data integrity by enforcing rules on the data.
These rules ensure that the data remains consistent and accurate.

There are many kinds of DBMS available. The most popular are RDBMS, NoSQL DBMS,
and Object-oriented DBMS:

Relational (RDBMS):

Relational databases are the most popular type of DBMS. They store data in tables with rows
and columns, where each table has a unique key for identification, making them easy to query
and update. Data can be related across tables through foreign keys. Examples of RDBMS
include MySQL, PostgreSQL, Oracle, and Microsoft SQL Server. They are particularly suitable
for applications requiring complex queries, transaction support, and data integrity. They’re often
used for storing financial data, customer information, and other types of business data.

Object-Oriented Databases (OODBMS)

Object-oriented databases are similar to relational ones, allowing you to store data as objects
rather than tables and columns. OODBMS store data in the form of objects, similar to how
objects are structured in object-oriented programming. These databases are well-suited for

Dr. KAMDJOU | 5
applications where complex data structures need to be managed, and they support inheritance,
polymorphism, and encapsulation. Examples include db4o and ObjectDB.

NoSQL Databases:

NoSQL (Not Only SQL) databases are designed for flexibility, scalability, and the ability to
handle unstructured or semi-structured data. NoSQL databases are preferred in scenarios where
high scalability, fast performance, and handling diverse data types are critical. They come in
various types:

o Document-oriented databases (e.g., MongoDB) store data in JSON-like


documents.
o Key-Value stores (e.g., Redis) store data as key-value pairs.
o Column-family stores (e.g., Cassandra) organize data in columns rather than
rows.
o Graph databases (e.g., Neo4j) store data as nodes and edges for representing
relationships.

Other Specialized Databases


Time-Series Databases (e.g., InfluxDB) are designed for managing time-stamped data,
commonly used in IoT and monitoring systems.
Spatial Databases (e.g., PostGIS) store and manage geographic data, supporting
queries like distance calculations and area searches.
Hierarchical databases are based on a parent-child relationship between data records.
They’re typically used when data is structured and doesn’t often change, such as in
EHR systems.
Flat-files DBMS is another type of database, but they store data in a simple text file
instead of using more complex structures like tables and objects.
Network databases are similar to hierarchical ones but have a more flexible structure
allowing multiple paths between records. This makes them well-suited for applications
where data is constantly changing, such as inventory management.

Dr. KAMDJOU | 6
1.2. Evolution of Database Systems
DBMS have undergone significant transformations in recent times. What began as rudimentary
systems to catalog and retrieve data have now evolved into sophisticated platforms that underpin
a vast array of business operations? In the early days, data was often stored in flat files or
hierarchical databases, which had a set structure and required significant manual effort for data
retrieval. But as organizations grew and data volumes surged, there emerged a need for more
efficient ways to store and access data (Fig 1.3).

This led to the development of the relational model in the 1970s, a watershed moment in the
history of databases. Systems based on this model, known as Relational Database Management
Systems (RDBMS), revolutionized data storage and retrieval by representing data in tables,
allowing for more versatile and efficient querying.

The turn of the century brought with it the challenges of handling the Internet’s explosion and
the vast amounts of unstructured data it generated. Modern database systems began to diversify,
incorporating NoSQL databases to cater to needs that traditional RDBMS couldn’t address.
Today, we have a plethora of database systems, each tailored for specific use cases – from
object-oriented databases, graph databases, to distributed systems, and more.

Fig. 1.3. Evolution of Database Management Systems (DBMS)

Dr. KAMDJOU | 7
As organizations sought more structured ways to store and manage data, hierarchical
databases emerged in the 1960s. These databases organize data into a tree-like structure, with
a root element and a hierarchy of parent-child relationships (Fig 1.4). Each child record has
one parent, and records are organized into a tree where each parent node can have multiple
child nodes, but each child has only one parent. The most notable example of a hierarchical
database is IBM's Information Management System (IMS).
While hierarchical databases were better at handling structured data and relationships
compared to flat files, they still had significant limitations:
Data redundancy: Information could be repeated in different branches of the hierarchy.
Inflexibility: Changes in data structures were difficult because altering the hierarchy
could require significant reorganization of the entire database.

Fig. 1.4. Hierarchical model in DBMS

In 1970, Edgar F. Codd, a researcher at IBM, introduced the relational model for databases in
his seminal paper "A Relational Model of Data for Large Shared Data Banks." The relational
model represented a breakthrough in database design, as it allowed data to be stored in tables
(also known as relations), each consisting of rows (records) and columns (attributes), as
shown in Fig 1.5. Key innovations of the relational model include:
Data Independence: Changes to the data structure (e.g., adding new tables or
columns) could be made without disrupting the overall system.

Dr. KAMDJOU | 8
Use of SQL: The introduction of Structured Query Language (SQL) made querying,
updating, and managing relational data much easier. SQL became the standard for
interacting with relational databases.
Normalization: Data could be organized into multiple related tables, reducing
redundancy and ensuring consistency.
Primary and Foreign Keys: These keys allowed different tables to be related,
enabling complex queries and data integrity.

Fig. 1.5. Relational database management system


The impact of the relational model was transformative. It provided a much more flexible,
efficient, and manageable way to store and retrieve data, allowing databases to grow in both
size and complexity. Major relational databases such as Oracle, MySQL, PostgreSQL, and
Microsoft SQL Server became foundational to many industries, particularly in business,
finance, and government.

Object-oriented DBMS designed for object-oriented programming, it keeps data as “objects,”


combining data and related methods (Fig 1.6). Object-Oriented Database Management
Systems (OODBMS) bring together the worlds of object-oriented programming and database
management, represents real-world entities efficiently. Great for modern, dynamic app. As the
name implies, OODBMS represent data as objects, encapsulating both their attributes and
their behavior. This allows for seamless integration of data and the operations associated with
it.

Dr. KAMDJOU | 9
Fig. 1.6. Object-oriented database management system

Objects in an OODBMS can have relationships, inheritance hierarchies, and methods


associated with them, providing rich data modeling capabilities. This makes OODBMS ideal
for applications where the structure and behavior of data entities are important, such as
scientific research, social media platforms, and complex simulations. OODBMS provide
transparent persistence, meaning that objects can be stored and retrieved directly from the
database without the need for complex mapping mechanisms. This simplifies the development
process and allows for a more natural interaction with the data. However, it’s important to
note that an OODBMS is not a one-size-fits-all solution. In scenarios where data consistency,
scalability, and complex queries are the primary concerns, other types of DBMS, such as
RDBMS or DDBMS, may be more appropriate.

While relational databases dominated for decades, the rise of the Internet, the big data
revolution, and the demand for highly scalable systems led to the development of NoSQL
databases in the early 2000s. NoSQL databases are designed to address some of the
limitations of relational databases, particularly in handling large-scale, unstructured, or semi-
structured data. Key characteristics of NoSQL systems include:

Dr. KAMDJOU | 10
Flexibility: NoSQL databases allow developers to store data in a variety of formats,
such as documents, key-value pairs, wide-columns, or graphs. This flexibility supports
more diverse data types and structures, including JSON, XML, or binary data.
Scalability: NoSQL databases are designed for horizontal scaling, meaning they can
spread data across many servers or even across geographic regions. This makes them
ideal for handling the immense volume of data generated by modern applications.
High Availability: NoSQL databases often emphasize availability and fault tolerance,
ensuring systems remain operational even in the event of hardware failures or network
issues.

Fig. 1.7. NoSQL Database Management Systems

Unlike the rigid structure of RDBMS, NoSQL databases take a schema-less or schema
flexible approach, enabling dynamic and agile data modeling. NoSQL DBMSs are designed
to handle multiple data formats, including JSON, XML, key-value pairs, graphs, and
documents, making them a versatile choice for modern applications that deal with
unstructured or evolving data schemas. This flexibility allows developers to adapt their data
models on the fly to meet changing business needs and reduce the need for costly and time-
consuming schema migrations.

Dr. KAMDJOU | 11
NoSQL databases often offer powerful query capabilities tailored to the specific data model
they support. Keep in mind that there are four main types of NoSQL database systems: graph
databases, document databases, key-value stores, and wide-column stores. Each type uses a
different data model, resulting in significant differences between each NoSQL type. For
example, document-oriented databases like MongoDB provide rich query languages that
enable flexible search and aggregation of JSON-like documents. On the other hand, graph
databases like Neo4j offer specialized query languages optimized for traversing and analyzing
complex relationships between entities.

As data storage and processing demands grew, the need for distributed databases became
evident. Distributed databases are systems where data is stored across multiple physical
locations, either within a single data center or across multiple data centers globally (Fig 1.8).
These systems are designed to improve performance, reliability, and scalability by distributing
both the data and workload across many machines. Key features of distributed databases
include:
Replication: Copies of data are maintained on multiple servers, ensuring availability
and fault tolerance.
Sharding: Data is partitioned (or "sharded") across different servers to improve
performance and allow for horizontal scaling.
Consistency and Partition Tolerance: Distributed systems must balance the trade-
offs between consistency (ensuring all nodes have the same data) and availability
(ensuring the system is always responsive), as outlined in the CAP Theorem.

Dr. KAMDJOU | 12
Fig. 1.8. Distributed database management system

The rise of cloud computing in the 2010s further accelerated the adoption of distributed
databases. Cloud platforms like Amazon Web Services (AWS), Microsoft Azure, and Google
Cloud provide scalable infrastructure that enables companies to deploy distributed databases
easily (Fig 1.9). These cloud-based databases are designed to offer elastic scaling, automatic
backups, and seamless integration with cloud-native applications. Popular cloud-based and
distributed databases include:
Amazon Aurora: A relational database that combines the best features of both
relational and NoSQL databases.
Google Cloud Spanner: A distributed relational database designed for global
applications requiring high scalability.
Cassandra: A NoSQL database that provides high availability and scalability for
large, distributed systems.

Dr. KAMDJOU | 13
Fig. 1.9. Cloud database management systems

In addition to relational databases, cloud platforms also support specialized databases, such as
time-series databases (e.g., InfluxDB), document stores (e.g., MongoDB Atlas), and graph
databases (e.g., Neo4j Aura). These databases allow organizations to quickly deploy and
manage data storage solutions without needing to maintain physical infrastructure.

Vector database management systems (DBMS) like Vector or Pinecone represent the newest
and most innovative approach to data management (Fig 1.10). These systems are specifically
designed to handle high-performance analytics and complex data processing tasks. The main
feature of Vector DBMS is its ability to leverage vectorized processing, which allows for the
simultaneous execution of multiple data elements, resulting in significantly faster query

Dr. KAMDJOU | 14
performance. In today’s data market, where organizations are dealing with massive volumes
of data and demanding real-time analytics, vector DBMS plays a crucial role in enabling
efficient data processing and analysis. The practical implications of using Vector DBMS
include accelerated data-driven decision-making, improved operational efficiency, and
enhanced scalability.

Fig. 1.10. Vector database management systems

Its advanced capabilities make it a valuable solution for industries such as finance,
telecommunications, and e-commerce, where rapid analysis of large datasets is vital. By
leveraging the power of Vector DBMS, organizations can gain actionable insights and stay at
the forefront of data-driven innovation.

1.3. Key Principles in Database Design


The key principles of database design include the importance of data integrity, data
availability, data security, scalability, and performance optimization. These principles are
fundamental to ensuring that a database system performs well, remains reliable, and meets the
needs of users and organizations alike.

a) Data Integrity
Data Integrity refers to the accuracy, consistency, and reliability of data stored in a database.
It is a fundamental principle in database design because ensuring data integrity guarantees that
the information in the database is trustworthy and valid throughout its lifecycle. There are
various mechanisms employed to enforce data integrity, including:

Dr. KAMDJOU | 15
Entity Integrity: Ensures that each record in a table is unique and can be identified by
a unique identifier, usually a primary key. No two rows in a table should have the
same primary key value.
Referential Integrity: Ensures that relationships between tables remain consistent. For
example, if a foreign key in one table refers to a primary key in another table, the
foreign key value must either be null or correspond to an existing value in the related
table. This prevents orphaned records (records that refer to non-existent entries in
another table).
Domain Integrity: Ensures that data entries are of the correct type, format, and range.
For example, a database might enforce a rule that the "age" field must be a positive
integer within a certain range.
User-Defined Integrity: Allows businesses to set specific rules based on business logic
that must be followed for data to be considered valid. For instance, a company might
have a rule that an employee’s salary cannot exceed a specific threshold.
Data integrity is enforced through constraints in the database, such as NOT NULL, CHECK,
UNIQUE, and FOREIGN KEY constraints. Maintaining high levels of data integrity prevents
errors, reduces data redundancy, and ensures that queries and reports based on the data will be
accurate.

b) Data Availability
Data Availability refers to the ability to access the data when needed, without interruptions,
ensuring that users and applications can interact with the database consistently and efficiently.
This is especially critical for mission-critical applications in industries such as finance,
healthcare, and e-commerce, where downtime can lead to significant business loss or
disruption. Key strategies to ensure high data availability include:
Replication: Copying data across multiple servers or data centers ensures that, if one
server fails, another can take over. This redundancy allows applications to continue
accessing the data even if one node in the system is down.
Clustering: Involves grouping multiple database servers (or nodes) to work together as
a single system. Clusters can automatically distribute requests among multiple nodes,
ensuring load balancing and reducing the risk of downtime.

Dr. KAMDJOU | 16
Failover Mechanisms: Automated failover systems detect when a database server goes
down and immediately reroute traffic to a backup server, minimizing downtime and
ensuring continuous access to the data.
Disaster Recovery: Having an effective disaster recovery plan is essential to ensure
that data can be restored after catastrophic events, such as server failures or natural
disasters. Cloud-based databases often offer built-in backup and recovery solutions for
enhanced availability.
Maintaining high availability often involves trade-offs, balancing between performance,
redundancy, and cost. Some systems may opt for high availability within a region, while
others may spread across multiple geographic regions for disaster recovery and business
continuity.

c) Data Security
Data Security is crucial to ensure that sensitive and private information within the database is
protected from unauthorized access, modification, or destruction. With increasing data
breaches and cyber threats, protecting database security has become one of the top priorities
for organizations. Key principles of data security include:
Authentication and Authorization: Ensuring that only authorized users can access the
database through proper authentication (e.g., username, password, biometric data) and
authorization (e.g., user roles, permissions). Databases can be set to restrict access to
certain users or user groups for specific tables or operations.
Encryption: Encrypting data ensures that even if an unauthorized party gains access to
the data, it remains unreadable without the decryption key. Data can be encrypted at
rest (when stored on disk) and in transit (when transmitted across networks).
Auditing and Monitoring: Continuously tracking database access and changes can help
detect potential security breaches. Auditing logs monitor who accessed the database
and what actions they took. These logs are essential for compliance requirements and
for investigating suspicious activities.
Backup and Recovery Security: Ensuring that backup data is stored securely and is
also encrypted. Backup systems should be safeguarded against tampering and
unauthorized access to prevent attackers from exploiting backup data.
Data Masking and Tokenization: Masking or tokenizing sensitive information, such as
credit card numbers or personal identification numbers (PINs), ensures that sensitive

Dr. KAMDJOU | 17
data remains hidden or replaced with surrogate values for non-privileged users or
applications.

By adhering to security best practices, such as least privilege access, regular patching of
vulnerabilities, and implementing multi-factor authentication, databases can significantly
reduce the risk of data breaches and ensure compliance with privacy regulations (e.g., GDPR,
HIPAA).

c) Scalability
Scalability refers to the ability of a database system to handle increasing amounts of data and
traffic while maintaining performance. As organizations grow and the volume of data
increases, a database must be capable of scaling to meet the demands of larger datasets, more
users, and higher transaction volumes. Scalability can be achieved in two ways:
Vertical Scaling (Scaling Up): Involves upgrading the existing database hardware,
such as increasing CPU, RAM, or storage capacity, to handle more data or users. This
is often simpler but has limitations in terms of physical hardware capacity.
Horizontal Scaling (Scaling Out): Involves adding more machines or nodes to the
system. For databases, this might involve sharding (partitioning data across multiple
servers) or using distributed databases where data and processing are spread across
many nodes, enhancing performance as demand increases.

Key aspects of scalable database design include:


Sharding: Breaking up a large dataset into smaller, more manageable chunks (called
shards), which are distributed across multiple servers. This allows for efficient load
balancing and can improve query performance by limiting the amount of data a server
must handle.
Load Balancing: Distributing incoming queries and transactions evenly across
multiple database servers helps prevent individual servers from becoming
overwhelmed and improves the overall response time.
Elasticity in Cloud Databases: Cloud databases offer dynamic scalability, allowing
resources to be added or removed based on demand. This elasticity ensures that
organizations only pay for the resources they use, while also ensuring the database can
grow as needed.

Dr. KAMDJOU | 18
Scalability is critical for organizations dealing with ever-growing datasets, such as social
media platforms, e-commerce websites, or IoT applications, where high volumes of
concurrent users and data must be processed efficiently.

d) Performance Optimization
Performance Optimization is essential for ensuring that databases respond quickly to queries
and handle large datasets efficiently. Performance issues can arise when a database is slow to
process complex queries, retrieve large datasets, or handle high-concurrency workloads. Key
strategies for optimizing database performance include:
Indexing: Indexes are data structures that allow for fast lookup of records based on
specific columns. By creating indexes on frequently queried columns, databases can
reduce the time it takes to find records, particularly for large tables. However, indexes
come with trade-offs: while they improve read performance, they can slow down write
operations, as the index must also be updated.
Query Optimization: The database query engine uses various strategies to optimize
queries, such as reordering operations or using more efficient access paths. Database
administrators can also write optimized SQL queries, avoid subqueries or unnecessary
joins, and use caching mechanisms to improve query speed.
Normalization and Denormalization: Normalization organizes data into smaller,
related tables, reducing redundancy and improving storage efficiency. However,
excessive normalization can result in complex joins that slow query performance. In
some cases, denormalization (the process of merging tables or adding redundant data)
may be used to optimize read performance, particularly for reporting or analytical
queries.
Caching: Frequently accessed data or query results can be cached in memory, reducing
the need for repeated database queries and significantly improving response times for
commonly accessed data.
Partitioning: Partitioning a large table into smaller, more manageable sections can
improve performance by reducing the amount of data that needs to be processed in a
query. Partitioning can be done by range, list, hash, or composite methods, depending
on the use case.
Regular performance monitoring, query profiling, and database tuning are essential for
maintaining optimal database performance as applications evolve and grow.

Dr. KAMDJOU | 19
1.4. Core Database Components
The core database components of a DBMS include Core Functions, Database Schema and
Architecture, Tables, Views, and Relationships, Indexes, Constraints, and Keys.

a) Core functions of DBMS


The core functions of a DBMS include:
Data Definition: The DBMS defines the structure of the data through a data definition
language (DDL), allowing users to specify the types of data to be stored (such as
integers, strings, or dates) and how they should be organized in tables.
Data Manipulation: Through a data manipulation language (DML), the DBMS enables
users to insert, update, delete, and query data. SQL is the most common DML used in
relational databases.
Data Retrieval: A key function of the DBMS is to enable the querying of data,
allowing users to extract useful information from the database using SQL queries.
Transaction Management: The DBMS manages transactions to ensure that data
modifications occur in a consistent and reliable manner. This involves enforcing
ACID (Atomicity, Consistency, Isolation, Durability) properties to ensure the integrity
of the database during transaction processing.
Concurrency Control: When multiple users or applications access the database
concurrently, the DBMS manages their interactions to prevent conflicts, such as two
users trying to update the same data simultaneously.
Backup and Recovery: DBMS systems provide mechanisms for backing up data and
recovering it in case of hardware failure, corruption, or other disasters.
Security: The DBMS provides authentication and authorization mechanisms to control
user access to sensitive data. Users can be granted different privileges to perform
actions like read, write, or delete data.

b) Database Schema and Architecture


The schema of a database defines its structure, including the tables, fields, relationships,
views, and other elements that make up the database. A schema provides a blueprint for how
data is organized and how the elements are interrelated. The types of Database Schema of Fig
1.11 include:

Dr. KAMDJOU | 20
Physical Schema: Defines how data is stored on the storage medium (disk, cloud, etc.),
including the data structures (files, indexes, partitions) used to physically store the
data.
Logical Schema: Describes the logical view of the data, including tables, views, and
relationships. It abstracts the physical storage details and focuses on the organization
of the data.
External Schema (View Schema): Represents different user views of the database.
Different users may need access to different parts of the database, and the external
schema defines these perspectives. Views are created to restrict or present a subset of
the data to the user, making it more relevant and secure.

Fig. 1.11. Three-Tier Architecture

The architecture of a DBMS refers to its design and structure (Fig 1.11). Commonly, database
systems are designed using the three-tier architecture (Fig 1.12):
Internal Level (Physical Tier): This is the lowest level of the DBMS and deals with the
physical storage of data. It defines how data is stored on hardware and how it is
indexed and retrieved efficiently.

Dr. KAMDJOU | 21
Conceptual Level (Logical Tier): The conceptual schema represents the logical
structure of the entire database, abstracting away the physical storage details. It
defines tables, columns, relationships, and constraints.
External Level (View Tier): This level describes the different ways the data is viewed
by users. It allows for customization of how data is presented to different users or
applications, with security measures such as data hiding or restricted access.

Fig. 1.12. DBMS Architecture


c) Tables, Views, and Relationships
In a relational database, tables are the fundamental units of data storage. Each table consists of
rows and columns, where each row represents a record (or tuple) and each column represents
an attribute (or field) of the data. For example, a "Customers" table might have columns like
CustomerID, Name, Email, and Address. Key characteristics of tables:
Primary Key: A column (or a set of columns) that uniquely identifies each record
in the table. No two rows can have the same primary key value.
Foreign Key: A column that establishes a link between two tables. It refers to the
primary key in another table, ensuring referential integrity between related tables.
A view is a virtual table based on the result of a query. It doesn't store data itself but rather
displays data stored in other tables. Views can be used to:
Present a specific subset of the data to users.

Dr. KAMDJOU | 22
Simplify complex queries by encapsulating them into a single object.
Restrict access to sensitive data by exposing only necessary columns.

For example, a view called "Customer_Orders_View" might combine data from both the
"Customers" and "Orders" tables to show a combined result, without actually storing the
combined data physically.
In relational databases, relationships are used to link tables based on common attributes
(keys). There are three types of relationships:
One-to-One: A record in one table is related to only one record in another table. For
example, each Employee has exactly one Office.
One-to-Many: A record in one table can be related to multiple records in another table.
For instance, a Customer can have multiple Orders, but each Order is associated with
only one Customer.
Many-to-Many: A record in one table can be related to multiple records in another
table, and vice versa. For example, a Student can enroll in multiple Courses, and a
Course can have multiple Students. This is typically implemented using a junction
table (also known as a bridge table), which contains foreign keys from both related
tables.

d) Indexes, Constraints, and Keys


An index is a database object that improves the speed of data retrieval operations on a
table. It is created on one or more columns of a table to allow quick access to rows based
on the values in those columns. Key points about indexes:
B-tree Indexes: The most common type of index, which allows for fast retrieval by
using a balanced tree structure.
Hash Indexes: Used for exact-match lookups where the value is hashed, and
retrieval is faster.
Unique Indexes: These indexes ensure that all values in the indexed column are
unique, preventing duplicate entries.
Composite Indexes: Indexes that involve multiple columns to improve query
performance on multi-column searches.

Dr. KAMDJOU | 23
However, while indexes speed up data retrieval, they can slow down insert, update, and
delete operations, as the index must be updated whenever the underlying data changes.

Constraints are rules that ensure the validity and integrity of the data in a database. Common
constraints include:
NOT NULL: Ensures that a column cannot have a null value.
UNIQUE: Ensures that all values in a column (or set of columns) are unique.
CHECK: Ensures that values in a column meet specific conditions (e.g., age >=
18).
DEFAULT: Specifies a default value for a column when no value is provided.
FOREIGN KEY: Ensures referential integrity by linking columns to primary keys
in other tables.

Keys are fundamental components that enforce data integrity in relational databases. They
help establish relationships between tables and ensure that data remains unique, valid, and
consistent.
Primary Key: A unique identifier for records in a table. Every table should have one
primary key.
Foreign Key: A reference to a primary key in another table, establishing relationships
between tables.
Unique Key: Similar to the primary key but allows for a column to accept null values.
Candidate Key: A set of columns that could be used as a primary key. One of these
keys will be chosen as the primary key.
Composite Key: A primary key that consists of more than one column.

1.5. Database Administration


Database administration is a critical function in modern IT environments, ensuring that
databases remain secure, efficient, and available. The role of a Database Administrator (DBA)
involves managing databases throughout their lifecycle, from development and deployment to
ongoing maintenance, backup, and recovery. Proper database administration is essential to
prevent data loss, optimize performance, and support business operations.

Dr. KAMDJOU | 24
a) Roles and Responsibilities of a Database Administrator (DBA)
A DBA plays a vital role in designing, implementing, and maintaining a database system.
Their responsibilities include ensuring data security, optimizing database performance, and
maintaining system integrity.
One of the primary duties of a DBA is database design and implementation. They work
closely with developers and system architects to create efficient database structures, define
schemas, and establish tables, relationships, and indexes. Proper database design ensures that
data is stored efficiently and retrieved quickly.
Security is another critical aspect of a DBA’s role. They implement user authentication and
access controls to protect sensitive data from unauthorized access. Security measures such as
encryption, firewall configurations, and role-based access control (RBAC) help safeguard
data from cyber threats, including SQL injection attacks and malware.
Performance tuning is essential for maintaining a fast and responsive database system. DBAs
analyze queries, optimize indexing strategies, and manage system resources to prevent
bottlenecks. They also monitor database caching and ensure that queries are executed
efficiently.
A DBA is also responsible for backup and recovery management, ensuring that databases are
regularly backed up to prevent data loss. In the event of system failure, they must be able to
restore the database quickly using backup files and disaster recovery plans.
Regular maintenance and monitoring are crucial for database stability. DBAs apply patches,
update software, and perform routine health checks to prevent failures. By using monitoring
tools, they can detect issues such as deadlocks, slow queries, and index fragmentation,
ensuring the database runs optimally.

b) Database Lifecycle: Development, Deployment, and Maintenance


The database lifecycle consists of several stages, starting from the initial development phase
and continuing through deployment and maintenance. Each stage is essential for ensuring a
well-functioning and reliable database system (Fig 1.13).

Dr. KAMDJOU | 25
Fig. 1.13. Database Life Cycle

During the development phase, business requirements are analyzed to determine the data that
needs to be stored. DBAs collaborate with developers to design a database schema, create
Entity-Relationship Diagrams (ERD), and ensure normalization to reduce redundancy.
Security considerations, such as encryption and access controls, are also planned during this
phase.
In the deployment phase, the database is implemented in a live environment. This involves
migrating data from legacy systems, setting up database structures, and optimizing
performance. Load testing is conducted to ensure the database can handle real-world usage.
DBAs also configure replication and failover mechanisms to ensure high availability.
The maintenance phase involves continuous monitoring and optimization. Regular backups
are scheduled to prevent data loss, and security patches are applied to protect against
vulnerabilities. DBAs also monitor performance metrics such as query execution time, CPU
usage, and disk I/O operations to keep the database running smoothly.

c) Backup and Recovery Strategies

Dr. KAMDJOU | 26
A well-planned backup and recovery strategy is essential to protect data from loss due to
hardware failures, cyberattacks, or accidental deletions. There are different types of backups,
each serving specific purposes.
A full backup creates a complete copy of the entire database and is typically performed
periodically. Incremental backups save only the data that has changed since the last backup,
reducing storage requirements. Differential backups store all changes made since the last full
backup, providing a balance between storage efficiency and recovery speed.
Logical backups export data as SQL scripts, allowing for easy restoration, while physical
backups copy database files directly from disk storage. To ensure data security, backups
should be encrypted and stored in multiple locations, such as cloud storage, offsite data
centers, and external drives.
Database recovery strategies include Point-in-Time Recovery (PITR), which allows
administrators to restore the database to a specific moment before a failure. Redo logs and
transaction logs help reconstruct lost transactions, ensuring minimal data loss. In critical
environments, failover mechanisms are used to switch operations to a standby database in
case of primary database failure.
Regular testing of backup restoration procedures is necessary to verify that backups are
functional. Organizations should implement disaster recovery plans (DRP) to ensure business
continuity in case of unexpected failures.

d) Monitoring and Troubleshooting Databases


Effective database monitoring and troubleshooting are essential for ensuring performance,
availability, and reliability. DBAs use monitoring tools to track critical performance metrics,
such as query execution time, CPU and memory usage, disk I/O operations, and concurrent
connections.
Common database performance issues include slow queries, deadlocks, high CPU usage, and
data corruption. Slow queries can result from missing indexes or inefficient joins, which can
be resolved by optimizing SQL statements and using EXPLAIN plans. Deadlocks occur when
multiple transactions block each other, requiring the use of row-level locking and optimized
transaction management.
To troubleshoot high CPU usage, DBAs analyze query execution plans, optimize indexes, and
allocate additional system resources if needed. In case of database crashes, recovery strategies

Dr. KAMDJOU | 27
such as restoring from backups, using failover clustering, and performing integrity checks
help restore normal operations.
Several tools are available for database monitoring and troubleshooting. Oracle Enterprise
Manager, SQL Server Management Studio (SSMS), pgAdmin, MySQL Workbench,
Prometheus, and Datadog are commonly used to analyze performance and detect issues in
different database environments.
By continuously monitoring database health, DBAs can proactively address issues before they
impact users. Regular maintenance, security updates, and optimization techniques ensure that
databases remain stable and perform efficiently.

1.6. Tutorials

Exercises 1:
Today, many institutions and businesses rely on databases to manage their information
systems. Consider the following sectors: Education (Universities and Schools), Healthcare
(Hospitals and Clinics), Finance (Banks and Microfinance Institutions), Business
(Supermarkets and E-commerce Platforms). For each sector, identify a type of database
system that could be used and explain why that type of database is appropriate for the sector.

Exercises 2:
You are hired to design a database for a supermarket chain to store data. Identify three important
tables that should be in the database. Describe two constraints you would apply to ensure data
integrity. What indexing strategy would you use to speed up product search queries?

Exercises 3:
The University of HMK maintains a student database containing personal details, academic records,
and financial transactions. Recently, there was a cyberattack on the university’s system, leading to data
breaches and temporary loss of student records. What security measures should be implemented to
protect student data? What type of backup strategy would you recommend for the university’s
database? Explain how the university can use monitoring tools to detect potential threats.

Exercises 4:
Mobile money services rely on databases for real-time transactions in 6G network. Imagine you are a
Database Administrator (DBA) for a mobile money service, and customers report delays in processing
transactions. What could be three possible causes of slow transaction processing? Suggest two

Dr. KAMDJOU | 28
optimization techniques to improve database performance. How would you use monitoring tools to
detect and resolve performance issues?

Exercises 5:
A hospital wants to manage patient records. A patient’s visit includes prescribed treatments, which are
stored in a separate Treatments table. Identify and design the tables needed to manage Patients,
Doctors, and Treatments effectively. Define Primary Keys, Foreign Keys, and Relationships.
Implement constraints to ensure that a doctor cannot have more than 50 active patients, a patient must
have at least one assigned doctor. Write SQL statements to create the database structure.

Exercises 6:
An online marketplace experiences slow search times when retrieving products. The database contains
over 1 million products, and users often search based on Product Name and Category. Analyze the
indexing strategy to optimize search speed. Decide where to apply indexes. Consider the trade-offs of
using indexes on large datasets. Write SQL statements to create the necessary indexes.

Exercises 7:
A bank manages customer accounts, transactions, and user logins. To protect sensitive financial data,
the bank needs to implement strict security measures at the database level. Identify three critical tables,
Implement constraints to prevent unauthorized data modifications. Apply encryption for storing
passwords and sensitive information. Define a view that allows bank employees to see only necessary
customer data (excluding sensitive information like account balances). Write SQL statements to
enforce these security measures.

Exercises 8:
A telecom company relies on a cloud-based database to store customer subscription data. Any
downtime could lead to revenue loss and customer dissatisfaction. Propose a high-availability strategy
to ensure continuous database access. Design a backup and recovery plan to prevent data loss.
Implement replication to synchronize data across multiple locations. Write SQL statements to
configure automatic backups and replication.

Exercises 9:

Dr. KAMDJOU | 29
A university database stores student grades, courses, and faculty assignments. Different users
(students, professors, and administrators) need varying levels of access. Design a UserRoles
system with different permissions for students, professors, and administrators. Implement
row-level security so that, Students can see only their own grades; Professors can update
grades only for their assigned courses; Administrators have full access. Write SQL statements
to enforce these access rules using Views and GRANT/REVOKE privileges.

Exercises 10:
A university offers multiple courses, and each student can enroll in multiple courses. A direct Many-
to-Many relationship between Students and Courses is not allowed in relational databases. Design the
database. Write an SQL statement to create and enforce the relationships.

Exercise 11:
A startup wants to develop a local social media platform that allows villagers to share posts, comment,
and interact with multimedia content. Since users interact with various types of data (text, images,
videos, comments, and reactions), a traditional relational model is inefficient. Instead, an Object-
Oriented Database Management System (OODBMS) will be used to model these interactions as
objects with attributes and behaviors. Design the OODBMS schema using object classes with
attributes and methods. Define inheritance, such as a MediaPost class inheriting from Post to store
video/image metadata. Illustrate relationships. Explain how persistence works, i.e., how objects are
stored/retrieved in an OODBMS without conversion. How does inheritance in OODBMS improve
code reusability in this application? What are the advantages of using an OODBMS over a relational
DB for a dynamic app like social media? How would you handle scalability issues if the platform
grows rapidly in users and interactions? Install OODBMS Tools for Implementing the solution.

Dr. KAMDJOU | 30

You might also like