Data P sss3 Web
Data P sss3 Web
Contact1
INDEXES
An index is a copy of selected column of data from a table that is design to enable a quick
search.
Index are powerful tools used in the background of database to speed up querying
Contact2
Types of Indexes
1. Bitmap Index
Explanation: Bitmap indexes use a series of bits to represent data, which is particularly
efficient for columns with a small number of distinct values (e.g., gender, yes/no fields).
Example: If a table has a column for gender with values ‘Male’ and ‘Female,’ a bitmap
index would create two bitmaps to represent these values.
Use Case: Ideal for read-heavy
databases with infrequent updates.
2. Dense Index
Explanation: In a dense index, every search key value in the database is included in the
index and points to a data record.
Use Case: Efficient for smaller databases, as it requires more space but provides fast access.
3. Sparse Index
Explanation: In a sparse index, not every search key is included. It indexes only a subset of
search keys, and each entry points to a block of data.
Use Case: Useful when the database is very large because it uses less space than a dense
index but requires more steps to find some data.
CLASS WORK
Differentiate Between Dense and Sparse Indexes
assignment
what is data security?
Contact3
Data processing
indexes
Differences Between Bitmap, Dense, and Sparse Indexes:
Bitmap indexes use bits to represent data and are efficient for columns with limited unique
values.
Dense indexes have an entry for every data value, ensuring faster retrieval.
Sparse indexes store fewer entries but save storage space.
Data Organization for Efficient Retrieval
Explains how data entries are organized in an index to support efficient retrieval:
In an index, the key values (such as names, IDs, or dates) are stored along with pointers to
their actual data location.
When a query is made, the system searches the index instead of scanning the entire dataset,
which drastically improves speed.
Tree Structure: Many indexes are organized in a hierarchical structure called a B-tree,
which allows for quick searching, insertion, and deletion of records.
o It is typically a clustered index, meaning the data is physically stored in the order of
the primary key.
Secondary Index
Explanation: A secondary index is any index created on columns other than the primary
key. It helps improve the speed of queries on non-key columns.
o Example: In the student database, you might have a secondary index on the "last
name" column to make it easier to find students by last name.
Features:
o Secondary indexes are typically non-clustered, meaning the physical order of the
data remains unchanged.
Class work
Give one difference between primary and secondary index?
Assignment
Mention 3 methods of ensuring access control?
Week2contact1
Definition Data Security
Data security involves protecting data from threats such as Unauthorized access, Theft, Corruption,
Deletion
Definition The process of converting data into a coded The process of creating a copy
format to prevent unauthorized access. of data to restore in case of loss
or corruption.
Purpose To ensure data privacy and security by To create a redundant copy of
making it unreadable to unauthorized users. data for recovery in case of data
loss,
Data Only accessible with a decryption key. Original data can be accessed
Access normally; backup is only used if
needed.
Primary Security and confidentiality of data. Data recovery and availability.
Focus
Authentication
Authentication is the process of verifying a user's identity before granting access.
Examples of Authentication Methods:
Single-Factor Authentication (SFA): Using one form of verification, such as a password.
Multi-Factor Authentication (MFA): Requires two or more verification methods, such as a
password and a fingerprint scan.
Class work
What is decryption?
Assignment
State the types of encryptions?
Contact 3
Creating dB
Pass wording
Using index
Week3
ROLES OF A DATABASE ADMINISTRATOR (DBA) IN ENSURING DATABASE SECURITY
A Database Administrator (DBA) is responsible for managing the operation and security of a
database system. Their primary role in ensuring database security includes the following
responsibilities:
User Access Control: A DBA manages who can access the database by creating and controlling user
accounts. They implement authentication mechanisms such as usernames and passwords to verify
the identity of users and assign specific privileges to control access levels. Only authorized users
should be able to view, modify, or delete data.
Data Backup and Recovery: The DBA ensures that backups are regularly made to prevent data loss
due to hardware failure, software malfunctions, or cyberattacks. In case of a disaster, they are
responsible for recovering the database to its previous state using these backups.
Encryption: DBAs apply data encryption techniques to secure sensitive information stored in the
database.
Database Patching and Updates: To avoid vulnerabilities, the DBA regularly updates the database
management software with security patches released by the vendor. These updates often fix bugs or
security loopholes that can be exploited by attackers.
Monitoring and Auditing: A DBA regularly monitors database activity to detect suspicious
behavior, unauthorized access, or potential threats. They also conduct audits to review how data is
being accessed and used, ensuring that security policies are being followed.
COMMON SECURITY THREATS TO DATABASES
Databases face numerous security threats, some of the most common include:
SQL Injection Attacks: This occurs when an attacker inputs malicious SQL statements into a query
field in a web application. If not properly validated, this can allow attackers to gain access to or
manipulate the database.
Unauthorized Access: When users or attackers gain access to a database without proper
permissions, they can view, steal, or manipulate data, leading to data breaches.
Data Breaches: Data breaches happen when sensitive information, such as customer details or
financial data, is exposed to unauthorized individuals, leading to identity theft, fraud, and other
criminal activities.
Malware and Ransomware Attacks: Malware, such as viruses or ransomware, can corrupt or
encrypt database files, making them unusable unless a ransom is paid. These types of attacks can
cripple an organization’s operations.
Internal Threats: Employees or users with access to the database may misuse their privileges to
alter or steal data. This kind of threat is particularly dangerous because it comes from trusted
sources.
Patching and Updating: Keep database software up-to-date with the latest patches to protect
against known vulnerabilities. Failing to do so could expose the system to exploits.
Class work
List five security threat to database?
Assignment
Research and explain the concept of SQL injection and how it can be prevented in database
management
Unit two
Class work
Write two practical applications of encryption.
Assignment
Differentiate between public key and private key?
Correction
public key is used to encrypt data while the private key can be used to decrypt data
WEEK 4
Crash Recovery
Crash recovery is the way a database fixes itself after something goes wrong, like when the
computer crashes or loses power. It helps make sure no data is lost, and the database goes back to
how it was before the problem happened. Crash recovery makes sure all the work done before the
crash is either finished correctly or undone if it wasn’t finished.
Functions of Uncle Pass
Uncle Pass helps the database recover after a crash by checking that everything that was completed
is saved, and anything that wasn’t finished is removed. Think of it like a cleaner who makes sure
only the correct information stays and everything else is cleaned up.
Log Sequence Number (LSN)
A Log Sequence Number (LSN) is like a serial number for every action that happens in the
database. Each change or update gets its own number, which helps the database know what
happened and in what order. If the system crashes, the LSN helps the database figure out what
needs to be fixed, making sure everything is in the correct order.
CLASS WORK
List five security threat to database?
ASSIGNMENT
Research and explain the concept of SQL injection and how it can be prevented in database
management?
Unit 2
Backup
A backup is a copy of the database that is saved at a specific time, so that if something goes wrong,
you can restore your data from that backup. There are different types of backups:
Full Back up: This is a copy of the entire database.
Incremental Backup: This only saves changes made after the last backup.
Differential Backup: This saves all changes made after the last full backup.
Backups are very important because they allow us to recover data if the system crashes.
Importance of Backup
Here are the main reasons why backups are important:
1. Data Protection: Backups ensure that important data is not permanently
2. Disaster Recovery: the system to be restored to a previous state, minimizing downtime and
losses.
3. Business Continuity: Backups help businesses quickly recover and continue operations after
a data loss incident.
4. Protection Against Ransomware and Viruses
5. Version Control: By keeping multiple backups, you can restore the database to a specific
time or version.
6. Minimizes Downtime: Having up-to-date backups allows faster recovery,
Unit 3
Transactions
A transaction is a small unit of work that happens in a database. Think of it like a task that needs to
be completed. For example, when you save a document or make a payment, that's a transaction. In
databases, transactions must follow a rule called ACID:
Atomicity: Either everything in the transaction happens, or nothing happens.
Consistency: The database must stay correct before and after the transaction.
Isolation: Transactions should not interfere with each other.
Durability: Once a transaction is finished, it stays saved, even if the system crashes.
These rules help make sure that the database stays reliable and accurate.
Techniques of Crash Recovery
There are several techniques used to recover from a crash in a database:
1. Logging: Logs keep track of all changes made to the database. If a crash happens, the
logs help determine what was completed and what needs to be undone. The most important
logs are:
o Redo Log: Replays changes to make sure the database has all the correct
information.
o Undo Log: Reverts changes that were not fully completed before the crash.
2. Checkpointing: The database takes snapshots at specific times (called checkpoints), saving
the current state. If a crash occurs, the recovery process can start from the last checkpoint,
which speeds up the process because it doesn’t have to go back to the very beginning.
3. Roll-back and Roll-forward:
o Rollback: This happens when a transaction fails and the database needs to undo all
changes made during that transaction. It returns the database to its previous state.
o Roll forward: This technique is used when the database uses a log to reapply
changes that were interrupted during a crash, ensuring that the database gets back
to its correct state.
Importance of Crash Recovery
Crash recovery is extremely important for several reasons:
Data Integrity: It ensures that the data in the database remains correct and reliable, even
after a failure.
Prevents Data Loss: Crash recovery helps prevent data from being permanently lost, which
is crucial for businesses and organizations that rely on accurate data.
Ensures Continuity: By quickly restoring the database to a usable state, crash recovery
allows systems to get back up and running, minimizing downtime and keeping operations
smooth.
Maintains Consistency: Recovery processes ensure that the database remains consistent,
meaning it only contains valid, up-to-date information, even after an unexpected crash.
1. Define Parallel Databases:
2. Understand the Need for Parallel Databases:
3. Identify the Characteristics of Parallel Databases:
4. Describe the Benefits of Parallel Databases:
5. Explain the Architecture of Parallel Databases:
6. Differentiate Between the Types of Parallel Architectures:
7. Discuss Data Partitioning in Parallel Databases:
8. Examine the Challenges of Parallel Databases:
9. Evaluate the Role of Parallel Databases in Modern Systems:
Definition of Parallel Databases
Parallel databases use multiple processors to perform database tasks concurrently. This setup
allows for faster query processing and efficient handling of large data sets by distributing tasks
across processors.
Why Parallel Databases?
Parallel databases are used to improve performance by reducing query processing time. They allow
multiple tasks to run at the same time, speeding up data retrieval and analysis, especially in large-
scale systems.
Characteristics of Parallel Databases
Parallel databases are characterized by:
Concurrency: Multiple queries processed simultaneously.
Data Partitioning: Dividing data into smaller parts for faster processing.
Scalability: Ability to grow by adding more processors.
Class work
Define parallel database in your own words?
Assignment
Study the about the architecture of parallel database on page….?
Lesson Objectives:
1. Define Distributed Databases:
2. Explain the Need for Distributed Databases:
3. Identify the Characteristics of Distributed Databases:
4. Describe Data Fragmentation in Distributed Databases:
5. Discuss Data Replication in Distributed Databases:
6. Explain the Architecture of Distributed Databases:
7. Discuss the Challenges of Distributed Databases:
8. Examine Data Synchronization in Distributed Databases:
9. Explore Real-World Applications of Distributed Databases:
Distributed Databases (SSS3)
Definition of Distributed Databases:
A distributed database is a collection of databases stored across different physical locations
that communicate via a network. These databases operate independently but appear as a
single system to users.
Need for Distributed Databases:
Distributed databases are used to ensure data availability and provide fault tolerance by
storing data in multiple locations. This setup allows for continuous access to data even if one
location fails and is particularly useful for large-scale organizations.
Characteristics of Distributed Databases:
Data Distribution: Data is stored across multiple sites, improving access and reducing
network load.
Autonomy: Each site can manage its database independently, but data can still be shared
across the system.
Network-Based: A network connects the databases, ensuring data can be accessed and
managed from different locations.
Class work
Can you use a distributed database offline?
Assignment
What is data replication? (use any AI tool but paraphrase your answers to avoid
plagiarism)