Chapter 1 - Database Concepts
Chapter 1 - Database Concepts
What is a Database?
Definition
Purpose of Databases
• Data Organization: Databases organize data in a way that makes it easily accessible,
manageable, and updateable.
• Data Integrity: They ensure the accuracy and consistency of data over its entire
lifecycle.
• Concurrent Access: Multiple users can access and manipulate data simultaneously
without compromising data integrity.
• Security: Databases provide mechanisms to control access and protect data from
unauthorized users.
Real-World Examples
Types of Databases
Databases can be broadly classified into two categories:
Relational Databases
A relational database is a type of database that stores and provides access to data points that are
related to one another. Data is organized into tables (also known as relations) consisting of rows
and columns.
Characteristics
• Structured Schema: Requires a predefined schema that outlines the tables, fields, and
relationships.
• Tables and Relationships: Data is stored in tables, and relationships between data are
established through primary and foreign keys.
• SQL (Structured Query Language): Uses SQL for querying and maintaining the
database.
• ACID Compliance: Ensures Atomicity, Consistency, Isolation, and Durability in
transactions.
• Structured Data: When data is highly structured and relationships between data are
well-defined.
• Complex Queries: Ideal for complex queries and transactions that require joins and
referential integrity.
• Data Integrity: When data accuracy and integrity are critical.
Characteristics
• Flexible Schema: Schemaless or dynamic schema allows for flexibility in data storage.
• Horizontal Scalability: Designed to scale out by adding more servers.
• High Performance: Optimized for read/write performance with large volumes of data.
• Distributed Architecture: Data is distributed across multiple nodes for fault tolerance.
Objectives
What is SQL?
Structured Query Language (SQL) is a standardized programming language designed for
managing and manipulating relational databases. SQL allows you to create, retrieve, update, and
delete data within a database.
An RDBMS is a software system that enables the creation, management, and manipulation of
relational databases. It provides the interface between users and applications and the database
itself.
Key Components
• Database Engine: Core service for storing, processing, and securing data.
• Database Schema: Defines the logical structure, including tables, views, and indexes.
• Query Processor: Interprets and executes SQL queries.
• Storage Manager: Handles data storage on disk.
• Transaction Manager: Ensures data integrity through ACID properties.
ACID Properties
Introduction to MySQL
MySQL is an open-source RDBMS that uses SQL for data access. It is widely used due to its
reliability, performance, and ease of use.
Features
• Open Source: Freely available under the GNU General Public License.
• Cross-Platform Support: Runs on various operating systems like Windows, Linux, and
macOS.
• High Performance: Optimized for speed, especially in web applications.
• Scalability: Suitable for small applications to large-scale enterprise systems.
• Security: Provides robust security features for data protection.
• Replication: Supports data replication for redundancy and failover.
Editions
Selecting a Database
USE company_db;
Creating a Table
CREATE TABLE employees (
employee_id INT AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
department VARCHAR(50),
salary DECIMAL(10, 2)
);
Querying Data
Updating Records
UPDATE employees
SET salary = 80000.00
WHERE employee_id = 2;
Deleting Records
DELETE FROM employees
WHERE employee_id = 1;
Introduction
In the previous lessons, we explored the fundamentals of databases, SQL, and relational database
management systems like MySQL. Building upon that foundation, this lesson delves into the
core components of relational database architecture: tables, rows, and columns. Understanding
these elements is crucial for designing efficient databases, performing data manipulation, and
ensuring data integrity.
Objectives
By the end of this lesson, you should be able to:
A relational database organizes data into tables, which are composed of rows and columns.
This structure allows for efficient data storage, retrieval, and manipulation. The relational model
was introduced by E.F. Codd in 1970 and has become the foundation for modern database
systems.
Key Components
2. Tables
A table is a collection of related data organized into rows and columns. Each table represents a
specific entity or subject within the database, such as customers, products, or orders.
Structure of a Table
Example
Customers Table
3. Columns (Fields/Attributes)
A column represents a specific data attribute within a table. It defines the kind of data that can
be stored in that column through data types and constraints.
Data Types
Data types specify the type of data that can be stored in a column.
4. Rows (Records/Tuples)
A row represents a single record in a table, containing data for each column.
Characteristics
Example
A primary key is a column or a set of columns that uniquely identifies each row in a table.
Characteristics
Example
Foreign Key
A foreign key is a column or a set of columns in one table that references the primary key in
another table.
Characteristics
Example
Implementing Relationships
7. Indexes
Definition
An index is a database object that improves the speed of data retrieval operations on a table at
the cost of additional writes and storage space.
Types of Indexes
• Single-Column Index: Index based on a single column.
• Composite Index: Index based on multiple columns.
• Unique Index: Ensures that all values in the index are unique.
Creating an Index
CREATE INDEX idx_customer_last_name ON customers(last_name);
Benefits of Indexes
Considerations
Entities:
• Books
• Authors
• Members
• Loans
• DATE: YYYY-MM-DD.
• TIME: HH:MM
• Accuracy: Ensure the data type can represent all possible values.
• Storage Efficiency: Use appropriate data types to conserve space.
• Performance: Some data types are faster to process.
• students
• courses
• enrollments
Relationships:
Objectives
By the end of this lesson, you should be able to:
1. Database Normalization
Definition
Goals of Normalization
Normal Forms
Normalization is achieved through a series of rules known as normal forms. The most
commonly applied normal forms are:
First Normal Form (1NF)
• Criteria:
o Each table cell contains a single (atomic) value.
o Each record is unique.
• Implementation:
o Eliminate repeating groups of data.
o Create separate tables for each group of related data.
Example Violation:
Solution in 1NF:
• Criteria:
o Must be in 1NF.
o All non-key attributes are fully functionally dependent on the primary key.
• Implementation:
o Remove partial dependencies.
o Create separate tables for sets of values that apply to multiple records.
Example Violation:
A table with a composite primary key where non-key attributes depend only on part of the key.
Solution in 2NF:
Separate the table into two, ensuring that each non-key attribute is fully dependent on the
primary key.
• Criteria:
o Must be in 2NF.
o No transitive dependencies (non-key attributes depend only on the primary key).
• Implementation:
o Remove columns that are not dependent upon the primary key.
o Create separate tables for related data.
Example Violation:
Solution in 3NF:
While 1NF, 2NF, and 3NF are commonly used, higher normal forms like Boyce-Codd Normal
Form (BCNF), Fourth Normal Form (4NF), and Fifth Normal Form (5NF) exist for more
complex normalization needs.
2. Primary Keys
Definition
A primary key is a column or a combination of columns that uniquely identifies each row in a
table.
Characteristics
• Uniqueness: No two rows can have the same primary key value.
• Non-nullable: Primary key columns cannot contain NULL values.
• Immutable: The value of a primary key should not change over time.
• Single Column or Composite: Can consist of one column or multiple columns
(composite key).
Best Practices
• Use surrogate keys (like auto-incrementing integers) when natural keys are complex.
• Ensure the primary key is minimal and contains only necessary columns.
3. Foreign Keys
Definition
A foreign key is a column or a set of columns in one table that references the primary key in
another table, establishing a relationship between the two tables.
Purpose
You can define actions that occur when the referenced data changes:
• ON DELETE CASCADE: Delete rows in the child table when the parent row is deleted.
• ON UPDATE CASCADE: Update the foreign key in the child table when the parent key
changes.
Example:
CREATE TABLE order_items (
order_item_id INT PRIMARY KEY,
order_id INT,
product_id INT,
FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE
);
Best Practices
6. Practical Examples
Example 1: Normalization Process
Unnormalized Table:
Normalization Steps:
Resulting Tables:
Introduction
In previous lessons, we've explored databases, SQL, and key concepts like normalization and keys.
This lesson focuses on Database Management Systems (DBMS)—the software that allows us to
create, manage, and interact with databases. Understanding DBMS is crucial for effectively
handling data storage, retrieval, and manipulation in various applications.
Purpose
• Schema Definition: Allows users to define the structure of the data, including tables,
fields, and relationships.
• Data Types and Constraints: Supports specifying data types and constraints to ensure
data integrity.
Data Manipulation
Data Security
• ACID Properties: Ensuring transactions are Atomic, Consistent, Isolated, and Durable.
• Concurrency Control: Managing simultaneous operations without conflicting.
Data Integrity
3. Components of a DBMS
Database Engine
Query Processor
Storage Manager
Transaction Manager
User Interface
• Provides tools for users to interact with the database.
• Includes command-line interfaces, graphical user interfaces, and APIs.
Distributed DBMS
Client-Server DBMS
Parallel DBMS
Cloud-Based DBMS
NoSQL Databases
• Designed for unstructured data with flexible schemas.
• Includes document stores, key-value stores, column stores, and graph databases.
• Examples: MongoDB (document store), Cassandra (column store), Redis (key-value
store).
Object-Oriented Databases
Hierarchical Databases
Network Databases
• Allow more complex relationships; records can have multiple parent and child records.
• Uses a network structure to create relationships.
• Optimizes data retrieval and manipulation using indexes and query optimization
techniques.
Data Administration
Cost
Performance Overhead
Maintenance
• MySQL
o Open-source, widely used for web applications.
• PostgreSQL
o Open-source, known for advanced features and compliance.
• Oracle Database
o Commercial, robust features for large enterprises.
• Microsoft SQL Server
o Commercial, integrated with Microsoft technologies.
NoSQL DBMS
• MongoDB
o Document-oriented, scalable and flexible schema.
• Apache Cassandra
o Columnar database, designed for high availability.
• Redis
o In-memory key-value store, used for caching and real-time analytics.
Cloud-Based DBMS
• Amazon RDS
o Managed relational database service.
• Google Cloud SQL
o Managed database service for MySQL, PostgreSQL.
• Azure SQL Database
o Managed relational cloud database service.
Healthcare
Education
Social Media
Data Encryption