AWS Cloud Practitioner Essentials
Chapter 5 : Storage and Databases
Video 1 : Introduction:
Now that your AWS environment is scalable, secure, and global, it's time to store data properly and track
user behavior, such as setting up a loyalty program.
This requires two key components:
1. Storage: For files, logs, images, backups, etc.
2. Databases: For customer data, orders, and transactional records.
Since different data types and usage scenarios exist, AWS provides many purpose-built storage and
database services to help you architect the best solution for each use case.
Video 2 : Instance Stores and Amazon Elastic Block Store (Amazon EBS)
EC2 Instance Store:
• Temporary storage physically attached to the host machine.
• Fast, but non-persistent: Data is lost if the instance is stopped or terminated.
• Use case: temporary files, caches, scratch data.
Amazon Elastic Block Store (EBS):
• Persistent block-level storage for EC2.
• EBS volumes are independent of EC2 lifecycle.
• Data remains intact after stopping/starting instances.
• Types & sizes can be chosen based on workload.
• Supports snapshots (incremental backups) for recovery.
Use EBS when you need data durability for applications, databases, or OS storage.
Video 3 : Amazon Simple Storage Service (Amazon S3)
Overview:
• Object storage for virtually unlimited data.
• Store files like images, videos, logs, documents.
• Data stored as objects in buckets.
• Max object size = 5 TB.
Features:
• Versioning: Retain older versions of files.
• Access Control: Define who can read/write objects.
• Storage Classes:
o S3 Standard: General purpose, frequent access. Offers 11 9s (99.999999999%) durability.
o S3 Standard-IA: Infrequent access, cost-optimized.
o S3 Glacier Flexible Retrieval: Archival data, multiple retrieval speeds.
o S3 Glacier Deep Archive: Cheapest, for long-term storage.
o S3 One Zone-IA: Cheaper, but stored in one AZ.
Lifecycle Policies:
• Automate data transition between storage classes.
• Example: 90 days in Standard → 30 in IA → Glacier.
Bonus:
• Host static websites directly from S3 buckets.
Comparing Amazon EBS and Amazon S3:
Amazon EBS is block storage attached to EC2. Durable and great for active read/write use cases, like editing
large files. Supports delta updates—only changed blocks are updated.
Amazon S3 is object storage with unlimited capacity. Great for infrequent changes, backup, and static
hosting. Each object is stored whole; no partial updates allowed.
Round 1 (S3 Wins): A photo analysis website with millions of viewable images. S3 offers web-enabled URLs,
cost savings, no EC2 needed, and 11 9s durability.
Round 2 (EBS Wins): Editing an 80 GB video file. EBS updates only the changed blocks. In S3, the whole file
would need reuploading each time.
Conclusion: Use S3 for static, infrequently changed files. Use EBS for complex read/write operations. Your
use case decides the winner!
Video 4 : Amazon Elastic File System (Amazon EFS)
Shared File System:
• Fully-managed, scalable file system.
• Multiple EC2 instances can read/write simultaneously.
• Ideal for shared storage and Linux-based workloads.
Differences from EBS:
• EBS: Only 1 EC2 instance in the same AZ can attach.
• EFS: Multiple EC2s across an entire region can connect.
• Scales automatically as you write more data.
• More suitable for workloads needing parallel access.
Video 5 : Amazon Relational Database Service (Amazon RDS)
Relational Databases:
• Store structured data using tables.
• Supports relationships between data (e.g., customers & orders).
• Use SQL to query data.
Amazon RDS:
• Managed service for relational databases.
• Supports: MySQL, PostgreSQL, SQL Server, Oracle, MariaDB.
• Handles patching, backups, scaling, failover automatically.
Amazon Aurora:
• Highly performant RDS-compatible database.
• Supports MySQL & PostgreSQL engines.
• Cost-effective (1/10th) of commercial databases.
• 6 copies of data across multiple AZs.
• Supports 15 read replicas for scalability.
• Continuous backups to S3 + point-in-time recovery.
Video 6 : Amazon DynamoDB
NoSQL Database:
• Fully managed, serverless, and non-relational.
• Data stored as items with attributes in a table.
• Schema-less: Each item can have different attributes.
High Performance:
• Milliseconds latency even at massive scale.
• Built-in redundancy across AZs.
• Automatically scales.
Use Cases:
• Real-time apps, gaming, mobile apps.
• Where relational DBs struggle with flexibility/performance.
• Example: Prime Day 2019 saw 7.11 trillion API calls to DynamoDB.
Comparing Amazon RDS and Amazon DynamoDB:
Amazon RDS is ideal for structured, relational data that requires SQL queries and relationships across
tables. Great for analytics and business logic.
DynamoDB is great for speed and flexibility. Handles large-scale, single-table access patterns where
relationships aren't critical.
Round 1 (RDS Wins): Sales supply chain analysis with complex joins. RDS excels here.
Round 2 (DynamoDB Wins): Employee directory with flat data. RDS features add overhead, while
DynamoDB offers speed and simplicity.
Conclusion: Choose RDS for relationship-heavy apps. Choose DynamoDB for simple, high-throughput, and
flexible workloads.
Video 7 : Amazon Redshift
Data Warehousing:
• For analyzing large volumes of historical data.
• Ideal for Business Intelligence (BI) and analytics.
Features:
• Supports petabyte-scale data.
• Use SQL queries on structured/unstructured data.
• Redshift Spectrum allows querying directly from S3 data lakes.
• 10x performance of traditional data warehouses.
• No server management required.
Use When:
• You need to analyze trends, create reports, or perform large-scale data analysis.
Video 8 : AWS Database Migration Service
Migrate Databases Easily:
• Supports migration from on-premises or cloud to AWS.
• Minimal downtime: Source stays operational during migration.
Types of Migrations:
1. Homogeneous: Same DB engines (e.g., MySQL → RDS MySQL).
2. Heterogeneous: Different engines (e.g., Oracle → Aurora), using AWS Schema Conversion Tool.
Other Use Cases:
• Test/dev migrations
• Database consolidation
• Continuous replication for DR or cross-region availability
Video 9 : Additional Database Services
DocumentDB:
• For document-oriented data (e.g., content, profiles).
• Great for content management systems.
Neptune:
• A graph database.
• Good for social networks, recommendations, fraud detection.
Amazon QLDB:
• Immutable ledger database.
• Tamper-proof, append-only record system.
• Ideal for audits, banking, compliance.
Amazon Managed Blockchain:
• Decentralized blockchain infrastructure.
• Use cases: multi-party business networks, supply chains.
Performance Boosters:
• ElastiCache: In-memory cache (Redis/Memcached) for faster reads.
• DAX (DynamoDB Accelerator): Caching for DynamoDB.
Video 10 : Summary
Key Takeaways:
• Amazon EBS: Persistent, block-level storage for EC2.
• Amazon S3: Object storage with tiers and website hosting.
• Amazon EFS: Shared, scalable file system for Linux EC2s.
• Amazon RDS/Aurora: Managed relational databases.
• Amazon DynamoDB: Serverless, NoSQL database.
• Amazon Redshift: Data warehouse for analytics.
• AWS DMS: Easy, low-downtime database migration.
• Specialty Databases: DocumentDB, Neptune, QLDB, Managed Blockchain.
• Caching: ElastiCache, DAX.