0% found this document useful (0 votes)

88 views

AWS Databases

Uploaded by

Kaustubh Negi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views

AWS Databases

Uploaded by

Kaustubh Negi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Amazon RDS

AWS RDS Overview

● RDS stands for Relational Database Service

● It’s a managed DB service for DB use SQL as a query language.
● It allows you to create databases in the cloud that are managed by AWS
○ Postgres
○ MySQL
○ MariaDB
○ Oracle
○ Microsoft SQL Server
○ Aurora (AWS Proprietary database)
Advantage over using RDS versus deploying
DB on EC2
● RDS is a managed service:
○ Automated provisioning, OS patching
○ Continuous backups and restore to speciﬁc timestamp (Point in Time Restore)!
○ Monitoring dashboards
○ Read replicas for improved read performance
○ Multi AZ setup for DR (Disaster Recovery)
○ Maintenance windows for upgrades
○ Scaling capability (vertical and horizontal)
○ Storage backed by EBS (gp2 or io1)
● BUT you can’t SSH into your instances
RDS Solution Architecture - EC2
RDS– Storage Auto Scaling
● Helps you increase storage on your RDS DB instance
dynamically
● When RDS detects you are running out of free database storage,
it scales automatically
● Avoid manually scaling your database storage
● You have to set Maximum Storage Threshold (maximum limit
for DB storage)
● Automatically modify storage if:
○ Free storage is less than 10% of allocated storage
○ Low-storage lasts at least 5 minutes
○ 6 hours have passed since last modiﬁcation
○ Useful for applications with unpredictable workloads
○ Supports all RDS database engines (MariaDB, MySQL,

● PostgreSQL, SQL Server, Oracle

RDS Read Replicas for read scalability
● Up to 5 Read Replicas

● Within AZ, Cross AZ or

Cross Region

● Replication is ASYNC, so
reads are eventually
consistent

● Replicas can be promoted

to their own DB

● Applications must update

the connection string to
leverage read replicas
RDS Read Replicas – Network Cost
● In AWS there’s a network cost when data goes from one AZ to another
● For RDS Read Replicas within the same region, you don’t pay that fee
Amazon Aurora
● Aurora is a proprietary technology from AWS (not open sourced)
● PostgreSQL and MySQL are both supported as Aurora DB
● Aurora is “AWS cloud optimized” and claims 5x performance improvement
over MySQL on RDS, over 3x the performance of Postgres on RDS
● Aurora storage automatically grows in increments of 10GB, up to 64 TB.
● Aurora costs more than RDS (20% more) – but is more eﬃcient

NOTE: Not in the free tier

Aurora DB Cluster
Features of Aurora
● Automatic fail-over
● Backup and Recovery
● Isolation and security
● Industry compliance
● Push-button scaling
● Automated Patching with Zero Downtime
● Advanced Monitoring
● Routine Maintenance
● Backtrack: restore data at any point of time without using backups
Aurora Replicas - Auto Scaling
Aurora – Custom Endpoints
● Define a subset of Aurora Instances as a Custom Endpoint
● Example: Run analytical queries on specific replicas
● The Reader Endpoint is generally not used after defining Custom Endpoints
Aurora Serverless
● Automated database
instantiation and auto-
scaling based on actual
usage
● Good for infrequent,
intermittent or
unpredictable workloads
● No capacity planning
needed
● Pay per second, can be
more cost-effective
Backups
RDS
● Automated backups: Aurora
○ Daily full backup of the database (during the ● Automated backups
maintenance window)
○ Transaction logs are backed-up by RDS every ○ 1 to 35 days (cannot be disabled)
5 minutes
○ => ability to restore to any point in time (from ○ point-in-time recovery in that
oldest backup to 5 minutes ago)
○ 1 to 35 days of retention, set 0 to disable timeframe
automated backups

● Manual DB Snapshots ● Manual DB Snapshots

○ Manually triggered by the user
○ Retention of backup for as long as you want ○ Manually triggered by the user
○ Retention of backup for as long as you
● Trick: in a stopped RDS database, you will want
still pay for storage. If you plan on stopping
it for a long time, you should snapshot &
restore instead
Amazon ElastiCache Overview
● The same way RDS is to get managed Relational Databases…
● ElastiCache is to get managed Redis or Memcached
● Caches are in-memory databases with high performance, low latency
● Helps reduce load off databases for read intensive workloads
● AWS takes care of OS maintenance / patching, optimizations, setup,
configuration, monitoring, failure recovery and backups
ElastiCache Solution Architecture - Cache
ElastiCache – Redis vs Memcached
DynamoDB
● Fully Managed Highly available with replication across 3 AZ
● NoSQL database - not a relational database
● Scales to massive workloads, distributed “serverless” database
● Millions of requests per seconds, trillions of row, 100s of TB of storage
● Fast and consistent in performance
● Single-digit millisecond latency – low latency retrieval
● Integrated with IAM for security, authorization and administration
● Low cost and auto scaling capabilities
● Standard & Infrequent Access (IA) Table Class
DynamoDB Accelerator - DAX
● Fully Managed in-memory cache for
DynamoDB
● 10x performance improvement – single-
digit millisecond latency to
microseconds latency – when accessing
your DynamoDB tables
● Secure, highly scalable & highly available
● Difference with ElastiCache at the CCP
level: DAX is only used for and is
integrated with DynamoDB, while
ElastiCache can be used for other
databases
DynamoDB – Global Tables
● Make a DynamoDB table accessible with low latency in multiple-regions
● Active-Active replication (read/write to any AWS Region)
Redshift Overview
● Redshift is based on PostgreSQL, but it’s not used for OLTP
● It’s OLAP – online analytical processing (analytics and data warehousing)
● Load data once every hour, not every second
● 10x better performance than other data warehouses, scale to PBs of data
● Columnar storage of data (instead of row based)
● Massively Parallel Query Execution (MPP), highly available
● Pay as you go based on the instances provisioned
● Has a SQL interface for performing the queries
● BI tools such as AWS Quicksight or Tableau integrate with it
Amazon EMR
● EMR stands for “Elastic MapReduce”
● EMR helps creating Hadoop clusters (Big Data) to analyze and process vast
amount of data
● The clusters can be made of hundreds of EC2 instances
● Also supports Apache Spark, HBase, Presto, Flink…
● EMR takes care of all the provisioning and configuration
● Auto-scaling and integrated with Spot instances
● Use cases: data processing, machine learning, web indexing, big data…
Amazon Athena
● Serverless query service to analyze data stored in Amazon S3
● Uses standard SQL language to query the files
● Supports CSV, JSON, ORC, Avro, and Parquet (built on Presto)
● Pricing: $5.00 per TB of data scanned
● Use compressed or columnar data for cost-savings (less scan)
● Use cases: Business intelligence / analytics / reporting, analyze
& query VPC Flow Logs, ELB Logs, CloudTrail trails, etc…

● Exam Tip: analyze data in S3 using serverless SQL, use Athena

Amazon QuickSight
● Serverless machine learning-powered business intelligence service to create
interactive dashboards
● Fast, automatically scalable, embeddable, with per-session pricing
● Use cases:
● Business analytics
● Building visualizations
● Perform ad-hoc analysis
● Get business insights using data
● Integrated with RDS, Aurora, Athena,
Redshift, S3…
DocumentDB
● Aurora is an “AWS-implementation” of PostgreSQL / MySQL …
● DocumentDB is the same for MongoDB (which is a NoSQL database)
● MongoDB is used to store, query, and index JSON data
● Similar “deployment concepts” as Aurora
● Fully Managed, highly available with replication across 3 AZ
● DocumentDB storage automatically grows in increments of 10GB, up to 64
TB.
● Automatically scales to workloads with millions of requests per seconds
Amazon Neptune
● Fully managed graph database
● A popular graph dataset would be a social network
○ Users have friends
○ Posts have comments
○ Comments have likes from users
○ Users share and like posts…
● Highly available across 3 AZ, with up to 15 read replicas
● Build and run applications working with highly connected datasets – optimized for
complex and hard queries
● Can store up to billions of relations and query the graph with milliseconds latency
● Highly available with replications across multiple AZs
● Great for knowledge graphs (Wikipedia), fraud detection, recommendation engines,
social networking
Amazon QLDB
● QLDB stands for ”Quantum Ledger Database”
● A ledger is a book recording financial
transactions
● Fully Managed, Serverless, High available,
Replication across 3 AZ
● Used to review history of all the changes made to your application data over time
● Immutable system: no entry can be removed or modified, cryptographically
verifiable
● 2-3x better performance than common ledger blockchain frameworks, manipulate
data using SQL
● Difference with Amazon Managed Blockchain: no decentralization component, in
accordance with financial regulation rules
Amazon Managed Blockchain
● Blockchain makes it possible to build applications where multiple parties can
execute transactions without the need for a trusted, central authority.
● Amazon Managed Blockchain is a managed service to:
○ Join public blockchain networks

○ Or create your own scalable private network

● Compatible with the frameworks Hyperledger Fabric & Ethereum

AWS Glue
● Managed extract, transform, and load
(ETL) service
● Useful to prepare and transform data
for analytics
● Fully serverless service
● AWS Glue is a serverless data
integration service that makes it easier
to discover, prepare, move, and integrate data from multiple sources for analytics,
machine learning (ML), and application development.
● AWS Glue can run your extract, transform, and load (ETL) jobs as new data arrives.
● For example, you can conﬁgure AWS Glue to initiate your ETL jobs to run as soon
as new data becomes available in Amazon Simple Storage Service (S3).
DMS – Database Migration Service
● Quickly and securely migrate databases to AWS,
resilient, self healing
● The source database remains available during the
migration
● Supports:
● Homogeneous migrations: ex Oracle to Oracle
● Heterogeneous migrations: ex Microsoft SQL Server
to Aurora
Databases & Analytics Summary in AWS
● Relational Databases - OLTP: RDS & Aurora (SQL)
● Differences between Multi-AZ, Read Replicas, Multi-Region
● In-memory Database: ElastiCache
● Key/Value Database: DynamoDB (serverless) & DAX (cache for DynamoDB)
● Warehouse - OLAP: Redshift (SQL)
● Hadoop Cluster: EMR
● Athena: query data on Amazon S3 (serverless & SQL)
● QuickSight: dashboards on your data (serverless)
● DocumentDB: “Aurora for MongoDB” (JSON – NoSQL database)
● Amazon QLDB: Financial Transactions Ledger (immutable journal, cryptographically veriﬁable)
● Amazon Managed Blockchain: managed Hyperledger Fabric & Ethereum blockchains
● Glue: Managed ETL (Extract Transform Load) and Data Catalog service
● Database Migration: DMS
● Neptune: graph database

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
An Overview of Business Intelligence, Analytics, and Data Science
0% (1)
An Overview of Business Intelligence, Analytics, and Data Science
44 pages
Database Services in AWS: Relational Databases
No ratings yet
Database Services in AWS: Relational Databases
9 pages
Module 8 - Database Services
No ratings yet
Module 8 - Database Services
33 pages
What Is Query Cache in MySQL
No ratings yet
What Is Query Cache in MySQL
4 pages
Postgre SQL
No ratings yet
Postgre SQL
10 pages
Informatica Cloud (IICS) Architecture
No ratings yet
Informatica Cloud (IICS) Architecture
21 pages
SQL Server Note
No ratings yet
SQL Server Note
42 pages
MySQL and SSD: Usage Patterns
No ratings yet
MySQL and SSD: Usage Patterns
29 pages
Distributed Database: GDC Thana Semester 6
No ratings yet
Distributed Database: GDC Thana Semester 6
10 pages
ADBMS Parallel and Distributed Databases
No ratings yet
ADBMS Parallel and Distributed Databases
98 pages
SQL Server 2000 Faqs
No ratings yet
SQL Server 2000 Faqs
62 pages
Mysql Interview Que & Ans
No ratings yet
Mysql Interview Que & Ans
13 pages
Comparison of MySQL and MS SQL
100% (1)
Comparison of MySQL and MS SQL
7 pages
SQL Optimization Training
No ratings yet
SQL Optimization Training
2 pages
Azure Data Engineer Mock Interview - Project Special
No ratings yet
Azure Data Engineer Mock Interview - Project Special
11 pages
Snowflake Standards
No ratings yet
Snowflake Standards
2 pages
Talend Open Studio For Data Integration: User Guide
No ratings yet
Talend Open Studio For Data Integration: User Guide
452 pages
SQL Server Change Tracking Feature
No ratings yet
SQL Server Change Tracking Feature
21 pages
Oracle DBA Servival Guide
No ratings yet
Oracle DBA Servival Guide
287 pages
SS1123 - D2T - Apache Cassandra Overview PDF
100% (1)
SS1123 - D2T - Apache Cassandra Overview PDF
45 pages
SQL Server Theory
No ratings yet
SQL Server Theory
2 pages
A Performance Comparison of SQL and NoSQL Databases
No ratings yet
A Performance Comparison of SQL and NoSQL Databases
5 pages
Spark A To Z
No ratings yet
Spark A To Z
63 pages
Snowflake Setup - MD
No ratings yet
Snowflake Setup - MD
2 pages
Caching in Snowflake
No ratings yet
Caching in Snowflake
7 pages
Using V$SQL - PLAN To Get Accurate Execution Plans
No ratings yet
Using V$SQL - PLAN To Get Accurate Execution Plans
9 pages
DW
No ratings yet
DW
29 pages
Data Warehouse - What Is It
No ratings yet
Data Warehouse - What Is It
5 pages
PostgreSQL Advanced CheatSheet 1731972672
No ratings yet
PostgreSQL Advanced CheatSheet 1731972672
10 pages
Apache Hive
No ratings yet
Apache Hive
3 pages
SQL
No ratings yet
SQL
47 pages
Leetcode Preparation
No ratings yet
Leetcode Preparation
14 pages
Data Model - Important - Concepts
No ratings yet
Data Model - Important - Concepts
24 pages
Design Document Database
No ratings yet
Design Document Database
62 pages
Querying With T-SQL - 01
No ratings yet
Querying With T-SQL - 01
24 pages
2 DE +Installing+Apache+Spark+on+CDH+EC2
No ratings yet
2 DE +Installing+Apache+Spark+on+CDH+EC2
19 pages
Apache Sqoop
No ratings yet
Apache Sqoop
21 pages
AWS EC2 Interview Questions - MindMajix
No ratings yet
AWS EC2 Interview Questions - MindMajix
27 pages
Create An Spark Streaming App: 1. Architecture and Abstraction
No ratings yet
Create An Spark Streaming App: 1. Architecture and Abstraction
8 pages
Nosql: Basics: Alexey Zinovyev, Java/Bigdata Trainer in Epam
No ratings yet
Nosql: Basics: Alexey Zinovyev, Java/Bigdata Trainer in Epam
145 pages
NetflixOSS - A Cloud Native Architecture - Slides PDF
No ratings yet
NetflixOSS - A Cloud Native Architecture - Slides PDF
86 pages
50 Important Queries in SQL Server
No ratings yet
50 Important Queries in SQL Server
19 pages
Interview
No ratings yet
Interview
86 pages
Midhun BIGDATA Curicullum
No ratings yet
Midhun BIGDATA Curicullum
17 pages
MSSQL Server 2008 Developer
No ratings yet
MSSQL Server 2008 Developer
240 pages
Tuning SQL Queries - Oracle
100% (1)
Tuning SQL Queries - Oracle
27 pages
Data Warehouses and Data Cubes
No ratings yet
Data Warehouses and Data Cubes
21 pages
Chapter 9 MySQL
No ratings yet
Chapter 9 MySQL
29 pages
Lab9 Troubleshooting Snowpipe AWS
No ratings yet
Lab9 Troubleshooting Snowpipe AWS
2 pages
Interview PDF
No ratings yet
Interview PDF
100 pages
Mysql
No ratings yet
Mysql
7 pages
Unit-7 Transaction Processing
No ratings yet
Unit-7 Transaction Processing
107 pages
Airflow 2 X
No ratings yet
Airflow 2 X
39 pages
SQL & NoSQL Cheat Sheet
No ratings yet
SQL & NoSQL Cheat Sheet
52 pages
PGSQL CheatSheet Mysql2psql
No ratings yet
PGSQL CheatSheet Mysql2psql
7 pages
Hadoop Commands Cheat Sheet
No ratings yet
Hadoop Commands Cheat Sheet
1 page
PostgreSQL Cheat Sheet - Hackr - Io
No ratings yet
PostgreSQL Cheat Sheet - Hackr - Io
90 pages
Cloud AWS: Yuwono Marta Dinata - OS
No ratings yet
Cloud AWS: Yuwono Marta Dinata - OS
55 pages
IBM InfoSphere Replication Server and Data Event Publisher
From Everand
IBM InfoSphere Replication Server and Data Event Publisher
Pav Kumar-Chatterjee
No ratings yet
IBM WEBSPHERE Frequently Asked Questions
From Everand
IBM WEBSPHERE Frequently Asked Questions
equitypress
1/5 (1)
(Ebook) Methods in Human Geography: A guide for students doing a research project (2nd Edition) by Robin Flowerdew, David Martin ISBN 9780582473218, 9781317873372, 9781405898416, 0582473217, 1317873378, 1405898410 instant download
100% (1)
(Ebook) Methods in Human Geography: A guide for students doing a research project (2nd Edition) by Robin Flowerdew, David Martin ISBN 9780582473218, 9781317873372, 9781405898416, 0582473217, 1317873378, 1405898410 instant download
48 pages
A Review of Social Media-based Public Opinion Analyses Challenges and Recommendations
No ratings yet
A Review of Social Media-based Public Opinion Analyses Challenges and Recommendations
14 pages
Project On Consumer Perception
100% (1)
Project On Consumer Perception
34 pages
Sharepoint 2010 Capacity Planning and Sizing Sheet
No ratings yet
Sharepoint 2010 Capacity Planning and Sizing Sheet
8 pages
Practical-1: AIM: Practical On Transaction Control Language. Theory
No ratings yet
Practical-1: AIM: Practical On Transaction Control Language. Theory
19 pages
2303 10130 PDF
No ratings yet
2303 10130 PDF
34 pages
Infinispan Data Grid Platform Definitive Guide - Sample Chapter
No ratings yet
Infinispan Data Grid Platform Definitive Guide - Sample Chapter
38 pages
4.1 Intro Nosql
No ratings yet
4.1 Intro Nosql
43 pages
Project Scope Example 04
No ratings yet
Project Scope Example 04
11 pages
ECON 3811
No ratings yet
ECON 3811
4 pages
65GZ032prg v0.11
No ratings yet
65GZ032prg v0.11
77 pages
BSQL TestQuick
No ratings yet
BSQL TestQuick
52 pages
Lead2Pass - Latest Free Oracle 1Z0 060 Dumps (61 70) Download!
No ratings yet
Lead2Pass - Latest Free Oracle 1Z0 060 Dumps (61 70) Download!
6 pages
Multiple Choice Questions: Concept Check Quiz
No ratings yet
Multiple Choice Questions: Concept Check Quiz
8 pages
SD Card-Feasibility Study
No ratings yet
SD Card-Feasibility Study
5 pages
Data Cleaning IN SQL: Part - 6
No ratings yet
Data Cleaning IN SQL: Part - 6
14 pages
CSE2021 - MODULE 1ppt
No ratings yet
CSE2021 - MODULE 1ppt
62 pages
Cou2 Glossary
No ratings yet
Cou2 Glossary
6 pages
What Is The Difference Between Business Unit and SetId
No ratings yet
What Is The Difference Between Business Unit and SetId
4 pages
HNAS File Service AdministrationGuide PDF
No ratings yet
HNAS File Service AdministrationGuide PDF
215 pages
Lab 1
No ratings yet
Lab 1
6 pages
DPSIR A Problem Structuring Method An Exploration From The
No ratings yet
DPSIR A Problem Structuring Method An Exploration From The
11 pages
GE 1 - Introduction To Earth Trek
No ratings yet
GE 1 - Introduction To Earth Trek
46 pages
ITS OD 202 Data Analytics 0225
No ratings yet
ITS OD 202 Data Analytics 0225
2 pages
Informatica Powercenter Questions
No ratings yet
Informatica Powercenter Questions
26 pages
City of Manila - Statistical Tables - 0
No ratings yet
City of Manila - Statistical Tables - 0
152 pages
Research Leadership Role
No ratings yet
Research Leadership Role
12 pages
Immediate download Mapping Data Flows in Azure Data Factory 1st Edition Mark Kromer ebooks 2024
100% (5)
Immediate download Mapping Data Flows in Azure Data Factory 1st Edition Mark Kromer ebooks 2024
65 pages
Raster Tutorial ArcGIS 93
No ratings yet
Raster Tutorial ArcGIS 93
11 pages

AWS Databases

Uploaded by

AWS Databases

Uploaded by

Amazon RDS

AWS RDS Overview

● RDS stands for Relational Database Service

● PostgreSQL, SQL Server, Oracle

● Within AZ, Cross AZ or

● Replicas can be promoted

● Applications must update

NOTE: Not in the free tier

● Manual DB Snapshots ● Manual DB Snapshots

● Exam Tip: analyze data in S3 using serverless SQL, use Athena

○ Or create your own scalable private network

● Compatible with the frameworks Hyperledger Fabric & Ethereum

You might also like