Migrate Your On-Premise Data Warehouse To Amazon Redshift: Noman Jaffery

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

AWS Data, Databases, and Analytics Online Series

Migrate your on-premise data


warehouse to Amazon Redshift

Noman Jaffery
Specialist Data Warehouse Solutions Architect, AWS

© 2020, Amazon Web Services, Inc. or its Affiliates.


Data warehousing trends

Data
010010010
01010001
100010100

Exponential growth End-to-end insights from Migrations to


of event data analyzing all your data the cloud

© 2020, Amazon Web Services, Inc. or its Affiliates.


Benefits of a cloud data warehouse

Get insights Scale, elasticity, Increases in No infrastructure


from all your data and flexibility productivity costs & pay-as-you
go

© 2020, Amazon Web Services, Inc. or its Affiliates.


Data warehouse
(business data)

Amazon Redshift

Data lake
(event data)

Customers moving to data lake architectures


Amazon Redshift enables you to have a lake house approach

© 2020, Amazon Web Services, Inc. or its Affiliates.


Amazon Redshift benefits
Tens of thousands of customers use Redshift & process over 2 EB of data per day

Data lake & AWS integrated Best performance Best value


AWS Lake Formation catalogue & Up to 3x faster than other Up to 75% less than other
security, Exabyte querying, AWS cloud data warehouses cloud data warehouses
integrated (e.g., AWS Database Migration and predictable costs
Service, Amazon CloudWatch)

Most scalable Most secure & compliant Easy to manage


Virtually unlimited AWS-grade security, e.g., VPC, encryption Easy to provision & manage,
elastic linear scaling with AWS Key Management Service (KMS), automated backups, AWS support,
AWS Cloud Trail, certifications such 99.9% SLAs
as SOC, PCI, DSS, ISO, FedRAMP, HIPAA

© 2020, Amazon Web Services, Inc. or its Affiliates.


Fannie Mae migrated to Amazon Redshift to save
cost & maximize performance
Challenge Solution Benefits
Fannie Mae wanted to modernize Fannie Mae migrated over 600 TB of The team is now able to spend more
their data warehouse and migrate uncompressed data to Amazon time on strategic data work, and
from on-premises DW (Netezza). Redshift. less time maintaining the data
warehouse.
They needed to reduce TCO with a Fannie Mae embraced the data lake
cloud-based data warehouse that architecture with Amazon Redshift With Concurrency Scaling, they
meets security and compliance for analytics and are leveraging achieved similar or better
requirements, and can scale Amazon Redshift Spectrum to performance with 50% of the
out/scale in seamlessly based on access data from Amazon Simple compute resources.
usage without compromising on Storage Service (Amazon S3) and
performance at large scale. easily share data across teams. They
use security features like KMS, and
Concurrency Scaling to handle user
and application growth.

© 2020, Amazon Web Services, Inc. or its Affiliates.


Features delivered to meet customer needs
Large # of tables support Copy command support
Robust result set caching
~20000 for ORC, Parquet
IAM role chaining Elastic resize Groups
Performance enhancements—
CloudWatch
Amazon Redshift Spectrum: date formats, Auto Health and performance monitoring Automatic table distribution hash join, vacuum, window functions,
scalar json and ION file formats support, w/Amazon support for resize ops, aggregations, console, union
region expansion, predicate filtering analyze CloudWatch style WLM queues all, efficient compile
code cache

200+
~25 Query Monitoring
Auto WLM Rules (QMR) support AQUA (Advanced Query Accelerator)
Resiliency of
Concurrency Scaling DC1 migration to DC2
ROLLBACK processing
Auto analyze for
Manage multi-part query Spectrum Request Apply new
incremental changes
in AWS console
on table Accelerator distribution key

Amazon Redshift Performance: Bloom filters in Amazon Redshift Spectrum:


Spectrum: Row group filtering
in Parquet and ORC, Nested
Faster Classic
resize with optimized
new features in the past joins, complex queries Concurrency scaling

data support, Enhanced VPC


Routing, Multiple partitions data transfer 18 months Amazon Lake Formation
Auto-Vacuum sort, Auto-
Analyze and
protocol integration
Auto Table Sort

Auto WLM with Snapshot scheduler Performance: join

query priorities
pushdowns to subquery,
mixed workloads temporary
Advisor recommendations AZ64 compression Console
Stored procedures tables, rank functions, null for distribution keys encoding redesign
handling in join, single row insert
Spatial Processing Performance of Inter- Federated
Column level access control Materialized Pause
RA3 Region Snapshot
with AWS lake formation Transfers Query Views and Resume

© 2020, Amazon Web Services, Inc. or its Affiliates.


Ready for the benefits of a cloud data
warehouse?
Next steps

ANALYZE AND PLAN WORKSHOP AND PILOT WORKLOAD MIGRATIONS

Preparation Proof of Value and Proof of Concept Migration

AWS SUPPORT THROUGHOUT THE JOURNEY

© 2020, Amazon Web Services, Inc. or its Affiliates.


What to expect during the migration journey
ANALYZE AND PLAN (2 WEEKS) WORKSHOP AND PILOT (6 WEEKS) WORKLOAD MIGRATIONS (8+ WEEKS)

Preparation Proof of Value and Proof of Concept Migration

Migrate

Create new target Realize


Discovery Migration cloud benefits Integrate
& Planning Business Case
Modify/Develop BI
and apps to dual
target
Modify/Develop ETL
to dual target Optimize Test

Migration Migration Monitor


Expertise Plan
Skills/CoE

AWS SUPPORT THROUGHOUT THE JOURNEY

© 2020, Amazon Web Services, Inc. or its Affiliates.


DMS and SCT

AWS Database Migration Service (DMS) easily and


securely migrates and/or replicate your databases
and data warehouses to AWS

AWS Schema Conversion Tool (SCT) converts your


commercial database and data warehouse schemas to
open-source engines or AWS-native services, such as
Amazon Aurora and Amazon Redshift

© 2020, Amazon Web Services, Inc. or its Affiliates.


AWS SCT data extractors
Extract data from your data warehouse and migrate to Amazon Redshift
• Extracts data through local migration agents
• Data is optimized for Amazon Redshift and saved in local files
• Files are loaded to an Amazon S3 bucket (through network or AWS Snowball Edge) and
then to Amazon Redshift

Microsoft SQL
Server

NETEZZA Source DW AWS Schema Amazon Amazon


Conversion S3 bucket Redshift
Tool

© 2020, Amazon Web Services, Inc. or its Affiliates.


DMS versus SCT extractors
For DW migrations, when to use DMS vs. SCT extractors?

Depends on customer use case:


Sources Full load CDC Managed Configuration
Service Options
DMS Many Yes Yes Yes Some
SCT Extractors DW only Yes No No Many

© 2020, Amazon Web Services, Inc. or its Affiliates.


Supported Sources

Amazon S3

PostgreSQL Teradata
Azure SQL
Oracle Netezza
MySQL DB2 LUW
SQL Server Greenplum
SAP ASE
Vertica
Mongo
DMS SCT Extractors

© 2020, Amazon Web Services, Inc. or its Affiliates.


Oracle and SQL Server Considerations
Guidelines

• If the use case requires CDC then use DMS

• If the source data volume is large (> 10TB) then use SCT extractors

• If the network is slow or unreliable use SCT extractors + AWS Snowball

© 2020, Amazon Web Services, Inc. or its Affiliates.


AWS Training and Certification

Training for the Flexibility to Learn Your Validate Skills with AWS
Whole Team Way Certification
Explore tailored Data or Build cloud skills with free digital
Database learning paths Demonstrate expertise with a Data
Data training courses such as “The industry-recognized credential
for customers and elements of Data Science”, or dive
partners (Data analytics and Database
deep with classroom training Speciality AWS Certifications)

aws.amazon.com/training/

© 2020, Amazon Web Services, Inc. or its Affiliates.


Visit the Data, Databases, and Analytics
Resource Hub for more resources
Dive deeper with these newly created whitepapers and e-books
to help you uncover new insights and value from your data.

• An introduction to cloud databases


• Enter the purpose-built database era
• Harness the power of data
• Creating a modern analytics architecture
• The data-driven enterprise https://tinyurl.com/aws-data-
• … and more! databases-analytics

Visit resource hub »

© 2020, Amazon Web Services, Inc. or its Affiliates.


Thank you for attending
AWS Data, Databases, and Analytics Online Series
We hope you found it interesting! A kind reminder to complete the survey.
Let us know what you thought of today’s event and how we can improve the event
experience for you in the future.

[email protected]
twitter.com/AWSCloud

facebook.com/AmazonWebServices
youtube.com/user/AmazonWebServices

slideshare.net/AmazonWebServices
twitch.tv/aws

© 2020, Amazon Web Services, Inc. or its Affiliates.


Thank you!

© 2020, Amazon Web Services, Inc. or its Affiliates.

You might also like