0% found this document useful (0 votes)
3 views

AWS Data Engineering Cheatsheet2

The AWS Data Engineering Cheatsheet provides essential information on commonly used AWS services, including Amazon Redshift, Amazon S3, and Amazon Athena. It covers features, components, management commands, and security aspects for each service, aimed at assisting data engineers in utilizing AWS effectively. The document serves as a quick reference guide for performing various data engineering tasks within the AWS ecosystem.

Uploaded by

Lahbib Fedi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

AWS Data Engineering Cheatsheet2

The AWS Data Engineering Cheatsheet provides essential information on commonly used AWS services, including Amazon Redshift, Amazon S3, and Amazon Athena. It covers features, components, management commands, and security aspects for each service, aimed at assisting data engineers in utilizing AWS effectively. The document serves as a quick reference guide for performing various data engineering tasks within the AWS ecosystem.

Uploaded by

Lahbib Fedi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Nata in Data

AWS Data Engineering Cheatsheet


Nataindata
May 20, 2024 • 11 min read

Hello dears, here you can find cheat sheets for most commonly used AWS
services in Data Engineering, like:

AWS Redshift Cheat Sheet

Amazon S3 Cheat Sheet


Subscribe
Amazon Athena Cheat Sheet
https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 1/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Amazon Kinesis Cheat Sheet

Amazon Redshift Cheat Sheet


Overview
Amazon Redshift is a fully managed, petabyte-scale data warehouse service
that extends data warehouse queries to your data lake. It allows you to run
analytic queries against petabytes of data stored locally in Redshift and
directly against exabytes of data stored in S3. Redshift is designed for OLAP
(Online Analytical Processing).

Currently, Redshift only supports Single-AZ deployments.

Features
Columnar Storage: Redshift uses columnar storage, data
compression, and zone maps to minimize the amount of I/O needed
for queries.

Parallel Processing: It utilizes a massively parallel processing (MPP)


data warehouse architecture to distribute SQL operations across
multiple nodes.

Machine Learning: Redshift leverages machine learning to optimize


throughput based on workloads.

Result Caching: Provides sub-second response times for repeat


queries.
https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 2/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Automated Backups: Redshift continuously backs up your data to


S3 and can replicate snapshots to another region for disaster
recovery.

Components
Cluster: Comprises a leader node and one or more compute nodes.
A database is created upon provisioning a cluster for loading data
and running queries.

Scaling: Clusters can be scaled in/out by adding/removing nodes


and scaled up/down by changing node types.

Maintenance Window: Redshift assigns a 30-minute maintenance


window randomly within an 8-hour block per region each week.
During this time, clusters are unavailable.

Deployment Platforms: Supports both EC2-VPC and EC2-Classic


platforms for launching clusters.

Redshift Nodes
Leader Node: Manages client connections, parses queries, and
coordinates execution plans with compute nodes.

Compute Nodes: Execute query plans, exchange data, and send


intermediate results to the leader node for aggregation.

Node Types
Dense Storage (DS): For large data workloads using HDD storage.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 3/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Dense Compute (DC): Optimized for performance-intensive


workloads using SSD storage.

Parameter Groups
Parameter groups apply to all databases within a cluster. The default
parameter group has preset values and cannot be modified.

Database Querying Options


Query Editor: Use the AWS Management Console to connect to
your cluster and run queries.

SQL Client Tools: Connect via standard ODBC and JDBC


connections.

Enhanced VPC Routing: Manages data flow between your cluster


and other resources using VPC features.

Redshift Spectrum
Query Exabytes of Data: Run queries against data in S3 without
loading or transforming it.

Columnar Format: Scans only the needed columns for your query,
reducing data processing.

Compression Algorithms: Scans less data when data is compressed


with supported algorithms.

Redshift Streaming Ingestion


https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 4/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Streaming Data: Consume and process data directly from


streaming sources like Amazon Kinesis Data Streams and Amazon
Managed Streaming for Apache Kafka (MSK).

Low Latency: Provides high-speed ingestion without staging data in


S3.

Redshift ML
Machine Learning: Train and deploy machine learning models using
SQL commands within Redshift.

In-Database Inference: Perform in-database predictions without


moving data.

SageMaker Integration: Utilizes Amazon SageMaker Autopilot to


find the best model for your data.

Redshift Data Sharing


Live Data Sharing: Securely share live data across Redshift clusters
within an AWS account without copying data.

Up-to-Date Information: Users always access the most current data


in the warehouse.

No Additional Cost: Available on Redshift RA3 clusters without


extra charges.

Redshift Cross-Database Query

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 5/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Query Across Databases: Allows querying across different


databases within a Redshift cluster,

regardless of the database you are connected to. This feature is available on
Redshift RA3 node types at no extra cost.

Cluster Snapshots
Types: There are two types of snapshots, automated and manual,
stored in S3 using SSL.

Automated Snapshots: Taken every 8 hours or 5 GB per node of


data change and are enabled by default. They are deleted at the
end of a one-day retention period, which can be modified.

Manual Snapshots: Retained indefinitely unless manually deleted.


Can be shared with other AWS accounts.

Cross-Region Snapshots: Snapshots can be copied to another AWS


Region for disaster recovery, with a default retention period of
seven days.

Monitoring
Audit Logging: Tracks authentication attempts, connections,
disconnections, user definition changes, and queries. Logs are
stored in S3.

Event Tracking: Redshift retains information about events for


several weeks.

Performance Metrics: Uses CloudWatch to monitor physical


aspects like CPU utilization, latency, and throughput.
https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 6/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Query/Load Performance Data: Helps monitor database activity


and performance.

CloudWatch Alarms: Optionally configured to monitor disk space


usage across cluster nodes.

Security
Access Control: By default, only the AWS account that creates the
cluster can access it.

IAM Integration: Create user accounts and manage permissions


using IAM.

Security Groups: Use Redshift security groups for EC2-Classic


platforms and VPC security groups for EC2-VPC platforms.

Encryption: Optionally encrypt clusters upon provisioning.


Encrypted clusters' snapshots are also encrypted.

Pricing
Billing: Pay per second based on the type and number of nodes in
your cluster.

Spectrum Scanning: Pay for the number of bytes scanned by


Redshift Spectrum.

Reserved Instances: Save costs by committing to 1 or 3-year terms.

Cluster Management

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 7/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Creating a Cluster

aws redshift create-cluster \


--cluster-identifier my-redshift-cluster \
--node-type dc2.large \
--master-username masteruser \
--master-user-password masterpassword \
--cluster-type multi-node \
--number-of-nodes 2

Deleting a Cluster

aws redshift delete-cluster \


--cluster-identifier my-redshift-cluster \
--skip-final-cluster-snapshot

Describing a Cluster
aws redshift describe-clusters \
--cluster-identifier my-redshift-cluster

Database Management
Connecting to the Database
Use a PostgreSQL-compatible tool such as psql or a SQL client:

psql -h my-cluster.cduijjmc4xkx.us-west-2.redshift.amazonaws.com -U masteruser

Creating a Database
https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 8/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

CREATE DATABASE mydb;

Dropping a Database

DROP DATABASE mydb;

User Management
Creating a User
CREATE USER myuser WITH PASSWORD 'mypassword' ;

Dropping a User

DROP USER myuser;

Granting Permissions

GRANT ALL PRIVILEGES ON DATABASE mydb TO myuser;

Revoking Permissions

REVOKE ALL PRIVILEGES ON DATABASE mydb FROM myuser;

Table Management

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 9/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Creating a Table

CREATE TABLE mytable (


id INT PRIMARY KEY,
name VARCHAR(50),
age INT
);

Dropping a Table

DROP TABLE mytable;

Inserting Data

INSERT INTO mytable (id, name, age) VALUES (1, 'John Doe', 30);

Updating Data

UPDATE mytable SET age = 31 WHERE id = 1;

Deleting Data

DELETE FROM mytable WHERE id = 1;

Querying Data

SELECT * FROM mytable;

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 10/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Performance Tuning
Analyzing a Table

ANALYZE mytable;

Vacuuming a Table

VACUUM mytable;

Redshift Distribution Styles


KEY: Distributes rows based on the values in one column.

EVEN: Distributes rows evenly across all nodes.

ALL: Copies the entire table to each node.

Example: Creating a Table with Distribution Key

CREATE TABLE mytable (


id INT,
name VARCHAR(50),
age INT
)
DISTSTYLE KEY
DISTKEY(id);

Backup and Restore


Creating a Snapshot

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 11/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

aws redshift create-cluster-snapshot \


--snapshot-identifier my-snapshot \
--cluster-identifier my-redshift-cluster

Restoring from a Snapshot

aws redshift restore-from-cluster-snapshot \


--snapshot-identifier my-snapshot \
--cluster-identifier my-new-cluster

Security
Enabling SSL
In psql or your SQL client, use the sslmode parameter:

psql "host=my-cluster.cduijjmc4xkx.us-west-2.redshift.amazonaws.com dbname=dev

Managing VPC Security Groups

aws redshift create-cluster-security-group --cluster-security-group-name my-sec


aws redshift authorize-cluster-security-group-ingress --cluster-security-group-

Maintenance
Resizing a Cluster

aws redshift modify-cluster \


--cluster-identifier my-redshift-cluster \
https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 12/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

--node-type dc2.large \
--number-of-nodes 4

Monitoring Cluster Performance


Use Amazon CloudWatch to monitor:

CPU Utilization

Database Connections

Read/Write IOPS

Network Traffic

Viewing Cluster Events

aws redshift describe-events \


--source-identifier my-redshift-cluster \
--source-type cluster

Amazon S3 Cheat Sheet


Overview
Amazon S3 (Simple Storage Service) stores data as objects within buckets.
Each object includes a file and optional metadata that describes the file. A
key is a unique identifier for an object within a bucket, and storage capacity
is virtually unlimited.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 13/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Buckets
Access Control: For each bucket, you can control access, create,
delete, and list objects, view access logs, and choose the
geographical region for storage.

Naming: Bucket names must be unique DNS-compliant names


across all existing S3 buckets. Once created, the name cannot be
changed and is visible in the URL pointing to the objects in the
bucket.

Limits: By default, you can create up to 100 buckets per AWS


account. The region of a bucket cannot be changed after creation.

Static Website Hosting: Buckets can be configured to host static


websites.

Deletion Restrictions: Buckets with 100,000 or more objects


cannot be deleted via the S3 console. Buckets with versioning
enabled cannot be deleted via the AWS CLI.

Data Consistency Model


Read-After-Write Consistency: For PUTS of new objects in all
regions.

Strong Consistency: For read-after-write HEAD or GET requests,


overwrite PUTS, and DELETES in all regions.

Eventual Consistency: For listing all buckets after deletion and for
enabling versioning on a bucket for the first time.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 14/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Storage Classes
Frequently Accessed Objects
S3 Standard: General-purpose storage for frequently accessed
data.

S3 Express One Zone: High-performance, single-AZ storage class


for latency-sensitive applications, offering improved access speeds
and reduced request costs compared to S3 Standard.

Infrequently Accessed Objects


S3 Standard-IA: For long-lived but less frequently accessed data,
with redundant storage across multiple AZs.

S3 One Zone-IA: Less expensive, stores data in one AZ, and is not
resilient to AZ loss. Suitable for objects over 128 KB stored for at
least 30 days.

Amazon S3 Intelligent-Tiering
Automatic Cost Optimization: Moves data between frequent and
infrequent access tiers based on access patterns.

Monitoring: Moves objects to infrequent access after 30 days


without access, and to archive tiers after 90 and 180 days without
access.

No Retrieval Fees: Optimizes costs without performance impact.

S3 Glacier
https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 15/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Long-Term Archive: Provides storage classes like Glacier Instant


Retrieval, Glacier Flexible Retrieval, and Glacier Deep Archive for
long-term archiving.

Access: Archived objects must be restored before access and are


only visible through S3.

Retrieval Options
Expedited: Access data within 1-5 minutes for urgent requests.

Standard: Default option, typically completes within 3-5 hours.

Bulk: Lowest-cost option for retrieving large amounts of data,


typically completes within 5-12 hours.

Additional Information
Object Storage: For S3 Standard, Standard-IA, and Glacier classes,
objects are stored across multiple devices in at least three AZs.

Amazon Athena Cheat Sheet

Overview
Amazon Athena is an interactive query service that allows you to analyze
data directly in Amazon S3 and other data sources using SQL. It is serverless

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 16/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

and uses Presto, an open-source, distributed SQL query engine optimized


for low-latency, ad hoc analysis.

Features
Serverless: No infrastructure to manage.

Built-in Query Editor: Allows you to write and execute queries


directly in the Athena console.

Wide Data Format Support: Supports formats such as CSV, JSON,


ORC, Avro, and Parquet.

Parallel Query Execution: Executes queries in parallel to provide


fast results, even for large datasets.

Amazon S3 Integration: Uses S3 as the underlying data store,


ensuring high availability and durability.

Data Visualization: Integrates with Amazon QuickSight.

AWS Glue Integration: Works seamlessly with AWS Glue for data
cataloging.

Managed Data Catalog: Stores metadata and schemas for your S3-
stored data.

Queries
Geospatial Data: You can query geospatial data.

Log Data: Supports querying various log types.

Query Results: Results are stored in S3.

Query History: Retains history for 45 days.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 17/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

User-Defined Functions (UDFs): Supports scalar UDFs, executed


with AWS Lambda, to process records or groups of records.

Data Types: Supports both simple (e.g., INTEGER, DOUBLE,


VARCHAR) and complex (e.g., MAPS, ARRAY, STRUCT) data types.

Requester Pays Buckets: Supports querying data in S3 Requester


Pays buckets.

Athena Federated Queries


Data Connectors: Allows querying data sources beyond S3 using
data connectors implemented in Lambda functions via the Athena
Query Federation SDK.

Pre-built Connectors: Available for popular data sources like


MySQL, PostgreSQL, Oracle, SQL Server, DynamoDB, MSK,
RedShift, OpenSearch, CloudWatch Logs, CloudWatch metrics, and
DocumentDB.

Custom Connectors: You can write custom data connectors or


customize pre-built ones using the Athena Query Federation SDK.

Optimizing Query Performance


Data Partitioning: Partitioning data by column values (e.g., date,
country, region) reduces the amount of data scanned by a query.

Columnar Formats: Converting data to columnar formats like


Parquet and ORC improves performance.

File Compression: Compressing files reduces the amount of data


scanned.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 18/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Splittable Files: Using splittable files allows Athena to read them in


parallel, speeding up query completion. Formats like AVRO,
Parquet, and ORC are splittable, regardless of the compression
codec. Only text files compressed with BZIP2 and LZO are
splittable.

Cost Controls
Workgroups: Isolate queries by teams, applications, or workloads
and enforce cost controls.

Per-Query Limit: Sets a threshold for the total amount of data


scanned per query, canceling any query that exceeds this limit.

Per-Workgroup Limit: Limits the total amount of data scanned by


all queries within a specified timeframe, with multiple limits based
on hourly or daily data scan totals.

Amazon Athena Security


Access Control: Use IAM policies, access control lists, and S3 bucket
policies to control data access.

Encrypted Data: Queries can be performed directly on encrypted


data in S3.

Amazon Athena Pricing


Pay Per Query: Charged based on the amount of data scanned by
each query.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 19/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

No Charge for Failed Queries: You are not charged for queries that
fail.

Cost Savings: Compressing, partitioning, or converting data to


columnar formats reduces the amount of data scanned, leading to
cost savings and performance gains.

Amazon Kinesis Cheat Sheet


Overview
Amazon Kinesis makes it easy to collect, process, and analyze real-time
streaming data. It can ingest real-time data such as video, audio, application
logs, website clickstreams, and IoT telemetry data for machine learning,
analytics, and other applications.

Kinesis Video Streams


A fully managed service for streaming live video from devices to the AWS
Cloud or building applications for real-time video processing or batch-
oriented video analytics.

Benefits
Device Connectivity: Connect and stream from millions of devices.

Custom Retention Periods: Configure video streams to durably


store media data for custom retention periods, generating an index
based on timestamps.
https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 20/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Serverless: No infrastructure setup or management required.

Security: Enforces TLS-based encryption for data streaming and


encrypts all data at rest using AWS KMS.

Components
Producer: Source that puts data into a Kinesis video stream.

Kinesis Video Stream: Enables the transportation, optional


storage, and real-time or batch consumption of live video data.

Consumer: Retrieves data from a Kinesis video stream to view,


process, or analyze it.

Fragment: A self-contained sequence of frames with no


dependencies on other fragments.

Video Playbacks
HLS (HTTP Live Streaming): For live playback.

GetMedia API: For building custom applications to process video


streams in real time with low latency.

Metadata
Nonpersistent Metadata: Ad hoc metadata for specific fragments.

Persistent Metadata: Metadata for consecutive fragments.

Pricing
Pay for the volume of data ingested, stored, and consumed.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 21/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Kinesis Data Stream


A scalable, durable data ingestion and processing service optimized for
streaming data.

Components
Data Producer: Application emitting data records to a Kinesis data
stream, assigning partition keys to records.

Data Consumer: Application or AWS service retrieving data from all


shards in a stream for real-time analytics or processing.

Data Stream: A logical grouping of shards retaining data for 24


hours or up to 7 days with extended retention.

Shard: The base throughput unit, ingesting up to 1000 records or 1


MB per second. Provides ordered records by arrival time.

Data Record
Record: Unit of data in a stream with a sequence number, partition
key, and data blob (max 1 MB).

Partition Key: Identifier (e.g., user ID, timestamp) used to route


records to shards.

Sequence Number
Unique identifier for each data record, assigned by Kinesis when
data is added.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 22/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Monitoring
Monitor shard-level metrics using CloudWatch, Kinesis Agent, and
Kinesis libraries. Log API calls with CloudTrail.

Security
Automatically encrypt sensitive data with AWS KMS.

Use IAM for access control and VPC endpoints to keep traffic within
the Amazon network.

Pricing
Charged per shard hour, PUT Payload Unit, and enhanced fan-out
usage. Extended data retention incurs additional charges.

Kinesis Data Firehose


The easiest way to load streaming data into data stores and analytics tools.

Features
Scalable: Automatically scales to match data throughput.

Data Transformation: Can batch, compress, and encrypt data


before loading it.

Destination Support: Captures, transforms, and loads data into S3,


Redshift, Elasticsearch, HTTP endpoints, and service providers like
Datadog, New Relic, MongoDB, and Splunk.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 23/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Batch Size and Interval: Control data upload frequency and size.

Data Delivery and Transformation


Lambda Integration: Transforms incoming data before delivery.

Format Conversion: Converts JSON to Parquet or ORC for storage


in S3.

Buffer Configuration: Controls data buffering before delivery to


destinations.

Pricing
Pay for the volume of data transmitted. Additional charges for data
format conversion.

Kinesis Data Analytics


Analyze streaming data, gain insights, and respond to business needs in real
time.

General Features
Serverless: Automatically manages infrastructure.

Scalable: Elastically scales to handle data volume.

Low Latency: Provides sub-second processing latencies.

SQL Features

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 24/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Standard ANSI SQL: Integrates with Kinesis Data Streams and


Firehose.

Input Types: Supports streaming and reference data sources.

Schema Editor: Recognizes standard formats like JSON and CSV.

Java Features
Apache Flink: Uses open-source libraries for building streaming
applications.

State Management: Stores state in encrypted, incrementally saved


running application storage.

Exactly Once Processing: Ensures processed records affect results


exactly once.

Components
Input: Streaming source for the application.

Application Code: SQL statements processing input data.

In-Application Streams: Stores data for processing.

Kinesis Processing Units (KPU): Provides memory, computing, and


networking resources.

Pricing
Charged based on the number of KPUs used. Additional charges for
Java application orchestration and storage.

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 25/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Member discussion 0 comments

Start the conversation


Become a member of Nata in Data to start commenting.

Sign up now

Already a member? Sign in

Python for Data Best Practices to Get


Engineering Started with Data
How I use Python as a Data engineer: Python Observability + Hands-On
plays a vital role in my daily work as a data… Examples
Jan 15, 2025 2 min read ! Get your Data Observability checklist in the
end ! DATA DOWNTIME - words that send…

Aug 12, 2024 6 min read

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 26/27
2/22/25, 7:28 PM AWS Data Engineering Cheatsheet

Nata in Data © 2025

Sign up Roadmap About Home

https://www.nataindata.com/blog/aws-data-engineering-cheat-sheet/ 27/27

You might also like