Data Engineering by AWS
Data Engineering by AWS
Internship: by AWS
Data Warehousing
Centralized storage of data for analytical purposes, enabling
comprehensive insights and business intelligence.
Building Data Pipelines with
AWS Services
AWS Athena
2
A serverless query service enabling interactive analysis of
data stored in S3 using SQL, eliminating the need for
complex infrastructure.
Leveraging Amazon S3 for Data
Storage
Object Storage
S3 offers secure and scalable object storage for a wide range of
data, from raw logs to processed files.
Data Durability
S3 ensures data durability and availability, with multiple copies
and automatic replication for high reliability.
Data Access
S3 provides flexible data access through APIs and SDKs, allowing
seamless integration with various applications.
Scaling with Amazon EMR and Apache Spark
Distributed Processing
2 Apache Spark enables distributed data processing, allowing parallel execution
of tasks for faster insights.
Data Analytics
3 Spark provides a powerful engine for data analytics, supporting
various data processing and machine learning tasks.
Securing and Monitoring Data Workloads
Data Encryption
1
AWS offers encryption options for data at rest and in transit, ensuring data confidentiality and integrity.
Access Control
2 IAM policies and security groups restrict access to sensitive data, ensuring only
authorized users can access it.
1 2
Practical Application Problem-Solving
Implement data pipelines and Develop problem-solving skills by
analyze data using real-world case tackling real-world data challenges
studies, applying the knowledge and identifying solutions using AWS
gained during the internship. services.