100% found this document useful (2 votes)

198 views9 pages

An AWS Data Lake With S3 Explained! - by David Hundley - Towards Data Science

This document summarizes key aspects of designing an AWS data lake using S3 storage. It explains that in the past, physical infrastructure separated operational and analytical data, requiring copying between systems. However, with S3, all data is stored in a single "data lake" regardless of account. Properly configuring access permissions allows sharing data across accounts without physical copying. When designing an S3 data lake, companies should focus on access control rather than physical isolation, avoiding unnecessary data movement between buckets to reduce costs.

Uploaded by

Eka Ponkratova

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

198 views9 pages

An AWS Data Lake With S3 Explained! - by David Hundley - Towards Data Science

Uploaded by

Eka Ponkratova

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

5/22/2021 An AWS Data Lake with S3 Explained!

| by David Hundley | Towards Data Science

Open in app

Following 597K Followers

An AWS Data Lake with S3 Explained!

How learning about S3 can help you to design an ideal data lake in AWS

David Hundley Dec 6, 2019 · 9 min read

If you’ve had any connection to the data world, you’ve probably heard some memorable,
often quirky phrase about how valuable data is. I’m thinking of phrases like…

“Data is the new oil.”

“Without big data, you are deaf and blind in the middle of a freeway.”
https://towardsdatascience.com/an-aws-data-lake-with-s3-explained-c67c5f161db3 1/9
5/22/2021 An AWS Data Lake with S3 Explained! | by David Hundley | Towards Data Science

“Data is the new bacon!” 🥓

Open in app

And truthfully, the hype is merited. (Except I’m not sure I’d agree with that one about
the bacon…) From an analytical perspective, data helps us to make informed
decisions about the next steps we should take in our businesses. This can manifest
in the form of tabular reports to data dashboards to this new thing getting a lot of hype
called machine learning. Where people in “ye olden days” were very much left in the
dark on how to make the best business decisions, we today have a LOT of data resources
at our disposal to help. Using my back-catalog of icons I’ve created for former blog posts
(😃), I’m not exaggerating when I say that all the things in the visual below can produce
valuable data.

So given that all this stuff is creating data, the next logical question is… how do we
make the best use of it?

This is where the concept of a data lake comes into mind. Put basically, a data lake is a
unified space to place all of your data — both structured and unstructured — to
build analytical solutions from. And because I’m a picture guy, here’s a simple picture
that illustrates that.

https://towardsdatascience.com/an-aws-data-lake-with-s3-explained-c67c5f161db3 2/9
5/22/2021 An AWS Data Lake with S3 Explained! | by David Hundley | Towards Data Science

Open in app

Of course, you want to manage this data lake to be sure it doesn’t become a data
dumping ground. Data governance is super important, so you’ll want to be sure you
manage things like metadata, data quality, and more with the stuff that you put into this
data lake. That’s sort of out of the scope for this post, but I would be doing you a
disservice if I didn’t at least mention it!

For this post in particular, I want to focus on what it means building a data lake within
Amazon Web Services (AWS). With cloud solutions being all the rage these days, it
makes sense that people would want to build out their own data lake within AWS. More
specifically, it would make sense that people would want to use AWS’s Simple Storage
Service (S3) as that basis for the data lake.

Here’s the problem… no offense to AWS, but I don’t think they do a great job at
explaining how S3 differs from “old world” concepts. I hold four AWS certifications —
including the Big Data Specialty — and none of the study materials I came across
studying for those certifications really explains all that well what I’m about to explain in
this post. What I’m going to share in this post will likely radically change your thinking
about how to properly design your AWS data lake on S3.

But BEFORE we get into that, let’s talk about those “old world” concepts…

https://towardsdatascience.com/an-aws-data-lake-with-s3-explained-c67c5f161db3 3/9
5/22/2021 An AWS Data Lake with S3 Explained! | by David Hundley | Towards Data Science

The On-Premise World of Physical Data Isolation

Open in app
Before the era of AWS and cloud computing in general, a company really had the burden
of making sure they could support all their database needs with the right physical
infrastructure. I’m literally talking about buying these giant, hulking machines that
somebody would have to wire up for network connectivity and maintain with server
patches and whatnot. It wasn’t easy (and still isn’t easy if you currently maintain an on-
premises infrastructure), and of course, it was limited in the sense that one piece of
hardware can only hold a finite amount of data. If you wanted to hold more data, you
had to spin up new hardware. If you wanted to separate stuff like test data from
production data, you also probably had to set up new hardware.

So if you had data in one physical environment that had to be used for analytical
purposes in another physical environment, you probably had to copy that data over to
the new replica environment. Of course, you probably also kept a tie to the source
environment to ensure that the stuff in the replica environment is still up-to-date.

That little image above represents copying data from one operational source to an
analytical replica. Of course, your operational source data most likely isn’t in one single
environment. It’s likely that you have tens — if not hundreds — of those operational
sources where you gather data. That’s a lot of data movement! But due to literal physical
https://towardsdatascience.com/an-aws-data-lake-with-s3-explained-c67c5f161db3 4/9
5/22/2021 An AWS Data Lake with S3 Explained! | by David Hundley | Towards Data Science

limitations, that copy has to be done. The data can’t literally be in two places at the same
Open in app
time, right?

Well, here’s where things get different with AWS and their S3 buckets…

How S3 Buckets Change the Game

Remember when I said that I don’t think AWS does a great job at explaining the whole
thing around how S3 buckets are segmented? If you’ve taken any of the AWS
certifications — including Cloud Practitioner — you might recall that S3 uses a shared
namespace for buckets across all accounts. For example, if I create a bucket in my AWS
account with the name “dkhundley”, you in your account CANNOT also create a bucket
called “dkhundley.” That’s about as far as they go explaining it in most study materials,
unfortunately.

But do you know why that is the case? Don’t fret if you don’t! Maybe a simple picture will
help illustrate this…

In a sense, it’s not unfair to think about AWS in general already being one GIANT data
lake! The reason you can’t create a bucket called “dkhundley” is because there’s already one
present in this massive “data lake” we call S3. The physical infrastructure has been

https://towardsdatascience.com/an-aws-data-lake-with-s3-explained-c67c5f161db3 5/9
5/22/2021 An AWS Data Lake with S3 Explained! | by David Hundley | Towards Data Science

abstracted away from us, so logically speaking, it’s like every single company’s data is
Open in app
one big happy family.

Now, don’t let me scare you! By this logic, you might jump to the natural conclusion that
the data in your S3 buckets can be readily accessed by somebody else’s company and
their respective AWS account. This thankfully is NOT true. AWS has been very
intentional about putting the proper security around everything in AWS, including S3
buckets, so you can only access S3 buckets if you have the right credentials to do so.

Here’s the real kicker and why S3 is so different than on-premise infrastructure (this is
VERY important): you don’t necessarily have to be within the same account that
produced the data in the S3 bucket to have access to that data. For example, if you
set me up with the right credentials, I can see the contents of your company’s S3
bucket(s) right from my own personal AWS account, NO PHYSICAL COPYING
REQUIRED. That’s right, folks. This is a HUGE shift in mindset from how we do things in
the on-premise world. Where we focus our time on isolating data with physical
infrastructure, cloud computing shifts are attention to focus on isolating data using security
policies.

https://towardsdatascience.com/an-aws-data-lake-with-s3-explained-c67c5f161db3 6/9
5/22/2021 An AWS Data Lake with S3 Explained! | by David Hundley | Towards Data Science

Given that AWS adopts a pay-as-you-go model, you want to design things in such a way
Open in app
that maximizes performance and minimizes costs. Considering that both storage and
data movement have their associated costs, replicating data from one S3 bucket to
another is both cost prohibitive AND performance inefficient. Likewise, in the context of
S3, AWS accounts DO NOT isolate resources. Do not make the mistake of thinking you
have to copy data from one S3 bucket to another just because you might not share the
same account as another in your company! There will still be specific use cases where
you do want to move data between S3 buckets, but if your analytical data is already good
to go in one S3 bucket, physically copying it to another “data lake account” S3 bucket is
probably not needed.

Design Considerations for Your S3 Data Lake

I hope what we’ve covered makes sense so far. Go back and review it a few times if you
need to! Again, I’m a picture guy, so I’m hoping my simple illustrations made this easy
enough to understand. With this new mindset in tow, let’s wrap up this post with a few
design considerations for you to think about when establishing your data lake on S3:

Test vs. Production Data: When you create a new IT solution that makes changes to
data, it’s natural to want to protect your production-level data from being negatively
impacted by that new solution. In most on-premise infrastructures, that means
physically isolating the environments between test and production. How you isolate
test vs. production data in S3 needs to be considered, and it can be done in a number
of ways. The safest and easiest way is by wholly isolating data into their own
respective buckets, and you can manage organization either using bucket naming
standards or AWS tags. (Or both!) But if you don’t want to replicate everything,
there are ways to isolate certain things within each bucket. That’s going to take more
work on your end, but if cost management is a big factor to you, it might be worth it.

Sensitive Data Protection: This is a lot like the isolation test vs. production that we
just discussed in the point above. The easiest thing again is to isolate sensitive data
into its own bucket and really lock that down with lots of security measures, but
again, it is possible to still lock down sensitive data with non-sensitive data in the
same bucket. I probably wouldn’t want to mess with the hassle of that, but you do
you, friends.

https://towardsdatascience.com/an-aws-data-lake-with-s3-explained-c67c5f161db3 7/9
5/22/2021 An AWS Data Lake with S3 Explained! | by David Hundley | Towards Data Science

Data Lake vs. Data Warehouse: Let’s be clear here… a data lake is NOT
Open in app
synonymous with a data warehouse. A data warehouse generally contains only
structured or semi-structured data, whereas a data lake contains the whole shebang:
structured, semi-structured, and unstructured. Data lakes often coexist with data
warehouses, where data warehouses are often built on top of data lakes. In terms of
AWS, the most common implementation of this is using S3 as the data lake and
Redshift as the data warehouse. Of course, there’s more than one way to skin a cat in
AWS, so don’t think you’re only limited to Redshift for your warehousing needs.

Data Management & Governance: I already brushed past this once in this post, but
I think it’s worth bringing up again. A data lake can become a data dump VERY
quickly without proper data management and governance. When you design your
data lake, AWS does offers services like AWS Glue to help you manage stuff like a
Data Catalog, but it puts a lot on you to figure out that stuff for yourselves. If you
really want extra help in this space, there are also many third party vendors that will
provide a lot of oomph here. (Oomph is a technical term. 😂) Depending on your
company’s needs, it might be worth that extra investment to bring in a third party
vendor to help you organize your data lake. (I’m not overly familiar with it, but AWS
does also offer a service called Lake Formation that may also be worth looking into.)

Lake Consumption: Things can get a little bit tricky when you want to build
analytical solutions on top of your data lake. Whereas AWS accounts don’t
necessarily matter when putting data into a data lake, they do matter more for your
consumption solutions. Multiple accounts can draw from the same data lake, but
you have to ensure that they all have the proper security credentials to access those
underlying S3 buckets. And chances are, you don’t want to give blanket access to
everybody for every S3 bucket in your data lake. Again, this is where data
management and governance is extremely important, so it again may be worth the
investment to leverage those same third party governance tools to help divy out
security credentials appropriately.

Alrighty, that wraps up this post! This was a pretty foreign concept to me until fairly
recently, so don’t beat yourself up if you didn’t fully grasp this even if you have an AWS
certification. If you enjoyed this post, you might also appreciate some of my other posts,
including last week’s post on my five tips for getting started in AWS. Thanks for reading!

https://towardsdatascience.com/an-aws-data-lake-with-s3-explained-c67c5f161db3 8/9
5/22/2021 An AWS Data Lake with S3 Explained! | by David Hundley | Towards Data Science

Open in app
Sign up for The Variable
By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials
and cutting-edge research to original features you don't want to miss. Take a look.

Emails will be sent to [email protected].

Get this newsletter
Not you?

AWS Data Science Data Lake Analytics Data

About Help Legal

Get the Medium app

https://towardsdatascience.com/an-aws-data-lake-with-s3-explained-c67c5f161db3 9/9

NEWEST!!! Competency - Framework - EN
100% (1)
NEWEST!!! Competency - Framework - EN
44 pages
Snowflake To Lakehouse Migration Assessment 5-23
100% (1)
Snowflake To Lakehouse Migration Assessment 5-23
22 pages
Mastering Azure Synapse Analytics: Learn how to develop end-to-end analytics solutions with Azure Synapse Analytics (English Edition)
From Everand
Mastering Azure Synapse Analytics: Learn how to develop end-to-end analytics solutions with Azure Synapse Analytics (English Edition)
Debananda Ghosh
No ratings yet
Profisee Datasheet Integrator 8.5x11
No ratings yet
Profisee Datasheet Integrator 8.5x11
1 page
Mckinsey Technology Trends Outlook 2023 v2
100% (3)
Mckinsey Technology Trends Outlook 2023 v2
81 pages
Data Quality Administration Guide
No ratings yet
Data Quality Administration Guide
210 pages
Tableau Training
83% (6)
Tableau Training
35 pages
Data Lakes For Maximum Flexibility
No ratings yet
Data Lakes For Maximum Flexibility
29 pages
DEN80EDU07A01. Introduction To Logical Data Fabric For Architects
100% (1)
DEN80EDU07A01. Introduction To Logical Data Fabric For Architects
43 pages
12 Best Practices For Modern Data Integration: White Paper
100% (3)
12 Best Practices For Modern Data Integration: White Paper
10 pages
The Medallion Architecture
100% (1)
The Medallion Architecture
2 pages
Lab - Qlik Replicate Oracle To Azure Synapse
No ratings yet
Lab - Qlik Replicate Oracle To Azure Synapse
23 pages
Designing Data Integration The ETL Pattern Approac
No ratings yet
Designing Data Integration The ETL Pattern Approac
9 pages
Star Schema and Technology Review: Musa Sami Ata Abdel-Rahman Supervisor: Professor Sebastian Link
No ratings yet
Star Schema and Technology Review: Musa Sami Ata Abdel-Rahman Supervisor: Professor Sebastian Link
15 pages
BI Architect Questions
100% (1)
BI Architect Questions
2 pages
Data Strategy Worksheet: Component Typical Questions
No ratings yet
Data Strategy Worksheet: Component Typical Questions
2 pages
ETL State of The Art
No ratings yet
ETL State of The Art
198 pages
100 Days of Data Engineering - Make A Copy and Use As You Need - Sheet1
No ratings yet
100 Days of Data Engineering - Make A Copy and Use As You Need - Sheet1
4 pages
ETL QA Sample Scenario V3
100% (2)
ETL QA Sample Scenario V3
3 pages
Big Data Fabric Architecture
No ratings yet
Big Data Fabric Architecture
15 pages
Data Vault and HQDM Principles PDF
No ratings yet
Data Vault and HQDM Principles PDF
8 pages
Data Engineering For Everyone 3
No ratings yet
Data Engineering For Everyone 3
81 pages
ETL Vs ELT White Paper
No ratings yet
ETL Vs ELT White Paper
12 pages
Implemententerprise Data Lake
100% (1)
Implemententerprise Data Lake
9 pages
Cloud Data Warehouse
No ratings yet
Cloud Data Warehouse
7 pages
11 Best Practices For Data Engineers
No ratings yet
11 Best Practices For Data Engineers
7 pages
FSLDM Data Modeller
No ratings yet
FSLDM Data Modeller
1 page
Data Warehouse ETL Testing Best Practices
No ratings yet
Data Warehouse ETL Testing Best Practices
6 pages
Break Down Data Silos With ETL and Unlock Trapped Data With ETL
No ratings yet
Break Down Data Silos With ETL and Unlock Trapped Data With ETL
25 pages
Modern Data Warehouse White Paper PDF
100% (1)
Modern Data Warehouse White Paper PDF
26 pages
A Practical Roadmap For Scaling Your Analytic Culture
100% (1)
A Practical Roadmap For Scaling Your Analytic Culture
18 pages
Eshant Garg: Azure Data Engineer, Architect, Advisor
No ratings yet
Eshant Garg: Azure Data Engineer, Architect, Advisor
44 pages
Data Engineering by AWS
100% (1)
Data Engineering by AWS
11 pages
Testing A Data Warehouse
100% (2)
Testing A Data Warehouse
7 pages
Data Fabric Architecture (More Detail) - Version 1.0
100% (2)
Data Fabric Architecture (More Detail) - Version 1.0
2 pages
Microsoft Modern Data Estate
No ratings yet
Microsoft Modern Data Estate
48 pages
Metadata Management On A Hadoop Eco-System: Whitepaper by
No ratings yet
Metadata Management On A Hadoop Eco-System: Whitepaper by
12 pages
Modern Data Architecture: Bywhinmon
No ratings yet
Modern Data Architecture: Bywhinmon
10 pages
Axon Data Governance - The Playbook For Information Segmentation (2021)
No ratings yet
Axon Data Governance - The Playbook For Information Segmentation (2021)
27 pages
The Need For Big Data Governance Collibra Mapr
No ratings yet
The Need For Big Data Governance Collibra Mapr
8 pages
Data Vault Case Study
No ratings yet
Data Vault Case Study
6 pages
How To Build A Self-Service Data Analytics Stack Final - Google Docs Pdxule
No ratings yet
How To Build A Self-Service Data Analytics Stack Final - Google Docs Pdxule
12 pages
AWS Big Data Specialty Study Guide PDF
No ratings yet
AWS Big Data Specialty Study Guide PDF
13 pages
Speed Your Data Lake ROI
100% (1)
Speed Your Data Lake ROI
16 pages
Preso Accenture - INFADAY - 2011
No ratings yet
Preso Accenture - INFADAY - 2011
18 pages
Talend Open Studio For Master Data Management: A Practical Starter Guide 2nd Edition
No ratings yet
Talend Open Studio For Master Data Management: A Practical Starter Guide 2nd Edition
100 pages
Kimball Vs Inmon
No ratings yet
Kimball Vs Inmon
28 pages
06.introduction To Data Factory
No ratings yet
06.introduction To Data Factory
26 pages
Azure Synpase Analytics Service
No ratings yet
Azure Synpase Analytics Service
22 pages
Data Quality and Preprocessing Concepts ETL
No ratings yet
Data Quality and Preprocessing Concepts ETL
64 pages
Designing A Modern Data Warehouse + Data Lake
100% (1)
Designing A Modern Data Warehouse + Data Lake
72 pages
150 Data Engineering Interview Questions PDF
No ratings yet
150 Data Engineering Interview Questions PDF
8 pages
10190-Move and Improve With Oracle Analytics Cloud-Presentation - 287
No ratings yet
10190-Move and Improve With Oracle Analytics Cloud-Presentation - 287
69 pages
Spark Use Cases
No ratings yet
Spark Use Cases
2 pages
Data Modeling Interviews
No ratings yet
Data Modeling Interviews
16 pages
Tableau Performance Optimization Flow Chart 2020
No ratings yet
Tableau Performance Optimization Flow Chart 2020
3 pages
Data Architect or ETL Architect or BI Architect or Data Warehous
No ratings yet
Data Architect or ETL Architect or BI Architect or Data Warehous
4 pages
Fundamentals of Big Data Engineering: A Guide To The
No ratings yet
Fundamentals of Big Data Engineering: A Guide To The
14 pages
Databricks Essentials: A Guide to Unified Data Analytics
From Everand
Databricks Essentials: A Guide to Unified Data Analytics
Robert Johnson
No ratings yet
The Definitive Guide to Data Integration: Unlock the power of data integration to efficiently manage, transform, and analyze data
From Everand
The Definitive Guide to Data Integration: Unlock the power of data integration to efficiently manage, transform, and analyze data
Pierre-yves Bonnefoy
No ratings yet
Pentaho Data Integration Cookbook - Second Edition
From Everand
Pentaho Data Integration Cookbook - Second Edition
María Carina Roldán
No ratings yet
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
From Everand
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
Manoj Kumar
No ratings yet
Data Architecture Complete Self-Assessment Guide
From Everand
Data Architecture Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
Special Report Quality of Loan Level Data Lessons Learned For The AnaCredit Project
No ratings yet
Special Report Quality of Loan Level Data Lessons Learned For The AnaCredit Project
12 pages
Risk Management Tool Guide 4 Portfolio Quality Analysis PQA
No ratings yet
Risk Management Tool Guide 4 Portfolio Quality Analysis PQA
15 pages
Data Governance - What, When, Where, Why, Who and How of Data - World of BigData
100% (1)
Data Governance - What, When, Where, Why, Who and How of Data - World of BigData
20 pages
Attachment 0
No ratings yet
Attachment 0
57 pages
Scaling Data - Data Informed To Data Driven To Data Led - Reforge
No ratings yet
Scaling Data - Data Informed To Data Driven To Data Led - Reforge
16 pages
Introduction On Data Warehouse With OLTP and OLAP: Arpit Parekh
No ratings yet
Introduction On Data Warehouse With OLTP and OLAP: Arpit Parekh
5 pages
Predictive Modeling: Project Documentation Team 10
No ratings yet
Predictive Modeling: Project Documentation Team 10
16 pages
3 Hribar Rajteric Final 2
No ratings yet
3 Hribar Rajteric Final 2
19 pages
Tableau Designing Efficient Workbooks
100% (1)
Tableau Designing Efficient Workbooks
53 pages
PPC Flowchart
No ratings yet
PPC Flowchart
1 page
CSA-311 Functional Security Assessment For Components (v2 - 4)
No ratings yet
CSA-311 Functional Security Assessment For Components (v2 - 4)
53 pages
Ac410 2
No ratings yet
Ac410 2
60 pages
Python Project 12th
No ratings yet
Python Project 12th
27 pages
Project Life Cycle
No ratings yet
Project Life Cycle
2 pages
Unit-II Entrepreneurial Journey (IPR)
No ratings yet
Unit-II Entrepreneurial Journey (IPR)
10 pages
Mabin - Fundood Data - S To Z
No ratings yet
Mabin - Fundood Data - S To Z
36 pages
Industrial Information Security Management System: Internal Audit Checklist
No ratings yet
Industrial Information Security Management System: Internal Audit Checklist
13 pages
05 Laboratory Exercise 12
No ratings yet
05 Laboratory Exercise 12
8 pages
AWS Certified DevOps Engineer Professional Questions
No ratings yet
AWS Certified DevOps Engineer Professional Questions
4 pages
Veeam Agent Linux 4 0 Whats New
No ratings yet
Veeam Agent Linux 4 0 Whats New
2 pages
Digital Business PPT 2
No ratings yet
Digital Business PPT 2
11 pages
Session 5 - UBER, An Empire in The Making
No ratings yet
Session 5 - UBER, An Empire in The Making
8 pages
Unit-1 QB
No ratings yet
Unit-1 QB
10 pages
Maximo Asset Management 7.1 Rapid Engagement Guide PDF
100% (1)
Maximo Asset Management 7.1 Rapid Engagement Guide PDF
112 pages
George Updated
No ratings yet
George Updated
4 pages
ERP For Pharmaceuticals
100% (1)
ERP For Pharmaceuticals
3 pages
Guide To Welding Automation
No ratings yet
Guide To Welding Automation
14 pages
SE Unit 1
No ratings yet
SE Unit 1
98 pages
Milk Tea Sun.: A Strong Partner For Sustainable Development
No ratings yet
Milk Tea Sun.: A Strong Partner For Sustainable Development
25 pages
1 Reliable India Jobs Business Information PDF
No ratings yet
1 Reliable India Jobs Business Information PDF
4 pages
Electrical and Maintenance Technician: Patrice - Eteme@
No ratings yet
Electrical and Maintenance Technician: Patrice - Eteme@
2 pages
Week 8 AIA IoT
No ratings yet
Week 8 AIA IoT
1 page
Services Guide: Cloud-Based Archiving & Messaging For Compliance
No ratings yet
Services Guide: Cloud-Based Archiving & Messaging For Compliance
32 pages
Meridium APM Asset Strategy Management V3.6.0.0.0
No ratings yet
Meridium APM Asset Strategy Management V3.6.0.0.0
469 pages
Chapter 3 & 4-Requirements Elicitation and Analysis I
No ratings yet
Chapter 3 & 4-Requirements Elicitation and Analysis I
33 pages
QCM Supply Chain Management 2016 Et 2017 2 2
No ratings yet
QCM Supply Chain Management 2016 Et 2017 2 2
15 pages
Analyzation To The Risks and Problems of E-Banking in Selected Banks in Antipolo City
No ratings yet
Analyzation To The Risks and Problems of E-Banking in Selected Banks in Antipolo City
63 pages
System Integration Testing: Chapter-1
No ratings yet
System Integration Testing: Chapter-1
28 pages
Nagaraj EconomicsMaps 2020
No ratings yet
Nagaraj EconomicsMaps 2020
27 pages

An AWS Data Lake With S3 Explained! - by David Hundley - Towards Data Science

Uploaded by

An AWS Data Lake With S3 Explained! - by David Hundley - Towards Data Science

Uploaded by

5/22/2021 An AWS Data Lake with S3 Explained!

| by David Hundley | Towards Data Science

Following 597K Followers

An AWS Data Lake with S3 Explained!

David Hundley Dec 6, 2019 · 9 min read

“Data is the new oil.”

“Data is the new bacon!” 🥓

The On-Premise World of Physical Data Isolation

How S3 Buckets Change the Game

Design Considerations for Your S3 Data Lake

Emails will be sent to [email protected].

AWS Data Science Data Lake Analytics Data

About Help Legal

Get the Medium app

You might also like