Datalake Architecture

This document summarizes a success story about creating a data lake solution for a large energy company. The solution included: 1) A uniform data platform and data lake using Amazon Redshift to provide a single source of truth for large datasets including real-time capabilities. 2) Tableau for responsive self-service BI and dashboards to visualize smart meter data and business metrics. 3) An AWS architecture using services like MSK, Kafka Connect, and Debezium for real-time data integration into Redshift from source systems.

Uploaded by

kingsley2ky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views4 pages

Datalake Architecture

Uploaded by

kingsley2ky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

 About

 Offers
 Success stories
 Partners
 News
 Careers
 Contact


Search

Success
stories
Energy – Building a
uniform data platform &
data lake, including
real-time capabilities

ENERGY
July 16, 2021
SHARE
Read on to see how the experts of Lucy in the Cloud created a new solution that
makes it possible to crunch in pseudo real-time huge datasets including load curves
and expose them for analytics and ML usage, for a big player in the energy sector.
What was the challenge that we were approached with?
Our client came to the experts of Lucy with a clear challenge to help solve: they were
dealing with very large data sets, that didn’t have clean update markers in
proprietary ERP solutions. As such they were in need of a solution that could meet
the following requirements:

 A uniform data platform and data lake, including real-time capabilities

 Responsive self-service BI and Dashboarding capabilities
 Ability of the platform to address more ML/AI oriented used cases
 Agility in deployment in a controlled and governed landscape
And our experts wasted no time in going to work on a strategy and technical solution
that would solve the main issues of our client.
So what did we come up with?

Our solution: a Data Lake solution using Redshift

Architecture based upon key data services of AWS

Data lake solution with Redshift and Tableau for DataViz

Our experts relied on the Tableau map capabilities to visualize any smart meter in
the network including all its consumption metrics.

Tableau’s rich dashboarding capabilities make it possible to report and visualize all
business metrics in a uniformed set, which is exactly what our customer was looking
for..

Redshift giving a unified business data model, with a low time to sync from the
source systems
The Data Warehouse is running on Amazon Redshift and follows the
Data Vault 2.0 methodology. Data Vault objects are very standardized
and have strict modelling rules, which allows a high level of
standardization and automation. The data model is generated based on
metadata stored in an Amazon RDS Aurora database. The Data Vault
model is generated by Orion, a Lucy in the cloud developed Data Vault
automation engine that runs in a serverless mode. The serverless mode
is achieved by generating AWS Step Functions that executes Lambda
functions and runs Redshift queries that are dispatched trough the
Redshift data API.
This makes the solution very scalable and able to process in near real-
time.

Real time flow based on AWS MSK and Kafka connect

In order to sync any database data based on Change Data Capture (CDC), Kafka
connect is used running Debezium. The Kafka connect nodes are running under
Docker in high availability mode. The data is serialized in AVRO format, the schema
registry is deployed on ECS in HA mode. Having the serialization in AVRO format
makes it’s fast and optimal in storage. AWS MSK handles giving the data platform is
solid back bone to handle any real-time data case for now and in the future.

And much more!

The uniform Data Platform we have built for this key player in the
Energy sector, grows fast and enables them to address all sorts of data
related use cases including the use of Machine Learning and Artificial
Intelligence capabilities.

Want to know more about the capabilities of Amazon Redshift?

Read all about it here: Amazon Redshift: the fastest and most widely used cloud
data warehouse
Take
advance
with AWS,
contact us !
CONTACT US

Biometry and Experimental Design
100% (1)
Biometry and Experimental Design
106 pages
Scottish Fold Cat
100% (2)
Scottish Fold Cat
11 pages
Report On Launching of New Product
40% (5)
Report On Launching of New Product
43 pages
Mixed Methods Research
No ratings yet
Mixed Methods Research
10 pages
Hospital List
75% (4)
Hospital List
4 pages
Dissertation On Investment Analysis
100% (2)
Dissertation On Investment Analysis
5 pages
Rate and Perception of Parents Towards The Implementation of Fatima National High School Drive
No ratings yet
Rate and Perception of Parents Towards The Implementation of Fatima National High School Drive
34 pages
The Modern ELT Stack To Win With Cloud Data Warehousing
No ratings yet
The Modern ELT Stack To Win With Cloud Data Warehousing
33 pages
Saira
100% (1)
Saira
6 pages
Redshift-DA Handout
No ratings yet
Redshift-DA Handout
121 pages
Data Lake On Aws
No ratings yet
Data Lake On Aws
29 pages
Migrate Your On-Premise Data Warehouse To Amazon Redshift: Noman Jaffery
100% (1)
Migrate Your On-Premise Data Warehouse To Amazon Redshift: Noman Jaffery
18 pages
Ni-Cd Battery For Aircraft Battery Design and Charging Options
No ratings yet
Ni-Cd Battery For Aircraft Battery Design and Charging Options
8 pages
Enterprise Data Warehousing On Aws
No ratings yet
Enterprise Data Warehousing On Aws
26 pages
Final Copy 222-1
No ratings yet
Final Copy 222-1
51 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Experiment-1: Aim: Equipment Required
No ratings yet
Experiment-1: Aim: Equipment Required
17 pages
Amazon Redshift - Analyze Data Across Your Lake House With Amazon Redshift
No ratings yet
Amazon Redshift - Analyze Data Across Your Lake House With Amazon Redshift
48 pages
Skin Scarring
No ratings yet
Skin Scarring
29 pages
7 Counters PDF
No ratings yet
7 Counters PDF
13 pages
Koala - Wikipedia
No ratings yet
Koala - Wikipedia
24 pages
A Comparative Analysis of The
No ratings yet
A Comparative Analysis of The
15 pages
Gifted 3
No ratings yet
Gifted 3
12 pages
Hpfs Instruments India LLP
No ratings yet
Hpfs Instruments India LLP
25 pages
Spring 2023 INT 500 - Syllabus (Marketing - Sales)
No ratings yet
Spring 2023 INT 500 - Syllabus (Marketing - Sales)
22 pages
XII-SCIENCE-24-25-SUMMER HOLIDAY ASSIGNMENT - Removed
No ratings yet
XII-SCIENCE-24-25-SUMMER HOLIDAY ASSIGNMENT - Removed
14 pages
SA 226 LUBRICATION - Maintenance Practices
No ratings yet
SA 226 LUBRICATION - Maintenance Practices
12 pages
Big Data PDF
No ratings yet
Big Data PDF
18 pages
Amazon Redshift论文
No ratings yet
Amazon Redshift论文
13 pages
DDMCA Regulations Updated
No ratings yet
DDMCA Regulations Updated
11 pages
Project Africa Now
No ratings yet
Project Africa Now
6 pages
Amazon Redshift: Database - PRN NO-2017BTECS00041
No ratings yet
Amazon Redshift: Database - PRN NO-2017BTECS00041
9 pages
Unit 5 - Systems of Equations and Inequalities Study Guide
No ratings yet
Unit 5 - Systems of Equations and Inequalities Study Guide
6 pages
Pteropoda
No ratings yet
Pteropoda
4 pages
Avr4311 E2
No ratings yet
Avr4311 E2
2 pages
Atlantic International University - Wikipedia
No ratings yet
Atlantic International University - Wikipedia
4 pages
938G+ +Electrical+System
100% (5)
938G+ +Electrical+System
2 pages
Clinical Medical Assistant Lesson 14 Assignment
No ratings yet
Clinical Medical Assistant Lesson 14 Assignment
2 pages
Generator Set Iso8528!5!2005 Operating Limits
No ratings yet
Generator Set Iso8528!5!2005 Operating Limits
1 page
Real-Time Big Data Analytics: Emerging Trends
From Everand
Real-Time Big Data Analytics: Emerging Trends
Trilokesh Khatri
No ratings yet
Introduction to Data Platforms: How to leverage data fabric concepts to engineer your organization's data for today's cloud-based digital world
From Everand
Introduction to Data Platforms: How to leverage data fabric concepts to engineer your organization's data for today's cloud-based digital world
Anthony David Giordano
No ratings yet
Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
From Everand
Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
Neylson Crepalde
No ratings yet
Data Engineering with AWS Cookbook: A recipe-based approach to help you tackle data engineering problems with AWS services
From Everand
Data Engineering with AWS Cookbook: A recipe-based approach to help you tackle data engineering problems with AWS services
Trâm Ngọc Phạm
No ratings yet
Amazon DynamoDB - The Definitive Guide: Explore enterprise-ready, serverless NoSQL with predictable, scalable performance
From Everand
Amazon DynamoDB - The Definitive Guide: Explore enterprise-ready, serverless NoSQL with predictable, scalable performance
Aman Dhingra
No ratings yet
Data Engineering with Google Cloud Platform: A guide to leveling up as a data engineer by building a scalable data platform with Google Cloud
From Everand
Data Engineering with Google Cloud Platform: A guide to leveling up as a data engineer by building a scalable data platform with Google Cloud
Adi Wijaya
No ratings yet
Databricks Platform Essentials: Definitive Reference for Developers and Engineers
From Everand
Databricks Platform Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering the Art of Cloud Computing with AWS: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Cloud Computing with AWS: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Hadoop Blueprints
From Everand
Hadoop Blueprints
Anurag Shrivastava
No ratings yet
AWS Lambda Essentials: Definitive Reference for Developers and Engineers
From Everand
AWS Lambda Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Scalability By Design
From Everand
Scalability By Design
Chukwunonso Offor
No ratings yet
Mastering Amazon Redshift: Scalable Cloud Data Warehousing
From Everand
Mastering Amazon Redshift: Scalable Cloud Data Warehousing
Robert Johnson
No ratings yet
Mastering Cloud Computing With Best Practices
From Everand
Mastering Cloud Computing With Best Practices
Manish Soni
No ratings yet
DBA's Guide to NoSQL
From Everand
DBA's Guide to NoSQL
The Enlightened DBA
5/5 (1)
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
From Everand
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
Robert Johnson
No ratings yet
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Realm Database in Mobile Application Development: Definitive Reference for Developers and Engineers
From Everand
Realm Database in Mobile Application Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
AWS Glue for Data Engineers: Serverless ETL Made Easy
From Everand
AWS Glue for Data Engineers: Serverless ETL Made Easy
Robert Johnson
No ratings yet
Redshift Essentials: Definitive Reference for Developers and Engineers
From Everand
Redshift Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Sqoop Essentials: Definitive Reference for Developers and Engineers
From Everand
Sqoop Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Microsoft SQL Server Text Book
From Everand
Microsoft SQL Server Text Book
Manish Soni
No ratings yet
The DynamoDB Handbook: Practical Solutions for Modern NoSQL Database Management
From Everand
The DynamoDB Handbook: Practical Solutions for Modern NoSQL Database Management
Robert Johnson
No ratings yet
Learning Cascading
From Everand
Learning Cascading
Michael Covert
No ratings yet
Data Lakes & Pipelines: A Modern Azure Guide
From Everand
Data Lakes & Pipelines: A Modern Azure Guide
Kameron Hussain
No ratings yet
Amazon Web Services: A Complete Guide
From Everand
Amazon Web Services: A Complete Guide
Christopher Ford
No ratings yet
Effective Business Intelligence with QuickSight
From Everand
Effective Business Intelligence with QuickSight
Rajesh Nadipalli
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Azure Data Demystified: From SQL to Synapse
From Everand
Azure Data Demystified: From SQL to Synapse
Kameron Hussain
No ratings yet
Professional Microsoft SQL Server 2012 Integration Services
From Everand
Professional Microsoft SQL Server 2012 Integration Services
Brian Knight
No ratings yet
Mastering Delta Lake: Optimizing Data Lakes for Performance and Reliability
From Everand
Mastering Delta Lake: Optimizing Data Lakes for Performance and Reliability
Robert Johnson
No ratings yet
A Comprehensive Guide to Amazon Web Services
From Everand
A Comprehensive Guide to Amazon Web Services
Josh Luberisse
No ratings yet
Power BI DAX: A Guide to Using Basic Functions in Data Analysis
From Everand
Power BI DAX: A Guide to Using Basic Functions in Data Analysis
Kiet Huynh
No ratings yet
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
From Everand
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
Will Girten
No ratings yet
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
From Everand
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
CPA USA Information Systems and Controls: The Complete Syllabus Guide
From Everand
CPA USA Information Systems and Controls: The Complete Syllabus Guide
Azhar ul Haque Sario
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Amazon Web Services: A Complete Guide: The IT Collection
From Everand
Amazon Web Services: A Complete Guide: The IT Collection
Christopher Ford
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
AWS Associate Architect: From basic to advanced
From Everand
AWS Associate Architect: From basic to advanced
Alex Carvalho
No ratings yet
AWS Cloud Practitioner: From Basic to Advanced
From Everand
AWS Cloud Practitioner: From Basic to Advanced
Alex Carvalho
No ratings yet
Essays on Infrastructure-as-code
From Everand
Essays on Infrastructure-as-code
Ravi Rajamani
No ratings yet
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)
AWS Certified Solutions Architect - Associate Exam Prep kit
From Everand
AWS Certified Solutions Architect - Associate Exam Prep kit
SUJAN
No ratings yet
AWS SysOps Administrator Associate: From basic to advanced
From Everand
AWS SysOps Administrator Associate: From basic to advanced
Alex Carvalho
No ratings yet
AWS Cloud Practitioner Exam Success Kit
From Everand
AWS Cloud Practitioner Exam Success Kit
SUJAN
No ratings yet
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
From Everand
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
Christopher Ford
No ratings yet
AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
C# 2010 Coding Briefs Data Access
From Everand
C# 2010 Coding Briefs Data Access
Kevin Hough
No ratings yet
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
From Everand
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
Poonam Devi
No ratings yet

Datalake Architecture

Uploaded by

Datalake Architecture

Uploaded by

 About

 A uniform data platform and data lake, including real-time capabilities

Our solution: a Data Lake solution using Redshift

Data lake solution with Redshift and Tableau for DataViz

Real time flow based on AWS MSK and Kafka connect

And much more!

Want to know more about the capabilities of Amazon Redshift?

You might also like