DS Attunity End To End Data Integration For Hadoop Data Lakes EN

The document discusses using Attunity solutions to automate data ingestion and transformation pipelines for Hadoop data lakes. Attunity Replicate can ingest data from various sources in real-time or bulk loads. Attunity Compose automates creation of Hive structures and transformation of data within them. Together they can manage data flows from ingestion to consumption for analytics.

Uploaded by

Demian Molinari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views2 pages

DS Attunity End To End Data Integration For Hadoop Data Lakes EN

Uploaded by

Demian Molinari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

SOLUTION SHEET: HADOOP DATA INGEST

End-to-End Data Integration for

Hadoop Data Lakes
Deliver timely, high-quality and well-governed transactional data to the business
Data Lakes enable enterprises to process vast data volumes and address use cases that range from batch
analysis to streaming analytics and machine learning. Whether on premises or in the cloud, Data Lakes
provide an efficient, scalable and centralized foundation for modern analytics.
But traditional tools for integrating this data are neither efficient nor scalable for Data Lake implementations.
IT organizations often struggle to ingest data from hundreds or even thousands of sources that require custom
coding and intrusive triggers and agents, tying up your most talented programmers with repetitive and error
prone work.
A related challenge is efficiently transforming data into accurate, consistent and analytics-ready systems of
record. Scarce programming resources are one obstacle. Another is the lack of metadata and lineage views,
which forces users to individually collect, assemble and refine data for analytics.
Attunity solutions remove these obstacles and create an efficient, automated data pipeline that
reduces time to analytics.
Data Lake Ingestion with Attunity
Attunity Replicate is a simple, universal and real-
time data ingestion solution that delivers data CUSTOMER SUCCESS
efficiently to any major Hadoop/Data Lake platform.
With Attunity Replicate, architects and database “Using Attunity, we were able to create
administrators can eliminate manual coding with our strategic analytical platform,
a 100% automated interface that quickly and easily configures, insights analytics, which allows us to
controls and monitors bulk loads as well as real-time updates. make important operational decisions
You can ingest data across hundreds or thousands of end points – that benefit our staff and students.”
including any major RDBMS, legacy system, data warehouse, Data JUERGEN STEGMAIR, LEAD FOR DATABASE
Lake distribution or streaming platform – through a single pane ADMIN, UNIVERSITY OF NORTH TEXAS
of glass. Attunity Replicate also minimizes production impact and
administrative burden by copying source updates from transaction
logs, with no need for agents.
Data Lake Transformation with Attunity
Attunity Compose for Data Lakes automates the creation and loading of Hadoop Hive
structures, as well as the transformation of enterprise data within them. Our solution fully
automates the pipeline of BI ready data into Hive, enabling you to automatically create both
Operational Data Stores (ODS) and Historical Data Stores (HDS). And we leverage the latest
innovations in Hadoop such as the new ACID Merge SQL capabilities, available today in Apache
Hive (part of the Hortonworks 2.6 distribution), to automatically and efficiently process data insertions,
updates and deletions.
Attunity Replicate integrates with Attunity Compose for Data Lakes to simplify and accelerate data ingestion,
data landing, SQL schema creation, data transformation and ODS and HDS creation/updates. Here is a sample
architecture and description of how a combined Attunity solution can manage data flows at each stage of a
data lake pipeline.

www.attunity.com
SOLUTION SHEET : HADOOP DATA INGEST

Your Data Lake Pipeline

Source Land Assemble Provision Consume

SAP

ANALYZE
RDBMS CAPTURE ENRICH

Raw
Deltas
STANDARDIZE HDS
DATA
MERGE ODS
WAREHOUSE Full
FORMAT Snapshot PREPARE
PARTITION Change SUBSET Views CLEANSE
History JOIN
FILES

Continuous Transactional Data Streaming

MAINFRAME

• Landing Zone
First Attunity Replicate copies data, often from traditional sources BUSINESS BENEFITS
such as Oracle, SAP and mainframe, then lands it in raw form in the
Hadoop File System (or cloud equivalent). This process enjoys all the Faster Data Lake operational readiness
advantages of Attunity Replicate, including full load/CDC capabilities, Reduced development time
time-based partitioning for transactional consistency and auto-propagation of Reduced reliance on Hadoop skills
source DDL changes. Data is now ingested and available as change tables, but
Easier compliance
not yet ready for analytics.
• Assembly Zone
Next Attunity Compose standardizes and combines change streams into a single
transformation-ready data store. It automatically merges the multi-table and/or multi-
sourced data into a flexible format and structure, retaining full history to rewind and
identify/remediate bugs if needed. The resulting persisted history provides consumers
with rapid access to trusted data, with no need to understand or execute the structuring that
has taken place. Data managers and architects, meanwhile, maintain central control of the entire
process.
• Provisioning Zone
Finally, data managers and architects provision an enriched data subset to a target,
potentially a structured data warehouse, for consumption (curation, preparation,
visualization, modeling and analytics) by data scientists and analysts. Data can be
continuously updated to these targets to maintain fresh data.
• Metadata Integration and Management
Attunity provides automated metadata management capabilities to help enterprise users
better understand, utilize and trust their data as it flows into and is transformed within
their data lake pipeline. With Attunity Replicate and Attunity Compose you can add, view
and edit entities (e.g., tables) and attributes (i.e., columns). Attunity Enterprise Manager
centralizes all this technical metadata so the lineage of any piece of data can be tracked from source
to target, and the potential impact of table/column changes across data zones can be assessed.
In addition, Attunity Enterprise Manager collects and shares operational metadata from Attunity
Replicate with third-party reporting tools for enterprise-wide discovery and reporting. Attunity
continues to enrich its metadata management capabilities and contribute to industry initiatives such
as ODPi to help simplify and standardize Big Data ecosystems with common reference specifications.

Contact Attunity today to learn how we can help you streamline your Data Lake pipeline and speed
your analytics readiness.

www.attunity.com
Americas Europe / Middle East / Africa Asia Pacific
866-288-8648 44 (0) 1932-895024 (852) 2756-9233
[email protected] [email protected] [email protected] © 2018 ATTUNITY LTD ALL RIGHTS RESERVED 20180615

Mivan Project
No ratings yet
Mivan Project
12 pages
BHAGYA
No ratings yet
BHAGYA
2 pages
Iochpe Maxion SA Annual Report (Mar 14 2025)
No ratings yet
Iochpe Maxion SA Annual Report (Mar 14 2025)
102 pages
GR 10 - Resource Pack 2025 - Accounting - Answer Book
No ratings yet
GR 10 - Resource Pack 2025 - Accounting - Answer Book
26 pages
L2L - Agreement-Gk Autoi - 100B-29 .07.2024
No ratings yet
L2L - Agreement-Gk Autoi - 100B-29 .07.2024
15 pages
For Breach of Contract
No ratings yet
For Breach of Contract
141 pages
Lam Bui: Contact
No ratings yet
Lam Bui: Contact
3 pages
معرض الكتاب PDF FULL
No ratings yet
معرض الكتاب PDF FULL
6 pages
BRKCCT 2723
No ratings yet
BRKCCT 2723
104 pages
Module 08 Managing Information System in Supply Chain
No ratings yet
Module 08 Managing Information System in Supply Chain
3 pages
Scala by Vistaland
No ratings yet
Scala by Vistaland
9 pages
HDFC Bank
No ratings yet
HDFC Bank
16 pages
Module 5: Risk Solutions: International Diploma in Risk Management
No ratings yet
Module 5: Risk Solutions: International Diploma in Risk Management
57 pages
Smithfield Food
No ratings yet
Smithfield Food
13 pages
Develop Interpretive Content For Eco-Tourism Activities 201015
No ratings yet
Develop Interpretive Content For Eco-Tourism Activities 201015
135 pages
MIF00411
No ratings yet
MIF00411
1 page
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
Hsslive Xi Bs Ptmta DP Mal Notes
No ratings yet
Hsslive Xi Bs Ptmta DP Mal Notes
58 pages
Adjusting Entries Service Business
No ratings yet
Adjusting Entries Service Business
18 pages
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
From Everand
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
AJIT DASH
2/5 (2)
Project Integration Management
No ratings yet
Project Integration Management
5 pages
Financial Planning and Forecasting
No ratings yet
Financial Planning and Forecasting
22 pages
Account Statement: Generated On Wednesday, March 02, 2022 2:54:18 PM
No ratings yet
Account Statement: Generated On Wednesday, March 02, 2022 2:54:18 PM
3 pages
Advanced Financial Accounting and Corporate Reporting - Semester-5
No ratings yet
Advanced Financial Accounting and Corporate Reporting - Semester-5
12 pages
Pacific Oil Company
No ratings yet
Pacific Oil Company
2 pages
Essays on Infrastructure-as-code
From Everand
Essays on Infrastructure-as-code
Ravi Rajamani
No ratings yet
Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
01 - Attunity Overview
No ratings yet
01 - Attunity Overview
27 pages
DP-500 Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI Exam Guide
From Everand
DP-500 Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI Exam Guide
Anand Vemula
No ratings yet
Oracle Database Administrator (DBA) Salary Guide - Oracle University
No ratings yet
Oracle Database Administrator (DBA) Salary Guide - Oracle University
1 page
Jsa Refresher Training: YOHO Development Project EPC2 Offshore Nigeria YQ Platform
No ratings yet
Jsa Refresher Training: YOHO Development Project EPC2 Offshore Nigeria YQ Platform
10 pages
CV Tran Chau Quang Luan Android Developer
No ratings yet
CV Tran Chau Quang Luan Android Developer
11 pages
OECD Services Trade Restrictiveness Index (STRI) : India 2019
No ratings yet
OECD Services Trade Restrictiveness Index (STRI) : India 2019
2 pages
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
From Everand
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
Will Girten
No ratings yet
Embassy DRHP 2010
No ratings yet
Embassy DRHP 2010
714 pages
Hadoop Ecosystem for Big Data
From Everand
Hadoop Ecosystem for Big Data
Dr. Zemelak Goraga
No ratings yet
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Audit Document - 2014
No ratings yet
Audit Document - 2014
28 pages
Advanced Java Technology GTU Syllabus
No ratings yet
Advanced Java Technology GTU Syllabus
4 pages
SAP HANA SYSTEM REPLICATION SCENARIOS
From Everand
SAP HANA SYSTEM REPLICATION SCENARIOS
Giridhar Kankanala
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Speed Your Data Lake ROI
100% (1)
Speed Your Data Lake ROI
16 pages
Attunity Magic of Data Integration Eguide
No ratings yet
Attunity Magic of Data Integration Eguide
8 pages
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
Move The Data That Moves Your Business: Attunity Replicate
No ratings yet
Move The Data That Moves Your Business: Attunity Replicate
2 pages
Effective Business Intelligence with QuickSight
From Everand
Effective Business Intelligence with QuickSight
Rajesh Nadipalli
No ratings yet
AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
Professional Hadoop Solutions
From Everand
Professional Hadoop Solutions
Boris Lublinsky
4/5 (2)
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Learning Cascading
From Everand
Learning Cascading
Michael Covert
No ratings yet
HDInsight Essentials - Second Edition
From Everand
HDInsight Essentials - Second Edition
Rajesh Nadipalli
No ratings yet
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
From Everand
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
Steve Brown
No ratings yet
Hadoop Blueprints
From Everand
Hadoop Blueprints
Anurag Shrivastava
No ratings yet
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
Administering ArcGIS for Server
From Everand
Administering ArcGIS for Server
Hussein Nasser
No ratings yet
Airflow for Data Workflow Automation
From Everand
Airflow for Data Workflow Automation
Richard Johnson
No ratings yet
DBA's Guide to NoSQL
From Everand
DBA's Guide to NoSQL
The Enlightened DBA
5/5 (1)
Kestra Pipeline Orchestration Essentials: The Complete Guide for Developers and Engineers
From Everand
Kestra Pipeline Orchestration Essentials: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Sqoop Essentials: Definitive Reference for Developers and Engineers
From Everand
Sqoop Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
From Everand
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Data Preparation with AWS Glue DataBrew: Definitive Reference for Developers and Engineers
From Everand
Efficient Data Preparation with AWS Glue DataBrew: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Essential Guide to DataStage Systems: Definitive Reference for Developers and Engineers
From Everand
Essential Guide to DataStage Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
From Everand
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical TimescaleDB Solutions: Definitive Reference for Developers and Engineers
From Everand
Practical TimescaleDB Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Azure HDInsight: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Azure HDInsight: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Amazon Athena Query Design and Optimization: Definitive Reference for Developers and Engineers
From Everand
Amazon Athena Query Design and Optimization: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Redshift Essentials: Definitive Reference for Developers and Engineers
From Everand
Redshift Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Essential Apache Beam: Definitive Reference for Developers and Engineers
From Everand
Essential Apache Beam: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Parallel Computing with Dask: Definitive Reference for Developers and Engineers
From Everand
Efficient Parallel Computing with Dask: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
RisingWave for Real-Time Data Processing: The Complete Guide for Developers and Engineers
From Everand
RisingWave for Real-Time Data Processing: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
From Everand
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
Robert Johnson
No ratings yet
Iceberg Table Formats and Analytics: Definitive Reference for Developers and Engineers
From Everand
Iceberg Table Formats and Analytics: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced Apache Tez Techniques: Definitive Reference for Developers and Engineers
From Everand
Advanced Apache Tez Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Learn HANA in 24 Hours
From Everand
Learn HANA in 24 Hours
Alex Nordeen
5/5 (1)
Comprehensive Guide to SAS Programming: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to SAS Programming: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Apache Samza: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Apache Samza: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Data Processing with Apache Pig: Definitive Reference for Developers and Engineers
From Everand
Efficient Data Processing with Apache Pig: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Apache Hudi for Scalable Data Lakes: The Complete Guide for Developers and Engineers
From Everand
Apache Hudi for Scalable Data Lakes: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Apache Superset Essentials: The Complete Guide for Developers and Engineers
From Everand
Apache Superset Essentials: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
HBase Configuration and Operations: Definitive Reference for Developers and Engineers
From Everand
HBase Configuration and Operations: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)
Mastering DuckDB: High-Performance Analytics Made Easy
From Everand
Mastering DuckDB: High-Performance Analytics Made Easy
Robert Johnson
No ratings yet
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
From Everand
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
Robert Johnson
No ratings yet
Mastering Delta Lake: Optimizing Data Lakes for Performance and Reliability
From Everand
Mastering Delta Lake: Optimizing Data Lakes for Performance and Reliability
Robert Johnson
No ratings yet
Mastering Amazon Redshift: Scalable Cloud Data Warehousing
From Everand
Mastering Amazon Redshift: Scalable Cloud Data Warehousing
Robert Johnson
No ratings yet
Data Lakes & Pipelines: A Modern Azure Guide
From Everand
Data Lakes & Pipelines: A Modern Azure Guide
Kameron Hussain
No ratings yet
SAP HANA Interview Questions You'll Most Likely Be Asked
From Everand
SAP HANA Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet

DS Attunity End To End Data Integration For Hadoop Data Lakes EN

Uploaded by

DS Attunity End To End Data Integration For Hadoop Data Lakes EN

Uploaded by

SOLUTION SHEET: HADOOP DATA INGEST

End-to-End Data Integration for

Your Data Lake Pipeline

Source Land Assemble Provision Consume

Continuous Transactional Data Streaming

You might also like