Oracle Data Integrator (ODI Formerly Sunopsis)
Oracle Data Integrator (ODI Formerly Sunopsis)
Subscribe
mulesoft.com OPEN
Advertising
Oracle Data Integrator (ODI formerly Sunopsis) 64 pages
(Agent|Server)
Typical Applications / Goals
Technical Architecture
Change Data Capture
CDC Journalizing Infrastructure
Setting up Journalizing/CDC
Check Knowledge Module (CKM)
Component Details
Console
Context
Data Integrity Control
Data Server (Data Source)
Database Technology
Datastore (Source and Target)
Designer
JDBC /JMS Drivers
Endeca Integration (IKM SQL to Endeca Server)
Adapter for Hyperion Essbase - Getting Started
File Technology
Getting Started
(Load|Integration) Knowledge Module (IKM)
Installation
Installation version 10.1.3.5.0
Installation with a topology for a standalone agent (version 12.1.2)
(Interface|Mapping)
Journalizing Knowledge Module (JKM)
Jython
The key features and benefits
Knowledge Module (KM)
Loading Knowledge Module (LKM)
Log / Debug
Logical Agent
Logical Schema
Master repository
Metadata Navigator
Model (Data Model)
Monitoring
Oracle Data Integrator Enterprise edition
Operator Module
Oracle Data Integrator
Oracle Data Profiling
Oracle Data Quality
Packages
Physical Agent
The physical schema
Procedure
Project
The repository
Reverse-engineering
Reverse-engineering Oracle
Reverse Knowmledge Module (RKM)
Salesforce
Reverse engineering Salesforce
Scenarios
Security Manager
Sequence
Staging Area (or Work area)
Studio
Sunopsis Engine
Topology
Topology Manager
Variable
Work Repository
XML
Table of Contents
1 - About
2 - Articles Related
3 - Publish-and-subscribe model
4 - Process
5 - The Journalizing Components
6 - Simple vs. Consistent Set Journalizing
7 - Implementation
7.1 - Tracking changes
7.2 - Processing the change
8 - Ensuring Data Consistency
9 - Documentation / Reference
1 - About
The goal of Change Data Capture is to track change in the source data. When running integration interface, ODI-EE can reduce the volume of source
data processed in the flow by extracting only the changed data.
Reducing the volume of source data is useful in many field such as:
• synchronization
• replication
These changes are captured by Oracle Data Integrator and transformed into events that are propagated throughout the information system.
Changes tracked by Changed Data Capture constitute data events. The ability to track these events and process them regularly in batches or in real time
is key to the success of an event-driven integration architecture.
Changed Data Capture is performed by journalizing models. Journalizing a model consists of setting up the infrastructure to capture the changes (inserts,
updates and deletes) made to the records of this model's datastores.
2 - Articles Related
• ODI - Staging Area (or Work area)
• ODI - Setting up Journalizing/CDC
• ODI - Typical Applications / Goals
OPEN
Advertising
3 - Publish-and-subscribe model
Changed Data Capture uses a publish-and-subscribe model. This model works in three steps:
• An identified subscriber, usually an integration process, subscribes to changes that might occur in a datastore. Multiple subscribers can subscribe
to these changes.
• The Changed Data Capture framework captures changes in the datastore and then publishes them for the subscriber.
• The subscriber—an integration process—can process the tracked changes at any time and consume these events. Once consumed, events are no
longer available for this subscriber.
4 - Process
ODI-EE processes datastore changes in two ways:
• Regularly in batches (pull mode)—for example, processes new orders from the Web site every five minutes and loads them into the operational
datastore (ODS)
• In real time (push mode) as the changes occur—for example, when a product is changed in the enterprise resource planning (ERP) system,
immediately updates the on-line catalog.
• Journals: Where changes are recorded. Journals only contain references to the changed records along with the type of changes (insert/update,
delete).
• Capture processes: Journalizing captures the changes in the source datastores either by creating triggers on the data tables, or by using database-
specific programs to retrieve log data from data server log files. See the documentation on journalizing knowledge modules for more information
on the capture processes used.
• Subscribers: CDC uses a publish/subscribe model. Subscribers are entities (applications, integration processes, etc) that use the changes tracked
on a datastore or on a consistent set. They subscribe to a model's CDC to have the changes tracked for them. Changes are captured only if there is
at least one subscriber to the changes. When all subscribers have consumed the captured changes, these changes are discarded from the journals.
• Journalizing views: Provide access to the changes and the changed data captured. They are used by the user to view the changes captured, and by
integration processes to retrieve the changed data.
Advertising
This approach has a limitation, illustrated in the following example: Say you need to process changes in the ORDER and ORDER_LINE datastores
(with a referential integrity constraint based on the fact that an ORDER_LINE record should have an associated ORDER record). If you have captured
insertions into ORDER_LINE, you have no guarantee that the associated new records in ORDERS have also been captured. Processing ORDER_LINE
records with no associated ORDER records may cause referential constraint violations in the integration process.
Consistent Set Journalizing provides the guarantee that when you have an ORDER_LINE change captured, the associated ORDER change has been also
captured, and vice versa. Note that consistent set journalizing guarantees the consistency of the captured changes. The set of available changes for which
consistency is guaranteed is called the Consistency Window. Changes in this window should be processed in the correct sequence (ORDER followed by
ORDER_LINE) by designing and sequencing integration interfaces into packages.
Although consistent set journalizing is more powerful, it is also more difficult to set up. It should be used when referential integrity constraints need to
be ensured when capturing the data changes. For performance reasons, consistent set journalizing is also recommended when a large number of
subscribers are required.
7 - Implementation
• triggers
• and relational database management system (RDBMS) log mining.
The triggers method creates triggers on the source tables to track changes as data is inserted, updated, or deleted. This method can be implemented on
most RDBMS, but it can have an impact on the transactional performance of the source systems.
The second method involves mining the RDBMS logs, which are the internal change history of the database engine. This method has no effect on the
system’s transactional performance; it is database-specific. This method is supported out-of-the-box for:
The Changed Data Capture framework used to manage changes is generic and open. The change tracking method can be customized, and any third-
party change provider can be used to load the framework with changes.
Advertising
ODI provides a mode of tracking changes, called Consistent Set Changed Data Capture, for this purpose. This mode allows you to process sets of
changes that guarantee data consistency.
9 - Documentation / Reference
• Changed Data Capture/Journalizing Documentation Reference
• Improve Data Integration with Changed Data Capture (PDF)
Newsletter
A data newsletter full of tips and tricks sharing the making of our data applications.
email address
Subscribe
report this ad
Advertising
Helping teams, developers, project managers, directors, innovators and clients understand and implement data applications since 2009.
If you are a data lover, if you want to discover our trade secrets, subscribe to our newsletter.
Data (State)
Data (State)
DataBase
Data Processing
Data Quality
Data Structure
Data Type
Data Warehouse
Data Visualization
Data Partition
Data Persistence
Data Concurrency
Data Science
Data Analysis
Statistics
Data Science
Linear Algebra Mathematics
Trigonometry
Modeling
Process
Data Modeling
Automata
Data Type
Number
Time
Text
Collection
Relation (Table)
Tree
Key/Value
Graph
Spatial
Color
Log
Measure Levels
Order
Nominal
Discrete
Distance
Ratio
Code
Compiler
Lexical Parser
Grammar
Function
Testing
Debugging
Shipping
Data Type
Versioning
Design Pattern
Infrastructure
Operating System
Security
File System
Network
Process (Thread)
Computer
PerfCounter
Marketing
Advertising
Analytics
Email
Web
Html
Dom
Http
Url
Css
Javascript
Selector
Browser
Web Services
OAuth
Contact
gerardnico
[email protected]
Privacy Policy
Bootie Template designed by Gerardnico with the help of Bootstrap and DokuWiki