0% found this document useful (0 votes)
445 views19 pages

CDC Presentation

This document describes the architecture and components of IBM InfoSphere Change Data Capture (CDC) software. It discusses the CDC instance, data store, access server, management console, replication types (refresh and mirroring), apply methods (live audit, adaptive apply, standard apply), and hardware architecture considerations for deploying CDC. The key points are: - A CDC instance represents the CDC services configured for a specific database. It creates metadata tables and can act as a source or target. - A data store maps an instance to database tables for replication. An instance can have multiple data stores. - The access server manages user access and maps users to data stores. The management console provides configuration and monitoring.

Uploaded by

Rajeswar Guin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
445 views19 pages

CDC Presentation

This document describes the architecture and components of IBM InfoSphere Change Data Capture (CDC) software. It discusses the CDC instance, data store, access server, management console, replication types (refresh and mirroring), apply methods (live audit, adaptive apply, standard apply), and hardware architecture considerations for deploying CDC. The key points are: - A CDC instance represents the CDC services configured for a specific database. It creates metadata tables and can act as a source or target. - A data store maps an instance to database tables for replication. An instance can have multiple data stores. - The access server manages user access and maps users to data stores. The management console provides configuration and monitoring.

Uploaded by

Rajeswar Guin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Change Data Capture

Play Back Session


CDC Single Scrape Architecture
CDC Instance (source) CDC (target)
Target
Target
Subscription
Subscription 11
Subscription 1
Log
Log reader
reader Log
Log parser
parser

Target
Target JDBC
Subscription
Subscription 22
Subscription
Subscription 22
Transaction
Transaction
Database queues
queues Change
Change log
log
logs (staging
(staging store)
store) Target
Target
Subscription
Subscription 33
Subscription
Subscription 33
Scraper

 Log Reader Thread captures changes for all active tables across all subscriptions
 Captured changes are stored in the Staging Store
 The Staging store is backed by memory and disk. Where possible, data is stored and accessed in memory.
 Maximum Staging Store size is set during CDC Instance Configuration

4/18/19
CDC Software Components

4/18/19
CDC Software Components

Instance:

• InfoSphere Change Data Capture services when initialized/configured against a specific database is called an instance.

• On configuration it creates a set of metadata tables on the specified schema.

• Once configured the services can be started/stopped.

• An instance can be treated either as a source or a target CDC agent w.r.t the database acts as either a source or a
target. So instances are sometimes called source/target agent server.

• A machine installed with InfoSphere Change Data Capture can have multiple instance running. (ex- edwps013/015.)

Data Store:

• An instance when mapped with database connection to schema containing the tables for replication.

• An instance can have multiple datastores.

4/18/19
CDC Software Components

CDC Instance and Datastore:

Instance Name: VOTXP02 Instance Name: SOTXP02


TS_AUTH Server Port : 10109 Server Port : 10109 TS_AUTH
TS_BOOKMARK DB Name: SOTXP02 DB Name: SOTXP02 TS_BOOKMARK
TS_CONFAUD TS_CONFAUD
TS_JRN DB User: TSUSER DB User: TSUSER
TS_JRN
TS_JRNCLEANUPLSN DB Password: DB Password: TS_JRNCLEANUPLSN
TS_JRNCLEANUPTEMP <password> <password> TS_JRNCLEANUPTEMP
TS_JRNLSN
TS_JRNOPID
Metadata Schema: Metadata Schema: TS_JRNLSN
<Name> <Name> TS_JRNOPID
TS_JRNTXID TS_JRNTXID
TS_JRNTXIDORDER TS_JRNTXIDORDER
TS_JRNTXS TS_JRNTXS
TS_JRNUSE TS_JRNUSE
Metadata Schema@
Oracle Client CDC
Metadata Schema @
VOTXP02 Tnsnames.ora SOTXP02
Installation
EDWPS015

SPR_REPL.CUSTOMER
SPR_REPL.CAMPAIGNORGANISATION
SOTXP02
DataStore
4/18/19
CDC Software Components

Access server:
• It manages user access, creation of datastore, connectivity to databases and map user with datastore using access
manager feature of management console.

• When the access server is installed it is configured with an user with system administrator role. For the first time login
to management console need this user login.

• The system administrator subsequently can create user with required roles.

CDC User Roles


• System Administrator  can perform all available operations in Management Console.
• Administrator  can access Monitoring and Configuration perspectives but Access Manager
• Operator  can start, stop, and monitor replication activities but cannot edit subscriptions or replication.
• Monitor  can only view events, statistics, replication state, status of a subscription and table mappings.

4/18/19
CDC Software Components

Management Console:
Management Console is the end user client interface for InfoSphere Change Data Capture. Management Console has
three perspectives. Each perspective

1. Access Manager perspectives:


• Add/Edit/Delete user, datastore and
• Map users with datastores.
• User assigned to a datastore can only access the datastore.

2. Configuration perspectives:
• Add/Edit/Delete datastores
• Add/Edit/Delete subscriptions
• Add/Import/Export projects
• Mapping, customizing tables and configure replication

3. Monitoring perspective:
• Starting, ending and monitoring replication
• view events, statistics, and table mappings.
• View replication state, status of a subscription
• view notifications sent by subscriptions and datastores.

4/18/19
CDC Software Components

Replication Type: The process of maintaining a on-going synchronization that is sending changes from source tables to
target tables. Replication method mainly of two types
• Refresh: A process that will synchronize the target table with the current contents of the source table.
• Mirroring: The process of continuous replication of changed data from the source system to the target system

Apply Methods:

• Live Audit: - Live Audit data is inserted into the target table irrespective of source operation. The operations are
identified by Aud_Type (PT=Insert, UP=Update, RR=Delete).

• Adaptive Apply: – Update the record if already exist in the target table else insert a new record irrespective of source
operation. Adaptive Apply replication provides flexibility when source and target tables are not synchronized.

• Standard Apply: - In standard apply, the source operations that is a row insert/update/delete operation on the source
table would result in an insert/update/delete on the target table respectively. In this method the source and the target
tables need to be synch with each other.

4/18/19
CDC Software Components
Live Audit

Adaptive Apply

4/18/19
H3G CDC Hardware Architecture

4/18/19
Design Consideration:

1. Firewalls needs to be disable among the servers communicating on TCP/IP.

2. Minimum Supplemental logging needs to be active in the Oracle source database to run in ARCHIVELOG mode.
ARCHIVELOG Mode: The Oracle redo log will contain required information to describe all data changes completely.

3. Supplemental logging for all columns on the source databases.

4. Target tables will have same schema definition and unique constrains as in the source table.

5. OS User Privileges: Dedicated OS User for running InfoSphere CDC and Read access to archive and online logs

6. Oracle client 9i and above on all CDC servers should be installed.

7. Resources required for creation of a new instance:


1. Enough memory ( > 1GB)
2. Enough disk storage >5 GB
3. New port for CDC listener (Ex: 10109 for Sprint instance SOTXP02 )
4. New OS user for CDC (Optional Ex: tsadmin)
5. Oracle user for CDC (TSUSER)
6. If target instance (edwps013/015), distinct schema name for metadata w.r.t application’s database (Ex- Sprint : SOTXP02.)
7. For source instance - way of getting to archive logs.

4/18/19
Failure Management

1. When OTS database crashed and then recovered to previous state. If data mismatch found between OTS and Clone
then the data can be recover using following options..

Repeating the CDC scraping –


1. Set CDC bookmark to restored timestamp and allow it to repeat the scraping into OTS table.
Copying missing data from Clone –
2. Use this option when missing data is very less.

2. In case of database crash/outage on the source system, CDC has to wait and may needs to increase the archive log
retention period (Wipro).

4/18/19
Maintenance and Operation Support

1. Check Subscription Status:


1. Active Normal:
Which signifies subscription is Active and Replication is happening properly.
2. Inactive Normal:
Which signifies subscription is Active, however Replication is not happening. It has been stopped normally.
3. Inactive Error:
Which signifies subscription is not active and replication is not happening also. Auto email alert for any error.
Check the error log to investigate the cause of this error.

2. Check Replication Status:


Check the replication is happening between Source and Target using management Console.

3. Check the Table Status:


Check apart from pre-Defined park tables, if there is any tables which are in parked status.

4. Check the DDL Errors:


If DDL errors happening due to some changes at source table and which is out of synch with target table.

4/18/19
Maintenance and Operation Support

Weekly Cleanup Activity:


Automated Script is executed for each instance at source and target CDC server’s bin path. This cleanup the
unnecessary log files, cache files, etc.
1. Stop the replication manually from Management console.
2. Login to instance Server(source/target).
3. Make sure you are on BIN path.
4. Execute CDC_Slow_Resolution.sh for each instances
5. repeat point 2 - 4 for source/target servers.
6. Start replication from Management console.

Weekly Subscription Backup:


Take back up of each subscription
1. Go to MC  Configuration  Right click on the subscription  Export subscription  XML File
2. Store the XML File in /opt/dsadm/CDCBackup.

4/18/19
CDC Commands

A) To Stop Instance
./dmshutdown -I <INSTANCE NAME> # (dmterminate command can be used to force down any instance)
 B) To Start Instance
nohup ./dmts64 -I <INSTANCE NAME> &
C) To Start Access Server 
nohup dmaccessserver &
D) To Kill Access Server
kill -9 <process id> # obtain process id of Access Server using ps command
D) Remove all the contents from the Staging Store
./dmclearstagingstore -I <instance name>
 E) To change the scraping point for a subscription
dmsetbookmark -I <INSTANCE_NAME> -s <SUBSCRIPTION_NAME ...> [-a]
 [-a] is optional, if you use this option scraping point will be applied to all the subscriptions for a particular
instance.
 F) To check the status of instances
./dmconfigurets

4/18/19
Automated Replication Control Using Batchman

The CDC replication keeps on running


for ever but pauses during the Daily ETL
batch load of stage1 layer.

4/18/19
Frequent CDC Issue and Fixes
Replication is slow and stuck at any log
Fix with DBA if database is down, performance is slow that delays the Archive log generation.
In case of high transaction wait for replication to complete (Festive season).

Replication is repeating in loop, moving from one node to another


Skip the archive logs which switched to another node, start from the next archive log and set Book Mark.

Replication is lagging for one node due to network issue, database issue etc.
Wait until the Replication Ends

Table definition for the subscription got changed


Recreated the table mappings from the Management Console for the subscription

DDL error: When there is a mismatch in the table structures between the source and target for a subscription CDC
throws a DDL error, take the following action to resolve this issue:
i. End replication of the erroneous instance from MC
ii. Update the ddl_restart.sh script with the corresponding table name for which the error is thrown
iii. Run the updated ddl_restart.sh script and check the log is successful or not.
iv. Start the subscription.

Unable to de-allocate database transaction Or Fatal Error Or Staging store corrupted:


1. Make sure the agent is up and running before proceeding to step 2.
4/18/19
Clear staging store --> ./dmclearstagingstore -I <Instance_Name>
Frequent CDC Issue and Fixes

DDL error
When there is a mismatch in the table structures between the source and target for a subscription CDC throws a DDL error,
take the following action to resolve this issue:
1. End replication of the erroneous instance from MC
2. Update the ddl_restart.sh script with the corresponding table name for which the error is thrown
3. Run the updated ddl_restart.sh script and check the log is successful or not.
4. Start the subscription.

Unable to de-allocate database transaction Or Fatal Error Or Staging store corrupted:


5. Make sure the agent is up and running before proceeding to step 2.
6. Clear staging store --> ./dmclearstagingstore -I <Instance_Name>
7. Make the Subscription Mirror Continuous through MC

Access server is down


Connect to edwps013/015 (192.168.76.15) and perform the below steps to restart the Access server
8. cd $ASHOME/bin
9. nohup dmaccessserver &
10. cat the nohup.out file for the output.

4/18/19
Thank You

4/18/19

You might also like