Lab - Qlik Replicate Oracle To Azure Synapse
Lab - Qlik Replicate Oracle To Azure Synapse
• Execute Task
Introduction
Replicate Tasks are used to manage the extraction and loading from various Source
Systems/Databases to targeted Databases. Although a Replicate task can only manage one source
and one target system, it does not limit the development of multiple tasks within a project. Example:
A client may have three different Database/File Systems in their environment, (Oracle, SAP HANA,
IBM VSAM files) and would like to extract all these data into a centralized database environment
(Microsoft Synapse) for Analysis. In this scenario, three Replicate Tasks would be required.
What will surprise those who worked with other data extraction tools, is the seamlessness with
which Qlik Replicate performs these processes.
In this Lab, you will define the steps needed to configure Qlik Replicate to extract data from Oracle
Database to Microsoft Synapse. Below are the steps required.
1. Ensure that the URL for the Replicate Server is available, and access granted.
- This will be provided by your Systems Administrator.
6. Enter a meaningful Endpoint Name and Description for the Endpoint Connector.
You will notice as we proceed that the content of the configuration window is context sensitive.
Server:
Port:
User:
Password:
Security/SSL Mode:
Look for the “Test Connection succeeded” message. Any other message means something may be
incorrect with your Server/Database definitions, or the Server/Database is unavailable.
3. Enter a meaningful Endpoint Name and Description for the Endpoint Connector.
6. Select Save.
1. A source endpoint.
2. A target endpoint.
• Unidirectional
• Full Load: enabled (Blue highlight is enabled; Select to enable / disable.)
• Apply Changes: enabled (Blue highlight is enabled; Select to enable / disable.)
• Store Changes: disabled (Blue highlight is enabled; Select to enable / disable.)
5. Select OK.
- This closes the New Task dialog box.
Once completed, the following window will appear:
We will now assign the newly created endpoints - source endpoint (Oracle) and target endpoint (Azure
Synapse).
At this point the Source and Target Endpoints created above will be assigned, in order for Replicate to
extract and load the data as defined.
2. Locate the Source Endpoint created above or one which meets your Source definitions.
5. Locate the Target Endpoint created above or one which meets your Target definitions.
7. Select Save.
Replicate is ready to extract the data from Oracle into Azure Synapse.
Steps
3. Enter HR in Schema.
4. Select Search.
A list of available files/tables will appear.
9. Select OK.
That completes configuration of the task. We are now ready to save our task and run it.
• If this is not the first time this Task is being used to extract data, Reload Target must be used –
Reload Target will not be an option.
is DML activity running in the background. Select on the Change Processing tab to
see it in action.
Note: Changes to the tables occur somewhat randomly in the background. You may
need to wait a few minutes before you will see changes appear in the tables that we
selected.
4. Open Azure data studio and Select on the Create new Connection.
When you have seen enough, you can declare Victory! for this part of the Test Drive.
7. Press Stop in the top left corner of the Replicate console to end the task.
9. Close the Oracle to Azure Synapse tab or Select on the TASKS tab to return to the main window.
There are two new Oracle target advanced properties, similar to the options
exposed by the SQL Loader:
bulkUseParallel - default true – Use Parallel hint for bulk DML statements
Task settings:
“batch_apply_use_direct_insert”=true
“batch_apply_direct_insert_min_events”=100 (default)
notes:
An option was added in the provider syntax to support using an append INSERT
statements when working in bulk apply. This will cause Oracle to use Direct
Path when inserting the rows from the net changes table to the target table and
therefore, should give a boost when there are many inserts. Especially with
Exadata.
To make the target Oracle endpoint use the append inserts bulk statements,
add the following to the connection string:
$info.query_syntax.bulk_insert_syntax=USE_APPEND
Please try to export the repository, play with the following parameters and
import back:
"common_settings": {
"lob_max_size": 1,
"change_table_settings": {
"table_suffix": "__ct",
"column_prefix": "header__"
},
"audit_table_settings": {
"table_name": ""
},
"stream_buffers_number" : 5,
"stream_buffer_size" : 20
· "stream_buffers_number" : 3
· "stream_buffer_size" : 8 (MB)
5. Bi Directional privileges In Oracle target, when the target bi-directional schema is different that the
Replicate user account you have two options:
6. Set varchar(char n) in target To enable the fix, add in target connection string: "charLengthSemantics".
database
Values: DEFAULT, CHAR, BYTE
BYTE- Default - create target tables with VARCHAR (XX BYTE) CHAR - create
target tables with VARCHAR (XX CHAR)
8. createNetChangesTempTable
9. EmptyStringValue By default an Oracle target changes an 'empty' string on the source to 'space'.
If an empty string is preferred (which will become a NULL in Oracle), one cannot
directly enter emptiness in the GUI, as the GUI interprets that as an attempt to
revert to default.
- Edit JSON, look for 'Hello World' replace with nothing leaving
"emptyStringValue": ""
- Save.
- Import task.
10. max_transaction_size This is the JSON variable behind full-load-tuning - "Commit rate during full
load:" Contrary to that description this is also in effect for CDC.
Values like that will cause the task to use excessive memory (28 GB !?) as visible
by (windows) 'taskmanager and the (5.2) new "memory_usage_kb" porperty in
gettaskstatus. With REPCTL GETSNAPSHOT I observed
the ORACLE_CDC pool ORACLE_STREAM_COMPONENT grow linear with this:
10K --> 41033728 bytes, 100K --> 410394624 bytes, 1M --> 4104003584 bytes.