Datastage Dynamic Relational Stage Guide
Datastage Dynamic Relational Stage Guide
Dynamic Relational Stage (DRS) reads data from any DataStage stage
and writes it to one of the supported relational databases. It also reads
data from any of the supported relational databases and writes it to
any DataStage stage. It supports the following relational databases:
DB2/UDB, Informix, Microsoft SQL Server, Oracle, and Sybase. It also
supports a generic ODBC interface. Version 1.0 of DRS is compatible
with Ascential DataStage Release 7.5.1.
Audience
This guide is intended for DataStage designers who create or modify
jobs that use DRS.
Related Documentation
To learn more about documentation from other Ascential products
and third-party documentation as they relate to DRS, refer to the
following sections/tables.
Guide Description
Ascential DataStage Designer General principles for designing jobs
Guide
Third-Party Documentation
Conventions
Convention Used for…
bold Field names, button names, menu items, and
keystrokes. Also used to indicate filenames, and
window and dialog box names.
user input Information that you need to enter as is.
code Code examples
variable Placeholders for information that you need to enter.
or Do not type the greater-/less-than brackets as part of
the variable.
<variable>
Contacting Support
To reach Customer Care, please refer to the information below:
Call toll-free: 1-866-INFONOW (1-866-463-6669)
Email: [email protected]
Ascential Developer Net: http://developernet.ascential.com
Please consult your support agreement for the location and
availability of customer support personnel.
To find the location and telephone number of the nearest Ascential
Software office outside of North America, please visit the Ascential
Software Corporation website at http://www.ascentialsoftware.com.
Chapter 1
The DRS Stage
Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
Configuration Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
General Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
Configuration Requirements for DB2/UDB. . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
Configuration Requirements for Informix . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Configuration Requirements for Oracle 8i . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Configuration Requirements for Oracle 9i . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
Configuration Requirements for Sybase . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Installing the Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Defining the DRS Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Connecting to a Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
Defining Character Set Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12
Defining Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
About the Input Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
Reject Row Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21
Appendix A
Best Practice for Handling Long Data Types in DataStage Jobs
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Recommended Job Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
The Batch Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
Minimum Job Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
Template Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3
Other Job Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5
The Preliminary Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6
Specifying Column Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7
Extracting the Length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7
Identifying the Name of the File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8
Determining Optimal Array Size and
Maximum Length of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8
The ETL Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9
Using Description as an Override to Specify Precision . . . . . . . . . . . . . . . A-9
Managing Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10
DRS lets you rapidly and efficiently prepare and load streams of
tabular data from any DataStage stage (for example, the ODBC stage,
the Sequential File stage, and so forth) to and from tables of the target
database. The Database client on Windows or UNIX uses SQL*Net to
access an Oracle server on Windows or UNIX.
Each DRS stage is a passive stage that can have any number of input,
output, and reference output links:
Input links specify the data you are writing, which is a stream of
rows to be loaded into a database. You can specify the data on an
input link using an SQL statement constructed by Ascential
DataStage or a user-defined SQL statement.
Output links specify the data you are extracting, which is a stream
of rows to be read from a database. You can also specify the data
on an output link using an SQL statement constructed by
Ascential DataStage or a user-defined SQL statement.
Each reference output link represents a row that is key read from a
database (that is, it reads the record using the key field in the
WHERE clause of the SQL SELECT statement).
Functionality
DRS has the following functionality and benefits:
Support for the following relational databases:
– DB2/UDB
– Informix
– Microsoft SQL Server
– Oracle
– Sybase
Support for generic ODBC.
Support for transaction grouping to control a group of input links
from a Transformer stage. This lets you write a set of data to a
database in one transaction. DRS opens one database session per
transaction group.
Support for reject row handling. Link reject variables tell the
Transformer stage the DBMS error code when an error occurs in
the DRS stage for insert, update, and so forth, for control of job
execution. For more information on how the Transformer stage
works with link reject variables, see Ascential DataStage Server
Job Developer’s Guide.
Support for some specific LOB data types. See the following table:
Configuration Requirements
General Requirements
For general configuration requirements, see Ascential DataStage
Plug-In Installation and Configuration Guide.
CancelDetectInterval=0
TrimBlankFromIndexName=1
ApplicationUsingThreads=1
The following example shows sample DSN entries for Tru64 and
Linux platforms for Ascential DataStage 7.5.1:
[hpds_stores]
Driver=/u1/informix/lib/cli/iclis09b.so
Description=INFORMIX 3.3 32-BIT
Database=<stores7>
LogonID=<userid>
pwd=<password>
Servername=<hpds.1>
CursorBehavior=0
CLIENT_LOCALE=en_us.8859-1
DB_LOCALE=en_us.8859-1
TRANSLATIONDLL=/u1/informix/lib/esql/igo4a304.so
Every Informix data source to which your jobs connect must have an
entry in the .odbc.ini file. The only required fields in the data source
specification are the Database and Server name. If you choose to
include the login ID (UID) and/or password (PWD), you can leave the
User Name and Password properties blank. If you enter values for
these properties, the values in .odbc.ini are ignored. For more
information on the format of the .odbc.ini file, see Informix CLI
Programmer’s Manual.
You can also use this .odbc.ini file for other ODBC applications
including jobs using the ODBC stage.
UNIX Platforms
The following sections specify information about library
requirements, depending on your platform.
Solaris.
A one-time site linking to build a replacement Oracle client shared
library is required for Oracle Clients 8.0.3, 8.0.4, 8.0.5, 8.1.6, and 8.1.7
on Solaris. This site linking binds your unique Oracle client
configuration into the file that is used by the Oracle OCI 8i stage to
access local and remote Oracle databases.
Before you build the Oracle client shared library, install Oracle, and set
the environment variable ORACLE_HOME to the directory where you
installed Oracle.
In order to build the library, copy the genclnt.tar file, which is located
at the root level of the DataStage installation media, into a directory
on the local hard disk and extract it before running the script.
Use the following commands to build the shared library:
# cp /cdrom/genclnt.tar
# tar -xvf genclnt.tar
# cd solaris
# ./genclntsh8 (or ./genclntsh816 or ./genclntsh817)
Tru64
A one-time site linking to build a replacement Oracle client shared
library is required for Oracle Client 8.1.6, and 8.1.7 on Tru64. This is
required to expose the necessary direct path loading API symbols.
Before you build the Oracle shared library, install Oracle and set the
environment variable ORACLE_HOME to the directory where you
installed Oracle.
In order to build the library, you must copy the GENCLNT.TAR;1 file,
which is located at the root level of the DataStage installation media,
into a directory on the local hard disk and extract it before running the
script.
Use the following commands to build the shared library for version
8.1.6:
# cp /cdrom/GENCLNT.TAR;1 ./genclnt.tar
# tar -xvf genclnt.tar
# cd tru64
# ./genclntsh816
Use the following commands to build the shared library for version
8.1.7:
# cp /cdrom/GENCLNT.TAR;1 ./genclnt.tar
# tar -xvf genclnt.tar
# cd tru64
# ./genclntsh817
HP-UX 11
A one-time site linking to build a replacement Oracle client shared
library is required for Oracle Client 8.1.6 and 8.1.7 on HP-UX 11. This
site linking binds your unique Oracle client configuration into the file
that is used by the Oracle OCI 8i stage to access local and remote
Oracle databases.
Note Oracle 8.0.n is not supported for HP-UX.
Before you build the Oracle client shared library:
1 Install Oracle
2 Set the ORACLE_HOME environment variable to the directory
where you installed Oracle
3 Set the SHLIB_PATH shared library path environment variable to
reference $ORACLE_HOME/lib.
To build the library, copy the GENCLNT.TAR;1 file, which is located at
the root level of the DataStage installation media, into a directory on
the local hard disk and extract it before running the script.
Use the following sequence of commands to build the shared library
for version 8.1.6:
# cp /cdrom/GENCLNT.TAR;1 ./genclnt.tar
# tar -xvf genclnt.tar
# cd hpux
# ./genclntsh816 (or ./genclntsh817)
Use the same sequence of commands to build the shared library for
version 8.1.7, replacing 816 with 817 in the last command.
AIX 4.3
You must set the LINK_CNTRL environment variable to the value
L_PTHREADS_D7 before installing the Oracle 8.0 client software to
prevent this incompatibility.
Otherwise, you must rerun the genclntsh script provided by Oracle to
generate a new shared library after setting this environment variable
to the proper value. In particular, you must set this environment
variable before:
Installing 8.0.3 on AIX 4.3
Upgrading an existing version of Oracle to Release 8.0.3RR
Relinking executables
Installing new Oracle patches
Oracle 8.0.5 contains the same text in Installation Guide in the
“Special Considerations for AIX 4.3” section. This guide can also be
found on the Oracle web site referenced earlier, as can Oracle8
Installation Guide for AIX-Based Systems. You also must set the
environment variable also before relinking any applications that link
against Oracle 8.0.3 libraries.
Install Oracle, and set the environment variable ORACLE_HOME to the
directory where you installed Oracle. On AIX 4.3, set the environment
variable LINK_CNTRL to the value L_PTHREADS_D7.
Input. This page is displayed only if you have an input link to this
stage. It specifies the SQL table to use and the associated column
definitions for each data input link. This page also specifies the
type of update action and transaction isolation level information
for concurrency control and performance tuning. It also contains
the SQL statement used to write the data and lets you enable case
sensitivity for SQL statements.
Output. This page is displayed only if you have an output link to
this stage. It specifies the SQL tables to use and the associated
column definitions for each data output link. This page also
specifies the type of query and transaction isolation level
information for concurrency control and performance tuning. It
also contains the SQL SELECT statement used to extract the data,
and lets you enable case sensitivity for SQL statements.
To edit an DRS stage from the DRS stage dialog box:
1 Define the connection (see the following section).
2 Optional. Define a character set map (see "Defining Character Set
Mapping" on page 1-12).
3 Define the data on the input links (see "Defining Input Data" on
page 1-13).
4 Define the data on the output links (see "Defining Output Data" on
page 1-24).
Connecting to a Database
Set the connection parameters on the General tab on the Stage page
of the GUI. To connect to one of the supported databases:
1 Identify the Database type by selecting a database management
system from the DBMS Type drop-down list. Alternatively select
Use Job Parameter…. Provide the name of the job parameter in
the box provided in the Use Job Parameter dialog box.
Database type is required.
2 Enter the name of the database alias to access in the Connection
name field.
– DB2/UDB. The database name is configured with the DB2
Client Configuration Assistant. The database name is all that is
needed to determine the location of the target database on the
network.
– Informix. This is the Informix data source.
Windows. Define using the ODBC Administrator.
HP-UX 11.0 UNIX platform. Define data sources (DSN) for
Informix databases in the .odbc.ini file.
General Tab
This tab is displayed by default. It contains the following fields:
Table name. This required field is editable when the update
action is not User-defined SQL (otherwise, it is read-only). It is
the name of the target database table the data is written to, and
the table must exist or be created by choosing generate DDL
from the Create table action list. You must have insert, update,
or delete privileges, depending on input mode. You must specify
Table name if you do not specify User-defined SQL. There is no
default.
Click … (Browse button) to browse the Repository to select the
table.
Update action. Specifies which SQL statements are used to
update the target table. Some update actions require key columns
to update or delete rows. There is no default. Choose the option
you want from the list:
– Clear table then insert rows. Deletes the contents of the
table and adds the new rows, with slower performance
because of transaction logging. This is the default value.
– Truncate table then insert rows. Truncates the table with
no transaction logging and faster performance. For DB2/UDB
and Informix, this option is the same as Clear table then
insert rows.
– Insert rows without clearing. Inserts the new rows in the
table.
– Delete existing rows only. Deletes existing rows in the
target table that have identical keys in the source files.
– Replace existing rows completely. Deletes the existing
rows, then adds the new rows to the table.
– Update existing rows only. Updates the existing data rows.
Any rows in the data that do not exist in the table are ignored.
– Update existing rows or insert new rows. Updates the
existing data rows before adding new rows. It is faster to
update first when you have a large number of records.
– Insert new rows or update existing rows. Inserts the new
rows before updating existing rows. It is faster to insert first if
you have only a few records.
– Insert new rows only. Inserts the new rows in the table but
does not report duplicate rows to the log.
– Truncate only. Ignores any incoming data and truncates the
target table.
Columns Tab
This tab contains the column definitions for the data written to the
table.
The column definitions appear in the same order as in the Columns
grid:
The Columns tab behaves the same way as the Columns tab in the
ODBC stage. For a description of how to enter and edit column
definitions, see Ascential DataStage Designer Guide.
SQL Tab
The SQL tab contains the Generated, User-defined, Before, After,
Generated DDL, and User-defined DDL tabs.
Use these tabs to display the stage-generated SQL statement and the
SQL statement that you can enter.
Note For information about SQL Meta Tags, see "SQL Meta Tags"
on page 1-31.
Before. Contains the SQL statements executed before the stage
processes any job data rows.
The SQL statement on the Before tab is the first SQL statement to
be executed. Depending on your choice, the job can continue or
abort after failing to execute a Before statement. It does not affect
the transaction grouping scheme. The commit or rollback is
performed on a per-link basis.
Each SQL statement is executed as a separate transaction if the
statement separator is a double semi-colon ( ;; ). All SQL
statements are executed in a single transaction if a semi-colon ( ; )
is the separator.
If the property value begins with FILE=, the remaining text is
interpreted as a pathname, and the contents of the file supplies
the property value.
Note For information about SQL Meta Tags, see "SQL Meta
Tags" on page 1-31.
Treat errors as non-fatal. If selected, errors caused by Before SQL
are logged as warnings, and processing continues with the next
command batch. Each separate execution is treated as a separate
transaction. If cleared, errors are treated as fatal to the job, and
result in a transaction rollback. The transaction is committed only
if all statements successfully execute.
The SQL statement on the After tab is the last SQL statement to
be executed. Depending on your choice, the job can continue or
abort after failing to execute an After SQL statement. It does not
affect the transaction grouping scheme. The commit or rollback is
performed on a per-link basis.
Each SQL statement is executed as a separate transaction if the
statement separator is a double semi-colon ( ;; ). All SQL
statements are executed in a single transaction if a semi-colon ( ; )
is the separator.
If the property value begins with FILE=, the remaining text is
interpreted as a pathname, and the contents of the file supplies
the property value.
The behavior of Treat errors as non-fatal is the same as for Before.
Note For information about SQL Meta Tags, see "SQL Meta
Tags" on page 1-31.
input source once you resolve the issues with the offending row
values.
The following sections describe the differences when you use SQL
SELECT statements for generated or user-defined queries that you
define on the Output page in the DRS stage dialog box of the GUI.
General Tab
This tab is displayed by default. It contains the following parameters:
Table names (SQL FROM clause). Specifies the name of the
source table. Use comma-separated table names to indicate an
inner join. Use outer-join SQL Meta Tags to define an outer join.
Browse. Allows you to work with the DataStage Repository to
obtain column information.
Transaction Isolation. Specifies the transaction isolation levels
that provide the necessary consistency and concurrency control
between transactions in the job and other transactions for optimal
performance. For more information on using these levels, see
your database documentation. Use one of the following
transaction isolation levels:
– Read Uncommitted. Takes exclusive locks on modified data.
These locks are held until a commit or rollback is executed.
However, other transactions can still read but not modify the
uncommitted changes. No other locks are taken.
– Read Committed. Takes exclusive locks on modified data and
sharable locks on all other data. This is the default. Exclusive
locks are held until a commit or rollback is executed.
Uncommitted changes are not readable by other transactions.
Shared locks are released immediately after the data has been
processed, allowing other transactions to modify it.
– Repeatable Read. Identical to serializable except that
phantom rows may be seen.
– Serializable. Takes exclusive locks on modified data and
sharable locks on all data. All locks are held until a commit or
rollback is executed, preventing other transactions from
modifying any data that has been referenced during the
transaction.
Note Not all platforms support all four levels of transaction
isolation, and those that do have different names for
them. DRS maps the value for the property to the
closest supported value defined for the database
platform selected. See the applicable database
documentation to determine what levels are supported.
Array size. Specifies the number of rows read from the database
at a time. Enter a positive integer to indicate the number of rows
to prefetch in one call. The default value 1 means that prefetching
is turned off.
Larger numbers use more memory on the client to cache the
rows. This minimizes server round trips and maximizes
Columns Tab
The column tab page behaves the same way as the Columns tab in
the ODBC stage, and it specifies which columns are aggregated. For a
description of how to enter and edit column definitions, see Ascential
DataStage Designer Guide.
The column definitions for output links contain a key field. Key fields
are used to join primary and reference inputs to a Transformer stage.
For a reference output link, the DRS key reads the data by using a
WHERE clause in the SQL SELECT statement. For details on how key
fields are specified and used, see Ascential DataStage Designer
Guide.
The Derivation cell on the Columns tab contains fully-qualified
column names when table definitions are loaded from the DataStage
Repository. If the Derivation cell has no value, DRS uses only the
column names to generate the SELECT statement displayed in the
Generated tab of the SQL tab. Otherwise, it uses the content of the
Derivation cell. Depending on the format used in the Repository, the
format is owner.table.name.columnname or tablename.columnname.
The column definitions for reference links require a key field. Key
fields join reference inputs to a Transformer stage. DRS key reads the
data by using a WHERE clause in the SQL SELECT statement.
Selection Tab
Use this tab to build column-generated SQL queries. It contains
optional SQL clauses for the conditional extraction of data.
SQL Tab
The SQL tab contains the Generated, User-defined, Before, and
After tabs.
Use these tabs to display the stage-generated SQL statement and the
SQL statement that you can enter.
Generated. Displays the SQL statements that read data from a
database. You cannot edit these statements, but you can use
Copy to copy them to the Clipboard for use elsewhere.
User-defined. Contains the user-defined SQL SELECT statement
executed to read data from a database. With this choice, you can
edit the SQL statements.
Note For information about SQL Meta Tags, see "SQL Meta
Tags" on page 1-31.
The Before is the first SQL statement to be executed, and you can
specify whether the job continues or aborts after failing to execute
a Before SQL statement. The commit/rollback is performed on a
per-link basis.
If the property value begins with FILE=, the remaining text is
interpreted as a pathname, and the contents of the file supply the
property value.
Note For information about SQL Meta Tags, see "SQL Meta
Tags" on page 1-31.
After. Contains the After SQL statement executed after the stage
processes any job data rows.
In this case, the meta tag is not like a simple macro. The meta tag
expansion inserts the join condition in front of the existing WHERE
condition, if any, with an AND, putting the join condition and the
existing WHERE condition each in parentheses. The expansion
process then scans the resulting composite WHERE condition for
references to fields of the table on the right side of the left outer join.
After each of these references, it inserts a “(+).”
Warning There are special restrictions for using outer join SQL
Meta Tags with Oracle 8i databases:
In order for the meta tag expansion process to identify the field
references, you must fully qualify all field references in join
conditions and in the WHERE condition.
You can nest joins only on the left, as in
%LeftJoin (%LeftJoin (Table1,,Table2,,Table1.x = Table2.x),, Table3,,
Table2.q =Table3.q
Aliases
You can provide aliases for the tables in the second and fourth
parameters. For example, in SQL-92
SELECT Table1Alias.x,Table1Alias.y
FROM %LeftJoin (Table1, Table1Alias, Table2, Table2Alias,
Table1Alias.x = Table2Alias.x)
WHERE Table1Alias.y > 2
produces
SELECT Table1Alias.x,Table1Alias.y
FROM Table1 Table1Alias LEFT OUTER JOIN Table2 Table2Alias ON
Table1Alias.x = Table2Alias.x
WHERE Table1Alias.y > 2
In the example above, you need RTRIM on SQL Server and DB2/UDB.
Conversion
Ascential DataStage uses a conversion of YYYY-MM-DD HH24:MI:SS
when reading or writing an Oracle date. If the DataStage data type is
Timestamp, DataStage uses the to_date function for this column
when it generates the INSERT statement to write an Oracle date. If the
DataStage data type is Timestamp or Date, Ascential DataStage uses
the to_char function for this column when it generates the SELECT
statement to read an Oracle date.
The following example creates a table with a DATE data type on an
Oracle server. The imported DataStage data type is Timestamp.
create table dsdate (one date);
Truncation
If you choose DataStage data type DATE or TIME, a portion of the
Oracle column is lost.
Date Time
2004-11-25 09:14:37
The full values that are stored in Oracle on November 23, 2004,
are:
DATE DATE
2004-11-25 12:00:00 2004-11-01 09:14:37
DATE DATE
2004-11-25 23:59:59 1900-01-01 09:14:37
4 DataStage FLOAT values have a maximum precision of 15 digits. Some loss of precision occurs when
reading data from Sybase float(p) columns where p is greater than 15.
DataStage Dynamic Relational Stage Guide
6 The DRS Plug-in supports the BLOB data type by mapping the LONGVARBINARY data type with a precision
greater than 4 KB to Oracle’s BLOB data type. To work with a BLOB column definition, choose DataStage’s
LONGVARBINARY as the column’s data type and provide a Length of more than 4 KB in the Columns tab.
The maximum size supported by DataStage is 2 GB. A column with a data type of BLOB cannot be used as
a key.
7 The DRS Plug-in supports the CLOB data type by mapping the LONGVARCHAR data type with a precision
greater than 32 K to DB2/UDB’s CLOB data type. To work with a CLOB column definition, choose
DataStage’s LONGVARCHAR as the column’s data type and provide a Length of more than 32 K in the
Columns tab. If the Length is less than or equal to 32 K, DataStage’s LONGVARCHAR maps to
LONGVARCHAR.
8 The DRS Plug-in supports the CLOB data type by mapping the LONGVARCHAR data type with a precision
greater than 4 KB to Oracle’s CLOB data type. To work with a CLOB column definition, choose DataStage’s
LONGVARCHAR as the column’s data type and provide a Length of more than 4 KB in the Columns tab. The
maximum size supported by DataStage is 2 GB. A column with a data type of CLOB cannot be used as a
key.
9 The date component of a Sybase DATETIME or SMALLDATETIME value is lost when converted to an
DataStage TIME value. When writing a DataStage TIME value to a Sybase DATETIME or SMALLDATETIME
value, the date component is set to the current date on the DataStage server machine.
10 If the size of the column is less than 32 K, the DB2/UDB data type for SQL_WLONGVARCHAR is
LONGVARCHAR.
Particularly note the key in this statement ($A#) is specified using the
external name.
Introduction
The Dynamic Relational Stage now supports the various long data
types provided by the major database vendors. DataStage represents
data types such as Oracle's CLOB and BLOB types with the Long
Varchar SQL type. In most cases, these types can support object sizes
of 2GB or greater. In DataStage, object sizes must be specified in a job
design and cannot be changed at runtime. Therefore, you must design
a job with the largest potential size of a large object rather than with
the largest actual size that may be encountered during job execution.
This results in unnecessary memory allocations which, together with
a large array size in the various database stages, can rapidly exhaust
available system memory.
The following describes a mechanism that assures DataStage users a
high degree of success in processing large objects on a system that is
adequately configured for the task.
For a sample job, use DataStage Manager to import the file
LongJobControl.dsx from the Samples directory on the DataStage
installation media.
Template Code
The BASIC code is in the Job Control tab of the batch job. The
template code is shown below.
REM PARAMETERFILE is the output file of QUERYJOB, and contains the maximum
data length of the LONG column
* Setup QUERYJOB, run it, wait for it to finish, and test for success
hJob1 = DSAttachJob(QUERYJOB, DSJ.ERRFATAL)
If NOT(hJob1) Then
Call DSLogFatal("Job Attach Failed: ":QUERYJOB:"", "JobControl")
Abort
End
ErrCode = DSSetParam(hJob1, "DBMS_MAX", DBMS_MAX)
ErrCode = DSSetParam(hJob1, "DSN_MAX", DSN_MAX)
ErrCode = DSSetParam(hJob1, "UID_MAX", UID_MAX)
ErrCode = DSSetParam(hJob1, "PWD_MAX", PWD_MAX)
ErrCode = DSRunJob(hJob1, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob1)
Status = DSGetJobInfo(hJob1, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
Call DSLogFatal("Job Failed: ":QUERYJOB:"", "JobControl")
End
* Determine optimal Array size and column length for ETL JOB
PARAMETERFILE = QUERYJOB:".pf"
CALL DSLogInfo("PARAMETERFILE = ":PARAMETERFILE:"", "JobControl")
*Get the maximum length of LONG column data
*If $mam is nonzero, use it to determine array size
IF ($mam NE 0) THEN
ARG1 = $mam*1048576
OPENSEQ PARAMETERFILE TO PFILE ELSE CALL DSLogInfo("File not
found ":PARAMETERFILE:"", "JobControl")
READSEQ ColLength FROM PFILE ELSE CALL DSLogInfo("Error reading
":PARAMETERFILE:"", "JobControl")
DATALENGTH = TRIMB(ColLength)
CLOSESEQ PFILE
MAXMEM = ARG1*0.3
* 6000 is for rows size of other columns
ROWSIZE = DATALENGTH + 6000
* If the maximum length of the long column data exceeds MAXMEM,
* MAXMEM will be the column precision for ETL job
IF (DATALENGTH > MAXMEM) THEN
DATALENGTH = MAXMEM
END
ARG2 = DATALENGTH
* Setup ETLJOB, run it, wait for it to finish, and test for success
hJob2 = DSAttachJob(ETLJOB, DSJ.ERRFATAL)
If NOT(hJob2) Then
Call DSLogFatal("Job Attach Failed: ":ETLJOB:"", "JobControl")
Abort
End
ErrCode = DSSetParam(hJob2, "DBMS_SRC", DBMS_SRC)
ErrCode = DSSetParam(hJob2, "DSN_SRC", DSN_SRC)
ErrCode = DSSetParam(hJob2, "UID_SRC", UID_SRC)
ErrCode = DSSetParam(hJob2, "PWD_SRC", PWD_SRC)
IF ($mam NE 0) THEN
ErrCode = DSSetParam(hJob2, "ARSZ_SRC", ARG1)
END
IF ($mam EQ 0) THEN
ErrCode = DSSetParam(hJob2, "ARSZ_SRC", ARSZ_SRC)
END
ErrCode = DSSetParam(hJob2, "LongLength", ARG2)
ErrCode = DSSetParam(hJob2, "DBMS_TRG", DBMS_TRG)
ErrCode = DSSetParam(hJob2, "DSN_TRG", DSN_TRG)
ErrCode = DSSetParam(hJob2, "UID_TRG", UID_TRG)
ErrCode = DSSetParam(hJob2, "PWD_TRG", PWD_TRG)
IF ($mam NE 0) THEN
ErrCode = DSSetParam(hJob2, "ARSZ_TRG", ARG1)
END
IF ($mam EQ 0) THEN
ErrCode = DSSetParam(hJob2, "ARSZ_TRG", ARSZ_TRG)
END
ErrCode = DSRunJob(hJob2, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob2)
Status = DSGetJobInfo(hJob2, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then
* Fatal Error - No Return
Call DSLogFatal("Job Failed: ":ETLJOB:"", "JobControl")
End
Managing Failures
DRS currently allocates its memory based on the column meta data
when the stage is initialized. Therefore, it is unlikely that a job will
encounter memory issues mid-run. However, it is possible that
truncations of long data occur if the maximum long column size
exceeds the memory available to process these rows. In such rare
cases, you need to log these truncated rows for later processing. The
information that uniquely identifies the target row is in the format of
"Table|Column|Key Values".
N S
NLS tab 1–12 Selection tab 1–27
specifying object sizes A–1
O SQL meta tags 1–31
Oracle 8i SQL Server
client shared library 1–6 data types 1–43
configuration requirements 1–6 SQL tab
data type considerations 1–41 Input page 1–18
data types 1–43 Output page 1–28
joins 1–38 Stage page 1–10
Oracle 9i description 1–10
configuration requirements 1–9 General tab 1–11
data type considerations 1–41 NLS tab 1–12
data types 1–43 Sybase
Output page configuration requirements 1–10
Columns tab 1–26 data types 1–43
description 1–11, 1–24
General tab 1–25 T
Selection tab 1–27 third-party documentation iv
U
User-defined DDL tab 1–21
user-defined queries 1–31
user-defined SQL statements 1–23
User-defined tab
Input page 1–18
Output page 1–28
W
writing to a database 1–22