Teradata Utilities
Teradata Utilities
Teradata Utilities
Below article shows the details of various Teradata Utilites and their usage.
BTEQ:
BTEQ is batch-mode utility for submitting SQL requests to the Teradata database. It runs on every
supported platform – laptop to mainframe. It can both export/import data to/from client system.
Instead of checking error code or error level after each statement, use the following setting at the
beginning of the script. This sets the error severity level at which BTEQ terminates to 1.
.MAXERROR 1
Thus, any error condition which results in an error level of 1 or greater will cause the program to
terminate, returning the error code that caused the MAXERROR condition to trigger. (MAXERROR is
NOT the maximum number of errors allowed).
Under certain conditions, it may be desirable to have certain error conditions not trigger the session to
exit. Such an example is dropping a table or view, but not wanting to abort the session if that
table/view does not exist. This can be accomplished by having the DROP statement as the first
statement and .MAXERROR 1 as the second statement.
If your need to have the DROP object statement after some SQL statement (cannot be the first
statement), the method to accomplish this is to set the error level associated with the error code.
.SET ERRORLEVEL 3807 SEVERITY 0
Then to reset the error level to be checked again you do the following:
.SET ERRORLEVEL 3807 SEVERITY 8
(8 is the default error level of the 3807 error, table/view does not exist).
You can find ERRORLEVEL for specific ERRORCODE in the BTEQ manual.
If you are going to use this functionality, it is very important to reset the ERRORLEVEL back to its
default before processing any statement where you want that error condition to abort the script.
If need to implement loops in BTEQ script , can use following syntax/logic
Sql stmnt;
If activitycount ……
Bteq Export
• All bteq export processes should use the ‘close’ option of the export command and the 'set retry off'
to ensure that the process aborts immediately upon a DBMS restart. If not, the export will reconnect
sessions when Teradata is available again, retransmitting rows already sent.
Bteq Import
• All bteq import processes populating empty tables should be preceded by a delete of that table for
restartability. Import will not automatically reconnect sessions after a Teradata restart. The job must
be manually restarted.
FASTLOAD:
Purpose: Loads large amount of data from external file into an empty table at high speed
Restriction:
Only load one empty table with 1 fast load job.
Tables defined with Referential integrity, secondary indexes, join indexes, hash indexes or triggers
cannot be loaded with FastLoad.
Duplicate rows cannot be loaded into a multiset table with FastLoad.
FastLoad has two phases – Data Acquisition and Application. It requires separate error table for each
phase. Use the BEGIN LOADING statement to specify (and create) the two error tables for the
operation. We can specify the error table in the same database as the data table or different
database.
We must remove the error tables before we re-run the same load job or it will terminate in an error
condition.
The CHECKPOINT option defines points in a job where FastLoad pauses to record that Teradata has
processes a specified number of input records. When we use checkpoints, we do not have to re-run
the entire Fastload job if it stops before completion. Fastload will use the checkpoint information in the
restart log table to determine the restart location.Checkpoints slow FastLoad processing – set the
CHECKPOINT large enough that checkpoints are taken every 10 to 15 minutes.
An errlimit count should be specified on all fastload utility control cards to pause the load when an
unusually high number of constraint violations are directed to error table 1. For large tables (such as
fact tables), this number should be approximately 1000. The errlimit for smaller tables, such as
dimension tables, would be table specific.
END LOADING indicates that all data rows have been transmitted and the second phase can begin.
Its omission means the load is incomplete and will be restarted later. This causes the table that is
being loaded to become “Fastload Paused” and we can’t access the table via SQL. If the END
LOADING was omitted by mistake, submit BEGIN and END LOADING statements, it will restart
phase 2 only.
We can specify the max and min number of sessions that can be used by the Fastload script. Specify
the max value equal to the number of AMPs. If we specify a max value larger than the number of
available AMPs, Fastload limits the sessions to one per working AMP. Min is optional and defaulted to
1.
FAST EXPORT:
Purpose: Exports large volumes of formatted data from Teradata to a host file or user-written
application.
Restrictions:
• Equality conditions for a Primary Index or USI
• WITH option to generate total or subtotal response rows.
The FastExport job may consist of multiple SELECT statements, which will be executed sequentially
by FastExport.
In the .EXPORT statement, we need to specify definition of the output data file. The different data file
format supported are:
TEXT – Fixed width format
VARTEXT – Character delimiter format
BINARY – Binary format, a 2 bytes length indicator followed by n bytes of data
The data can be exported in the sorted order. But note that requesting sorted data adds additional
work (overhead and time) to Teradata.
We can specify the max and min number of sessions that can be used by the FastExport script.
Specify the max value equal to the number of AMPs. If we specify a max value larger than the number
of available AMPs, FastExport limits the sessions to one per working AMP. Min is optional and
defaulted to 1.
MULTI LOAD:
Purpose: Used for loading, updating or deleting data to and from populated tables, typically with batch
inputs from a host file.
Restrictions:
• Cannot process tables defined with USI’s, Referential Integrity, Join Indexes, Hash Indexes, or
Triggers.
• No data retrieval capability.
• Import tasks require use of Primary Index.
The Multiload supports five target tables per script. Tables may contain pre-existing data. Ability to do
INSERTs UPDATEs, DELETEs and UPSERTs.
In the BEGIN MLOAD statement define one work table for each target data table. If work tables are
not defined, these are created with the table name prefixed by ‘WT_’. Make sure that the database
where work tables are defined has enough space. Multiload loads the data into work table before
applying into the target table data blocks.
Two error tables need to be defined for each target table. If they are not defined in the BEGIN
MLOAD, they are defaulted to an ‘ET_’ and ‘UV_’ prefix and the table name. ‘ET_tablename’ refers to
acquisition phase error table and ‘UV_tablename’ refers to an application phase error table.
BEGIN MLOAD
TABLES tname1, tname2,…
WORKTABLES wt_table1, wt_table2,…
ERRORTABLES et_table1 uv_table1, et_table2 uv_table2, …
[CHECKPOINT integer]
[INDICATORS];
.
.
.
.
END MLOAD;
Use RELEASE MLOAD tablename; statement if MLOAD doesn’t successfully completed and there is
no desire to restart the MLOAD. This will return the target table to general availability. Error tables
should be dropped manually, as RELASE MLOAD will not drop them automatically.
In the .IMPORT statement, we need to specify definition of the input data file. The different data file
format supported are:
• TEXT – Fixed width format
• VARTEXT – Character delimiter format
We can specify the max and min number of sessions that can be used by the MLOAD script. Specify
the max value equal to the number of AMPs. If we specify a max value larger than the number of
available AMPs, MLOAD limits the sessions to one per working AMP. Min is optional and defaulted to
1.
Multiload jobs may or may not benefit from dropping/recreating indexes. This is dependent on the
size of the table, the number of indexes, and the size of the file being multiloading. Tests will need to
be conducted to determine the method that performs best. If drop/create index is not done, these
members should exis, but contain comments only.
An errlimit count should be specified on all multiload utility control cards to pause the load when an
unusually high number of constraint violations are directed to the ‘et’ error table. For large tables
(such as fact tables), this number should be approximately 1000. The errlimit for smaller tables, such
as dimension tables, would be table specific.
• Multiload is a Teradata load utility that is used mostly to append rows to a table. It is the fastest way
to do this, but is substantially slower than fastload. Although runtime is affected by the number of
duplicate primary index values, multiload runs approximately 2-3 times slower than fastload. A
sample runtime on a 2 node, 12 amp NCR box is: 1000000, 300 byte rows in 1 hour (unique primary
index).
• Multiload usually runs faster when secondary indexes are dropped prior to the load and then
recreated afterwards. Multiload does not support unique secondary indexes and they must be
dropped prior to a multiload.
• Multiload restart options depend on whether or not the job abends in the acquisition or application
phase. The syslist will indicate which phase it is in, but if you are unsure run the release mload
command. If the job is in the application phase, this command will not be accepted. Please see the
multiload restart procedures for more details on restart options.
• Multiload can also be used to perform a large number of deletes in an efficient manner, using the
multiload delete statement.
• Any operation performed using MLOAD cannot be rolled back.