Netezza User-Defined Functions Developer's Guide
Netezza User-Defined Functions Developer's Guide
3 and Later
20444-5 Rev. 4
Note: Before using this information and the product that it supports, read the information in Notices and Trademarks on
page G-1.
Contents
Preface
1 User-Defined Functions
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
User-Defined Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
User-Defined Table Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
User-Defined Aggregates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
User-Defined Shared Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Fenced and Unfenced Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
SPUPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Netezza Developer Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Important Cautions for User Code on Netezza Systems . . . . . . . . . . . . . . . . . . . . . . . 1-4
Netezza System Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
UDX Programming Language Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
User Account Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
UDX API Version 1 and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
How to Create a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Design a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Review the Existing Built-In Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Create a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
Information for UDX Developers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
Registering UDXs in Netezza Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
How to Convert and Use Netezza Temporal Values . . . . . . . . . . . . . . . . . . . . . . . 1-7
UDFs in Table Columns and Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
UDX Development and Compilation Environments. . . . . . . . . . . . . . . . . . . . . . . . 1-8
UDX Object File Install Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
Stored Procedures and UDXs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
Information for UDX Users and Netezza Administrators . . . . . . . . . . . . . . . . . . . . . . . 1-9
How to Call a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
Cross-Database Access to UDXs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
International/Unicode Character Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
How to Back up and Restore UDX Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
How to Upgrade and Patch Netezza Systems That Have UDX Code . . . . . . . . . . . 1-10
iii
iv
vi
vii
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-4
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-5
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6
ALTER FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-9
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-10
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-11
ALTER LIBRARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-11
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-11
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-11
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-12
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-12
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-13
CREATE [OR REPLACE] AGGREGATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-13
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-13
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-13
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-15
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-16
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-17
CREATE [OR REPLACE] FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-17
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-18
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-18
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-21
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-22
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-22
CREATE [OR REPLACE] LIBRARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-23
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-23
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-23
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-24
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-24
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25
DROP AGGREGATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25
viii
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-26
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-26
DROP FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-28
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-28
DROP LIBRARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-28
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-28
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
SHOW AGGREGATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-31
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-31
SHOW FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-33
SHOW LIBRARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-33
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-34
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-34
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-34
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-34
ix
Index
xi
xii
Tables
Table 6-1:
Table 6-2:
Table 6-3:
Table 6-4:
Table 6-5:
Table 6-6:
Table 6-7:
Table 6-8:
Table 7-1:
Table 7-2:
Table 7-3:
Table 7-4:
Table 7-5:
Table 7-6:
Table 7-7:
Table 7-8:
Table B-1:
Table B-2:
Table B-3:
Table B-4:
Table B-5:
Table B-6:
Table B-7:
Table B-8:
Table B-9:
Table B-10:
Table B-11:
Table B-12:
Table B-13:
Table B-14:
Table B-15:
Table B-16:
Table B-17:
xiii
xiv
Table B-18:
Table B-19:
Table B-20:
Table B-21:
Table B-22:
Table B-23:
Table B-24:
Table C-1:
Table C-2:
Table C-3:
Table C-4:
Preface
This guide describes how to create functions, aggregates, and shared libraries, which
increase the analysis and query capabilities of the IBM Netezza data warehouse appliance. You can create these custom objects, add them to the Netezza system, and make
them available for other users to include in their queries.
Note: In previous releases, this feature was called OnStream Functions. In release 6.0.x,
the feature name changed to user-defined functions.
Throughout this guide, note that the terms user-defined function (UDF) and user-defined
aggregate (UDA) are used to clarify user-created functions and aggregates versus the builtin functions and aggregates that ship with the Netezza software. The term UDX represents
a user-defined object in general. For more information about the terminology and concepts,
refer to Chapter 1, User-Defined Functions.
See:
Using SPUPad to allocate a named, unique, area Creating Memory Workpads Using the SPUPad on
of memory as a temporary storage area and
page A-1
workpad
Detailed descriptions of the Netezza SQL comNetezza SQL Reference on page B-1
mands for creating, altering, and dropping UDXs
Descriptions of helper routines that you can use
in your UDXs to convert data and time values
xv
Topic:
See:
Refer to your Netezza maintenance agreement for details about your support plan choices
and coverage.
The name and version of the manual that you are using
xvi
CHAPTER 1
User-Defined Functions
Whats in this chapter
Introduction
Important Cautions for User Code on Netezza Systems
Netezza System Prerequisites
How to Create a UDX
Information for UDX Developers
Information for UDX Users and Netezza Administrators
This chapter introduces the user-defined functions support in the Netezza environment.
Review this chapter to learn about the key concepts and definitions, as well as the benefits,
prerequisites, and important information.
Introduction
The Netezza user-defined functions feature allows you to create custom functions, aggregates, and shared libraries that run on Netezza systems and perform specific types of
analysis for your business reporting and data queries. They allow you to leverage the
Netezza massively parallel processing (MPP) environment to accelerate analysis of data, as
well as to offer new and unique types of analysis. Because user-defined functions enable
data processing directly on the Netezza system, you can reduce or eliminate data movement to other systems for analysis, which reduces the overall processing time.
Netezzas support for user-defined functions generally follows the model used in the 2003
SQL Standard for SQL-Invoked Routines.
User-Defined Functions
A user-defined function (UDF) is user-supplied code that is executed by the Netezza system
in response to SQL invocation syntax. UDFs provide new types of data analysis actions
which are not currently available with the built-in functions such as upper(), sqr(), or
length(). A user-defined function is a scalar function; that is, it returns one value.
A UDF invocation may appear anywhere inside a SQL statement where a built-in function
can appear, which includes restrictions (where clauses), join conditions, projections (select
from lists), and HAVING conditions. A UDF can accept zero or more input values but produces one output value. Input values to a UDF can be literals, column references, or
expressions. The data types of inputs and output must be Netezza built-in data types.
1-1
User-Defined Aggregates
A user-defined aggregate (UDA) is user-supplied code which implements the various phases
of aggregate evaluation, such as initialization, accumulation, and merging, on the Netezza
system.
UDAs provide new types of aggregation functions that are not currently available with the
built-in aggregates such as count(), sum(), avg(), max(), or min(). UDAs are able to take
multiple arguments, but they are also scalar and produce one output value. UDAs may be
used in a SQL statement anywhere a built-in aggregate may appear as either grand,
grouped, or windowed aggregates.
You can control whether a UDA is allowed in grouped aggregate query or a window (analytical) aggregate query, or either type, when you define the UDA. The restriction is a
performance optimization; by restricting an aggregate to grouped aggregates only, for example, the Netezza will not allow users to include the aggregate in an analytic (windowed)
aggregate query. This may be the intended design of the UDA itself, or it could be a performance optimization to control memory impacts on the Netezza. If an aggregate is defined
as ANY type, then it can be used in either aggregation types. For more information about
window and grouped aggregates and the performance implications of window aggregates,
see the IBM Netezza Database Users Guide.
1-2
20444-5
Rev.4
Introduction
library routines need be made only in the shared library that defines them. In addition,
users who have UDXs that leverage third-party libraries may be able to migrate those routines and libraries more easily to the Netezza system for analysis.
UDX
Throughout this guide, the term UDX is a generic reference to user-defined functions of any
kind, including user-defined functions, aggregates, or shared libraries. The term is also
used in code or commands that operate on these user-defined object types, such as
nzudxcompile.
SPUPad
The SPUPad feature allows you to reserve temporary areas of memory on the Netezza SPUs
(also known as S-Blades). The SPUPads are typically used to hold data for use with userdefined functions. When the query or transaction block that created the SPUPad finishes,
the Netezza system releases the memory allocated for each SPUPad. SPUPads typically
reside in memory on the Netezza SPUs where the user data tables reside, but they can also
reside on the host if the UDX is operating on external tables or host-based system views.
Note: SPUPads are not supported for UDXs that run in fenced mode.
20444-5
Rev.4
1-3
1-4
20444-5
Rev.4
20444-5
Rev.4
1-5
If you want to create UDXs that can be used in both 5.0.x and 6.0.x environments, you can
define your objects with API VERSION 1. If you include version-2-only features, the object
will error out when you use the CREATE OR REPLACE command to add the UDX as an
object in a database.
Design a UDX
Create a UDX
Design a UDX
The first step in designing a UDX is to identify the type of action that you need the userdefined function or aggregate to perform. For example, you might want to implement functions that perform tasks such as specialized string operations or comparisons; custom
mathematical analysis; or conversions such as metric to English measurements, Celsius to
Fahrenheit, or currency conversions. In addition, it is important to identify any user-defined
shared libraries that your UDXs may require or that you could develop to create more efficient UDXs that share common routines.
Before the user-defined functions feature, conversion and analysis tasks might have
required you to export data from the Netezza system to another host server to perform the
processing and then load the converted data back into the Netezza system for storage. With
user-defined functions, you may be able to perform many or all of these analytical steps
directly on the Netezza system.
1-6
20444-5
Rev.4
Create a UDX
The following process outlines the steps to create a user-defined function or aggregate:
1. Write C++ code that implements the necessary class methods.
2. Compile the C++ program using nzudxcompile to create object files that can be registered with the Netezza system.
3. Use Netezza SQL CREATE commands to register the UDX as an object in the Netezza
system.
4. Debug the C++ program to look for and resolve any errors in the processing. Netezza
provides a test harness to assist you with debugging.
5. Test the UDX on a development system to confirm that it performs as designed.
6. Deploy the UDX to one or more production Netezza systems.
7. Give users permission to execute the user-defined function or aggregate in queries, and
possibly alter the UDX to run in unfenced mode to improve performance.
Chapters 2 through 6describe the steps to create UDXs, including requirements for the
code, best practices for testing, and some limitations and restrictions for UDXs.
20444-5
Rev.4
1-7
Use the compiler and specify compatible flags, which can include some of the following:
-shared
-Wa,--32
-fPIC
-fexceptions -fsigned-char
-Wno-invalid-offsetof
Some of these flags may be valid only for C++ or C libraries. Use the shared flag only when
linking the shared library and the fPIC only when creating object files. The Wa,--32
makes 32-bit object files instead of 64-bit.
1-8
20444-5
Rev.4
Stored procedures can be designed to call UDXs in the same way that they can call built-in
functions. You can also use UDXs to perform such tasks as extend the NZPLSQL language.
These UDFs must be invoked using SQL that is designed to run only on the Netezza host
inside of Postgres. For more information about these capabilities, refer to Appendix E,
Using UDXs with Stored Procedures.
Using fully-qualified object names when calling a UDX object that resides within a different database, for example:
MYDB(MYUSER)=> SELECT * FROM customers WHERE
OTHERDB..CustomerName(b) = 1;
Using the PATH SQL session variable to specify the databases to search to find the
UDX. To use the PATH session variable, you enter a command similar to the following
at the nzsql command prompt:
MYDB(MYUSER)=> SET PATH = <elem> [, <elem>];
The Netezza system uses this variable during the lookup of any unqualified UDXs. It
searches the current database if PATH is not set; otherwise it searches the databases
specified in PATH, in the order that they are specified. The Netezza system uses the
first match (or potential match) it finds, even if a better match might exist in a subsequent database. A poorer match is one that might require implicit casting of arguments
20444-5
Rev.4
1-9
or that causes an error due to multiple potential matches. Note that PATH searches
databases, not schemas, as there is no schema support for this capability. Also, the
PATH session variable supports only UDFs, UDAs, and stored procedures (which are
described in the IBM Netezza Stored Procedures Developers Guide). Other object
types are not supported.
How to Upgrade and Patch Netezza Systems That Have UDX Code
After you and other permitted users register UDX code with your Netezza system, there are
no special requirements or procedures necessary to preserve those user-defined objects
during a service pack update or an upgrade to a new release. In most cases, the object continue to operate in the same manner on the newly updated or upgraded system as on the
previous release.
Note: If the UDX base class changes in the new release, the older object files will no longer
work. You must recompile your object files from the C++ sources and use the CREATE OR
REPLACE [FUNCTION | AGGREGATE] commands to update the object files in your Netezza
system. If you obtained your UDFs or UDAs from a third-party resource, you will need to
obtain updated objects that have been recompiled for the new release before you use the
CREATE OR REPLACE command. Be sure to review the IBM Netezza Release Notes to see
if the UDX base class changed or if there are special compilation issues for the Netezza
release.
1-10
20444-5
Rev.4
If the new release or service pack introduces any new features or changes that could affect
the operation of UDXs, Netezza will describe the changes in the release notes for the service pack or the release. Before you install any new release or service pack, you should
carefully review the release notes to familiarize yourself with any new features, changes,
fixes, and known issues for that release. After you upgrade, if the later release has a new
base class, you may need to recompile your UDXs to replace the object files with versions
that support the new features.
If you downgrade your Netezza release, note that downgrades could result in a loss of support for features that are available in the later release. If you downgrade to a release that
supports only UDX version 1, your UDX version 1 code should continue to work following
the downgrade; however, UDX version 2 objects will not work and must be dropped. If the
earlier release uses a different base class, you may need to recompile your UDXs and/or
obtain the objects that were compiled on the earlier base class.
As a best practice, make sure that you have recent backups of your Netezza system, which
will also include any UDX code registered with the Netezza system. In the event of a problem or failure situation during the upgrade, the backups provide you with the ability to
restore the system to the point of the backup image.
20444-5
Rev.4
1-11
1-12
20444-5
Rev.4
CHAPTER 2
Creating User-Defined Functions
Whats in this chapter
Creating the C++ File for the UDF
Compiling the UDF
Registering the UDF with the Netezza System
UDX Environment
Understanding Size-Specific, Generic, and Variable Argument UDXs
Using the UDF in a SQL Query
Altering and Dropping UDFs
Return Value Sizer API
This chapter describes the steps to create a scalar user-defined function (UDF) and to
register it for use on a Netezza system.
In addition, make sure that you declare any of the standard C++ library header files that
your function may require. If your UDF requires any user-defined shared libraries, make
sure you note the name of the libraries as you will need them when you register the UDF in
the database. For example:
#include "udxinc.h"
#include <string.h>
Note: User-defined shared libraries must exist in the database before you can register the
UDF and specify those libraries as dependencies. You could register the UDF without specifying any library dependencies, and after the libraries are added, use the ALTER
FUNCTION command to update the UDF definition with the correct dependencies. For
more information about user-defined shared libraries, see Creating a User-Defined Shared
Library on page 5-1.
2-1
The UDX classes and functions for API version 2 are defined in a namespace called
nz::udx_ver2. (The API version 1 UDXs use the nz::udx namespace.) Your C++ program
must reference the correct namespace. For example:
#include "udxinc.h"
#include <string.h>
using namespace nz::udx_ver2;
Note: This chapter uses udx_ver2 as the default namespace for the examples that follow.
The sections note the differences with UDX version 1, and Appendix F, Sample UserDefined Functions and Aggregates Reference contains examples of version 1 and version
2 definitions. You can continue to create UDX version 1 functions as well as new version 2
functions; both will operate on Release 6.0.x systems. However, the version 1 functions will
work on Netezza Release 5.0.x and later systems and thus may be more portable for your
Netezza systems.
To implement a UDF, you create a new class object derived from the Udf base class. Continuing the customername example:
#include "udxinc.h"
#include <string.h>
using namespace nz::udx_ver2;
class CCustomerName: public nz::udx_ver2::Udf
{
public:
};
implementation must be outside the class definition. In UDX version 2, the instantiate
method takes one argument (UdxInit *pInit), which enables access to the memory
specification, the log setting, and the UDX environment (see UDX Environment on
page 2-7). The constructor must take a UdxInit object as well and pass it to the base
class constructor. The instantiate method creates a new object of the derived class type
using the new operator and returns it (as base class type Udf) to the runtime engine.
The runtime engine will delete the object when it is no longer needed. An example of
the instantiate method for API version 2 follows:
#include "udxinc.h"
#include <string.h>
using namespace nz::udx_ver2;
class CCustomerName: public nz::udx_ver2::Udf
{
public:
CCustomerName(UdxInit *pInit) : Udf(pInit)
{
}
static nz::udx_ver2::Udf* instantiate(UdxInit *pInit);
};
nz::udx_ver2::Udf* CCustomerName::instantiate(UdxInit *pInit)
{
return new CCustomerName(pInit);
}
2-2
20444-5
Rev.4
evaluate() is the method called once for each row of data during execution.
#include "udxinc.h"
#include <string.h>
using namespace nz::udx_ver2;
class CCustomerName: public nz::udx_ver2::Udf
{
public:
CCustomerName(UdxInit *pInit) : Udf(pInit)
{
}
static nz::udx_ver2::Udf* instantiate(UdxInit *pInit);
virtual nz::udx_ver2::ReturnValue evaluate()
{
// Code to be inserted here
}
};
nz::udx_ver2::Udf* CCustomerName::instantiate(UdxInit *pInit)
{
return new CCustomerName(pInit);
}
You can implement constructors and destructors as necessary. In API Version 1, constructors are optional. In API version 2, constructors are required; you must specify a
constructor even if it only invokes the base class constructor, as in the previous example.
Some common things to include in constructors are memory reservation routines (since
new and delete are relatively expensive), and setting up any structures needed for computation (for example, setting up a matrix for encryption routines).
For the customername example, the UDF takes a string and returns the integer 1 if the
string starts with Customer A, otherwise it returns the integer 0. The code for the evaluate method follows:
virtual nz::udx_ver2::ReturnValue evaluate()
{
StringArg *str;
str = stringArg(0);
// 4
int lengths = str->length;
// 5
char *datas = str->data;
// 6
int32 retval = 0;
if (lengths >= 10)
if (memcmp("Customer A",datas,10) == 0)
retval = 1;
NZ_UDX_RETURN_INT32(retval);
// 11
}
In the sample program, line 4 declares and uses a StringArg structure to pass the argument
to the UDF. Arguments to UDFs are retrieved using functions such as StringArg. If this UDF
took a second string argument, the second argument would be referenced by StringArg(1).
For a complete list of argument types supported and their associated helper functions, refer
to Appendix C, Datatype Helper API Reference. Lines 5 and 6 extract the length and
character pointer (char*) from the StringArg structure.
Note: The sample program uses memcmp (not strcmp) because StringArg structures are
not null-terminated (\0) in user-defined functions or aggregates. Therefore, strcmp, strcpy,
strlen, atol, and other functions which depend on the presence of a null terminator will not
20444-5
Rev.4
2-3
work. If you need to use those functions, you must copy the string into a buffer and manually append the null terminator.
Line 11 uses a UDX macro to return the computed value to the Netezza engine. The NZ_
UDX_RETURN_INT32 macro helps to confirm that the return value is of the expected type.
For a list of the available return macros, refer to UDX Return Value Macros on page D-8.
To use the function, you must compile and register it as an available function on the
Netezza system.
To compile the customername.cpp file and create the output object files:
nzudxcompile /home/nz/udx_files/customername.cpp
customername.o_x86 is the object file for the Netezza host (i386 Linux platform on
x86).
customername.o_spu10 is the object file for the Linux-based Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models (formerly called Netezza TwinFin and Skimmer
systems).
For a complete description of the nzudxcompile command and its options and usage, refer
to nzudxcompile Command Syntax on page 6-18.
After you create the compiled object files, you must register the UDF with the Netezza system. You can register a user-defined function using the CREATE FUNCTION command, as
described in the next section.
Note: Optionally, you can also compile and register a UDF in one step using the nzudxcompile command. For example, to compile the customername C++ file and also register it in a
sample database called mydb:
nzudxcompile customername.cpp o customername.o
-sig "CustomerName(varchar(64000))" --version 2 -return INT4
-class CCustomerName -user myuser -pw password -db mydb
In this example, the quotes are required to ensure that the shell properly handles the
parentheses characters. This example also shows that you must include the --version 2
syntax when you are using the command to compile and register an API version 2 UDF.
2-4
20444-5
Rev.4
the object files and read and execute access to every directory in the path from the root to
the object file.
For example, to register the sample function customername to the Netezza system, start an
nzsql session to your database (which is named mydb in this example):
nzsql mydb myuser password
Next, use the CREATE FUNCTION Netezza SQL command to register the function:
MYDB(MYUSER)=> CREATE FUNCTION CustomerName(varchar(64000))
RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC API VERSION 2
EXTERNAL CLASS NAME 'CCustomerName'
EXTERNAL HOST OBJECT '/home/nz/udx_files/customername.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/customername.o_spu10';
If the command is successful, it returns the message CREATE FUNCTION. It creates the
function in the mydb database, and the function is owned by myuser. To create a function,
your user account must have Create Function administration permission or you must be
logged in as the admin user. For a description of the required privileges for these commands, refer to Managing User Account Permissions on page 6-1.
When you register a UDF with the Netezza system, the specified object files are copied into
the Netezza database directories. This allows the functions to be used in queries by all permitted users, and it also ensures that the UDFs are backed up and restored with the user
data in the database. If you change the C++ program for any reason (such as adding debug
messages or changing the operation of the function), you must re-compile the program and
re-run the CREATE OR REPLACE FUNCTION command to copy the updated object files
into the Netezza database.
Note the following characteristics of the CREATE FUNCTION command:
20444-5
Rev.4
If you use the command CREATE FUNCTION, instead of CREATE OR REPLACE FUNCTION, the command will fail if a user-defined function with the same name and
signature already exists in the database.
You can create multiple UDFs that use the same name, but they must have different
signatures if they reside in the same database. The name must meet the character
restrictions for a legal Netezza SQL keyword or identifier, and it does not have to match
or relate to anything defined in the C++ file (that is, the name is not used for binding).
The value you specify for EXTERNAL CLASS NAME must match the class in the C++
file exactly, as this is how the runtime engine creates and calls the UDF object method.
The command will fail if the DEPENDENCIES argument references the name of a userdefined shared library which is not defined in the current database.
For string arguments, use caution in choosing a string size. In general you should follow these best practices for strings:
If the string input is naturally bounded, specify a string size that matches the largest string needed. For the customername example, varchar(10) is sufficient.
If the string input length could vary widely, use generic size arguments. For more
information, see Generic Arguments in the UDX Signature on page 2-9.
There can be a performance penalty for specifying a large string when the input
passed is a CHAR/NCHAR type and the argument is specified as VARCHAR/NVARCHAR. In this case, the argument will be implicitly converted to the variable-sized
argument, including all of the trailing spaces.
2-5
Dependencies
The following command defines a UDF named myfunc that depends upon a user-defined
shared library named mylib:
MYDB(MYUSER)=> CREATE FUNCTION myfunc(int)
RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'CMyFunc' DEPENDENCIES mylib
EXTERNAL HOST OBJECT '/home/nz/udx_files/myfunc.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/myfunc.o_spu';
If a user calls greatest_value with two input values, the system uses the first (two argument) function. If the user specifies three input values, the system uses the second
function that accepts three input values.
Overloading allows you to support different combinations of input values and/or return
types. However, overloading and uniquely named but similar functions and aggregates have
a maintenance overhead; if you need to update or redesign the body of the UDX, you have
to update each UDX with the changes that you want to make.
2-6
20444-5
Rev.4
UDX Environment
UDX Environment
UDX API version 2 includes support for the UDX environment. The UDX environment consists of a list of one or more variable name and value pairs that you can specify for a UDF,
UDA, or UDTF when you register it. Using environment variables, you can conditionalize
the UDX behavior in the UDX definition.
The environment variables simplify the process for changing the behavior of the UDX when
necessary. To change a variable, you alter the UDX to specify new variables, new values, or
to clear the variables. Although you can define the variable values within the source code
for the UDX, changing the values would require you to edit the source code, recompile, and
reregister the UDX to implement the new behavior. Similarly, if you defined variables as an
argument to the UDX when you invoke it, changing the variable would require changes to
the SQL query that invokes the UDX.
For example, assume that you have a UDF that performs a currency conversion from U.S.
dollars to Euros. If you hardcode an exchange rate variable within the UDF source code, you
would have to update the source code whenever the exchange rate changes, then recompile
and reregister the UDF. Instead, you could use an environment variable to define the
exchange rate for the UDF when you register it, as in the following example:
CREATE OR REPLACE FUNCTION usdToEurosFunc(int) RETURNS int4 LANGUAGE
CPP PARAMETER STYLE NPSGENERIC ENVIRONMENT 'USD_EURO'='0.7268'
EXTERNAL CLASS NAME 'CusdToEuros' EXTERNAL HOST OBJECT '/home/nz/udx_
files/usdEuroFunc.o_x86' EXTERNAL SPU OBJECT '/home/nz/udx_files/
usdEuroFunc.o_spu';
When you update the exchange rate, you can simply re-register the UDX, as shown in bold
below:
CREATE OR REPLACE FUNCTION usdToEurosFunc(int) RETURNS int4 LANGUAGE
CPP PARAMETER STYLE NPSGENERIC ENVIRONMENT 'USD_EURO'='0.8241'
EXTERNAL CLASS NAME 'CusdToEuros' EXTERNAL HOST OBJECT '/home/nz/udx_
files/usdEuroFunc.o_x86' EXTERNAL SPU OBJECT '/home/nz/udx_files/
usdEuroFunc.o_spu';
You can specify and alter the environment variables using the CREATE OR REPLACE commands for functions and aggregates. You can use the ALTER command to alter variable
values as well as to clear the environment of all variable settings using the NO ENVIRONMENT syntax. To alter an existing set of one or more environment pairs, you must specify
all the environment settings; the ALTER command replaces the current list with the list
specified in the ALTER command.
Within the UDF, you can retrieve the UdxEnvironment class to obtain and use the environment variables:
UdxEnvironment* getEnvironment()
The getNumEntries() method returns the number of environment variables in the UdxEnvironment object. For example:
int numEntries = env->getNumEntries();
20444-5
Rev.4
2-7
The findEntry() method takes an input string and matches it against the variable
names. It returns the key number of the first matching entry in the environment structure, or -1 if the string is not found.
The getEntry() method takes an input index or variable name value and returns the
matching UdxEnvironmentEntry or a null value if not found.
The getKey() method returns the matching environment variable name from a
UdxEnvironmentEntry.
The getValue() method returns the matching environment variable value from a
UdxEnvironmentEntry.
Size-specific arguments
Generic arguments
Variable arguments
The following sections describe these three formats and the benefits and considerations for
using that type.
Size-Specific Arguments
With size-specific UDXs such as the customerName example, you must declare the type
and size of all input arguments, as well as the type and size of the return value. Specific
datatype size declarations are useful for resource planning as well as for error-checking of
the input arguments and return values, but they can be somewhat limiting if your UDXs
process strings or numerics that could vary in size each time you run a query.
Constant datatype sizes often require you to use larger datatype sizes (and thus more storage resources) to support the maximum input values and/or return values. They can also
result in implicit casts, such as casting a smaller input value to fit the larger declared size
(for example, it could increase the precision of a numeric or add padding to strings). If you
choose too small a size, you risk loss of precision if the Netezza system casts a larger input
numeric to the smaller numeric or truncates input strings which exceed the defined string
input size.
Generic-Size Arguments
Generic-size (or any-size) arguments offer more flexibility for strings and numerics. You can
declare character strings or numerics using the ANY keyword in the signature (or in the
return value). For example:
MYDB(MYUSER)=> CREATE FUNCTION CustomerName(varchar(ANY))
RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC API VERSION 2
EXTERNAL CLASS NAME 'CCustomerName'
EXTERNAL HOST OBJECT '/home/nz/udx_files/customername.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/customername.o_spu10';
2-8
20444-5
Rev.4
The function accepts a character string of up to 64,000 characters (the maximum for a
VARCHAR). Within the body of the function, the code must process the strings and numerics with the plan that you could receive a string of any valid length. That is, you can check
and obtain their size, process them as needed, and return the value for the function.
Generic-size arguments help you to avoid specific limits for the input strings and numerics,
or to use overly large or maximum size values that result in unnecessary resource allocation
for the procedure. This format can also reduce the need to register and maintain similar
procedures that take different input arguments or have different return values, as well as
possible casting of input values.
Note: UDFs support generic arguments as well as generic return values. UDAs, however,
support only generic arguments. The return value and state arguments of a UDA must specify constant data sizes. UDTFs support shapers, which are similar to generic return values
for scalar UDFs.
CHAR or NCHAR
VARCHAR or NVARCHAR
NUMERIC
For example, to specify a numeric datatype of precision 10 and scale 2, you specify it as
NUMERIC(10,2). To specify a numeric datatype that can take any size, you specify is as
NUMERIC(ANY). Likewise, to specify a variable character string that can take any size, you
declare it as VARCHAR(ANY).
20444-5
Rev.4
2-9
}
return sizerNumericSizeValue(prec, scale); //let return value
//precision and scale be "max"
}
For a complete example of a UDF that uses generic input arguments as well as a generic
return value and a calculateSize() method, see Generic UDF Example on page F-1.
2-10
20444-5
Rev.4
In this example, the number() function takes an input numeric datatype of any valid size
and returns a numeric datatype of a valid size that will be calculated by the UDF.
An example for a UDA follows (UDAs allow generic arguments only):
CREATE OR REPLACE AGGREGATE char20 (CHAR(ANY))
RETURNS CHAR(20) STATE (CHAR(20))
LANGUAGE CPP PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'Char20'
EXTERNAL HOST OBJECT '/tmp/udx_test/UDX_CharMax.o_x86'
EXTERNAL SPU OBJECT '/tmp/udx_test/UDX_CharMax.o_spu10';
Note: UDAs which have large string state variables can impact performance.
Variable Arguments
Variable-argument functions and aggregates offer even more flexibility than generic-size
arguments. With variable argument UDFs, UDAs, and UDTFs, you specify only the
VARARGS keyword in the argument_type_list. Users can specify from 0 to 64 input values
of any supported data type as input arguments. For example, using the greatest_value function from a previous section:
MYDB(MYUSER)=> CREATE FUNCTION greatest_value(VARARGS) RETURNS
INT64...
Within the body of the function, the code must process the input values and manage them
as needed. For example, the function body should verify the data types of the input arguments and either cast or error out as applicable. You must design your UDX code to handle
the native data type of the input values, such as managing Numeric32Val versus double
data types. If you were hard-coding the input values, you could declare the input as
func(double) and when invoked with a numeric, the system would cast it to double for you.
Variable argument signatures allow you to create one function or aggregate that can be
used for different combinations of input types. This simplifies the development of your
UDFs, UDAs, and UDTFs and reduces the need to create overloaded definitions that perform the same task for different types and numbers of arguments.
20444-5
Rev.4
2-11
CREATE
INSERT
INSERT
INSERT
INSERT
VARCHAR(200));
'Customer A');
'Customer B');
'Customer CBA');
'Customer ABC');
sizerReturnType Method
Returns a datatype based on the declared UDF return type.
Syntax
The method has the following syntax:
int sizerReturnType() const;
2-12
20444-5
Rev.4
Description
The method returns a datatype such as UDX_NUMERIC, UDX_FIXED, UDX_VARIABLE,
UDX_NATIONAL_FIXED or UDX_NATIONAL_VARIABLE. For a description of these
datatypes, see Supported Data Types on page D-1.
numSizerArgs Method
Specifies the number of arguments in the UDF signature.
Syntax
The method has the following syntax:
int numSizerArgs() const;
Description
This method indicates the number of arguments that the function is called with.
sizerArgType Method
Specifies the datatype of the specified argument of the function.
Syntax
The method has the following syntax:
int sizerArgType(int n) const;
Description
The method specifies the datatype of the argument of the function. The datatype can be
any of the enumerated types except UDX_NUMERIC. The enumerated types are described
in Supported Data Types on page D-1.
Throws
The method throws an exception if n is out of range.
sizerStringArgSize Method
Returns the string size in characters of the specified argument.
Syntax
The method has the following syntax:
int sizerStringArgSize(int n) const;
Description
This method returns the size of the specified generic argument. This size is in characters,
not bytes.
20444-5
Rev.4
2-13
Throws
The method throws exceptions if n is out of range, if the specified argument is not a string
type, or the specified argument does not have a size.
sizerNumericArgPrecision Method
Returns the precision of the specified numeric argument.
Syntax
The method has the following syntax:
int sizerNumericArgPrecision(int n) const;
Description
The method returns the precision component of the specified numeric argument.
Throws
The method throws exceptions if n is out of range, if the specified argument is not a
numeric type, or the specified argument does not have a size.
sizerNumericArgScale Method
Returns the scale of the specified numeric argument.
Syntax
The method has the following syntax:
int sizerNumericArgScale(int n) const;
Description
The method returns the scale component of the specified numeric argument.
Throws
The method throws exceptions if n is out of range, if the specified argument is not a
numeric type, or the specified argument does not have a size.
sizerStringSizeValue Method
Builds a string return value size for the specified string length.
Syntax
The method has the following syntax:
uint64 sizerStringSizeValue(int len) const;
Description
The method builds a return value for the calculateSize method using the specified string
length len. The string length must be in characters, not bytes.
2-14
20444-5
Rev.4
Throws
The method throws an exception if the return type is not a string.
sizerNumericSizeValue Method
Builds a numeric return value for the specified precision and scale values.
Syntax
The method has the following syntax:
uint64 sizerNumericSizeValue(int prec, int scale) const;
Description
The method builds a numeric return values size for the calculateSize() method as an int64
(which is a format that Netezza recognizes) using the values specified in prec and scale.
Throws
The method throws an exception if the return type is not a numeric.
isSizerArgConstant Method
Returns true if the specified argument is a constant integer.
Syntax
The method has the following syntax:
bool isSizerArgConstant(int n) const;
Description
To support certain methods like round(val, scale), this method provides a mechanism for
passing constant arguments to the sizer as long as they are of int32 type. This method
returns true if the specified argument is a constant integer (0 or any positive or negative
number, except -1), or false if the constant specified is -1.
Throws
The method throws exceptions if n is out of range or if the specified argument is not an
int32 datatype.
sizerGetConstantArg Method
Returns the specified constant int32 argument.
Syntax
The method has the following syntax:
int32 sizerGetConstantArg(int n) const;
Description
This method returns the specified constant int32 argument.
20444-5
Rev.4
2-15
Throws
The method throws exceptions if n is out of range, the specified argument is not an int32,
or the specified argument is not constant.
calculateSize Method
Provides the sizing calculations for strings and numerics in generic UDFs.
Syntax
The method has the following syntax:
virtual uint64 calculateSize() const
{
return 0xFFFFFFFFFFFFFFFFLL;
}
Description
If your UDF uses the ANY keyword as the size specified for a numeric or string return value,
you must override this method to provide the sizing capabilities.
2-16
20444-5
Rev.4
CHAPTER 3
Creating User-Defined Table Functions
Whats in this chapter
Creating the C++ File for the UDTF
Compiling the UDTF
Registering the UDTF
Using the UDTF in a SQL Query
Altering and Dropping a UDTF
This chapter describes the steps to create a user-defined table function (UDTF) and to
register it for use on a Netezza system.
In addition, make sure that you declare any of the standard C++ library header files that
your table function may require. If your UDTF requires any user-defined shared libraries,
make sure you note the name of the libraries as you will need them when you register the
UDTF in the database. For example:
#include "udxinc.h"
#include <string.h>
Note: User-defined shared libraries must exist in the database before you can register the
UDTF and specify those libraries as dependencies. You could register the UDTF without
specifying any library dependencies, and after the libraries are added, use the ALTER
FUNCTION command to update the UDTF definition with the correct dependencies. For
more information about user-defined shared libraries, see Creating a User-Defined Shared
Library on page 5-1.
The UDX classes and functions for API version 2 are defined in a namespace called
nz::udx_ver2. Your C++ program must reference the correct namespace. For example:
#include "udxinc.h"
using namespace nz::udx_ver2;
3-1
To implement a UDTF, you create a new class object derived from the Udtf base class. For
example:
#include "udxinc.h"
using namespace nz::udx_ver2;
class parseNames : public Udtf {
public:
}
The parseNames UDTF takes an input table of strings fields which are separated by spaces
or commas, and returns a table where each field of the requested string is output on its
own row. As with other UDXs, you define the variables required for the UDTF algorithm at
the class level. For example:
#include "udxinc.h"
using namespace nz::udx_ver2;
class parseNames : public Udtf {
private:
char value[1000];
int valuelen;
int i;
public:
}
Each UDTF must implement the instantiate() and constructor method as well as two additional UDTF-specific methods: newInputRow() and nextOutputRow(). The
nextEoiOutputRow() UDTF-specific method is optional. An example of the methods and
their purpose follows.
As with UDFs, you call the instantiate() method to create the UDTF object dynamically,
In UDX version 2, the instantiate method takes one argument (UdxInit *pInit), which
enables access to the memory specification, the log setting, and the UDX environment
(see UDX Environment on page 2-7). The constructor must take a UdxInit object as
well and pass it to the base class constructor. An example follows:
#include "udxinc.h"
using namespace nz::udx_ver2;
class parseNames : public Udtf {
private:
char value[1000];
int valuelen;
int i;
public:
parseNames(UdxInit *pInit) : Udtf(pInit) {}
static Udtf* instantiate(UdxInit*);
};
3-2
20444-5
Rev.4
For a UDTF, you use the newInputRow() method to perform initialization actions such
as copying input arguments, initializing class variables, and managing situations such
as null input variables. The method is called once for each input row. For the parseNames UDTF example, the following sample code copies the input list to the variable
value, sets valuelen to the length of the input string, and initializes the variable i to
zero (0):
virtual void newInputRow() {
StringArg *valuesa = stringArg(0);
bool valuesaNull = isArgNull(0);
if (valuesaNull)
valuelen = 0;
else {
if (valuesa->length >= 1000)
throwUdxException("Input value must be less than 1000
characters.");
memcpy(value, valuesa->data, valuesa->length);
value[valuesa->length] = 0;
valuelen = valuesa->length;
}
i = 0;
}
You use the nextOutputRow() method to create and return the next output row of the
table. The method should also detect whether there is no more data to return and then
return Done. NPS calls this method at least once per input row. Sample code follows:
virtual DataAvailable nextOutputRow() {
if (i >= valuelen)
return Done;
// save starting position of name
int start = i;
// scan string for next comma
while ((i < valuelen) && value[i] != ',')
i++;
// return word
StringReturn *rk = stringReturnColumn(0);
if (rk->size < i-start)
throwUdxException("Value exceeds return size");
memcpy(rk->data, value+start, i-start);
rk->size = i-start;
i++;
return MoreData;
}
As shown in the example above, you create a column using the appropriate column
return type such as stringReturnColumn() or intReturnColumn() and you specify the
position of the column such as 1, 2, 3, and so on. The return MoreData syntax indicates that there is another row to process. When the counter variable i reaches the end
of the input string, there is no more data to process and nextOutputRow() returns Done.
20444-5
Rev.4
3-3
If your UDTF supports the TABLE WITH FINAL syntax, you use the nextEoiOutputRow()
method at least once after the end of the input to process and output all the data. The
base class has a default implementation of this method that returns no rows when
called. It is similar to nextOutputRow() except that newInputRow() is not called before
it. A sample method follows:
virtual DataAvailable nextEoiOutputRow()
return Done;
}
To compile the parseNames.cpp file and create the output object files:
nzudxcompile parseNames.cpp
parseNames.o_x86 is the object file for the Netezza host (i386 Linux platform on x86).
parseNames.o_spu10 is the object file for the Linux-based Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.
For a complete description of the nzudxcompile command and its options and usage, refer
to nzudxcompile Command Syntax on page 6-18.
After you create the compiled object files, you must register the UDTF with the Netezza
system. You can register a user-defined function using the CREATE FUNCTION command,
as described in the next section.
Optionally, you can also compile and register a UDTF in one step using the nzudxcompile
command. For example, to compile the parseNames C++ file and also register it in a sample database called mydb:
nzudxcompile --sig "parseNames(VARCHAR(ANY))" --return
"TABLE(product_id VARCHAR(200))" --class parseNames --version 2
parseNames.cpp -user myuser -pw password -db mydb
3-4
20444-5
Rev.4
For example, to register the sample function parseNames to the Netezza system, start an
nzsql session to your database (which is named mydb in this example):
nzsql mydb myuser password
Next, use the CREATE FUNCTION Netezza SQL command to register the UDTF:
MYDB(MYUSER)=> CREATE FUNCTION ParseNames(varchar(ANY))
RETURNS TABLE(product_id VARCHAR(200)) API VERSION 2 LANGUAGE CPP
PARAMETER STYLE NPSGENERIC EXTERNAL CLASS NAME 'ParseNames'
EXTERNAL HOST OBJECT '/home/nz/udx_files/parseNames.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/parseNames.o_spu10';
If the command is successful, it returns the message CREATE FUNCTION. It creates the
UDTF in the mydb database, and the UDTF is owned by myuser. To create a function, your
user account must have Create Function administration permission or you must be logged
in as the admin user. For a description of the required privileges for these commands, refer
to Managing User Account Permissions on page 6-1.
20444-5
Rev.4
3-5
By default, the admin user account has execute access to all user-defined functions and
aggregates. The user account that registered a UDTF also has execute access to that UDTF.
Other users can be given permission to run specific or all UDTFs.
3-6
20444-5
Rev.4
behavior of the TABLE WITH FINAL syntax depends on the locus, that is the location,
where the UDTF runs. If the UDTF executes on the S-Blades, for example, the TABLE WITH
FINAL post-processing occurs once per dataslice.
When you register a UDTF, you can control whether the user can invoke the UDTF with the
WITH FINAL syntax. If you register the UDTF as TABLE, TABLE FINAL ALLOWED, the user
can specify either TABLE or TABLE WITH FINAL syntax. If the UDTF is registered as
TABLE ALLOWED, for example, the user can specify only the TABLE syntax option. Likewise, if the UDTF is registered as TABLE FINAL ALLOWED, the user must use the TABLE
WITH FINAL syntax option.
The arguments that you specify (not the join qualifier) determine the type of correlation
that occurs.
In the catalog table functions can execute only on the host or a SPU (in the case of
a correlated table function).
In a materialized view materialized views operate using data which is stored on disk.
For the parseNames UDTF, the following query shows how the function could be invoked as
an uncorrelated table function:
mydb(usr1)=> SELECT * FROM TABLE(parseNames('1,2,3,4,5'));
PRODUCT_ID
-----------1
2
3
4
5
(5 rows)
20444-5
Rev.4
3-7
With inner correlation, the table function is invoked once for each input row. The table
output contains all of the output rows produced for that input row plus the corresponding input row. If the table function does not produce an output row for a given input
row, the input row will be omitted from the table output. Also, if the table function produces an output row for a given input row, but the join qualifier evaluates to false, the
input row will be omitted from the combined output. If the UDTF can be called using
the TABLE WITH FINAL syntax, note that there may be additional output rows as a
result of the WITH FINAL processing. Two examples follow:
3-8
20444-5
Rev.4
With left outer correlation, the major difference is that in cases where the input row
does not produce output or cases where the join qualifier evaluates to false, the UDTF
displays the result of the table function with NULL values in its columns. For example:
It cannot occur in a RIGHT OUTER JOIN where it is laterally correlated to the table
being joined on.
It cannot occur in a FULL OUTER JOIN where it is laterally correlated to the table
being joined on.
When used in a LEFT OUTER or INNER JOIN, where it is laterally correlated to the
table in the join clause, you will get correlation behavior and not join behavior.
20444-5
Rev.4
3-9
To identify the behavior of the chain of correlated functions, it can be helpful to divide the
query into parts and examining the results for each part. For example, the first part is the
query that joins the orders table with the parseNames UDTF as follows. For brevity, the output shows only the results for the first two order IDs (120 and 124):
mydb(usr1)=> SELECT t.order_id, t.prod_codes, f.product_id FROM orders
t JOIN TABLE(parseNames(prod_codes)) AS f ON TRUE ORDER BY order_id;
ORDER_ID |
PROD_CODES
| PRODUCT_ID
----------+-----------------------+-----------120 | 28,36,80
| 28
120 | 28,36,80
| 36
120 | 28,36,80
| 80
124 | 124,6,12,121
| 124
124 | 124,6,12,121
| 6
124 | 124,6,12,121
| 12
124 | 124,6,12,121
| 121
...
(34 rows)
As the output shows, the parseNames UDTF returns a table with a row for each unique
value in the prod_codes string of values. This initial result set is fed into the next join,
which invokes the parseNames function for each unique value in the f.product_id column,
as follows:
mydb(usr1)=> SELECT t.order_id, t.prod_codes, f.product_id, x.product_
id FROM orders t JOIN TABLE(parseNames(prod_codes)) AS f ON TRUE JOIN
TABLE (parseNames(f.product_id)) x ON TRUE ORDER BY order_id;
ORDER_ID |
PROD_CODES
| PRODUCT_ID | PRODUCT_ID
----------+-----------------------+------------+-----------120 | 28,36,80
| 28
| 28
120 | 28,36,80
| 36
| 36
120 | 28,36,80
| 80
| 80
124 | 124,6,12,121
| 124
| 124
124 | 124,6,12,121
| 6
| 6
124 | 124,6,12,121
| 12
| 12
124 | 124,6,12,121
| 121
| 121
...
(34 rows)
3-10
20444-5
Rev.4
The Netezza system offers a number of shaper methods that you can use to collect information on the input columns (including constant values) and to provide information about the
output shape. For a description of the methods, see UDTF Shaper Methods on
page D-10.
An example of a calculateShape() method follows:
void calculateShape(UdxOutputShaper *shaper) {
if (shaper->numArgs() != 1)
throwUdxException("Expecting only one argument");
int nType = shaper->argType(0);
if ((UDX_FIXED == nType) || (UDX_VARIABLE == nType)) {
int len = shaper->stringArgSize(0);
char ucstr[] = "UPPER_CASE"; // For column names on systems that
char lcstr[] = "lower_case"; // use lowercase naming
char tcstr[] = "Title_Case";
char ucstrU[] = "UPPER_CASE"; // For column names on systems
char lcstrU[] = "LOWER_CASE"; // that use uppercase naming
char tcstrU[] = "TITLE_CASE";
if (shaper->isSystemCaseUpper()) {
shaper->addOutputColumn(nType, ucstrU, len);
shaper->addOutputColumn(nType, lcstrU, len);
shaper->addOutputColumn(nType, tcstrU, len);
}
else {
shaper->addOutputColumn(nType, ucstr, len);
shaper->addOutputColumn(nType, lcstr, len);
shaper->addOutputColumn(nType, tcstr, len);
}
}
else {
throwUdxException("Only CHAR and VARCHAR types are supported");
}
}
In this example, note that the UDTF is designed to take an input string and output three
columns of data: an uppercase version, a lowercase version, and a title case or initial-cap
version of the string. The shaper verifies that only one string is input at a time, and that the
string is of type CHAR or NVARCHAR. The function also displays the column headings
using a mixed capitalization on systems where the system casing is lowercase, or all uppercase characters on systems where the case is uppercase.
20444-5
Rev.4
3-11
For a complete example of the UcLcTc UDTF, see Sample UDTF with Generic Return
Value on page F-13.
3-12
20444-5
Rev.4
calculateShape Method
Specifies the shape of the table returned by the UDTF.
Syntax
The method has the following syntax:
virtual void calculateShape(UdxOutputShaper *shaper)
Description
UDTFs that return a generic table size (that is, which specify RETURN TABLE(ANY)) must
include a calculateShape() method to define the shape and content of the return table.
The UdxOutputShaper object has methods that you can use to retrieve information about
the input to the table function as well as to set the shape of the output.
addOutputColumn Method
Defines an output column for the table.
Syntax
This method has the following syntax:
void addOutputColumn(int nType, const char* strName, int nSize);
void addOutputColumn(int nType, const char* strName, int precision,
int scale);
void addOutputColumn(int nType, const char* strName);
Description
The addOutputColumn method operates on the UdxOutputShaper object to build an output
column definition for the table using the specified input values. The version that you invoke
depends on the data type you are defining. For example, use the precision and scale variant
for numerics (but not doubles or floats), the size version for strings, and the other for all
other data types.
numOutputColumns Method
Returns the number of output columns for the table.
Syntax
This method has the following syntax:
int numOutputColumns();
20444-5
Rev.4
3-13
Description
The numOutputColumns method operates on the UdxOutputShaper object and returns the
number of output columns specified for the table.
getOutputColumn Method
Returns the specified table column.
Syntax
This method has the following syntax:
const UdxColumnInfo* getOutputColumn(int n);
Description
The getOutputColumn method operates on the UdxOutputShaper object and returns the
specified table column. The n value specifies the column that you want to return.
isSystemCaseUpper Method
Verifies whether the Netezza database system case is in uppercase or lowercase.
Syntax
This method has the following syntax:
bool isSystemCaseUpper();
Description
The isSystemCaseUpper method operates on the UdxOutputShaper object and returns true
if the Netezza system case is in uppercase, or false if it is lowercase.
getType Method
Returns the datatype of the column.
Syntax
This method has the following syntax:
int getType();
Description
The getType method operates on the UdxColumnInfo object and returns the datatype of the
table column.
getSize Method
Returns the size of the column.
Syntax
This method has the following syntax:
int getSize();
3-14
20444-5
Rev.4
Description
The getSize method operates on the UdxColumnInfo object. The method returns the length
of the string for a column that contains a string value.
getPrecision Method
Returns the precision value for a column.
Syntax
This method has the following syntax:
int getPrecision();
Description
The getPrecision method operates on the UdxColumnInfo object. For columns that contain
numeric data, the method returns the precision value.
getScale Method
Returns the scale value for a column.
Syntax
This method has the following syntax:
int getScale();
Description
The getScale method operates on the UdxColumnInfo object. For columns that contain
numeric data, the method returns the scale value.
getName Method
Returns the name of a column.
Syntax
This method has the following syntax:
const char* getName();
Description
The getName method operates on the UdxColumnInfo object. The method returns the
name of the column.
20444-5
Rev.4
3-15
3-16
20444-5
Rev.4
CHAPTER 4
Creating User-Defined Aggregates
Whats in this chapter
Creating the C++ File for the UDA
Compiling the UDA
Registering the UDA with the Netezza System
Using the UDA in a SQL Query
Altering and Dropping UDAs
This chapter describes how to create a user-defined aggregate and register it on a Netezza
system. This chapter creates a simple aggregate called PenMax, which returns the secondgreatest or second-largest value encountered. If there is not a second-greatest value, the
aggregate returns NULL.
In addition, make sure that you declare any of the standard C++ library header files that
your aggregate may require. If your UDA requires any user-defined shared libraries, make
sure you note the name of the libraries as you will need them when you register the UDA in
the database.
Note: User-defined shared libraries must exist in the database before you can register the
UDA and specify those libraries as dependencies. You could register the UDA without specifying any library dependencies, and after the libraries are added, use the ALTER
AGGREGATE command to update the UDA definition with the correct dependencies. For
more information about user-defined shared libraries, see Creating a User-Defined Shared
Library on page 5-1.
The UDX classes for API version 2 are defined in a namespace called nz::udx_ver2. (The
API version 1 UDXs use the nz::udx namespace.) Your C++ program must reference the correct namespace. For example:
#include "udxinc.h"
using namespace nz::udx_ver2;
4-1
Note: This chapter uses udx_ver2 as the default namespace for the examples that follow.
The sections note the differences with UDX version 1, and Appendix F, Sample UserDefined Functions and Aggregates Reference contains examples of version 1 and version
2 definitions. You can continue to create UDX version 1 UDAs as well as new version 2
UDAs; both will operate on Release 6.0.x systems. However, the version 1 UDAs will work
on Netezza Release 5.0.x and later systems and thus may be more portable for your
Netezza systems.
To implement a UDA, you create a new class object derived from the Uda base class. Continuing the PenMax example:
#include "udxinc.h"
using namespace nz::udx_ver2;
class CPenMax: public nz::udx_ver2::Uda
{
public:
};
Each UDA must implement the following five methods in addition to its constructor and
destructor. An example of the class header for the PenMax UDA follows:
class CPenMax : public nz::udx_ver2::Uda
{
public:
static nz::udx_ver2::Uda* instantiate(UdxInit *pInit)
virtual void initializeState();
virtual void accumulate();
virtual void merge();
virtual ReturnValue finalResult();
};
nz::udx_ver2::Uda* CPenMax::instantiate(UdxInit *pInit)
{
return new CPenMax(pInit);
}
instantiate() is called by the runtime engine to create the object dynamically. The static
implementation must be outside of the class definition. In UDX version 2, the instantiate method takes one argument (UdxInit *pInit), which enables access to the memory
specification, the log setting, and the UDX environment in the constructor (see UDX
Environment on page 2-7). It creates a new object of the derived class type using the
new operator and returns it (as base class type Uda) to the runtime engine. The
runtime engine deletes the object when it is no longer needed. An example follows:
class CPenMax : public nz::udx_ver2::Uda
{
public:
CPenMax(UdxInit *pInit) : Uda(pInit)
{
}
static nz::udx_ver2::Uda* instantiate(UdxInit *pInit);
virtual void initializeState();
virtual void accumulate();
virtual void merge();
virtual ReturnValue finalResult();
};
4-2
20444-5
Rev.4
initializeState() is called to allow the implementer to initialize the necessary state used
in the UDA. The state of a UDA is one or more values which must be valid Netezza
datatypes. The state is automatically preserved by the runtime engine between snippets, if necessary.To calculate the penultimate maximum, the function must keep track
of the largest two numbers in state variables. initializeState() sets both the variables to
NULL. The states are declared in the CREATE AGGREGATE command, which is
described later. An example follows:
void CPenMax::initializeState()
{
setStateNull(0, true); // set current max to null
setStateNull(1, true); // set current penmax to null
}
accumulate() is called once per row and adds the contribution of its arguments to the
aggregate's accumulator state. It updates the states to keep the highest two values in
the correct states. In addition to getting the arguments through int curVal =
int32Arg(0);, the method retrieves the two state variables using the in32State(int) and
isStateNull(int) functions. The accumulate method updates the states as required.
void CPenMax::accumulate()
{
int *pCurMax = int32State(0);
bool curMaxNull = isStateNull(0);
int *pCurPenMax = int32State(1);
bool curPenMaxNull = isStateNull(1);
int curVal = int32Arg(0);
bool curValNull = isArgNull(0);
if ( !curValNull ) { // do nothing if argument is null - can't
//affect max or penmax
if ( curMaxNull ) { // if current max is null, this arg
//becomes current max
setStateNull(0, false); // current max no longer null
*pCurMax = curVal;
} else
{ if ( curVal > *pCurMax ) { // if arg is new max
setStateNull(1, false); // then prior current max
// becomes current penmax
*pCurPenMax = *pCurMax;
*pCurMax = curVal; // and current max gets arg
} else if ( curPenMaxNull || curVal > *pCurPenMax ){
// arg might be greater than current penmax
setStateNull(1, false); // it is
*pCurPenMax = curVal;
}
}
}
}
20444-5
Rev.4
4-3
merge() is called with arguments of a second set of state variables and merges this second state into its own state variables. This method is necessary because the Netezza
system is a parallel-processing architecture, and the aggregate states from all SPUs
will be sent to the host, where they will be consolidated into a single merged aggregation state. The merge() method merges two states, handling all the null values states
correctly. One of the states is passed in normally as in accumulate(). The second state
is passed in as arguments, requiring the use of argument retrieval functions such as
int32Arg(int) isArgNull(int) to retrieve. An example follows:
void CPenMax::merge()
{
int *pCurMax = int32State(0);
bool curMaxNull = isStateNull(0);
int *pCurPenMax = int32State(1);
bool curPenMaxNull = isStateNull(1);
int nextMax = int32Arg(0);
bool nextMaxNull = isArgNull(0);
int nextPenMax = int32Arg(1);
bool nextPenMaxNull = isArgNull(1);
if ( !nextMaxNull ) { // if next max is null, then so is
//next penmax and we do nothing
if ( curMaxNull ) {
setStateNull(0, false); // current max was null,
// so save next max
*pCurMax = nextMax;
} else {
if ( nextMax > *pCurMax ) {
setStateNull(1, false);
// next max is greater than current, so save next
*pCurPenMax = *pCurMax;
// and make current penmax prior current max
*pCurMax = nextMax;
} else if ( curPenMaxNull || nextMax > *pCurPenMax ) {
// next max may be greater than current penmax
setStateNull(1, false); // it is
*pCurPenMax = nextMax;
}
}
if ( !nextPenMaxNull ) {
if ( isStateNull(1) ) {
// can't rely on curPenMaxNull here, might have
// change state var null flag above
setStateNull(1, false); // first non-null penmax,
// save it
*pCurPenMax = nextPenMax;
} else {
if ( nextPenMax > *pCurPenMax ) {
*pCurPenMax = nextPenMax;
// next penmax greater than current, save it
}
}
}
}
}
4-4
20444-5
Rev.4
finalResult() returns the final aggregation value from the accumulated state. A simple
example might be a UDA implementation of an average aggregation, where the finalResult() method divides the sum by the count to produce an average. In this example, the
finalResult() method gathers one of the states and returns it using the NZ_UDX_
RETURN_INT32 macro in a similar fashion to evaluate() in the UDF case.
ReturnValue CPenMax::finalResult()
{
int curPenMax = int32Arg(1);
bool curPenMaxNull = isArgNull(1);
if ( curPenMaxNull )
NZ_UDX_RETURN_NULL();
setReturnNull(false);
NZ_UDX_RETURN_INT32(curPenMax);
}
The NZ_UDX_RETURN_INT32 macro helps to confirm that the return value is of the
expected type. For a list of the available return macros, refer to UDX Return Value Macros on page D-8. The finalResult method can access all of the datatype helper API calls,
as well as a list of state arguments that are listed in UDA State Arguments on page D-8.
To compile the PenMax C++ file and create the API version 2 object files:
nzudxcompile penmax.cpp
penmax.o_x86 is the object file for the Netezza host (i386 Linux platform on x86).
penmax.o_spu10 is the object file for the Linux-based Rev10 SPUs on IBM Netezza
1000 and Netezza 100 models.
For a complete description of the nzudxcompile command and its options and usage, refer
to nzudxcompile Command Syntax on page 6-18.
After you create the compiled object files, you must register the UDA with the Netezza system so that query writers can include the UDA in their queries. The next section,
Registering the UDA with the Netezza System,describes how to register the UDA.
Optionally, you can also compile and register the UDA in one step using the nzudxcompile
command. This example also shows that you must include the --version 2 syntax when
you are using the command to compile and register an API version 2 UDA.
To compile the PenMax C++ file and also register it in a sample database called mydb:
nzudxcompile /home/nz/udx_files/PenMax.cpp o PenMax.o
-sig "PenMax(int4)" --version 2 -return INT4 -class CPenMax
--state "(int4, int4)" -user myuser -pw password -db mydb
20444-5
Rev.4
4-5
If the command is successful, it creates the aggregate in the default database. The UDA
will be owned by the user account that issues the SQL command. To create an aggregate,
your user account must have Create Aggregate permission, or you must be logged in as the
admin user.
Note: Each user-defined aggregate must also have a unique signature. For a description of
signatures and how the Netezza system processes them, see Function and Aggregate Signatures on page 2-6.
4-6
20444-5
Rev.4
20444-5
Rev.4
4-7
4-8
20444-5
Rev.4
CHAPTER 5
Creating User-Defined Shared Libraries
Whats in this chapter
Creating a User-Defined Shared Library
Library Loading Options
Compiling and Linking the Shared Library
Registering the Shared Library in a Database
Using the Shared Library with a UDX
Altering and Dropping Shared Libraries
Clearing Dependencies
This chapter describes the steps to create a user-defined shared library and to register it for
use on a Netezza system.
5-1
An automatic load library is automatically loaded into the system and added to the global space. At snippet execution time, the system ensures that automatic load libraries
are automatically opened, and library symbols are available for use. The library is automatically closed after the snippet finishes. Automatic load is the default method for
user-defined shared libraries.
Manual load means that a user-defined shared library is directly managed by a UDX.
The UDX must use the dlopen(), dlsym(), and dlclose() functions to load the library, reference symbols, and to close the library when finished. UDXs are responsible for
opening and closing the manual load libraries when they are needed.
If you create a shared library that has dependencies on other user-defined shared libraries,
you should define the top-level library as AUTOMATIC LOAD. The subsequent or referenced
libraries should also be AUTOMATIC LOAD.
2. Link the compiled object into a shared library for the host:
nzudxcompile --objs /home/nz/libs/mylib.o --host -o
/home/nz/libs/host/mylib.so
4. Link the compiled object into a shared library for the SPU:
nzudxcompile --objs /home/nz/libs/mylib.o --dynamic --spu
-o /home/nz/libs/spu/mylib.so
Note: The --dynamic switch is used only when compiling shared libraries for the SPU
environment.
5-2
20444-5
Rev.4
mylib.o_x86 is the object file for the Netezza host (i386 Linux platform on x86).
mylib.o_spu10 is the object file for the Linux-based Rev10 SPus on IBM Netezza
1000 and Netezza 100 models.
For a complete description of the nzudxcompile command and its options and usage, refer
to nzudxcompile Command Syntax on page 6-18.
After you create the compiled object files, you must register the library with the Netezza
system so that other UDXs can specify it as a dependency. The next section, Registering
the Shared Library in a Database,describes how to register the library.
If the command is successful, it creates the user-defined shared library in the default database. The library will be owned by the user account that issues the SQL command. To
create a library, your user account must have Create Library permission, or you must be
logged in as the admin user.
20444-5
Rev.4
5-3
Clearing Dependencies
If you create a UDX and declare dependencies for it, you can remove the dependencies
using the NO DEPENDECIES option of the ALTER FUNCTION|AGGREGATE|LIBRARY commands or the CREATE [OR REPLACE] FUNCTION|AGGREGATE|LIBRARY commands. The
NO DEPENDENCIES option is the default for these commands. It indicates that the UDX
does not have any dependencies, and if the UDX object already exists, it clears any previous dependencies for the object.
For example, to clear the dependencies for the sample UDF myfunc:
MYDB(MYUSER)=> CREATE OR REPLACE FUNCTION myfunc(int)
RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'CMyFunc' NO DEPENDENCIES
EXTERNAL HOST OBJECT '/home/nz/udx_files/myfunc.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/myfunc.o_spu';
5-4
20444-5
Rev.4
CHAPTER 6
Common UDX Development Topics
Whats in this chapter
Managing User Account Permissions
Documenting a UDX
UDX Development Best Practices
nzudxcompile Command Syntax
Migrating UDXs from API Version 1 to API Version 2
This chapter describes some UDX development topics and best practices for the Netezza
system. These topics generally apply to all types of UDXs, such as functions, aggregates,
and shared libraries.
6-1
For example, the following command grants Create Function permissions to the user
myuser:
GRANT CREATE FUNCTION TO myuser;
For example, the following command grants all permissions for aggregate objects to the
group analysts:
GRANT ALL ON AGGREGATE TO analysts;
For example, the following command revokes Create Library permissions from the group
analysts:
REVOKE CREATE LIBRARY FROM GROUP analysts;
Always specify a complete signature for the object value. For example, to grant Alter permissions for the sample function CustomerName (described later in this chapter) to the
user myuser:
GRANT ALTER ON CustomerName(varchar(64000)) TO myuser;
6-2
20444-5
Rev.4
To revoke Alter permissions on the CustomerName function from the group sales:
REVOKE ALTER ON CustomerName(varchar(64000)) FROM GROUP sales;
Always use a complete signature for the object. For example, to grant Execute permissions
for the sample function CustomerName to the user myuser, you can use the following
command:
GRANT EXECUTE ON CustomerName(varchar(64000)) TO myuser;
To revoke Execute permissions for the sample aggregate PenMax (described later in this
chapter) from the group sales:
REVOKE EXECUTE ON PenMax(int4) FROM GROUP sales;
Always specify a complete signature for the object value. For example, to grant Drop permissions for the sample function CustomerName (described later in this chapter) to the
user newuser:
GRANT DROP ON CustomerName(varchar(64000)) TO newuser;
To revoke Drop permissions on the CustomerName function from the user myuser:
REVOKE DROP ON CustomerName(varchar(64000)) FROM myuser;
20444-5
Rev.4
6-3
Documenting a UDX
As a best practice, Netezza recommends that you follow a documentation and comments
convention for each UDX. One method is to include all the necessary DDL commands and
sample usage at the top of the C++ file as a comment. This can help you to reconstruct the
purpose, compilation code, Netezza registration arguments, and any additional information
such as table format for your functions.
The following example shows a complete listing of the customername.cpp file with both the
code and documentation comments at the beginning of the file.
/*
Copyright (c) 2007-2009 Netezza Corporation, an IBM Company
All rights reserved.
Function CustomerName takes a string and returns an integer 1 if it
begins with 'Customer A' and 0 otherwise.
REGISTRATION:
create or replace function
CustomerName(varchar(64000))
returns INT4
language cpp
parameter style npsgeneric
EXTERNAL CLASS NAME 'CCustomerName'
EXTERNAL HOST OBJECT '/home/nz/udx_files/customername.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/customername.o_spu10';
USAGE:
create
insert
insert
insert
insert
6-4
20444-5
Rev.4
Documenting a UDX
};
Udf* CCustomerName::instantiate()
{
return new CCustomerName;
}
A Netezza SQL query user can display these comments using the nzsql \dd <name> command switch, or the \dd switch which will show all comments for all functions.
As a best practice, you should create comments for all UDXsincluding information about
the author, version, and descriptionin a format similar to the following:
COMMENT ON FUNCTION <function name> (<argument type list>) IS
'Author: <name>
Version: <version>
Description: <description>';
For example:
COMMENT ON FUNCTION CustomerName(varchar(64000)) IS 'Author: name
Version: 1.0 Description: Sample UDF from Dev Guide';
To comment on a UDX, you must either be the Netezza admin user, the owner of the UDX,
or you must have COMMENT permissions for the objects. For more information about COMMENT ON, refer to the IBM Netezza Database Users Guide.
For UDFs and UDAs, make sure that you specify a full signature name (argument type list),
including correct sizes for numeric and string datatypes. Otherwise you will receive an error
similar to the following:
Error: CommentAggregate: existing UDX name(argument type list)
differs in size of string/numeric arguments
For functions, you can use the SHOW FUNCTION SQL command or the \df nzsql command argument.
For aggregates, you can use the SHOW AGGREGATE SQL command or the \da nzsql
command argument.
For shared libraries, you can use the SHOW LIBRARY SQL command or the \dl nzsql
command argument.
The SQL commands and the nzsql switches provide the same output information; they
show information for all functions or aggregates the standard Netezza SQL built-in functions and aggregates as well the UDFs and UDAs in the database.
20444-5
Rev.4
6-5
A sample \df command and its output follows. Note that the output has been truncated for
easier viewing. (The output is identical for the SHOW FUNCTION command.)
MYDB(MYUSER)=> \df
List of functions
RESULT
| FUNCTION
| BUILTIN | ARGUMENTS
-----------------+-------------+---------+------------------BIGINT
| ABS
|
t
| (BIGINT)
DOUBLE PRECISION | ABS
|
t
| (DOUBLE PRECISION)
INTEGER
| ABS
|
t
| (INTEGER)
DOUBLE PRECISION | COS
|
t
| (DOUBLE PRECISION)
DOUBLE PRECISION | COT
|
t
| (DOUBLE PRECISION)
INTEGER
| CUSTOMERNAME |
f
| (CHARACTER VARYING (64000))
In the output above, note that the UDF customername shows f (false) for the BUILTIN
value. Standard functions and aggregates that are supplied with Netezza SQL by default
are called built-ins, and they show a value of t (true).
Use the plus sign switch (\df+, \da+, and \dl+) or the VERBOSE option for the SQL commands to obtain verbose output, which will include the comments specified for them if you
follow the best practice described in the previous section.
To create a synonym in the current database for a UDX that resides in a different database
on the same Netezza system, follow the same steps as for a Netezza table, view, or object.
For example:
CREATE SYNONYM <name> FOR <object name>
The <object name> can be a local UDX or a fully qualified name for a UDX in a different
database. Query users can then invoke the UDX using its synonym <name>.
Also note that UDFs and UDAs can be used in a view, just as built-in functions can. If you
use them in a view, note that permission-checking will be based on the view permissions,
and the Execute object permissions are bypassed. For a complete description of how to create and use synonyms, as well as for the use of functions in views, refer to the IBM Netezza
Database Users Guide.
6-6
20444-5
Rev.4
Although the TRUNC and trunc functions are different functions in the database, this generally leads to user confusion over the two similarly named functions. Both functions
appear to be the same function, but they could have very different definitions and purposes. Also, if the system administrator should ever change the system case to lowercase,
an identifier name collision would result.
The _v_dual_dslice view returns the dataslice ID for each dataslice at the dataslice.
The _v_dual view returns one row and causes the UDX to be evaluated on the host.
In previous releases, it was a common practice for users to create a single-row table or a
multi-row table to control whether a UDX would be evaluated on one or all SPUs. You can
use these views with a UDTF as well, but the behavior of the UDTF is also controlled by the
execution locus options specified with the parallel or parallel-not-allowed syntax. That is, a
parallel-allowed UDTF will usually run on a SPU even if invoked with _v_dual, and a parallel-not-allowed UDTF will run on the host even if invoked with _v_dual_dslicse. The
_v_dual_dslice view can be useful for a parallel UDTF so that it has at least one row on
each dataslice.
20444-5
Rev.4
To avoid the linker errors for identically named (but different) symbols, declare functions as static, and use namespaces to help uniquely identify the symbols.
To avoid the linker errors for code which is reused among several UDXs, you can compile the code for the UDXs and the shared code into one object file using the
nzudxcompile command. You could also move the shared code into a user-defined
shared library and then have the UDXs all depend on that library.
6-7
getCurrentLocus
getCurrentDatasliceId
getCurrentTransaction
getCurrentHardwareId
getCurrentUsername
getCurrentSessionId
getNumberDataslices
getNumberSpus
udxLibraryName
Note: If you build these functions into your UDFs and you later downgrade the Netezza
release to a 4.5 version, these functions will not exist in that version. You will need to
rewrite those C++ files to no longer use these functions, or users will encounter link errors
if they try to run them in the earlier environments.
getCurrentLocus
Detects the locus of execution for a UDF.
Description
Returns A value that indicates whether the UDF is running in Postgres, DBOS, or on a
SPU. The valid values are UDX_LOCUS_POSTGRES (0), UDX_LOCUS_DBOS (1) or UDX_
LOCUS_SPU (2).
6-8
20444-5
Rev.4
getCurrentDatasliceId
Returns the value of the dataslice ID on which the UDX is operating.
Description
Returns A dataslice ID when the UDF is running on a SPU, or 0 is the UDF is running
elsewhere such as the host.
getCurrentTransaction
Returns the current Netezza transaction ID.
Description
Returns
getCurrentHardwareId
Returns the value of the hardware ID on which the UDX is operating.
Description
Returns A hardware ID when the UDF is running on a SPU, or 0 is the UDF is running
elsewhere such as the host.
getCurrentUsername
Returns the name of the user who is running the UDX.
Description
Returns
getCurrentSessionId
Returns the current session ID value.
Description
Returns
getNumberDataslices
Returns the number of dataslices.
Description
Returns The number of dataslices on which the UDF is operating. This number could be
larger than the number of SPUs.
20444-5
Rev.4
6-9
getNumberSpus
Returns the number of SPUs.
Description
Returns
udxLibraryName
Given a library name as used in the DEPENDENCIES clause of the DDL, returns the actual
path on disk of the corresponding shared library. The function returns the appropriate file
pathname depending on the context (host vs SPU).
Description
Returns The function returns the names of libraries on which the given snippet depends
due to its UDXs, including indirect (or nested) dependencies.
The caseSensitive flag allow you to specify a case sensitive or case insensitive lookup. In
most cases, a case insensitive lookup works best, but if there are two libraries with the
same name but different cases, you need to use the case sensitive flag to distinctly identify
the libraries.
If the library name is not found, the function returns NULL.
This function is primarily used for libraries that are registered as MANUAL LOAD. After the
pathname has been recovered, the user can use dlopen, dlsym, and dlclose as normal. In
the case of C++ libraries, the user is responsible for providing a mangled name to dlsym.
Additionally, some C++ functionality requires that dlopen be invoked with RTLD_GLOBAL
for run time type information (RTTI).
6-10
20444-5
Rev.4
_f_fgt
_f_fle
_f_flt
_f_fne
_f_ftod
_f_ftoi
_f_ftoll
_f_ftou
_f_ftoull
_f_itof
_f_lltof
_f_mul
_f_sub
_f_ulltof
_f_utof
_fp_round
_isinf
_isnan
_logb
_logbf
_nextafter
_nextafterf
_scalb
a64l
abs
acos
acosf
asctime
asctime_r
asin
asinf _
atan
atan2
atan2f
atanf
atof
atoi
atol
bsearch
calloc
ceil
ceilf
clock
cos
cosf
cosh
coshf
d_fge
div
drand48
ecvt
erand48
erf
erfc
erfcf
erff
exp
expf
fabs
fabsf
fcvt
floor
floorf
fmod
fmodf
free
frexp
frexpf
gcvt
gettimeofday
gmtime
gmtime_r
hcreate
hdestroy
hsearch
hypot
hypotf
isalnum
isalpha
isascii
iscntrl
isdigit
isgraph
islower
isnan
isprint
ispunct
isspace
isupper
isxdigit
j0
j1
jn
jrand48
l64a
labs
lcong48
ldexp
ldexpf
ldiv
lfind
lgamma
lgammaf
localeconv
log rand_r
log10
log10f
logf
lrand48
lsearch
malloc
mblen
mbstowcs
memchr
memcmp
memcpy
memmove
memset
modf
modff
mrand48
nrand48
pow
powf
printf
qsort
rand
realloc
rint
round
roundf
scalbln
scalblnf
scalbn
scalbnf
seed48
sin
sinf
sinh
sinhf
snprintf
sprintf
sqrt
sqrtf
srand
srand48
sscanf
strcasecmp
strcat
strchr
strcmp
strcoll
strcpy
strcspn
strdup
strerror
strlen
strncasecmp
strncat
strncmp
strncpy
strpbrk
strrchr
strspn
strstr
strtod
strtok
strtok_r
strtol
strtoul
strxfrm
swab
tan
tanf
tanh
tanhf
tdelete
tfind
time
times
toascii
tolower
toupper
trunc
truncf
tsearch
twalk
vprintf
vsnprintf
vsprintf
vsscanf
wcstombs
wctomb
y0
y1
yn
UDFs and UDAs can also allocate memory with the malloc/free functions or new/delete
operators. However, use caution to carefully consider the memory allocations and include
the memory as part of the MAXIMUM MEMORY argument for the function or aggregate.
Any function or aggregate that exceeds its MAXIMUM MEMORY setting could negatively
impact the system performance.
20444-5
Rev.4
6-11
When you define a UDF as RETURNS NULL ON NULL INPUT, if the Netezza system
detects a NULL input value to the UDF, it skips the UDF and automatically returns a
NULL value.
When you define a UDF as DETERMINISTIC, the Netezza system may call the UDF
only once during statement preparation time rather than once for each row it operates
on during the query execution. This will only happen if the UDF takes all literal arguments or no arguments, or if the UDF RETURNS NULL ON NULL INPUT and it is given
at least one literal NULL as an argument.
If your query uses the same UDF more than once, and the UDF takes the same arguments and is DETERMINISTIC, the Netezza query algorithms could apply common
subexpression elimination (CSE) to improve the query performance. With CSE, the
Netezza system calls the function only once for a common result that it can apply to
the other uses of the function within the query.
The Netezza Just In Time (JIT) statistics process can also increase the number of UDX
invocations. JIT statistics runs very fast sample queries on the affected tables to assess
query performance. Thus, the process could invoke the UDXs in the query several times
as it seeks the best plan for the query.
The last two examples are Netezza query performance optimizations, and should not be of
concern. For the first two example situations, you can change the query optimization
behavior if necessary by changing the UDF registration settings.
If you register the UDF as NON DETERMINISTIC, the Netezza system always invokes the
function to obtain a value. (The NON DETERMINISTISC setting may also be the reason why
the log shows that a UDF was invoked more than you expected.)
If you register the UDF as CALLED on NULL INPUT, the Netezza system invokes the function for one or more NULL input values. Your function must then be designed to handle
input NULL values appropriately.
Carefully consider the performance implications for these changes; if your UDF really is
DETERMINISTIC or it should return NULL on NULL input, there are performance benefits
to the resulting query optimizations. You might want to use different settings for these registration options in your test environment than in the production environment. For more
details about these arguments, refer to CREATE [OR REPLACE] FUNCTION on
page B-17.
6-12
20444-5
Rev.4
For example, if you attempt to drop a UDF named fileupdate that is used in a table named
customers, an error similar to the following is returned:
DEV(USER1)=> DROP FUNCTION fileupdate(int4);
ERROR: Can't delete function FILEUPDATE - table CUSTOMERS (col 1)
depends on it
The error reports the table and specific column that refers to the view that you wanted to
drop.
Similarly, if you try to drop a UDX that is used in a view, the command returns an error. For
example, if you try to drop a UDA named mysum which is used in a view named TOTAL_
VW, the following error is returned:
DEV(MYUSER)=> DROP AGGREGATE mysum(int4);
ERROR: Can't delete aggregate MYSUM - view TOTAL_VW depends on it
To resolve these error messages and drop the UDX, you must change the default value of
each table row which references the UDFs by modifying the default value clause using the
ALTER [ COLUMN ] column { SET DEFAULT value | DROP DEFAULT } commands. For
views, you need to use the CREATE OR REPLACE VIEW command to remove the UDX from
the view definition.
If you try to drop a user-defined shared library that is a dependency of any existing UDX,
you must resolve those dependencies before you can drop the library. For example:
DEV(MYUSER)=> DROP LIBRARY mymathlib;
ERROR: Can't delete library mymathlib - function MYFUNC(integer)
depends on it
If you try to drop a database which contains objects that are referenced by objects in other
databases, the DROP DATABASE command displays errors and exits. The error messages
display up to 5 object dependencies, plus the total number of dependencies which must be
resolved. You must resolve all the dependency issues before you can drop the database.
Note: If you have tables or views from a previous Netezza release that reference a UDX,
note that after you upgrade to 4.6, the Netezza system will allow a DROP command on that
UDX. The older tables and views are not added as dependencies until you recreate the view
or modify the default expression using the 4.6 software. If you have an upgraded system,
you should use the nzudxvalidate command after you drop a UDX to check for any tables
and views which might contain unresolved references to the dropped UDX. For more information, see the next section Checking for Unreferenced or Invalid UDXs.
20444-5
Rev.4
6-13
nzudxvalidate command
Checks tables and views for references to dropped UDXs, and validates the existing UDXs
for any problems such as missing or invalid object files. You must be logged in as the nz
user account to run this command, and NZ_USER and NZ_PASSWORD must be set to the
admin account and password.
Syntax
The -h option displays help for the command. The -d option allows you to specify one database to check. By default, the command checks all the databases on the Netezza system. If
your Netezza user account has limited access to the databases on the Netezza system, the
command can check only the databases to which you have access.
Description The nzudxvalidate command locates any references to dropped UDXs or
invalid UDXs within all of the databases or a specific database. The Netezza system must
be online when you run this command. The command displays a list of any tables and
views that reference dropped UDXs, as well as any UDXs that have issues with their object
files, such as missing object files, invalid object files, or object files that fail a CRC checksum match. A description of these problems and how to resolve them follows the example.
Processing tables
Table DEV.T2.C2 - default value references stale UDF(s): 'ONE()'
Processing views
View DEV.VAS uses stale UDA UDA_SUM (oid 214389)
View DEV.VAS2 uses stale UDF UDF_LENGTH (oid 214444)
View DEV.VAS2 uses stale UDA UDA_SUM (oid 214389)
Processing udfs
UDF CONVERT.STRING_SIZE_VARCHAR(CHARACTER VARYING(64000)) is missing
its EXTERNAL HOST OBJECT file
UDF CONVERT.STRING_SIZE_VARCHAR(CHARACTER VARYING(64000)) is missing
its EXTERNAL SPU OBJECT file
UDF CONVERT.CHARID(CHARACTER(ANY)) has invalid checksum for its
EXTERNAL HOST OBJECT file
UDF CONVERT.ONE() has invalid EXTERNAL HOST OBJECT file
UDF CONVERT.ONE() has invalid EXTERNAL SPU OBJECT file
Processing udas
UDA small.CHARMAX2(CHARACTER(20)) is missing its EXTERNAL HOST OBJECT
file
6-14
20444-5
Rev.4
Invalid object file errors. These errors typically occur because the user specified the
wrong object file pathname in the CREATE OR REPLACE or ALTER command for the
UDX. For example, the user most likely specified the SPU object file pathname for the
host object file argument, or vice versa.
Missing object file errors. These errors (which occur very rarely) indicate that the object
file has somehow been deleted from the /nz/data directory.
Invalid checksum errors. These errors (which occur very rarely) indicate that there has
been a corruption or unexpected change to the object file, and it no longer matches the
one with which the UDX was registered.
To correct any of these errors, use the ALTER [FUNCTION|AGGREGATE|LIBRARY] command or CREATE OR REPLACE [FUNCTION|AGGREGATE|LIBRARY] command to update
the UDX with the correct object files. Make sure that you specify the correct external object
file pathname (either host or SPU) for the object file arguments.
20444-5
Rev.4
"C"
"C"
"C"
"C"
"C"
6-15
Note: If your UDF or UDA uses a function such as strftime (which formats a local time/date
according to LC_* locale settings), keep in mind the best practices from the previous section, Using C Runtime Library Functions on page 6-10. Though the function returns a
value on the Netezza host, results are inconsistent on the SPUs because the SPUs do not
support the LC* variables. The value returned on the host is based on the LC_TIME value
when the Netezza system was started, which can cause some unexpected time settings for
the UDX code. As a best practice, avoid the use of LC* variables or functions that use
them.
The msg message string is returned to the SQL session. For example:
throwUdxException( "Invalid value" );
6-16
20444-5
Rev.4
Memory Registration
When you register a user-defined function or aggregate, you can use an optional parameter
called MAXIMUM MEMORY to specify the amount of memory, in bytes, that the function or
aggregate is expected to require as it runs. The size value can be an empty value or a value
in the form of a number, or a number with the letters b (bytes), k (kilobytes), m (megabytes), or g (gigabytes). This is not a memory limit threshold; instead, this value is a
performance indicator used during scheduling and planning. The stated memory allocation
helps the Netezza system to schedule the UDX better, which will run the queries that use
the UDX more efficiently.
For each UDX, try to estimate the overall memory consumption and use the MAXIMUM
MEMORY parameter to specify the total memory usage. The test harness (described in
UDX Test Harness on page 7-5) provides information that can help you to assess the
memory consumption of a UDX.
You can use the following function within your UDX to display the memory that the UDF or
UDA was registered with (its MAXIMUM MEMORY value):
getMemory()
API version 2 UDXs have access to the memory registration within the constructor.
Similarly, to compile the multiple sources to create one SPU object file, use commands like
the following:
nzudxcompile helloworld.cpp --spu -o helloworld_temp.o_spu10
nzudxcompile parser.c --spu -o parser.o_spu10
nzudxcompile --spu --objs helloworld_temp.o_spu10
--objs parser.o_spu10 -o helloworld.o_spu10
20444-5
Rev.4
6-17
Conditional Compilation
Within your C++ source files, you can mark code that should run only on the Netezza host
or the SPUs using the FOR_SPU conditional compilation. For example:
#ifdef FOR_SPU
My SPU code
#else
My host code
#endif
Syntax
The nzudxcompile command has the following syntax:
nzudxcompile [OPTIONS]... srcfile
Inputs
The nzudxcompile command takes the following input options. Note that some of the
options are general options, while some apply when compiling UDXs, or when registering
UDFs or UDAs.
Table 6-2: nzudxcompile General Options
6-18
Option
Description
--base base
--user username
--pw password
--db database
-h or --help
20444-5
Rev.4
Table 6-3 describes the nzudxcompile options used for compiling UDX source files.
Table 6-3: nzudxcompile Compile Options
Option
Description
srcfile
--dynamic
Create a shared library for SPUs. You must specify --objs with
this argument.
-g
--host
Creates a compiled object file for the Linux host only. Used
when combining multiple .o files using the --objs option.
--spu
--sputype type
Compiles for the specified SPU type. Valid values are spu7 for
z-series SPUs and spu10 for IBM Netezza 1000 and Netezza
100 model SPUs.
--print-compiler
--print-spu-file
-o outputobjectfile
--args args
--objs inputobjectfile
Specifies input object file(s) that will be linked into one shared
object file. If you specify this option, you must also specify
either --spu, --sputype, or --host.
Table 6-4 describes the options used when registering either a UDF or UDA.
Table 6-4: nzudxcompile General Registration Options
20444-5
Rev.4
Option
Description
[ --spufile file ]
[ --hostfile file ]
6-19
Description
--return return
Specifies the return type for the function or aggregate. You must
specify a valid Netezza data type for the return type.
--class class
--deps libs
--mask args
--mem mem
Specifies an indication of the potential memory use of the function. The mem value can be an empty value or a value in the
form of a number, or a number with the letters b (bytes), k (kilobytes), m (megabytes), or g (gigabytes).
--fenced
Specifies that the UDX should run in fenced mode. This is the
default mode.
--unfenced
--varargs
Table 6-5 describes the registration options that are available for UDX API version 2
objects.
Table 6-5: nzudxcompile API Version 2 Registration Options
6-20
Option
Description
--environment val
--version ver
20444-5
Rev.4
Table 6-6 describes the options used only when registering a UDF, in addition to those
described in Table 6-4.
Table 6-6: nzudxcompile UDF Registration Options
Option
Description
--nondet
For a user-defined function, specifies that the function is nondeterministic. A deterministic function, which is the default,
indicates that the UDF is a pure function, one which always
returns the same value given the same argument values and
which has no side effects. A non-deterministic function could
return different results based on the code where it is called on
or other situations; therefore, the function is always called even
if it has multiple instances in your query. If Netezza observes
multiple instances of a deterministic function in a query, it
could reduce all the calls to one call (a common subexpression
elimination) to improve the query performance.
--nullcall
Table 6-7 describes the options used only when registering user-defined table functions.
Table 6-7: nzudxcompile Table Function Registration Options
Option
Description
--noparallel
Specifies that the table function will be created with parallel not
allowed. The default is parallel allowed.
--lastcall args
20444-5
Rev.4
Option
Description
--state state
6-21
Description
--type aggtype
Description
The nzudxcompile command has these additional descriptions:
Privileges Required
To run nzudxcompile, you must be logged in to the Netezza system as the nz user account.
If you use the command to also register the user-defined function or aggregate in one step,
you must specify a SQL user such as admin or one who has Create Function | Aggregate and
List privileges to the target database.
Common Tasks
Use the command to compile C++ code files for a user-defined function or aggregate into
object files that can be used in SQL queries on the Netezza system. If you use this command only to compile the object files, you will need to register the functions and
aggregates using the CREATE FUNCTION or CREATE AGGREGATE Netezza SQL commands. You can also create the compiled objects and register them at the same time using
the nzudxcompile command; you must specify the --sig, --return, --class, and --state arguments, which will then cause the Netezza system to call the related CREATE OR REPLACE
function, as applicable.
Usage
The following are examples of nzudxcompile command usage:
To compile a sample C++ file for a function named cube and create the output object
files:
nzudxcompile /home/nz/udx_files/cube.cpp
To compile the cube C++ file and save it as mycube object files:
nzudxcompile /home/nz/udx_files/cube.cpp -o mycube
To compile the cube C++ file and also register it in the mydb database:
nzudxcompile /home/nz/udx_files/cube.cpp --sig "Cube(int4)"
--return INT8 --class Cube --user myuser --pw password --db mydb
To create a shared library called mylib from the mylib.cpp file, run the following two
commands:
nzudxcompile /home/nz/udx_files/mylib.cpp --sputype spu10 -o
mylib.so
nzudxcompile --objs /home/nz/libs/mylib.o_spu10 --dynamic --sputype
spu10 -o mylib.so
6-22
20444-5
Rev.4
The constructor and instantiator for a UDX must be revised to take a UdxInit object.
For example, in API version 1, an UDX could have the following form:
using namespace nz::udx;
class CCustomerName: public Udf
{
public:
static Udf* instantiate();
};
Udf* CCustomerName::instantiate()
{
return new CCustomerName;
}
In API version 2, the instantiator and constructor take a UdxInit argument. As a result,
you must declare the constructor, even if it does not take any arguments, as in this
example:
using namespace nz::udx_ver2;
class CCustomerName: public nz::udx_ver2::Udf
{
public:
CCustomerName(UdxInit *pInit) : Udf(pInit){}
static nz::udx_ver2::Udf* instantiate(UdxInit *pInit);
};
nz::udx_ver2::Udf* CCustomerName::instantiate(UdxInit *pInit)
{
return new CCustomerName(pInit);
}
20444-5
Rev.4
6-23
6-24
20444-5
Rev.4
CHAPTER 7
Debugging User-Defined Functions and Aggregates
Whats in this chapter
Message Logging
UDX Test Harness
Debugging Using UDX Stubs
This chapter describes how to debug and test user-defined functions, aggregates, and
shared libraries using two debugging aids:
Message logging
This chapter also describes how to disable UDXs within your nzsql session so that you can
troubleshoot problems such as changes in query performance on the Netezza system.
Message Logging
You can use the logMsg() facility to include operational messages and debugging hints
within your UDXs. The logMsg facility is similar to printf-style logging. You add the messages that you want to track the operation of the UDX. Each message has a flag value
(LOG_DEBUG, LOG_TRACE, or both values ORed together) to help you control the verbosity of the output.
You can control how much detail is output for a specific UDF or UDA using the LOGMASK
attribute when you register the function or aggregate, or when you run the UDX using the
test harness.
logMsg Function
Adds a logging message to your user-defined function or aggregate.
Syntax
The logMsg function has the following syntax:
logMsg(flag, fmt-string, args...)
The flag argument specifies the output level for the message. This allows you to control the
verbosity of the debugging output. If logging is enabled at the specified flag level, all of the
messages with that flag level will be output to standard output as well as the specified log
file. The valid values are LOG_DEBUG, LOG_TRACE, or both values ORed together.
7-1
DEBUG is usually a higher-level tracing category which provides messages for actions in
the main body. TRACE is typically used in lower-level areas of the code, such as loops or
other subareas of code. For example, you might put a logMsg statement with LOG_DEBUG
in the main body, and several more detailed statements with a LOG_TRACE level inside a
loop. For messages that you want to display under either output mode, you can specify
LOG_DEBUG|LOG_TRACE as an ORed value.
The fmt-string value specifies the logging message, enclosed in double quotes, and must
end with a newline character (\n). If you want to include substitution values in the message, you can do so and then specify the values using the args value. Note that the fmtstring value can include vsnprintf() formatted conversion specifications such as %i (optionally signed integer), %lld (long long decimal), %llu (long long unsigned int), and the like.
For a description of the available options, refer to the vsnprintf() documentation or man
pages.
The args value specifies zero or more substitution arguments that you want to specify in the
output message. The args values must correspond in type and number to substitution
switches in the fmt-string.
Usage
The logMsg function specifies an output message that you can use to follow the operational
steps of a UDX, which can help you to identify debugging steps and other information
about the function. You enable the logging using the nzudxdbg command.
7-2
20444-5
Rev.4
Message Logging
After you change a UDX to add logMsg() calls, you must recompile and re-register it. To
recompile the function:
nzudxcompile customername.cpp o customername.o
After you re-register the function, you set the log verbosity for the function by altering the
function using Netezza SQL commands:
ALTER FUNCTION CustomerName(varchar(64000)) LOGMASK DEBUG,TRACE;
The values for LOGMASK can be NONE, DEBUG, TRACE, or both DEBUG,TRACE. The
value NONE disables output; DEBUG outputs any calls to logMsg that contain LOG_
DEBUG; TRACE outputs any messages with LOG_TRACE as the flag; DEBUG,TRACE outputs messages that have either or both DEBUG and TRACE flags.
If you use the --file argument of the nzudxdbg command, the output messages will be written to the standard log files. As a best practice, you should log the messages to files to help
with debugging and comparisons of the output messages following the test runs.
After you enable the LOGMASK, if you want the message output from the UDX to appear in
your terminal or shell window, you must stop and restart the database. To stop and restart
the Netezza database without disconnecting any Netezza processes:
nzstop
nzstart -i
You do not have to restart the database after each time you change the function, only after
the first time that you enable message logging in that specific terminal window.
Use the nzudxdbg command to enable message logging:
nzudxdbg --user admin --pw password -on --file
You should see log messages like the following in the sysmgr.log log file and optionally the
shell window in which you ran nzstart i:
(event002.1001) [d,udx ] Found a match of length 12
(event002.1003) [d,udx ] Found a match of length 10
The (event002.1001) identifies where the function was run it means that it was in process event002 on the SPU with hardware id 1001 (or 1003 in the second message).
To see the log message output for the function when it runs on the host, run the query:
select x.*, customername(x.b) from customers x, customers y;
20444-5
Rev.4
16:29:07
16:29:07
16:29:07
16:29:07
16:29:07
(dbos.24072)
(dbos.24072)
(dbos.24072)
(dbos.24072)
(dbos.24072)
[d,udx
[d,udx
[d,udx
[d,udx
[d,udx
]
]
]
]
]
Found
Found
Found
Found
Found
a
a
a
a
a
match
match
match
match
match
of
of
of
of
of
length
length
length
length
length
10
10
10
12
12
7-3
The (dbos.24072) value indicates that the message occurred in the dbos process with process ID (pid) 24072.
nzudxdbg Command
Enables or disables message logging for user-defined functions or aggregates. The command displays messages to standard output and optionally to the standard log files. You
must run this command as the nz user.
The command has the following syntax:
nzudxdbg [--all | --id hwid ] [--on | --off] [--file] [--user user] [-pw password] [-h]
Description
--all
--id hwid
--on
--off
--file
on the host
/nz/kit/log/postgres/pg.log for the functions that operate on the
SPUs
7-4
--user user
Specifies a SQL user for the command. Use the admin user or a
SQL user who has Manage Hardware privileges. The default is the
value of NZ_USER.
--pw password
Specifies the password for the SQL user account. The default is the
value of NZ_PASSWORD.
-h
20444-5
Rev.4
This command runs the customername function 100 times with randomly generated data
and displays output similar to the following:
(clientmgr) Info: admin: login successful
Selected only choice
1 - customername(VARCHAR(64000)) RETURNS INT4
Executing /nz/kit/bin/adm/udxharness -f customername_func.harness -k /
nz/kit.6.0.B4.14104
starting execution
Elapsed time: 0m0.039s
External references
logvprint(char const*, char*)
vtable for __cxxabiv1::__class_type_info
vtable for __cxxabiv1::__si_class_type_info
operator delete[](void*)
operator delete(void*)
operator new(unsigned int)
__cxa_pure_virtual
__gxx_personality_v0
free
memcmp
sprintf
strcmp
strdup
throwError
Our UDX object used 262144 bytes (may be rounded up to nearest page
4096)
Our UDX return value takes up 4 bytes, with 669 bytes for miscellaneous
Our UDX arguments take up 64012 bytes, with 14 bytes for miscellaneous
Our UDX state values take up 0 bytes, with 8 bytes for miscellaneous
State information may be doubled, since we need two states for merge
20444-5
Rev.4
7-5
If the function runs with no errors, the External references section of the output lists any
external library functions that the function uses. sprintf and throwUdxException() should
always be listed as they are included by the support functions, such as int32Arg(int).
The last section of the output displays estimated memory usage for the UDF.
Although the sample customername function is simple and operating correctly, assume
that the UDF has a problem. The following code for the function has a deliberate error that
would cause a buffer overrun when the function runs:
virtual ReturnValue evaluate()
{
StringArg *str = stringArg(0);
int lengths = str->length;
char *datas = str->data;
char* ptr = (char*)str;
for (int i=0; i < 4000; i++)
{
*(ptr-i) = 5;
}
int32 retval = 1;
if (lengths >= 10)
if (memcmp("Customer A",datas,10) == 0)
{
logMsg(LOG_DEBUG, "Found a match of length %d\n",
lengths);
retval = 1;
}
NZ_UDX_RETURN_INT32(retval);
}
If you have a new function, or one that you have found to have an error in processing, you
can compile the UDX with a debugging option, as follows:
nzudxcompile customername.cpp o customername.o -g
Assume that this incorrect function has been registered. The next step is to run it using the
test harness, as follows:
nzudxrunharness --user admin --pw password --db mydb
--dir /nz/data.1.0 --name customername --unfenced
7-6
20444-5
Rev.4
When the test harness detects a buffer overwrite, the output displays the structure order
and where the error occurred. In this case, the error occurred after returnType. To try and
identify the cause of the problem, you could debug it in GDB as follows:
nzudxrunharness --user admin --pw password --db mydb
--dir /nz/data.1.0 --name customername -dbg
This command launches the gdb prior to the execution of your UDF. A good place to set a
breakpoint is on the evaluate() method by typing:
(gdb) break CCustomerName::evaluate
For more information about debugging best practices to isolate programming problems,
consult the GDB documentation on the http://www.gnu.org/software/gdb web site.
nzudxrunharness Command
Runs a user-defined function or aggregate within a simulation test environment. The harness displays information on the memory usage of the objects, and it can detect buffer
overwrites when the UDF/UDA is called. The command displays messages to standard output and optionally to the standard log files. You must be logged in as the nz user to run this
command.
The command has the following syntax:
nzudxrunharness [OPTION]...
Table 7-2 describes the options that you can use for any instance of the command.
Table 7-2: nzudxrunharness General Options
Option
Description
--dir datadir
--base base
--user user
--pw password
--db database
-h
Table 7-3 describes the options that you use to specify the input file for the command.
20444-5
Rev.4
7-7
Description
--file testfile
--grp col
Specifies the column number in the test data file that is used to
group by (for aggregates). The test data file must already be
grouped.
--sep separator
--escape escape
Specifies the escape character to use in the test file. The default
is none.
--quoting
--hexinput
--generate
Generates a control file, but does not run the harness. For more
information, see Test Harness Control File on page 7-10.
Table 7-4 describes the other random input options for the command.
Table 7-4: nzudxrunharness Random Input Options
Option
Description
--rows rows
--groups groups
--nulls nulls
Specifies the null arguments. The value is specified as a colonseparated string of field numbers for example 1:2:3:5
The default is no nulls.
7-8
Option
Description
--hex
--novalidate
20444-5
Rev.4
Description
--dbg
Table 7-6 describes the options that specify the UDX to test.
Table 7-6: nzudxrunharness UDX Options
Option
Description
--name name
--func
--agg
Operates on an aggregate.
Table 7-7 describes the options that allow you to override defined UDX values.
Table 7-7: nzudxrunharness UDX Override Options
20444-5
Rev.4
Option
Description
--mask NONE,
DEBUG, TRACE
--over override
--varargs cols
Specifies the argument info for VARARGS UDX. You specify the
value as a colon-separated series of types. For example:
VARCHAR(100):NUMERIC(10,3):INT4
--fenced
--unfenced
--object file
7-9
Description
--nodlclose
Specifies that the test harness should not invoke the C library
function dlclose() to close references to UNIX shared libraries
that were made available with dlopen(). The test harness invokes
dlclose() by default.
If you are running the test harness within a debugging tool such
as valgrind or callgrind, specify this option so that the harness
does not invoke dlclose() automatically. This allows you to access
symbol names and other values that can be useful for debugging,
but which may not available after dlclose() has been called. (For
more information about the valgrind debugging environment, see
http://valgrind.org.)
--final
The test harness runs the specified UDX using either a supplied data file or by creating random data based on the --rows,--groups, and --nulls flags. With the --nulls flag, the specified
columns will be null about 50% of the time. Also, when using random data, strings will be
filled to maximum capacity, which will either be based on the argument signature, or the
overrides specified by --over. Using a supplied data file is the best way to test the correctness of your algorithm.
Using the --mask flags shows the results of logMsg calls. The --print and --hex flags show
the results of evaluate or performFinalResult. The harness also prints out external routines
found in the object file. The --dbg flag invokes the debugger so that the actual object can
be debugged. If you use the debugger, make sure that the host object file for the UDX has
been compiled with debugging symbols. (Typically, you would compile using the optimized
mode instead.)
In the data file, types like interval and timetz with more than one piece of information must
have the fields separated by a colon (:). Nulls can be specified by <NULL>.
7-10
20444-5
Rev.4
classname:CCustomerName
fenced:t
deterministic:t
nullcall:t
memory:0
logmask:0
nulls:
numdependencies:0
undefined:vtable for __cxxabiv1::__class_type_info
undefined:vtable for __cxxabiv1::__si_class_type_info
undefined:operator delete[](void*)
undefined:operator delete(void*)
undefined:operator new(unsigned int)
undefined:__cxa_pure_virtual
undefined:__gxx_personality_v0
undefined:free
undefined:memcmp
undefined:sprintf
undefined:strcmp
undefined:strdup
undefined:throwError
inputdelim:,
inputquote: true
hexinput: false
printoutput: none
numrows: 100
validate: true
You can edit the control file parameters to change the test environment. You can also use
the udxharness binary and specify one or more control files to test a UDX or several UDXs
and their interactions in the same transaction scope. For example:
[nz@nzhost udx]$ udxharness -f customername_func.harness -f penmaxv2_
agg.harness -k /nz/kit
This sample command runs both the customername UDF and penmax UDA in the same
test environment to evaluate the impact on the system.
Table 7-8 describes the harness control file parameters.
Table 7-8: Control File Parameters
Parameter
Description
20444-5
Rev.4
udxtype: type
Specifies the UDX type. The type must be either udf, uda, or
udtf. This parameter must be the first one in the control file.
numarguments:
num
7-11
Description
argument: info
of a numeric.
The scale value is -1 or the scale of a numeric.
classname: class
Specifies the C++ class for the UDX. This paramter is required.
datafile: file
numdependencies:
num
dependency: libinfo
Specifies a library dependency for the UDX. You must specify the
libraries in correct order; that is, the libraries that depend on
other libraries must be listed after the libraries that they depend
on. The format of the libinfo value is auto,file,name.
The auto value is t for automatic load and f for manual load.
The name is the library name.
The file value is the .so library object file.
Spaces before or after the commas are not allowed, unless they
are part of the file or name.
7-12
fenced: value
shaper: value
hexinput: value
Specifies whether the data in the input file is in hex format. The
default is false (not in hex format).
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.
inputdelim: delim
Specifies the delimiter for the input file. The default is comma.
inputescape: escape
Specifies the escape character for the input file. The default is
no escape character.
20444-5
Rev.4
Description
inputquote: value
Specifies whether string data in the input file will be quoted. The
default is false (not quoted).
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.
logmask: mask
Specifies the log mask for the UDX. The valid values are 1 for
TRACE, 2 for DEBUG, 3 for both DEBUG and TRACE or 0 for
NONE. The default is 0.
memory: mem
Specifies the maximum memory for the UDX. The size value can
be an empty value or a value in the form of a number and the letters b (bytes), k (kilobytes), m (megabytes), or g (gigabytes). The
default is 0.
nulls: cols
Specifies the columns that will be null randomly when using randomly generated test data (no input data file specified). The cols
value is a comma-separated list of column numbers. The column
numbers start at 1.
numreturns: num
Specifies the number of return columns for the UDX. The value
is 1 for UDFs and UDAs, but it can be 1 or more for UDTFs. It
must be immediately followed by the return info. You must specify exactly num return values. This parameter is required.
returninfo: info
Specifies the return info. The info value has the form
type:typmod:scale:name.
The type value is one of the DataType enums from the Udx-
of a numeric.
The scale value is -1 or the scale of a numeric.
The name value is used only for a UDTF, where it specifies the
20444-5
Rev.4
numrows: num
objectfile: file
printoutput: type
Specifies how to print the output of the UDX. Possible values are
normal, hex, or none. Normal prints the normally expected output; hex prints strings in their hex representation instead of
string representation; none does not print any output. The
default is none.
7-13
Description
validate: value
udxname: name
undefined: symbol
version: ver
Specifies the UDX API version. The ver value can be 1 or 2. The
default is 1.
groups: num
numstate: num
state: info
Specifies one of the state values for the UDA. The format of the
info field is type:typmod:scale.
The type value is one of the DataType enums from the Udx-
of a numeric.
The scale value is -1 or the scale of a numeric.
7-14
20444-5
Rev.4
Description
nullcall: value
UDTF-Specific Parameters
lastcall: value
Specifies whether the UDTF is called after the last input row.
The default is false.
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.
environment: info
setting.
The value is the value of the environment setting.
close:value
20444-5
Rev.4
7-15
If you enable the UDX stub processing, the output for the same query appears as follows:
A | B
---+-----------------3 | Customer CBA
1 | Customer A
4 | Customer ABC
2 | Customer B
(4 rows)
Essentially, with the UDX stub enabled, the WHERE clause does not restrict the output;
therefore, the command displayed the entire customers table.
To disable the UDX stub processing and enable your user-defined functions and aggregates, set the udx_stub session variable to false (0):
MYDB(MYUSER)=> set udx_stub=0;
7-16
20444-5
Rev.4
APPENDIX
This appendix describes the SPUPad feature, which allows UDX developers to allocate a
named, unique, area of memory as a temporary storage area and workpad. A SPUPad typically resides in memory on the Netezza S-Blades, but it can also reside in memory on the
host. Its location depends upon the location of the user tables on which it operates. When
a SPUPad runs on an S-Blade, the system creates one SPUPad for each dataslice managed
by the S-Blade.
A user-defined function or user-defined aggregate can call the SPUPad routines to create a
SPUPad, write data to it, and read data from it. The SPUPad is temporary because it persists only for the lifetime of the transaction or transaction block which called the function
that created it. When the transaction completes, the memory used for the SPUPad is automatically freed.
The SPUPad feature allows UDX developers to allocate and write data directly to the memory of the S-Blade or the Netezza host. Use caution when using this feature. You should be
very familiar with the Netezza architecture and verify your code and memory allocations, as
problems in the code could create out-of-memory situations, S-Blade resets, and other
impacts that would affect the performance and availability of your Netezza system.
Return multiple result columns; currently, UDFs and UDAs cannot return multiple
results unless they are encoded in a string and you have UDXs that return the string as
well as extract values from that string.
A-1
Process using a lookup table of values that can speed processing for the UDX. For
example, you might want to create a temporary table of facts or values that might help
the UDFs or UDAs to run more quickly.
Your UDX could be designed to create a SPUPad, write and manipulate the content, and
return data all in one function call. You could also design several UDXs to perform some or
all of those tasks separately within a query transaction block.
The following sections show some examples of the code that you can add to your C++
programs.
A-2
20444-5
Rev.4
Note: To review the entire sample program and its comments, see the section string_pad_
create.cpp on page A-14.
A structure shows an interesting aspect of root objects; for example, the following sample
code shows a root object that implements a simple dictionary:
struct MyValue
{
char* name;
int value;
};
struct MyLookup
{
MyValue *values;
int numallocated;
int numused;
};
In this example, the root object is an instance of MyLookup, which can contain an arbitrary
number of MyValue objects. All of the objects, plus the char* strings, are allocated through
the SPUPad allocation mechanisms, but the only way to get to a value (or the name in a
value) is through the MyLookup root object.
If you create multiple C++ files to define UDXs that will manipulate the data in the same
SPUPad, make sure that you repeat your Root structure definition in each C++ file. If you
define a number of common structures or definitions, you could create an include file to
define all these objects in one location.
20444-5
Rev.4
A-3
Although there is no maximum number of objects that a SPUPad can hold, as a best practice try to limit the number of objects that you create to only those that you really need. The
SPUPad keeps track of each object using a pointer per object, which adds to the memory
consumed by the SPUPad.
Create a SPUPad
Within the UDF evaluate method, you create a named pad of type SPUPad for your function. An example follows which creates a SPUPad named stringpad which is a string
storage pad. The UDF called string_pad_create takes an input string and length, then saves
each character in the string in an array.
class StringPadCreate: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
CPad* pad = getPad("stringpad");
Root *ro = (Root*) pad->getRootObject(sizeof(Root)); //Line 4.
if(!ro) // If false, stringpad does not exist; safe to create it.
{
ro=PAD_NEW(pad,Root);
int32 size = int32Arg(0);
StringArg* a = stringArg(1);
int32 stringSize=a->length;
if (size<1 || size > 64000)
{
throwUdxException("Given size is out of range.");
}
if(stringSize > size)
{
throwUdxException("Given string bigger than given size.");
}
ro->size=size;
ro->data=PAD_NEW(pad,char)[size]; // PAD_NEW creates an array to
// hold each character in the
// input string.
for(int i=0; i<size; i++)
{
if(i<stringSize)
{
ro->data[i]=a->data[i];
}
else
{
ro->data[i]=' ';
}
}
pad->setRootObject(ro, sizeof(Root));
//Line 34.
NZ_UDX_RETURN_BOOL(true);
}
else // stringpad already exists; stop processing the create task.
{
NZ_UDX_RETURN_BOOL(false);
A-4
20444-5
Rev.4
}
}
Udf* StringPadCreate::instantiate()
{
return new StringPadCreate;
}
On Line 34, the program sets the root object for the SPUPad so that subsequent calls
or functions can reference the SPUPad.
20444-5
Rev.4
A-5
A-6
20444-5
Rev.4
SPUPad-Related API
(1 row)
MYDB(MYUSER)=> SELECT string_pad_get(1) FROM one_dslice;
STRING_PAD_GET
---------------e
(1 row)
MYDB(MYUSER)=> SELECT string_pad_get(2) FROM one_dslice;
STRING_PAD_GET
---------------t
(1 row)
MYDB(MYUSER)=> COMMIT;
COMMIT
If you run these functions as single select statements, note that the first function (string_
pad_create) would run, create the SPUPad and the character array, and then exit. When the
function returns, the Netezza system automatically cleans up the SPUPad and frees the
memory. If you then run string_pad_get in a single select, you would see the following error
because the SPUPad no longer exists:
MYDB(MYUSER)=> select string_pad_get(0) from one_dslice;
ERROR: Pad does not exist
SPUPad-Related API
The following functions and macros are used for creating and managing the SPUPads
within your UDX code.
allocate Function
Allocates the specified amount of memory and returns it.
Syntax
The function has the following syntax:
virtual void *allocate(const size_t sz, bool array=false)
20444-5
Rev.4
A-7
Description
allocate() uses NzAllocObject to allocate memory from the heap for the SPUPad. The only
size restriction is the amount of available heap memory. Instead of using the allocate()
function to allocate memory, review the PAD_NEW macro, which can help you to perform
the allocations and also manage constructors for you. Use the allocate() function only if you
are using C-style code and want to replace calls to malloc/calloc instead of calls to new.
Throws
allocate() throws an exception if it cannot allocate the memory.
deallocate Function
Deallocates the specified amount of memory, which must have been previously allocated by
allocate().
Syntax
The function has the following syntax:
virtual void deallocate(void *ptr)
Description
deallocate() uses NzFreeObject to deallocate memory for the SPUPad and return it to the
heap. Instead of using the deallocate() function to free memory, review the PAD_DELETE
macro, which can help you to free the memory and also manage destructors for you. Use
the deallocate() function if you are using C-style code and want to replace calls to free
instead of calls to delete or delete[].
Throws
deallocate() throws an exception if the specified object was not allocated by the pad using
allocate().
setRootObject Function
Sets the root object and size for the pad.
Syntax
The function has the following syntax:
virtual void setRootObject(void *ptr, size_t size)
Description
This function can be called only once per a SPUPad instance; subsequent calls will throw
an exception. The size is the size of the root object, not the root object plus all its children.
The size argument must correspond to the size of the object as it was allocated using allocate(), and as such is subject to the same restrictions.
Throws
setRootObject() throws an exception if the pad already has a root object. (You cannot reset
the root once it is set.)
A-8
20444-5
Rev.4
SPUPad-Related API
getRootObject Function
Gets the root object of a pad, or returns NULL if the pad does not exist.
Syntax
The function has the following syntax:
virtual void * getRootObject(size_t size)
Description
This size value must match the size specified in the setRootObject() call to ensure that the
function is retrieving the root for the expected object. getRootObject() will return the root
object if it is set, NULL otherwise.
Throws
getRootObject() throws an exception if the root object is set but the size specified is not
equal to the size that the object was registered with using setRootObject().
getTotalSize Function
Returns the total size in bytes of all objects allocated by a SPUPad.
Syntax
The function has the following syntax:
virtual int32 getTotalSize()
Description
getTotalSize() returns the current size in bytes allocated by the SPUPad. The function
always returns a positive number if the pad is not empty, or 0 if the pad is empty. The sizes
that make up the total are subject only to the restrictions of allocate()/deallocate().
getPad Function
Returns a SPUPad object of the specified name.
Syntax
The function has the following syntax:
extern CPad* getPad(const char* strName)
Description
getPad() returns a SPUPad object or creates a new SPUPad object if it does not already
exist. The Netezza system will be responsible for cleaning up and freeing all objects allocated using the pad when the current transaction ends.
PAD_NEW Macro
PAD_NEW() is a macro that allocates the memory for a new SPUPad.
20444-5
Rev.4
A-9
Syntax
The macro has the following syntax:
PAD_NEW(pad, type)
Description
The PAD_NEW macro allocates memory using the allocate() function, and can also be used
to invoke constructors and destructors. PAD_NEW invokes helper templates to ensure that
allocate() and the related constructors are invoked appropriately.
PAD_NEW can be used in array and non-array contexts as follows:
MyObject *pObj = PAD_NEW(pad, MyObject);
char * pStr = PAD_NEW(pad, char)[10];
The array style helps to properly support the calling of constructors when allocating an array
of objects that have a constructor.
PAD_DELETE Macro
PAD_DELETE() is a macro that deallocates or frees the memory used by a SPUPad.
Syntax
The macro has the following syntax:
PAD_DELETE(pad, ptr)
Description
The PAD_DELETE macro deallocates memory that was used for a SPUPad by calling the
deallocate() function and invoking the necessary destructors. PAD_DELETE invokes helper
templates to ensure that deallocate() and the destructors are invoked appropriately.
PAD_DELETE can be used in array and non-array contexts as follows:
PAD_DELETE(pad, pObj);
PAD_DELETE(pad, pStr);
The array style helps to properly support the calling of destructors when freeing an array of
objects that have a destructor.
isUserQuery Function
Verifies that the SPUPad is being called by a user query, not an internal routine such as
Just-in-Time Statistics which is running the function for query optimization planning.
Syntax
The function has the following syntax:
extern "C" bool isUserQuery()
Description
As a best practice, call this function to ensure that it returns true before operating on the
SPUPad. The query will return true when the function is being executed during a user
query, and false when the function is being executed as part of an internal process such as
JIT statistics. See also Best Practices for UDXs with SPUPads on page A-11.
A-10
20444-5
Rev.4
If a UDX modifies the contents of a SPUPad, the UDX should use the isUserQuery()
function to guard against Just In Time (JIT) statistics impacts if the SPUPad contents
could be used across queries or across rows. The JIT statistics process runs sample
tests of user queries to identify the best performance plan for the query. For queries
which modify SPUPad contents, the JIT statistics sampling could cause unintended
modifications of the SPUPad contents. Thus, make sure that SPUPad operations occur
only when isUserQuery() returns true.
If the contents are used only by other UDFs or UDAs within the same row of the current
query (such as for caching a complicated calculation or returning multiple columns),
the UDX does not need to call isUserQuery() to guard the SPUPad.
If a UDF or UDA uses a SPUPad, but it does not modify the contents, you do not need
to guard against JIT statistics impacts.
Any UDF that uses a SPUPad and that guards against JIT statistics should not be used
in a WHERE clause of a query. The JIT statistics evaluation of the query would ignore
the UDF and the query plan might not reflect the actual cost or size of the query.
In all SPUPad cases, the UDXs should be robust enough to error gracefully when the
SPUPad is not populated.
If you have two or more UDXs that could be used in the same query, and one or more or
them uses SPUPad, use caution to avoid symbol name overlaps with the struct and
class objects that are placed in the SPUPad. Symbol name collisions can cause SPU
resets.
20444-5
Rev.4
A-11
table, and one dataslice will contain two rows. If you run the same BEGIN/COMMIT transaction commands shown in Running the stringpad UDFs on page A-6, the sample output
is as follows:
MYDB(MYUSER)=> BEGIN TRANSACTION;
BEGIN
MYDB(MYUSER)=> SELECT string_pad_create(10, 'netezza') FROM multi_
dslice;
STRING_PAD_CREATE
------------------t
t
f
t
t
t
t
t
t
(9 rows)
As shown in the sample output, there are eight true (t) rows and one false (f) row, because
the query created eight SPUPads one for each dataslice where a row of the table resides.
The false response was returned by the dataslice that has two rows of the table, because
the create_string_pad function checks for the existence of a SPUPad before it creates one.
It found the SPUPad from the processing of the first table row on that dataslice, so it did
not create another SPUPad.
The next query returns the character at index value 4 of the string netezza which resides
in the SPUPads:
MYDB(MYUSER)=> SELECT string_pad_get(4) FROM multi_dslice;
STRING_PAD_GET
---------------z
z
z
z
z
z
z
z
z
(9 rows)
MYDB(MYUSER)=> COMMIT;
COMMIT
If you specify an external table as the FROM clause, the Netezza system runs the query on
the host and creates the SPUPad on the host. For example, assume that one_dslice_ext is
an external table version of the one_dslice table. You could run the UDX as follows:
A-12
20444-5
Rev.4
In this example, the function creates the SPUPad on the host in memory, then frees the
SPUPad and its memory when the function completes.
If your SPUPad operations read information from a distributed user table, each SPUPad on
the S-Blades has access only to the data that resides on that S-Blade or that is sent to it by
the UDX. If your analysis algorithm requires that the SPUPads have some uniform data
across all S-Blades, you could use a mechanism in the UDX to write common data to the
SPUPad, or you could create a table that contains the needed rows and also has a datasliceid identification, for example:
CREATE TABLE foo_brdcst AS SELECT d.ds_id-1 AS dsid_, t.* FROM foo t,
_t_dslice d DISTRIBUTE ON (dsid_);
For extremely complex queries, where data is redistributed or sent to the host, you may not
get meaningful or predictable results. To avoid this, be very explicit with the distribution of
tables.
20444-5
Rev.4
A-13
Transaction Restarts
When the Netezza system restarts a transaction due to a state change, S-Blade restart, or
other reasons, the SPUPad UDXs are affected. A SELECT statement can be restarted if it is
not in a multi-statement transaction and no results have been returned yet. A multi-statement transaction that is between statements can be restarted if it has not modified any
data so far.
For a single-select query, there should be no implications other than making sure that any
SPUPads that might have been created are properly freed. In the case of the multi-statement select, if any of the UDXs in the statements use the SPUPad, they should be written
to ensure that they error and exit if the SPUPad does not exist.
Make sure that you register SPUPad UDXs as NOT FENCED; fenced UDXs cannot use
SPUPads.
Make sure that you register a UDF as NOT DETERMINISTIC when the UDF uses a
SPUPad.
Make sure that you add the memory requirements of the SPUPad to the memory needs
of the UDX in the MAXIMUM MEMORY argument. You need to add in the SPUPad
memory for any UDX that creates a SPUPad as well as any UDX that uses a SPUPad
created by another UDX. Starting in Release 6.0, you can specify a wider range of values for the MAXIMUM MEMORY input. The previous limit was 10MB. Use caution with
the SPUPad memory consumption, as very large SPUPads can impact the Netezza system performance as well as result in out-of-memory errors for your UDXs.
Examples
The following examples show C++ programs that use SPUPad definitions.
string_pad_create.cpp
The string_pad_create.cpp sample program takes an input string size and string and creates a SPUPad to store the string as an array of characters.
/**
* UDF: string_pad_create(int4, varchar(64000)) -> bool
*
* Creates a SPUPad called "stringpad" with size and content
* as specified in the arguments. The data from "stringpad" can
* then be accessed by string_pad_get() and string_pad_size().
*
* argument1: the initial size of the pad. Must be between 1 and 64000.
* argument2: the initial data to put in the pad. Must be between
* 0 and argument1 characters.
*
* returns true if pad created successfully. returns false if
* "stringpad" already exists when called.
A-14
20444-5
Rev.4
Examples
*
* throws error if argument1 is null or if argument2 is null or
* the arguments are out of range.
*
* REGISTRATION:
* CREATE OR REPLACE FUNCTION string_pad_create (int4, varchar(64000))
* RETURNS bool
* LANGUAGE CPP
* PARAMETER STYLE NPSGENERIC NOT FENCED
* CALLED ON NULL INPUT
* NOT DETERMINISTIC
* EXTERNAL CLASS NAME 'StringPadCreate'
* EXTERNAL HOST OBJECT '/tmp/test/UDX_StringPadCreate.o_x86'
* EXTERNAL SPU OBJECT '/tmp/test/UDX_StringPadCreate.o_spu10';
*
* -->>Do NOT register any spu-pad related UDFs as 'deterministic'
*
* USAGE
* You need a user table that is defined on at least one SPU, such as:
* CREATE TABLE one_dslice(c1 int4);
* INSERT INTO one_dslice VALUES(1);
* SELECT string_pad_create(10, 'netezza') FROM one_dslice;
* This select creates a SPUPad on the SPU that manages the dataslice
* where one_dslice resides.
*
* To create the pad on multiple SPUs, create a table
* T with X distinct values where X>=NUM_DATASLICES. Then issue
* 'SELECT string_pad_create(...,...) FROM T;'
* expect NUM_SPUS 't' values and X-NUM_SPUS 'f' values in the
* result set.
*
* Copyright (c) 2007-2010 Netezza Corporation, an IBM Company
* All rights reserved.
*
*/
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
struct Root
{
char* data;
int size;
};
class StringPadCreate: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
if(isArgNull(0)||isArgNull(1))
{
throwUdxException("cannot accept null arguments.");
20444-5
Rev.4
A-15
}
if(argType(0)!=UDX_INT32)
{
throwUdxException("First argument must be int4.");
}
if(argType(1)!=UDX_VARIABLE)
{
throwUdxException("2nd argument must be a varchar.");
}
CPad* pad = getPad("stringpad");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
if(!ro)
{
ro=PAD_NEW(pad,Root);
int32 size = int32Arg(0);
StringArg* a = stringArg(1);
int32 stringSize=a->length;
if (size<1 || size > 64000)
{
throwUdxException("Given size is out of range.");
}
if(stringSize > size)
{
throwUdxException("Given string bigger than given size.");
}
ro->size=size;
ro->data=PAD_NEW(pad,char)[size];
for(int i=0; i<size; i++)
{
if(i<stringSize)
{
ro->data[i]=a->data[i];
}
else
{
ro->data[i]=' ';
}
}
pad->setRootObject(ro, sizeof(Root));
NZ_UDX_RETURN_BOOL(true);
}
else
{
NZ_UDX_RETURN_BOOL(false);
}
}
};
Udf* StringPadCreate::instantiate()
{
return new StringPadCreate;
}
A-16
20444-5
Rev.4
Examples
string_pad_get.cpp
The string_pad_get.cpp sample program takes an input string position value and returns
the character stored in stringpad at that position of a character array. The stringpad must
be created and populated by the string_pad_create function.
/**
* UDF string_pad_get(int4) -> char(1)
* Gets a character at index from the pad "stringpad".
* The pad must be created first by using string_pad_create.
*
* argument1 = index at which to return the character
*
* returns = the character at index
*
* throws error if the spu pad "stringpad" is not found or if index
* is out of range.
*
* COMPILATION:
* nzudxcompile UDX_StringPadGet.cpp -o /tmp/test/UDX_StringPadGet.o
*
* REGISTRATION:
* CREATE OR REPLACE FUNCTION string_pad_get (int4)
* RETURNS char(1)
* LANGUAGE CPP
* PARAMETER STYLE NPSGENERIC NOT FENCED
* CALLED ON NULL INPUT
* NOT DETERMINISTIC
* EXTERNAL CLASS NAME 'StringPadGet'
* EXTERNAL HOST OBJECT '/tmp/test/UDX_StringPadGet.o_x86'
* EXTERNAL SPU OBJECT '/tmp/test/UDX_StringPadGet.o_spu10';
*
* -->Do not register any spu-pad related UDFs as 'deterministic'
*
* USAGE:
* CREATE TABLE one_dslice(c1 int4);
* INSERT INTO one_dslice VALUES(1);
* SELECT string_pad_create(10, 'netezza') FROM one_dslice;
* SELECT string_pad_get(1) FROM one_dslice;
*
* Copyright (c) 2007-2010 Netezza Corporation, an IBM Company
* All rights reserved.
*/
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
struct Root
{
char* data;
int size;
};
class StringPadGet: public Udf
{
20444-5
Rev.4
A-17
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
if(isArgNull(0))
{
throwUdxException("must not accept null arguments.");
}
if(argType(0)!=UDX_INT32)
{
throwUdxException("1st argument must be int4 (int32).");
}
CPad* pad = getPad("stringpad");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
if(!ro)
{
throwUdxException("Pad does not exist");
}
else
{
int index = int32Arg(0);
if(index<0||index>=ro->size)
{
throwUdxException("Index out of bounds");
}
StringReturn *ret = stringReturnInfo();
ret->size=1;
ret->data[0]=ro->data[index];
NZ_UDX_RETURN_STRING(ret);
}
}
};
Udf* StringPadGet::instantiate()
{
return new StringPadGet;
}
string_pad_size.cpp
The string_pad_size.cpp program defines a UDF that returns the size of the sample stringpad created by the string_pad_create function.
/**
* UDF string_pad_size() -> int4
*
* Returns the size of the pad. -1 if no pad is found.
*
* COMPILATION:
* nzudxcompile UDX_StringPadSize.cpp -o UDX_StringPadSize.o
*
* REGISTRATION:
* CREATE OR REPLACE FUNCTION string_pad_size()
* RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC NOT FENCED
A-18
20444-5
Rev.4
Examples
20444-5
Rev.4
A-19
padcounter.cpp
The padcounter sample program contains two UDFs, padcounter() and getpadcount(),
which use a SPUPad to obtain a row count of a table.
#include "udxinc.h"
/*
* These functions obtain a row count of a table using a simple SPUPad.
*
* To compile and register the functions:
*
* nzudxcompile padcounter.cpp --sig "padcounter()" --ret int4
*
--class PadCounter
* then alter function padcounter() to make it NOT DETERMINISTIC
*
* nzudxcompile padcounter.cpp --sig "getpadcount()" --ret int8
*
--class GetPadCount
* then alter function getpadcount() to make it NOT DETERMINISTIC
*
* You need a table with 1 row per SPU; for example: one_per;
*
* This will return the row count in table <TBL>:
* select sum(getpadcount()) from one_per where exists
*
(select count(padcounter()) from <TBL>);
*/
using namespace nz::udx;
struct Root
{
int64 myCount;
};
class PadCounter: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
CPad* pad = getPad("PadCount");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
setReturnNull(false);
if(!ro)
{
ro=PAD_NEW(pad,Root);
ro->myCount = 1;
pad->setRootObject(ro, sizeof(Root));
NZ_UDX_RETURN_INT32(1);
}
else
{
ro->myCount += 1;
NZ_UDX_RETURN_INT32(1);
}
A-20
20444-5
Rev.4
Examples
}
};
Udf* PadCounter::instantiate()
{
return new PadCounter;
}
class GetPadCount: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
CPad* pad = getPad("PadCount");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
setReturnNull(false);
if(!ro)
{
NZ_UDX_RETURN_INT64(0);
}
else
{
NZ_UDX_RETURN_INT64(ro->myCount);
}
}
};
Udf* GetPadCount::instantiate()
{
return new GetPadCount;
}
20444-5
Rev.4
A-21
A-22
20444-5
Rev.4
APPENDIX
Description
More Information
ALTER AGGREGATE
Changes a UDA.
ALTER FUNCTION
Changes a UDF.
ALTER LIBRARY
CREATE LIBRARY
DROP AGGREGATE
DROP FUNCTION
DROP LIBRARY
SHOW AGGREGATE
Displays information
about aggregates (builtins as well as UDAs).
SHOW FUNCTION
Displays information
about functions (built-ins
as well as UDFs).
SHOW LIBRARY
Displays information
about shared libraries.
B-1
If you issue one of the alter, create, or drop commands for a UDX or shared library that is
currently in use by an active query, the Netezza system waits for that querys transaction to
complete before it executes the command.
This guide also discusses other Netezza SQL commands such as GRANT, REVOKE, and
COMMENT. For a description of these commands, see the IBM Netezza Database Users
Guide.
ALTER AGGREGATE
Use the ALTER AGGREGATE command to change the aggregate object files, state, return
value, memory usage options, or logging level. The aggregate must be defined in the current database. You can also use this command to change the owner of the UDA.
You cannot change the aggregate name or argument type list using this command. To
change an aggregates name and/or argument type list, you must drop the aggregate and
create an aggregate with the new name and/or argument type list.
Synopsis
Syntax:
ALTER AGGREGATE aggregate_name(argument_types)
[RETURNS return_type] [STATE (state_types)]
[FENCED | NOT FENCED] [MAXIMUM MEMORY mem]
[LOGMASK mask] [TYPE ANY | ANALYTIC | GROUPED]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[API VERSION [ 1 | 2 ] ]
[NO ENVIRONMENT | ENVIRONMENT 'name' = 'value' , 'name2' = 'value2' ]
[EXTERNAL CLASS NAME 'class_name']
[EXTERNAL HOST OBJECT 'host_object_filename']
[EXTERNAL SPU OBJECT 'SPU_object_filename']
ALTER aggregate_name(argument_types) OWNER TO name
Inputs
The ALTER AGGREGATE command takes the following inputs:
Table B-2: ALTER AGGREGATE Input
B-2
Input
Description
aggregate_name
The name of the aggregate that you want to change. You cannot
change the aggregate name using this command.
20444-5
Rev.4
ALTER AGGREGATE
Description
argument_types
RETURNS return_
type
Specifies the aggregates return value as one fully specified argument and type. All Netezza data types are supported. Strings must
include a size and NUMERIC types must include precision and
scale.
STATE state_types
FENCED
NOT FENCED
MAXIMUM MEMORY Specifies an indication of the potential memory use of the aggregate. The size value can be an empty value or a value in the form of
a number and the letters b (bytes), k (kilobytes), m (megabytes), or
g (gigabytes). For example, valid values could be '0', '1k', '100k',
'1g', or '10m'. The default is 0.
20444-5
Rev.4
LOGMASK mask
Specifies the logging control level for the aggregate. Valid values
are NONE, DEBUG, and TRACE, or a comma-separated combination of DEBUG and TRACE.
TYPE
B-3
Description
DEPENDENCIES
deplibs
Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.
API VERSION [1 | 2] Specifies the version of the UDX interface used by the aggregate.
The API VERSION must match the compiled version of the object
files for the host and SPU. The default is 1. If you include version
2 compiled objects, you must specify API VERSION 2.
ENVIRONMENT
NO ENVIRONMENT
EXTERNAL CLASS
NAME 'class_name'
Specifies the name of the C++ class that implements the aggregate. The class must derive from the Uda base class and must
implement a static method that instantiates an instance of the
class.
EXTERNAL HOST
OBJECT 'host_
object_filename'
EXTERNAL SPU
OBJECT 'SPU_
object_filename'
Specifies the pathname for the Linux SPUs compiled object file.
Specify the spu10 compiled object for Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.
Outputs
The ALTER AGGREGATE command has the following output
Table B-3: ALTER AGGREGATE Output
Output
Description
ALTER AGGREGATE
This error indicates that a UDX exists with the name but
has different sizes specified for string or numeric arguments. To alter the aggregate, make sure that you
specify the exact argument type list with correct sizes.
ERROR: lookupLibrary: library lib- The message that the system returns if it cannot find the username does not exist
defined shared library specified as a dependency.
B-4
20444-5
Rev.4
ALTER AGGREGATE
Description
Description
You cannot alter a user-defined aggregate that is currently in use in an active query. After
the active querys transaction completes, the Netezza system will process the ALTER
AGGREGATE command to update the aggregate.
Privileges Required
To alter a UDA, you must meet one of the following criteria:
You must have the Alter privilege on the specific UDA object.
To alter an aggregate to be unfenced, you must have the Unfence admin privilege.
Note: When you issue an ALTER AGGREGATE command and specify new object files, the
database processes the HOST OBJECT and the SPU OBJECT files as the user nz. The user
nz must have read access to the object files and read and execute access to every directory
in the path from the root to the object file.
Common Tasks
You can use the ALTER AGGREGATE command to change the owner of an aggregate. Make
sure that you specify the full signature of the aggregate (name and argument type list) as
follows:
ALTER AGGREGATE aggregate_name(argument_types) OWNER TO name
Related Commands
See CREATE [OR REPLACE] AGGREGATE on page B-13 to create aggregates.
20444-5
Rev.4
B-5
Usage
The following provides sample usage.
ALTER FUNCTION
Use the ALTER FUNCTION command to change the function object files, return value,
memory usage options, or logging level. You can also use this command to change the
owner of the UDF.
You cannot change the function name or argument type list using this command. To change
a functions name and/or argument type list, you must drop the function and then create a
function with the new name and/or argument type list.
Synopsis
The ALTER FUNCTION command has the following syntax:
ALTER FUNCTION function_name(argument_types)
[RETURNS return_type] [FENCED | NOT FENCED]
[DETERMINISTIC | NOT DETERMINISTIC]
[RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT]
[MAXIMUM MEMORY mem]
[LOGMASK mask] [NO DEPENDENCIES| DEPENDENCIES deplibs]
[API VERSION [ 1 | 2 ] ]
[NO ENVIRONMENT | ENVIRONMENT 'name' = 'value' , 'name2' = 'value2']
[TABLE, TABLE FINAL ALLOWED | TABLE ALLOWED | TABLE FINAL ALLOWED]
[PARALLEL ALLOWED | PARALLEL NOT ALLOWED]
[EXTERNAL CLASS NAME 'class_name']
[EXTERNAL HOST OBJECT 'host_object_filename']
[EXTERNAL SPU OBJECT 'SPU_object_filename']
ALTER FUNCTION function_name(argument_types) OWNER TO name
Inputs
The ALTER FUNCTION command takes the following inputs:
Table B-4: ALTER FUNCTION Input
B-6
Input
Description
function_name
Specifies the name of the function that you want to change. You
cannot change the name of the function.
20444-5
Rev.4
ALTER FUNCTION
Description
argument_types
RETURNS return_
type
Specifies the functions return value as one fully specified argument and type. All Netezza data types are supported. Strings must
include either a size or ANY for generic sizes. NUMERIC types
must include precision and scale or ANY for generic sizes.
FENCED
NOT FENCED
[DETERMINISTIC |
NOT
DETERMINISTIC]
20444-5
Rev.4
B-7
Description
[RETURNS NULL
ON NULL INPUT |
CALLED ON NULL
INPUT]
MAXIMUM MEMORY Specifies an indication of the potential memory use of the function. The size value can be an empty value or a value in the form of
a number and the letters b (bytes), k (kilobytes), m (megabytes), or
g (gigabytes). For example, valid values could be '0', '1k', '100k',
'1g', or '10m'. The default is 0.
LOGMASK mask
Specifies the logging control level for the function. Valid values are
NONE, DEBUG, and TRACE, or a comma-separated combination of
DEBUG and TRACE.
Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.
DEPENDENCIES
deplibs
API VERSION [1 | 2] Specifies the version of the UDX interface used by the aggregate.
The API VERSION must match the compiled version of the object
files for the host and SPU. The default is 1. If you include version
2 compiled objects, you must specify API VERSION 2.
ENVIRONMENT
NO ENVIRONMENT
B-8
20444-5
Rev.4
ALTER FUNCTION
Description
TABLE, TABLE
FINAL ALLOWED
Specifies the options that control how the user-defined table function can be invoked.
The TABLE, TABLE FINAL ALLOWED option specifies that you
TABLE ALLOWED
TABLE FINAL
ALLOWED
PARALLEL
ALLOWED
PARALLEL NOT
ALLOWED
EXTERNAL CLASS
NAME 'class_name'
Specifies the name of the C++ class that implements the function.
The class must derive from the Udf base class and must implement
a static method that instantiates an instance of the class.
EXTERNAL HOST
OBJECT 'host_
object_filename'
EXTERNAL SPU
OBJECT 'SPU_
object_filename'
Specifies the pathname for the Linux SPUs compiled object file.
Specify the spu10 compiled object for Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.
Outputs
The ALTER FUNCTION command has the following output
Table B-5: ALTER FUNCTION Output
20444-5
Rev.4
Output
Description
ALTER FUNCTION
This error indicates that a UDX exists with the name but
has different sizes specified for string or numeric arguments. To alter the function, make sure that you specify
the exact argument type list with correct sizes.
B-9
Description
Description
You cannot alter a user-defined function that is currently in use in an active query. After the
active querys transaction completes, the Netezza system will process the ALTER FUNCTION command to update the function.
Privileges Required
To alter a UDF, you must meet one of the following criteria:
You must have the Alter privilege on the specific UDF object.
To alter a function to be unfenced, you must have the Unfence admin privilege.
Note: When you issue an ALTER FUNCTION command and specify new object files, the
database processes the HOST OBJECT and the SPU OBJECT files as the user nz. The user
nz must have read access to the object files and read and execute access to every directory
in the path from the root to the object file.
Common Tasks
You can use the ALTER FUNCTION command to change the owner of a function. Make sure
that you specify the full signature of the function (name and argument type list) as follows:
ALTER FUNCTION function_name(argument_types) OWNER TO name
Related Commands
See CREATE [OR REPLACE] FUNCTION on page B-17 to create functions.
See DROP FUNCTION on page B-27 to drop functions.
B-10
20444-5
Rev.4
ALTER LIBRARY
Usage
The following provides sample usage.
ALTER LIBRARY
Use the ALTER LIBRARY command to change a user-defined shared library. You can use
this command to change properties such as the loading method, dependencies, owner, and
object files. You can also use this command to change the owner of the library.
You cannot change the library name using this command. To change a librarys name, you
must drop the library and create a library with the new name.
Synopsis
The ALTER LIBRARY command has the following syntax:
ALTER LIBRARY library_name
[ AUTOMATIC LOAD | MANUAL LOAD ]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[ EXTERNAL HOST OBJECT 'host_object_filename' ]
[ EXTERNAL SPU OBJECT 'SPU_object_filename' ]
ALTER LIBRARY library_name OWNER TO name
Inputs
The ALTER LIBRARY command takes the following inputs:
Table B-6: ALTER LIBRARY Input
Input
Description
library_name
The name of the library that you want to change. You must be connected to the database where the library is defined. You cannot
change the name using this command.
[AUTOMATIC LOAD | Automatic load specifies that the Netezza system will automatiMANUAL LOAD]
cally open the library before any objects that depend upon it are
used. Manual load specifies that the UDX is responsible for opening and closing manual load libraries when they are needed.
DEPENDENCIES
deplibs
Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.
20444-5
Rev.4
B-11
Description
EXTERNAL HOST
OBJECT 'host_
object_filename'
EXTERNAL SPU
OBJECT 'SPU_
object_filename'
Outputs
The ALTER LIBRARY command has the following output
Table B-7: ALTER LIBRARY Output
Output
Description
ALTER LIBRARY
Description
You cannot alter a user-defined library that is currently in use in an active query. After the
active querys transaction completes, the Netezza system will process the ALTER LIBRARY
command to update the library.
Privileges Required
To alter a UDF, you must meet one of the following criteria:
You must have the Alter privilege on the specific library object.
Note: When you issue an ALTER LIBRARY command and specify new object files, the database processes the HOST OBJECT and the SPU OBJECT files as the user nz. The user nz
must have read access to the object files and read and execute access to every directory in
the path from the root to the object file.
Common Tasks
You can use the ALTER LIBRARY command to change the owner of a library. For example:
ALTER LIBRARY library_name OWNER TO name
B-12
20444-5
Rev.4
Related Commands
See CREATE [OR REPLACE] LIBRARY on page B-23 to create or replace shared
libraries.
See DROP LIBRARY on page B-28 to drop shared libraries.
See SHOW LIBRARY on page B-33 to display information about shared libraries.
Usage
The following provides sample usage.
To alter a sample library named mylib to set the load option to MANUAL LOAD, enter:
MYDB(MYUSER)=> ALTER LIBRARY mylib MANUAL LOAD;
Synopsis
Syntax for creating a new user-defined aggregate:
CREATE [OR REPLACE] AGGREGATE aggregate_name(argument_types)
RETURNS return_type STATE (state_types)
LANGUAGE CPP PARAMETER STYLE NPSGENERIC [FENCED | NOT FENCED]
[MAXIMUM MEMORY mem ] [LOGMASK mask]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[ TYPE ANY | ANALYTIC | GROUPED] [API VERSION [1 | 2]]
[ENVIRONMENT 'name'='value', 'name'='value']
[EXTERNAL CLASS NAME 'class_name']
[EXTERNAL HOST OBJECT 'host_object_filename']
[EXTERNAL SPU OBJECT 'SPU_object_filename']
Inputs
The CREATE [OR REPLACE] AGGREGATE command takes the following inputs:
Table B-8: CREATE [OR REPLACE] AGGREGATE Input
20444-5
Rev.4
Input
Description
aggregate_name
Specifies the name of the aggregate that you want to create. This is
the SQL identifier that will be used to invoke the aggregate in a
SQL expression.
If the aggregate already exists, you cannot change the name using
the CREATE OR REPLACE command.
B-13
Description
argument_types
RETURNS return_
type
Specifies the aggregates return value as one fully specified argument and type. All Netezza data types are supported. Strings must
include a size and NUMERIC types must include precision and
scale.
STATE state_types
LANGUAGE
PARAMETER STYLE Specifies the parameter style for the aggregate. The default and
only valid value is NPSGENERIC.
FENCED
NOT FENCED
MAXIMUM MEMORY Specifies an indication of the potential memory use of the aggregate. The size value can be an empty value or a value in the form of
a number and the letters b (bytes), k (kilobytes), m (megabytes), or
g (gigabytes). For example, valid values could be '0', '1k', '100k',
'1g', or '10m'. The default is 0.
LOGMASK mask
B-14
Specifies the logging control level for the aggregate. Valid values
are NONE, DEBUG, and TRACE, or a comma-separated combination of DEBUG and TRACE.
20444-5
Rev.4
Description
DEPENDENCIES
deplibs
Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.
TYPE
API VERSION [1 | 2] Specifies the version of the UDX interface used by the aggregate.
The API VERSION must match the compiled version of the object
files for the host and SPU. The default is 1. If you include version
2 compiled objects, you must specify API VERSION 2.
ENVIRONMENT
EXTERNAL CLASS
NAME 'class_name'
Specifies the name of the C++ class that implements the aggregate. The class must derive from the Uda base class and must
implement a static method that instantiates an instance of the
class.
EXTERNAL HOST
OBJECT 'host_
object_filename'
EXTERNAL SPU
OBJECT 'SPU_
object_filename'
Specifies the pathname for the Linux SPUs compiled object file.
Specify the spu10 compiled object for Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.
Outputs
The CREATE [OR REPLACE] AGGREGATE command has the following output:
Table B-9: CREATE [OR REPLACE] AGGREGATE Output
20444-5
Rev.4
Output
Description
CREATE AGGREGATE
B-15
Description
ERROR: User 'username' is not The system returns this message if your user account
allowed to create/drop
does not have Create Aggregate permission.
aggregates.
ERROR: Synonym 'name'
already exists
ERROR: AggregateCreate:
aggregate name already exists
with the same arguments
This error is returned when you issue a CREATE AGGREGATE command and an aggregate with the same name
and argument type list already exists in the database.
Use CREATE OR REPLACE AGGREGATE instead.
NOTICE: AggregateCreate:
existing UDX name(argument_
types) differs in size of string/
numeric arguments
Description
When you create an aggregate, note that the aggregates signature (that is, its name and
argument type list) must be unique within its database. No other UDX can have the same
name and argument type list in the same database.
You cannot change the aggregate name or the argument type list using the CREATE OR
REPLACE command. You can change some aspects of the argument types; for example,
you can change the size of a string or the precision and scale of a numeric value. To change
an aggregates name and/or argument type list, you must drop the aggregate and then create an aggregate with the new name and/or argument type list.
B-16
20444-5
Rev.4
You cannot replace a user-defined aggregate that is currently in use in an active query.
After the active querys transaction completes, the Netezza system will process the CREATE
OR REPLACE AGGREGATE command to update the aggregate.
Privileges Required
You must have Create Aggregate permission to use the CREATE AGGREGATE command.
Also, if you use CREATE OR REPLACE AGGREGATE to change a UDA, you must have Create Aggregate permission and Alter permission for the UDA to change it. To create an
unfenced aggregate, you must have the Unfence admin privilege.
Note: When you issue a CREATE AGGREGATE command, the database processes the HOST
OBJECT and the SPU OBJECT files as the user nz. The user nz must have read access to
the object files and read and execute access to every directory in the path from the root to
the object file.
Common Tasks
Use the CREATE AGGREGATE command to create and become the owner of a new userdefined aggregate. You must create the aggregates C++ files and compile them using
nzudxcompile before you can use this command to register the aggregate with the Netezza
system.
Netezza has some special processing to deal with string fields used in aggregates. If your
aggregate returns a string type that is larger than 512 bytes, there must be a string type in
the state that is larger than 255 bytes, or multiple ones which have combined lengths that
are greater than 255. Otherwise, the command returns an error similar to the following:
ERROR: Records trailing string space set to 512 is too small: Bump
it up using the environment variable NZ_SPRINGFIELD_SIZE
Related Commands
See ALTER AGGREGATE on page B-2 to alter an aggregate.
See DROP AGGREGATE on page B-25 to remove a user-defined aggregate.
See SHOW AGGREGATE on page B-30 to display information about aggregates.
Usage
The following provides sample usage.
20444-5
Rev.4
B-17
Synopsis
Syntax for creating a new user-defined function:
CREATE [OR REPLACE] FUNCTION function_name(argument_types)
RETURNS return_type LANGUAGE CPP PARAMETER STYLE NPSGENERIC
[FENCED | NOT FENCED] [DETERMINISTIC | NOT DETERMINISTIC]
[RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT]
[MAXIMUM MEMORY mem] [LOGMASK <MASK>]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[API VERSION [ 1 | 2 ]]
[ENVIRONMENT 'name' = 'value', 'name2' = 'value2']
[TABLE, TABLE FINAL ALLOWED | TABLE ALLOWED | TABLE FINAL ALLOWED]
[PARALLEL ALLOWED | PARALLEL NOT ALLOWED]
[EXTERNAL CLASS NAME 'class_name']
[EXTERNAL HOST OBJECT 'host_object_filename']
[EXTERNAL SPU OBJECT 'SPU_object_filename']
Inputs
The CREATE [OR REPLACE] FUNCTION command takes the following inputs:
Table B-10: CREATE [OR REPLACE] FUNCTION Input
B-18
Input
Description
function_name
Specifies the name of the function that you want to create. This is
the SQL identifier that will be used to invoke the function in a SQL
expression. The name must meet the naming criteria for keywords
and identifiers, which are described in the IBM Netezza Database
Users Guide.
If the function already exists, you cannot change the name using
the CREATE OR REPLACE command.
argument_types
20444-5
Rev.4
Description
RETURNS return_
type
Specifies the functions return value as one fully specified argument and type. All Netezza data types are supported. Strings must
include either a size or ANY for generic sizes. NUMERIC types
must include precision and scale or ANY for generic sizes.
LANGUAGE
PARAMETER STYLE The default and only supported value at this time is NPSGENERIC.
FENCED
NOT FENCED
[DETERMINISTIC |
NOT
DETERMINISTIC]
MAXIMUM MEMORY Specifies an indication of the potential memory use of the function. The size value can be an empty value or a value in the form of
a number and the letters b (bytes), k (kilobytes), m (megabytes), or
g (gigabytes). For example, valid values could be '0', '1k', '100k',
'1g', or '10m'. The default is 0.
20444-5
Rev.4
B-19
Description
LOGMASK mask
Specifies the logging control level for the function. Valid values are
NONE, DEBUG, and TRACE, or a comma-separated combination of
DEBUG and TRACE.
DEPENDENCIES
deplibs
Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
API VERSION [1 | 2] Specifies the version of the UDX interface used by the aggregate.
The API VERSION must match the compiled version of the object
files for the host and SPU. The default is 1. If you include version
2 compiled objects, you must specify API VERSION 2.
ENVIRONMENT
TABLE, TABLE
FINAL ALLOWED
Specifies the options that control how the user-defined table function can be invoked.
The TABLE, TABLE FINAL ALLOWED option specifies that you
TABLE ALLOWED
TABLE FINAL
ALLOWED
PARALLEL
ALLOWED
PARALLEL NOT
ALLOWED
B-20
EXTERNAL CLASS
NAME 'class_name'
Specifies the name of the C++ class that implements the function.
The class must derive from the Udf base class and must implement
a static method that instantiates an instance of the class.
EXTERNAL HOST
OBJECT 'host_
object_filename'
EXTERNAL SPU
OBJECT 'SPU_
object_filename'
Specifies the pathname for the Linux SPUs compiled object file.
Specify the spu10 compiled object for Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.
20444-5
Rev.4
Outputs
The CREATE [OR REPLACE] FUNCTION command has the following output:
Table B-11: CREATE [OR REPLACE] FUNCTION Output
Output
Description
CREATE FUNCTION
ERROR: User 'username' is not The system returns this message if your user account
allowed to create/drop
does not have Create Function permission.
functions.
ERROR: Synonym 'name'
already exists
ERROR: function name already This error is returned when you issue a CREATE FUNCexists with the same signature TION command and a function with the same name and
argument type list already exists in the database. Use
CREATE OR REPLACE FUNCTION instead.
ERROR: function name already The system returns this message if a function already
exists with the same signature exists with the name that you specified for the function.
20444-5
Rev.4
ERROR: ProcedureCreate:
Can't use version 2 features
without specifying API VERSION 2 for udx_name
B-21
Description
When you create a function, note that the functions signature (that is, its name and argument type list) must be unique within its database. No other user-defined function or
aggregate can have the same name and argument type list in the same database.
You cannot change the functions name or the argument type list using the CREATE OR
REPLACE command. You can change some aspects of the argument types; for example,
you can change the size of a string or the precision and scale of a numeric value. To change
a functions name and/or argument type list, you must drop the function and then create a
function with the new name and/or argument type list.
You cannot replace a user-defined function that is currently in use in an active query. After
the active querys transaction completes, the Netezza system will process the CREATE OR
REPLACE FUNCTION command to update the function.
Privileges Required
You must have Create Function permission to use the CREATE FUNCTION command. Also,
if you use CREATE OR REPLACE FUNCTION to change a UDF, you must have Create Function and Alter permission for the UDF to change it. To create an unfenced function, you
must have the Unfence admin privilege.
Note: When you issue a CREATE FUNCTION command, the database processes the HOST
OBJECT and the SPU OBJECT files as the user nz. The user nz must have read access to
the object files and read and execute access to every directory in the path from the root to
the object file.
Common Tasks
Use the CREATE FUNCTION command to create and become the owner of a new userdefined function. You must create the functions C++ files and compile them using nzudxcompile before you can use this command to register the function with the Netezza system.
The function is defined as an object in the current database.
Related Commands
See ALTER FUNCTION on page B-6 to change a UDF.
See DROP FUNCTION on page B-27 to drop a UDF.
See SHOW FUNCTION on page B-32 to display information about functions.
Usage
The following provides sample usage.
B-22
20444-5
Rev.4
Synopsis
The CREATE [OR REPLACE] LIBRARY command has the following syntax:
CREATE [OR REPLACE] LIBRARY library_name
[ AUTOMATIC LOAD | MANUAL LOAD ]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[ EXTERNAL HOST OBJECT 'host_object_filename' ]
[ EXTERNAL SPU OBJECT 'SPU_object_filename' ]
Inputs
The CREATE [OR REPLACE] LIBRARY command takes the following inputs:
Table B-12: CREATE [OR REPLACE] LIBRARY Input
Input
Description
library name
The name of the library that you want to create or replace. The
name must be unique within the current database. You must be
connected to the database where the library is defined. You cannot
change the name using the CREATE OR REPLACE LIBRARY
command.
[AUTOMATIC LOAD | Automatic load specifies that the Netezza system will automatiMANUAL LOAD]
cally open the library before any objects that depend upon it are
used. Manual load specifies that the UDX is responsible for opening and closing manual load libraries when they are needed.
Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.
DEPENDENCIES
deplibs
20444-5
Rev.4
EXTERNAL HOST
OBJECT 'host_
object_filename'
EXTERNAL SPU
OBJECT 'SPU_
object_filename'
B-23
Outputs
The CREATE [OR REPLACE] LIBRARY command has the following output
Table B-13: CREATE [OR REPLACE] LIBRARY Output
Output
Description
CREATE LIBRARY
ERROR: Object with name 'lib- The message returned if you use the CREATE LIBRARY
name' already exists
command for a library name that already exists. Use
CREATE OR REPLACE or specify a unique library name.
ERROR: lookupLibrary: library
libname does not exist
Description
The user-defined shared library is created in the current database. You cannot replace a
user-defined library that is currently in use in an active query. After the active querys transaction completes, the Netezza system will process the CREATE OR REPLACE LIBRARY
command to replace the library.
Privileges Required
You must have Create Library permission to use the CREATE LIBRARY command. Also, if
you use CREATE OR REPLACE LIBRARY to change a UDF, you must have Create Library
and Alter permission for the library to change it.
Note: When you issue a CREATE LIBRARY command, the database processes the HOST
OBJECT and the SPU OBJECT files as the user nz. The user nz must have read access to
the object files and read and execute access to every directory in the path from the root to
the object file.
Common Tasks
You can use the CREATE [OR REPLACE] LIBRARY command to create and become the
owner of a new shared library. You must create the library and any of its dependencies and
compile them using nzudxcompile before you can use this command to register the shared
library with the Netezza system. The library is defined as an object in the current database.
Related Commands
See ALTER LIBRARY on page B-11 to alter shared libraries.
See DROP LIBRARY on page B-28 to drop shared libraries.
See SHOW LIBRARY on page B-33 to display information about shared libraries.
B-24
20444-5
Rev.4
DROP AGGREGATE
Usage
The following provides sample usage.
DROP AGGREGATE
Use the DROP AGGREGATE command to remove an existing user-defined aggregate from a
database. When you drop an aggregate, the aggregates object files will also be removed
from the user code object repository.
Synopsis
Syntax for dropping a user-defined aggregate:
DROP AGGREGATE aggregate_name(argument_types)
Inputs
The DROP AGGREGATE command takes the following inputs:
Table B-14: DROP AGGREGATE Input
Input
Description
aggregate_name
argument_types
Outputs
The DROP AGGREGATE command has the following output:
Table B-15: DROP AGGREGATE Output
20444-5
Rev.4
Output
Description
DROP AGGREGATE
B-25
Description
This error indicates that a UDX exists with the name but
has different sizes specified for string or numeric arguments. To drop the aggregate, make sure that you
specify the exact argument type list with correct sizes.
ERROR: Can't delete aggregate The message that the system returns if a UDA is refername - view viewName
enced in a view. You cannot drop the UDA until the
depends on it
dependency from the view is resolved.
Description
You cannot drop a user-defined aggregate that is currently in use in an active query. After
the active querys transaction completes, the Netezza system will process the DROP
AGGREGATE command to drop the aggregate. The aggregate must be defined in the current database.
You cannot drop a UDA that is referenced by an existing view. Review the section Dependency Checks before Dropping UDXs on page 6-13 for more information about resolving
dependencies to UDAs that you want to drop.
Privileges Required
To drop a UDA, you must meet one of the following criteria:
You must have the Drop privilege on the specific UDA object.
Common Tasks
Use the DROP AGGREGATE command to drop an existing aggregate from a database.
Related Commands
See CREATE [OR REPLACE] AGGREGATE on page B-13 for information on how to create
aggregates.
See ALTER AGGREGATE on page B-2 to alter an aggregate.
See SHOW AGGREGATE on page B-30 to display information about aggregates.
Usage
The following is sample usage.
B-26
20444-5
Rev.4
DROP FUNCTION
DROP FUNCTION
Use the DROP FUNCTION command to remove an existing user-defined function from a
database. When you drop a function, the functions object files will also be removed from
the user code object repository.
Synopsis
Syntax for dropping a user-defined function:
DROP FUNCTION function_name(argument_types)
Inputs
The DROP FUNCTION command takes the following inputs:
Table B-16: DROP FUNCTION Input
Input
Description
function_name
argument_types
Outputs
The DROP FUNCTION command has the following output:
Table B-17: DROP FUNCTION Output
20444-5
Rev.4
Output
Description
DROP FUNCTION
This error indicates that a UDX exists with the name but
has different sizes specified for string or numeric arguments. To drop the function, make sure that you specify
the exact argument type list with correct sizes.
The message that the system returns if a UDF is referenced in a table or a view. You cannot drop the UDF
until the dependency is resolved.
B-27
Description
You cannot drop a user-defined function that is currently in use in an active query. After the
active querys transaction completes, the Netezza system will process the DROP FUNCTION command to drop the function. The function must be defined in the current
database.
You cannot drop a UDF that is referenced by an existing table or view. Review the section
Dependency Checks before Dropping UDXs on page 6-13 for more information about
resolving dependencies to UDFs that you want to drop.
Privileges Required
To drop a UDF, you must meet one of the following criteria:
You must have the Drop privilege on the specific UDF object.
Common Tasks
Use the DROP FUNCTION command to drop an existing function from a database.
Related Commands
See CREATE [OR REPLACE] FUNCTION on page B-17 for information on how to create
functions.
See ALTER FUNCTION on page B-6 to alter a function.
See SHOW FUNCTION on page B-32 to display information about functions.
Usage
The following is sample usage.
DROP LIBRARY
Use the DROP LIBRARY command to remove an existing user-defined shared library from a
database. When you drop a shared library, the shared librarys object files will also be
removed from the user code object repository.
Synopsis
Syntax for dropping a user-defined shared library:
DROP LIBRARY library_name
B-28
20444-5
Rev.4
DROP LIBRARY
Inputs
The DROP LIBRARY command takes the following inputs:
Table B-18: DROP LIBRARY Input
Input
Description
library_name
Outputs
The DROP LIBRARY command has the following output:
Table B-19: DROP LIBRARY Output
Output
Description
DROP LIBRARY
ERROR: RemoveLibrary:
library libname does not exist
Description
You cannot drop a user-defined shared library that is currently in use in an active query.
After the active querys transaction completes, the Netezza system will process the DROP
LIBRARY command to drop the shared library. The shared library must be defined in the
current database.
Privileges Required
To drop a shared library, you must meet one of the following criteria:
You must have the Drop privilege on the specific shared library object.
Common Tasks
Use the DROP LIBRARY command to drop an existing shared library from a database.
Related Commands
See ALTER LIBRARY on page B-11 to alter shared libraries.
20444-5
Rev.4
B-29
See CREATE [OR REPLACE] LIBRARY on page B-23 to create or replace shared
libraries.
See SHOW LIBRARY on page B-33 to display information about shared libraries.
Usage
The following is sample usage.
SHOW AGGREGATE
Use the SHOW AGGREGATE command to display information about one or more aggregates
(built-in as well as UDAs). The command checks your user account privileges to ensure that
you are permitted to see information about the UDAs defined in the database.
Synopsis
Syntax:
SHOW AGGREGATE [ALL | ident] [VERBOSE]
Inputs
The SHOW AGGREGATE command takes the following inputs:
Table B-20: SHOW AGGREGATE Input
Input
Description
ALL
ident
Show information about a specific aggregate defined in the database that begins with ident. You can specify a partial name, but the
command will error if you specify a full signature.
VERBOSE
Outputs
The SHOW AGGREGATE command has the following output
Table B-21: SHOW AGGREGATE Output
B-30
Output
Description
20444-5
Rev.4
SHOW AGGREGATE
Description
The SHOW AGGREGATE command is identical in behavior to the nzsql \da and \da+
commands.
Privileges Required
Any user can run the SHOW AGGREGATE command; however, you must be the admin user,
own the UDA, or have object privileges on UDAs (such as Execute, List, Alter, or Drop) to
see information about UDAs in the output.
Common Tasks
Use the SHOW AGGREGATE command to display information about the aggregates in a
database.
Related Commands
See ALTER AGGREGATE on page B-2 to alter UDAs.
See CREATE [OR REPLACE] AGGREGATE on page B-13 to create UDAs.
See DROP AGGREGATE on page B-25 to drop a UDA.
Usage
The following provides sample usage.
To show the sample UDA named PenMax, use the following command:
DEV(MYUSER)=> show aggregate penmax;
NAME | BUILTIN | ARGUMENTS | RETURNTYPE | DESCRIPTION
--------+---------+-----------+------------+------------PENMAX | f
| (INTEGER) | INT4
|
(1 row)
To show verbose information for the PenMax UDA, use the following command:
To list all the aggregates in a database, use the following command. (The output is
abbreviated for presentation in the document.)
20444-5
Rev.4
B-31
SHOW FUNCTION
Use the SHOW FUNCTION command to display information about one or more functions
(built-in as well as UDFs). The command checks your user account privileges to ensure that
you are permitted to see information about the UDFs defined in the database.
Synopsis
Syntax:
SHOW FUNCTION [ALL | ident] [VERBOSE]
Inputs
The SHOW FUNCTION command takes the following inputs:
Table B-22: SHOW FUNCTION Input
Input
Description
ALL
ident
Show information about one or more functions defined in the database that begin with ident. You can specify a partial name, but the
command will error if you specify a full signature.
VERBOSE
Outputs
The SHOW FUNCTION command has the following output
Table B-23: SHOW FUNCTION Output
Output
Description
Description
The SHOW FUNCTION command is identical in behavior to the nzsql \df and \df+
commands.
Privileges Required
Any user can run the command SHOW FUNCTION; however, you must be the admin user,
own the UDF, or have object privileges on UDFs (such as Execute, List, Alter, or Drop) to
see information about UDFs in the output.
B-32
20444-5
Rev.4
SHOW LIBRARY
Common Tasks
Use the SHOW FUNCTION command to display information about the functions in a
database.
Related Commands
See ALTER FUNCTION on page B-6 to alter UDFs.
See CREATE [OR REPLACE] FUNCTION on page B-17 to create UDFs.
See DROP FUNCTION on page B-27 to drop a UDF.
Usage
The following provides sample usage.
To show all the functions, use the following command. (The output is abbreviated for
presentation in the document.)
MYDB(MYUSER)=> SHOW FUNCTION;
List of functions
RESULT
| FUNCTION
| BUILTIN | ARGUMENTS
-----------------+-------------+---------+------------------BIGINT
| ABS
|
t
| (BIGINT)
DOUBLE PRECISION | ABS
|
t
| (DOUBLE PRECISION)
INTEGER
| ABS
|
t
| (INTEGER)
DOUBLE PRECISION | COS
|
t
| (DOUBLE PRECISION)
DOUBLE PRECISION | COT
|
t
| (DOUBLE PRECISION)
INTEGER
| CUSTOMERNAME |
f
| (CHARACTER VARYING (64000))
To show verbose information for the sample UDF named customername, use the following command.
| (CHARACTER VARYING(64000)) | t
| ADMIN | f
| t
|
1 |
|
|
|
SHOW LIBRARY
Use the SHOW LIBRARY command to display information about one or more user-defined
shared libraries. The command checks your user account privileges to ensure that you are
permitted to see information about the shared libraries defined in the database.
20444-5
Rev.4
B-33
Synopsis
Syntax:
SHOW LIBRARY [ALL | ident] [VERBOSE]
Inputs
The SHOW LIBRARY command takes the following inputs:
Table B-24: SHOW LIBRARY Input
Input
Description
ALL
ident
Show information about one or more libraries defined in the database that begin with ident. You can specify a partial name.
VERBOSE
Description
The SHOW LIBRARY command is identical in behavior to the nzsql \dl command.
Privileges Required
Any user can run the command SHOW LIBRARY; however, you must be the admin user,
own the library, or have object privileges on libraries (such as Execute, List, Alter, or Drop)
to see information about libraries in the output.
Common Tasks
Use the SHOW LIBRARY command to display information about the shared libraries in a
database.
Related Commands
See ALTER LIBRARY on page B-11 to alter shared libraries.
See CREATE [OR REPLACE] LIBRARY on page B-23 to create or replace shared
libraries.
See DROP LIBRARY on page B-28 to drop shared libraries.
Usage
The following provides sample usage.
B-34
20444-5
Rev.4
SHOW LIBRARY
MYMATHLIB
MYSQLTOOLSLIB
(6 rows)
| t
| f
To show verbose information for the sample library named mylib, or any libraries that
begin with the mylib string, use the following command.
20444-5
Rev.4
B-35
B-36
20444-5
Rev.4
APPENDIX
This appendix describes helper functions that you can use to manage, verify, and convert
Netezza-specific datatype values. There are helper functions available for processing two
types of datatypes: temporal (date/time) values and numeric values.
Convert datatypes from their internal Netezza formats into developer-style dataa process known as decoding.
Convert data from developer-style data into Netezza internal formats for storage on and
use by the Netezza systema process known as encoding.
Verify that a value is within the valid range for its specified datatype.
Developer-style data refers to simple data structures from which programmers can infer
useful information. For example, the API function decodeDate converts a Netezza date,
which is an integer number of days after 1/1/2000, to a format that is often more used in
programs: Gregorian calendar day, month and year. The API does not provide any kind of
text (presentation) formatting or "end-user friendly" formats.
This API currently processes temporal data types such as: Date, Time, Timestamp, TimeTZ
and Interval. The API generally complies with the standard for the ISO C time_t datatype.
The API also complies with the standard for the struct tm datatype, with two exceptions:
Leap seconds are not supported. The range for tm_sec is reduced to [0, 59] for the API
functions.
When converting from NZ Date or NZ Timestamp to struct tm, the tm_yday is set to 0,
and the tm_gmtoff field is not supported on Netezza SPUs.
C-1
Since you must include the udxinc.h header file in the C++ UDX source code, you automatically have access to the datatype helper API functions. The helper API functions are
contained within namespace nz::udx::dthelpers.
Internal Representation
Range
Date
Time
Min: 0 (00:00:00.000000)
Max: 86,399,999,999 (23:59:59.999999)
TimeTZ
Timestamp
Min: -63,082,281,600,000,000
(00:00:00, 1/1/0001)
Max: 252,455,615,999,999,999
(23:59:59.999999, 12/31/9999)
Interval
C-2
20444-5
Rev.4
Each conversion function provides an optional boolean error write-to argument for easy
checking. The optional argument is set to true when the given data is out of range and false
when the given data is in range (and there are no other errors). The conversion routines will
throw an error when any of the passed references or pointers are null, or when the optional
error argument is not supplied and there is an error.
20444-5
Rev.4
Gregorian Calendar
Date and Time
time_t value at
UTC + 0:00 (GMT)
NZ Date Value
NZ Timestamp Value
Lowest supported
NZ date: 1/1/0001,
00:00:00
UNDEFINED
-730,119
-63,082,281,600,000,000
-10,957
-946,684,800,000,000
NZ day zero:
1/1/2000,
00:00:00
946,684,800
2,147,483,647
13,898
1,200,798,847,000,000
Highest supported
NZ date:
12/31/9999,
11:59:59.999999
UNDEFINED
2,921,939
252,455,615,999,999,999
C-3
Data used by NZ
TIMESTAMP
tm_sec (0 to 61)
Ignored
Seconds (0 to 59)
tm_min (0 to 59)
Ignored
Minutes
tm_hour (0 to 23)
Ignored
Hours
Day
Day
Month
Month
Year
Year
Week Day
Week Day
Year Day
Year Day
Set to -1
Set to -1
For conversions to and from struct tm, Netezza does not use or support leap seconds. Thus,
the seconds fields of struct tm that are written out by the API will never exceed 59. If a
conversion routine is called with a struct tm containing a 60 or 61 tm_sec field, the API
throws an error.
When you are encoding from a struct tm format, ignored fields can contain any data. When
you are decoding to a struct tm, the ignored fields will have a value of zero (0).
The tm_isdst flag will be ignored on input and will be set to -1 for 'unknown' on output.
Setting tm_isdst to -1 is not within the standard, but is a typical industry practice. In all
other respects, the API conforms to all the time_t specifications as listed in the ANSI C++
standard.
C-4
20444-5
Rev.4
To decode a Netezza time to h:m:s only, and ignore microseconds, you can use the IgnoreBuffer structure as follows:
uint8 h,m,s;
IgnoreBuffer ignore;
decodeTime(givenTime, &h, &m, &s, &ignore.u32);
In another example, to decode a date value into only the day and month and ignore the
year, you can use IgnoreBuffer as follows:
uint8 month,day;
IgnoreBuffer ignore;
decodeDate(givenDate, &month, &day, &ignore.u16);
It is important to note that any instance of the buffer will not contain any sort of meaningful data at any time.
Description
Value
-730,119
2,921,939
20444-5
Rev.4
C-5
Description
Value
86,399,999,999
The minimum value that the Netezzaencoded TimeTZ Offset part can have
(+13:00:00).
-46800
The maximum value that the Netezzaencoded TimeTZ Offset part can have
(-12:59:00).
46740
-63,082,281,600,000,000
252,455,615,999,999,999
The minimum value that the Netezzaencoded Interval Month part can
have.
-3,000,000
The maximum value that the Netezzaencoded Interval Month part can
have.
3,000,000
9999
-779
780
-11,323
13,898
C-6
-946,684,800,000,000
20444-5
Rev.4
Description
Value
1,200,798,847,999,999
isValidDate
Verifies whether a Netezza-encoded Date value is valid and within the Netezza Date range.
Description
isValidEpochDate
Verifies whether a Netezza-encoded Date value is valid and within the time_t Epoch range.
Description
isValidTime
Verifies whether a Netezza-encoded Time value is valid and within range.
Description
20444-5
Rev.4
C-7
isValidTimeTzOffset
Verifies whether the offset part of a Netezza-encoded TimeTZ value is valid and within
range.
Description
isValidTimeTz
Verifies whether a Netezza-encoded TimeTZ value is valid and within range.
Description
isValidTimestamp
Verifies whether a Netezza-encoded Timestamp value is valid and within range.
Description
isValidEpochTimestamp
Verifies whether a Netezza-encoded Timestamp value is valid and within the time_t Epoch
range.
Description
C-8
20444-5
Rev.4
isValidInterval
Verifies whether a Netezza-encoded Interval value is valid and within range.
Description
isValidDate
Verifies whether a decoded m/d/y Date value is valid and within the Netezza Date range.
Description
isValidTime
Verifies whether a decoded h:m:s:micros Time value is valid and within the Netezza Time
range.
Description
20444-5
Rev.4
C-9
isValidSqlOffset
Verifies whether a time offset value is within the valid API range.
Description
isValidTimeTz
Verifies whether a decoded h:m:s:micros+offset TimeTZ value is valid and within the
Netezza TimeTZ range.
Description
isValidTimestamp
Verifies whether a decoded m/d/y, h:m:s:micros Timestamp value is valid and within the
Netezza Timestamp range.
Description
C-10
20444-5
Rev.4
isValidEpoch
Verifies whether a decoded time_t value is valid and can be decoded to NZ Timestamp or
Date.
Description
isValidTimeValUsecs
Verifies whether the microseconds part of a timeval structure is valid.
Description
isValidTimeVal
Verifies whether a given timeval structure is valid and can be encoded to a Netezza
Timestamp.
Description
isValidTimeStruct
Verifies whether a given tm structure can be encoded to a Netezza Date or Timestamp
value.
Description
20444-5
Rev.4
C-11
C-12
20444-5
Rev.4
decodeTime
Converts a Netezza-encoded Time value to h:m:s:micros.
Description
decodeTimeTz
Converts a Netezza-encoded TimeTz value to h:m:s:micros.
Description
20444-5
Rev.4
C-13
C-14
20444-5
Rev.4
20444-5
Rev.4
C-15
C-16
20444-5
Rev.4
encodeTime
The encodeTime function converts a h:m:s:micros Time value to a Netezza-encoded Time
value.
Description
encodeTimeTZ
Converts a h:m:s:micros TimeTZ value to a Netezza-encoded TimeTZ.
Description
20444-5
Rev.4
C-17
C-18
20444-5
Rev.4
Miscellaneous Functions
The miscellaneous datatype helper functions provide additional capabilities that you can
use within your UDX programs.
isLeapYear
Verifies whether the specified year is a leap year.
Description
20444-5
Rev.4
C-19
Returns
The function returns a value of true when any of the following are true:
year%4 is 0
year%100 is not 0
offsetTimestamp
Applies an offset [SQL_OFFSET_MIN, SQL_OFFSET_MAX] to a Netezza Timestamp value.
Description
offsetTime
Applies an offset to a Netezza Time value. If nzTime with offset crosses 23:59:59.999999,
the value will reset (wrap) back to zero. For example, applying "+120 minutes" to the
encoded equivalent of "23:00:00" returns the encoded equivalent of "01:00:00".
Description
C-20
20444-5
Rev.4
offsetEpoch
Applies an offset to a time_t structure. It treats time_t as if it allows offsets, which is
slightly outside the time_t specification, but it allows for easy usage.
Description
offsetTimeStruct
Applies an offset to a struct tm.
Description
20444-5
Rev.4
C-21
convertNumeric32
Converts an input numeric value from its current storage size, precision, and scale to a 32bit numeric with a new precision and scale.
Description
The function has three forms of syntax:
int32 convertNumeric32(int32 value, int curPrec, int curScale, int
desiredPrec, int desiredScale);
int32 convertNumeric32(int64 value, int curPrec, int curScale, int
desiredPrec, int desiredScale);
int32 convertNumeric32(CNumeric128 value, int curPrec, int curScale,
int desiredPrec, int desiredScale);
value specifies the integer part of the input numeric, which can be either a 32-bit, 64-bit,
or 128-bit value.
curPrec and curScale specify the current precision and scale for the input numeric specified in value.
desiredPrec and desiredScale specify the new precision and scale for the converted
numeric. For a 32-bit numeric, the desiredPrec value can range from 1 to 9, and the
desiredScale value can range from 0 to (9-desiredPrec). If your desired precision is in the
range of 10 to 18, use the convertNumeric64 function, or if the desired precision is in the
range of 19-38, use the convertNumeric128 function. This helps to ensure that you select
the right storage size for the resulting integer part of the numeric.
Returns
The function returns a 32-bit integer that is compatible with the desired precision and
scale values.
C-22
20444-5
Rev.4
Throws
The function throws the following exceptions:
Numeric value out of range usually indicates that the input value is outside
the range of the current precision and scale values, or outside the range of a 128-bit
integer.
convertNumeric64
Converts an input numeric value from its current storage size, precision, and scale to a 64bit numeric with a new precision and scale.
Description
The function has three forms of syntax:
int64 convertNumeric64(int32 value, int curPrec, int curScale, int
desiredPrec, int desiredScale);
int64 convertNumeric64(int64 value, int curPrec, int curScale, int
desiredPrec, int desiredScale);
int64 convertNumeric64(CNumeric128 value, int curPrec, int curScale,
int desiredPrec, int desiredScale);
value specifies the integer part of the input numeric, which can be either a 32-bit, 64-bit,
or 128-bit value.
curPrec and curScale specify the current precision and scale for the input numeric specified in value.
desiredPrec and desiredScale specify the new precision and scale for the converted
numeric value. For a 64-bit numeric, the desiredPrec value can range from 10 to 18, and
the desiredScale value can range from 0 to (18-desiredPrec). If your desired precision is in
the range of 1 to 9, use the convertNumeric32 function, or if the desired precision is in the
range of 19-38, use the convertNumeric128 function. This helps to ensure that you select
the right storage size for the resulting integer portion of the numeric.
Returns
The function returns a 64-bit integer that is compatible with the desired precision and
scale values.
Throws
For a description of the exceptions, see the exceptions for convertNumeric32 on
page C-22.
20444-5
Rev.4
C-23
convertNumeric128
Converts an input numeric value from its current storage size, precision, and scale to a
128-bit numeric with a new precision and scale.
Description
The function has three forms of syntax:
CNumeric128 convertNumeric128(int32 value, int curPrec, int curScale,
int desiredPrec, int desiredScale);
CNumeric128 convertNumeric128(int64 value, int curPrec, int curScale,
int desiredPrec, int desiredScale);
CNumeric128 convertNumeric128(CNumeric128 value, int curPrec, int
curScale, int desiredPrec, int desiredScale);
value specifies the integer part of the input numeric, which can be either a 32-bit, 64-bit,
or 128-bit value.
curPrec and curScale specify the current precision and scale for the input numeric value.
desiredPrec and desiredScale specify the new precision and scale for the converted
numeric value. For a 128-bit numeric, the desiredPrec value can range from 19 to 38, and
the desiredScale value can range from 0 to (38-desiredPrec). If your desired precision is in
the range of 1 to 9, use the convertNumeric32 function, or if the desired precision is in the
range of 10 to 18, use the convertNumeric64 function. This helps to ensure that you select
the right storage size for the resulting integer portion of the numeric.
Returns
The function returns a 128-bit integer that is compatible with the desired precision and
scale values.
Throws
For a description of the exceptions, see the exceptions for convertNumeric32 on
page C-22.
CheckPrecision38Limit
Verifies that an input numeric value is within the 38-digit limit of a numeric. Netezza storage limits require that a numeric cannot have more than 38 combined digits before and
after the decimal point.
Description
The function has the following syntax:
CNumeric128 const& CheckPrecision38Limit(CNumeric128 const& value)
Returns
The function returns a reference to the input 128-bit numeric value or throws an error.
Throws
The function throws the error Numeric value requires more than 38 digits if the input
numeric value has more than 38 digits.
C-24
20444-5
Rev.4
UTF8CharCount
Returns a quick UTF-8 character count of a string.
Description
The function has the following syntax:
inline int UTF8CharCount(const char* bytes, int length)
Returns
The function returns the number of UTF-8 characters. The result maybe indeterminate if
bytes is not a valid UTF-8 string. As a best practice, use the isValidUTF8 helper function to
confirm that the string is composed of valid UTF-8 characters before you call this function
to count the characters.
Throws
The function throws an function throws an opaque exception object if length < 0 or if bytes
is NULL.
isValidUTF8
Checks if a given string represents valid UTF-8 characters.
Description
The function has the following syntax:
inline bool isValidUTF8(const char* bytes, int length, int*
charLength= NULL)
Returns
The function returns true if length is 0 or bytes[0...length-1] is a valid UTF8 string. Otherwise, the function returns false.
Throws
The function throws an opaque exception object if bytes is NULL or if length < 0.
20444-5
Rev.4
C-25
C-26
20444-5
Rev.4
APPENDIX
UDX Arguments
Logging Methods
char
nchar
nvarchar
varchar
boolean
date
time
numeric
real
double precision
D-1
interval
integer
bigint
smallint
byteint
timestamp
char
DDL info: CHAR(n)
C++ info: UdxBase::UDX_FIXED
struct StringArg
{
char* data;
int length;
// Bytes used by string data (not characters).
int dec_length; // Character declared length ie NCHAR(20) = 20. Will
// also be set if VARCHAR or NVARCHAR.
};
struct StringReturn
{
char* data;
int size;
// On enter it is the size (in bytes) allocated for string data.
// On return it is the size (in bytes) actually used.
};
The char type often has implicit spaces at the end when passed as an argument. The difference between the specified length and the dec_length indicates how many trailing spaces
must be accounted for. Note that length is in bytes and dec_length is in characters.
nchar
DDL info: NCHAR(n)
C++ info: UdxBase::UDX_NATIONAL_FIXED
struct StringArg
{
char* data;
int length;
// Bytes used by string data (not characters).
int dec_length; // Character declared length ie NCHAR(20) = 20. Will
// also be set if VARCHAR or NVARCHAR.
};
struct StringReturn
{
char* data;
int size;
D-2
20444-5
Rev.4
The nchar type often has implicit spaces at the end when passed as an argument. The difference between the specified length and the dec_length indicates how many trailing
spaces must be accounted for. Note that length is in bytes and dec_length is in characters.
nvarchar
DDL info: NVARCHAR(n)
varchar
DDL info: VARCHAR(n)
20444-5
Rev.4
D-3
boolean
DDL info: BOOL
// 1 = true, 0 = false
date
DDL info: DATE
C++ info: dxBase::UDX_DATE
int32 date; // Day resolution spans January 1, 0001 to December
// 31, 9999 (centered around 2000-01-01).
time
DDL info: TIME
Uses the int64 time value and adds an int32 time zone as well. The time zone is represented in seconds.
numeric
DDL info: NUMERIC(p,s)
D-4
20444-5
Rev.4
struct Numeric32Val
{
CNumeric32 *value;
int precision; // Number of digits (both sides of decimal point)
int scale; // Number of decimal digits
};
struct Numeric64Val
{
CNumeric64 *value;
int precision; // Number of digits (both sides of decimal point)
int scale; // Number of decimal digits
};
struct Numeric128Val
{
CNumeric128 *value;
int precision; // Number of digits (both sides of decimal point)
int scale; // Number of decimal digits
};
The precision determines which of the three variations will be used. 1 - 9 digits use
Numeric1, 10 - 18 digits use numeric2, and 19 - 38 digits use Numeric4 The scale value
is necessary to determine the meaning of the numeric since it is presented as an integer,
with the scale indicating where the floating point is placed.
real
DDL info: FLOAT4
double precision
DDL info: FLOAT8
interval
DDL info: INTERVAL
20444-5
Rev.4
D-5
{
Interval *value;
};
It has microsecond resolution and ranges from +/- 178000000 years. The time part represents everything but months and years (microseconds) and the month part represents
months and years.
integer
DDL info: INT4
bigint
DDL info: INT8
smallint
DDL info: INT2
C++ info: UdxBase::UDX_INT16
int16 smallInt;
byteint
DDL info: INT1
timestamp
DDL info: TIMESTAMP
C++ info: UdxBase::UDX_TIMESTAMP
int64 timestamp;
The value represents the number of microseconds since midnight 2000-01-01. This can
be positive or negative, the low value is January 1, 0001 and the high value is December
31, 9999.
D-6
20444-5
Rev.4
UDX Arguments
UDX Arguments
You can use the following argument types in your UDX code:
bool isArgNull(int n)
int argType(int n)
bool isArgConst(int n)
int numArgs()
int64 timestampArg(int n)
int64 timeArg(int n)
int32 dateArg(int n)
bool boolArg(int n)
int64 int64Arg(int n)
int32 int32Arg(int n)
int16 int16Arg(int n)
int8 int8Arg(int n)
double doubleArg(int n)
float floatArg(int n)
struct Interval* intervalArg(int n)
TimeTzADT* timetzArg(int n)
Numeric128Val* numeric128Arg(int n)
Numeric64Val* numeric64Arg(int n)
Numeric32Val* numeric32Arg(int n)
StringArg* stringArg(int n)
int stringArgSize(int n)
int numericArgPrecision(int n)
int numericArgScale(int n)
Logging Methods
You can use the following message logging methods in your UDXs:
bool isLoggingEnabled(int8 val);
int8 getLogMask();
void logMsg(int8 flags, const char* fmt, ...);
The isLoggingEnabled and getLogMask methods are available only in UDXs that use API
Version 2.
20444-5
Rev.4
D-7
D-8
20444-5
Rev.4
NZ_UDX_RETURN_TIMETZ(x)
NZ_UDX_RETURN_NUMERIC32(x)
NZ_UDX_RETURN_NUMERIC64(x)
NZ_UDX_RETURN_NUMERIC128(x)
NZ_UDX_RETURN_FLOAT(x)
NZ_UDX_RETURN_DOUBLE(x)
NZ_UDX_RETURN_INTERVAL(x)
NZ_UDX_RETURN_INT64(x)
NZ_UDX_RETURN_INT32(x)
NZ_UDX_RETURN_INT16(x)
NZ_UDX_RETURN_INT8(x)
NZ_UDX_RETURN_TIMESTAMP(x)
You can use the following methods to obtain information from UdxEnvironmentEntry :
const char* getKey();
const char* getValue();
20444-5
Rev.4
D-9
The following methods operate on the UdxOutputShaper object to define an output column:
void addOutputColumn(int nType, const char* strName, int nSize);
void addOutputColumn(int nType, const char* strName, int precision,
int scale);
void addOutputColumn(int nType, const char* strName);
The following methods operate on the UdxOutputShaper object to obtain information about
the return columns and system casing:
int numOutputColumns();
const UdxColumnInfo* getOutputColumn(int n);
bool isSystemCaseUpper();
You can use the following methods on the UdxColumnInfo object to obtain information
about a column:
int getType();
int getSize();
int getPrecision();
int getScale();
const char* getName();
You can use the following methods on the shaper object to get input arguments and input
metadata. Many of these methods work the same way as the standard UDX Arguments
methods.
int numArgs();
int argType(int n);
int stringArgSize(int n);
int numericArgPrecision(int n);
int numericArgScale(int n);
bool isArgConst(int n);
bool isArgNull(int n);
Numeric32Val* numeric32Arg(int n);
Numeric64Val* numeric64Arg(int n);
Numeric128Val* numeric128Arg(int n);
StringArg* stringArg(int n);
TimeTzADT* timetzArg(int n);
struct Interval* intervalArg(int n);
bool boolArg(int n);
int32 dateArg(int n);
int64 timeArg(int n);
int64 timestampArg(int n);
int8 int8Arg(int n);
D-10
20444-5
Rev.4
For more information on generic UDTFs, see Registering Generic Return Type UDTFs on
page 3-12.
20444-5
Rev.4
D-11
D-12
20444-5
Rev.4
APPENDIX
This appendix describes some advanced development topics that show how user-defined
functions and aggregates can be used with stored procedures and to extend the NZPLSQL
language. For general information about creating and using stored procedures, refer to the
IBM Netezza Stored Procedures Developers Guide.
<udxinc.h>
<dirent.h>
<sys/types.h>
<string.h>
<errno.h>
E-1
E-2
20444-5
Rev.4
Note: The sample dir.cpp file stores several pointers into an int32 field. If the Netezza
operating system changes to a 64-bit version in the future, note that these pointers would
have to switch to use int64 instead.
You can compile and register the three UDFs in the dir.cpp file using the following three
commands or using CREATE AND REPLACE FUNCTION commands:
nzudxcompile dir.cpp --sig "opendir(varchar(any))" --return int4
--class OpenDir --unfenced
nzudxcompile dir.cpp --sig "readdir(int4)" --return "varchar(512)"
--class ReadDir --unfenced
nzudxcompile dir.cpp --sig "closedir(int4)" --return "bool"
--class CloseDir --unfenced
20444-5
Rev.4
E-3
nm varchar(512);
cl bool;
dir varchar(1024);
num int4;
r record;
BEGIN
select count(*) INTO num from _t_object where upper(objname) =
'SORTER' and objclass = 4905 and objdb = current_db;
IF num = 1 THEN
DROP TABLE SORTER;
END IF;
CREATE TABLE SORTER (grp int4, name varchar(2000));
dir := '/tmp/udx_known';
dirp := opendir(dir);
LOOP
nm = readdir(dirp);
exit when nm is null;
EXECUTE IMMEDIATE 'INSERT INTO SORTER VALUES (1, ' ||
quote_literal(nm) || ')';
END LOOP;
FOR r in SELECT name from sorter order by name LOOP
RAISE NOTICE 'got %/%', dir, r.name;
END LOOP;
cl = closedir(dirp);
DROP TABLE SORTER;
RETURN cl;
EXCEPTION WHEN OTHERS THEN
IF dirp is not NULL THEN
cl = closedir(dirp);
RETURN cl;
END IF;
END;
END_PROC;
The sample procedure calls the new UDFs opendir(), readdir(), and closedir() to operate on
a directory named /tmp/udx_known. As an example, if udx_known contains the dir.cpp program and the object files from nzudxcompile, a sample sp_listdirs01() call returns the
following information:
DEV(MYUSER)=> CALL sp_listdirs01();
call sp_listdirs01();
NOTICE: got /tmp/udx_known/.
NOTICE: got /tmp/udx_known/..
NOTICE: got /tmp/udx_known/dir.cpp
NOTICE: got /tmp/udx_known/dir.o_diab_ppc
NOTICE: got /tmp/udx_known/dir.o_ppc
NOTICE: got /tmp/udx_known/dir.o_x86
SP_LISTDIRS01
--------------t
(1 row)
If you attempt to run any of the UDFs OpenDir, ReadDir, or CloseDir on the SPUs or in
DBOS, the Netezza system reports an error similar to the following:
DEV(MYUSER)=> SELECT readdir(grp) FROM customers;
ERROR: readdir only supported in frontend
E-4
20444-5
Rev.4
APPENDIX
This appendix contains some sample user-defined functions and aggregates. In addition to
the examples in this appendix, other samples are available on the Demo page of the NDN
Developers web site at https://developer.netezza.com.
F-1
LANGUAGE CPP
PARAMETER STYLE NPSGENERIC
CALLED ON NULL INPUT
NOT DETERMINISTIC
EXTERNAL CLASS NAME 'Concat'
EXTERNAL HOST OBJECT '/tmp/udx_test/UDX_Concat.o_x86'
EXTERNAL SPU OBJECT '/tmp/udx_test/UDX_Concat.o_spu10';
*
* USAGE:
select var_concat('str1','str2');
*
* Copyright (c) 2007-2009 Netezza Corporation, an IBM Company
* All rights reserved.
*/
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
class Concat: public Udf
{
static Udf* instantiate();
inline bool isValidArgType(int at) const
{
return at==UDX_FIXED||at==UDX_VARIABLE||at==UDX_NATIONAL_
FIXED||at==UDX_NATIONAL_VARIABLE;
}
virtual ReturnValue evaluate()
{
if(numArgs()!=2)
{
throwUdxException("var_concat number of arguments is not 2");
}
if (isArgNull(0))
NZ_UDX_RETURN_NULL();
if (isArgNull(1))
NZ_UDX_RETURN_NULL();
setReturnNull(false);
int argType0=argType(0);
int argType1=argType(1);
if(isValidArgType(argType0)&&isValidArgType(argType1))
{
StringArg *a = stringArg(0);
StringArg *b = stringArg(1);
StringReturn *ret = stringReturnInfo();
ret->size=a->length+b->length;
memcpy(ret->data,a->data, a->length);
memcpy(ret->data+a->length, b->data, b->length);
NZ_UDX_RETURN_STRING(ret);
}
else
{
F-2
20444-5
Rev.4
throwUdxException("Datatype mismatch.");
}
}
virtual uint64 calculateSize() const
{
int argType0=sizerArgType(0);
int argType1=sizerArgType(1);
if(isValidArgType(argType0)&&isValidArgType(argType1))
{
return sizerStringSizeValue(sizerStringArgSize(0)
+sizerStringArgSize(1));
}
else
{
throwUdxException("Datatype mismatch.");
}
}
};
Udf* Concat::instantiate()
{
return new Concat;
}
20444-5
Rev.4
F-3
F-4
20444-5
Rev.4
NZ_UDX_RETURN_STRING(ret);
}
};
Udf* CHexToBin::instantiate()
{
return new CHexToBin;
}
class CBinToHex : public Udf
{
char unconvert(char inp)
{
if (inp <= 9)
return inp + '0';
return inp + 'A' - 10;
}
public:
char *m_pBuf;
CBinToHex()
{
m_pBuf = new char[32000];
}
~CBinToHex()
{
delete m_pBuf;
}
static Udf* instantiate();
virtual ReturnValue evaluate()
{
StringReturn* ret = stringReturnInfo();
StringArg *input = stringArg(0);
int numbytes = input->length * 2;
for (int i=0; i < input->length; i++)
{
m_pBuf[i*2] = unconvert(((unsigned char)(input->data[i]) &
0xF0) >> 4);
m_pBuf[i*2+1] = unconvert((unsigned char)(input->data[i]) &
0x0F);
}
ret->size = numbytes;
memcpy(ret->data, m_pBuf, numbytes);
NZ_UDX_RETURN_STRING(ret);
}
};
Udf* CBinToHex::instantiate()
{
return new CBinToHex;
}
20444-5
Rev.4
F-5
#include "udxinc.h"
using namespace nz::udx;
using namespace nz::udx::dthelpers;
static const uint8 WORK_START_HOUR=9;
static const uint8 WORK_END_HOUR=18;
class IsBusinessHours : public Udf
{
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
if(isArgNull(0))
NZ_UDX_RETURN_NULL();
int64 ts = timestampArg(0);
if(!isValidTimestamp(ts)) //if this test does not pass, we won't
// be able to decode ts
throwUdxException("invalid timestamp passed");
struct tm decomp;
bool err=false;
decodeTimestamp(ts, &decomp, &err);
if(err) //if isValidTimestamp(ts) is true, err should be false,
//but better safe than sorry
F-6
20444-5
Rev.4
20444-5
Rev.4
F-7
public:
static Uda* instantiate();
void initializeState()
{
StringArg *s = stringState(0);
s->length = 0;
setStateNull(0, false);
}
virtual void accumulate()
{
StringArg *s = stringState(0);
int32 value = int32Arg(0);
if (s->length < MAXCHILDREN * 4)
{
*((int32*)(s->data+s->length)) = value;
s->length = s->length + 4;
}
}
virtual void merge()
{
/* Destination */
StringArg *s = stringState(0);
/* Source */
StringArg *s2 = stringArg(0);
if (s->length + s2->length <= MAXCHILDREN * 4)
{
memcpy(s->data+s->length, s2->data, s2->length);
s->length = s->length + s2->length;
}
}
virtual ReturnValue finalResult()
{
setReturnNull(false);
StringReturn *ret = stringReturnInfo();
StringArg *s = stringArg(0);
printf("got %d\n", s->length);
ret->size = s->length;
memcpy(ret->data,s->data,s->length);
NZ_UDX_RETURN_STRING(ret);
}
};
Uda* CPackChildren::instantiate()
{
return new CPackChildren;
}
F-8
20444-5
Rev.4
REGISTRATION:
CREATE AGGREGATE PENMAX(INT4) RETURNS INT4 STATE (INT4, INT4)
LANGUAGE CPP API VERSION 2 PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'CPenMax'
EXTERNAL HOST OBJECT '/home/nz/udx_files/penmax.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/penmax.o_spu10'
Usage:
CREATE
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
20444-5
Rev.4
F-9
*pCurPenMax = curVal;
}
}
}
}
void CPenMax::merge()
{
int *pCurMax = int32State(0);
bool curMaxNull = isStateNull(0);
int *pCurPenMax = int32State(1);
bool curPenMaxNull = isStateNull(1);
int nextMax = int32Arg(0);
bool nextMaxNull = isArgNull(0);
int nextPenMax = int32Arg(1);
bool nextPenMaxNull = isArgNull(1);
if ( !nextMaxNull ) { // if next max is null, then so is
//next penmax and we do nothing
if ( curMaxNull ) {
setStateNull(0, false); // current max was null,
// so save next max
*pCurMax = nextMax;
} else {
if ( nextMax > *pCurMax ) {
setStateNull(1, false);
// next max is greater than current, so save next
*pCurPenMax = *pCurMax;
// and make current penmax prior current max
*pCurMax = nextMax;
} else if ( curPenMaxNull || nextMax > *pCurPenMax ) {
// next max may be greater than current penmax
setStateNull(1, false); // it is
*pCurPenMax = nextMax;
}
}
if ( !nextPenMaxNull ) {
if ( isStateNull(1) ) {
// can't rely on curPenMaxNull here, might have
// change state var null flag above
setStateNull(1, false); // first non-null penmax,
// save it
*pCurPenMax = nextPenMax;
} else {
if ( nextPenMax > *pCurPenMax ) {
*pCurPenMax = nextPenMax;
// next penmax greater than current, save it
}
}
}
}
}
ReturnValue CPenMax::finalResult()
{
int curPenMax = int32Arg(1);
F-10
20444-5
Rev.4
20444-5
Rev.4
F-11
{
case
case
case
case
{
UDX_FIXED:
UDX_VARIABLE:
UDX_NATIONAL_FIXED:
UDX_NATIONAL_VARIABLE:
StringReturn *ret = stringReturnColumn(i);
if (ret->size)
memset(ret->data, ' ', ret->size);
sprintf(temp, "%d", val);
ret->size = strlen(temp);
memcpy(ret->data, temp, strlen(temp));
break;
}
case UDX_BOOL:
*boolReturnColumn(i) = true;
break;
case UDX_DATE:
*dateReturnColumn(i) = val;
break;
case UDX_TIME:
*timeReturnColumn(i) = val;
break;
case UDX_TIMETZ:
{
TimeTzADT *ret = timetzReturnColumn(i);
ret->time = val;
ret->zone = 0;
break;
}
case UDX_NUMERIC32:
{
Numeric32Val *ret = numeric32ReturnColumn(i);
*ret->value = val;
break;
}
case UDX_NUMERIC64:
{
Numeric64Val *ret = numeric64ReturnColumn(i);
*ret->value = val;
break;
}
case UDX_NUMERIC128:
{
Numeric128Val *ret = numeric128ReturnColumn(i);
*ret->value = val;
break;
}
case UDX_FLOAT:
*floatReturnColumn(i) = val * 1.0;
break;
case UDX_DOUBLE:
*doubleReturnColumn(i) = val * 1.0;
break;
case UDX_INTERVAL:
{
F-12
20444-5
Rev.4
20444-5
Rev.4
F-13
F-14
20444-5
Rev.4
20444-5
Rev.4
F-15
F-16
20444-5
Rev.4
APPENDIX
This section describes some important notices, trademarks, and compliance information.
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other
countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service
is not intended to state or imply that only that IBM product, program, or service may be
used. Any functionally equivalent product, program, or service that does not infringe any
IBM intellectual property right may be used instead. However, it is the user's responsibility
to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in
this document. The furnishing of this document does not grant you any license to these
patents. You can send license inquiries, in writing, to: This information was developed for
products and services offered in the U.S.A.
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785 U.S.A.
For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:
IBM World Trade Asia Corporation
Licensing 2-31 Roppongi 3-chome, Minato-ku
Tokyo 106-0032, Japan
The following paragraph does not apply to the United Kingdom or any other country where
such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES
CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
G-1
G-2
20444-5
Rev.4
Trademarks
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate
programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of
developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are
written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
Each copy or any portion of these sample programs or any derivative work, must include a
copyright notice as follows:
your company name) (year). Portions of this code are derived from IBM Corp. Sample
Programs.
Copyright IBM Corp. _enter the year or years_.
If you are viewing this information softcopy, the photographs and color illustrations may not
appear.
Trademarks
IBM, the IBM logo, ibm.com and Netezza are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If
these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or ), these symbols indicate U.S. registered or
common law trademarks owned by IBM at the time this information was published. Such
trademarks may also be registered or common law trademarks in other countries. A current
list of IBM trademarks is available on the Web at Copyright and trademark information at
ibm.com/legal/copytrade.shtml.
Adobe is a registered trademark of Adobe Systems Incorporated in the United States, and/
or other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or
both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
NEC is a registered trademark of NEC Corporation.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United
States, other countries, or both.
Red Hat is a trademark or registered trademark of Red Hat, Inc. in the United States and/or
other countries.
D-CC, D-C++, Diab+, FastJ, pSOS+, SingleStep, Tornado, VxWorks, Wind River, and the
Wind River logo are trademarks, registered trademarks, or service marks of Wind River Systems, Inc. Tornado patent pending.
APC and the APC logo are trademarks or registered trademarks of American Power Conversion Corporation.
Other company, product or service names may be trademarks or service marks of others.
20444-5
Rev.4
G-3
G-4
20444-5
Rev.4
20444-5
Rev.4
G-5
This is a Class A product based on the standard of the Voluntary Control Council for Interference (VCCI). If this equipment is used in a domestic environment, radio interference
may occur, in which case the user may be required to take corrective actions.
Japan Electronics and Information Technology Industries Association (JEITA) Statement
This is electromagnetic wave compatibility equipment for business (Type A). Sellers and
users need to pay attention to it. This is for any areas other than home.
Russia Electromagnetic Interference (EMI) Class A Statement
G-6
20444-5
Rev.4
20444-5
Rev.4
G-7
G-8
20444-5
Rev.4
Index
Index
Symbols
/nz/extensions directory, about 1-8
\da switch, showing UDAs 6-5
\df switch, showing UDFs 6-5
\dl command 6-5
_v_dual view 6-7
_v_dual_dslice view 6-7
A
access permissions 1-5
account permissions 1-5
about 1-5
managing 6-1
addOutputColumn Method 3-13
admin user, permissions 6-1
aggregate function. See user-defined aggregates.
aggregate objects A-2
aggregates
altering B-2
creating or replacing B-13
dropping B-25
allocate function A-7
ALTER AGGREGATE command B-2
ALTER FUNCTION command B-6
ALTER LIBRARY command B-11
ANY keyword 2-9
API versions, about 1-5
automatic load, user-defined shared libraries 5-2
B
backups, Netezza and UDX code 1-10
best practices
registering UDXs with SPUPads A-14
UDX development 6-6
bigint datatype D-6
bintohex UDF example F-3
boolean datatype D-4
built-in
aggregates 1-2
functions 1-1
functions, checking 1-6
byteint datatype D-6
C
C++
files with multiple functions 6-17
functions, support for 6-10
library header files, declaring 2-1, 3-1
objects, aggregates and non-aggregates A-2
calculateShape method 3-11
calculateSize method
definition 2-16
examples 2-10
casting, input values to match signature sizes 2-9
D
datatype
helper API
about 1-7
functions C-1
Netezza C-2
supported D-1
UDX_BOOL D-4
UDX_DATE D-4
UDX_DOUBLE D-5
UDX_FIXED D-2
UDX_FLOAT D-5
UDX_INT16 D-6
UDX_INT32 D-6
UDX_INT64 D-6
UDX_INT8 D-6
UDX_INTERVAL D-5
UDX_NATIONAL_FIXED D-2
UDX_NATIONAL_VARIABLE D-3
UDX_NUMERIC128 D-4
UDX_NUMERIC32 D-4
UDX_NUMERIC64 D-4
UDX_TIME D-4
UDX_TIMESTAMP D-6
UDX_TIMETZ D-4
UDX_VARIABLE D-3
datatype conversion
functions C-12
ignoring values C-5
Date datatype C-2
date datatype D-4
deallocate function A-8
Index-1
Index
debugging
flags 7-1
hints 7-7
decoded range-checking functions C-9
decodeDate (m/d/y Output) function C-12
decodeDate (struct tm Output) function C-13
decodeDate (time_t Output) function C-12
decodeTime function C-13
decodeTimestamp (mdy h:m:s:m Output) function C-14
decodeTimestamp (struct timeval Output) function C-15
decodeTimestamp (struct tm Output) function C-15
decodeTimestamp (time_t Output) function C-14
decodeTimeTz function C-13
decoding functions C-12
Demo page, NDN web site F-1
dependencies
clearing 5-4
viewing and resolving 6-13
Developer-style data C-1
development test environment 1-4
double precision datatype D-5
downgrade cautions 1-11
DROP AGGREGATE command B-25
DROP FUNCTION command B-27
DROP LIBRARY command B-28
dymanic memory, best practices 6-8
E
encoded range-checking functions C-7
encodeDate (m/d/y Values) function C-15
encodeDate (struct tm Values) function C-16
encodeDate (time_t Values) function C-16
encodeTime function C-17
encodeTimestamp (m/d/y Input Format) function C-18
encodeTimestamp (struct tm Input Format) function C-19
encodeTimestamp (time_t Input Format) function C-18
encodeTimestamp (timeval Input Format) function C-19
encodeTimeTZ function C-17
encoding functions C-15
error checking, in UDXs 6-16
errors, record size exceeded 6-8
examples of UDXs F-1
execution locus, specifying for UDX 6-7
EXTERNAL CLASS NAME, requirements 2-5
F
fenced mode 1-3
fencing, impacts on query performance 1-3
findEntry method 2-8
fixed-point numeric datatypes, conversions C-22
flags, debug 7-1
FOR_SPU compiler code 6-18
fully-qualified names, UDX 6-6
fully-qualified object names, for stored procedures 1-9
functions
altering B-6
creating or replacing B-17
dropping B-27
in table column expressions or views 1-7
multiple in one C++ file 6-17
Index-2
G
generic return value for UDTFs 3-11
generic UDTFs
ANY return value 3-11
calculateShaper() method for UDTF return value 3-11
registering 3-12
generic UDXs
ANY keyword 2-9
calculateSize() method for UDF return value 2-10
input arguments 2-9
registering 2-10
return value for UDFs 2-10
See also user-defined functions, generic.
getCurrentDatasliceId function 6-9
getCurrentHardwareId function 6-9
getCurrentLocus function
description 6-8
using in language-extension UDFs E-1
getCurrentSessionId function 6-9
getCurrentTransaction function 6-9
getCurrentUsername function 6-9
getEntry method 2-8
getKey method 2-8
getName Method 3-15
getNumberDataslices function 6-9
getNumberSpus function 6-10
getNumEntries method 2-7
getOutputColumn Method 3-14
getPad function A-9
getpadcount example A-20
getPrecision Method 3-15
getRootObject function A-9
getScale Method 3-15
getSize Method 3-14
getTotalSize function A-9
getType Method 3-14
getValue method 2-8
global objects 1-4
GRANT ALL command, create permission 6-2
GRANT command
alter permission 6-2
create permission 6-2
drop permission 6-3
execute permission 6-3
unfence permission 6-3
H
helper functions C-1
helper routines
datatype C-1
numerics C-22
temporal C-1
UTF-8 datatypes C-25
hextobin UDF example F-3
I
identifier collisions, avoiding 6-7
IgnoreBuffer C-5
implicit castings, for UDX input values 2-9
Index
L
lateral subquery 3-8
laterally correlated table function, restrictions 3-9
left outer correlation 3-9
LIBC, support on SPUs 6-10
LIBRARY
command B-23
objects 1-5
linker errors, avoiding for common symbols 6-7
locale environment variables 6-10
locale-aware functions 6-10
locus of UDTFs 3-10
log mask, checking settings 7-2
logging
messages from UDXs 7-1
methods D-7
LOGMASK
attribute 7-1
using 7-3
logMsg
example 7-2
facility 7-1
function 7-1
M
macros, return values D-8
manual load, user-defined shared libraries 5-2
MAXIMUM MEMORY
determining 6-17
including SPUPad memory in A-14
memcmp use 2-3
memory
allocating with SPUPad A-1
calculating for SPUPad A-13
freeing SPUPad memory A-13
N
names for UDXs, avoiding built-in function names 6-7
namespace, about version 1 and 2 2-2
nchar datatype D-2
Netezza
datatypes C-2
Developer Network (NDN) 1-3
SQL commands B-1
temporal values, converting 1-7
Web site 1-3
new and delete operators 6-8
newInputRow() method 3-3
nextEoiOutputRow() method 3-4
nextOutputRow() method 3-3
non-aggregate objects A-2
null checking 6-16
numeric datatype D-4
numerics, conversions C-22
numOutputColumns Method 3-13
numSizerArgs method 2-13
nvar_concat UDF example F-1
nvarchar datatype D-3
nz::udx::dthelpers namespace C-2
NZPLSQL language, extending with UDFs E-1
nzsql command
comments for UDXs 6-5
help on UDX 6-5
nzudxcompile
command 6-18
syntax 2-4
UDA example 4-5
UDF example 2-4
nzudxrunharness command 7-5, 7-7
nzudxvalidate command 6-14
O
object files
multiple, compiling 6-17
resolving problems with 6-15
objects,limiting in SPUPad A-4
offsetEpoch function C-21
offsetTime function C-20
offsetTimestamp function C-20
offsetTimeStruct function C-21
CREATE B-23
outputs
ALTER GROUP command B-4, B-9, B-12, B-24
CREATE DATABASE command B-15, B-21
DROP VIEW command B-25, B-27, B-29
SHOW PROCEDURE command B-30, B-32
overloading, functions and aggregates 2-6
Index-3
Index
P
packChildren UDA example F-7
PAD_DELETE macro A-10
PAD_NEW macro A-9
padcounter example A-20
patches, and UDX code 1-10
PATH SQL session variable 1-9
PenMax example 4-1
permissions
account 1-5
granting
all 6-2
alter permission 6-2
create 6-2
drop permission 6-3
execute permission 6-3
unfence permission 6-3
managing 6-1
revoking
alter permission 6-2
create permission 6-2
drop permission 6-3
execute permission 6-3
unfence permission 6-3
plus switch 6-6
privileges, commands
ALTER AGGREGATE command B-5
ALTER FUNCTION command B-10
ALTER LIBRARY command B-12
CREATE AGGREGATE command B-17
CREATE FUNCTION command B-22
CREATE LIBRARY command B-24
Q
queries, using UDFs in 2-11
query cancellation, using throwUdxException() 6-16
query optimization, and UDXs 6-12
R
range specifier constants C-5
range-checking functions C-7
real datatype D-5
record function. See user-defined functions.
record size exceeded errors 6-8
repeating subquery 3-8
restores, Netezza and UDX code 1-10
return value macros D-8
return value sizer API 2-12
return value sizer methods
calculateSize 2-16
isSizerArgConstant 2-15
numSizerArgs 2-13
sizerArgType 2-13
sizerGetConstantArg 2-15
sizerNumericArgPrecision 2-14
sizerNumericArgScale 2-14
Index-4
sizerNumericSizeValue 2-15
sizerReturnType 2-12
sizerStringArgSize 2-13
sizerStringSizeValue 2-14
REVOKE command
alter permission 6-3
create permission 6-2
drop permission 6-3
execute permission 6-3
unfence permission 6-3
root object, SPUPad A-3
Root structure A-3
S
setReturnNull, checking for null returns 6-16
setRootObject function A-8
shaper methods 3-11
shared libraries
altering B-11
creating B-23
showing B-33
shell window, writing messages to 7-3
SHOW AGGREGATE command B-30
SHOW FUNCTION command B-32
SHOW LIBRARY command B-33
signature
about 6-1
format 2-6
sizerArgType method 2-13
sizerGetConstantArg method 2-15
sizerNumericArgPrecision method 2-14
sizerNumericArgScale method 2-14
sizerNumericSizeValue method 2-15
sizerReturnType method 2-12
sizerStringArgSize method 2-13
sizerStringSizeValue method 2-14
smallint datatype D-6
SPUPad
about 1-3, A-1
accessing A-5
best practices for registering A-14
calculating memory use A-13
content restrictions A-2
creating A-2, A-4
creating on one SPU versus multiple SPUs A-11
creating on the Netezza host A-12
define content of A-3
examples A-14
freeing memory used by A-13
functions
allocate A-7
deallocate A-8
getPad A-9
getRootObject A-9
getTotalSize A-9
isUserQuery A-10
setRootObject A-8
getpadcount example A-20
limiting number of objects A-4
Index
macros
PAD_DELETE A-10
PAD_NEW A-9
non-virtual destructors A-2
padcounter example A-20
process data in A-5
root object, about A-3
Root structure A-3
running functions A-6
string_pad_get example A-5
stringpad example A-4
transaction restarts A-14
understanding return values A-11
uses A-1
using NOT DETERMINISTIC A-14
Standard C Library (LIBC) support 6-10
standard log files, writing messages to 7-3
stored procedures
about 1-8
fully qualified name of 1-9
PATH session variable 1-9
string sizes, best practices 2-5
string_pad_create
code A-14
example A-4
string_pad_get
code A-17
example A-5
stringpad example A-4
struct tm
conversion restrictions C-4
implementation C-4
structure C-4
symbols, multiple definitions errors 6-7
synonyms, creating for UDXs 6-6
system prerequisites 1-4
T
table function. See user-defined table functions.
table shaper methods 3-13
TABLE WITH FINAL syntax 3-6
tables
resolving references to dropped UDFs 6-15
resolving references to UDFs 6-13
temporal
types C-2
values, about 1-7
test harness
about 7-5
control file 7-10
example 7-5
Time datatype C-2
time datatype D-4
time with time zone datatype D-4
time_t structure C-3
Timestamp datatype C-2
timestamp datatype D-6
TimeTZ datatype C-2
timeval support C-4
U
Uda base class 4-2
UDA. See user-defined aggregates.
Udf base class 2-2
UDF. See user-defined functions.
Udtf base class 3-2
UDTF.See user-defined table functions.
UDX
avoiding built-in names 6-7
commenting on 6-5
compiling 6-18
controlling access to 1-4
creation steps 1-7
cross-database access to 6-6
definition 1-3
environment
about 2-7
class 2-7
methods D-9
error checking 6-16
examples F-1
how to call 1-9
how to plan and create 1-6
installation location best practices 1-8
macros 2-4
migrating version 1 to version 2 6-23
planning steps 1-6
record size exceeded errors 6-8
resolving object file problems 6-15
signature 6-1
UDX_BOOL datatype D-4
UDX_DATE datatype D-4
UDX_DOUBLE datatype D-5
UDX_FIXED datatype D-2
UDX_FLOAT datatype D-5
UDX_INT16 datatype D-6
UDX_INT32 datatype D-6
UDX_INT64 datatype D-6
UDX_INT8 datatype D-6
UDX_INTERVAL datatype D-5
UDX_LOCUS_POSTGRES, example of E-1
UDX_NATIONAL_FIXED datatype D-2
UDX_NATIONAL_VARIABLE datatype D-3
UDX_NUMERIC128 datatypes D-4
UDX_NUMERIC32 datatypes D-4
UDX_NUMERIC64 datatypes D-4
UDX_TIME datatype D-4
UDX_TIMESTAMP datatype D-6
UDX_TIMETZ datatype D-4
UDX_VARIABLE datatype D-3
udx_ver2 namespace 2-2
UdxEnvironmentEntry values 2-7
udxinc.h header file
in user-defined aggregates 4-1
in user-defined functions 2-1
in user-defined table functions 3-1
udxLibraryName function 6-10
UDXs
cross-database access to 1-9
references to user-defined shared libraries 6-13
uncorrelated table function 3-7
Unfence privilege 1-3
Index-5
Index
Index-6
V
_v_depend view 6-13
var_concat UDF example F-1
varchar datatype D-3
views
resolving references to dropped UDAs 6-15
resolving references to UDAs 6-13