0% found this document useful (0 votes)
582 views

Netezza User-Defined Functions Developer's Guide

Netezza User-Defined Functions Developer's Guide

Uploaded by

srimkb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
582 views

Netezza User-Defined Functions Developer's Guide

Netezza User-Defined Functions Developer's Guide

Uploaded by

srimkb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 242

IBM Netezza 6.0.

3 and Later

IBM Netezza User-Defined


Functions Developers Guide
Revised:

20444-5 Rev. 4

February 14, 2012

Note: Before using this information and the product that it supports, read the information in Notices and Trademarks on
page G-1.

Copyright IBM Corporation 2007, 2012.


US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM
Corp.

Contents
Preface
1 User-Defined Functions
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
User-Defined Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
User-Defined Table Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
User-Defined Aggregates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
User-Defined Shared Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Fenced and Unfenced Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
SPUPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Netezza Developer Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Important Cautions for User Code on Netezza Systems . . . . . . . . . . . . . . . . . . . . . . . 1-4
Netezza System Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
UDX Programming Language Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
User Account Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
UDX API Version 1 and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
How to Create a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Design a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Review the Existing Built-In Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Create a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
Information for UDX Developers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
Registering UDXs in Netezza Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
How to Convert and Use Netezza Temporal Values . . . . . . . . . . . . . . . . . . . . . . . 1-7
UDFs in Table Columns and Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
UDX Development and Compilation Environments. . . . . . . . . . . . . . . . . . . . . . . . 1-8
UDX Object File Install Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
Stored Procedures and UDXs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
Information for UDX Users and Netezza Administrators . . . . . . . . . . . . . . . . . . . . . . . 1-9
How to Call a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
Cross-Database Access to UDXs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
International/Unicode Character Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
How to Back up and Restore UDX Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
How to Upgrade and Patch Netezza Systems That Have UDX Code . . . . . . . . . . . 1-10

iii

2 Creating User-Defined Functions


Creating the C++ File for the UDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
Compiling the UDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
Registering the UDF with the Netezza System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Function and Aggregate Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Overloading Functions and Aggregates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
UDX Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
Understanding Size-Specific, Generic, and Variable Argument UDXs . . . . . . . . . . . . . 2-8
Size-Specific Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Generic-Size Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Variable Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11
Using the UDF in a SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11
Altering and Dropping UDFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
Return Value Sizer API. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
sizerReturnType Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
numSizerArgs Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
sizerArgType Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
sizerStringArgSize Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
sizerNumericArgPrecision Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
sizerNumericArgScale Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
sizerStringSizeValue Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
sizerNumericSizeValue Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
isSizerArgConstant Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
sizerGetConstantArg Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
calculateSize Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16

3 Creating User-Defined Table Functions


Creating the C++ File for the UDTF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
Compiling the UDTF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
Registering the UDTF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
Using the UDTF in a SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
UDTF Invocation Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Specifying UDTF Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
Defining the Execution Locus of the UDTF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
Specifying UDTF Arguments and Return Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
Registering Generic Return Type UDTFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
Altering and Dropping a UDTF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12

iv

Table Shaper API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13


calculateShape Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
addOutputColumn Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
numOutputColumns Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
getOutputColumn Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
isSystemCaseUpper Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
getType Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
getSize Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
getPrecision Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
getScale Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
getName Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15

4 Creating User-Defined Aggregates


Creating the C++ File for the UDA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
Compiling the UDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Registering the UDA with the Netezza System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Using the UDA in a SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Altering and Dropping UDAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7

5 Creating User-Defined Shared Libraries


Creating a User-Defined Shared Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
Library Loading Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
Compiling and Linking the Shared Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
Registering the Shared Library in a Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
Using the Shared Library with a UDX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
Altering and Dropping Shared Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
Clearing Dependencies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4

6 Common UDX Development Topics


Managing User Account Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
Granting Create Permission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
Granting All Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
Revoking Create Permission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
Managing Alter Permission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
Managing Execute Permission. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3
Managing Drop Permission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3
Managing Unfence Permission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3

Documenting a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4


Adding Comments for a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5
Netezza SQL Command Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5
UDX Development Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6
Cross-Database Access to UDXs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6
Avoiding UDX Name Collisions with Built-In Functions . . . . . . . . . . . . . . . . . . . . 6-7
Specifying the Execution Locus for UDFs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
Avoiding UDX Linkage Symbol Collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
Avoiding Record Size Exceeded Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8
Managing Dynamic Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8
Obtaining UDX System Information Programmatically . . . . . . . . . . . . . . . . . . . . . 6-8
Using C Runtime Library Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
Netezza Query Optimization and UDX Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
Dependency Checks before Dropping UDXs . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
Checking for Unreferenced or Invalid UDXs . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14
Time Zone Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15
Error Reporting within a UDX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16
Checking for Nulls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16
Memory Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-17
Compiling Multiple Object Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-17
Managing C++ Files That Contain Multiple Functions . . . . . . . . . . . . . . . . . . . . 6-17
Conditional Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
nzudxcompile Command Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22
Migrating UDXs from API Version 1 to API Version 2. . . . . . . . . . . . . . . . . . . . . . . . 6-23

7 Debugging User-Defined Functions and Aggregates


Message Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
logMsg Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
Checking the Log Mask Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
Example: Adding logMsg() to the Sample UDF . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
nzudxdbg Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4
UDX Test Harness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
nzudxrunharness Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7

vi

Test Harness Control File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10


Debugging Using UDX Stubs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-15

Appendix A: Creating Memory Workpads Using the SPUPad


Uses of the SPUPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Content Restrictions for the SPUPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
How to Define and Use a SPUPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
Define the SPUPad Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3
Create a SPUPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-4
Process Data In a SPUPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5
Running the stringpad UDFs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6
SPUPad-Related API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7
allocate Function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7
deallocate Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8
setRootObject Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8
getRootObject Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9
getTotalSize Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9
getPad Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9
PAD_NEW Macro. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9
PAD_DELETE Macro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10
isUserQuery Function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10
Special Considerations for UDXs that Have SPUPads . . . . . . . . . . . . . . . . . . . . . . . A-11
Best Practices for UDXs with SPUPads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-11
Tables and SPUPad Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-11
Calculating the Memory Use of a SPUPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-13
Automatic Memory Cleanup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-13
Transaction Restarts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14
Best Practices for Registering UDXs that Use SPUPads. . . . . . . . . . . . . . . . . . . . . . A-14
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14
string_pad_create.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14
string_pad_get.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17
string_pad_size.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-18
padcounter.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-20

Appendix B: Netezza SQL Reference


ALTER AGGREGATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2

vii

Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-4
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-5
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6
ALTER FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-9
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-10
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-11
ALTER LIBRARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-11
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-11
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-11
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-12
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-12
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-13
CREATE [OR REPLACE] AGGREGATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-13
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-13
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-13
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-15
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-16
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-17
CREATE [OR REPLACE] FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-17
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-18
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-18
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-21
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-22
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-22
CREATE [OR REPLACE] LIBRARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-23
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-23
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-23
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-24
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-24
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25
DROP AGGREGATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25
Synopsis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25

viii

Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-26
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-26
DROP FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-28
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-28
DROP LIBRARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-28
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-28
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
SHOW AGGREGATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-31
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-31
SHOW FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-33
SHOW LIBRARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-33
Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-34
Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-34
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-34
Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-34

Appendix C: Datatype Helper API Reference


Temporal Datatype Helper Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
Netezza Date/Time Datatype Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2
Using the IgnoreBuffer to Skip Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-5
Range Specifier Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-5
Encoded Range-Checking Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-7

ix

Decoded Range-Checking Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-9


Decoder Conversion Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-12
Encoder Conversion Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-15
Miscellaneous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-19
Numeric Datatype Helper Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-22
convertNumeric32. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-22
convertNumeric64. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-23
convertNumeric128. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-24
CheckPrecision38Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-24
UTF-8 Datatype Helper Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-25
UTF8CharCount . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-25
isValidUTF8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-25

Appendix D: UDX Datatypes Reference Information


Supported Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-1
char . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-2
nchar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-2
nvarchar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-3
varchar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-3
boolean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4
date . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4
time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4
time with time zone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4
numeric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4
real . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-5
double precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-5
interval. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-5
integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-6
bigint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-6
smallint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-6
byteint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-6
timestamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-6
UDX Arguments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-7
Logging Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-7
Memory Management Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-7
UDA State Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-8
UDX Return Value Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-8
UDX Environment Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-9

UDF Sizer Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-9


UDTF Shaper Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-10
UDTF Column Return Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-11

Appendix E: Using UDXs with Stored Procedures


Using UDXs to Extend the NZPLSQL Language. . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-1

Appendix F: Sample User-Defined Functions and Aggregates Reference


Sample User-Defined Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-1
Generic UDF Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-1
Binary to Hexadecimal Converter Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-3
Business Hours Verification Function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-6
Sample User-Defined Aggregates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-7
PackChildren Aggregate (UDX Version 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-7
PenMax Example (UDX Version 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-8
Sample User-Defined Table Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-11
Sample UDTF with Generic Return Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-13

Appendix G: Notices and Trademarks


Notices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-1
Trademarks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-3
Electronic Emission Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-4
Regulatory and Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-7

Index

xi

xii

Tables
Table 6-1:

Supported LIBC Functions (for Releases Before 5.0) . . . . . . . . . . . . 6-11

Table 6-2:

nzudxcompile General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18

Table 6-3:

nzudxcompile Compile Options . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-19

Table 6-4:

nzudxcompile General Registration Options . . . . . . . . . . . . . . . . . . 6-19

Table 6-5:

nzudxcompile API Version 2 Registration Options . . . . . . . . . . . . . . 6-20

Table 6-6:

nzudxcompile UDF Registration Options. . . . . . . . . . . . . . . . . . . . . 6-21

Table 6-7:

nzudxcompile Table Function Registration Options . . . . . . . . . . . . . 6-21

Table 6-8:

nzudxcompile UDA Registration Options . . . . . . . . . . . . . . . . . . . . 6-21

Table 7-1:

nzudxdbg Command Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4

Table 7-2:

nzudxrunharness General Options . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7

Table 7-3:

nzudxrunharness Input File Options. . . . . . . . . . . . . . . . . . . . . . . . . 7-8

Table 7-4:

nzudxrunharness Random Input Options . . . . . . . . . . . . . . . . . . . . . 7-8

Table 7-5:

nzudxrunharness Output Options . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8

Table 7-6:

nzudxrunharness UDX Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9

Table 7-7:

nzudxrunharness UDX Override Options . . . . . . . . . . . . . . . . . . . . . . 7-9

Table 7-8:

Control File Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11

Table B-1:

UDX Netezza SQL Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1

Table B-2:

ALTER AGGREGATE Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2

Table B-3:

ALTER AGGREGATE Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-4

Table B-4:

ALTER FUNCTION Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6

Table B-5:

ALTER FUNCTION Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-9

Table B-6:

ALTER LIBRARY Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-11

Table B-7:

ALTER LIBRARY Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-12

Table B-8:

CREATE [OR REPLACE] AGGREGATE Input . . . . . . . . . . . . . . . . . . B-13

Table B-9:

CREATE [OR REPLACE] AGGREGATE Output. . . . . . . . . . . . . . . . . B-15

Table B-10:

CREATE [OR REPLACE] FUNCTION Input . . . . . . . . . . . . . . . . . . . B-18

Table B-11:

CREATE [OR REPLACE] FUNCTION Output . . . . . . . . . . . . . . . . . . B-21

Table B-12:

CREATE [OR REPLACE] LIBRARY Input . . . . . . . . . . . . . . . . . . . . B-23

Table B-13:

CREATE [OR REPLACE] LIBRARY Output . . . . . . . . . . . . . . . . . . . B-24

Table B-14:

DROP AGGREGATE Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25

Table B-15:

DROP AGGREGATE Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-25

Table B-16:

DROP FUNCTION Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27

Table B-17:

DROP FUNCTION Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27

xiii

xiv

Table B-18:

DROP LIBRARY Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29

Table B-19:

DROP LIBRARY Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29

Table B-20:

SHOW AGGREGATE Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30

Table B-21:

SHOW AGGREGATE Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-30

Table B-22:

SHOW FUNCTION Input. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32

Table B-23:

SHOW FUNCTION Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-32

Table B-24:

SHOW LIBRARY Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-34

Table C-1:

Netezza Temporal Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2

Table C-2:

Examples of time_t and Netezza Date and Timestamp Conversions . . . C-3

Table C-3:

Examples of struct tm and Netezza Datatype Properties. . . . . . . . . . . C-4

Table C-4:

Range Specifier Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-5

Preface
This guide describes how to create functions, aggregates, and shared libraries, which
increase the analysis and query capabilities of the IBM Netezza data warehouse appliance. You can create these custom objects, add them to the Netezza system, and make
them available for other users to include in their queries.
Note: In previous releases, this feature was called OnStream Functions. In release 6.0.x,
the feature name changed to user-defined functions.
Throughout this guide, note that the terms user-defined function (UDF) and user-defined
aggregate (UDA) are used to clarify user-created functions and aggregates versus the builtin functions and aggregates that ship with the Netezza software. The term UDX represents
a user-defined object in general. For more information about the terminology and concepts,
refer to Chapter 1, User-Defined Functions.

About This Guide


The IBM Netezza User-Defined Functions Developers Guide is intended for users who want
to create objects such as functions, aggregates, or libraries to extend the capabilities of
Netezza systems.
The guide is intended for select partners, resellers, and customers who are members of the
Netezza Developer Network (NDN) program. To use this guide, you should be very familiar
with programming in C++, Netezza SQL commands, and Netezza systems.
Topic:

See:

Introduction to user-defined functions

User-Defined Functions on page 1-1

Step-by-step process to create a UDF

Creating User-Defined Functions on page 2-1

Step-by-step process to create a UDTF

Creating User-Defined Table Functions on page 3-1

Step-by-step process to create a UDA

Creating User-Defined Aggregates on page 4-1

Step-by-step process to create a user-defined


shared library

Creating User-Defined Shared Libraries on page 5-1

General developer topics and best practices

Common UDX Development Topics on page 6-1

Steps to test and debug UDXs

Debugging User-Defined Functions and Aggregates on


page 7-1

Using SPUPad to allocate a named, unique, area Creating Memory Workpads Using the SPUPad on
of memory as a temporary storage area and
page A-1
workpad
Detailed descriptions of the Netezza SQL comNetezza SQL Reference on page B-1
mands for creating, altering, and dropping UDXs
Descriptions of helper routines that you can use
in your UDXs to convert data and time values

Datatype Helper API Reference on page C-1

Datatypes reference information

UDX Datatypes Reference Information on page D-1

xv

Topic:

See:

Example of extending the NZPLSQL stored procedure language with UDXs

Using UDXs with Stored Procedures on page E-1

Examples of user-defined functions

Sample User-Defined Functions and Aggregates Reference on page F-1

Symbols and Conventions


This guide uses the following typographical conventions:

Italics for terms, and user-defined variables such as file names

Upper case for SQL commands; for example INSERT, DELETE

Bold for command line input; for example, nzsystem stop

If You Need Help


If you are having trouble using the Netezza appliance, you should:
1. Retry the action, carefully following the instructions given for that task in the
documentation.
2. Go to the Netezza Knowledge Base at https://knowledge.netezza.com. Enter your support username and password. You can search the knowledge base or the latest updates
to the product documentation. Click Netezza HelpDesk to submit a support request.
3. If you are unable to access the Netezza Knowledge Base, you can also contact Netezza
Support at the following telephone numbers:

North American Toll-Free: +1.877.810.4441

United Kingdom Free-Phone: +0.800.032.8382

International Direct: +1.508.620.2281

Refer to your Netezza maintenance agreement for details about your support plan choices
and coverage.

Comments on the Documentation


We welcome any questions, comments, or suggestions that you have for the IBM Netezza
documentation. Please send us an e-mail message at [email protected]
and include the following information:

The name and version of the manual that you are using

Any comments that you have about the manual

Your name, address, and phone number

We appreciate your comments on the documentation.

xvi

CHAPTER 1
User-Defined Functions
Whats in this chapter
Introduction
Important Cautions for User Code on Netezza Systems
Netezza System Prerequisites
How to Create a UDX
Information for UDX Developers
Information for UDX Users and Netezza Administrators

This chapter introduces the user-defined functions support in the Netezza environment.
Review this chapter to learn about the key concepts and definitions, as well as the benefits,
prerequisites, and important information.

Introduction
The Netezza user-defined functions feature allows you to create custom functions, aggregates, and shared libraries that run on Netezza systems and perform specific types of
analysis for your business reporting and data queries. They allow you to leverage the
Netezza massively parallel processing (MPP) environment to accelerate analysis of data, as
well as to offer new and unique types of analysis. Because user-defined functions enable
data processing directly on the Netezza system, you can reduce or eliminate data movement to other systems for analysis, which reduces the overall processing time.
Netezzas support for user-defined functions generally follows the model used in the 2003
SQL Standard for SQL-Invoked Routines.

User-Defined Functions
A user-defined function (UDF) is user-supplied code that is executed by the Netezza system
in response to SQL invocation syntax. UDFs provide new types of data analysis actions
which are not currently available with the built-in functions such as upper(), sqr(), or
length(). A user-defined function is a scalar function; that is, it returns one value.
A UDF invocation may appear anywhere inside a SQL statement where a built-in function
can appear, which includes restrictions (where clauses), join conditions, projections (select
from lists), and HAVING conditions. A UDF can accept zero or more input values but produces one output value. Input values to a UDF can be literals, column references, or
expressions. The data types of inputs and output must be Netezza built-in data types.

1-1

IBM Netezza User-Defined Functions Developers Guide

User-Defined Table Functions


A user-defined table function (UDTF) is a function that you can invoke in a FROM clause of
a SQL statement. It returns a table shape with columns that have names and types. Unlike
a user-defined function (which is a scalar function), a table function can return zero or
more rows. You can invoke a table function with arguments, including literals and non-literal expressions containing columns from other tables.
You can use table functions for such tasks as expanding data from one row into many rows,
or to produce a summary by combining data from many rows. This can help you to create
custom summaries in table form, as well as to perform unpivot operations such as combining the data from several related tables into a new combined table.
A table function can appear in a SQL clause almost anywhere a table would normally
appear. You invoke the table function using a form similar to TABLE (func-name (args)).

User-Defined Aggregates
A user-defined aggregate (UDA) is user-supplied code which implements the various phases
of aggregate evaluation, such as initialization, accumulation, and merging, on the Netezza
system.
UDAs provide new types of aggregation functions that are not currently available with the
built-in aggregates such as count(), sum(), avg(), max(), or min(). UDAs are able to take
multiple arguments, but they are also scalar and produce one output value. UDAs may be
used in a SQL statement anywhere a built-in aggregate may appear as either grand,
grouped, or windowed aggregates.
You can control whether a UDA is allowed in grouped aggregate query or a window (analytical) aggregate query, or either type, when you define the UDA. The restriction is a
performance optimization; by restricting an aggregate to grouped aggregates only, for example, the Netezza will not allow users to include the aggregate in an analytic (windowed)
aggregate query. This may be the intended design of the UDA itself, or it could be a performance optimization to control memory impacts on the Netezza. If an aggregate is defined
as ANY type, then it can be used in either aggregation types. For more information about
window and grouped aggregates and the performance implications of window aggregates,
see the IBM Netezza Database Users Guide.

User-Defined Shared Libraries


In prior releases, you could use of a subset of the standard C libraries to use common routines and operations without having to define or duplicate those routines within your UDXs.
If a UDX required a custom or open source set of routines or code, you would have to
encode those routines into the UDX C++ file itself. If several UDXs required a routine, you
either had to copy that code to multiple UDX files, or create one combined, larger UDX C++
file that contained all the common routines plus all the UDX definitions.
UDXs now support user-defined shared libraries, which allow users to create their own
libraries of routines for use with their UDXs. User-defined shared libraries are objects in the
Netezza database; users must create, compile, and register their user-defined shared libraries to allow other UDXs to reference them.
User-defined libraries help to keep the UDX sources more compact and easier to maintain.
You do not have to replicate code for common routines across multiple UDXs, nor create
large UDX object files for a family of related UDXs that use similar code or processing. As a
result, compiled UDX objects are smaller, and are often easier to maintain, as changes to

1-2

20444-5

Rev.4

Introduction

library routines need be made only in the shared library that defines them. In addition,
users who have UDXs that leverage third-party libraries may be able to migrate those routines and libraries more easily to the Netezza system for analysis.

UDX
Throughout this guide, the term UDX is a generic reference to user-defined functions of any
kind, including user-defined functions, aggregates, or shared libraries. The term is also
used in code or commands that operate on these user-defined object types, such as
nzudxcompile.

Fenced and Unfenced Mode


Starting in Netezza Release 6.0.x, UDXs operate in fenced mode, which is a safety feature
that provides a separate process environment for user-defined objects. Fencing allows
developers to create and run their UDXs in a protected address space on the host and snippet processing units (SPUs); it minimizes the impact of an incorrectly designed UDX by
helping to protect against Netezza software or system crashes. The SPUPad feature is not
supported with fenced UDXs.
Fenced mode is the default for the system. Any new UDXs that you create will be fenced
unless you specifically register them as not fenced. (Your database user account must have
the Unfence administrative privilege to create and/or alter UDXs to be unfenced.) If you
upgrade a system from Release 5.0.x to 6.0.x or later, any UDXs on the system continue to
run in unfenced mode, which was the default in the previous releases.
It is important to note that fencing has negative impacts on the performance of a UDX.
After you test and debug your fenced UDXs, you can alter them to run unfenced to improve
the performance of those UDXs and to take advantage of SPUPads if applicable.

SPUPad
The SPUPad feature allows you to reserve temporary areas of memory on the Netezza SPUs
(also known as S-Blades). The SPUPads are typically used to hold data for use with userdefined functions. When the query or transaction block that created the SPUPad finishes,
the Netezza system releases the memory allocated for each SPUPad. SPUPads typically
reside in memory on the Netezza SPUs where the user data tables reside, but they can also
reside on the host if the UDX is operating on external tables or host-based system views.
Note: SPUPads are not supported for UDXs that run in fenced mode.

Netezza Developer Network


The Netezza Developer Network (NDN) program enables members to leverage Netezzas
high-performance, analytic appliance platform to design and develop specialized applications and algorithms for data analysis. The NDN is a growing community of users who focus
on extending the capabilities of the Netezza family; current participants include customers,
technology partners, systems integrators, as well as members of the academic community
who focus on analytical applications.
The NDN support team maintains a wiki server site to assist in the communication and
sharing of best practices, sample code, documentation updates, and frequently asked
questions. For more information about the NDN, visit the Netezza Web site at
http://www.netezza.com for program descriptions and membership information.
Note: This document refers to NDN web sites that require a user account to access.

20444-5

Rev.4

1-3

IBM Netezza User-Defined Functions Developers Guide

Important Cautions for User Code on Netezza Systems


UDXs are powerful tools for extending the data analysis capabilities of the Netezza systems.
The introduction of user-defined code also introduces risks to the stability of the Netezza
system, and to the integrity of the data stored within the Netezza. Make certain that only
authorized users have the access permissions required to create and alter UDXs. This helps
you to guard against users introducing unapproved or unauthorized code into the Netezza
environment. User accounts and access permissions are described in detail in the IBM
Netezza System Administrators Guide; refer to User Account Permissions on page 1-5
for details.
You can use the Netezza access controls and permissions to make sure that only authorized
SQL query users have execute permission for any UDXs. The access permissions help to
lock down functions and aggregates that may be intended for specific purposes or users,
that could impact the performance of the Netezza system, or that may still be undergoing
development or testing.
As a developer of user-defined code, make sure that you carefully test your UDXs to ensure
that they perform in the manner that you expect. Use caution to avoid memory leaks in your
code, as well as uninitialized pointers. Because you have the ability to read, alter, and save
data processed by your UDXs, you could unintentionally change the data on your Netezza
system. Make sure that you review and follow the best practices within this guide, and that
you understand the limitations and recommendations for UDXs.
As a best practice, test your UDXs in fenced mode and within a development environment
before you deploy them to production systems. Create test plans to verify that they operate
correctly. Also as a best practice, run reports to verify data within your test Netezza system
before and after deployment of your UDXs. This helps you to confirm that your functions
did not change or delete data that should not have been affected, and that they are processing the target data correctly.

Netezza System Prerequisites


UDX support became available in Netezza Release 4.5. This revision of the guide describes
updates available in Release 6.0 and later. For a complete description of requirements and
changes for the release, refer to the IBM Netezza Release Notes.

UDX Programming Language Requirements


Your user-defined functions and aggregates must be written in C++. They can use the full
standard C library (LIBC), but they should avoid interprocess communication (IPC) calls
and other low-level operations.
Note: In releases prior to 5.0, the Netezza SPUs used an operating system called Nucleus
which supported a subset of the standard C library. As of Release 5.0, SPUs use a Linux
operating system which supports the standard C library and which provides stronger error
support and control for UDXs.
In this release, the Netezza SPUs do not support the use of global objects (static or otherwise) with constructors or destructors.

1-4

20444-5

Rev.4

Netezza System Prerequisites

User Account Permissions


To create user-defined functions, aggregates, and shared libraries, you must be able to log
in as the Netezza admin user account, or your user account must have the respective Create Function, Create Aggregate, and/or Create Library administration privilege.
To change a user-defined function, aggregate, and/or shared library, your account must
have the Alter privilege for the function, aggregate, or library objects or for the specific
UDX.
As a best practice, do not use the admin user account to create and manage user-defined
objects. You should create special user accounts or groups, and give them the necessary
permissions to create, manage, and/or execute user-defined objects. The admin user has
root-like access to all the Netezza databases and objects and special permissions for work
priority. You should restrict access to the admin account and use it only for the critical work
or tasks on the system. This guide uses a sample user account named myuser to perform
the tasks in the guide.
To use a user-defined function or aggregate in your SQL queries, you must be the admin
user, or your user account must have Execute privilege for the function and/or aggregate
objects, or for the specific UDF or UDA.
If the UDF or UDA uses shared libraries, you must have Execute access to the shared libraries. You must be the admin user, or your user account must have Execute privilege for
library objects or for the specific shared library. If your account has Unfence privilege, you
can create or alter a UDX to run in unfenced mode, which can offer improved performance
for the UDX. For more information about user accounts and Netezza SQL commands for
managing privileges, refer to Managing User Account Permissions on page 6-1.
You can use any SQL tool which supports ODBC, JDBC, OLE DB as well as nzsql to create,
modify, and execute Netezza UDXs. This document provides examples that use the nzsql
command line tool. Any SQL tool which can connect to the Netezza database can be used
to run queries that include UDXs.

UDX API Version 1 and 2


For releases before Netezza Release 6.0.x, the UDXs used API version 1 functionality.
Release 6.0.x introduces API version 2, which adds support for features such as userdefined table functions and environment values. The APIs have different namespaces: API
Version 1 uses the nz::udx namespace, and API version 2 uses the nz::udx_ver2
namespace. The different namespaces allow older UDXs to run on the newer Netezza
releases without code changes. If you want to migrate a version 1 UDX to version 2 functionality, you must change the C++ code for the UDX to convert it to a version 2 file. For
more information, see Migrating UDXs from API Version 1 to API Version 2 on page 6-23.
Release 6.0.x supports both UDX version 1 and 2 objects. By default, when you create a
UDX, the UDX will use API version 1 support. If you include version 2 features such as
ENVIRONMENT in the UDX definition, you must specify API VERSION 2 also in the UDX
definition.
If you create UDXs that take advantage of UDX version 2 features, those UDXs will not run
in an environment that supports only UDX version 1. If you attempt to downgrade from
Release 6.0.x to 5.0.x, the downgrade process returns errors if it detects UDX version 2
features; you must drop the UDX version 2 features before you can downgrade.

20444-5

Rev.4

1-5

IBM Netezza User-Defined Functions Developers Guide

If you want to create UDXs that can be used in both 5.0.x and 6.0.x environments, you can
define your objects with API VERSION 1. If you include version-2-only features, the object
will error out when you use the CREATE OR REPLACE command to add the UDX as an
object in a database.

How to Create a UDX


The process to create a user-defined function or aggregate is very straightforward; typically,
the time-consuming work lies within the design of the UDX, the coding and debugging of
the program, and operational testing. This section describes the process with the following
high-level steps:

Design a UDX

Review the Existing Built-In Functions

Create a UDX

Design a UDX
The first step in designing a UDX is to identify the type of action that you need the userdefined function or aggregate to perform. For example, you might want to implement functions that perform tasks such as specialized string operations or comparisons; custom
mathematical analysis; or conversions such as metric to English measurements, Celsius to
Fahrenheit, or currency conversions. In addition, it is important to identify any user-defined
shared libraries that your UDXs may require or that you could develop to create more efficient UDXs that share common routines.
Before the user-defined functions feature, conversion and analysis tasks might have
required you to export data from the Netezza system to another host server to perform the
processing and then load the converted data back into the Netezza system for storage. With
user-defined functions, you may be able to perform many or all of these analytical steps
directly on the Netezza system.

Review the Existing Built-In Functions


The second step in planning a UDX is to confirm that your function or aggregate is not
already available as a built-in to Netezza SQL. Netezza SQL offers a wide set of string,
mathematical, analytical, and conversion functions. For example, if you want a function
that changes the letter case of a string from uppercase to lowercase, or vice versa, Netezza
already provides built-in LOWER() and UPPER() functions for that conversion.
Also, it is highly recommended that you never create a UDX that has the same name, but
different letter-casing, as a built-in function. That is, do not create a UDX named upper()
because the system has a built-in function UPPER(). Although you are permitted to do this
using delimited (quoted) identifiers, the practice often results in confusion for query users.
It could also result in identifier collisions if the Netezza administrator changes the default
system casing to lowercase, for example. Review the IBM Netezza Database Users Guide
for a list of existing built-in functions and names.

1-6

20444-5

Rev.4

Information for UDX Developers

Create a UDX
The following process outlines the steps to create a user-defined function or aggregate:
1. Write C++ code that implements the necessary class methods.
2. Compile the C++ program using nzudxcompile to create object files that can be registered with the Netezza system.
3. Use Netezza SQL CREATE commands to register the UDX as an object in the Netezza
system.
4. Debug the C++ program to look for and resolve any errors in the processing. Netezza
provides a test harness to assist you with debugging.
5. Test the UDX on a development system to confirm that it performs as designed.
6. Deploy the UDX to one or more production Netezza systems.
7. Give users permission to execute the user-defined function or aggregate in queries, and
possibly alter the UDX to run in unfenced mode to improve performance.
Chapters 2 through 6describe the steps to create UDXs, including requirements for the
code, best practices for testing, and some limitations and restrictions for UDXs.

Information for UDX Developers


The following sections describe some additional concepts and implementation practices for
UDX developers.

Registering UDXs in Netezza Databases


When you register a UDX using the CREATE [OR REPLACE] [FUNCTION | AGGREGATE |
LIBRARY] command, the command adds the UDX to a specific Netezza database. (This is
usually the one to which you are connected to or logged in via the nzsql command, for
example.) You can register the UDXs to more than one database, but the UDXs are accessible only to queries within that database unless you use cross-database access methods to
run the UDXs. For more information, see Cross-Database Access to UDXs on page 1-9.

How to Convert and Use Netezza Temporal Values


Netezza uses a special encoding format to store date, time, and interval values in the user
data tables. Your user-defined functions and aggregates may need to parse or compute
these temporal fields, so Netezza offers a datatype helper API that you can use within your
C++ source files. Appendix C, Datatype Helper API Reference, describes the datatype
helper API routines that you can use; the sample UDX code in this guide also includes
examples that use these API routines.
In addition to the date/time conversion functions, there are several numeric conversion
functions that are also available. These functions are also described in Appendix C.

UDFs in Table Columns and Views


If you include a DETERMINISTIC UDF in a table column default expression or a view, note
that Netezza stores the results of the function (not the function itself). If the function
changes, the Netezza system does not update the table or view with the new result.

20444-5

Rev.4

1-7

IBM Netezza User-Defined Functions Developers Guide

UDX Development and Compilation Environments


To compile the source object files for the UDX code, Netezza requires you to use the
nzudxcompile command, which runs only on a Netezza system. Netezza does not recommend or support the use of other third-party C++ compilers on generic Linux systems. As a
best practice, develop your UDXs and compile them on Netezza development system environments first. You can then copy the compiled object files to production Netezza systems
to register them for use in queries.
The nzudxcompile command may not be practical if you are migrating third-party libraries
that use a complex, existing, build system. In these special cases, you may need to use the
Netezza compiler directly to specify your compilation flags as needed.
To obtain the path to the compiler, use the following command:
nzudxcompile -print-compiler

Use the compiler and specify compatible flags, which can include some of the following:

-shared

-Wa,--32

-fPIC

-fexceptions -fsigned-char

-Wno-invalid-offsetof

Some of these flags may be valid only for C++ or C libraries. Use the shared flag only when
linking the shared library and the fPIC only when creating object files. The Wa,--32
makes 32-bit object files instead of 64-bit.

UDX Object File Install Location


If you create a package of UDXs, there is a recommended framework for the installation
location and practices. For example, it is highly recommended that you install your UDX
software kit to a new directory under the /nz directory named /nz/extensions/companyname/product-name/version, where company-name and product-name are unique identifier
strings for your company and UDX capability set, and version is a unique version number
for the UDX set.
As a best practice, you should create a setup script that would create the subdirectory for
your UDX set and save the object files for your UDXs as well as any necessary readme files
or documentation. You could also use scripts to call the SQL commands to create your
UDXs automatically, as well as possibly to remove them in the event that you wish to
remove the UDXs from the Netezza system.

Stored Procedures and UDXs


Starting in Netezza Release 4.6, the Netezza system supports stored procedures, which are
applications that you can define as objects on the Netezza host. Stored procedures combine the benefits of SQL to query and manipulate database information with the benefits of
a procedural programming language (called NZPLSQL) to handle data processing, transaction logic, and application branching behaviors. For more information about stored
procedures and the NZPLSQL language, see the IBM Netezza Stored Procedures Developers Guide.

1-8

20444-5

Rev.4

Information for UDX Users and Netezza Administrators

Stored procedures can be designed to call UDXs in the same way that they can call built-in
functions. You can also use UDXs to perform such tasks as extend the NZPLSQL language.
These UDFs must be invoked using SQL that is designed to run only on the Netezza host
inside of Postgres. For more information about these capabilities, refer to Appendix E,
Using UDXs with Stored Procedures.

Information for UDX Users and Netezza Administrators


The following sections describe some additional concepts and implementation practices for
UDX users and Netezza administrators who manage systems that include UDX code.

How to Call a UDX


Users who have the appropriate permissions to run user-defined functions and aggregates
on their Netezza system can include them in their Netezza SQL queries in the same way
that they would use any of the Netezza SQL built-in functions or aggregates. This guide
provides several examples of how to call the sample functions. For a complete description
of how to use the built-in functions, refer to the IBM Netezza Database Users Guide.

Cross-Database Access to UDXs


Typically, most users are logged in to the database that contains the UDX that they plan to
run. However, the Netezza system allows for cross-database access of UDXs using two
methods:

Using fully-qualified object names when calling a UDX object that resides within a different database, for example:
MYDB(MYUSER)=> SELECT * FROM customers WHERE
OTHERDB..CustomerName(b) = 1;

Using the PATH SQL session variable to specify the databases to search to find the
UDX. To use the PATH session variable, you enter a command similar to the following
at the nzsql command prompt:
MYDB(MYUSER)=> SET PATH = <elem> [, <elem>];

The <elem> value can be a database name or the variables CURRENT_CATALOG,


CURRENT_USER, CURRENT_SCHEMA or CURRENT_PATH. (Anything you specify as
<elem> must resolve to a database name.)
For example:
MYDB(MYUSER)=> SET PATH = mydb, nzdb, customer;
SET VARIABLE

To display the PATH value, use the following command:


MYDB(MYUSER)=> SELECT CURRENT_PATH;
CURRENT_PATH
-----------------MYDB,NZDB,CUSTOMER
(1 row)

The Netezza system uses this variable during the lookup of any unqualified UDXs. It
searches the current database if PATH is not set; otherwise it searches the databases
specified in PATH, in the order that they are specified. The Netezza system uses the
first match (or potential match) it finds, even if a better match might exist in a subsequent database. A poorer match is one that might require implicit casting of arguments

20444-5

Rev.4

1-9

IBM Netezza User-Defined Functions Developers Guide

or that causes an error due to multiple potential matches. Note that PATH searches
databases, not schemas, as there is no schema support for this capability. Also, the
PATH session variable supports only UDFs, UDAs, and stored procedures (which are
described in the IBM Netezza Stored Procedures Developers Guide). Other object
types are not supported.

International/Unicode Character Support


UDXs can be named using any of the supported characters for SQL identifier names, as
described in the IBM Netezza Database Users Guide. UDFs support NCHAR and NVARCHAR argument and return types. UDAs support NCHAR and NVARCHAR argument, state,
and return types.

How to Back up and Restore UDX Code


As a best practice, you should keep backup copies of your source C++ programs in a safe
location outside of the Netezza system. Make sure that you have recent backups of your
Netezza systems in the event that you need to recover from an accidental change to your
data, or to restore Netezza services as part of a disaster recovery situation.
There are no special requirements or procedures needed to back up UDX objects on a
Netezza system. After you register UDFs, UDAs, or user-defined shared libraries with a
Netezza system, they (and their associated compiled objects) are backed up during the normal Netezza nzbackup operations.
After a UDX is altered, the next incremental backup also captures the object files for the
altered UDX. Backup ignores the object files for any unchanged UDXs to improve the
backup performance. If you later attempt a -schema-only restore on an increment that does
not have the UDX object files (because they had not been altered during this time), the
restore process creates a zero-length placeholder object files for those UDXs and logs the
signatures of the incomplete UDXs in the restoresvr log file. The resulting UDXs are defined
in the database, but they cannot be executed because their object files have not been
restored. You must use CREATE OR REPLACE commands to update the UDXs with their
necessary object files. For a -schema-only restore, you can use the
nzrestore -allincs argument, which restores the object files from all available increments so
that any referenced UDXs will be created and executable following the restore.

How to Upgrade and Patch Netezza Systems That Have UDX Code
After you and other permitted users register UDX code with your Netezza system, there are
no special requirements or procedures necessary to preserve those user-defined objects
during a service pack update or an upgrade to a new release. In most cases, the object continue to operate in the same manner on the newly updated or upgraded system as on the
previous release.
Note: If the UDX base class changes in the new release, the older object files will no longer
work. You must recompile your object files from the C++ sources and use the CREATE OR
REPLACE [FUNCTION | AGGREGATE] commands to update the object files in your Netezza
system. If you obtained your UDFs or UDAs from a third-party resource, you will need to
obtain updated objects that have been recompiled for the new release before you use the
CREATE OR REPLACE command. Be sure to review the IBM Netezza Release Notes to see
if the UDX base class changed or if there are special compilation issues for the Netezza
release.

1-10

20444-5

Rev.4

Information for UDX Users and Netezza Administrators

If the new release or service pack introduces any new features or changes that could affect
the operation of UDXs, Netezza will describe the changes in the release notes for the service pack or the release. Before you install any new release or service pack, you should
carefully review the release notes to familiarize yourself with any new features, changes,
fixes, and known issues for that release. After you upgrade, if the later release has a new
base class, you may need to recompile your UDXs to replace the object files with versions
that support the new features.
If you downgrade your Netezza release, note that downgrades could result in a loss of support for features that are available in the later release. If you downgrade to a release that
supports only UDX version 1, your UDX version 1 code should continue to work following
the downgrade; however, UDX version 2 objects will not work and must be dropped. If the
earlier release uses a different base class, you may need to recompile your UDXs and/or
obtain the objects that were compiled on the earlier base class.
As a best practice, make sure that you have recent backups of your Netezza system, which
will also include any UDX code registered with the Netezza system. In the event of a problem or failure situation during the upgrade, the backups provide you with the ability to
restore the system to the point of the backup image.

20444-5

Rev.4

1-11

IBM Netezza User-Defined Functions Developers Guide

1-12

20444-5

Rev.4

CHAPTER 2
Creating User-Defined Functions
Whats in this chapter
Creating the C++ File for the UDF
Compiling the UDF
Registering the UDF with the Netezza System
UDX Environment
Understanding Size-Specific, Generic, and Variable Argument UDXs
Using the UDF in a SQL Query
Altering and Dropping UDFs
Return Value Sizer API

This chapter describes the steps to create a scalar user-defined function (UDF) and to
register it for use on a Netezza system.

Creating the C++ File for the UDF


To begin, use any text editor to create your C++ file. The file name must have a .cpp extension. You might want to create a new directory such as /home/nz/udx_files as your area for
UDX code files.
Your C++ file must include the udxinc.h header file, which contains the required declarations for user-defined functions and processing on the Netezza SPUs.
#include "udxinc.h"

In addition, make sure that you declare any of the standard C++ library header files that
your function may require. If your UDF requires any user-defined shared libraries, make
sure you note the name of the libraries as you will need them when you register the UDF in
the database. For example:
#include "udxinc.h"
#include <string.h>

Note: User-defined shared libraries must exist in the database before you can register the
UDF and specify those libraries as dependencies. You could register the UDF without specifying any library dependencies, and after the libraries are added, use the ALTER
FUNCTION command to update the UDF definition with the correct dependencies. For
more information about user-defined shared libraries, see Creating a User-Defined Shared
Library on page 5-1.

2-1

IBM Netezza User-Defined Functions Developers Guide

The UDX classes and functions for API version 2 are defined in a namespace called
nz::udx_ver2. (The API version 1 UDXs use the nz::udx namespace.) Your C++ program
must reference the correct namespace. For example:
#include "udxinc.h"
#include <string.h>
using namespace nz::udx_ver2;

Note: This chapter uses udx_ver2 as the default namespace for the examples that follow.
The sections note the differences with UDX version 1, and Appendix F, Sample UserDefined Functions and Aggregates Reference contains examples of version 1 and version
2 definitions. You can continue to create UDX version 1 functions as well as new version 2
functions; both will operate on Release 6.0.x systems. However, the version 1 functions will
work on Netezza Release 5.0.x and later systems and thus may be more portable for your
Netezza systems.
To implement a UDF, you create a new class object derived from the Udf base class. Continuing the customername example:
#include "udxinc.h"
#include <string.h>
using namespace nz::udx_ver2;
class CCustomerName: public nz::udx_ver2::Udf
{
public:
};

Each UDF must implement the following two methods:


instantiate() is called by the runtime engine to create the object dynamically. The static

implementation must be outside the class definition. In UDX version 2, the instantiate
method takes one argument (UdxInit *pInit), which enables access to the memory
specification, the log setting, and the UDX environment (see UDX Environment on
page 2-7). The constructor must take a UdxInit object as well and pass it to the base
class constructor. The instantiate method creates a new object of the derived class type
using the new operator and returns it (as base class type Udf) to the runtime engine.
The runtime engine will delete the object when it is no longer needed. An example of
the instantiate method for API version 2 follows:
#include "udxinc.h"
#include <string.h>
using namespace nz::udx_ver2;
class CCustomerName: public nz::udx_ver2::Udf
{
public:
CCustomerName(UdxInit *pInit) : Udf(pInit)
{
}
static nz::udx_ver2::Udf* instantiate(UdxInit *pInit);
};
nz::udx_ver2::Udf* CCustomerName::instantiate(UdxInit *pInit)
{
return new CCustomerName(pInit);
}

2-2

20444-5

Rev.4

Creating the C++ File for the UDF

evaluate() is the method called once for each row of data during execution.
#include "udxinc.h"
#include <string.h>
using namespace nz::udx_ver2;
class CCustomerName: public nz::udx_ver2::Udf
{
public:
CCustomerName(UdxInit *pInit) : Udf(pInit)
{
}
static nz::udx_ver2::Udf* instantiate(UdxInit *pInit);
virtual nz::udx_ver2::ReturnValue evaluate()
{
// Code to be inserted here
}
};
nz::udx_ver2::Udf* CCustomerName::instantiate(UdxInit *pInit)
{
return new CCustomerName(pInit);
}

You can implement constructors and destructors as necessary. In API Version 1, constructors are optional. In API version 2, constructors are required; you must specify a
constructor even if it only invokes the base class constructor, as in the previous example.
Some common things to include in constructors are memory reservation routines (since
new and delete are relatively expensive), and setting up any structures needed for computation (for example, setting up a matrix for encryption routines).
For the customername example, the UDF takes a string and returns the integer 1 if the
string starts with Customer A, otherwise it returns the integer 0. The code for the evaluate method follows:
virtual nz::udx_ver2::ReturnValue evaluate()
{
StringArg *str;
str = stringArg(0);
// 4
int lengths = str->length;
// 5
char *datas = str->data;
// 6
int32 retval = 0;
if (lengths >= 10)
if (memcmp("Customer A",datas,10) == 0)
retval = 1;
NZ_UDX_RETURN_INT32(retval);
// 11
}

In the sample program, line 4 declares and uses a StringArg structure to pass the argument
to the UDF. Arguments to UDFs are retrieved using functions such as StringArg. If this UDF
took a second string argument, the second argument would be referenced by StringArg(1).
For a complete list of argument types supported and their associated helper functions, refer
to Appendix C, Datatype Helper API Reference. Lines 5 and 6 extract the length and
character pointer (char*) from the StringArg structure.
Note: The sample program uses memcmp (not strcmp) because StringArg structures are
not null-terminated (\0) in user-defined functions or aggregates. Therefore, strcmp, strcpy,
strlen, atol, and other functions which depend on the presence of a null terminator will not

20444-5

Rev.4

2-3

IBM Netezza User-Defined Functions Developers Guide

work. If you need to use those functions, you must copy the string into a buffer and manually append the null terminator.
Line 11 uses a UDX macro to return the computed value to the Netezza engine. The NZ_
UDX_RETURN_INT32 macro helps to confirm that the return value is of the expected type.
For a list of the available return macros, refer to UDX Return Value Macros on page D-8.
To use the function, you must compile and register it as an available function on the
Netezza system.

Compiling the UDF


After you create your C++ file for your new UDF, compile the C++ file using the nzudxcompile command. The command is located in the /nz/kit/bin/adm directory. The compilation
process creates the object files that will run on the Netezza host as well as on the Netezza
SPUs.

To compile the customername.cpp file and create the output object files:
nzudxcompile /home/nz/udx_files/customername.cpp

The nzudxcompile command creates the following object files:

customername.o_x86 is the object file for the Netezza host (i386 Linux platform on
x86).

customername.o_spu10 is the object file for the Linux-based Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models (formerly called Netezza TwinFin and Skimmer
systems).

For a complete description of the nzudxcompile command and its options and usage, refer
to nzudxcompile Command Syntax on page 6-18.
After you create the compiled object files, you must register the UDF with the Netezza system. You can register a user-defined function using the CREATE FUNCTION command, as
described in the next section.
Note: Optionally, you can also compile and register a UDF in one step using the nzudxcompile command. For example, to compile the customername C++ file and also register it in a
sample database called mydb:
nzudxcompile customername.cpp o customername.o
-sig "CustomerName(varchar(64000))" --version 2 -return INT4
-class CCustomerName -user myuser -pw password -db mydb
In this example, the quotes are required to ensure that the shell properly handles the
parentheses characters. This example also shows that you must include the --version 2
syntax when you are using the command to compile and register an API version 2 UDF.

Registering the UDF with the Netezza System


To register a UDF, you use the Netezza SQL command CREATE FUNCTION. For a complete
description of the CREATE FUNCTION command, refer to CREATE [OR REPLACE] FUNCTION on page B-17.
Note: When you issue a CREATE FUNCTION command, the database processes the HOST
OBJECT and the SPU OBJECT files as the nz user. The nz user must have read access to

2-4

20444-5

Rev.4

Registering the UDF with the Netezza System

the object files and read and execute access to every directory in the path from the root to
the object file.
For example, to register the sample function customername to the Netezza system, start an
nzsql session to your database (which is named mydb in this example):
nzsql mydb myuser password

Next, use the CREATE FUNCTION Netezza SQL command to register the function:
MYDB(MYUSER)=> CREATE FUNCTION CustomerName(varchar(64000))
RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC API VERSION 2
EXTERNAL CLASS NAME 'CCustomerName'
EXTERNAL HOST OBJECT '/home/nz/udx_files/customername.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/customername.o_spu10';

If the command is successful, it returns the message CREATE FUNCTION. It creates the
function in the mydb database, and the function is owned by myuser. To create a function,
your user account must have Create Function administration permission or you must be
logged in as the admin user. For a description of the required privileges for these commands, refer to Managing User Account Permissions on page 6-1.
When you register a UDF with the Netezza system, the specified object files are copied into
the Netezza database directories. This allows the functions to be used in queries by all permitted users, and it also ensures that the UDFs are backed up and restored with the user
data in the database. If you change the C++ program for any reason (such as adding debug
messages or changing the operation of the function), you must re-compile the program and
re-run the CREATE OR REPLACE FUNCTION command to copy the updated object files
into the Netezza database.
Note the following characteristics of the CREATE FUNCTION command:

20444-5

Rev.4

If you use the command CREATE FUNCTION, instead of CREATE OR REPLACE FUNCTION, the command will fail if a user-defined function with the same name and
signature already exists in the database.

You can create multiple UDFs that use the same name, but they must have different
signatures if they reside in the same database. The name must meet the character
restrictions for a legal Netezza SQL keyword or identifier, and it does not have to match
or relate to anything defined in the C++ file (that is, the name is not used for binding).

The value you specify for EXTERNAL CLASS NAME must match the class in the C++
file exactly, as this is how the runtime engine creates and calls the UDF object method.

The command will fail if the DEPENDENCIES argument references the name of a userdefined shared library which is not defined in the current database.

For string arguments, use caution in choosing a string size. In general you should follow these best practices for strings:

If the string input is naturally bounded, specify a string size that matches the largest string needed. For the customername example, varchar(10) is sufficient.

If the string input length could vary widely, use generic size arguments. For more
information, see Generic Arguments in the UDX Signature on page 2-9.

There can be a performance penalty for specifying a large string when the input
passed is a CHAR/NCHAR type and the argument is specified as VARCHAR/NVARCHAR. In this case, the argument will be implicitly converted to the variable-sized
argument, including all of the trailing spaces.

2-5

IBM Netezza User-Defined Functions Developers Guide

Dependencies
The following command defines a UDF named myfunc that depends upon a user-defined
shared library named mylib:
MYDB(MYUSER)=> CREATE FUNCTION myfunc(int)
RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'CMyFunc' DEPENDENCIES mylib
EXTERNAL HOST OBJECT '/home/nz/udx_files/myfunc.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/myfunc.o_spu';

Function and Aggregate Signatures


Each user-defined function and aggregate has a signature; that is, a unique identification
that is of the form <udx-name>(<argument type name list>). Signatures must be unique
within the same database, and they cannot duplicate the signature of another UDF, UDA, or
a built-in function or aggregate. The <argument type name list> component does not consider data type sizes to be differentiators. For example, you cannot create two functions
called Myfunc( numeric(3,2) ) and Myfunc( numeric(4,1) ) in the same database.
If there are common use-cases where a function or aggregate must accept different sized
strings or numerics, you could design the UDX to accept the largest of the possible values
(such as the CustomerName example in the previous section), or you could create a new
UDX with a different name to process a different data size, for example:
MYDB(MYUSER)=> CREATE FUNCTION CustomerNameShort(varchar(256))
RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC API VERSION 2
EXTERNAL CLASS NAME 'CCustomerName'
EXTERNAL HOST OBJECT '/home/nz/udx_files/customername.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/customername.o_spu10';

Overloading Functions and Aggregates


You can create UDXs that have the same name, but which have different argument signatures and return types. This process is referred to as overloading the definition. For
example, assume that you have a function called greatest_value that could take two or
three input integers and which returns the greatest of the input values.
Some sample (and partial) function follow:
MYDB(MYUSER)=> CREATE FUNCTION greatest_value(INT64, INT64) RETURNS
INT64...
MYDB(MYUSER)=> CREATE FUNCTION greatest_value(INT64, INT64, INT64)
RETURNS INT64...

If a user calls greatest_value with two input values, the system uses the first (two argument) function. If the user specifies three input values, the system uses the second
function that accepts three input values.
Overloading allows you to support different combinations of input values and/or return
types. However, overloading and uniquely named but similar functions and aggregates have
a maintenance overhead; if you need to update or redesign the body of the UDX, you have
to update each UDX with the changes that you want to make.

2-6

20444-5

Rev.4

UDX Environment

UDX Environment
UDX API version 2 includes support for the UDX environment. The UDX environment consists of a list of one or more variable name and value pairs that you can specify for a UDF,
UDA, or UDTF when you register it. Using environment variables, you can conditionalize
the UDX behavior in the UDX definition.
The environment variables simplify the process for changing the behavior of the UDX when
necessary. To change a variable, you alter the UDX to specify new variables, new values, or
to clear the variables. Although you can define the variable values within the source code
for the UDX, changing the values would require you to edit the source code, recompile, and
reregister the UDX to implement the new behavior. Similarly, if you defined variables as an
argument to the UDX when you invoke it, changing the variable would require changes to
the SQL query that invokes the UDX.
For example, assume that you have a UDF that performs a currency conversion from U.S.
dollars to Euros. If you hardcode an exchange rate variable within the UDF source code, you
would have to update the source code whenever the exchange rate changes, then recompile
and reregister the UDF. Instead, you could use an environment variable to define the
exchange rate for the UDF when you register it, as in the following example:
CREATE OR REPLACE FUNCTION usdToEurosFunc(int) RETURNS int4 LANGUAGE
CPP PARAMETER STYLE NPSGENERIC ENVIRONMENT 'USD_EURO'='0.7268'
EXTERNAL CLASS NAME 'CusdToEuros' EXTERNAL HOST OBJECT '/home/nz/udx_
files/usdEuroFunc.o_x86' EXTERNAL SPU OBJECT '/home/nz/udx_files/
usdEuroFunc.o_spu';

When you update the exchange rate, you can simply re-register the UDX, as shown in bold
below:
CREATE OR REPLACE FUNCTION usdToEurosFunc(int) RETURNS int4 LANGUAGE
CPP PARAMETER STYLE NPSGENERIC ENVIRONMENT 'USD_EURO'='0.8241'
EXTERNAL CLASS NAME 'CusdToEuros' EXTERNAL HOST OBJECT '/home/nz/udx_
files/usdEuroFunc.o_x86' EXTERNAL SPU OBJECT '/home/nz/udx_files/
usdEuroFunc.o_spu';

You can specify and alter the environment variables using the CREATE OR REPLACE commands for functions and aggregates. You can use the ALTER command to alter variable
values as well as to clear the environment of all variable settings using the NO ENVIRONMENT syntax. To alter an existing set of one or more environment pairs, you must specify
all the environment settings; the ALTER command replaces the current list with the list
specified in the ALTER command.
Within the UDF, you can retrieve the UdxEnvironment class to obtain and use the environment variables:
UdxEnvironment* getEnvironment()

The UDX environment is a structure of UdxEnvironmentEntry values that store each


name/value pair that you specify when you register the UDX. The environment variables are
available to the UDX constructor. You can use the following methods to obtain information
about the entries in the environment as well as to obtain a specific entry in the environment
by its name, its position (key), or its value:

The getNumEntries() method returns the number of environment variables in the UdxEnvironment object. For example:
int numEntries = env->getNumEntries();

20444-5

Rev.4

2-7

IBM Netezza User-Defined Functions Developers Guide

The findEntry() method takes an input string and matches it against the variable
names. It returns the key number of the first matching entry in the environment structure, or -1 if the string is not found.

The getEntry() method takes an input index or variable name value and returns the
matching UdxEnvironmentEntry or a null value if not found.

The getKey() method returns the matching environment variable name from a
UdxEnvironmentEntry.

The getValue() method returns the matching environment variable value from a
UdxEnvironmentEntry.

Understanding Size-Specific, Generic, and Variable Argument UDXs


In the signature of a user-defined function or aggregate, the <argument_type_list> can use
three general forms:

Size-specific arguments

Generic arguments

Variable arguments

The following sections describe these three formats and the benefits and considerations for
using that type.

Size-Specific Arguments
With size-specific UDXs such as the customerName example, you must declare the type
and size of all input arguments, as well as the type and size of the return value. Specific
datatype size declarations are useful for resource planning as well as for error-checking of
the input arguments and return values, but they can be somewhat limiting if your UDXs
process strings or numerics that could vary in size each time you run a query.
Constant datatype sizes often require you to use larger datatype sizes (and thus more storage resources) to support the maximum input values and/or return values. They can also
result in implicit casts, such as casting a smaller input value to fit the larger declared size
(for example, it could increase the precision of a numeric or add padding to strings). If you
choose too small a size, you risk loss of precision if the Netezza system casts a larger input
numeric to the smaller numeric or truncates input strings which exceed the defined string
input size.

Generic-Size Arguments
Generic-size (or any-size) arguments offer more flexibility for strings and numerics. You can
declare character strings or numerics using the ANY keyword in the signature (or in the
return value). For example:
MYDB(MYUSER)=> CREATE FUNCTION CustomerName(varchar(ANY))
RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC API VERSION 2
EXTERNAL CLASS NAME 'CCustomerName'
EXTERNAL HOST OBJECT '/home/nz/udx_files/customername.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/customername.o_spu10';

2-8

20444-5

Rev.4

Understanding Size-Specific, Generic, and Variable Argument UDXs

The function accepts a character string of up to 64,000 characters (the maximum for a
VARCHAR). Within the body of the function, the code must process the strings and numerics with the plan that you could receive a string of any valid length. That is, you can check
and obtain their size, process them as needed, and return the value for the function.
Generic-size arguments help you to avoid specific limits for the input strings and numerics,
or to use overly large or maximum size values that result in unnecessary resource allocation
for the procedure. This format can also reduce the need to register and maintain similar
procedures that take different input arguments or have different return values, as well as
possible casting of input values.
Note: UDFs support generic arguments as well as generic return values. UDAs, however,
support only generic arguments. The return value and state arguments of a UDA must specify constant data sizes. UDTFs support shapers, which are similar to generic return values
for scalar UDFs.

Supported Generic Argument Types


You use the ANY keyword to indicate that an argument or return value is generic. The following datatypes support the ANY keyword as a size specifier:

CHAR or NCHAR

VARCHAR or NVARCHAR

NUMERIC

For example, to specify a numeric datatype of precision 10 and scale 2, you specify it as
NUMERIC(10,2). To specify a numeric datatype that can take any size, you specify is as
NUMERIC(ANY). Likewise, to specify a variable character string that can take any size, you
declare it as VARCHAR(ANY).

Generic Arguments in the UDX Signature


You can define both generic arguments as well as the standard datatype-and-size-specific
arguments in the signature of a UDX. The Netezza software verifies that all the input arguments match the required number and datatypes of the signature.
For arguments that have a specific size, the Netezza software also confirms that the size of
the input value matches the defined signature size. If necessary, the Netezza software casts
the input values to match the size specified in the signature. For example, if you declare a
string of 20 characters [CHAR(20)] in a signature, the Netezza software implicitly truncates an input string that is longer than 20 characters or adds padding if the input string is
less than 20 characters.
For generic arguments, the argument values are passed to the function without any casting
or changes. For example, if you declare a CHAR(ANY) input value, the function accepts
character strings of any length up to the supported maximum; it checks to make sure that
the input value is a valid character string and that it occurs in the expected place of the
signature.
The Netezza software performs some implicit castings for the input values. For example, if
you define an input argument as VARCHAR(ANY) in the signature, but you pass an input of
CHAR(200) to the function, the UDF casts the CHAR(200) to VARCHAR(200). The UDF
uses the datatype of the signature and the size of the input value to determine the casting
change.

20444-5

Rev.4

2-9

IBM Netezza User-Defined Functions Developers Guide

Generic UDF Return Value


If you use ANY for a return value size, your UDF must calculate the size of the numeric or
string return value. To perform this task, your UDF must override the calculateSize()
method to define the sizing operation.
The calculateSize() method provides an upper limit of the return value size. It specifies the
amount of memory that the system should allocate for the result; however, the actual return
value length still needs to be set. The Netezza system offers a number of sizer methods
that you can use to process the string and numeric datatypes. For a description of the
methods, see Return Value Sizer API on page 2-12.
An example of a calculateSize() method for a string datatype follows:
virtual uint64 calculateSize() const
{
int len = 0;
for (int i=0; i < numSizerArgs(); i++) //for each input argument
{
if (sizerArgType(i) == UDX_VARIABLE)
len += sizerStringArgSize(i); //add the input argument sizes
}
return sizerStringSizeValue(len); //let return value be the sum
}

An example of a calculateSize() method for a numeric datatype follows:


virtual
{
int
int
for
{

uint64 calculateSize() const


prec = 0;
scale = 0;
(int i=0; i < numSizerArgs(); i++) //for each input argument
if (sizerArgType(i) == UDX_NUMERIC64) // if the argument is a
//numeric64 value
{
if (sizerNumericArgPrecision(i) > prec) //compute maximum
prec = sizerNumericArgPrecision(i); //precision and
if (sizerNumericArgScale(i) > scale)
//scale as "max"
scale = sizerNumericArgScale(i);
}

}
return sizerNumericSizeValue(prec, scale); //let return value
//precision and scale be "max"
}

For a complete example of a UDF that uses generic input arguments as well as a generic
return value and a calculateSize() method, see Generic UDF Example on page F-1.

Registering Generic UDXs


When you register a generic UDX using the CREATE [OR REPLACE] command, you use the
keyword ANY to declare character or numeric datatypes as generic. For a complete description of the command, refer to CREATE [OR REPLACE] AGGREGATE on page B-13 or
CREATE [OR REPLACE] FUNCTION on page B-17.

2-10

20444-5

Rev.4

Using the UDF in a SQL Query

An example for a UDF follows:


MYDB(MYUSER)=> CREATE FUNCTION number(num NUMERIC(ANY))
RETURNS NUMERIC(ANY) LANGUAGE CPP PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'Nnumber'
EXTERNAL HOST OBJECT '/home/nz/udx_files/number.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/number.o_spu10';

In this example, the number() function takes an input numeric datatype of any valid size
and returns a numeric datatype of a valid size that will be calculated by the UDF.
An example for a UDA follows (UDAs allow generic arguments only):
CREATE OR REPLACE AGGREGATE char20 (CHAR(ANY))
RETURNS CHAR(20) STATE (CHAR(20))
LANGUAGE CPP PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'Char20'
EXTERNAL HOST OBJECT '/tmp/udx_test/UDX_CharMax.o_x86'
EXTERNAL SPU OBJECT '/tmp/udx_test/UDX_CharMax.o_spu10';

Note: UDAs which have large string state variables can impact performance.

Variable Arguments
Variable-argument functions and aggregates offer even more flexibility than generic-size
arguments. With variable argument UDFs, UDAs, and UDTFs, you specify only the
VARARGS keyword in the argument_type_list. Users can specify from 0 to 64 input values
of any supported data type as input arguments. For example, using the greatest_value function from a previous section:
MYDB(MYUSER)=> CREATE FUNCTION greatest_value(VARARGS) RETURNS
INT64...

Within the body of the function, the code must process the input values and manage them
as needed. For example, the function body should verify the data types of the input arguments and either cast or error out as applicable. You must design your UDX code to handle
the native data type of the input values, such as managing Numeric32Val versus double
data types. If you were hard-coding the input values, you could declare the input as
func(double) and when invoked with a numeric, the system would cast it to double for you.
Variable argument signatures allow you to create one function or aggregate that can be
used for different combinations of input types. This simplifies the development of your
UDFs, UDAs, and UDTFs and reduces the need to create overloaded definitions that perform the same task for different types and numbers of arguments.

Using the UDF in a SQL Query


After you register a UDF with your Netezza system, you and other permitted users can call
the function in the same manner as the Netezza SQL built-in functions. To use a UDF,
users must have Execute permission for the FUNCTION object or for the specific UDF.
Note: By default, the admin user account has execute access to all user-defined functions
and aggregates. The user account that registered a UDF also has execute access to that
UDF. Other users can be given permission to run specific or all UDFs.
For the sample customername function, first create a sample table that contains the data
to be processed by the function. For example:

20444-5

Rev.4

2-11

IBM Netezza User-Defined Functions Developers Guide

CREATE
INSERT
INSERT
INSERT
INSERT

TABLE customers (a INT, b


INTO customers VALUES (1,
INTO customers VALUES (2,
INTO customers VALUES (3,
INTO customers VALUES (4,

VARCHAR(200));
'Customer A');
'Customer B');
'Customer CBA');
'Customer ABC');

Then you can run the sample customername function, as follows:


MYDB(MYUSER)=> SELECT * FROM customers WHERE CustomerName(b) = 1;

Sample output follows:


A | B
---+-----------------1 | Customer A
4 | Customer ABC
(2 rows)

Altering and Dropping UDFs


After you register a UDF, you can use the ALTER FUNCTION command to change the function object files, return value, memory usage options, or logging level. You cannot change
the function name or argument type list; you must drop the existing function and then create a new function with the new name and/or argument type list. For more information
about the command, see ALTER FUNCTION on page B-6.
You can remove a UDF using the DROP FUNCTION command. For more information about
the command, see DROP FUNCTION on page B-27. The Netezza system does not allow
you to drop UDFs which are referenced in any tables or views. You must resolve those
dependencies before you can drop the UDX, as described in Dependency Checks before
Dropping UDXs on page 6-13.
Note: The dependency checks will not stop a DROP FUNCTION operation if the UDF was
used in a table or view that was created on an earlier Netezza release environment which
was subsequently upgraded to 4.6. You would have to recreate the old view or modify the
default expression after the 4.6 upgrade to take advantage of the dependency checks.

Return Value Sizer API


As described in Generic UDF Return Value on page 2-10, UDFs that use a generic return
value must include a calculateSize() routine. (UDAs do not support a generic return value.)
The calculateSize() method provides an upper limit of the return value size. It is called to
allocate the memory for the result; the actual return value length still needs to be set. You
can use the following methods within the sizer routine to calculate the return value size.

sizerReturnType Method
Returns a datatype based on the declared UDF return type.

Syntax
The method has the following syntax:
int sizerReturnType() const;

2-12

20444-5

Rev.4

Return Value Sizer API

Description
The method returns a datatype such as UDX_NUMERIC, UDX_FIXED, UDX_VARIABLE,
UDX_NATIONAL_FIXED or UDX_NATIONAL_VARIABLE. For a description of these
datatypes, see Supported Data Types on page D-1.

numSizerArgs Method
Specifies the number of arguments in the UDF signature.

Syntax
The method has the following syntax:
int numSizerArgs() const;

Description
This method indicates the number of arguments that the function is called with.

sizerArgType Method
Specifies the datatype of the specified argument of the function.

Syntax
The method has the following syntax:
int sizerArgType(int n) const;

Description
The method specifies the datatype of the argument of the function. The datatype can be
any of the enumerated types except UDX_NUMERIC. The enumerated types are described
in Supported Data Types on page D-1.

Throws
The method throws an exception if n is out of range.

sizerStringArgSize Method
Returns the string size in characters of the specified argument.

Syntax
The method has the following syntax:
int sizerStringArgSize(int n) const;

Description
This method returns the size of the specified generic argument. This size is in characters,
not bytes.

20444-5

Rev.4

2-13

IBM Netezza User-Defined Functions Developers Guide

Throws
The method throws exceptions if n is out of range, if the specified argument is not a string
type, or the specified argument does not have a size.

sizerNumericArgPrecision Method
Returns the precision of the specified numeric argument.

Syntax
The method has the following syntax:
int sizerNumericArgPrecision(int n) const;

Description
The method returns the precision component of the specified numeric argument.

Throws
The method throws exceptions if n is out of range, if the specified argument is not a
numeric type, or the specified argument does not have a size.

sizerNumericArgScale Method
Returns the scale of the specified numeric argument.

Syntax
The method has the following syntax:
int sizerNumericArgScale(int n) const;

Description
The method returns the scale component of the specified numeric argument.

Throws
The method throws exceptions if n is out of range, if the specified argument is not a
numeric type, or the specified argument does not have a size.

sizerStringSizeValue Method
Builds a string return value size for the specified string length.

Syntax
The method has the following syntax:
uint64 sizerStringSizeValue(int len) const;

Description
The method builds a return value for the calculateSize method using the specified string
length len. The string length must be in characters, not bytes.

2-14

20444-5

Rev.4

Return Value Sizer API

Throws
The method throws an exception if the return type is not a string.

sizerNumericSizeValue Method
Builds a numeric return value for the specified precision and scale values.

Syntax
The method has the following syntax:
uint64 sizerNumericSizeValue(int prec, int scale) const;

Description
The method builds a numeric return values size for the calculateSize() method as an int64
(which is a format that Netezza recognizes) using the values specified in prec and scale.

Throws
The method throws an exception if the return type is not a numeric.

isSizerArgConstant Method
Returns true if the specified argument is a constant integer.

Syntax
The method has the following syntax:
bool isSizerArgConstant(int n) const;

Description
To support certain methods like round(val, scale), this method provides a mechanism for
passing constant arguments to the sizer as long as they are of int32 type. This method
returns true if the specified argument is a constant integer (0 or any positive or negative
number, except -1), or false if the constant specified is -1.

Throws
The method throws exceptions if n is out of range or if the specified argument is not an
int32 datatype.

sizerGetConstantArg Method
Returns the specified constant int32 argument.

Syntax
The method has the following syntax:
int32 sizerGetConstantArg(int n) const;

Description
This method returns the specified constant int32 argument.

20444-5

Rev.4

2-15

IBM Netezza User-Defined Functions Developers Guide

Throws
The method throws exceptions if n is out of range, the specified argument is not an int32,
or the specified argument is not constant.

calculateSize Method
Provides the sizing calculations for strings and numerics in generic UDFs.

Syntax
The method has the following syntax:
virtual uint64 calculateSize() const
{
return 0xFFFFFFFFFFFFFFFFLL;
}

Description
If your UDF uses the ANY keyword as the size specified for a numeric or string return value,
you must override this method to provide the sizing capabilities.

2-16

20444-5

Rev.4

CHAPTER 3
Creating User-Defined Table Functions
Whats in this chapter
Creating the C++ File for the UDTF
Compiling the UDTF
Registering the UDTF
Using the UDTF in a SQL Query
Altering and Dropping a UDTF

This chapter describes the steps to create a user-defined table function (UDTF) and to
register it for use on a Netezza system.

Creating the C++ File for the UDTF


To begin, use any text editor to create your C++ file. The file name must have a .cpp extension. You might want to create a new directory such as /home/nz/udx_files as your area for
UDX code files.
Your C++ file must include the udxinc.h header file, which contains the required declarations for user-defined table functions and processing on the Netezza SPUs.
#include "udxinc.h"

In addition, make sure that you declare any of the standard C++ library header files that
your table function may require. If your UDTF requires any user-defined shared libraries,
make sure you note the name of the libraries as you will need them when you register the
UDTF in the database. For example:
#include "udxinc.h"
#include <string.h>

Note: User-defined shared libraries must exist in the database before you can register the
UDTF and specify those libraries as dependencies. You could register the UDTF without
specifying any library dependencies, and after the libraries are added, use the ALTER
FUNCTION command to update the UDTF definition with the correct dependencies. For
more information about user-defined shared libraries, see Creating a User-Defined Shared
Library on page 5-1.
The UDX classes and functions for API version 2 are defined in a namespace called
nz::udx_ver2. Your C++ program must reference the correct namespace. For example:
#include "udxinc.h"
using namespace nz::udx_ver2;

3-1

IBM Netezza User-Defined Functions Developers Guide

To implement a UDTF, you create a new class object derived from the Udtf base class. For
example:
#include "udxinc.h"
using namespace nz::udx_ver2;
class parseNames : public Udtf {
public:
}

The parseNames UDTF takes an input table of strings fields which are separated by spaces
or commas, and returns a table where each field of the requested string is output on its
own row. As with other UDXs, you define the variables required for the UDTF algorithm at
the class level. For example:
#include "udxinc.h"
using namespace nz::udx_ver2;
class parseNames : public Udtf {
private:
char value[1000];
int valuelen;
int i;
public:
}

The parseNames UDTF uses the following variables:

The value variable contains a copy of the input parameter.

The valuelen variable contains the length of the input string.

The i variable is a counter.

Each UDTF must implement the instantiate() and constructor method as well as two additional UDTF-specific methods: newInputRow() and nextOutputRow(). The
nextEoiOutputRow() UDTF-specific method is optional. An example of the methods and
their purpose follows.

As with UDFs, you call the instantiate() method to create the UDTF object dynamically,
In UDX version 2, the instantiate method takes one argument (UdxInit *pInit), which
enables access to the memory specification, the log setting, and the UDX environment
(see UDX Environment on page 2-7). The constructor must take a UdxInit object as
well and pass it to the base class constructor. An example follows:
#include "udxinc.h"
using namespace nz::udx_ver2;
class parseNames : public Udtf {
private:
char value[1000];
int valuelen;
int i;
public:
parseNames(UdxInit *pInit) : Udtf(pInit) {}
static Udtf* instantiate(UdxInit*);
};

3-2

20444-5

Rev.4

Creating the C++ File for the UDTF

Udtf* parseNames::instantiate (UdxInit* pInit) {


return new parseNames(pInit);
}

For a UDTF, you use the newInputRow() method to perform initialization actions such
as copying input arguments, initializing class variables, and managing situations such
as null input variables. The method is called once for each input row. For the parseNames UDTF example, the following sample code copies the input list to the variable
value, sets valuelen to the length of the input string, and initializes the variable i to
zero (0):
virtual void newInputRow() {
StringArg *valuesa = stringArg(0);
bool valuesaNull = isArgNull(0);
if (valuesaNull)
valuelen = 0;
else {
if (valuesa->length >= 1000)
throwUdxException("Input value must be less than 1000
characters.");
memcpy(value, valuesa->data, valuesa->length);
value[valuesa->length] = 0;
valuelen = valuesa->length;
}
i = 0;
}

You use the nextOutputRow() method to create and return the next output row of the
table. The method should also detect whether there is no more data to return and then
return Done. NPS calls this method at least once per input row. Sample code follows:
virtual DataAvailable nextOutputRow() {
if (i >= valuelen)
return Done;
// save starting position of name
int start = i;
// scan string for next comma
while ((i < valuelen) && value[i] != ',')
i++;
// return word
StringReturn *rk = stringReturnColumn(0);
if (rk->size < i-start)
throwUdxException("Value exceeds return size");
memcpy(rk->data, value+start, i-start);
rk->size = i-start;
i++;
return MoreData;
}

As shown in the example above, you create a column using the appropriate column
return type such as stringReturnColumn() or intReturnColumn() and you specify the
position of the column such as 1, 2, 3, and so on. The return MoreData syntax indicates that there is another row to process. When the counter variable i reaches the end
of the input string, there is no more data to process and nextOutputRow() returns Done.

20444-5

Rev.4

3-3

IBM Netezza User-Defined Functions Developers Guide

If your UDTF supports the TABLE WITH FINAL syntax, you use the nextEoiOutputRow()
method at least once after the end of the input to process and output all the data. The
base class has a default implementation of this method that returns no rows when
called. It is similar to nextOutputRow() except that newInputRow() is not called before
it. A sample method follows:
virtual DataAvailable nextEoiOutputRow()
return Done;
}

Compiling the UDTF


After you create your C++ file for your new UDTF, compile the C++ file using the nzudxcompile command. The command is located in the /nz/kit/bin/adm directory. The compilation
process creates the object files that will run on the Netezza host as well as on the Netezza
SPUs.

To compile the parseNames.cpp file and create the output object files:
nzudxcompile parseNames.cpp

The nzudxcompile command creates the following object files:

parseNames.o_x86 is the object file for the Netezza host (i386 Linux platform on x86).

parseNames.o_spu10 is the object file for the Linux-based Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.

For a complete description of the nzudxcompile command and its options and usage, refer
to nzudxcompile Command Syntax on page 6-18.
After you create the compiled object files, you must register the UDTF with the Netezza
system. You can register a user-defined function using the CREATE FUNCTION command,
as described in the next section.
Optionally, you can also compile and register a UDTF in one step using the nzudxcompile
command. For example, to compile the parseNames C++ file and also register it in a sample database called mydb:
nzudxcompile --sig "parseNames(VARCHAR(ANY))" --return
"TABLE(product_id VARCHAR(200))" --class parseNames --version 2
parseNames.cpp -user myuser -pw password -db mydb

Registering the UDTF


To register a UDTF, you use the Netezza SQL command CREATE FUNCTION. For a complete description of the CREATE FUNCTION command, refer to CREATE [OR REPLACE]
FUNCTION on page B-17.
Note: When you issue a CREATE FUNCTION command, the database processes the HOST
OBJECT and the SPU OBJECT files as the nz user. The nz user must have read access to
the object files and read and execute access to every directory in the path from the root to
the object file.

3-4

20444-5

Rev.4

Using the UDTF in a SQL Query

For example, to register the sample function parseNames to the Netezza system, start an
nzsql session to your database (which is named mydb in this example):
nzsql mydb myuser password

Next, use the CREATE FUNCTION Netezza SQL command to register the UDTF:
MYDB(MYUSER)=> CREATE FUNCTION ParseNames(varchar(ANY))
RETURNS TABLE(product_id VARCHAR(200)) API VERSION 2 LANGUAGE CPP
PARAMETER STYLE NPSGENERIC EXTERNAL CLASS NAME 'ParseNames'
EXTERNAL HOST OBJECT '/home/nz/udx_files/parseNames.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/parseNames.o_spu10';

If the command is successful, it returns the message CREATE FUNCTION. It creates the
UDTF in the mydb database, and the UDTF is owned by myuser. To create a function, your
user account must have Create Function administration permission or you must be logged
in as the admin user. For a description of the required privileges for these commands, refer
to Managing User Account Permissions on page 6-1.

Using the UDTF in a SQL Query


After you register a UDTF with your Netezza system, you and other permitted users can use
the UDTF in a FROM clause where a table would normally appear in a query. To use a
UDTF, users must have Execute permission for the FUNCTION object or for the specific
UDTF.
For the parseNames example, assume that you have a table named orders that contains
records for each customer sale of items, where the items are a comma-separated list of
product ID codes, for example:
CREATE TABLE orders(order_id INTEGER, cust_id VARCHAR(200), sale_date
DATE, prod_codes VARCHAR(1000)) DISTRIBUTE ON (order_id);
INSERT INTO orders(order_ID, cust_ID, sale_date, PROD_CODES) VALUES
(124, 'AB123456', '20100826', '124,6,12,121');
INSERT INTO orders(order_ID, cust_id, sale_date, PROD_CODES) VALUES
(125, 'AB987657', '20100826', '8');
INSERT INTO orders(order_ID, cust_id, sale_date, PROD_CODES) VALUES
(126, 'AB456754', '20100901', '32,5,76,65,121,98');
INSERT INTO orders(order_ID, cust_id, sale_date, PROD_CODES) VALUES
(131, 'AB643623', '20100902', '12,88,41');
INSERT INTO orders(order_ID, cust_id, sale_date, PROD_CODES) VALUES
(142, 'AB664353', '20100904', '1,145,52,53,93,98,100');
INSERT INTO orders(order_ID, cust_id, sale_date, PROD_CODES) VALUES
(132, 'AB643623', '20100904', '121');
INSERT INTO orders(order_ID, cust_id, sale_date, PROD_CODES) VALUES
(143, 'AB123456', '20100905', '87,182');
INSERT INTO orders(order_ID, cust_id, sale_date, PROD_CODES) VALUES
(120, 'AB876123', '20100905', '28,36,80');

20444-5

Rev.4

3-5

IBM Netezza User-Defined Functions Developers Guide

INSERT INTO orders(order_ID, cust_id, sale_date, PROD_CODES) VALUES


(150, 'CD876543', '20100905', '80,43,55,12,4,67,92');

The orders table appears as:


select * from orders;
ORDER_ID | CUST_ID | SALE_DATE |
PROD_CODES
----------+----------+------------+----------------------120 | AB876123 | 2010-09-05 | 28,36,80
126 | AB456754 | 2010-09-01 | 32,5,76,65,121,98
142 | AB664353 | 2010-09-04 | 1,145,52,53,93,98,100
150 | CD876543 | 2010-09-05 | 80,43,55,12,4,67,92
143 | AB123456 | 2010-09-05 | 87,182
124 | AB123456 | 2010-08-26 | 124,6,12,121
132 | AB643623 | 2010-09-04 | 121
125 | AB987657 | 2010-08-26 | 8
131 | AB643623 | 2010-09-02 | 12,88,41
(9 rows)

Then you can run the sample parseNames UDTF, as follows:


MYDB(MYUSER)=> SELECT t.cust_id, f.product_id FROM orders AS t, TABLE
( parseNames(prod_codes) ) AS f;

Sample output follows:


CUST_ID | PRODUCT ID
----------+-----------AB876123 | 28
AB876123 | 36
AB876123 | 80
AB987657 | 8
AB123456 | 87
AB123456 | 182
AB643623 | 12
AB643623 | 88
AB643623 | 41
AB123456 | 124
AB123456 | 6
...

By default, the admin user account has execute access to all user-defined functions and
aggregates. The user account that registered a UDTF also has execute access to that UDTF.
Other users can be given permission to run specific or all UDTFs.

UDTF Invocation Forms


You can invoke a table function using either of the following syntax forms:
TABLE(func_name(args))
or
TABLE WITH FINAL (func_name(args))
The TABLE syntax causes the query to use normal table behavior that is, the UDTF is
invoked once per each input row. The TABLE WITH FINAL syntax also invokes the UDTF
once per each input row, but in addition, the UDTF is invoked again after all the input rows
are processed, which allows it to output more rows such as summary rows. Note that the

3-6

20444-5

Rev.4

Using the UDTF in a SQL Query

behavior of the TABLE WITH FINAL syntax depends on the locus, that is the location,
where the UDTF runs. If the UDTF executes on the S-Blades, for example, the TABLE WITH
FINAL post-processing occurs once per dataslice.
When you register a UDTF, you can control whether the user can invoke the UDTF with the
WITH FINAL syntax. If you register the UDTF as TABLE, TABLE FINAL ALLOWED, the user
can specify either TABLE or TABLE WITH FINAL syntax. If the UDTF is registered as
TABLE ALLOWED, for example, the user can specify only the TABLE syntax option. Likewise, if the UDTF is registered as TABLE FINAL ALLOWED, the user must use the TABLE
WITH FINAL syntax option.

Specifying UDTF Arguments


The arguments that you specify for a UDTF can be in one of two forms:

All literal expression arguments

A combination of literal expression and column expression arguments (also referred to


as a correlated table function)

The arguments that you specify (not the join qualifier) determine the type of correlation
that occurs.

Literal Expression Arguments


When you invoke a UDTF with all literal expression arguments, you invoke an uncorrelated
table function. You can specify an uncorrelated table function anywhere in a query where a
table could appear. An uncorrelated table function always executes on the host. Although
you should be able to use an uncorrelated table function in most supported contexts, there
are a couple of exceptions where you cannot use it:

In the catalog table functions can execute only on the host or a SPU (in the case of
a correlated table function).

In a materialized view materialized views operate using data which is stored on disk.

For the parseNames UDTF, the following query shows how the function could be invoked as
an uncorrelated table function:
mydb(usr1)=> SELECT * FROM TABLE(parseNames('1,2,3,4,5'));
PRODUCT_ID
-----------1
2
3
4
5
(5 rows)

Other examples of uncorrelated UDTFs include:

20444-5

Rev.4

SELECT * FROM mytbl, TABLE(myfunc(1, 2));

SELECT * FROM TABLE WITH FINAL(tfunc(1));

3-7

IBM Netezza User-Defined Functions Developers Guide

Literal and Columnar Expression Arguments


When you use a combination of literal and column expression arguments, you invoke a correlated table function.
Note: A correlated subquery (also known as a repeating subquery) is a special kind of table,
row, or scalar subquery that includes a reference to an outer table. You should not use correlated subqueries because they can be very slow; they evaluate the subquery once per
input row of the outer table. Netezza has very limited support for correlated subqueries.
A lateral subquery is a special form of correlated subquery that appears in a FROM clause
and includes a reference to an outer table that appears earlier the same FROM clause. The
LATERAL keyword typically identifies this type of correlation, although the TABLE keyword
can be used as well. Netezza does not support the LATERAL keyword and supports the
TABLE keyword only for UDTFs (not for SQL subqueries).
The tables which contain the columns referenced by the table function must appear before
the table function invocation in the SQL query. A table function is laterally correlated with
the tables whose columns it uses, even when the invocation appears to be that of a JOIN
operation. There are two forms of lateral correlations for UDTFs, inner and left outer
correlation:

With inner correlation, the table function is invoked once for each input row. The table
output contains all of the output rows produced for that input row plus the corresponding input row. If the table function does not produce an output row for a given input
row, the input row will be omitted from the table output. Also, if the table function produces an output row for a given input row, but the join qualifier evaluates to false, the
input row will be omitted from the combined output. If the UDTF can be called using
the TABLE WITH FINAL syntax, note that there may be additional output rows as a
result of the WITH FINAL processing. Two examples follow:

mydb(usr1)=> SELECT t.cust_id, f.product_id FROM orders AS t, TABLE (


parseNames(prod_codes) ) AS f WHERE t.cust_id='AB123456';
cust_id | product_id
----------+-----------AB123456 | 124
AB123456 | 6
AB123456 | 12
AB123456 | 121
AB123456 | 87
AB123456 | 182
(6 rows)
mydb(usr1)=> SELECT t.cust_id, f.product_id FROM orders AS t, TABLE (
parseNames(prod_codes) ) AS f WHERE t.cust_id='AB223456';
cust_id | product_id
----------+-----------(0 rows)

3-8

20444-5

Rev.4

Using the UDTF in a SQL Query

With left outer correlation, the major difference is that in cases where the input row
does not produce output or cases where the join qualifier evaluates to false, the UDTF
displays the result of the table function with NULL values in its columns. For example:

mydb(usr1)=> SELECT * FROM orders AS f LEFT OUTER JOIN


TABLE(parsenames(f.prod_codes)) ON product_id='121';
ORDER_ID | CUST_ID | SALE_DATE |
PROD_CODES
| PRODUCT_ID
----------+----------+------------+-----------------------+-----------150 | CD876543 | 2010-09-05 | 80,43,55,12,4,67,92
|
142 | AB664353 | 2010-09-04 | 1,145,52,53,93,98,100 |
124 | AB123456 | 2010-08-26 | 124,6,12,121
| 121
143 | AB123456 | 2010-09-05 | 87,182
|
120 | AB876123 | 2010-09-05 | 28,36,80
|
131 | AB643623 | 2010-09-02 | 12,88,41
|
132 | AB643623 | 2010-09-04 | 121
| 121
125 | AB987657 | 2010-08-26 | 8
|
126 | AB456754 | 2010-09-01 | 32,5,76,65,121,98
| 121
(9 rows)

A laterally correlated table function has the following additional restrictions:

It cannot occur in a RIGHT OUTER JOIN where it is laterally correlated to the table
being joined on.

It cannot occur in a FULL OUTER JOIN where it is laterally correlated to the table
being joined on.

When used in a LEFT OUTER or INNER JOIN, where it is laterally correlated to the
table in the join clause, you will get correlation behavior and not join behavior.

Table Functions and Visibility Rules


If you use both explicit and implicit join syntax, the query could result in an error due to
visibility rules. For example, the following query results in an error:
select * from feeder, feeder2, feeder3 left join
table(tfunc(feeder.i, feeder2.i)) on true where feeder.i =
feeder2.i;
The error occurs because the tfunc table function is in an explicit join clause which has visibility only of feeder3 and is laterally correlated with feeder and feeder2. In this case, you
can specify only the tables that appear in the left of the join expression, or constants, as
arguments to the table function. Also, the left join is not a table function lateral correlation,
but a standard left join because feeder3 does not appear as a table function argument.The
correct way to construct such a query would be as follows:
select * from feeder3 left join (feeder join feeder2 on feeder.i =
feeder2.i join table(tfunc(feeder.i, feeder2.i)) on true) on true;

Chaining Table Functions


You can create a query that invokes a table function that is correlated on a table function,
which is referred to as chaining table functions. The results of the first invocation of the
UDTF can be input to subsequent UDTFs for more processing. For example, the following
query shows how the results from one correlation is fed to another correlation:
mydb(usr1)=> SELECT t.order_id, t.prod_codes, f.product_id, x.product_
id FROM orders t JOIN TABLE(parseNames(prod_codes)) AS f ON TRUE JOIN
TABLE (parseNames(f.product_id)) x ON TRUE ORDER BY order_id;

20444-5

Rev.4

3-9

IBM Netezza User-Defined Functions Developers Guide

To identify the behavior of the chain of correlated functions, it can be helpful to divide the
query into parts and examining the results for each part. For example, the first part is the
query that joins the orders table with the parseNames UDTF as follows. For brevity, the output shows only the results for the first two order IDs (120 and 124):
mydb(usr1)=> SELECT t.order_id, t.prod_codes, f.product_id FROM orders
t JOIN TABLE(parseNames(prod_codes)) AS f ON TRUE ORDER BY order_id;
ORDER_ID |
PROD_CODES
| PRODUCT_ID
----------+-----------------------+-----------120 | 28,36,80
| 28
120 | 28,36,80
| 36
120 | 28,36,80
| 80
124 | 124,6,12,121
| 124
124 | 124,6,12,121
| 6
124 | 124,6,12,121
| 12
124 | 124,6,12,121
| 121
...
(34 rows)

As the output shows, the parseNames UDTF returns a table with a row for each unique
value in the prod_codes string of values. This initial result set is fed into the next join,
which invokes the parseNames function for each unique value in the f.product_id column,
as follows:
mydb(usr1)=> SELECT t.order_id, t.prod_codes, f.product_id, x.product_
id FROM orders t JOIN TABLE(parseNames(prod_codes)) AS f ON TRUE JOIN
TABLE (parseNames(f.product_id)) x ON TRUE ORDER BY order_id;
ORDER_ID |
PROD_CODES
| PRODUCT_ID | PRODUCT_ID
----------+-----------------------+------------+-----------120 | 28,36,80
| 28
| 28
120 | 28,36,80
| 36
| 36
120 | 28,36,80
| 80
| 80
124 | 124,6,12,121
| 124
| 124
124 | 124,6,12,121
| 6
| 6
124 | 124,6,12,121
| 12
| 12
124 | 124,6,12,121
| 121
| 121
...
(34 rows)

Defining the Execution Locus of the UDTF


The system invokes UDTFs either on the host, one SPU, or all the SPUs, at the discretion
of the optimizer. When you create or alter the UDTF, you can specify the preferred execution locus of the UDTF to the optimizer.
If you register the UDTF as PARALLEL ALLOWED, the table function can be invoked on
either the host or a SPU. The optimizer chooses the locus based on its calculations for optimal performance. In general, if you specify PARALLEL ALLOWED and use non-literal
arguments, the UDTF typically executes on the SPU. If you register the function as PARALLEL NOT ALLOWED (or, --noparallel for the nzudxcompile command), the system invokes
the UDTF on the host or a single SPU, but not on all the SPUs. The default behavior is
PARALLEL ALLOWED.
A uncorrelated table function (one with all literal arguments) is always executed on the
host, because the query could have inconsistent behavior otherwise. (The output would vary
based on the number of dataslices.)

3-10

20444-5

Rev.4

Specifying UDTF Arguments and Return Values

Specifying UDTF Arguments and Return Values


Like scalar UDFs, UDTFs can also take advantage of size-specific, generic, and variablesize arguments. For details about argument types and how to specify them, see Understanding Size-Specific, Generic, and Variable Argument UDXs on page 2-8.
UDTFs also support size-specific and generic return values; that is, you can define a specific table shape to return when you register the UDTF, or you can register the UDTF as
RETURNS TABLE (ANY). The ANY keyword indicates that the UDTF will define and return
the table shape based on the input arguments and the UDTF program design. When you
specify ANY as the table return value, Netezza instantiates the UDTF and invokes the calculateShape method to get the table definition from the UDTF.
virtual void calculateShape(UdxOutputShaper *shaper)

The Netezza system offers a number of shaper methods that you can use to collect information on the input columns (including constant values) and to provide information about the
output shape. For a description of the methods, see UDTF Shaper Methods on
page D-10.
An example of a calculateShape() method follows:
void calculateShape(UdxOutputShaper *shaper) {
if (shaper->numArgs() != 1)
throwUdxException("Expecting only one argument");
int nType = shaper->argType(0);
if ((UDX_FIXED == nType) || (UDX_VARIABLE == nType)) {
int len = shaper->stringArgSize(0);
char ucstr[] = "UPPER_CASE"; // For column names on systems that
char lcstr[] = "lower_case"; // use lowercase naming
char tcstr[] = "Title_Case";
char ucstrU[] = "UPPER_CASE"; // For column names on systems
char lcstrU[] = "LOWER_CASE"; // that use uppercase naming
char tcstrU[] = "TITLE_CASE";
if (shaper->isSystemCaseUpper()) {
shaper->addOutputColumn(nType, ucstrU, len);
shaper->addOutputColumn(nType, lcstrU, len);
shaper->addOutputColumn(nType, tcstrU, len);
}
else {
shaper->addOutputColumn(nType, ucstr, len);
shaper->addOutputColumn(nType, lcstr, len);
shaper->addOutputColumn(nType, tcstr, len);
}
}
else {
throwUdxException("Only CHAR and VARCHAR types are supported");
}
}

In this example, note that the UDTF is designed to take an input string and output three
columns of data: an uppercase version, a lowercase version, and a title case or initial-cap
version of the string. The shaper verifies that only one string is input at a time, and that the
string is of type CHAR or NVARCHAR. The function also displays the column headings
using a mixed capitalization on systems where the system casing is lowercase, or all uppercase characters on systems where the case is uppercase.

20444-5

Rev.4

3-11

IBM Netezza User-Defined Functions Developers Guide

For a complete example of the UcLcTc UDTF, see Sample UDTF with Generic Return
Value on page F-13.

Registering Generic Return Type UDTFs


When you register a generic UDTF using the CREATE [OR REPLACE] command, you use
the keyword ANY to declare the return table shape as generic. For a complete description of
the command, refer to CREATE [OR REPLACE] FUNCTION on page B-17.
An example command for the sample UDTF follows:
MYDB(MYUSER)=> CREATE OR REPLACE FUNCTION UcLcTc(VARARGS) RETURNS
TABLE(ANY) LANGUAGE CPP PARAMETER STYLE NPSGENERIC NOT FENCED API
VERSION 2 EXTERNAL CLASS NAME 'UcLcTc' EXTERNAL HOST OBJECT '/home/
usr/udtf/uclctc.o_x86' EXTERNAL SPU OBJECT '/home/usr/udtf/uclctc.o_
spu10';
CREATE FUNCTION

An example of the nzudxcompile registration command follows:


nzudxcompile --sig "UcLcTc()" --varargs --return "TABLE(ANY)" --class
"UcLcTc" --version 2 --unfenced --db test UcLcTc.cpp

Altering and Dropping a UDTF


After you register a UDTF, you can use the ALTER FUNCTION command to change the
function object files, return value, memory usage options, or logging level. You cannot
change the function name or argument type list; you must drop the existing function and
then create a new function with the new name and/or argument type list. For more information about the command, see ALTER FUNCTION on page B-6.
You can remove a UDTF using the DROP FUNCTION command. For more information about
the command, see DROP FUNCTION on page B-27. The Netezza system does not allow
you to drop UDFs which are referenced in any tables or views. You must resolve those
dependencies before you can drop the UDX, as described in Dependency Checks before
Dropping UDXs on page 6-13.
The dependency checks will not stop a DROP FUNCTION operation if the UDF was used in
a table or view that was created on an earlier Netezza release environment which was subsequently upgraded to 4.6. You would have to recreate the old view or modify the default
expression after the 4.6 upgrade to take advantage of the dependency checks.

3-12

20444-5

Rev.4

Table Shaper API

Table Shaper API


The following sections describe how to use the table shaper methods to define the table
that will be returned by a UDTF. In addition to these table shaper methods, see the UDTF
Shaper Methods on page D-10 for a list of other methods that you can use for input
options and other actions on shaper objects.

calculateShape Method
Specifies the shape of the table returned by the UDTF.

Syntax
The method has the following syntax:
virtual void calculateShape(UdxOutputShaper *shaper)

Description
UDTFs that return a generic table size (that is, which specify RETURN TABLE(ANY)) must
include a calculateShape() method to define the shape and content of the return table.
The UdxOutputShaper object has methods that you can use to retrieve information about
the input to the table function as well as to set the shape of the output.

addOutputColumn Method
Defines an output column for the table.

Syntax
This method has the following syntax:
void addOutputColumn(int nType, const char* strName, int nSize);
void addOutputColumn(int nType, const char* strName, int precision,
int scale);
void addOutputColumn(int nType, const char* strName);

Description
The addOutputColumn method operates on the UdxOutputShaper object to build an output
column definition for the table using the specified input values. The version that you invoke
depends on the data type you are defining. For example, use the precision and scale variant
for numerics (but not doubles or floats), the size version for strings, and the other for all
other data types.

numOutputColumns Method
Returns the number of output columns for the table.

Syntax
This method has the following syntax:
int numOutputColumns();

20444-5

Rev.4

3-13

IBM Netezza User-Defined Functions Developers Guide

Description
The numOutputColumns method operates on the UdxOutputShaper object and returns the
number of output columns specified for the table.

getOutputColumn Method
Returns the specified table column.

Syntax
This method has the following syntax:
const UdxColumnInfo* getOutputColumn(int n);

Description
The getOutputColumn method operates on the UdxOutputShaper object and returns the
specified table column. The n value specifies the column that you want to return.

isSystemCaseUpper Method
Verifies whether the Netezza database system case is in uppercase or lowercase.

Syntax
This method has the following syntax:
bool isSystemCaseUpper();

Description
The isSystemCaseUpper method operates on the UdxOutputShaper object and returns true
if the Netezza system case is in uppercase, or false if it is lowercase.

getType Method
Returns the datatype of the column.

Syntax
This method has the following syntax:
int getType();

Description
The getType method operates on the UdxColumnInfo object and returns the datatype of the
table column.

getSize Method
Returns the size of the column.

Syntax
This method has the following syntax:
int getSize();

3-14

20444-5

Rev.4

Table Shaper API

Description
The getSize method operates on the UdxColumnInfo object. The method returns the length
of the string for a column that contains a string value.

getPrecision Method
Returns the precision value for a column.

Syntax
This method has the following syntax:
int getPrecision();

Description
The getPrecision method operates on the UdxColumnInfo object. For columns that contain
numeric data, the method returns the precision value.

getScale Method
Returns the scale value for a column.

Syntax
This method has the following syntax:
int getScale();

Description
The getScale method operates on the UdxColumnInfo object. For columns that contain
numeric data, the method returns the scale value.

getName Method
Returns the name of a column.

Syntax
This method has the following syntax:
const char* getName();

Description
The getName method operates on the UdxColumnInfo object. The method returns the
name of the column.

20444-5

Rev.4

3-15

IBM Netezza User-Defined Functions Developers Guide

3-16

20444-5

Rev.4

CHAPTER 4
Creating User-Defined Aggregates
Whats in this chapter
Creating the C++ File for the UDA
Compiling the UDA
Registering the UDA with the Netezza System
Using the UDA in a SQL Query
Altering and Dropping UDAs

This chapter describes how to create a user-defined aggregate and register it on a Netezza
system. This chapter creates a simple aggregate called PenMax, which returns the secondgreatest or second-largest value encountered. If there is not a second-greatest value, the
aggregate returns NULL.

Creating the C++ File for the UDA


To begin, use any text editor to create your C++ file. Your C++ file must include the
udxinc.h header file, which contains the required declarations for user-defined aggregates
and processing on the Netezza SPUs.
#include "udxinc.h"

In addition, make sure that you declare any of the standard C++ library header files that
your aggregate may require. If your UDA requires any user-defined shared libraries, make
sure you note the name of the libraries as you will need them when you register the UDA in
the database.
Note: User-defined shared libraries must exist in the database before you can register the
UDA and specify those libraries as dependencies. You could register the UDA without specifying any library dependencies, and after the libraries are added, use the ALTER
AGGREGATE command to update the UDA definition with the correct dependencies. For
more information about user-defined shared libraries, see Creating a User-Defined Shared
Library on page 5-1.
The UDX classes for API version 2 are defined in a namespace called nz::udx_ver2. (The
API version 1 UDXs use the nz::udx namespace.) Your C++ program must reference the correct namespace. For example:
#include "udxinc.h"
using namespace nz::udx_ver2;

4-1

IBM Netezza User-Defined Functions Developers Guide

Note: This chapter uses udx_ver2 as the default namespace for the examples that follow.
The sections note the differences with UDX version 1, and Appendix F, Sample UserDefined Functions and Aggregates Reference contains examples of version 1 and version
2 definitions. You can continue to create UDX version 1 UDAs as well as new version 2
UDAs; both will operate on Release 6.0.x systems. However, the version 1 UDAs will work
on Netezza Release 5.0.x and later systems and thus may be more portable for your
Netezza systems.
To implement a UDA, you create a new class object derived from the Uda base class. Continuing the PenMax example:
#include "udxinc.h"
using namespace nz::udx_ver2;
class CPenMax: public nz::udx_ver2::Uda
{
public:
};

Each UDA must implement the following five methods in addition to its constructor and
destructor. An example of the class header for the PenMax UDA follows:
class CPenMax : public nz::udx_ver2::Uda
{
public:
static nz::udx_ver2::Uda* instantiate(UdxInit *pInit)
virtual void initializeState();
virtual void accumulate();
virtual void merge();
virtual ReturnValue finalResult();
};
nz::udx_ver2::Uda* CPenMax::instantiate(UdxInit *pInit)
{
return new CPenMax(pInit);
}

instantiate() is called by the runtime engine to create the object dynamically. The static
implementation must be outside of the class definition. In UDX version 2, the instantiate method takes one argument (UdxInit *pInit), which enables access to the memory
specification, the log setting, and the UDX environment in the constructor (see UDX
Environment on page 2-7). It creates a new object of the derived class type using the
new operator and returns it (as base class type Uda) to the runtime engine. The
runtime engine deletes the object when it is no longer needed. An example follows:
class CPenMax : public nz::udx_ver2::Uda
{
public:
CPenMax(UdxInit *pInit) : Uda(pInit)
{
}
static nz::udx_ver2::Uda* instantiate(UdxInit *pInit);
virtual void initializeState();
virtual void accumulate();
virtual void merge();
virtual ReturnValue finalResult();
};

4-2

20444-5

Rev.4

Creating the C++ File for the UDA

nz::udx_ver2::Uda* CPenMax::instantiate(UdxInit *pInit)


{
return new CPenMax(pInit);
}

initializeState() is called to allow the implementer to initialize the necessary state used
in the UDA. The state of a UDA is one or more values which must be valid Netezza
datatypes. The state is automatically preserved by the runtime engine between snippets, if necessary.To calculate the penultimate maximum, the function must keep track
of the largest two numbers in state variables. initializeState() sets both the variables to
NULL. The states are declared in the CREATE AGGREGATE command, which is
described later. An example follows:
void CPenMax::initializeState()
{
setStateNull(0, true); // set current max to null
setStateNull(1, true); // set current penmax to null
}

accumulate() is called once per row and adds the contribution of its arguments to the
aggregate's accumulator state. It updates the states to keep the highest two values in
the correct states. In addition to getting the arguments through int curVal =
int32Arg(0);, the method retrieves the two state variables using the in32State(int) and
isStateNull(int) functions. The accumulate method updates the states as required.
void CPenMax::accumulate()
{
int *pCurMax = int32State(0);
bool curMaxNull = isStateNull(0);
int *pCurPenMax = int32State(1);
bool curPenMaxNull = isStateNull(1);
int curVal = int32Arg(0);
bool curValNull = isArgNull(0);
if ( !curValNull ) { // do nothing if argument is null - can't
//affect max or penmax
if ( curMaxNull ) { // if current max is null, this arg
//becomes current max
setStateNull(0, false); // current max no longer null
*pCurMax = curVal;
} else
{ if ( curVal > *pCurMax ) { // if arg is new max
setStateNull(1, false); // then prior current max
// becomes current penmax
*pCurPenMax = *pCurMax;
*pCurMax = curVal; // and current max gets arg
} else if ( curPenMaxNull || curVal > *pCurPenMax ){
// arg might be greater than current penmax
setStateNull(1, false); // it is
*pCurPenMax = curVal;
}
}
}
}

20444-5

Rev.4

4-3

IBM Netezza User-Defined Functions Developers Guide

merge() is called with arguments of a second set of state variables and merges this second state into its own state variables. This method is necessary because the Netezza
system is a parallel-processing architecture, and the aggregate states from all SPUs
will be sent to the host, where they will be consolidated into a single merged aggregation state. The merge() method merges two states, handling all the null values states
correctly. One of the states is passed in normally as in accumulate(). The second state
is passed in as arguments, requiring the use of argument retrieval functions such as
int32Arg(int) isArgNull(int) to retrieve. An example follows:
void CPenMax::merge()
{
int *pCurMax = int32State(0);
bool curMaxNull = isStateNull(0);
int *pCurPenMax = int32State(1);
bool curPenMaxNull = isStateNull(1);
int nextMax = int32Arg(0);
bool nextMaxNull = isArgNull(0);
int nextPenMax = int32Arg(1);
bool nextPenMaxNull = isArgNull(1);
if ( !nextMaxNull ) { // if next max is null, then so is
//next penmax and we do nothing
if ( curMaxNull ) {
setStateNull(0, false); // current max was null,
// so save next max
*pCurMax = nextMax;
} else {
if ( nextMax > *pCurMax ) {
setStateNull(1, false);
// next max is greater than current, so save next
*pCurPenMax = *pCurMax;
// and make current penmax prior current max
*pCurMax = nextMax;
} else if ( curPenMaxNull || nextMax > *pCurPenMax ) {
// next max may be greater than current penmax
setStateNull(1, false); // it is
*pCurPenMax = nextMax;
}
}
if ( !nextPenMaxNull ) {
if ( isStateNull(1) ) {
// can't rely on curPenMaxNull here, might have
// change state var null flag above
setStateNull(1, false); // first non-null penmax,
// save it
*pCurPenMax = nextPenMax;
} else {
if ( nextPenMax > *pCurPenMax ) {
*pCurPenMax = nextPenMax;
// next penmax greater than current, save it
}
}
}
}
}

4-4

20444-5

Rev.4

Compiling the UDA

finalResult() returns the final aggregation value from the accumulated state. A simple
example might be a UDA implementation of an average aggregation, where the finalResult() method divides the sum by the count to produce an average. In this example, the
finalResult() method gathers one of the states and returns it using the NZ_UDX_
RETURN_INT32 macro in a similar fashion to evaluate() in the UDF case.
ReturnValue CPenMax::finalResult()
{
int curPenMax = int32Arg(1);
bool curPenMaxNull = isArgNull(1);
if ( curPenMaxNull )
NZ_UDX_RETURN_NULL();
setReturnNull(false);
NZ_UDX_RETURN_INT32(curPenMax);
}

The NZ_UDX_RETURN_INT32 macro helps to confirm that the return value is of the
expected type. For a list of the available return macros, refer to UDX Return Value Macros on page D-8. The finalResult method can access all of the datatype helper API calls,
as well as a list of state arguments that are listed in UDA State Arguments on page D-8.

Compiling the UDA


After you create and debug your C++ file for your new UDA, you need to compile the C++
file. The compilation process creates the object files that will run on the Netezza host as
well as on the Netezza SPUs.

To compile the PenMax C++ file and create the API version 2 object files:
nzudxcompile penmax.cpp

The nzudxcompile command creates the following object files:

penmax.o_x86 is the object file for the Netezza host (i386 Linux platform on x86).

penmax.o_spu10 is the object file for the Linux-based Rev10 SPUs on IBM Netezza
1000 and Netezza 100 models.

For a complete description of the nzudxcompile command and its options and usage, refer
to nzudxcompile Command Syntax on page 6-18.
After you create the compiled object files, you must register the UDA with the Netezza system so that query writers can include the UDA in their queries. The next section,
Registering the UDA with the Netezza System,describes how to register the UDA.
Optionally, you can also compile and register the UDA in one step using the nzudxcompile
command. This example also shows that you must include the --version 2 syntax when
you are using the command to compile and register an API version 2 UDA.

To compile the PenMax C++ file and also register it in a sample database called mydb:
nzudxcompile /home/nz/udx_files/PenMax.cpp o PenMax.o
-sig "PenMax(int4)" --version 2 -return INT4 -class CPenMax
--state "(int4, int4)" -user myuser -pw password -db mydb

20444-5

Rev.4

4-5

IBM Netezza User-Defined Functions Developers Guide

Registering the UDA with the Netezza System


If you choose to compile but not register your UDAs with the nzudxcompile command, you
must register the UDA using the Netezza SQL command CREATE AGGREGATE. For a complete description, refer to CREATE [OR REPLACE] AGGREGATE on page B-13.
Note: When you issue a CREATE AGGREGATE command, the database processes the HOST
OBJECT and the SPU OBJECT files as the user nz. The user nz must have read access to
the object files and read and execute access to every directory in the path from the root to
the object file.
For example, to register the sample aggregate penmax, use the following command:
CREATE AGGREGATE PENMAX(INT4) RETURNS INT4 STATE (INT4, INT4)
LANGUAGE CPP PARAMETER STYLE NPSGENERIC API VERSION 2
EXTERNAL CLASS NAME 'CPenMax'
EXTERNAL HOST OBJECT '/home/nz/udx_files/penmax.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/penmax.o_spu10'

If the command is successful, it creates the aggregate in the default database. The UDA
will be owned by the user account that issues the SQL command. To create an aggregate,
your user account must have Create Aggregate permission, or you must be logged in as the
admin user.
Note: Each user-defined aggregate must also have a unique signature. For a description of
signatures and how the Netezza system processes them, see Function and Aggregate Signatures on page 2-6.

Using the UDA in a SQL Query


After you register a UDA, you and other permitted users can call it in the same manner as
the Netezza SQL aggregates. To use a UDA, users must have Execute permission for
AGGREGATE objects or for the specific UDA.
Note: By default, the admin user account has execute access to all user-defined aggregates. The user who registered a UDA also has execute access to that UDA. Other users can
be given permission to run specific or all UDAs.
For the sample PenMax aggregate, first create a sample table that contains the data to be
processed by the UDA. For example:
CREATE
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT

TABLE myints (a int, b int);


INTO myints VALUES (1,2);
INTO myints VALUES (1,4);
INTO myints VALUES (1,6);
INTO myints VALUES (2,8);
INTO myints VALUES (2,10);
INTO myints VALUES (2,12);

Then you can run the sample PenMax aggregate, as follows:


SELECT penmax(b) FROM myints;

Sample output follows:


PENMAX
-------10

4-6

20444-5

Rev.4

Altering and Dropping UDAs

Another example follows:


SELECT a,penmax(b) FROM myints GROUP BY a;

Sample output follows:


A | PENMAX
---+-------1 | 4
2 | 10

Altering and Dropping UDAs


After you register a UDA, you can use the ALTER AGGREGATE command to change the
aggregate object files, state, return value, or logging level. You cannot change the aggregate
name or argument type list; you must drop the existing aggregate and then create a new
aggregate with the new name and/or argument type list. For more information about the
command, see ALTER AGGREGATE on page B-2.
You can remove a UDA using the DROP AGGREGATE command. For more information
about the command, see DROP AGGREGATE on page B-25. The Netezza system does
not allow you to drop UDAs which are referenced in any views. You must resolve those
dependencies before you can drop the UDA, as described in Dependency Checks before
Dropping UDXs on page 6-13.
Note: The dependency checks will not stop a DROP AGGREGATE operation if the UDA was
used in a view that was created on an earlier Netezza release environment which was subsequently upgraded to 4.6. You would have to recreate the old view after the 4.6 upgrade to
take advantage of the dependency checks.

20444-5

Rev.4

4-7

IBM Netezza User-Defined Functions Developers Guide

4-8

20444-5

Rev.4

CHAPTER 5
Creating User-Defined Shared Libraries
Whats in this chapter
Creating a User-Defined Shared Library
Library Loading Options
Compiling and Linking the Shared Library
Registering the Shared Library in a Database
Using the Shared Library with a UDX
Altering and Dropping Shared Libraries
Clearing Dependencies

This chapter describes the steps to create a user-defined shared library and to register it for
use on a Netezza system.

Creating a User-Defined Shared Library


To help make UDX code more efficient and easier to maintain, or to help make your UDXs
aware of specific processing algorithms that you want to use, you can use shared libraries
to define that code. In the Netezza environment, user-defined shared libraries are objects
in the database. You compile and register them just as you do a UDF or a UDA. The shared
library objects are saved on the Netezza host.
Note: You must compile and register user-defined shared libraries before you register other
UDFs, UDAs, or user-defined shared libraries that depend on them. A UDX object cannot
have more than 64 direct dependencies.
In addition, make sure that you declare any of the standard C++ library header files that
your user-defined shared library may require. If your shared library depends on other userdefined shared libraries, be sure to note those dependencies as you must specify them
when you register this user-defined shared library.
The process to define a shared library for your UDXs consists of the following steps:
1. Create/obtain the C++ shared library. Make sure that it is debugged and ready to use.
2. Compile the shared library.
3. Link the shared library.
4. Register the library as an object in the Netezza database.
The following sections describe these process steps in more detail.

5-1

IBM Netezza User-Defined Functions Developers Guide

Library Loading Options


User-defined shared libraries support two loading methods, automatic and manual load:

An automatic load library is automatically loaded into the system and added to the global space. At snippet execution time, the system ensures that automatic load libraries
are automatically opened, and library symbols are available for use. The library is automatically closed after the snippet finishes. Automatic load is the default method for
user-defined shared libraries.

Manual load means that a user-defined shared library is directly managed by a UDX.
The UDX must use the dlopen(), dlsym(), and dlclose() functions to load the library, reference symbols, and to close the library when finished. UDXs are responsible for
opening and closing the manual load libraries when they are needed.

If you create a shared library that has dependencies on other user-defined shared libraries,
you should define the top-level library as AUTOMATIC LOAD. The subsequent or referenced
libraries should also be AUTOMATIC LOAD.

Compiling and Linking the Shared Library


After you create your C++ file for your new user-defined shared library, compile the C++ file
using the nzudxcompile command. The command is located in the /nz/kit/bin/adm directory. The compilation process creates the object files that will run on the Netezza host as
well as on the Netezza SPUs.
You compile and link shared library C++ files to make it usable for the UDXs that reference
them. You use the nzudxcompile command to compile and link a shared library. The command allows you to create shared libraries that will work in the host environment as well as
on the SPUs.
Note: Unlike other UDXs, you cannot compile and register a shared library in one step with
the nzudxcompile command. You must register the library using CREATE LIBRARY.
To compile and link a shared library named mylib:
1. Create a compiled object for the Netezza host environment:
nzudxcompile /home/nz/libs/mylib.cpp --host
-o /home/nz/libs/mylib.o

2. Link the compiled object into a shared library for the host:
nzudxcompile --objs /home/nz/libs/mylib.o --host -o
/home/nz/libs/host/mylib.so

3. Create a compiled object for the Netezza SPU environment:


nzudxcompile /home/nz/libs/mylib.cpp --spu -o
/home/nz/libs/mylib.o

4. Link the compiled object into a shared library for the SPU:
nzudxcompile --objs /home/nz/libs/mylib.o --dynamic --spu
-o /home/nz/libs/spu/mylib.so

Note: The --dynamic switch is used only when compiling shared libraries for the SPU
environment.

5-2

20444-5

Rev.4

Registering the Shared Library in a Database

The nzudxcompile command creates the following object files:

mylib.o_x86 is the object file for the Netezza host (i386 Linux platform on x86).

mylib.o_spu10 is the object file for the Linux-based Rev10 SPus on IBM Netezza
1000 and Netezza 100 models.

For a complete description of the nzudxcompile command and its options and usage, refer
to nzudxcompile Command Syntax on page 6-18.
After you create the compiled object files, you must register the library with the Netezza
system so that other UDXs can specify it as a dependency. The next section, Registering
the Shared Library in a Database,describes how to register the library.

Registering the Shared Library in a Database


After you create the compiled objects for the host and SPU environment, connect to the
SQL database and use the CREATE LIBRARY command to register the library in a Netezza
database.
Note: When you issue a CREATE LIBRARY command, the database processes the HOST
OBJECT and the SPU OBJECT files as the user nz. The user nz must have read access to
the object files and read and execute access to every directory in the path from the root to
the object file.
A sample command follows:
MYDB(USER)=> CREATE OR REPLACE LIBRARY myudxlib AUTOMATIC LOAD
EXTERNAL HOST OBJECT '/home/nz/libs/host/mylib.so'
EXTERNAL SPU OBJECT '/home/nz/libs/spu/mylib.so';
CREATE LIBRARY

If the command is successful, it creates the user-defined shared library in the default database. The library will be owned by the user account that issues the SQL command. To
create a library, your user account must have Create Library permission, or you must be
logged in as the admin user.

Using the Shared Library with a UDX


When you create a UDX that depends on a shared library, you must specify the shared
library as a dependency of the UDX. A sample command follows:
MYDB(USER)=> CREATE OR REPLACE FUNCTION appendfile(INT4)
RETURNS INT4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC
CALLED ON NULL INPUT NOT DETERMINISTIC
EXTERNAL CLASS NAME 'Append' DEPENDENCIES myudxlib
EXTERNAL HOST OBJECT '/home/nz/udx/append.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx/append.o_spu10';
CREATE FUNCTION

20444-5

Rev.4

5-3

IBM Netezza User-Defined Functions Developers Guide

Altering and Dropping Shared Libraries


After you register a user-defined library, you can use the ALTER LIBRARY command to
change the library object files, loading option, dependencies, or owner. You cannot change
the library name; you must drop the existing library and then create a new library with the
new name. For more information about the command, see ALTER LIBRARY on
page B-11.
You can remove a user-defined shared library using the DROP LIBRARY command. For
more information about the command, see DROP LIBRARY on page B-28. You cannot
drop libraries which are referenced in any existing UDXs. You must resolve those dependencies before you can drop the library, as described in Dependency Checks before Dropping
UDXs on page 6-13.

Clearing Dependencies
If you create a UDX and declare dependencies for it, you can remove the dependencies
using the NO DEPENDECIES option of the ALTER FUNCTION|AGGREGATE|LIBRARY commands or the CREATE [OR REPLACE] FUNCTION|AGGREGATE|LIBRARY commands. The
NO DEPENDENCIES option is the default for these commands. It indicates that the UDX
does not have any dependencies, and if the UDX object already exists, it clears any previous dependencies for the object.
For example, to clear the dependencies for the sample UDF myfunc:
MYDB(MYUSER)=> CREATE OR REPLACE FUNCTION myfunc(int)
RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'CMyFunc' NO DEPENDENCIES
EXTERNAL HOST OBJECT '/home/nz/udx_files/myfunc.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/myfunc.o_spu';

5-4

20444-5

Rev.4

CHAPTER 6
Common UDX Development Topics
Whats in this chapter
Managing User Account Permissions
Documenting a UDX
UDX Development Best Practices
nzudxcompile Command Syntax
Migrating UDXs from API Version 1 to API Version 2

This chapter describes some UDX development topics and best practices for the Netezza
system. These topics generally apply to all types of UDXs, such as functions, aggregates,
and shared libraries.

Managing User Account Permissions


Before you create your UDXs, make sure that you familiarize yourself with the required
account permissions necessary to create and manage these objects. The Netezza admin
user has permission to manage and execute UDXs. In addition, the owner (that is, the user
who created the UDF, UDA, or user-defined shared library) has permission to manage and
execute the objects that he or she creates.
As the admin or any user who has account management permissions, you can grant other
users permission to register, manage, or execute the user-defined functions and aggregates
on a Netezza system. You can assign permissions using the NzAdmin user interface, or by
using the GRANT and REVOKE Netezza SQL commands. For details on managing users
and groups, as well as assigning permissions using the NzAdmin interface, refer to the IBM
Netezza System Administrators Guide.
If you use Netezza SQL commands to manage account permissions, note some special formats and syntax for the commands. In the following examples, the term entity represents
either a user or a user group. The term object represents the word FUNCTION (for all
UDFs), AGGREGATE (for all UDAs), or LIBRARY (for all user-defined shared libraries). The
term object could also be a specific user-defined shared library, or a specific UDF or UDA if
it is specified as a full signature name (argument type list). Make sure that you specify the
correct signature including the sizes for numeric and string datatypes, otherwise you will
receive an error similar to the following:
Error: GrantAggregate: existing UDX name(argument type list) differs
in size of string/numeric arguments

6-1

IBM Netezza User-Defined Functions Developers Guide

Granting Create Permission


To grant Create administration permission for UDXs:

GRANT CREATE FUNCTION TO entity;

GRANT CREATE AGGREGATE TO entity;

GRANT CREATE LIBRARY TO entity;

For example, the following command grants Create Function permissions to the user
myuser:
GRANT CREATE FUNCTION TO myuser;

Granting All Permissions


To grant users or a group with all object permissions (alter, drop, execute, and list) for
UDXs:

GRANT ALL ON FUNCTION TO entity;

GRANT ALL ON AGGREGATE TO entity;

GRANT ALL ON LIBRARY TO entity;

For example, the following command grants all permissions for aggregate objects to the
group analysts:
GRANT ALL ON AGGREGATE TO analysts;

Revoking Create Permission


To revoke Create administration permission:

REVOKE CREATE FUNCTION FROM entity;

REVOKE CREATE AGGREGATE FROM entity;

REVOKE CREATE LIBRARY FROM entity;

For example, the following command revokes Create Library permissions from the group
analysts:
REVOKE CREATE LIBRARY FROM GROUP analysts;

Managing Alter Permission


To grant or revoke Alter permissions on an object:

GRANT ALTER ON object TO entity;

REVOKE ALTER ON object FROM entity;

Always specify a complete signature for the object value. For example, to grant Alter permissions for the sample function CustomerName (described later in this chapter) to the
user myuser:
GRANT ALTER ON CustomerName(varchar(64000)) TO myuser;

To grant Alter permission on all aggregates to the user newuser:


GRANT ALTER ON AGGREGATE TO newuser;

6-2

20444-5

Rev.4

Managing User Account Permissions

To revoke Alter permissions on the CustomerName function from the group sales:
REVOKE ALTER ON CustomerName(varchar(64000)) FROM GROUP sales;

Managing Execute Permission


To grant or revoke Execute permission on an object:

GRANT EXECUTE ON object TO entity;

REVOKE EXECUTE ON object FROM entity;

Always use a complete signature for the object. For example, to grant Execute permissions
for the sample function CustomerName to the user myuser, you can use the following
command:
GRANT EXECUTE ON CustomerName(varchar(64000)) TO myuser;

To grant Execute permission on all functions to the user newuser:


GRANT EXECUTE ON FUNCTION TO newuser;

To grant Execute permission on the library mylib to the user newuser:


GRANT EXECUTE ON mylib TO newuser;

To revoke Execute permissions for the sample aggregate PenMax (described later in this
chapter) from the group sales:
REVOKE EXECUTE ON PenMax(int4) FROM GROUP sales;

Managing Drop Permission


To grant or revoke Drop permissions on an object:

GRANT DROP ON object TO entity;

REVOKE DROP ON object FROM entity;

Always specify a complete signature for the object value. For example, to grant Drop permissions for the sample function CustomerName (described later in this chapter) to the
user newuser:
GRANT DROP ON CustomerName(varchar(64000)) TO newuser;

To grant Drop permission on all aggregates to the user newuser:


GRANT DROP ON AGGREGATE TO newuser;

To revoke Drop permissions on the CustomerName function from the user myuser:
REVOKE DROP ON CustomerName(varchar(64000)) FROM myuser;

Managing Unfence Permission


A user who has Unfence privileges can create or alter a UDF, UDA, or UDTF to run in
unfenced mode. To grant or revoke Unfence permissions to users or groups:

20444-5

Rev.4

GRANT UNFENCE TO entity;

REVOKE UNFENCE FROM entity;

6-3

IBM Netezza User-Defined Functions Developers Guide

Documenting a UDX
As a best practice, Netezza recommends that you follow a documentation and comments
convention for each UDX. One method is to include all the necessary DDL commands and
sample usage at the top of the C++ file as a comment. This can help you to reconstruct the
purpose, compilation code, Netezza registration arguments, and any additional information
such as table format for your functions.
The following example shows a complete listing of the customername.cpp file with both the
code and documentation comments at the beginning of the file.
/*
Copyright (c) 2007-2009 Netezza Corporation, an IBM Company
All rights reserved.
Function CustomerName takes a string and returns an integer 1 if it
begins with 'Customer A' and 0 otherwise.
REGISTRATION:
create or replace function
CustomerName(varchar(64000))
returns INT4
language cpp
parameter style npsgeneric
EXTERNAL CLASS NAME 'CCustomerName'
EXTERNAL HOST OBJECT '/home/nz/udx_files/customername.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/customername.o_spu10';
USAGE:
create
insert
insert
insert
insert

table customers (a int, b varchar(200));


into customers values (1,'Customer A');
into customers values (2,'Customer B');
into customers values (3,'Customer CBA');
into customers values (4,'Customer ABC');

select * from customers where CustomerName(b) = 1;


*/
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
class CCustomerName: public Udf
{
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
StringArg *str;
str = stringArg(0);
int lengths = str->length;
char *datas = str->data;
int32 retval = 0;
if (lengths >= 10)
if (memcmp("Customer A",datas,10) == 0)
retval = 1;
NZ_UDX_RETURN_INT32(retval);
}

6-4

20444-5

Rev.4

Documenting a UDX

};
Udf* CCustomerName::instantiate()
{
return new CCustomerName;
}

Adding Comments for a UDX


You can comment on UDXs using the Netezza SQL COMMENT ON command capability. For
example:
COMMENT ON FUNCTION <function name> (<argument type list>) IS 'text'
COMMENT ON AGGREGATE <aggregate name> (<argument type list>) IS 'text'
COMMENT ON LIBRARY <library name> IS 'text'

A Netezza SQL query user can display these comments using the nzsql \dd <name> command switch, or the \dd switch which will show all comments for all functions.
As a best practice, you should create comments for all UDXsincluding information about
the author, version, and descriptionin a format similar to the following:
COMMENT ON FUNCTION <function name> (<argument type list>) IS
'Author: <name>
Version: <version>
Description: <description>';

For example:
COMMENT ON FUNCTION CustomerName(varchar(64000)) IS 'Author: name
Version: 1.0 Description: Sample UDF from Dev Guide';

To comment on a UDX, you must either be the Netezza admin user, the owner of the UDX,
or you must have COMMENT permissions for the objects. For more information about COMMENT ON, refer to the IBM Netezza Database Users Guide.
For UDFs and UDAs, make sure that you specify a full signature name (argument type list),
including correct sizes for numeric and string datatypes. Otherwise you will receive an error
similar to the following:
Error: CommentAggregate: existing UDX name(argument type list)
differs in size of string/numeric arguments

Netezza SQL Command Help


There are several commands that you can use to display information about functions and
aggregates.

For functions, you can use the SHOW FUNCTION SQL command or the \df nzsql command argument.

For aggregates, you can use the SHOW AGGREGATE SQL command or the \da nzsql
command argument.

For shared libraries, you can use the SHOW LIBRARY SQL command or the \dl nzsql
command argument.

The SQL commands and the nzsql switches provide the same output information; they
show information for all functions or aggregates the standard Netezza SQL built-in functions and aggregates as well the UDFs and UDAs in the database.

20444-5

Rev.4

6-5

IBM Netezza User-Defined Functions Developers Guide

A sample \df command and its output follows. Note that the output has been truncated for
easier viewing. (The output is identical for the SHOW FUNCTION command.)
MYDB(MYUSER)=> \df
List of functions
RESULT
| FUNCTION
| BUILTIN | ARGUMENTS
-----------------+-------------+---------+------------------BIGINT
| ABS
|
t
| (BIGINT)
DOUBLE PRECISION | ABS
|
t
| (DOUBLE PRECISION)
INTEGER
| ABS
|
t
| (INTEGER)
DOUBLE PRECISION | COS
|
t
| (DOUBLE PRECISION)
DOUBLE PRECISION | COT
|
t
| (DOUBLE PRECISION)
INTEGER
| CUSTOMERNAME |
f
| (CHARACTER VARYING (64000))

In the output above, note that the UDF customername shows f (false) for the BUILTIN
value. Standard functions and aggregates that are supplied with Netezza SQL by default
are called built-ins, and they show a value of t (true).
Use the plus sign switch (\df+, \da+, and \dl+) or the VERBOSE option for the SQL commands to obtain verbose output, which will include the comments specified for them if you
follow the best practice described in the previous section.

UDX Development Best Practices


The following sections describe some common best practices that can help you to create
more robust and trouble-free user-defined functions and aggregates.

Cross-Database Access to UDXs


UDFs, UDAs, and user-defined shared libraries reside in specific databases on the Netezza
system. Like tables, views, and other objects, UDXs also support cross-database access, so
you can use fully qualified names and synonyms to access UDXs that reside in a database
other than the one that you are currently logged in to for your queries.
The syntax for fully-qualified names is the same as the syntax used for cross-database
access to tables or other objects. For example:
SYSTEM(MYUSER)=> SELECT dev.admin.cube(id), id from dev.admin.emp;

To create a synonym in the current database for a UDX that resides in a different database
on the same Netezza system, follow the same steps as for a Netezza table, view, or object.
For example:
CREATE SYNONYM <name> FOR <object name>

The <object name> can be a local UDX or a fully qualified name for a UDX in a different
database. Query users can then invoke the UDX using its synonym <name>.
Also note that UDFs and UDAs can be used in a view, just as built-in functions can. If you
use them in a view, note that permission-checking will be based on the view permissions,
and the Execute object permissions are bypassed. For a complete description of how to create and use synonyms, as well as for the use of functions in views, refer to the IBM Netezza
Database Users Guide.

6-6

20444-5

Rev.4

UDX Development Best Practices

Avoiding UDX Name Collisions with Built-In Functions


When you create a UDX, check the existing list of built-in (or default) functions to ensure
that you choose unique names for your UDXs. Do not rely on case-sensitive letters alone to
distinguish the names of your UDXs, as problems can result if the Netezza administrator
changes the system case. By default, the Netezza system converts all identifiers, such as
database, table, and column names, to the default system case, which is uppercase. If you
want to use mixed-case names for UDX names, you can place the name in double quotes
when you register the UDX.
So, for example, there is a built-in function named TRUNC. You could create a UDF named
trunc using a command such as (a partial command follows for this example):
CREATE FUNCTION "trunc"(numeric(ANY)) RETURNS numeric(ANY) ...

Although the TRUNC and trunc functions are different functions in the database, this generally leads to user confusion over the two similarly named functions. Both functions
appear to be the same function, but they could have very different definitions and purposes. Also, if the system administrator should ever change the system case to lowercase,
an identifier name collision would result.

Specifying the Execution Locus for UDFs


Starting in Release 6.0, there are two new views that you can use to control the execution
locus and behavior of a UDX:

The _v_dual_dslice view returns the dataslice ID for each dataslice at the dataslice.

The _v_dual view returns one row and causes the UDX to be evaluated on the host.

In previous releases, it was a common practice for users to create a single-row table or a
multi-row table to control whether a UDX would be evaluated on one or all SPUs. You can
use these views with a UDTF as well, but the behavior of the UDTF is also controlled by the
execution locus options specified with the parallel or parallel-not-allowed syntax. That is, a
parallel-allowed UDTF will usually run on a SPU even if invoked with _v_dual, and a parallel-not-allowed UDTF will run on the host even if invoked with _v_dual_dslicse. The
_v_dual_dslice view can be useful for a parallel UDTF so that it has at least one row on
each dataslice.

Avoiding UDX Linkage Symbol Collisions


As you develop your UDXs, carefully consider the name choices for the symbols within your
UDXs. If you use the same symbol names in different UDXs, a query that uses one of the
UDXs will run without any problems; however, if a query uses two or more UDXs that happen to share a common symbol name, the query could return linker errors for symbols with
multiple definitions. Common symbol names could result from an accidental reuse of the
same symbol name for different purposes, or you might have common code that performs
the same type of operation in different UDXs.

20444-5

Rev.4

To avoid the linker errors for identically named (but different) symbols, declare functions as static, and use namespaces to help uniquely identify the symbols.

To avoid the linker errors for code which is reused among several UDXs, you can compile the code for the UDXs and the shared code into one object file using the
nzudxcompile command. You could also move the shared code into a user-defined
shared library and then have the UDXs all depend on that library.

6-7

IBM Netezza User-Defined Functions Developers Guide

Avoiding Record Size Exceeded Errors


When you register a UDX using the CREATE [OR REPLACE] command, Netezza sums the
argument list and return type sizes. If the sum of the argument and return type sizes in the
signature of a UDF (or the argument signature, return type, and state size of a UDA) is
greater than the maximum row size of the Netezza system, the CREATE [OR REPLACE]
returns a record size exceeded error. The maximum row size is 65,535 bytes.
To avoid the record size exceeded error, you can use generic UDXs to help reduce the size
of the initial UDX definitions. For UDFs, you can use generic arguments and generic return
values. For UDAs, you can use generic arguments only.

Managing Dynamic Memory


If you use the C++ new operator to allocate dynamic memory for variables, make sure that
you call the delete operator to release the memory as part of your program cleanup routines. (This practice also applies if you use the malloc/calloc functions make sure that
you call the free function to release the memory.) If you do not release the dynamic memory, the Netezza system could experience performance degradations because swap/memory
is consumed but not released by the UDXs.

Obtaining UDX System Information Programmatically


The user-defined functions API provides a set of functions that you can use to obtain some
types of information about the Netezza system and the UDX processing.

getCurrentLocus

getCurrentDatasliceId

getCurrentTransaction

getCurrentHardwareId

getCurrentUsername

getCurrentSessionId

getNumberDataslices

getNumberSpus

udxLibraryName

Note: If you build these functions into your UDFs and you later downgrade the Netezza
release to a 4.5 version, these functions will not exist in that version. You will need to
rewrite those C++ files to no longer use these functions, or users will encounter link errors
if they try to run them in the earlier environments.

getCurrentLocus
Detects the locus of execution for a UDF.
Description

The function has the following syntax:

extern "C" int getCurrentLocus();

Returns A value that indicates whether the UDF is running in Postgres, DBOS, or on a
SPU. The valid values are UDX_LOCUS_POSTGRES (0), UDX_LOCUS_DBOS (1) or UDX_
LOCUS_SPU (2).

6-8

20444-5

Rev.4

UDX Development Best Practices

getCurrentDatasliceId
Returns the value of the dataslice ID on which the UDX is operating.
Description

The function has the following syntax:

extern "C" int getCurrentDatasliceId();

Returns A dataslice ID when the UDF is running on a SPU, or 0 is the UDF is running
elsewhere such as the host.

getCurrentTransaction
Returns the current Netezza transaction ID.
Description

The function has the following syntax:

extern "C" int64 getCurrentTransaction();

Returns

A transaction ID value if one exists.

getCurrentHardwareId
Returns the value of the hardware ID on which the UDX is operating.
Description

The function has the following syntax:

extern "C" int getCurrentHardwareId();

Returns A hardware ID when the UDF is running on a SPU, or 0 is the UDF is running
elsewhere such as the host.

getCurrentUsername
Returns the name of the user who is running the UDX.
Description

The function has the following syntax:

extern "C" int getCurrentUsername();

Returns

A database user account name.

getCurrentSessionId
Returns the current session ID value.
Description

The function has the following syntax:

extern "C" int getCurrentSessionId();

Returns

A session ID for the current session.

getNumberDataslices
Returns the number of dataslices.
Description

The function has the following syntax:

extern "C" int getNumberDataslices();

Returns The number of dataslices on which the UDF is operating. This number could be
larger than the number of SPUs.

20444-5

Rev.4

6-9

IBM Netezza User-Defined Functions Developers Guide

getNumberSpus
Returns the number of SPUs.
Description

The function has the following syntax:

extern "C" int getNumberSpus();

Returns

The number of SPUs on which the UDF is operating.

udxLibraryName
Given a library name as used in the DEPENDENCIES clause of the DDL, returns the actual
path on disk of the corresponding shared library. The function returns the appropriate file
pathname depending on the context (host vs SPU).
Description

The function has the following syntax:

const char* udxLibraryName(const char* name, bool caseSensitive);

Returns The function returns the names of libraries on which the given snippet depends
due to its UDXs, including indirect (or nested) dependencies.
The caseSensitive flag allow you to specify a case sensitive or case insensitive lookup. In
most cases, a case insensitive lookup works best, but if there are two libraries with the
same name but different cases, you need to use the case sensitive flag to distinctly identify
the libraries.
If the library name is not found, the function returns NULL.
This function is primarily used for libraries that are registered as MANUAL LOAD. After the
pathname has been recovered, the user can use dlopen, dlsym, and dlclose as normal. In
the case of C++ libraries, the user is responsible for providing a mangled name to dlsym.
Additionally, some C++ functionality requires that dlopen be invoked with RTLD_GLOBAL
for run time type information (RTTI).

Using C Runtime Library Functions


In Netezza releases before 5.0, user-defined functions supported a subset of the C runtime
library functions within user-defined functions and aggregates, as described in Table 6-1.
These restrictions resulted from the Nucleus operating system used in the Rev 7 SPU environments for those releases. In these previous releases, the Netezza Linux host supports a
slightly larger subset of LIBC functions, but you should use care to avoid using any functions outside the common subset listed in Table 6-1.
In Release 5.0 and later, SPUs use the Linux operating system, and therefore support the
full standard C Library (LIBC) functions.
As a best practice, do not use the LC_* locale variables. If you set the LC_* locale environment variables on the Netezza host, the locale-aware functions may not return similar
results when they run on the host and the SPUs. The Netezza host cannot communicate
LC_* variable values to the SPUs, and the SPUs cannot interpret the LC* settings. Likewise, use caution to avoid the use of locale-aware functions such as strftime, strcoll, and
string function.

6-10

20444-5

Rev.4

UDX Development Best Practices

Table 6-1: Supported LIBC Functions (for Releases Before 5.0)


__asr64
__div64
__dtoll
__dtoull
__lsl64
__lsr64
__rem64
__udiv64
__urem64
_copysign
_copysignf
_ctype
_d_add
_d_div
_d_dtof
_d_dtoi
_d_dtoll
_d_dtou
_d_dtoull
_d_feq
_d_fgt
_d_fle
_d_flt
_d_fne
_d_itod
_d_lltod
_d_mul
_d_sub
_d_ulltod
_d_utod
_f_add
_f_div
_f_feq
_f_fge

_f_fgt
_f_fle
_f_flt
_f_fne
_f_ftod
_f_ftoi
_f_ftoll
_f_ftou
_f_ftoull
_f_itof
_f_lltof
_f_mul
_f_sub
_f_ulltof
_f_utof
_fp_round
_isinf
_isnan
_logb
_logbf
_nextafter
_nextafterf
_scalb
a64l
abs
acos
acosf
asctime
asctime_r
asin
asinf _
atan
atan2
atan2f

atanf
atof
atoi
atol
bsearch
calloc
ceil
ceilf
clock
cos
cosf
cosh
coshf
d_fge
div
drand48
ecvt
erand48
erf
erfc
erfcf
erff
exp
expf
fabs
fabsf
fcvt
floor
floorf
fmod
fmodf
free
frexp
frexpf

gcvt
gettimeofday
gmtime
gmtime_r
hcreate
hdestroy
hsearch
hypot
hypotf
isalnum
isalpha
isascii
iscntrl
isdigit
isgraph
islower
isnan
isprint
ispunct
isspace
isupper
isxdigit
j0
j1
jn
jrand48
l64a
labs
lcong48
ldexp
ldexpf
ldiv
lfind
lgamma

lgammaf
localeconv
log rand_r
log10
log10f
logf
lrand48
lsearch
malloc
mblen
mbstowcs
memchr
memcmp
memcpy
memmove
memset
modf
modff
mrand48
nrand48
pow
powf
printf
qsort
rand
realloc
rint
round
roundf
scalbln
scalblnf
scalbn
scalbnf
seed48

sin
sinf
sinh
sinhf
snprintf
sprintf
sqrt
sqrtf
srand
srand48
sscanf
strcasecmp
strcat
strchr
strcmp
strcoll
strcpy
strcspn
strdup
strerror
strlen
strncasecmp
strncat
strncmp
strncpy
strpbrk
strrchr
strspn
strstr
strtod
strtok
strtok_r
strtol
strtoul

strxfrm
swab
tan
tanf
tanh
tanhf
tdelete
tfind
time
times
toascii
tolower
toupper
trunc
truncf
tsearch
twalk
vprintf
vsnprintf
vsprintf
vsscanf
wcstombs
wctomb
y0
y1
yn

UDFs and UDAs can also allocate memory with the malloc/free functions or new/delete
operators. However, use caution to carefully consider the memory allocations and include
the memory as part of the MAXIMUM MEMORY argument for the function or aggregate.
Any function or aggregate that exceeds its MAXIMUM MEMORY setting could negatively
impact the system performance.

20444-5

Rev.4

6-11

IBM Netezza User-Defined Functions Developers Guide

Netezza Query Optimization and UDX Calls


The Netezza system has many internal performance algorithms intended to make queries
as fast and as efficient as possible. These internal algorithmsin combination with the
UDX registration settings and query designcan result in some unexpected behaviors for
when, and how often, a UDX is invoked during a query. For example, when you review the
log messages or plan files for test queries, you might find that the UDX was not called, or
perhaps a UDF was called more or less often than you expected. For example:

When you define a UDF as RETURNS NULL ON NULL INPUT, if the Netezza system
detects a NULL input value to the UDF, it skips the UDF and automatically returns a
NULL value.

When you define a UDF as DETERMINISTIC, the Netezza system may call the UDF
only once during statement preparation time rather than once for each row it operates
on during the query execution. This will only happen if the UDF takes all literal arguments or no arguments, or if the UDF RETURNS NULL ON NULL INPUT and it is given
at least one literal NULL as an argument.

If your query uses the same UDF more than once, and the UDF takes the same arguments and is DETERMINISTIC, the Netezza query algorithms could apply common
subexpression elimination (CSE) to improve the query performance. With CSE, the
Netezza system calls the function only once for a common result that it can apply to
the other uses of the function within the query.

The Netezza Just In Time (JIT) statistics process can also increase the number of UDX
invocations. JIT statistics runs very fast sample queries on the affected tables to assess
query performance. Thus, the process could invoke the UDXs in the query several times
as it seeks the best plan for the query.

The last two examples are Netezza query performance optimizations, and should not be of
concern. For the first two example situations, you can change the query optimization
behavior if necessary by changing the UDF registration settings.
If you register the UDF as NON DETERMINISTIC, the Netezza system always invokes the
function to obtain a value. (The NON DETERMINISTISC setting may also be the reason why
the log shows that a UDF was invoked more than you expected.)
If you register the UDF as CALLED on NULL INPUT, the Netezza system invokes the function for one or more NULL input values. Your function must then be designed to handle
input NULL values appropriately.
Carefully consider the performance implications for these changes; if your UDF really is
DETERMINISTIC or it should return NULL on NULL input, there are performance benefits
to the resulting query optimizations. You might want to use different settings for these registration options in your test environment than in the production environment. For more
details about these arguments, refer to CREATE [OR REPLACE] FUNCTION on
page B-17.

6-12

20444-5

Rev.4

UDX Development Best Practices

Dependency Checks before Dropping UDXs


Starting in Release 4.6, the Netezza system tracks dependencies on UDXs. That is, if a
table, view, or other UDX references a UDX that you want to drop, you will not be allowed to
drop it until the dependency is removed.
The dependency check prevents any problems with future queries. You can view the dependency using the _v_depend view, for example:
MYDB(MYUSER)=>
SELECT * FROM _v_depend;
REFERENCING
|
REFERENCED
| DEPTYPE
-------------------------+--------------------------------+--------table CUSTOMERS col(1)
| function FILEUPDATE(INTEGER)
| n
view TOTAL_VW
| aggregate MYSUM(INTEGER)
| n
function MYFUNC(INTEGER) | library MYMATHLIB
| n
(2 rows)

For example, if you attempt to drop a UDF named fileupdate that is used in a table named
customers, an error similar to the following is returned:
DEV(USER1)=> DROP FUNCTION fileupdate(int4);
ERROR: Can't delete function FILEUPDATE - table CUSTOMERS (col 1)
depends on it

The error reports the table and specific column that refers to the view that you wanted to
drop.
Similarly, if you try to drop a UDX that is used in a view, the command returns an error. For
example, if you try to drop a UDA named mysum which is used in a view named TOTAL_
VW, the following error is returned:
DEV(MYUSER)=> DROP AGGREGATE mysum(int4);
ERROR: Can't delete aggregate MYSUM - view TOTAL_VW depends on it

To resolve these error messages and drop the UDX, you must change the default value of
each table row which references the UDFs by modifying the default value clause using the
ALTER [ COLUMN ] column { SET DEFAULT value | DROP DEFAULT } commands. For
views, you need to use the CREATE OR REPLACE VIEW command to remove the UDX from
the view definition.
If you try to drop a user-defined shared library that is a dependency of any existing UDX,
you must resolve those dependencies before you can drop the library. For example:
DEV(MYUSER)=> DROP LIBRARY mymathlib;
ERROR: Can't delete library mymathlib - function MYFUNC(integer)
depends on it

If you try to drop a database which contains objects that are referenced by objects in other
databases, the DROP DATABASE command displays errors and exits. The error messages
display up to 5 object dependencies, plus the total number of dependencies which must be
resolved. You must resolve all the dependency issues before you can drop the database.
Note: If you have tables or views from a previous Netezza release that reference a UDX,
note that after you upgrade to 4.6, the Netezza system will allow a DROP command on that
UDX. The older tables and views are not added as dependencies until you recreate the view
or modify the default expression using the 4.6 software. If you have an upgraded system,
you should use the nzudxvalidate command after you drop a UDX to check for any tables
and views which might contain unresolved references to the dropped UDX. For more information, see the next section Checking for Unreferenced or Invalid UDXs.

20444-5

Rev.4

6-13

IBM Netezza User-Defined Functions Developers Guide

Checking for Unreferenced or Invalid UDXs


For Netezza Release 4.5 and 4.6 UDXs, it was an important best practice to check tables
and views for any references to UDXs that had been dropped from the database. With
Release 5.0 and later, the system uses dependency checks to help protect against the
removal of UDXs that are referenced by other objects.
The nzudxvalidate command also checks existing UDXs and reports any problems such as
missing or invalid object files. After you perform a task such as restoring a database from a
backup, it is a good practice to run this command to check for any problems with the
defined UDXs on the system.

nzudxvalidate command
Checks tables and views for references to dropped UDXs, and validates the existing UDXs
for any problems such as missing or invalid object files. You must be logged in as the nz
user account to run this command, and NZ_USER and NZ_PASSWORD must be set to the
admin account and password.
Syntax

The nzudxvalidate command has the following syntax:

nzudxvalidate [-h] [-d dbname]

The -h option displays help for the command. The -d option allows you to specify one database to check. By default, the command checks all the databases on the Netezza system. If
your Netezza user account has limited access to the databases on the Netezza system, the
command can check only the databases to which you have access.
Description The nzudxvalidate command locates any references to dropped UDXs or
invalid UDXs within all of the databases or a specific database. The Netezza system must
be online when you run this command. The command displays a list of any tables and
views that reference dropped UDXs, as well as any UDXs that have issues with their object
files, such as missing object files, invalid object files, or object files that fail a CRC checksum match. A description of these problems and how to resolve them follows the example.
Processing tables
Table DEV.T2.C2 - default value references stale UDF(s): 'ONE()'
Processing views
View DEV.VAS uses stale UDA UDA_SUM (oid 214389)
View DEV.VAS2 uses stale UDF UDF_LENGTH (oid 214444)
View DEV.VAS2 uses stale UDA UDA_SUM (oid 214389)
Processing udfs
UDF CONVERT.STRING_SIZE_VARCHAR(CHARACTER VARYING(64000)) is missing
its EXTERNAL HOST OBJECT file
UDF CONVERT.STRING_SIZE_VARCHAR(CHARACTER VARYING(64000)) is missing
its EXTERNAL SPU OBJECT file
UDF CONVERT.CHARID(CHARACTER(ANY)) has invalid checksum for its
EXTERNAL HOST OBJECT file
UDF CONVERT.ONE() has invalid EXTERNAL HOST OBJECT file
UDF CONVERT.ONE() has invalid EXTERNAL SPU OBJECT file
Processing udas
UDA small.CHARMAX2(CHARACTER(20)) is missing its EXTERNAL HOST OBJECT
file

6-14

20444-5

Rev.4

UDX Development Best Practices

UDA small.CHARMAX2(CHARACTER(20)) is missing its EXTERNAL SPU OBJECT


file
Processing libraries
LIBRARY mydb.myudxlib is missing its EXTERNAL SPU OBJECT file
LIBRARY mydb.mymathlib has invalid checksum for its EXTERNAL SPU
OBJECT file
LIBRARY mydb.mysqllib has invalid EXTERNAL SPU OBJECT file
Done

Resolving View References to Stale UDXs


If the nzudxvalidate command locates views which contain references to stale UDXs, you
can refresh the view to update it with the latest object ID for the UDX.

Resolving Invalid Object File Errors


The nzudxvalidate command checks the UDX object files and reports the following types of
errors:

Invalid object file errors. These errors typically occur because the user specified the
wrong object file pathname in the CREATE OR REPLACE or ALTER command for the
UDX. For example, the user most likely specified the SPU object file pathname for the
host object file argument, or vice versa.

Missing object file errors. These errors (which occur very rarely) indicate that the object
file has somehow been deleted from the /nz/data directory.

Invalid checksum errors. These errors (which occur very rarely) indicate that there has
been a corruption or unexpected change to the object file, and it no longer matches the
one with which the UDX was registered.

To correct any of these errors, use the ALTER [FUNCTION|AGGREGATE|LIBRARY] command or CREATE OR REPLACE [FUNCTION|AGGREGATE|LIBRARY] command to update
the UDX with the correct object files. Make sure that you specify the correct external object
file pathname (either host or SPU) for the object file arguments.

Time Zone Support


The SPU automatically has its time zone set to that of the host. The time zone is reset each
time a new snippet runs on the SPU, which helps to ensure that the UDX operates in a consistent environment regardless of where it is running. The reset also helps to clear any time
zone settings that may have been made by a UDX in a previous snippet.
The following time zone-related variables can be used normally:
extern char *tzname[2];
extern long timezone;
extern int daylight;

The following time functions can be used normally:


extern
extern
extern
extern
extern

20444-5

Rev.4

"C"
"C"
"C"
"C"
"C"

time_t mktime(struct tm *);


char *ctime(const time_t *);
char *ctime_r(const time_t *, char *);
struct tm *localtime(const time_t *);
struct tm *localtime_r(const time_t *, struct tm *);

6-15

IBM Netezza User-Defined Functions Developers Guide

extern "C" size_t strftime(char *, size_t, const char *, const struct


tm *);
extern "C" void tzset();

Note: If your UDF or UDA uses a function such as strftime (which formats a local time/date
according to LC_* locale settings), keep in mind the best practices from the previous section, Using C Runtime Library Functions on page 6-10. Though the function returns a
value on the Netezza host, results are inconsistent on the SPUs because the SPUs do not
support the LC* variables. The value returned on the host is based on the LC_TIME value
when the Netezza system was started, which can cause some unexpected time settings for
the UDX code. As a best practice, avoid the use of LC* variables or functions that use
them.

Error Reporting within a UDX


Within a UDX, you can build in error checking and reporting capabilities using the
throwUdxException() method. The throwUdxException() method can report an error and
then cancel/abort the execution of the query. throwUdxException() supports messages up to
300 characters. Messages that exceed the limit are truncated to the first 300 characters.
throwUdxException() signals an error in the UDF/UDA and returns control back to the system. You should be careful to clean up memory before using this routine.
To invoke this method, you specify throwUdxException() as follows:
throwUdxException( const char* msg );

The msg message string is returned to the SQL session. For example:
throwUdxException( "Invalid value" );

Checking for Nulls


All classes, including the UDF class, are contained in the UDX namespaces. There are two
namespaces: the API version 1 namespace is nz::udx; the API version 2 namespace is
nz::udx_ver2.
Before returning from the evaluate method (for UDFs) or the finalResult method (for
UDAs), your function should call setReturnNull to set whether the return value is NULL.
You configure how UDFs should behave when database NULL values are encountered when
you register the UDF with the CREATE FUNCTION command option:
[{RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT}]
The default setting is RETURNS NULL ON NULL INPUT. This option specifies that the
evaluate() method will not be called when any of the arguments are NULL. This setting has
certain performance advantages if you want to skip your function if it is passed NULL
values.
The CALLED ON NULL INPUT option specifies that the function will be called even if
NULL values are encountered. If you choose this option, you should use the following function within the evaluate() and finalResult() method to confirm whether your argument is
NULL:
bool isArgNull(int n)

6-16

20444-5

Rev.4

UDX Development Best Practices

Memory Registration
When you register a user-defined function or aggregate, you can use an optional parameter
called MAXIMUM MEMORY to specify the amount of memory, in bytes, that the function or
aggregate is expected to require as it runs. The size value can be an empty value or a value
in the form of a number, or a number with the letters b (bytes), k (kilobytes), m (megabytes), or g (gigabytes). This is not a memory limit threshold; instead, this value is a
performance indicator used during scheduling and planning. The stated memory allocation
helps the Netezza system to schedule the UDX better, which will run the queries that use
the UDX more efficiently.
For each UDX, try to estimate the overall memory consumption and use the MAXIMUM
MEMORY parameter to specify the total memory usage. The test harness (described in
UDX Test Harness on page 7-5) provides information that can help you to assess the
memory consumption of a UDX.
You can use the following function within your UDX to display the memory that the UDF or
UDA was registered with (its MAXIMUM MEMORY value):
getMemory()

API version 2 UDXs have access to the memory registration within the constructor.

Compiling Multiple Object Files


If your user-defined function or aggregate has more than one source file, you can compile
the multiple sources into one object file using a series of nzudxcompile steps. You compile
the sources for the host and SPU objects in separate steps.
For example, if you have two source C++ programs called helloworld.cpp and parser.cpp,
compile the host source files by compiling both C++ files individually, then compile them
both into one output source file, as follows:
nzudxcompile --host helloworld.cpp o helloworld_temp.o_x86
nzudxcompile --host parser.cpp o parser.o_x86
nzudxcompile --host --objs helloworld_temp.o_x86 --objs parser.o_x86
-o helloworld.o_x86

Similarly, to compile the multiple sources to create one SPU object file, use commands like
the following:
nzudxcompile helloworld.cpp --spu -o helloworld_temp.o_spu10
nzudxcompile parser.c --spu -o parser.o_spu10
nzudxcompile --spu --objs helloworld_temp.o_spu10
--objs parser.o_spu10 -o helloworld.o_spu10

Managing C++ Files That Contain Multiple Functions


If you create source files that contain more than one user-defined function or aggregate,
you can register the functions and aggregates individually to the Netezza system. For example, if you create a C++ program that has three user-defined functions, you first create the
object files for the host and SPUs using the nzudxcompile command.
After you have the .o object files for the C++ program, you can then issue three CREATE
FUNCTION command statements (which each use the same set of object code files) to register each of the three functions to the Netezza system. The CREATE FUNCTION
commands add three copies of the same object files, one for each unique function, to the
Netezza data directory.

20444-5

Rev.4

6-17

IBM Netezza User-Defined Functions Developers Guide

Conditional Compilation
Within your C++ source files, you can mark code that should run only on the Netezza host
or the SPUs using the FOR_SPU conditional compilation. For example:
#ifdef FOR_SPU
My SPU code
#else
My host code
#endif

nzudxcompile Command Syntax


Use the nzudxcompile command to compile source C++ files for UDXs into the object files
that will run on the Netezza host and the SPUs. The command resides in the /nz/kit/bin/
adm directory.
For UDFs and UDAs, you can use the command to compile and register the files in one
step, or you can also register UDF and UDA object files that were compiled previously. For
user-defined shared libraries, you can use the command only to compile the source files for
a library; you cannot register shared libraries using this command. You must use the CREATE [OR REPLACE] LIBRARY command to register them.

Syntax
The nzudxcompile command has the following syntax:
nzudxcompile [OPTIONS]... srcfile

Inputs
The nzudxcompile command takes the following input options. Note that some of the
options are general options, while some apply when compiling UDXs, or when registering
UDFs or UDAs.
Table 6-2: nzudxcompile General Options

6-18

Option

Description

--base base

Specifies the base or home directory for the Netezza software.


The default is /nz/kit.

--user username

Specifies the Netezza user account to use if you are registering


a UDF or UDA with the command. The account must have Create Function privilege (for UDXs) or Create Aggregate privilege
(for UDAs) as well as List privilege to the target database. The
default is the NZ_USER environment variable value.

--pw password

Specifies the password for the Netezza user account. The


default is the NZ_PASSWORD environment variable value.

--db database

Specifies the database in which to register the UDF or UDA. The


default is the NZ_DATABASE environment variable value.

-h or --help

Displays the usage for the command and exits.

20444-5

Rev.4

nzudxcompile Command Syntax

Table 6-3 describes the nzudxcompile options used for compiling UDX source files.
Table 6-3: nzudxcompile Compile Options
Option

Description

srcfile

Required argument that specifies the pathname of the input C


or C++ file that you want to compile.

--dynamic

Create a shared library for SPUs. You must specify --objs with
this argument.

-g

Generates/compiles the debug objects.

--host

Creates a compiled object file for the Linux host only. Used
when combining multiple .o files using the --objs option.

--spu

Compiles the input .o files for the native SPU environment.


Used when combining multiple .o files using the --objs option.

--sputype type

Compiles for the specified SPU type. Valid values are spu7 for
z-series SPUs and spu10 for IBM Netezza 1000 and Netezza
100 model SPUs.

--print-compiler

Displays the pathname of the compiler used by nzudxcompile.

--print-spu-file

Displays the SPU output file.

-o outputobjectfile

Specifies the name of the output object filename. If you do not


specify an output object file name, the command uses the input
source filename (without the .c or .cpp suffix) and appends the
suffixes .o_x86, .o_spu7, and .o_spu10.

--args args

Specifies optional additional arguments to pass to the compiler,


as is.

--objs inputobjectfile

Specifies input object file(s) that will be linked into one shared
object file. If you specify this option, you must also specify
either --spu, --sputype, or --host.

Table 6-4 describes the options used when registering either a UDF or UDA.
Table 6-4: nzudxcompile General Registration Options

20444-5

Rev.4

Option

Description

[ --spufile file ]

Specifies the SPU object file for registration-only operations.


This option bypasses the step to create the compiled object file
for the SPU and just registers the specified object file.

[ --hostfile file ]

Specifies the host object file for registration-only operations.


This option bypasses the step to create the compiled object file
for the host and just registers the specified object file.

6-19

IBM Netezza User-Defined Functions Developers Guide

Table 6-4: nzudxcompile General Registration Options (continued)


Option

Description

[ --sig args [, ...] ]

Specifies the argument signature for the function or aggregate,


and must be enclosed in double-quotation marks. For example:
"UDXname(arg1, arg2, ...) "

--return return

Specifies the return type for the function or aggregate. You must
specify a valid Netezza data type for the return type.

--class class

Specifies the class name for the function or aggregate.

--deps libs

Specifies the library dependencies for this UDX. The libraries


can be a single library name or a comma-separated list of
libraries.

--mask args

Specifies the debugging mask. The valid values are DEBUG or


TRACE. You can specify the argument multiple times to specify
multiple values.

--mem mem

Specifies an indication of the potential memory use of the function. The mem value can be an empty value or a value in the
form of a number, or a number with the letters b (bytes), k (kilobytes), m (megabytes), or g (gigabytes).

--fenced

Specifies that the UDX should run in fenced mode. This is the
default mode.

--unfenced

Specifies that the UDX should run in unfenced mode. The


default is fenced mode. The database user who owns the UDX
must have Unfence admin privileges.

--varargs

Specifies that the UDX is a variable arguments UDX.

Table 6-5 describes the registration options that are available for UDX API version 2
objects.
Table 6-5: nzudxcompile API Version 2 Registration Options

6-20

Option

Description

--environment val

Specifies one or more environment settings, where val is a string


in the form of 'name' = 'value' pairs.

--version ver

Specifies which version of the object files to compile/create.


The default is 1 for API version 1. Specify 2 to create objects
that can leverage the API version 2 features.

20444-5

Rev.4

nzudxcompile Command Syntax

Table 6-6 describes the options used only when registering a UDF, in addition to those
described in Table 6-4.
Table 6-6: nzudxcompile UDF Registration Options
Option

Description

--nondet

For a user-defined function, specifies that the function is nondeterministic. A deterministic function, which is the default,
indicates that the UDF is a pure function, one which always
returns the same value given the same argument values and
which has no side effects. A non-deterministic function could
return different results based on the code where it is called on
or other situations; therefore, the function is always called even
if it has multiple instances in your query. If Netezza observes
multiple instances of a deterministic function in a query, it
could reduce all the calls to one call (a common subexpression
elimination) to improve the query performance.

--nullcall

For a user-defined function, specifies that the function is


CALLED ON NULL INPUT, which means that the function will
be called normally when some of its arguments are NULL. The
default is RETURNS NULL ON NULL INPUT indicates that the
function always returns NULL whenever any of its arguments are
NULL.

Table 6-7 describes the options used only when registering user-defined table functions.
Table 6-7: nzudxcompile Table Function Registration Options
Option

Description

--noparallel

Specifies that the table function will be created with parallel not
allowed. The default is parallel allowed.

--lastcall args

Specifies that the table function will be created with <args>


ALLOWED. The valid values follow:
TABLE
TABLE FINAL
TABLE, TABLE FINAL

The default is TABLE, TABLE FINAL


Table 6-8 describes the options used only when registering a UDA, in addition to those
described in Table 6-4.
Table 6-8: nzudxcompile UDA Registration Options

20444-5

Rev.4

Option

Description

--state state

Specifies the state signature for a user-defined aggregate, and


must be enclosed in double-quotation marks. For example:
"(state1, state2, ...)"

6-21

IBM Netezza User-Defined Functions Developers Guide

Table 6-8: nzudxcompile UDA Registration Options (continued)


Option

Description

--type aggtype

For a user-defined aggregate, specifies whether the aggregate


can be invoked only in a windowed aggregate (ANALYTIC) or a
grouped aggregate (GROUPED) or both (ANY). The default is
ANY.

Description
The nzudxcompile command has these additional descriptions:

Privileges Required
To run nzudxcompile, you must be logged in to the Netezza system as the nz user account.
If you use the command to also register the user-defined function or aggregate in one step,
you must specify a SQL user such as admin or one who has Create Function | Aggregate and
List privileges to the target database.

Common Tasks
Use the command to compile C++ code files for a user-defined function or aggregate into
object files that can be used in SQL queries on the Netezza system. If you use this command only to compile the object files, you will need to register the functions and
aggregates using the CREATE FUNCTION or CREATE AGGREGATE Netezza SQL commands. You can also create the compiled objects and register them at the same time using
the nzudxcompile command; you must specify the --sig, --return, --class, and --state arguments, which will then cause the Netezza system to call the related CREATE OR REPLACE
function, as applicable.

Usage
The following are examples of nzudxcompile command usage:

To compile a sample C++ file for a function named cube and create the output object
files:
nzudxcompile /home/nz/udx_files/cube.cpp

To compile the cube C++ file and save it as mycube object files:
nzudxcompile /home/nz/udx_files/cube.cpp -o mycube

To compile the cube C++ file and also register it in the mydb database:
nzudxcompile /home/nz/udx_files/cube.cpp --sig "Cube(int4)"
--return INT8 --class Cube --user myuser --pw password --db mydb

To create a shared library called mylib from the mylib.cpp file, run the following two
commands:
nzudxcompile /home/nz/udx_files/mylib.cpp --sputype spu10 -o
mylib.so
nzudxcompile --objs /home/nz/libs/mylib.o_spu10 --dynamic --sputype
spu10 -o mylib.so

6-22

20444-5

Rev.4

Migrating UDXs from API Version 1 to API Version 2

Migrating UDXs from API Version 1 to API Version 2


If you have UDXs that were created for UDX version 1 and which use the nz::udx
namespace, those UDXs will continue to run following an upgrade to Release 6.0 or later. If
you want to migrate your UDXs to version 2 to take advantage of features such as the new
version 2 API methods, environment access, or others, there are a few small changes that
must be made to the C++ source code files:

Change the namespace from nz::udx to nz::udx_ver2.

The constructor and instantiator for a UDX must be revised to take a UdxInit object.
For example, in API version 1, an UDX could have the following form:
using namespace nz::udx;
class CCustomerName: public Udf
{
public:
static Udf* instantiate();
};
Udf* CCustomerName::instantiate()
{
return new CCustomerName;
}

In API version 2, the instantiator and constructor take a UdxInit argument. As a result,
you must declare the constructor, even if it does not take any arguments, as in this
example:
using namespace nz::udx_ver2;
class CCustomerName: public nz::udx_ver2::Udf
{
public:
CCustomerName(UdxInit *pInit) : Udf(pInit){}
static nz::udx_ver2::Udf* instantiate(UdxInit *pInit);
};
nz::udx_ver2::Udf* CCustomerName::instantiate(UdxInit *pInit)
{
return new CCustomerName(pInit);
}

20444-5

Rev.4

6-23

IBM Netezza User-Defined Functions Developers Guide

6-24

20444-5

Rev.4

CHAPTER 7
Debugging User-Defined Functions and Aggregates
Whats in this chapter
Message Logging
UDX Test Harness
Debugging Using UDX Stubs

This chapter describes how to debug and test user-defined functions, aggregates, and
shared libraries using two debugging aids:

Message logging

UDX test harness

This chapter also describes how to disable UDXs within your nzsql session so that you can
troubleshoot problems such as changes in query performance on the Netezza system.

Message Logging
You can use the logMsg() facility to include operational messages and debugging hints
within your UDXs. The logMsg facility is similar to printf-style logging. You add the messages that you want to track the operation of the UDX. Each message has a flag value
(LOG_DEBUG, LOG_TRACE, or both values ORed together) to help you control the verbosity of the output.
You can control how much detail is output for a specific UDF or UDA using the LOGMASK
attribute when you register the function or aggregate, or when you run the UDX using the
test harness.

logMsg Function
Adds a logging message to your user-defined function or aggregate.

Syntax
The logMsg function has the following syntax:
logMsg(flag, fmt-string, args...)

The flag argument specifies the output level for the message. This allows you to control the
verbosity of the debugging output. If logging is enabled at the specified flag level, all of the
messages with that flag level will be output to standard output as well as the specified log
file. The valid values are LOG_DEBUG, LOG_TRACE, or both values ORed together.

7-1

IBM Netezza User-Defined Functions Developers Guide

DEBUG is usually a higher-level tracing category which provides messages for actions in
the main body. TRACE is typically used in lower-level areas of the code, such as loops or
other subareas of code. For example, you might put a logMsg statement with LOG_DEBUG
in the main body, and several more detailed statements with a LOG_TRACE level inside a
loop. For messages that you want to display under either output mode, you can specify
LOG_DEBUG|LOG_TRACE as an ORed value.
The fmt-string value specifies the logging message, enclosed in double quotes, and must
end with a newline character (\n). If you want to include substitution values in the message, you can do so and then specify the values using the args value. Note that the fmtstring value can include vsnprintf() formatted conversion specifications such as %i (optionally signed integer), %lld (long long decimal), %llu (long long unsigned int), and the like.
For a description of the available options, refer to the vsnprintf() documentation or man
pages.
The args value specifies zero or more substitution arguments that you want to specify in the
output message. The args values must correspond in type and number to substitution
switches in the fmt-string.

Usage
The logMsg function specifies an output message that you can use to follow the operational
steps of a UDX, which can help you to identify debugging steps and other information
about the function. You enable the logging using the nzudxdbg command.

Checking the Log Mask Settings


API version 2 includes two additional methods for checking the log mask. You can use the
isLoggingEnabled() method to check whether logging has been enabled, and you can use
the getLogMask() method to obtain the log mask.

Example: Adding logMsg() to the Sample UDF


As one method of a debugging, this example uses logMsg to validate that a certain area of
the sample function customername has been reached. The new code is shown below in
bold:
virtual ReturnValue evaluate()
{
StringArg *str; // 3
str = stringArg(0); // 4
int lengths = str->length; // 6
char *datas = str->data; // 7
int32 retval = 1;
if (lengths >= 10)
if (memcmp("Customer A",datas,10) == 0)
{
logMsg(LOG_DEBUG, "Found a match of length %d\n", lengths);
// 14
retval = 1;
}
NZ_UDX_RETURN_INT32(retval); // 17
}

7-2

20444-5

Rev.4

Message Logging

After you change a UDX to add logMsg() calls, you must recompile and re-register it. To
recompile the function:
nzudxcompile customername.cpp o customername.o

To re-register the function:


CREATE OR REPLACE FUNCTION CustomerName(varchar(64000))
RETURNS int4 LANGUAGE cpp PARAMETER STYLE npsgeneric
EXTERNAL CLASS NAME 'CCustomerName'
EXTERNAL HOST OBJECT '/home/nz/myfirstudx/customername.o_x86'
EXTERNAL SPU OBJECT '/home/nz/myfirstudx/customername.o_spu10';

After you re-register the function, you set the log verbosity for the function by altering the
function using Netezza SQL commands:
ALTER FUNCTION CustomerName(varchar(64000)) LOGMASK DEBUG,TRACE;

The values for LOGMASK can be NONE, DEBUG, TRACE, or both DEBUG,TRACE. The
value NONE disables output; DEBUG outputs any calls to logMsg that contain LOG_
DEBUG; TRACE outputs any messages with LOG_TRACE as the flag; DEBUG,TRACE outputs messages that have either or both DEBUG and TRACE flags.
If you use the --file argument of the nzudxdbg command, the output messages will be written to the standard log files. As a best practice, you should log the messages to files to help
with debugging and comparisons of the output messages following the test runs.
After you enable the LOGMASK, if you want the message output from the UDX to appear in
your terminal or shell window, you must stop and restart the database. To stop and restart
the Netezza database without disconnecting any Netezza processes:
nzstop
nzstart -i

You do not have to restart the database after each time you change the function, only after
the first time that you enable message logging in that specific terminal window.
Use the nzudxdbg command to enable message logging:
nzudxdbg --user admin --pw password -on --file

Then, run the query again as follows:


select * from customers where CustomerName(b) = 1;

You should see log messages like the following in the sysmgr.log log file and optionally the
shell window in which you ran nzstart i:
(event002.1001) [d,udx ] Found a match of length 12
(event002.1003) [d,udx ] Found a match of length 10

The (event002.1001) identifies where the function was run it means that it was in process event002 on the SPU with hardware id 1001 (or 1003 in the second message).
To see the log message output for the function when it runs on the host, run the query:
select x.*, customername(x.b) from customers x, customers y;

The log messages should be similar to the following:


07-24-07
07-24-07
07-24-07
07-24-07
07-24-07

20444-5

Rev.4

16:29:07
16:29:07
16:29:07
16:29:07
16:29:07

(dbos.24072)
(dbos.24072)
(dbos.24072)
(dbos.24072)
(dbos.24072)

[d,udx
[d,udx
[d,udx
[d,udx
[d,udx

]
]
]
]
]

Found
Found
Found
Found
Found

a
a
a
a
a

match
match
match
match
match

of
of
of
of
of

length
length
length
length
length

10
10
10
12
12

7-3

IBM Netezza User-Defined Functions Developers Guide

07-24-07 16:29:07 (dbos.24072) [d,udx ] Found a match of length 12


07-24-07 16:29:07 (dbos.24072) [d,udx ] Found a match of length 12

The (dbos.24072) value indicates that the message occurred in the dbos process with process ID (pid) 24072.

nzudxdbg Command
Enables or disables message logging for user-defined functions or aggregates. The command displays messages to standard output and optionally to the standard log files. You
must run this command as the nz user.
The command has the following syntax:
nzudxdbg [--all | --id hwid ] [--on | --off] [--file] [--user user] [-pw password] [-h]

Table 7-1 describes the usage for the command.


Table 7-1: nzudxdbg Command Arguments
Argument

Description

--all

Displays messages from routines run on all the SPUs.

--id hwid

Displays messages from routines that run only on the specified


SPU. The hwid value must be a valid SPU ID.

--on

Enables message logging for UDXs.

--off

Disables message logging for UDXs.

--file

When logging is enabled, specifies that the log messages will be


written to the standard log files in addition to standard output. The
standard log files include the following:
/nz/kit/log/dbos/dbos.log for the user-defined functions that run

on the host
/nz/kit/log/postgres/pg.log for the functions that operate on the

system catalog or at statement preparation time


/nz/kit/log/sysmgr/sysmgr.log for the functions that run on the

SPUs

7-4

--user user

Specifies a SQL user for the command. Use the admin user or a
SQL user who has Manage Hardware privileges. The default is the
value of NZ_USER.

--pw password

Specifies the password for the SQL user account. The default is the
value of NZ_PASSWORD.

-h

Displays the command usage.

20444-5

Rev.4

UDX Test Harness

UDX Test Harness


The test harness allows you to run your UDXs in a test environment outside the Netezza
runtime engine. It is useful to debug user code, to collect useful statistics on the user
code, and to check for certain programming errors.
For complicated UDAs and UDFs it is recommended that you run the test harness first as it
may catch errors such as buffer overruns. Always run the UDX in fenced mode first, then in
unfenced mode after you have debugged any issues with the UDX.
For the sample UDF customername, you can run the UDF in the test harness using the following command:
nzudxrunharness --user admin --pw password --db mydb
--dir /nz/data.1.0 --name customername --fenced

This command runs the customername function 100 times with randomly generated data
and displays output similar to the following:
(clientmgr) Info: admin: login successful
Selected only choice
1 - customername(VARCHAR(64000)) RETURNS INT4
Executing /nz/kit/bin/adm/udxharness -f customername_func.harness -k /
nz/kit.6.0.B4.14104
starting execution
Elapsed time: 0m0.039s
External references
logvprint(char const*, char*)
vtable for __cxxabiv1::__class_type_info
vtable for __cxxabiv1::__si_class_type_info
operator delete[](void*)
operator delete(void*)
operator new(unsigned int)
__cxa_pure_virtual
__gxx_personality_v0
free
memcmp
sprintf
strcmp
strdup
throwError
Our UDX object used 262144 bytes (may be rounded up to nearest page
4096)
Our UDX return value takes up 4 bytes, with 669 bytes for miscellaneous
Our UDX arguments take up 64012 bytes, with 14 bytes for miscellaneous
Our UDX state values take up 0 bytes, with 8 bytes for miscellaneous
State information may be doubled, since we need two states for merge

For a complete description of the command, refer to nzudxrunharness Command on


page 7-7.
The first section of the output shows the possible UDFs that match customername. If there
is more than one (in the case of multiple UDFs that share the same name but have different arguments), the command prompts you to select which UDF to use. This command
then displays some status information about the actual call that is executed, along with a
starting and executing message.

20444-5

Rev.4

7-5

IBM Netezza User-Defined Functions Developers Guide

If the function runs with no errors, the External references section of the output lists any
external library functions that the function uses. sprintf and throwUdxException() should
always be listed as they are included by the support functions, such as int32Arg(int).
The last section of the output displays estimated memory usage for the UDF.
Although the sample customername function is simple and operating correctly, assume
that the UDF has a problem. The following code for the function has a deliberate error that
would cause a buffer overrun when the function runs:
virtual ReturnValue evaluate()
{
StringArg *str = stringArg(0);
int lengths = str->length;
char *datas = str->data;
char* ptr = (char*)str;
for (int i=0; i < 4000; i++)
{
*(ptr-i) = 5;
}
int32 retval = 1;
if (lengths >= 10)
if (memcmp("Customer A",datas,10) == 0)
{
logMsg(LOG_DEBUG, "Found a match of length %d\n",
lengths);
retval = 1;
}
NZ_UDX_RETURN_INT32(retval);
}

If you have a new function, or one that you have found to have an error in processing, you
can compile the UDX with a debugging option, as follows:
nzudxcompile customername.cpp o customername.o -g

Assume that this incorrect function has been registered. The next step is to run it using the
test harness, as follows:
nzudxrunharness --user admin --pw password --db mydb
--dir /nz/data.1.0 --name customername --unfenced

The output from the test harness is similar to the following:


(clientmgr) Info: admin: login successful
Selected only choice
1 - customername(VARCHAR(64000)) RETURNS INT4
Executing /nz/kit/bin/adm/udxharness -f customername_func.harness -k /
nz/kit.6.0.B4.14104
starting execution
layout of overwrite structure is:
pReturnInfo
pReturnNull
returnType
numArgs
argTypes
args
argNulls
argConsts

7-6

20444-5

Rev.4

UDX Test Harness

ERROR: (After function evaluate): Mismatch after retNulls between


bytes 9 and 999 0
Caught exception

When the test harness detects a buffer overwrite, the output displays the structure order
and where the error occurred. In this case, the error occurred after returnType. To try and
identify the cause of the problem, you could debug it in GDB as follows:
nzudxrunharness --user admin --pw password --db mydb
--dir /nz/data.1.0 --name customername -dbg

This command launches the gdb prior to the execution of your UDF. A good place to set a
breakpoint is on the evaluate() method by typing:
(gdb) break CCustomerName::evaluate

For more information about debugging best practices to isolate programming problems,
consult the GDB documentation on the http://www.gnu.org/software/gdb web site.

nzudxrunharness Command
Runs a user-defined function or aggregate within a simulation test environment. The harness displays information on the memory usage of the objects, and it can detect buffer
overwrites when the UDF/UDA is called. The command displays messages to standard output and optionally to the standard log files. You must be logged in as the nz user to run this
command.
The command has the following syntax:
nzudxrunharness [OPTION]...

Table 7-2 describes the options that you can use for any instance of the command.
Table 7-2: nzudxrunharness General Options
Option

Description

--dir datadir

Specifies the location of the data directory, which is usually


/nz/data.

--base base

Specifies the base or home directory for the Netezza software.


The default is /nz/kit.

--user user

Specifies the Netezza database user to run this command. The


user must be admin or a user who owns the target database
where the UDX resides. The default is the value of NZ_USER.

--pw password

Specifies the Netezza users password. The default is the value


of NZ_PASSWORD.

--db database

Specifies the database in which to run the UDX. The default is


the NZ_DATABASE environment variable value.

-h

Displays the usage for the command.

Table 7-3 describes the options that you use to specify the input file for the command.

20444-5

Rev.4

7-7

IBM Netezza User-Defined Functions Developers Guide

Table 7-3: nzudxrunharness Input File Options


Option

Description

--file testfile

Specifies the pathname of a test data file. The columns must be


in the order expected for the arguments and separated by commas (,).

--grp col

Specifies the column number in the test data file that is used to
group by (for aggregates). The test data file must already be
grouped.

--sep separator

Specifies the character used as a separator in the test file. The


default is comma (,).

--escape escape

Specifies the escape character to use in the test file. The default
is none.

--quoting

Uses double-quotes for test file.

--hexinput

Specifies that the data in the file will be in hexadecimal input.

--generate

Generates a control file, but does not run the harness. For more
information, see Test Harness Control File on page 7-10.

Table 7-4 describes the other random input options for the command.
Table 7-4: nzudxrunharness Random Input Options
Option

Description

--rows rows

Specifies the number of rows to simulate. The default is 100.

--groups groups

Specifies the number of groups to simulate. The default is 5.

--nulls nulls

Specifies the null arguments. The value is specified as a colonseparated string of field numbers for example 1:2:3:5
The default is no nulls.

Table 7-5 describes the commands output options.


Table 7-5: nzudxrunharness Output Options

7-8

Option

Description

--print

Prints return values.

--hex

Prints return values, with strings in hexadecimal format.

--novalidate

Skips the buffer overrun validation steps. (The harness runs


faster when you run without validation.)

20444-5

Rev.4

UDX Test Harness

Table 7-5: nzudxrunharness Output Options (continued)


Option

Description

--dbg

Launches the debugger.

Table 7-6 describes the options that specify the UDX to test.
Table 7-6: nzudxrunharness UDX Options
Option

Description

--name name

Specifies the function or aggregate name. You can specify just


the function or aggregate name as it is, if the name is unique, or
use a full signature such as func(return).

--func

Operates on a function (default).

--agg

Operates on an aggregate.

Table 7-7 describes the options that allow you to override defined UDX values.
Table 7-7: nzudxrunharness UDX Override Options

20444-5

Rev.4

Option

Description

--mask NONE,
DEBUG, TRACE

Specifies the logging mask override. Valid values are NONE,


DEBUG, or TRACE. You can also specify a comma-separated
combination of DEBUG, TRACE to log both types of messages.

--over override

Specifies the string column size overrides. The value is specified


as 1-40:2-400 where the first number is the column (1 based).
The second number is the character size (not byte size).

--varargs cols

Specifies the argument info for VARARGS UDX. You specify the
value as a colon-separated series of types. For example:
VARCHAR(100):NUMERIC(10,3):INT4

--fenced

Runs the harness in fenced mode. This setting overrides the


fencing defined for the UDX in the database.

--unfenced

Runs the harness in unfenced mode. This setting overrides the


fencing defined for the UDX in the database.

--object file

Specifies an object file to use instead of the object file specified


for the UDX in the database. You can also use the --data option
with the --object option because it may be necessary for library
object file locations.

7-9

IBM Netezza User-Defined Functions Developers Guide

Table 7-7: nzudxrunharness UDX Override Options (continued)


Option

Description

--nodlclose

Specifies that the test harness should not invoke the C library
function dlclose() to close references to UNIX shared libraries
that were made available with dlopen(). The test harness invokes
dlclose() by default.
If you are running the test harness within a debugging tool such
as valgrind or callgrind, specify this option so that the harness
does not invoke dlclose() automatically. This allows you to access
symbol names and other values that can be useful for debugging,
but which may not available after dlclose() has been called. (For
more information about the valgrind debugging environment, see
http://valgrind.org.)

--final

For user-defined table functions, invokes the table function


using the TABLE WITH FINAL behavior.

The test harness runs the specified UDX using either a supplied data file or by creating random data based on the --rows,--groups, and --nulls flags. With the --nulls flag, the specified
columns will be null about 50% of the time. Also, when using random data, strings will be
filled to maximum capacity, which will either be based on the argument signature, or the
overrides specified by --over. Using a supplied data file is the best way to test the correctness of your algorithm.
Using the --mask flags shows the results of logMsg calls. The --print and --hex flags show
the results of evaluate or performFinalResult. The harness also prints out external routines
found in the object file. The --dbg flag invokes the debugger so that the actual object can
be debugged. If you use the debugger, make sure that the host object file for the UDX has
been compiled with debugging symbols. (Typically, you would compile using the optimized
mode instead.)
In the data file, types like interval and timetz with more than one piece of information must
have the fields separated by a colon (:). Nulls can be specified by <NULL>.

Test Harness Control File


When you run the nzudxharness command, the command creates a control file called
udfname_func.harness or udaname_agg.harness. A control file is a text file that specifies
the details of the of the test and simulation environment.
The nzudxrunharness command writes the harness control file to the current directory. For
the CustomerName UDF example in the section UDX Test Harness on page 7-5, the command created and used the following control file:
[nz@nzhost udx]$ more customername_func.harness
udxtype: udf
objectfile:/nz/data.1.0/base/2547310/udf/2928850.oh
udxname:customername
version: 2
numenvironments:0
numreturns:1
returninfo:16:-1:-1
numarguments:1
argument:1:64000:-1

7-10

20444-5

Rev.4

UDX Test Harness

classname:CCustomerName
fenced:t
deterministic:t
nullcall:t
memory:0
logmask:0
nulls:
numdependencies:0
undefined:vtable for __cxxabiv1::__class_type_info
undefined:vtable for __cxxabiv1::__si_class_type_info
undefined:operator delete[](void*)
undefined:operator delete(void*)
undefined:operator new(unsigned int)
undefined:__cxa_pure_virtual
undefined:__gxx_personality_v0
undefined:free
undefined:memcmp
undefined:sprintf
undefined:strcmp
undefined:strdup
undefined:throwError
inputdelim:,
inputquote: true
hexinput: false
printoutput: none
numrows: 100
validate: true

You can edit the control file parameters to change the test environment. You can also use
the udxharness binary and specify one or more control files to test a UDX or several UDXs
and their interactions in the same transaction scope. For example:
[nz@nzhost udx]$ udxharness -f customername_func.harness -f penmaxv2_
agg.harness -k /nz/kit

This sample command runs both the customername UDF and penmax UDA in the same
test environment to evaluate the impact on the system.
Table 7-8 describes the harness control file parameters.
Table 7-8: Control File Parameters
Parameter

Description

Options common to UDFs, UDAs, and UDTFs

20444-5

Rev.4

udxtype: type

Specifies the UDX type. The type must be either udf, uda, or
udtf. This parameter must be the first one in the control file.

numarguments:
num

Specifies the number of arguments for the UDX. It must be


immediately followed by the arguments. You must specify exactly
num arguments.

7-11

IBM Netezza User-Defined Functions Developers Guide

Table 7-8: Control File Parameters (continued)


Parameter

Description

argument: info

Specifies one of the arguments referenced by num. The format of


the info value is type:typmod:scale.
The type value is one of the DataType enums from the Udx-

Base C++ class.


The typmod value is -1 or the size of a string, or the precision

of a numeric.
The scale value is -1 or the scale of a numeric.

classname: class

Specifies the C++ class for the UDX. This paramter is required.

datafile: file

Specifies the input data file.

numdependencies:
num

Specifies the number of library dependencies for the UDX. It


must be immediately followed by the dependencies. You must
specify exactly num dependencies.

dependency: libinfo

Specifies a library dependency for the UDX. You must specify the
libraries in correct order; that is, the libraries that depend on
other libraries must be listed after the libraries that they depend
on. The format of the libinfo value is auto,file,name.
The auto value is t for automatic load and f for manual load.
The name is the library name.
The file value is the .so library object file.

Spaces before or after the commas are not allowed, unless they
are part of the file or name.

7-12

fenced: value

Specifies whether the UDX should be run in fenced mode or


unfenced mode.
You can specify a boolean value such as true, t, on, yes, y, or 1 to
run the UDX in fenced mode. Specify a value of false, f, off, no,
n, or 0 to run the UDX in unfenced mode.

shaper: value

Specifies whether the UDX should call a sizer or shaper. The


default is false (do not call a sizer/shaper).
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.

hexinput: value

Specifies whether the data in the input file is in hex format. The
default is false (not in hex format).
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.

inputdelim: delim

Specifies the delimiter for the input file. The default is comma.

inputescape: escape

Specifies the escape character for the input file. The default is
no escape character.

20444-5

Rev.4

UDX Test Harness

Table 7-8: Control File Parameters (continued)


Parameter

Description

inputquote: value

Specifies whether string data in the input file will be quoted. The
default is false (not quoted).
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.

logmask: mask

Specifies the log mask for the UDX. The valid values are 1 for
TRACE, 2 for DEBUG, 3 for both DEBUG and TRACE or 0 for
NONE. The default is 0.

memory: mem

Specifies the maximum memory for the UDX. The size value can
be an empty value or a value in the form of a number and the letters b (bytes), k (kilobytes), m (megabytes), or g (gigabytes). The
default is 0.

nulls: cols

Specifies the columns that will be null randomly when using randomly generated test data (no input data file specified). The cols
value is a comma-separated list of column numbers. The column
numbers start at 1.

numreturns: num

Specifies the number of return columns for the UDX. The value
is 1 for UDFs and UDAs, but it can be 1 or more for UDTFs. It
must be immediately followed by the return info. You must specify exactly num return values. This parameter is required.

returninfo: info

Specifies the return info. The info value has the form
type:typmod:scale:name.
The type value is one of the DataType enums from the Udx-

Base C++ class.


The typmod value is -1, or the size of a string, or the precision

of a numeric.
The scale value is -1 or the scale of a numeric.
The name value is used only for a UDTF, where it specifies the

column name. For a UDTF, the return info must be specified


in the correct order; that is, the same order in which the columns are declared for the table function.

20444-5

Rev.4

numrows: num

Specifies the number of randomly generated rows to produce for


the test. This option is only used when you do not specify an
input data file. The default is 100.

objectfile: file

Specifies the host UDX object file. This value is required.

printoutput: type

Specifies how to print the output of the UDX. Possible values are
normal, hex, or none. Normal prints the normally expected output; hex prints strings in their hex representation instead of
string representation; none does not print any output. The
default is none.

7-13

IBM Netezza User-Defined Functions Developers Guide

Table 7-8: Control File Parameters (continued)


Parameter

Description

validate: value

Specifies whether to validate the results for buffer overruns,


which adds execution time. The default is true.
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.

udxname: name

Specifies the name of the UDX as registered in the database.


This value is required.

undefined: symbol

Specifies an undefined symbol from the object file. There can be


0 or more symbols. This is only printed out by the program to
show issues such as potentially unresolved references. This
parameter is not required.

version: ver

Specifies the UDX API version. The ver value can be 1 or 2. The
default is 1.

Options specific to UDAs


columnorder:
column

Specifies the column to use for grouping when executing an


aggregate. This parameter triggers the calling of InitializeState
when the group changes.
Column is a 1-based number specifying the position in the data
file. This parameter is required if you specify a data input file.

groups: num

Specifies the number of groups to simulate when using random


data instead of a data input file. The default is 5.

numstate: num

Specifies the number of state values for the UDA. It must be


immediately followed by the states. You must specify exactly
num states. This parameter is required.

state: info

Specifies one of the state values for the UDA. The format of the
info field is type:typmod:scale.
The type value is one of the DataType enums from the Udx-

Base C++ class.


The typmod value is -1 or the size of a string, or the precision

of a numeric.
The scale value is -1 or the scale of a numeric.

Options specific to UDTFs


deterministic: value

7-14

Specifies whether the UDF is deterministic. The default is true.


You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.

20444-5

Rev.4

Debugging Using UDX Stubs

Table 7-8: Control File Parameters (continued)


Parameter

Description

nullcall: value

Specifies whether the UDF is called when one of the arguments


is null. The default is false.
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.

UDTF-Specific Parameters
lastcall: value

Specifies whether the UDTF is called after the last input row.
The default is false.
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0.

UDX Version 2-Specific Options


numenvironments:
num

Specifies the number of environment values for the UDX. It must


be immediately followed by the environment value. You must
specify exactly num variables.

environment: info

Specifies an environment entry for the UDX. The format of the


info value is name, value.
The name value specifies the name of the environment

setting.
The value is the value of the environment setting.

close:value

Specifies whether the test harness invokes the C library function


dlclose() to close references to UNIX shared libraries that were
made available with dlopen().
You can specify a boolean value such as true, t, on, yes, y, or 1;
or false, f, off, no, n, or 0. The default is true to call dlclose().
If you are running the test harness within a debugging tool such
as valgrind or callgrind, set this value to false so that the harness
does not invoke dlclose() automatically. This allows you to access
symbol names and other values that can be useful for debugging,
but which may not available after dlclose() has been called. (For
more information about the valgrind debugging environment, see
http://valgrind.org.)

Debugging Using UDX Stubs


You can replace calls to UDFs or UDAs with a stub object that returns a trivial value. This
allows you to turn off your user-defined functions or aggregates if a table query fails. Disabling UDXs may also be helpful during times when you need to troubleshoot Netezza
query performance or system issues such as SPU resets. The UDX stub capability is not
supported for queries that run in the Postgres environment on the host.
The UDX stub capability is controlled using an nzsql session variable called udx_stub. If
you set the variable to true (or 1), UDXs will not be executed fully. Instead, they return a
basic value that essentially causes them to be ignored.

20444-5

Rev.4

7-15

IBM Netezza User-Defined Functions Developers Guide

To enable the UDX stub processing, do the following:


1. Within your nzsql session, set the session variable as follows:
MYDB(MYUSER)=> set udx_stub=1;

2. The command returns the message SET VARIABLE if successful.


When you next run a SQL query that calls a UDX, you should notice that the UDX did not
execute. For example, use the CustomerName UDF example from Chapter 2. If you enter
the following query:
MYDB(MYUSER)=> SELECT * FROM customers WHERE CustomerName(b) = 1;

The output would normally appear as follows:


A | B
---+-----------------1 | Customer A
4 | Customer ABC
(2 rows)

If you enable the UDX stub processing, the output for the same query appears as follows:
A | B
---+-----------------3 | Customer CBA
1 | Customer A
4 | Customer ABC
2 | Customer B
(4 rows)

Essentially, with the UDX stub enabled, the WHERE clause does not restrict the output;
therefore, the command displayed the entire customers table.

To disable the UDX stub processing and enable your user-defined functions and aggregates, set the udx_stub session variable to false (0):
MYDB(MYUSER)=> set udx_stub=0;

7-16

20444-5

Rev.4

APPENDIX

Creating Memory Workpads Using the SPUPad


Whats in this appendix
Uses of the SPUPad
Content Restrictions for the SPUPad
How to Define and Use a SPUPad
SPUPad-Related API
Special Considerations for UDXs that Have SPUPads
Best Practices for Registering UDXs that Use SPUPads
Examples

This appendix describes the SPUPad feature, which allows UDX developers to allocate a
named, unique, area of memory as a temporary storage area and workpad. A SPUPad typically resides in memory on the Netezza S-Blades, but it can also reside in memory on the
host. Its location depends upon the location of the user tables on which it operates. When
a SPUPad runs on an S-Blade, the system creates one SPUPad for each dataslice managed
by the S-Blade.
A user-defined function or user-defined aggregate can call the SPUPad routines to create a
SPUPad, write data to it, and read data from it. The SPUPad is temporary because it persists only for the lifetime of the transaction or transaction block which called the function
that created it. When the transaction completes, the memory used for the SPUPad is automatically freed.
The SPUPad feature allows UDX developers to allocate and write data directly to the memory of the S-Blade or the Netezza host. Use caution when using this feature. You should be
very familiar with the Netezza architecture and verify your code and memory allocations, as
problems in the code could create out-of-memory situations, S-Blade resets, and other
impacts that would affect the performance and availability of your Netezza system.

Uses of the SPUPad


The SPUPad may be a helpful option if you are designing UDFs or UDAs that perform the
following types of actions:

Return multiple result columns; currently, UDFs and UDAs cannot return multiple
results unless they are encoded in a string and you have UDXs that return the string as
well as extract values from that string.

A-1

IBM Netezza User-Defined Functions Developers Guide

Process using a lookup table of values that can speed processing for the UDX. For
example, you might want to create a temporary table of facts or values that might help
the UDFs or UDAs to run more quickly.

Serve as (or leverage) a common engine-style UDF to initialize a program or routine


that is passed as an argument of a UDF or UDA.

Content Restrictions for the SPUPad


Because SPUPads are areas of Netezza system memory that you can write to, read from,
and from which you can run programs or engines, use extreme caution with the data that
you write to the SPUPad.
The SPUPad cannot be used to store some types of code objects. Attempting to store these
objects could cause the S-Blade to reset. Therefore, while C++ objects can be stored in the
SPUPad, some methods and C++ virtual function tables on those objects will not work.
C++ objects of type class, struct, or union can be divided into two groups: aggregates and
non-aggregates. An aggregate is a class, struct, or union with no constructors, no private or
protected members, no base classes, and no virtual functions. All other classes are nonaggregates. Aggregates can have static member functions and static class members.
Aggregate classes can be stored in the SPUPad. Although it is not supported, you should be
able to store a non-aggregate class if there is no base class and no virtual methods.
This has particular implications for destructors on objects. Non-virtual destructors on the
objects are useful only if you manually invoke PAD_DELETE on an object and want the
object to invoke PAD_DELETE on its children. (No heap memory used by the object or its
children can be allocated using anything other than PAD_NEW.)
When the object is destroyed automatically after the transaction, the destructor is not
invoked, but the child objects are freed automatically to avoid a memory leak.

How to Define and Use a SPUPad


The process to define a UDX that has a SPUPad is the same process as described in chapter 2; you create a C++ program, which now includes special functions and code for a
SPUPad, then you compile and register your UDX.
Note: SPUPads are not supported for UDXs that operate in fenced mode.
To use a SPUPad in your C++ programs, follow these high-level steps (described in later
sections):

Define the data type contents of a SPUPad object.

Define a SPUPad with a specific name.

Write data to the SPUPad

Read data from the SPUPad

Your UDX could be designed to create a SPUPad, write and manipulate the content, and
return data all in one function call. You could also design several UDXs to perform some or
all of those tasks separately within a query transaction block.
The following sections show some examples of the code that you can add to your C++
programs.

A-2

20444-5

Rev.4

How to Define and Use a SPUPad

Define the SPUPad Content


As the first step in creating a SPUPad within your C++ program, identify the types of data
that will be saved in your SPUPad. If you plan to have several different types of data in the
SPUPad, consider creating a structure type to define the content. (In the SPUPad examples, this structure is referred to as the Root structure type. Each SPUPad gives you access
to a portion of memory that can be thought of as the root of a tree. Basically, all the objects
in the SPUPad can be reached only through the root object.)
In a root object, you could store a single object (int, double, etc.), an array (char*, int*,
etc.), or a structure. As an example, assume that you plan to use a SPUPad to store a string
of characters. The Root structure might contain a pointer to the string, and perhaps a size
value to help set some boundaries for the string size (as well as the memory that will be
consumed by the SPUPad).
An example of a Root structure for this SPUPad follows:
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
struct Root
{
char* data;
int size;
};

Note: To review the entire sample program and its comments, see the section string_pad_
create.cpp on page A-14.
A structure shows an interesting aspect of root objects; for example, the following sample
code shows a root object that implements a simple dictionary:
struct MyValue
{
char* name;
int value;
};
struct MyLookup
{
MyValue *values;
int numallocated;
int numused;
};

In this example, the root object is an instance of MyLookup, which can contain an arbitrary
number of MyValue objects. All of the objects, plus the char* strings, are allocated through
the SPUPad allocation mechanisms, but the only way to get to a value (or the name in a
value) is through the MyLookup root object.
If you create multiple C++ files to define UDXs that will manipulate the data in the same
SPUPad, make sure that you repeat your Root structure definition in each C++ file. If you
define a number of common structures or definitions, you could create an include file to
define all these objects in one location.

20444-5

Rev.4

A-3

IBM Netezza User-Defined Functions Developers Guide

Although there is no maximum number of objects that a SPUPad can hold, as a best practice try to limit the number of objects that you create to only those that you really need. The
SPUPad keeps track of each object using a pointer per object, which adds to the memory
consumed by the SPUPad.

Create a SPUPad
Within the UDF evaluate method, you create a named pad of type SPUPad for your function. An example follows which creates a SPUPad named stringpad which is a string
storage pad. The UDF called string_pad_create takes an input string and length, then saves
each character in the string in an array.
class StringPadCreate: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
CPad* pad = getPad("stringpad");
Root *ro = (Root*) pad->getRootObject(sizeof(Root)); //Line 4.
if(!ro) // If false, stringpad does not exist; safe to create it.
{
ro=PAD_NEW(pad,Root);
int32 size = int32Arg(0);
StringArg* a = stringArg(1);
int32 stringSize=a->length;
if (size<1 || size > 64000)
{
throwUdxException("Given size is out of range.");
}
if(stringSize > size)
{
throwUdxException("Given string bigger than given size.");
}
ro->size=size;
ro->data=PAD_NEW(pad,char)[size]; // PAD_NEW creates an array to
// hold each character in the
// input string.
for(int i=0; i<size; i++)
{
if(i<stringSize)
{
ro->data[i]=a->data[i];
}
else
{
ro->data[i]=' ';
}
}
pad->setRootObject(ro, sizeof(Root));
//Line 34.
NZ_UDX_RETURN_BOOL(true);
}
else // stringpad already exists; stop processing the create task.
{
NZ_UDX_RETURN_BOOL(false);

A-4

20444-5

Rev.4

How to Define and Use a SPUPad

}
}
Udf* StringPadCreate::instantiate()
{
return new StringPadCreate;
}

In the sample code, note the following important practices:

On Line 4, there is a call to getRootObject which verifies whether stringpad already


exists. If the pad does not exist, the program creates it and returns true; otherwise the
function does not create another pad and returns false. As a best practice, your UDX
should check for the presence of the SPUPad before it creates the SPUPad.

On Line 34, the program sets the root object for the SPUPad so that subsequent calls
or functions can reference the SPUPad.

Process Data In a SPUPad


After you define the code to create and save data into a SPUPad, the next step is to process
the SPUPads data in some way.
Continuing the stringpad example, the following sample code is a new UDF called string_
pad_get that takes an input position. The UDF uses the position value to return the character in that position of the array saved in the SPUPad. If the stringpad does not exist, the
function exits with an error.
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
struct Root
{
char* data;
int size;
};
class StringPadGet: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
if(isArgNull(0))
{
throwUdxException("cannot accept null arguments.");
}
if(argType(0)!=UDX_INT32)
{
throwUdxException("First argument must be int4.");
}
CPad* pad = getPad("stringpad");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
if(!ro)
{

20444-5

Rev.4

A-5

IBM Netezza User-Defined Functions Developers Guide

throwUdxException("Pad does not exist");


}
else
{
int index = int32Arg(0);
if(index<0||index>=ro->size)
{
throwUdxException("Index out of bounds");
}
StringReturn *ret = stringReturnInfo();
ret->size=1;
ret->data[0]=ro->data[index];
NZ_UDX_RETURN_STRING(ret);
}
}
};
Udf* StringPadGet::instantiate()
{
return new StringPadGet;
}

Running the stringpad UDFs


After you create the C++ program(s) that contain your SPUPad calls, you compile them
using nzudxcompile and register them on the Netezza system. For a detailed description of
the command, see nzudxcompile Command Syntax on page 6-18. For an example of the
CREATE [OR REPLACE] FUNCTION commands to register these sample UDFs, see string_
pad_create.cpp on page A-14 and string_pad_get.cpp on page A-17. These functions
could easily be saved in one C++ program; after you compile the program, you would use
the CREATE [OR REPLACE] FUNCTION command to register each of the functions as separate UDFs.
After you compile and register the UDFs, you can run them. For example, the following
sequence of commands entered at the nzsql prompt show how these functions process the
data. The commands are processed within a BEGIN/COMMIT transaction block so that the
SPUPad still exists for the string_pad_get function.
Note: The one_dslice table has one row and thus is saved on only one dataslice. Its content
has no bearing on the string processing in this example, but its presence and content cause
the UDF to create the SPUPad on the S-Blade that manages the dataslice where the single
row of the table resides. For more information about how the table can impact the operation of the SPUPad, see Tables and SPUPad Operations on page A-11.
MYDB(MYUSER)=> BEGIN TRANSACTION;
BEGIN
MYDB(MYUSER)=> SELECT string_pad_create(10, 'netezza') FROM one_
dslice;
STRING_PAD_CREATE
------------------t
(1 row)
MYDB(MYUSER)=> SELECT string_pad_get(0) FROM one_dslice;
STRING_PAD_GET
---------------n

A-6

20444-5

Rev.4

SPUPad-Related API

(1 row)
MYDB(MYUSER)=> SELECT string_pad_get(1) FROM one_dslice;
STRING_PAD_GET
---------------e
(1 row)
MYDB(MYUSER)=> SELECT string_pad_get(2) FROM one_dslice;
STRING_PAD_GET
---------------t
(1 row)
MYDB(MYUSER)=> COMMIT;
COMMIT

If you run these functions as single select statements, note that the first function (string_
pad_create) would run, create the SPUPad and the character array, and then exit. When the
function returns, the Netezza system automatically cleans up the SPUPad and frees the
memory. If you then run string_pad_get in a single select, you would see the following error
because the SPUPad no longer exists:
MYDB(MYUSER)=> select string_pad_get(0) from one_dslice;
ERROR: Pad does not exist

SPUPad-Related API
The following functions and macros are used for creating and managing the SPUPads
within your UDX code.

allocate Function on page A-7

deallocate Function on page A-8

setRootObject Function on page A-8

getRootObject Function on page A-9

getTotalSize Function on page A-9

getPad Function on page A-9

PAD_NEW Macro on page A-9

PAD_DELETE Macro on page A-10

isUserQuery Function on page A-10

allocate Function
Allocates the specified amount of memory and returns it.

Syntax
The function has the following syntax:
virtual void *allocate(const size_t sz, bool array=false)

20444-5

Rev.4

A-7

IBM Netezza User-Defined Functions Developers Guide

Description
allocate() uses NzAllocObject to allocate memory from the heap for the SPUPad. The only
size restriction is the amount of available heap memory. Instead of using the allocate()
function to allocate memory, review the PAD_NEW macro, which can help you to perform
the allocations and also manage constructors for you. Use the allocate() function only if you
are using C-style code and want to replace calls to malloc/calloc instead of calls to new.

Throws
allocate() throws an exception if it cannot allocate the memory.

deallocate Function
Deallocates the specified amount of memory, which must have been previously allocated by
allocate().

Syntax
The function has the following syntax:
virtual void deallocate(void *ptr)

Description
deallocate() uses NzFreeObject to deallocate memory for the SPUPad and return it to the
heap. Instead of using the deallocate() function to free memory, review the PAD_DELETE
macro, which can help you to free the memory and also manage destructors for you. Use
the deallocate() function if you are using C-style code and want to replace calls to free
instead of calls to delete or delete[].

Throws
deallocate() throws an exception if the specified object was not allocated by the pad using
allocate().

setRootObject Function
Sets the root object and size for the pad.

Syntax
The function has the following syntax:
virtual void setRootObject(void *ptr, size_t size)

Description
This function can be called only once per a SPUPad instance; subsequent calls will throw
an exception. The size is the size of the root object, not the root object plus all its children.
The size argument must correspond to the size of the object as it was allocated using allocate(), and as such is subject to the same restrictions.

Throws
setRootObject() throws an exception if the pad already has a root object. (You cannot reset
the root once it is set.)

A-8

20444-5

Rev.4

SPUPad-Related API

getRootObject Function
Gets the root object of a pad, or returns NULL if the pad does not exist.

Syntax
The function has the following syntax:
virtual void * getRootObject(size_t size)

Description
This size value must match the size specified in the setRootObject() call to ensure that the
function is retrieving the root for the expected object. getRootObject() will return the root
object if it is set, NULL otherwise.

Throws
getRootObject() throws an exception if the root object is set but the size specified is not
equal to the size that the object was registered with using setRootObject().

getTotalSize Function
Returns the total size in bytes of all objects allocated by a SPUPad.

Syntax
The function has the following syntax:
virtual int32 getTotalSize()

Description
getTotalSize() returns the current size in bytes allocated by the SPUPad. The function
always returns a positive number if the pad is not empty, or 0 if the pad is empty. The sizes
that make up the total are subject only to the restrictions of allocate()/deallocate().

getPad Function
Returns a SPUPad object of the specified name.

Syntax
The function has the following syntax:
extern CPad* getPad(const char* strName)

Description
getPad() returns a SPUPad object or creates a new SPUPad object if it does not already
exist. The Netezza system will be responsible for cleaning up and freeing all objects allocated using the pad when the current transaction ends.

PAD_NEW Macro
PAD_NEW() is a macro that allocates the memory for a new SPUPad.

20444-5

Rev.4

A-9

IBM Netezza User-Defined Functions Developers Guide

Syntax
The macro has the following syntax:
PAD_NEW(pad, type)

Description
The PAD_NEW macro allocates memory using the allocate() function, and can also be used
to invoke constructors and destructors. PAD_NEW invokes helper templates to ensure that
allocate() and the related constructors are invoked appropriately.
PAD_NEW can be used in array and non-array contexts as follows:
MyObject *pObj = PAD_NEW(pad, MyObject);
char * pStr = PAD_NEW(pad, char)[10];

The array style helps to properly support the calling of constructors when allocating an array
of objects that have a constructor.

PAD_DELETE Macro
PAD_DELETE() is a macro that deallocates or frees the memory used by a SPUPad.

Syntax
The macro has the following syntax:
PAD_DELETE(pad, ptr)

Description
The PAD_DELETE macro deallocates memory that was used for a SPUPad by calling the
deallocate() function and invoking the necessary destructors. PAD_DELETE invokes helper
templates to ensure that deallocate() and the destructors are invoked appropriately.
PAD_DELETE can be used in array and non-array contexts as follows:
PAD_DELETE(pad, pObj);
PAD_DELETE(pad, pStr);

The array style helps to properly support the calling of destructors when freeing an array of
objects that have a destructor.

isUserQuery Function
Verifies that the SPUPad is being called by a user query, not an internal routine such as
Just-in-Time Statistics which is running the function for query optimization planning.

Syntax
The function has the following syntax:
extern "C" bool isUserQuery()

Description
As a best practice, call this function to ensure that it returns true before operating on the
SPUPad. The query will return true when the function is being executed during a user
query, and false when the function is being executed as part of an internal process such as
JIT statistics. See also Best Practices for UDXs with SPUPads on page A-11.

A-10

20444-5

Rev.4

Special Considerations for UDXs that Have SPUPads

Special Considerations for UDXs that Have SPUPads


The following sections describe additional considerations for UDX developers who incorporate SPUPads into their functions.

Best Practices for UDXs with SPUPads


When your UDX uses a SPUPad, consider these important programming practices:

If a UDX modifies the contents of a SPUPad, the UDX should use the isUserQuery()
function to guard against Just In Time (JIT) statistics impacts if the SPUPad contents
could be used across queries or across rows. The JIT statistics process runs sample
tests of user queries to identify the best performance plan for the query. For queries
which modify SPUPad contents, the JIT statistics sampling could cause unintended
modifications of the SPUPad contents. Thus, make sure that SPUPad operations occur
only when isUserQuery() returns true.
If the contents are used only by other UDFs or UDAs within the same row of the current
query (such as for caching a complicated calculation or returning multiple columns),
the UDX does not need to call isUserQuery() to guard the SPUPad.

If a UDF or UDA uses a SPUPad, but it does not modify the contents, you do not need
to guard against JIT statistics impacts.

Any UDF that uses a SPUPad and that guards against JIT statistics should not be used
in a WHERE clause of a query. The JIT statistics evaluation of the query would ignore
the UDF and the query plan might not reflect the actual cost or size of the query.

In all SPUPad cases, the UDXs should be robust enough to error gracefully when the
SPUPad is not populated.

If you have two or more UDXs that could be used in the same query, and one or more or
them uses SPUPad, use caution to avoid symbol name overlaps with the struct and
class objects that are placed in the SPUPad. Symbol name collisions can cause SPU
resets.

Tables and SPUPad Operations


The stringpad example uses the one_dslice table to essentially ground the SPUPad to a
single S-Blade and dataslice location. Since one_dslice has only one row and thus resides
in one dataslice, the query creates one SPUPad on the S-Blade that manages that
dataslice. If you use a larger table that is distributed across all the dataslices, a similar
query would create multiple SPUPads; on each S-Blade, there would be a unique SPUPad
for each dataslice that it manages.
Note: Starting in Release 6.0, you can use the _v_dual_dslice view to ground the SPUPad
to all the dataslices, rather than creating a user table such as multi_dslice to perform that
execution locus control.
If you use a table that has many rows with data evenly distributed across the dataslices, the
example function returns the answer multiple times. For example, assume that multi_
dslice is a simple table with nine rows which are distributed evenly over eight dataslices on
a Skimmer system (which has one S-Blade). Seven dataslices will contain one row of the

20444-5

Rev.4

A-11

IBM Netezza User-Defined Functions Developers Guide

table, and one dataslice will contain two rows. If you run the same BEGIN/COMMIT transaction commands shown in Running the stringpad UDFs on page A-6, the sample output
is as follows:
MYDB(MYUSER)=> BEGIN TRANSACTION;
BEGIN
MYDB(MYUSER)=> SELECT string_pad_create(10, 'netezza') FROM multi_
dslice;
STRING_PAD_CREATE
------------------t
t
f
t
t
t
t
t
t
(9 rows)

As shown in the sample output, there are eight true (t) rows and one false (f) row, because
the query created eight SPUPads one for each dataslice where a row of the table resides.
The false response was returned by the dataslice that has two rows of the table, because
the create_string_pad function checks for the existence of a SPUPad before it creates one.
It found the SPUPad from the processing of the first table row on that dataslice, so it did
not create another SPUPad.
The next query returns the character at index value 4 of the string netezza which resides
in the SPUPads:
MYDB(MYUSER)=> SELECT string_pad_get(4) FROM multi_dslice;
STRING_PAD_GET
---------------z
z
z
z
z
z
z
z
z
(9 rows)
MYDB(MYUSER)=> COMMIT;
COMMIT

This query returns 9 rows, one for each row in multi_dslice.


If you do not specify a FROM clause, as in the following example, the Netezza system
attempts to run the query in the Postgres environment on the host. SPUPads are not supported in the Postgres environment.
MYDB(MYUSER)=> SELECT string_pad_create(10, 'netezza');
ERROR: CPad not supported in postgres

If you specify an external table as the FROM clause, the Netezza system runs the query on
the host and creates the SPUPad on the host. For example, assume that one_dslice_ext is
an external table version of the one_dslice table. You could run the UDX as follows:

A-12

20444-5

Rev.4

Special Considerations for UDXs that Have SPUPads

MYDB(MYUSER)=> SELECT string_pad_create(10, 'netezza') FROM one_dslice_


ext;
STRING_PAD_CREATE
------------------t
(1 row)

In this example, the function creates the SPUPad on the host in memory, then frees the
SPUPad and its memory when the function completes.
If your SPUPad operations read information from a distributed user table, each SPUPad on
the S-Blades has access only to the data that resides on that S-Blade or that is sent to it by
the UDX. If your analysis algorithm requires that the SPUPads have some uniform data
across all S-Blades, you could use a mechanism in the UDX to write common data to the
SPUPad, or you could create a table that contains the needed rows and also has a datasliceid identification, for example:
CREATE TABLE foo_brdcst AS SELECT d.ds_id-1 AS dsid_, t.* FROM foo t,
_t_dslice d DISTRIBUTE ON (dsid_);

For extremely complex queries, where data is redistributed or sent to the host, you may not
get meaningful or predictable results. To avoid this, be very explicit with the distribution of
tables.

Calculating the Memory Use of a SPUPad


When you incorporate SPUPads into your UDX applications, make sure that you include the
memory use of the SPUPad into the MAXIMUM MEMORY setting for your UDX. This helps
to ensure correct performance scheduling for the queries that use the UDX.
Based on the number of objects and size of the objects that you store in the SPUPad, you
can obtain a rough estimate of its memory allocation. For each object, be sure to add the
space consumed by the pointers that refer to it as well; so, add an additional int for each
declared object in the SPUPad.
For SPUPads that run in S-Blade memory, make sure that you consider the number of
dataslices managed by the S-Blade; typically, an S-Blade manages 8 dataslices by default
(sometimes 6, and sometimes more if one or more S-Blades have failed within the SPA).
The system creates one SPUPad for each dataslice which contains rows that are being
queried.
You can also use the getTotalSize call in a UDX to determine the size of the SPUPad, and
then use that information to update the MAXIMUM MEMORY value of the UDX. Memory
calculation is often an iterative process; for example, as you debug your UDX in the test
harness, you could use getTotalSize to calculate the memory and return the value using
logMsg(). You could also use the call within your UDFs as they run on a development system, and as the memory allocation becomes better known, you can use the ALTER
FUNCTION or ALTER AGGREGATE commands to modify the MAXIMUM MEMORY setting
appropriately.

Automatic Memory Cleanup


The memory in use for the SPUPads is freed when the transaction or transaction block that
called the UDX ends. You do not have to explicitly deallocate or free memory for the SPUPads if you use the PAD_NEW macro to create the pad.

20444-5

Rev.4

A-13

IBM Netezza User-Defined Functions Developers Guide

Transaction Restarts
When the Netezza system restarts a transaction due to a state change, S-Blade restart, or
other reasons, the SPUPad UDXs are affected. A SELECT statement can be restarted if it is
not in a multi-statement transaction and no results have been returned yet. A multi-statement transaction that is between statements can be restarted if it has not modified any
data so far.
For a single-select query, there should be no implications other than making sure that any
SPUPads that might have been created are properly freed. In the case of the multi-statement select, if any of the UDXs in the statements use the SPUPad, they should be written
to ensure that they error and exit if the SPUPad does not exist.

Best Practices for Registering UDXs that Use SPUPads


When you register UDXs that take advantage of SPUPad features, note the following best
practices:

Make sure that you register SPUPad UDXs as NOT FENCED; fenced UDXs cannot use
SPUPads.

Make sure that you register a UDF as NOT DETERMINISTIC when the UDF uses a
SPUPad.

Make sure that you add the memory requirements of the SPUPad to the memory needs
of the UDX in the MAXIMUM MEMORY argument. You need to add in the SPUPad
memory for any UDX that creates a SPUPad as well as any UDX that uses a SPUPad
created by another UDX. Starting in Release 6.0, you can specify a wider range of values for the MAXIMUM MEMORY input. The previous limit was 10MB. Use caution with
the SPUPad memory consumption, as very large SPUPads can impact the Netezza system performance as well as result in out-of-memory errors for your UDXs.

Examples
The following examples show C++ programs that use SPUPad definitions.

string_pad_create.cpp
The string_pad_create.cpp sample program takes an input string size and string and creates a SPUPad to store the string as an array of characters.
/**
* UDF: string_pad_create(int4, varchar(64000)) -> bool
*
* Creates a SPUPad called "stringpad" with size and content
* as specified in the arguments. The data from "stringpad" can
* then be accessed by string_pad_get() and string_pad_size().
*
* argument1: the initial size of the pad. Must be between 1 and 64000.
* argument2: the initial data to put in the pad. Must be between
* 0 and argument1 characters.
*
* returns true if pad created successfully. returns false if
* "stringpad" already exists when called.

A-14

20444-5

Rev.4

Examples

*
* throws error if argument1 is null or if argument2 is null or
* the arguments are out of range.
*
* REGISTRATION:
* CREATE OR REPLACE FUNCTION string_pad_create (int4, varchar(64000))
* RETURNS bool
* LANGUAGE CPP
* PARAMETER STYLE NPSGENERIC NOT FENCED
* CALLED ON NULL INPUT
* NOT DETERMINISTIC
* EXTERNAL CLASS NAME 'StringPadCreate'
* EXTERNAL HOST OBJECT '/tmp/test/UDX_StringPadCreate.o_x86'
* EXTERNAL SPU OBJECT '/tmp/test/UDX_StringPadCreate.o_spu10';
*
* -->>Do NOT register any spu-pad related UDFs as 'deterministic'
*
* USAGE
* You need a user table that is defined on at least one SPU, such as:
* CREATE TABLE one_dslice(c1 int4);
* INSERT INTO one_dslice VALUES(1);
* SELECT string_pad_create(10, 'netezza') FROM one_dslice;
* This select creates a SPUPad on the SPU that manages the dataslice
* where one_dslice resides.
*
* To create the pad on multiple SPUs, create a table
* T with X distinct values where X>=NUM_DATASLICES. Then issue
* 'SELECT string_pad_create(...,...) FROM T;'
* expect NUM_SPUS 't' values and X-NUM_SPUS 'f' values in the
* result set.
*
* Copyright (c) 2007-2010 Netezza Corporation, an IBM Company
* All rights reserved.
*
*/
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
struct Root
{
char* data;
int size;
};
class StringPadCreate: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
if(isArgNull(0)||isArgNull(1))
{
throwUdxException("cannot accept null arguments.");

20444-5

Rev.4

A-15

IBM Netezza User-Defined Functions Developers Guide

}
if(argType(0)!=UDX_INT32)
{
throwUdxException("First argument must be int4.");
}
if(argType(1)!=UDX_VARIABLE)
{
throwUdxException("2nd argument must be a varchar.");
}
CPad* pad = getPad("stringpad");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
if(!ro)
{
ro=PAD_NEW(pad,Root);
int32 size = int32Arg(0);
StringArg* a = stringArg(1);
int32 stringSize=a->length;
if (size<1 || size > 64000)
{
throwUdxException("Given size is out of range.");
}
if(stringSize > size)
{
throwUdxException("Given string bigger than given size.");
}
ro->size=size;
ro->data=PAD_NEW(pad,char)[size];
for(int i=0; i<size; i++)
{
if(i<stringSize)
{
ro->data[i]=a->data[i];
}
else
{
ro->data[i]=' ';
}
}
pad->setRootObject(ro, sizeof(Root));
NZ_UDX_RETURN_BOOL(true);
}
else
{
NZ_UDX_RETURN_BOOL(false);
}
}
};
Udf* StringPadCreate::instantiate()
{
return new StringPadCreate;
}

A-16

20444-5

Rev.4

Examples

string_pad_get.cpp
The string_pad_get.cpp sample program takes an input string position value and returns
the character stored in stringpad at that position of a character array. The stringpad must
be created and populated by the string_pad_create function.
/**
* UDF string_pad_get(int4) -> char(1)
* Gets a character at index from the pad "stringpad".
* The pad must be created first by using string_pad_create.
*
* argument1 = index at which to return the character
*
* returns = the character at index
*
* throws error if the spu pad "stringpad" is not found or if index
* is out of range.
*
* COMPILATION:
* nzudxcompile UDX_StringPadGet.cpp -o /tmp/test/UDX_StringPadGet.o
*
* REGISTRATION:
* CREATE OR REPLACE FUNCTION string_pad_get (int4)
* RETURNS char(1)
* LANGUAGE CPP
* PARAMETER STYLE NPSGENERIC NOT FENCED
* CALLED ON NULL INPUT
* NOT DETERMINISTIC
* EXTERNAL CLASS NAME 'StringPadGet'
* EXTERNAL HOST OBJECT '/tmp/test/UDX_StringPadGet.o_x86'
* EXTERNAL SPU OBJECT '/tmp/test/UDX_StringPadGet.o_spu10';
*
* -->Do not register any spu-pad related UDFs as 'deterministic'
*
* USAGE:
* CREATE TABLE one_dslice(c1 int4);
* INSERT INTO one_dslice VALUES(1);
* SELECT string_pad_create(10, 'netezza') FROM one_dslice;
* SELECT string_pad_get(1) FROM one_dslice;
*
* Copyright (c) 2007-2010 Netezza Corporation, an IBM Company
* All rights reserved.
*/

#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
struct Root
{
char* data;
int size;
};
class StringPadGet: public Udf
{

20444-5

Rev.4

A-17

IBM Netezza User-Defined Functions Developers Guide

private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
if(isArgNull(0))
{
throwUdxException("must not accept null arguments.");
}
if(argType(0)!=UDX_INT32)
{
throwUdxException("1st argument must be int4 (int32).");
}
CPad* pad = getPad("stringpad");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
if(!ro)
{
throwUdxException("Pad does not exist");
}
else
{
int index = int32Arg(0);
if(index<0||index>=ro->size)
{
throwUdxException("Index out of bounds");
}
StringReturn *ret = stringReturnInfo();
ret->size=1;
ret->data[0]=ro->data[index];
NZ_UDX_RETURN_STRING(ret);
}
}
};
Udf* StringPadGet::instantiate()
{
return new StringPadGet;
}

string_pad_size.cpp
The string_pad_size.cpp program defines a UDF that returns the size of the sample stringpad created by the string_pad_create function.
/**
* UDF string_pad_size() -> int4
*
* Returns the size of the pad. -1 if no pad is found.
*
* COMPILATION:
* nzudxcompile UDX_StringPadSize.cpp -o UDX_StringPadSize.o
*
* REGISTRATION:
* CREATE OR REPLACE FUNCTION string_pad_size()
* RETURNS int4 LANGUAGE CPP PARAMETER STYLE NPSGENERIC NOT FENCED

A-18

20444-5

Rev.4

Examples

* CALLED ON NULL INPUT NOT DETERMINISTIC


* EXTERNAL CLASS NAME 'StringPadSize'
* EXTERNAL HOST OBJECT '/tmp/test/UDX_StringPadSize.o_x86'
* EXTERNAL SPU OBJECT '/tmp/test/UDX_StringPadSize.o_spu10';
*
* -->>Do not register any spu-pad related UDFs as 'deterministic'
*
* USAGE:
*
* CREATE TABLE one_dslice(c1 int4);
* INSERT INTO one_dslice VALUES(1);
* SELECT string_pad_create(10, 'netezza') FROM one_dslice;
* SELECT string_pad_size() FROM one_dslice;
*
* Copyright (c) 2007-2010 Netezza Corporation, and IBM Company
* All rights reserved.
*/
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
struct Root
{
char* data;
int size;
};
class StringPadSize: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
CPad* pad = getPad("stringpad");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
if(!ro)
{
NZ_UDX_RETURN_INT32(-1);
}
else
{
NZ_UDX_RETURN_INT32(ro->size);
}
}
};
Udf* StringPadSize::instantiate()
{
return new StringPadSize;
}

20444-5

Rev.4

A-19

IBM Netezza User-Defined Functions Developers Guide

padcounter.cpp
The padcounter sample program contains two UDFs, padcounter() and getpadcount(),
which use a SPUPad to obtain a row count of a table.
#include "udxinc.h"
/*
* These functions obtain a row count of a table using a simple SPUPad.
*
* To compile and register the functions:
*
* nzudxcompile padcounter.cpp --sig "padcounter()" --ret int4
*
--class PadCounter
* then alter function padcounter() to make it NOT DETERMINISTIC
*
* nzudxcompile padcounter.cpp --sig "getpadcount()" --ret int8
*
--class GetPadCount
* then alter function getpadcount() to make it NOT DETERMINISTIC
*
* You need a table with 1 row per SPU; for example: one_per;
*
* This will return the row count in table <TBL>:
* select sum(getpadcount()) from one_per where exists
*
(select count(padcounter()) from <TBL>);
*/
using namespace nz::udx;
struct Root
{
int64 myCount;
};
class PadCounter: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
CPad* pad = getPad("PadCount");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
setReturnNull(false);
if(!ro)
{
ro=PAD_NEW(pad,Root);
ro->myCount = 1;
pad->setRootObject(ro, sizeof(Root));
NZ_UDX_RETURN_INT32(1);
}
else
{
ro->myCount += 1;
NZ_UDX_RETURN_INT32(1);
}

A-20

20444-5

Rev.4

Examples

}
};
Udf* PadCounter::instantiate()
{
return new PadCounter;
}
class GetPadCount: public Udf
{
private:
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
CPad* pad = getPad("PadCount");
Root *ro = (Root*) pad->getRootObject(sizeof(Root));
setReturnNull(false);
if(!ro)
{
NZ_UDX_RETURN_INT64(0);
}
else
{
NZ_UDX_RETURN_INT64(ro->myCount);
}
}
};
Udf* GetPadCount::instantiate()
{
return new GetPadCount;
}

20444-5

Rev.4

A-21

IBM Netezza User-Defined Functions Developers Guide

A-22

20444-5

Rev.4

APPENDIX

Netezza SQL Reference


This appendix provides reference information for the Netezza SQL commands that relate to
the creation and management of user-defined functions and aggregates. Table B-1 lists the
UDX-related commands.
Table B-1: UDX Netezza SQL Commands
Command

Description

More Information

ALTER AGGREGATE

Changes a UDA.

See ALTER AGGREGATE on


page B-2.

ALTER FUNCTION

Changes a UDF.

See ALTER FUNCTION on


page B-6.

ALTER LIBRARY

Changes a shared library.

See ALTER LIBRARY on


page B-11.

CREATE [OR REPLACE] Adds or updates a UDA.


AGGREGATE

See CREATE [OR REPLACE]


AGGREGATE on page B-13.

CREATE [OR REPLACE] Adds or updates a UDF.


FUNCTION

See CREATE [OR REPLACE]


FUNCTION on page B-17.

CREATE LIBRARY

Adds a shared library.

See CREATE [OR REPLACE]


LIBRARY on page B-23.

DROP AGGREGATE

Drops or deletes a UDA.

See DROP AGGREGATE on


page B-25.

DROP FUNCTION

Drops or deletes a UDF.

See DROP FUNCTION on


page B-27.

DROP LIBRARY

Drops or deletes a shared


library.

See DROP LIBRARY on


page B-28.

SHOW AGGREGATE

Displays information
about aggregates (builtins as well as UDAs).

See SHOW AGGREGATE on


page B-30.

SHOW FUNCTION

Displays information
about functions (built-ins
as well as UDFs).

See SHOW FUNCTION on


page B-32.

SHOW LIBRARY

Displays information
about shared libraries.

See SHOW LIBRARY on


page B-33.

B-1

IBM Netezza User-Defined Functions Developers Guide

If you issue one of the alter, create, or drop commands for a UDX or shared library that is
currently in use by an active query, the Netezza system waits for that querys transaction to
complete before it executes the command.
This guide also discusses other Netezza SQL commands such as GRANT, REVOKE, and
COMMENT. For a description of these commands, see the IBM Netezza Database Users
Guide.

ALTER AGGREGATE
Use the ALTER AGGREGATE command to change the aggregate object files, state, return
value, memory usage options, or logging level. The aggregate must be defined in the current database. You can also use this command to change the owner of the UDA.
You cannot change the aggregate name or argument type list using this command. To
change an aggregates name and/or argument type list, you must drop the aggregate and
create an aggregate with the new name and/or argument type list.

Synopsis
Syntax:
ALTER AGGREGATE aggregate_name(argument_types)
[RETURNS return_type] [STATE (state_types)]
[FENCED | NOT FENCED] [MAXIMUM MEMORY mem]
[LOGMASK mask] [TYPE ANY | ANALYTIC | GROUPED]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[API VERSION [ 1 | 2 ] ]
[NO ENVIRONMENT | ENVIRONMENT 'name' = 'value' , 'name2' = 'value2' ]
[EXTERNAL CLASS NAME 'class_name']
[EXTERNAL HOST OBJECT 'host_object_filename']
[EXTERNAL SPU OBJECT 'SPU_object_filename']
ALTER aggregate_name(argument_types) OWNER TO name

Inputs
The ALTER AGGREGATE command takes the following inputs:
Table B-2: ALTER AGGREGATE Input

B-2

Input

Description

aggregate_name

The name of the aggregate that you want to change. You cannot
change the aggregate name using this command.

20444-5

Rev.4

ALTER AGGREGATE

Table B-2: ALTER AGGREGATE Input (continued)


Input

Description

argument_types

A list of fully-specified arguments and types to uniquely identify


the aggregate. You could also specify the VARARGS value to create
a variable argument aggregate where users could input up to 64
values of any supported data type. VARARGS is a mutually exclusive value; you cannot specify any other arguments in the list.
You cannot change the argument list or sizes. You can remove
VARARGS from the argument list, or add it to an otherwise empty
argument list.You cannot change the argument list using this
command.
All Netezza data types are supported. Strings must include either a
size or ANY for generic sizes. NUMERIC types must include precision and scale or ANY for generic sizes.

RETURNS return_
type

Specifies the aggregates return value as one fully specified argument and type. All Netezza data types are supported. Strings must
include a size and NUMERIC types must include precision and
scale.

STATE state_types

Specifies a list of fully-specified state data types, which cannot be


empty. All Netezza data types are supported. Strings must include
a size and NUMERIC types must include precision and scale.
These data items serve as the aggregators running accumulators.
This aggregation state is maintained outside of the aggregation
implementation class's internal state by the Netezza system for various efficiency reasons.

FENCED

Specifies whether the aggregate is executed in a separate process


in protected address space (fenced mode). To create an unfenced
aggregate, you must have the Unfence admin privilege.

NOT FENCED

MAXIMUM MEMORY Specifies an indication of the potential memory use of the aggregate. The size value can be an empty value or a value in the form of
a number and the letters b (bytes), k (kilobytes), m (megabytes), or
g (gigabytes). For example, valid values could be '0', '1k', '100k',
'1g', or '10m'. The default is 0.

20444-5

Rev.4

LOGMASK mask

Specifies the logging control level for the aggregate. Valid values
are NONE, DEBUG, and TRACE, or a comma-separated combination of DEBUG and TRACE.

TYPE

The context in which the UDA can be called. Specify ANALYTIC if


the UDA is allowed only for window aggregates, GROUPED if the
UDA is allowed in grouped or grand aggregates, or ANY if the UDA
is allowed in both contexts. For more information about windowing,
refer to the IBM Netezza Database Users Guide.

B-3

IBM Netezza User-Defined Functions Developers Guide

Table B-2: ALTER AGGREGATE Input (continued)


Input

Description

DEPENDENCIES
deplibs

Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.
API VERSION [1 | 2] Specifies the version of the UDX interface used by the aggregate.
The API VERSION must match the compiled version of the object
files for the host and SPU. The default is 1. If you include version
2 compiled objects, you must specify API VERSION 2.
ENVIRONMENT
NO ENVIRONMENT

Specifies a name/value pair that is available to the aggregate when


executing. You can specify several comma-separated name/value
pairs.
To alter an existing set of one or more environment pairs, you must
specify all the environment settings; the alter command replaces
the current list with the list specified in the ALTER command. To
clear the environment list, specify NO ENVIRONMENT.

EXTERNAL CLASS
NAME 'class_name'

Specifies the name of the C++ class that implements the aggregate. The class must derive from the Uda base class and must
implement a static method that instantiates an instance of the
class.

EXTERNAL HOST
OBJECT 'host_
object_filename'

Specifies the pathname to the implementation's compiled object


as compiled for host execution.

EXTERNAL SPU
OBJECT 'SPU_
object_filename'

Specifies the pathname for the Linux SPUs compiled object file.
Specify the spu10 compiled object for Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.

Outputs
The ALTER AGGREGATE command has the following output
Table B-3: ALTER AGGREGATE Output
Output

Description

ALTER AGGREGATE

The message that the system returns if the command is


successful.

Error: AlterAggregate: existing


UDX name(argument_types)
differs in size of string/numeric
arguments

This error indicates that a UDX exists with the name but
has different sizes specified for string or numeric arguments. To alter the aggregate, make sure that you
specify the exact argument type list with correct sizes.

ERROR: lookupLibrary: library lib- The message that the system returns if it cannot find the username does not exist
defined shared library specified as a dependency.

B-4

20444-5

Rev.4

ALTER AGGREGATE

Table B-3: ALTER AGGREGATE Output


Output

Description

ERROR: Version mismatch for


function udx_name. Specified
version 2, but provided version
1 object file

The compiled object files use API version 1 support, but


the SQL command uses version 2 functionality. You
must either create version 2 compiled objects, or remove
options in the ALTER command that specify version 2
features.

ERROR: Version mismatch for


function udx_name. Specified
version 1, but provided version
2 object file

The compiled object files use API version 2 support, but


the SQL command uses version 1 functionality. You
must either specify version 1 compiled objects, or
change the ALTER command to specify version 2 syntax.

ERROR: Environment names


can't be empty

The name value of an environment setting cannot be an


empty string.

ERROR: type 'type' is not yet


defined

The specified return type is not a known Netezza data


type.

Description
You cannot alter a user-defined aggregate that is currently in use in an active query. After
the active querys transaction completes, the Netezza system will process the ALTER
AGGREGATE command to update the aggregate.

Privileges Required
To alter a UDA, you must meet one of the following criteria:

You must have the Alter privilege on the AGGREGATE object.

You must have the Alter privilege on the specific UDA object.

You must own the UDA.

You must be the database admin user.

To alter an aggregate to be unfenced, you must have the Unfence admin privilege.

Note: When you issue an ALTER AGGREGATE command and specify new object files, the
database processes the HOST OBJECT and the SPU OBJECT files as the user nz. The user
nz must have read access to the object files and read and execute access to every directory
in the path from the root to the object file.

Common Tasks
You can use the ALTER AGGREGATE command to change the owner of an aggregate. Make
sure that you specify the full signature of the aggregate (name and argument type list) as
follows:
ALTER AGGREGATE aggregate_name(argument_types) OWNER TO name

Related Commands
See CREATE [OR REPLACE] AGGREGATE on page B-13 to create aggregates.

20444-5

Rev.4

B-5

IBM Netezza User-Defined Functions Developers Guide

See DROP AGGREGATE on page B-25 to drop aggregates.


See SHOW AGGREGATE on page B-30 to display information about aggregates.

Usage
The following provides sample usage.

To change the message logging to DEBUG level on a sample aggregate mycalc(int4),


enter:
DEV(MYUSER)=> ALTER AGGREGATE mycalc(int4) LOGMASK DEBUG;

ALTER FUNCTION
Use the ALTER FUNCTION command to change the function object files, return value,
memory usage options, or logging level. You can also use this command to change the
owner of the UDF.
You cannot change the function name or argument type list using this command. To change
a functions name and/or argument type list, you must drop the function and then create a
function with the new name and/or argument type list.

Synopsis
The ALTER FUNCTION command has the following syntax:
ALTER FUNCTION function_name(argument_types)
[RETURNS return_type] [FENCED | NOT FENCED]
[DETERMINISTIC | NOT DETERMINISTIC]
[RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT]
[MAXIMUM MEMORY mem]
[LOGMASK mask] [NO DEPENDENCIES| DEPENDENCIES deplibs]
[API VERSION [ 1 | 2 ] ]
[NO ENVIRONMENT | ENVIRONMENT 'name' = 'value' , 'name2' = 'value2']
[TABLE, TABLE FINAL ALLOWED | TABLE ALLOWED | TABLE FINAL ALLOWED]
[PARALLEL ALLOWED | PARALLEL NOT ALLOWED]
[EXTERNAL CLASS NAME 'class_name']
[EXTERNAL HOST OBJECT 'host_object_filename']
[EXTERNAL SPU OBJECT 'SPU_object_filename']
ALTER FUNCTION function_name(argument_types) OWNER TO name

Inputs
The ALTER FUNCTION command takes the following inputs:
Table B-4: ALTER FUNCTION Input

B-6

Input

Description

function_name

Specifies the name of the function that you want to change. You
cannot change the name of the function.

20444-5

Rev.4

ALTER FUNCTION

Table B-4: ALTER FUNCTION Input (continued)


Input

Description

argument_types

A list of fully-specified arguments and types to uniquely identify


the function. You could also specify the VARARGS value to create a
variable argument aggregate where users could input up to 64 values of any supported data type. VARARGS is a mutually exclusive
value; you cannot specify any other arguments in the list.
You cannot change the argument list or sizes. You can remove
VARARGS from the argument list, or add it to an otherwise empty
argument list.You cannot change the argument list using this
command.
All Netezza data types are supported. Strings must include either a
size or ANY for generic sizes. NUMERIC types must include precision and scale or ANY for generic sizes.

RETURNS return_
type

Specifies the functions return value as one fully specified argument and type. All Netezza data types are supported. Strings must
include either a size or ANY for generic sizes. NUMERIC types
must include precision and scale or ANY for generic sizes.

FENCED

Specifies whether the function is executed in a separate process in


protected address space (fenced mode). To create an unfenced
function, you must have the Unfence admin privilege.

NOT FENCED
[DETERMINISTIC |
NOT
DETERMINISTIC]

DETERMINISTIC indicates that the UDF is a pure function, one


which always returns the same value given the same argument values and which has no side effects. The system may consider
multiple instances of a deterministic UDF that have identical argument lists to be candidates for common subexpression elimination
(CSE).
If a function is DETERMINISTIC, it will be called once at statement preparation time instead of once per row if either of the
following is true:
It RETURNS NULL ON NULL INPUT and one or more of its

argument are NULL (the literal NULL).


It has all constant arguments.

An argument is constant if it is a SQL literal, or the result of a UDF


or built-in that has been evaluated once at statement preparation
time instead of once per row. For more information on query optimization impacts for this setting, see Netezza Query Optimization
and UDX Calls on page 6-12.

20444-5

Rev.4

B-7

IBM Netezza User-Defined Functions Developers Guide

Table B-4: ALTER FUNCTION Input (continued)


Input

Description

[RETURNS NULL
ON NULL INPUT |
CALLED ON NULL
INPUT]

RETURNS NULL ON NULL INPUT indicates that the function


always returns NULL whenever any of its arguments are NULL. If
you specify this parameter, the function will not be executed when
there are NULL arguments; instead a NULL result is assumed
automatically.
CALLED ON NULL INPUT (the default) indicates that the function
will be called normally when some of its arguments are NULL. It is
then the function creators responsibility to check for NULL values
if necessary and respond appropriately. For more information on
query optimization impacts for this setting, see Netezza Query
Optimization and UDX Calls on page 6-12.

MAXIMUM MEMORY Specifies an indication of the potential memory use of the function. The size value can be an empty value or a value in the form of
a number and the letters b (bytes), k (kilobytes), m (megabytes), or
g (gigabytes). For example, valid values could be '0', '1k', '100k',
'1g', or '10m'. The default is 0.
LOGMASK mask

Specifies the logging control level for the function. Valid values are
NONE, DEBUG, and TRACE, or a comma-separated combination of
DEBUG and TRACE.

Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.
DEPENDENCIES
deplibs

API VERSION [1 | 2] Specifies the version of the UDX interface used by the aggregate.
The API VERSION must match the compiled version of the object
files for the host and SPU. The default is 1. If you include version
2 compiled objects, you must specify API VERSION 2.
ENVIRONMENT
NO ENVIRONMENT

B-8

Specifies a name/value pair that is available to the function when


executing. You can specify several comma-separated name/value
pairs.
To alter an existing set of one or more environment pairs, you must
specify all the environment settings; the alter command replaces
the current list with the list specified in the ALTER command. To
clear the environment list, specify NO ENVIRONMENT.

20444-5

Rev.4

ALTER FUNCTION

Table B-4: ALTER FUNCTION Input (continued)


Input

Description

TABLE, TABLE
FINAL ALLOWED

Specifies the options that control how the user-defined table function can be invoked.
The TABLE, TABLE FINAL ALLOWED option specifies that you

TABLE ALLOWED
TABLE FINAL
ALLOWED

can invoke the table function using TABLE(func()), TABLE


WITH FINAL(func()), or either case.
You can also specify either TABLE ALLOWED or TABLE FINAL

ALLOWED to allow the user-defined table function to be invoked


using one of these forms.
FINAL means that the table function will be invoked after all of the
input rows are processed, thus allowing for it to output more rows.

PARALLEL
ALLOWED
PARALLEL NOT
ALLOWED

PARALLEL ALLOWED specifies that a user-defined table function


can be invoked on either the host or the SPU, at the discretion of
the optimizer.
PARALLEL NOT ALLOWED specifies that the table function will
always be invoked on the host or one selected SPU at the discretion of the optimizer.

EXTERNAL CLASS
NAME 'class_name'

Specifies the name of the C++ class that implements the function.
The class must derive from the Udf base class and must implement
a static method that instantiates an instance of the class.

EXTERNAL HOST
OBJECT 'host_
object_filename'

Specifies the pathname to the implementation's compiled object


as compiled for host execution.

EXTERNAL SPU
OBJECT 'SPU_
object_filename'

Specifies the pathname for the Linux SPUs compiled object file.
Specify the spu10 compiled object for Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.

Outputs
The ALTER FUNCTION command has the following output
Table B-5: ALTER FUNCTION Output

20444-5

Rev.4

Output

Description

ALTER FUNCTION

The message returned if the command is successful.

Error: AlterFunction: existing


UDX name(argument_types)
differs in size of string/numeric
arguments

This error indicates that a UDX exists with the name but
has different sizes specified for string or numeric arguments. To alter the function, make sure that you specify
the exact argument type list with correct sizes.

ERROR: lookupLibrary: library


libname does not exist

The message that the system returns if it cannot find the


user-defined shared library specified as a dependency.

B-9

IBM Netezza User-Defined Functions Developers Guide

Table B-5: ALTER FUNCTION Output


Output

Description

ERROR: Version mismatch for


function udx_name. Specified
version 2, but provided version
1 object file

The compiled object files use API version 1 support, but


the SQL command uses version 2 functionality. You
must either create version 2 compiled objects, or remove
options in the ALTER command that specify version 2
features.

ERROR: Version mismatch for


function udx_name. Specified
version 1, but provided version
2 object file

The compiled object files use API version 2 support, but


the SQL command uses version 1 functionality. You
must either specify version 1 compiled objects, or
change the ALTER command to specify version 2 syntax.

ERROR: Environment names


can't be empty

The name value of an environment setting cannot be an


empty string.

ERROR: type 'type' is not yet


defined

The specified return type is not a known Netezza data


type.

Description
You cannot alter a user-defined function that is currently in use in an active query. After the
active querys transaction completes, the Netezza system will process the ALTER FUNCTION command to update the function.

Privileges Required
To alter a UDF, you must meet one of the following criteria:

You must have the Alter privilege on the FUNCTION object.

You must have the Alter privilege on the specific UDF object.

You must own the UDF.

You must be the database admin user.

To alter a function to be unfenced, you must have the Unfence admin privilege.

Note: When you issue an ALTER FUNCTION command and specify new object files, the
database processes the HOST OBJECT and the SPU OBJECT files as the user nz. The user
nz must have read access to the object files and read and execute access to every directory
in the path from the root to the object file.

Common Tasks
You can use the ALTER FUNCTION command to change the owner of a function. Make sure
that you specify the full signature of the function (name and argument type list) as follows:
ALTER FUNCTION function_name(argument_types) OWNER TO name

Related Commands
See CREATE [OR REPLACE] FUNCTION on page B-17 to create functions.
See DROP FUNCTION on page B-27 to drop functions.

B-10

20444-5

Rev.4

ALTER LIBRARY

See SHOW FUNCTION on page B-32 to display information about functions.

Usage
The following provides sample usage.

To alter a sample function named myfunc(char(12)) to set the MAXIMUM MEMORY


option to 100k, enter:
MYDB(MYUSER)=> ALTER FUNCTION myfunc(char(12)) MAXIMUM MEMORY
'100k';

ALTER LIBRARY
Use the ALTER LIBRARY command to change a user-defined shared library. You can use
this command to change properties such as the loading method, dependencies, owner, and
object files. You can also use this command to change the owner of the library.
You cannot change the library name using this command. To change a librarys name, you
must drop the library and create a library with the new name.

Synopsis
The ALTER LIBRARY command has the following syntax:
ALTER LIBRARY library_name
[ AUTOMATIC LOAD | MANUAL LOAD ]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[ EXTERNAL HOST OBJECT 'host_object_filename' ]
[ EXTERNAL SPU OBJECT 'SPU_object_filename' ]
ALTER LIBRARY library_name OWNER TO name

Inputs
The ALTER LIBRARY command takes the following inputs:
Table B-6: ALTER LIBRARY Input
Input

Description

library_name

The name of the library that you want to change. You must be connected to the database where the library is defined. You cannot
change the name using this command.

[AUTOMATIC LOAD | Automatic load specifies that the Netezza system will automatiMANUAL LOAD]
cally open the library before any objects that depend upon it are
used. Manual load specifies that the UDX is responsible for opening and closing manual load libraries when they are needed.
DEPENDENCIES
deplibs

Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.

20444-5

Rev.4

B-11

IBM Netezza User-Defined Functions Developers Guide

Table B-6: ALTER LIBRARY Input (continued)


Input

Description

EXTERNAL HOST
OBJECT 'host_
object_filename'

Specifies the pathname to the shared librarys compiled host


object file.

EXTERNAL SPU
OBJECT 'SPU_
object_filename'

Specifies the pathname to the shared librarys compiled object file


for the Linux SPU environment. Specify the spu10 compiled object
for Rev10 SPUs on IBM Netezza 1000 and Netezza 100 models.

Outputs
The ALTER LIBRARY command has the following output
Table B-7: ALTER LIBRARY Output
Output

Description

ALTER LIBRARY

The message returned if the command is successful.

ERROR: Unable to calculate


cksum for file filename

The message returned if the object file pathname cannot


be resolved to a file.

ERROR: lookupLibrary: library


libname does not exist

The message that the system returns if the specified


shared library does not exist in the current database.

Description
You cannot alter a user-defined library that is currently in use in an active query. After the
active querys transaction completes, the Netezza system will process the ALTER LIBRARY
command to update the library.

Privileges Required
To alter a UDF, you must meet one of the following criteria:

You must have the Alter privilege on the LIBRARY object.

You must have the Alter privilege on the specific library object.

You must own the library.

You must be the database admin user.

Note: When you issue an ALTER LIBRARY command and specify new object files, the database processes the HOST OBJECT and the SPU OBJECT files as the user nz. The user nz
must have read access to the object files and read and execute access to every directory in
the path from the root to the object file.

Common Tasks
You can use the ALTER LIBRARY command to change the owner of a library. For example:
ALTER LIBRARY library_name OWNER TO name

B-12

20444-5

Rev.4

CREATE [OR REPLACE] AGGREGATE

Related Commands
See CREATE [OR REPLACE] LIBRARY on page B-23 to create or replace shared
libraries.
See DROP LIBRARY on page B-28 to drop shared libraries.
See SHOW LIBRARY on page B-33 to display information about shared libraries.

Usage
The following provides sample usage.

To alter a sample library named mylib to set the load option to MANUAL LOAD, enter:
MYDB(MYUSER)=> ALTER LIBRARY mylib MANUAL LOAD;

CREATE [OR REPLACE] AGGREGATE


Use the CREATE AGGREGATE command to create a new user-defined aggregate. Use CREATE OR REPLACE AGGREGATE to create a new aggregate or to update an existing
aggregate with new object files, state, return value, memory usage, or logging level.

Synopsis
Syntax for creating a new user-defined aggregate:
CREATE [OR REPLACE] AGGREGATE aggregate_name(argument_types)
RETURNS return_type STATE (state_types)
LANGUAGE CPP PARAMETER STYLE NPSGENERIC [FENCED | NOT FENCED]
[MAXIMUM MEMORY mem ] [LOGMASK mask]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[ TYPE ANY | ANALYTIC | GROUPED] [API VERSION [1 | 2]]
[ENVIRONMENT 'name'='value', 'name'='value']
[EXTERNAL CLASS NAME 'class_name']
[EXTERNAL HOST OBJECT 'host_object_filename']
[EXTERNAL SPU OBJECT 'SPU_object_filename']

Inputs
The CREATE [OR REPLACE] AGGREGATE command takes the following inputs:
Table B-8: CREATE [OR REPLACE] AGGREGATE Input

20444-5

Rev.4

Input

Description

aggregate_name

Specifies the name of the aggregate that you want to create. This is
the SQL identifier that will be used to invoke the aggregate in a
SQL expression.
If the aggregate already exists, you cannot change the name using
the CREATE OR REPLACE command.

B-13

IBM Netezza User-Defined Functions Developers Guide

Table B-8: CREATE [OR REPLACE] AGGREGATE Input (continued)


Input

Description

argument_types

Specifies a list of fully-specified aggregate argument data types.


All Netezza data types are supported. Strings must include either a
size or ANY for generic sizes. NUMERIC types must include precision and scale or ANY for generic sizes.
You could also specify the VARARGS value to create a variable
argument aggregate where users could input up to 64 values of any
supported data type. VARARGS is a mutually exclusive value; you
cannot specify any other arguments in the list.
If the aggregate already exists, you cannot change the argument
type list using the CREATE OR REPLACE command. You can
change some aspects of a UDAs argument types; for example, you
can change the size of a string or the precision and scale of a
numeric value. You can remove VARARGS from the argument list,
or add it to an otherwise empty argument list.

RETURNS return_
type

Specifies the aggregates return value as one fully specified argument and type. All Netezza data types are supported. Strings must
include a size and NUMERIC types must include precision and
scale.

STATE state_types

Specifies a list of fully-specified state data types, which cannot be


empty. All Netezza data types are supported. Strings must include
a size and NUMERIC types must include precision and scale.
These data items serve as the aggregators running accumulators.
This aggregation state is maintained outside of the aggregation
implementation class's internal state by the Netezza system for various efficiency reasons.

LANGUAGE

Specifies the programming language used for the aggregate. The


default and only supported value at this time is CPP (C++).

PARAMETER STYLE Specifies the parameter style for the aggregate. The default and
only valid value is NPSGENERIC.
FENCED
NOT FENCED

Specifies whether the aggregate is executed in a separate process


in protected address space (fenced mode). To create an unfenced
aggregate, you must have the Unfence admin privilege.

MAXIMUM MEMORY Specifies an indication of the potential memory use of the aggregate. The size value can be an empty value or a value in the form of
a number and the letters b (bytes), k (kilobytes), m (megabytes), or
g (gigabytes). For example, valid values could be '0', '1k', '100k',
'1g', or '10m'. The default is 0.
LOGMASK mask

B-14

Specifies the logging control level for the aggregate. Valid values
are NONE, DEBUG, and TRACE, or a comma-separated combination of DEBUG and TRACE.

20444-5

Rev.4

CREATE [OR REPLACE] AGGREGATE

Table B-8: CREATE [OR REPLACE] AGGREGATE Input (continued)


Input

Description

DEPENDENCIES
deplibs

Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.
TYPE

The context in which the UDA can be called. Specify ANALYTIC if


the UDA is allowed only for window aggregates, GROUPED if the
UDA is allowed in grouped or grand aggregates, or ANY if the UDA
is allowed in both contexts. For more information about windowing,
refer to the IBM Netezza Database Users Guide.

API VERSION [1 | 2] Specifies the version of the UDX interface used by the aggregate.
The API VERSION must match the compiled version of the object
files for the host and SPU. The default is 1. If you include version
2 compiled objects, you must specify API VERSION 2.
ENVIRONMENT

Specifies a name/value pair that is available to the aggregate when


executing. You can specify several comma-separated name/value
pairs.
To replace an existing set of one or more environment pairs, you
must specify all the environment settings; the command replaces
the current list with the list specified in the CREATE OR REPLACE
command.

EXTERNAL CLASS
NAME 'class_name'

Specifies the name of the C++ class that implements the aggregate. The class must derive from the Uda base class and must
implement a static method that instantiates an instance of the
class.

EXTERNAL HOST
OBJECT 'host_
object_filename'

Specifies the pathname to the implementation's compiled object


as compiled for host execution.

EXTERNAL SPU
OBJECT 'SPU_
object_filename'

Specifies the pathname for the Linux SPUs compiled object file.
Specify the spu10 compiled object for Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.

Outputs
The CREATE [OR REPLACE] AGGREGATE command has the following output:
Table B-9: CREATE [OR REPLACE] AGGREGATE Output

20444-5

Rev.4

Output

Description

CREATE AGGREGATE

The message that the system returns if the command is


successful.

B-15

IBM Netezza User-Defined Functions Developers Guide

Table B-9: CREATE [OR REPLACE] AGGREGATE Output (continued)


Output

Description

ERROR: User 'username' is not The system returns this message if your user account
allowed to create/drop
does not have Create Aggregate permission.
aggregates.
ERROR: Synonym 'name'
already exists

The system returns this message if a synonym already


exists with the name that you specified for the
aggregate.

ERROR: AggregateCreate:
aggregate name already exists
with the same arguments

This error is returned when you issue a CREATE AGGREGATE command and an aggregate with the same name
and argument type list already exists in the database.
Use CREATE OR REPLACE AGGREGATE instead.

NOTICE: AggregateCreate:
existing UDX name(argument_
types) differs in size of string/
numeric arguments

This message indicates that a UDX already exists with


the name but has different sizes specified for string or
numeric arguments. If you did not intend to change the
aggregate signature, you should check the signature and
ensure that it is correct.

ERROR: lookupLibrary: library


libname does not exist

The message that the system returns if it cannot find the


user-defined shared library specified as a dependency.

ERROR: Version mismatch for


function udx_name. Specified
version 2, but provided version
1 object file

The compiled object files use API version 1 support, but


the SQL command uses version 2 functionality. You
must either create version 2 compiled objects, or remove
options in the ALTER command that specify version 2
features.

ERROR: Version mismatch for


function udx_name. Specified
version 1, but provided version
2 object file

The compiled object files use API version 2 support, but


the SQL command uses version 1 functionality. You
must either specify version 1 compiled objects, or
change the ALTER command to specify version 2 syntax.

ERROR: Environment names


can't be empty

The name value of an environment setting cannot be an


empty string.

ERROR: type 'type' is not yet


defined

The specified return type is not a known Netezza data


type.

Description
When you create an aggregate, note that the aggregates signature (that is, its name and
argument type list) must be unique within its database. No other UDX can have the same
name and argument type list in the same database.
You cannot change the aggregate name or the argument type list using the CREATE OR
REPLACE command. You can change some aspects of the argument types; for example,
you can change the size of a string or the precision and scale of a numeric value. To change
an aggregates name and/or argument type list, you must drop the aggregate and then create an aggregate with the new name and/or argument type list.

B-16

20444-5

Rev.4

CREATE [OR REPLACE] FUNCTION

You cannot replace a user-defined aggregate that is currently in use in an active query.
After the active querys transaction completes, the Netezza system will process the CREATE
OR REPLACE AGGREGATE command to update the aggregate.

Privileges Required
You must have Create Aggregate permission to use the CREATE AGGREGATE command.
Also, if you use CREATE OR REPLACE AGGREGATE to change a UDA, you must have Create Aggregate permission and Alter permission for the UDA to change it. To create an
unfenced aggregate, you must have the Unfence admin privilege.
Note: When you issue a CREATE AGGREGATE command, the database processes the HOST
OBJECT and the SPU OBJECT files as the user nz. The user nz must have read access to
the object files and read and execute access to every directory in the path from the root to
the object file.

Common Tasks
Use the CREATE AGGREGATE command to create and become the owner of a new userdefined aggregate. You must create the aggregates C++ files and compile them using
nzudxcompile before you can use this command to register the aggregate with the Netezza
system.
Netezza has some special processing to deal with string fields used in aggregates. If your
aggregate returns a string type that is larger than 512 bytes, there must be a string type in
the state that is larger than 255 bytes, or multiple ones which have combined lengths that
are greater than 255. Otherwise, the command returns an error similar to the following:
ERROR: Records trailing string space set to 512 is too small: Bump
it up using the environment variable NZ_SPRINGFIELD_SIZE

Related Commands
See ALTER AGGREGATE on page B-2 to alter an aggregate.
See DROP AGGREGATE on page B-25 to remove a user-defined aggregate.
See SHOW AGGREGATE on page B-30 to display information about aggregates.

Usage
The following provides sample usage.

To create the sample Penmax aggregate (described in Chapter 4):


MYDB(MYUSER)=> CREATE AGGREGATE PENMAX(INT4) RETURNS INT4
STATE (INT4, INT4) LANGUAGE CPP PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'CPenMax'
EXTERNAL HOST OBJECT '/home/nz/udx_files/penmax.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/penmax.o_spu10';

CREATE [OR REPLACE] FUNCTION


Use the CREATE FUNCTION command to create a new user-defined function. The CREATE
OR REPLACE FUNCTION will create a new function or replace an existing function of the
same name with new object files, return value, function behaviors, or logging level.

20444-5

Rev.4

B-17

IBM Netezza User-Defined Functions Developers Guide

Synopsis
Syntax for creating a new user-defined function:
CREATE [OR REPLACE] FUNCTION function_name(argument_types)
RETURNS return_type LANGUAGE CPP PARAMETER STYLE NPSGENERIC
[FENCED | NOT FENCED] [DETERMINISTIC | NOT DETERMINISTIC]
[RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT]
[MAXIMUM MEMORY mem] [LOGMASK <MASK>]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[API VERSION [ 1 | 2 ]]
[ENVIRONMENT 'name' = 'value', 'name2' = 'value2']
[TABLE, TABLE FINAL ALLOWED | TABLE ALLOWED | TABLE FINAL ALLOWED]
[PARALLEL ALLOWED | PARALLEL NOT ALLOWED]
[EXTERNAL CLASS NAME 'class_name']
[EXTERNAL HOST OBJECT 'host_object_filename']
[EXTERNAL SPU OBJECT 'SPU_object_filename']

Inputs
The CREATE [OR REPLACE] FUNCTION command takes the following inputs:
Table B-10: CREATE [OR REPLACE] FUNCTION Input

B-18

Input

Description

function_name

Specifies the name of the function that you want to create. This is
the SQL identifier that will be used to invoke the function in a SQL
expression. The name must meet the naming criteria for keywords
and identifiers, which are described in the IBM Netezza Database
Users Guide.
If the function already exists, you cannot change the name using
the CREATE OR REPLACE command.

argument_types

Specifies a list of fully-specified function argument data types. All


Netezza data types are supported. Strings must include either a
size or ANY for generic sizes. NUMERIC types must include precision and scale or ANY for generic sizes.
You could also specify the VARARGS value to create a variable
argument aggregate where users could input up to 64 values of any
supported data type. VARARGS is a mutually exclusive value; you
cannot specify any other arguments in the list.
You cannot change the argument list or sizes. You can remove
VARARGS from the argument list, or add it to an otherwise empty
argument list.You cannot change the argument list using this
command.
If the function already exists, you cannot change the argument type
list using the CREATE OR REPLACE command. You can also use
CREATE OR REPLACE to alter some aspects of a UDFs argument
types; for example, you can change the size of a string or the precision and scale of a numeric value.

20444-5

Rev.4

CREATE [OR REPLACE] FUNCTION

Table B-10: CREATE [OR REPLACE] FUNCTION Input (continued)


Input

Description

RETURNS return_
type

Specifies the functions return value as one fully specified argument and type. All Netezza data types are supported. Strings must
include either a size or ANY for generic sizes. NUMERIC types
must include precision and scale or ANY for generic sizes.

LANGUAGE

Specifies the programming language used for the function. The


default and only supported value at this time is CPP (C++).

PARAMETER STYLE The default and only supported value at this time is NPSGENERIC.
FENCED
NOT FENCED
[DETERMINISTIC |
NOT
DETERMINISTIC]

Specifies whether the function is executed in a separate process in


protected address space (fenced mode). To create an unfenced
function, you must have the Unfence admin privilege.
DETERMINISTIC indicates that the UDF is a pure function, one
which always returns the same value given the same argument values and which has no side effects. The system may consider
multiple instances of a deterministic UDF having identical argument lists to be candidates for common subexpression elimination
(CSE). The default is DETERMINISTIC.
If a function is DETERMINISTIC, it will be called once at statement preparation time instead of once per row if either of the
following is true:
It RETURNS NULL ON NULL INPUT and one or more of its

argument are NULL (the literal NULL).


It has all constant arguments.

An argument is constant if it is a SQL literal, or the result of a UDF


or built-in that has been evaluated once at statement preparation
time instead of once per row.
[RETURNS NULL
ON NULL INPUT |
CALLED ON NULL
INPUT]

RETURNS NULL ON NULL INPUT indicates that the function


always returns NULL whenever any of its arguments are NULL. If
you specify this parameter, the function will not be executed when
there are NULL arguments; instead a NULL result is assumed
automatically.
CALLED ON NULL INPUT (the default) indicates that the function
will be called normally when some of its arguments are NULL. It is
then the function creators responsibility to check for NULL values
if necessary and respond appropriately. For more information on
query optimization impacts for this setting, see Netezza Query
Optimization and UDX Calls on page 6-12.

MAXIMUM MEMORY Specifies an indication of the potential memory use of the function. The size value can be an empty value or a value in the form of
a number and the letters b (bytes), k (kilobytes), m (megabytes), or
g (gigabytes). For example, valid values could be '0', '1k', '100k',
'1g', or '10m'. The default is 0.

20444-5

Rev.4

B-19

IBM Netezza User-Defined Functions Developers Guide

Table B-10: CREATE [OR REPLACE] FUNCTION Input (continued)


Input

Description

LOGMASK mask

Specifies the logging control level for the function. Valid values are
NONE, DEBUG, and TRACE, or a comma-separated combination of
DEBUG and TRACE.

DEPENDENCIES
deplibs

Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.

API VERSION [1 | 2] Specifies the version of the UDX interface used by the aggregate.
The API VERSION must match the compiled version of the object
files for the host and SPU. The default is 1. If you include version
2 compiled objects, you must specify API VERSION 2.
ENVIRONMENT

Specifies a name/value pair that is available to the function when


executing. You can specify several comma-separated name/value
pairs.
To alter an existing set of one or more environment pairs, you must
specify all the environment settings; the alter command replaces
the current list with the list specified in the ALTER command.

TABLE, TABLE
FINAL ALLOWED

Specifies the options that control how the user-defined table function can be invoked.
The TABLE, TABLE FINAL ALLOWED option specifies that you

TABLE ALLOWED
TABLE FINAL
ALLOWED

can invoke the table function using TABLE(func()), TABLE


WITH FINAL(func()), or either case.
You can also specify either TABLE ALLOWED or TABLE FINAL

ALLOWED to allow the user-defined table function to be invoked


using one of these forms.
FINAL means that the table function will be invoked after all of the
input rows are processed, thus allowing for it to output more rows.

PARALLEL
ALLOWED
PARALLEL NOT
ALLOWED

B-20

PARALLEL ALLOWED specifies that a user-defined table function


can be invoked on either the host or the SPU, at the discretion of
the optimizer.
PARALLEL NOT ALLOWED specifies that the table function will
always be invoked on the host or one selected SPU at the discretion of the optimizer.

EXTERNAL CLASS
NAME 'class_name'

Specifies the name of the C++ class that implements the function.
The class must derive from the Udf base class and must implement
a static method that instantiates an instance of the class.

EXTERNAL HOST
OBJECT 'host_
object_filename'

Specifies the pathname to the implementation's compiled object


as compiled for host execution.

EXTERNAL SPU
OBJECT 'SPU_
object_filename'

Specifies the pathname for the Linux SPUs compiled object file.
Specify the spu10 compiled object for Rev10 SPUs on IBM
Netezza 1000 and Netezza 100 models.

20444-5

Rev.4

CREATE [OR REPLACE] FUNCTION

Outputs
The CREATE [OR REPLACE] FUNCTION command has the following output:
Table B-11: CREATE [OR REPLACE] FUNCTION Output
Output

Description

CREATE FUNCTION

The message that the system returns if the command is


successful.

ERROR: User 'username' is not The system returns this message if your user account
allowed to create/drop
does not have Create Function permission.
functions.
ERROR: Synonym 'name'
already exists

The system returns this message if a synonym already


exists with the name that you specified for the function.

ERROR: function name already This error is returned when you issue a CREATE FUNCexists with the same signature TION command and a function with the same name and
argument type list already exists in the database. Use
CREATE OR REPLACE FUNCTION instead.
ERROR: function name already The system returns this message if a function already
exists with the same signature exists with the name that you specified for the function.

20444-5

Rev.4

NOTICE: FunctionCreate: existing UDX name(argument_types)


differs in size of string/numeric
arguments

This message indicates that a UDX already exists with


the name but has different sizes specified for string or
numeric arguments. If you did not intend to change the
function signature, you should check the signature and
ensure that it is correct.

ERROR: lookupLibrary: library


libname does not exist

The message that the system returns if it cannot find the


user-defined shared library specified as a dependency.

ERROR: ProcedureCreate:
Can't use version 2 features
without specifying API VERSION 2 for udx_name

The message indicates that you specified version 2


options for the SQL command, but you did not also
specify API VERSION 2 in the SQL command.

ERROR: Version mismatch for


function udx_name. Specified
version 2, but provided version
1 object file

The compiled object files use API version 1 support, but


the SQL command uses version 2 functionality. You
must either create version 2 compiled objects, or remove
options in the CREATE command that specify version 2
features.

ERROR: Version mismatch for


function udx_name. Specified
version 1, but provided version
2 object file

The compiled object files use API version 2 support, but


the SQL command uses version 1 functionality. You
must either specify version 1 compiled objects, or
change the ALTER command to specify version 2 syntax.

ERROR: Environment names


can't be empty

The name value of an environment setting cannot be an


empty string.

ERROR: type 'type' is not yet


defined

The specified return type is not a known Netezza data


type.

B-21

IBM Netezza User-Defined Functions Developers Guide

Description
When you create a function, note that the functions signature (that is, its name and argument type list) must be unique within its database. No other user-defined function or
aggregate can have the same name and argument type list in the same database.
You cannot change the functions name or the argument type list using the CREATE OR
REPLACE command. You can change some aspects of the argument types; for example,
you can change the size of a string or the precision and scale of a numeric value. To change
a functions name and/or argument type list, you must drop the function and then create a
function with the new name and/or argument type list.
You cannot replace a user-defined function that is currently in use in an active query. After
the active querys transaction completes, the Netezza system will process the CREATE OR
REPLACE FUNCTION command to update the function.

Privileges Required
You must have Create Function permission to use the CREATE FUNCTION command. Also,
if you use CREATE OR REPLACE FUNCTION to change a UDF, you must have Create Function and Alter permission for the UDF to change it. To create an unfenced function, you
must have the Unfence admin privilege.
Note: When you issue a CREATE FUNCTION command, the database processes the HOST
OBJECT and the SPU OBJECT files as the user nz. The user nz must have read access to
the object files and read and execute access to every directory in the path from the root to
the object file.

Common Tasks
Use the CREATE FUNCTION command to create and become the owner of a new userdefined function. You must create the functions C++ files and compile them using nzudxcompile before you can use this command to register the function with the Netezza system.
The function is defined as an object in the current database.

Related Commands
See ALTER FUNCTION on page B-6 to change a UDF.
See DROP FUNCTION on page B-27 to drop a UDF.
See SHOW FUNCTION on page B-32 to display information about functions.

Usage
The following provides sample usage.

To create the sample function CustomerName (described in Chapter 2):


MYDB(MYUSER)=> CREATE FUNCTION CustomerName(varchar(64000))
RETURNS int4 LANGUAGE CPP PARAMETER STYLE npsgeneric
EXTERNAL CLASS NAME 'CCustomerName'
EXTERNAL HOST OBJECT '/home/nz/udx_files/customername.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/customername.o_spu10'

B-22

20444-5

Rev.4

CREATE [OR REPLACE] LIBRARY

CREATE [OR REPLACE] LIBRARY


Use the CREATE [OR REPLACE] LIBRARY command to create a user-defined shared
library. After the library is added to the database, it is immediately available for use by a
UDX.
You cannot change the library name using this command. To change a librarys name, you
must drop the library and create a library with the new name.

Synopsis
The CREATE [OR REPLACE] LIBRARY command has the following syntax:
CREATE [OR REPLACE] LIBRARY library_name
[ AUTOMATIC LOAD | MANUAL LOAD ]
[NO DEPENDENCIES| DEPENDENCIES deplibs]
[ EXTERNAL HOST OBJECT 'host_object_filename' ]
[ EXTERNAL SPU OBJECT 'SPU_object_filename' ]

Inputs
The CREATE [OR REPLACE] LIBRARY command takes the following inputs:
Table B-12: CREATE [OR REPLACE] LIBRARY Input
Input

Description

library name

The name of the library that you want to create or replace. The
name must be unique within the current database. You must be
connected to the database where the library is defined. You cannot
change the name using the CREATE OR REPLACE LIBRARY
command.

[AUTOMATIC LOAD | Automatic load specifies that the Netezza system will automatiMANUAL LOAD]
cally open the library before any objects that depend upon it are
used. Manual load specifies that the UDX is responsible for opening and closing manual load libraries when they are needed.
Specifies an optional list of user-defined shared library dependencies for the UDX. You can specify one or a comma-separated list of
library names.
NO DEPENDENCIES Specifies that there are no dependencies for the UDX, which is the
default if DEPENDENCIES deplibs is omitted. You can use this
option to clear any previous dependencies declared for the UDX.
DEPENDENCIES
deplibs

20444-5

Rev.4

EXTERNAL HOST
OBJECT 'host_
object_filename'

Specifies the pathname to the shared librarys compiled host


object file.

EXTERNAL SPU
OBJECT 'SPU_
object_filename'

Specifies the pathname to the shared librarys compiled object file


for the Linux SPU environment. Specify the spu10 compiled object
for Rev10 SPUs on IBM Netezza 1000 and Netezza 100 models.

B-23

IBM Netezza User-Defined Functions Developers Guide

Outputs
The CREATE [OR REPLACE] LIBRARY command has the following output
Table B-13: CREATE [OR REPLACE] LIBRARY Output
Output

Description

CREATE LIBRARY

The message returned if the command is successful.

ERROR: Object with name 'lib- The message returned if you use the CREATE LIBRARY
name' already exists
command for a library name that already exists. Use
CREATE OR REPLACE or specify a unique library name.
ERROR: lookupLibrary: library
libname does not exist

The message that the system returns if the specified


shared library does not exist in the current database.

ERROR: Unable to calculate


cksum for file filename

The message returned if the object file pathname cannot


be resolved to a file.

Description
The user-defined shared library is created in the current database. You cannot replace a
user-defined library that is currently in use in an active query. After the active querys transaction completes, the Netezza system will process the CREATE OR REPLACE LIBRARY
command to replace the library.

Privileges Required
You must have Create Library permission to use the CREATE LIBRARY command. Also, if
you use CREATE OR REPLACE LIBRARY to change a UDF, you must have Create Library
and Alter permission for the library to change it.
Note: When you issue a CREATE LIBRARY command, the database processes the HOST
OBJECT and the SPU OBJECT files as the user nz. The user nz must have read access to
the object files and read and execute access to every directory in the path from the root to
the object file.

Common Tasks
You can use the CREATE [OR REPLACE] LIBRARY command to create and become the
owner of a new shared library. You must create the library and any of its dependencies and
compile them using nzudxcompile before you can use this command to register the shared
library with the Netezza system. The library is defined as an object in the current database.

Related Commands
See ALTER LIBRARY on page B-11 to alter shared libraries.
See DROP LIBRARY on page B-28 to drop shared libraries.
See SHOW LIBRARY on page B-33 to display information about shared libraries.

B-24

20444-5

Rev.4

DROP AGGREGATE

Usage
The following provides sample usage.

To create a new sample library named mylib, enter:


MYDB(MYUSER)=> CREATE LIBRARY mylib AUTOMATIC LOAD EXTERNAL HOST
OBJECT '/home/nz/libs/mylib.o_x86' EXTERNAL SPU OBJECT '/home/nz/
libs/mylib.o_spu10';
CREATE LIBRARY

DROP AGGREGATE
Use the DROP AGGREGATE command to remove an existing user-defined aggregate from a
database. When you drop an aggregate, the aggregates object files will also be removed
from the user code object repository.

Synopsis
Syntax for dropping a user-defined aggregate:
DROP AGGREGATE aggregate_name(argument_types)

Inputs
The DROP AGGREGATE command takes the following inputs:
Table B-14: DROP AGGREGATE Input
Input

Description

aggregate_name

Specifies the name of an existing user-defined aggregate that you


want to drop.

argument_types

Specifies a list of fully-specified argument data types to uniquely


identify the aggregate. All Netezza data types are supported.
Strings must include either a size or ANY for generic sizes.
NUMERIC types must include precision and scale or ANY for
generic sizes. You could also specify the VARARGS value to drop a
variable argument aggregate.

Outputs
The DROP AGGREGATE command has the following output:
Table B-15: DROP AGGREGATE Output

20444-5

Rev.4

Output

Description

DROP AGGREGATE

The message that the system returns if the command is


successful.

ERROR: Name: No such


aggregate

The message that the system returns if the specified


aggregate does not exist in the current database.

B-25

IBM Netezza User-Defined Functions Developers Guide

Table B-15: DROP AGGREGATE Output (continued)


Output

Description

Error: DropAggregate: existing


UDX name(argument_types)
differs in size of string/numeric
arguments

This error indicates that a UDX exists with the name but
has different sizes specified for string or numeric arguments. To drop the aggregate, make sure that you
specify the exact argument type list with correct sizes.

ERROR: Can't delete aggregate The message that the system returns if a UDA is refername - view viewName
enced in a view. You cannot drop the UDA until the
depends on it
dependency from the view is resolved.

Description
You cannot drop a user-defined aggregate that is currently in use in an active query. After
the active querys transaction completes, the Netezza system will process the DROP
AGGREGATE command to drop the aggregate. The aggregate must be defined in the current database.
You cannot drop a UDA that is referenced by an existing view. Review the section Dependency Checks before Dropping UDXs on page 6-13 for more information about resolving
dependencies to UDAs that you want to drop.

Privileges Required
To drop a UDA, you must meet one of the following criteria:

You must have the Drop privilege on the AGGREGATE object.

You must have the Drop privilege on the specific UDA object.

You must own the UDA.

You must be the database admin user.

Common Tasks
Use the DROP AGGREGATE command to drop an existing aggregate from a database.

Related Commands
See CREATE [OR REPLACE] AGGREGATE on page B-13 for information on how to create
aggregates.
See ALTER AGGREGATE on page B-2 to alter an aggregate.
See SHOW AGGREGATE on page B-30 to display information about aggregates.

Usage
The following is sample usage.

To drop a sample aggregate named mycalc, enter:


MYDB(MYUSER)=> DROP AGGREGATE mycalc(int4);

B-26

20444-5

Rev.4

DROP FUNCTION

DROP FUNCTION
Use the DROP FUNCTION command to remove an existing user-defined function from a
database. When you drop a function, the functions object files will also be removed from
the user code object repository.

Synopsis
Syntax for dropping a user-defined function:
DROP FUNCTION function_name(argument_types)

Inputs
The DROP FUNCTION command takes the following inputs:
Table B-16: DROP FUNCTION Input
Input

Description

function_name

Specifies the name of an existing user-defined function.

argument_types

Specifies a list of fully-specified argument data types to


uniquely identify the function. All Netezza data types are
supported. Strings must include either a size or ANY for
generic sizes. NUMERIC types must include precision
and scale or ANY for generic sizes. You could also specify the VARARGS value to drop a variable argument
function.

Outputs
The DROP FUNCTION command has the following output:
Table B-17: DROP FUNCTION Output

20444-5

Rev.4

Output

Description

DROP FUNCTION

The message that the system returns if the command is


successful.

ERROR: Name: No such


function

The message that the system returns if the specified


function does not exist in the current database.

Error: DropFunction: existing


UDX name(argument_types)
differs in size of string/numeric
arguments

This error indicates that a UDX exists with the name but
has different sizes specified for string or numeric arguments. To drop the function, make sure that you specify
the exact argument type list with correct sizes.

ERROR: Can't delete function


name - object depends on it

The message that the system returns if a UDF is referenced in a table or a view. You cannot drop the UDF
until the dependency is resolved.

B-27

IBM Netezza User-Defined Functions Developers Guide

Description
You cannot drop a user-defined function that is currently in use in an active query. After the
active querys transaction completes, the Netezza system will process the DROP FUNCTION command to drop the function. The function must be defined in the current
database.
You cannot drop a UDF that is referenced by an existing table or view. Review the section
Dependency Checks before Dropping UDXs on page 6-13 for more information about
resolving dependencies to UDFs that you want to drop.

Privileges Required
To drop a UDF, you must meet one of the following criteria:

You must have the Drop privilege on the FUNCTION object.

You must have the Drop privilege on the specific UDF object.

You must own the UDF.

You must be the database admin user.

Common Tasks
Use the DROP FUNCTION command to drop an existing function from a database.

Related Commands
See CREATE [OR REPLACE] FUNCTION on page B-17 for information on how to create
functions.
See ALTER FUNCTION on page B-6 to alter a function.
See SHOW FUNCTION on page B-32 to display information about functions.

Usage
The following is sample usage.

To drop a sample function myfunc(char(12)), enter:


MYDB(MYUSER)=> DROP FUNCTION myfunc(char(12));

DROP LIBRARY
Use the DROP LIBRARY command to remove an existing user-defined shared library from a
database. When you drop a shared library, the shared librarys object files will also be
removed from the user code object repository.

Synopsis
Syntax for dropping a user-defined shared library:
DROP LIBRARY library_name

B-28

20444-5

Rev.4

DROP LIBRARY

Inputs
The DROP LIBRARY command takes the following inputs:
Table B-18: DROP LIBRARY Input
Input

Description

library_name

Specifies the name of an existing user-defined shared


library.

Outputs
The DROP LIBRARY command has the following output:
Table B-19: DROP LIBRARY Output
Output

Description

DROP LIBRARY

The message that the system returns if the command is


successful.

ERROR: RemoveLibrary:
library libname does not exist

The message that the system returns if the specified


shared library does not exist in the current database.

ERROR: Can't delete library


mylib - name depends on it

The message that the system returns if you try to drop a


user-defined shared library that is referenced by an
existing UDX. The name value could be the name of
another library or the signature of a UDF or UDA.

Description
You cannot drop a user-defined shared library that is currently in use in an active query.
After the active querys transaction completes, the Netezza system will process the DROP
LIBRARY command to drop the shared library. The shared library must be defined in the
current database.

Privileges Required
To drop a shared library, you must meet one of the following criteria:

You must have the Drop privilege on the LIBRARY object.

You must have the Drop privilege on the specific shared library object.

You must own the shared library.

You must be the database admin user.

Common Tasks
Use the DROP LIBRARY command to drop an existing shared library from a database.

Related Commands
See ALTER LIBRARY on page B-11 to alter shared libraries.

20444-5

Rev.4

B-29

IBM Netezza User-Defined Functions Developers Guide

See CREATE [OR REPLACE] LIBRARY on page B-23 to create or replace shared
libraries.
See SHOW LIBRARY on page B-33 to display information about shared libraries.

Usage
The following is sample usage.

To drop a sample library mylib, enter:


MYDB(MYUSER)=> DROP LIBRARY mylib;

SHOW AGGREGATE
Use the SHOW AGGREGATE command to display information about one or more aggregates
(built-in as well as UDAs). The command checks your user account privileges to ensure that
you are permitted to see information about the UDAs defined in the database.

Synopsis
Syntax:
SHOW AGGREGATE [ALL | ident] [VERBOSE]

Inputs
The SHOW AGGREGATE command takes the following inputs:
Table B-20: SHOW AGGREGATE Input
Input

Description

ALL

Show information about all the aggregates defined in the database.


This is the default.

ident

Show information about a specific aggregate defined in the database that begins with ident. You can specify a partial name, but the
command will error if you specify a full signature.

VERBOSE

Display detailed information about the aggregates.

Outputs
The SHOW AGGREGATE command has the following output
Table B-21: SHOW AGGREGATE Output

B-30

Output

Description

error found "(" (at char num)


syntax error, unexpected '(',
expecting $end

The message that the system returns if you specify a full


signature, for example:
SHOW AGGREGATE penmax(integer);

20444-5

Rev.4

SHOW AGGREGATE

Description
The SHOW AGGREGATE command is identical in behavior to the nzsql \da and \da+
commands.

Privileges Required
Any user can run the SHOW AGGREGATE command; however, you must be the admin user,
own the UDA, or have object privileges on UDAs (such as Execute, List, Alter, or Drop) to
see information about UDAs in the output.

Common Tasks
Use the SHOW AGGREGATE command to display information about the aggregates in a
database.

Related Commands
See ALTER AGGREGATE on page B-2 to alter UDAs.
See CREATE [OR REPLACE] AGGREGATE on page B-13 to create UDAs.
See DROP AGGREGATE on page B-25 to drop a UDA.

Usage
The following provides sample usage.

To show the sample UDA named PenMax, use the following command:
DEV(MYUSER)=> show aggregate penmax;
NAME | BUILTIN | ARGUMENTS | RETURNTYPE | DESCRIPTION
--------+---------+-----------+------------+------------PENMAX | f
| (INTEGER) | INT4
|
(1 row)

To show verbose information for the PenMax UDA, use the following command:

DEV(MYUSER)=> SHOW AGGREGATE penmax VERBOSE;


NAME | BUILTIN | ARGUMENTS | RETURNTYPE | VARARGS | FENCED | VERSION | DESCRIPTION |
STATE
| LOGMASK | MEMORY | AGGTYPE | DEPENDENCIES | ENV
--------+---------+-----------+------------+---------+--------+---------+-------------+
------------------+---------+--------+---------+--------------+----PENMAX | f
| (INTEGER) | INT4
| f
| t
|
1 |
|
INTEGER, INTEGER | NONE
| 0
| ANY
|
|
(1 row)

To list all the aggregates in a database, use the following command. (The output is
abbreviated for presentation in the document.)

DEV(MYUSER)=> SHOW AGGREGATE ALL;


NAME
| BUILTIN |
ARGUMENTS
| RETURNTYPE | DESCRIPTION
-----------+---------+----------------------------------+------------+------------COUNT
| t
| ()
| INT8
|
DENSE_RANK | t
| ()
| INT8
|
RANK
| t
| ()
| INT8
|
PENMAX
| f
| (INTEGER)
| INT4
|
UDA_SUM
| f
| (INTEGER)
| INT8
|

20444-5

Rev.4

B-31

IBM Netezza User-Defined Functions Developers Guide

SHOW FUNCTION
Use the SHOW FUNCTION command to display information about one or more functions
(built-in as well as UDFs). The command checks your user account privileges to ensure that
you are permitted to see information about the UDFs defined in the database.

Synopsis
Syntax:
SHOW FUNCTION [ALL | ident] [VERBOSE]

Inputs
The SHOW FUNCTION command takes the following inputs:
Table B-22: SHOW FUNCTION Input
Input

Description

ALL

Show information about all the functions defined in the database.


This is the default.

ident

Show information about one or more functions defined in the database that begin with ident. You can specify a partial name, but the
command will error if you specify a full signature.

VERBOSE

Display detailed information about the functions.

Outputs
The SHOW FUNCTION command has the following output
Table B-23: SHOW FUNCTION Output
Output

Description

error found "(" (at char num)


syntax error, unexpected '(',
expecting $end

The message that the system returns if you specify a full


signature, for example:
show FUNCTION returntwo();

Description
The SHOW FUNCTION command is identical in behavior to the nzsql \df and \df+
commands.

Privileges Required
Any user can run the command SHOW FUNCTION; however, you must be the admin user,
own the UDF, or have object privileges on UDFs (such as Execute, List, Alter, or Drop) to
see information about UDFs in the output.

B-32

20444-5

Rev.4

SHOW LIBRARY

Common Tasks
Use the SHOW FUNCTION command to display information about the functions in a
database.

Related Commands
See ALTER FUNCTION on page B-6 to alter UDFs.
See CREATE [OR REPLACE] FUNCTION on page B-17 to create UDFs.
See DROP FUNCTION on page B-27 to drop a UDF.

Usage
The following provides sample usage.

To show all the functions, use the following command. (The output is abbreviated for
presentation in the document.)
MYDB(MYUSER)=> SHOW FUNCTION;
List of functions
RESULT
| FUNCTION
| BUILTIN | ARGUMENTS
-----------------+-------------+---------+------------------BIGINT
| ABS
|
t
| (BIGINT)
DOUBLE PRECISION | ABS
|
t
| (DOUBLE PRECISION)
INTEGER
| ABS
|
t
| (INTEGER)
DOUBLE PRECISION | COS
|
t
| (DOUBLE PRECISION)
DOUBLE PRECISION | COT
|
t
| (DOUBLE PRECISION)
INTEGER
| CUSTOMERNAME |
f
| (CHARACTER VARYING (64000))

To show verbose information for the sample UDF named customername, use the following command.

DEV(MYUSER)=> SHOW FUNCTION customername VERBOSE;


RESULT |
FUNCTION
| BUILTIN |
ARGUMENTS
| NULLONNULLINPUT |
DETERMINISTIC | LOGMASK | MEMORY | OWNER | VARARGS | FENCED | VERSION | LASTCALL |
DESCRIPTION | DEPENDENCIES | LOCATION | ENV
---------+------------------+---------+----------------------------+-----------------+
---------------+---------+--------+-------+---------+--------+---------+----------+
-------------+--------------+----------+----INTEGER | CUSTOMERNAME
| f
t
|NONE
| 0
|
|
(1 row)

| (CHARACTER VARYING(64000)) | t
| ADMIN | f
| t
|
1 |
|

|
|

SHOW LIBRARY
Use the SHOW LIBRARY command to display information about one or more user-defined
shared libraries. The command checks your user account privileges to ensure that you are
permitted to see information about the shared libraries defined in the database.

20444-5

Rev.4

B-33

IBM Netezza User-Defined Functions Developers Guide

Synopsis
Syntax:
SHOW LIBRARY [ALL | ident] [VERBOSE]

Inputs
The SHOW LIBRARY command takes the following inputs:
Table B-24: SHOW LIBRARY Input
Input

Description

ALL

Show information about all the libraries defined in the database.


This is the default.

ident

Show information about one or more libraries defined in the database that begin with ident. You can specify a partial name.

VERBOSE

Display detailed information about the libraries.

Description
The SHOW LIBRARY command is identical in behavior to the nzsql \dl command.

Privileges Required
Any user can run the command SHOW LIBRARY; however, you must be the admin user,
own the library, or have object privileges on libraries (such as Execute, List, Alter, or Drop)
to see information about libraries in the output.

Common Tasks
Use the SHOW LIBRARY command to display information about the shared libraries in a
database.

Related Commands
See ALTER LIBRARY on page B-11 to alter shared libraries.
See CREATE [OR REPLACE] LIBRARY on page B-23 to create or replace shared
libraries.
See DROP LIBRARY on page B-28 to drop shared libraries.

Usage
The following provides sample usage.

To show all the libraries, use the following command.


MYDB(MYUSER)=> SHOW LIBRARY;
LIBRARY
| AUTOMATICLOAD
----------------+--------------MYLIB
| t

B-34

20444-5

Rev.4

SHOW LIBRARY

MYMATHLIB
MYSQLTOOLSLIB
(6 rows)

| t
| f

To show verbose information for the sample library named mylib, or any libraries that
begin with the mylib string, use the following command.

DEV(MYUSER)=> SHOW LIBRARY mylib VERBOSE;


LIBRARY
| AUTOMATICLOAD | OWNER | DESCRIPTION | DEPENDENCIES
---------------+---------------+-------+-------------+-------------MYLIB
| t
| USER |
|
MYLIBMATH
| t
| USER |
|
MYLIBSPECIALS | t
| USER |
| MYLIBMATH
(1 row)

20444-5

Rev.4

B-35

IBM Netezza User-Defined Functions Developers Guide

B-36

20444-5

Rev.4

APPENDIX

Datatype Helper API Reference


Whats in this appendix
Temporal Datatype Helper Functions
Numeric Datatype Helper Functions
UTF-8 Datatype Helper Functions

This appendix describes helper functions that you can use to manage, verify, and convert
Netezza-specific datatype values. There are helper functions available for processing two
types of datatypes: temporal (date/time) values and numeric values.

Temporal Datatype Helper Functions


This section describes the temporal datatype helper API functions. These functions can
help you to do the following types of tasks within your user-defined functions and
aggregates:

Convert datatypes from their internal Netezza formats into developer-style dataa process known as decoding.

Convert data from developer-style data into Netezza internal formats for storage on and
use by the Netezza systema process known as encoding.

Verify that a value is within the valid range for its specified datatype.

Developer-style data refers to simple data structures from which programmers can infer
useful information. For example, the API function decodeDate converts a Netezza date,
which is an integer number of days after 1/1/2000, to a format that is often more used in
programs: Gregorian calendar day, month and year. The API does not provide any kind of
text (presentation) formatting or "end-user friendly" formats.
This API currently processes temporal data types such as: Date, Time, Timestamp, TimeTZ
and Interval. The API generally complies with the standard for the ISO C time_t datatype.
The API also complies with the standard for the struct tm datatype, with two exceptions:

Leap seconds are not supported. The range for tm_sec is reduced to [0, 59] for the API
functions.

When converting from NZ Date or NZ Timestamp to struct tm, the tm_yday is set to 0,
and the tm_gmtoff field is not supported on Netezza SPUs.

C-1

IBM Netezza User-Defined Functions Developers Guide

Since you must include the udxinc.h header file in the C++ UDX source code, you automatically have access to the datatype helper API functions. The helper API functions are
contained within namespace nz::udx::dthelpers.

Netezza Date/Time Datatype Representations


Table C-1 lists the Netezza datatypes for date and time values addressed by the datatype
helper API, and the "maximum safe" range values for each datatype. Like all datatypes,
Netezza temporal types have a specific range of supported values. Not all of the range limits are strictly enforced, but violating the range usually causes subsequent errors which
may not be thrown immediately.
Table C-1: Netezza Temporal Datatypes
Type

Internal Representation

Range

Date

int32 which represents the number of


days before (-) or after (+) 1/1/2000

Min: -730,119 (1/1/0001)


Max: 2,921,939 (12/31/9999)

Time

int64 which represents the number of


microseconds between midnight and
one microsecond before midnight

Min: 0 (00:00:00.000000)
Max: 86,399,999,999 (23:59:59.999999)

TimeTZ

int64 which represents time (see Time


above)
int32 which represents the offset, in
seconds, sign reversed
For example, the offset of "+ 1 hour" is
stored as -3600 (-1*60*60).

Time Min: 0 (00:00:00.000000)


Time Max: 86,399,999,999 (23:59:59.999999)
Offset Min: -46800 (+ 13:00:00)
Offset Max: 46740 (-12:59:00)
Offset must be a whole number of minutes (offset%60 = 0). Note the sign reversal.

Timestamp

int64 which represents the number of


microseconds before (-) or after (+)
00:00:00.0, 1/1/2000

Min: -63,082,281,600,000,000
(00:00:00, 1/1/0001)
Max: 252,455,615,999,999,999
(23:59:59.999999, 12/31/9999)

Interval

int32 which represents the number of


months (+/-)
int64 which represents the number of
microseconds (+/-)

Months Min: - 3,000,000 (- 250000 years)


Months Max: 3,000,000 (250000 years)
Microseconds Min: NONE (max signed int64)
Microseconds Max: NONE (min signed int64)

Note that a configuration of a negative


(-) months value and a positive (+)
microseconds value, and vice-versa, is
possible and supported by the Netezza
system.

The microsecond value can be as large as the int64


datatype allows and will overflow into negatives
without error.
A month is always considered to contain 30 days.
The months and microseconds values are stored
separately and there is no information exchange
between them.

C-2

20444-5

Rev.4

Temporal Datatype Helper Functions

Each conversion function provides an optional boolean error write-to argument for easy
checking. The optional argument is set to true when the given data is out of range and false
when the given data is in range (and there are no other errors). The conversion routines will
throw an error when any of the passed references or pointers are null, or when the optional
error argument is not supplied and there is an error.

Support for time_t Temporal Structures


The API provides functions for working with the C++ time_t structure. Implementations of
time_t differ on various systems; Netezzas implementation uses the signed int32 implementation. This is the implementation used by the gcc 3.4.5 and the diab compilers,
which are used by Netezza for UDX compilation. This implementation does have a limitation, namely that it supports dates only between 1/1/1970 and 1/19/2038, which means
values from 0 to 2147483647, inclusive. Although the time_t structure standard supports
values outside that range, the behavior of those other values is not guaranteed. The remaining sections of this document refer to this specific implementation simply as "time_t".
The datatype helper API supports only decoding and encoding between time_t and the
Netezza Date and Timestamp datatypes. Since the range of time_t is a subset of the range
for Netezza Date and Timestamp, this conversion allows for only values in the range of
time_t. Table C-2 lists several correlated values between time_t, Netezza Date and Netezza
Timestamp. Note that using API functions to encode or decode Date or Timestamp values
out of the range of time_t will result in errors.
Table C-2: Examples of time_t and Netezza Date and Timestamp Conversions

20444-5

Rev.4

Gregorian Calendar
Date and Time

time_t value at
UTC + 0:00 (GMT)

NZ Date Value

NZ Timestamp Value

Lowest supported
NZ date: 1/1/0001,
00:00:00

UNDEFINED

-730,119

-63,082,281,600,000,000

time_t Epoch start:


1/1/1970,
00:00:00

-10,957

-946,684,800,000,000

NZ day zero:
1/1/2000,
00:00:00

946,684,800

time_t Epoch end:


1/19/2038,
03:14:07

2,147,483,647

13,898

1,200,798,847,000,000

Highest supported
NZ date:
12/31/9999,
11:59:59.999999

UNDEFINED

2,921,939

252,455,615,999,999,999

C-3

IBM Netezza User-Defined Functions Developers Guide

Support for struct tm Temporal Structures


In addition to supporting time_t, the datatype helper API also supports the common C++
struct tm structure. Netezza Date and Timestamp values can be converted to a tm; however,
Time and TimeTz values cannot because there is no date information included in those
values.
The API uses the strict implementation of struct tm which does not contain the additional
fields tm_zone and tm_gmtoff, commonly seen in GCC. The strict implementation is necessary for consistency between the host and SPU C++ libraries. The API provides functions
for conversions between struct tm and the Netezza Date and Timestamp datatypes. Since
struct tm uses more fields than are needed to store Date or Timestamp information, several
fields will be ignored by conversion. Table C-3 lists the mappings between tm fields and
their Netezza Datatype counterparts.
Table C-3: Examples of struct tm and Netezza Datatype Properties
struct tm Fields

Data used by NZ DATE

Data used by NZ
TIMESTAMP

tm_sec (0 to 61)

Ignored

Seconds (0 to 59)

tm_min (0 to 59)

Ignored

Minutes

tm_hour (0 to 23)

Ignored

Hours

tm_mday (1 to 31) day of the month

Day

Day

tm_mon (0 to 11) months since January

Month

Month

tm_year since 1900

Year

Year

tm_wday (0 to 6) week day

Week Day

Week Day

tm_yday (0 to 365) day of the year

Year Day

Year Day

tm_isdst daylight savings time flag

Set to -1

Set to -1

For conversions to and from struct tm, Netezza does not use or support leap seconds. Thus,
the seconds fields of struct tm that are written out by the API will never exceed 59. If a
conversion routine is called with a struct tm containing a 60 or 61 tm_sec field, the API
throws an error.
When you are encoding from a struct tm format, ignored fields can contain any data. When
you are decoding to a struct tm, the ignored fields will have a value of zero (0).
The tm_isdst flag will be ignored on input and will be set to -1 for 'unknown' on output.
Setting tm_isdst to -1 is not within the standard, but is a typical industry practice. In all
other respects, the API conforms to all the time_t specifications as listed in the ANSI C++
standard.

Support for struct timeval Temporal Structures


The conversion between struct timeval and Netezza Timestamp is also supported. For the
tv_sec field, the API enforces the exact same constraints as those used for time_t. The
tv_usec field is allowed to be between 0 and 999,999 only.

C-4

20444-5

Rev.4

Temporal Datatype Helper Functions

Using the IgnoreBuffer to Skip Values


You can use the IgnoreBuffer structure in your user-defined functions to skip or avoid processing arguments that you do not require in your datatype conversions. For example, you
might want to convert a Netezza timestamp to hours, minutes, and seconds, but if you do
not need the microseconds component, you can ignore it.
IgnoreBuffer has the following syntax:
union IgnoreBuffer
{
uint8 u8;
uint16 u16;
uint32 u32;
uint64 u64;
int8 s8;
int16 s16;
int32 s32;
int64 s64;
}

To decode a Netezza time to h:m:s only, and ignore microseconds, you can use the IgnoreBuffer structure as follows:
uint8 h,m,s;
IgnoreBuffer ignore;
decodeTime(givenTime, &h, &m, &s, &ignore.u32);

In another example, to decode a date value into only the day and month and ignore the
year, you can use IgnoreBuffer as follows:
uint8 month,day;
IgnoreBuffer ignore;
decodeDate(givenDate, &month, &day, &ignore.u16);

It is important to note that any instance of the buffer will not contain any sort of meaningful data at any time.

Range Specifier Constants


Table C-4 describes the range specifier constants. You can use these constants to represent
the maximum or minimum values for the temporal fields. For a description of the meaning
of the various Netezza-encoded temporal datatypes, refer to Table C-1 on page C-2.
Table C-4: Range Specifier Constants
Constant

Description

Value

static const int32 ENC_DATE_MIN

The minimum value that the Netezzaencoded Date can have.

-730,119

static const int32 ENC_DATE_MAX

The maximum value that the Netezzaencoded Date can have.

2,921,939

static const int64 ENC_TIME_MIN

The minimum value that the Netezzaencoded Time can have.

20444-5

Rev.4

C-5

IBM Netezza User-Defined Functions Developers Guide

Table C-4: Range Specifier Constants (continued)


Constant

Description

Value

static const int64 ENC_TIME_MAX

The maximum value that the Netezzaencoded Time can have.

86,399,999,999

static const int32 ENC_TIMETZ_


OFFSET_MIN

The minimum value that the Netezzaencoded TimeTZ Offset part can have
(+13:00:00).

-46800

static const int32 ENC_TIMETZ_


OFFSET_MAX

The maximum value that the Netezzaencoded TimeTZ Offset part can have
(-12:59:00).

46740

static const int64 ENC_


TIMESTAMP_MIN

The minimum value that the Netezzaencoded Timestamp can have.

-63,082,281,600,000,000

static const int64 ENC_


TIMESTAMP_MAX

The maximum value that the Netezzaencoded Timestamp can have.

252,455,615,999,999,999

static const int32 ENC_INTERVAL_


MONTH_MIN

The minimum value that the Netezzaencoded Interval Month part can
have.

-3,000,000

static const int32 ENC_INTERVAL_


MONTH_MAX

The maximum value that the Netezzaencoded Interval Month part can
have.

3,000,000

static const uint16 SQL_YEAR_MIN

The minimum value that the decoded


Year can have. Does not apply to the
'years' value of Interval.

static const uint16 SQL_YEAR_MAX

The maximum value that the decoded


Year can have. Does not apply to the
'years' value of Interval.

9999

static const int16 SQL_OFFSET_


MIN

The minimum value that the decoded


Time Zone Offset part can have
(-12:59).

-779

static const int16 SQL_OFFSET_


MAX

The maximum value that the decoded


Time Zone Offset part can have
(+13:00).

780

static const int32 EPOCH_START_


AS_DATE

The start of the time_t Epoch, represented as an NZ Date (1/1/1970).

-11,323

static const int32 EPOCH_END_AS_


DATE

The end of the time_t Epoch, represented as an NZ Date (1/19/2038).

13,898

static const int64 EPOCH_START_


AS_TIMESTAMP

The start of the time_t Epoch, represented as an NZ Timestamp


(00:00:00, 1/19/1970).

C-6

-946,684,800,000,000

20444-5

Rev.4

Temporal Datatype Helper Functions

Table C-4: Range Specifier Constants (continued)


Constant

Description

Value

static const int64 EPOCH_END_AS_


TIMESTAMP

The end of the time_t Epoch, represented as an NZ Timestamp


(03:14:07.999999, 1/19/2038).

1,200,798,847,999,999

Encoded Range-Checking Functions


You can use the following functions to verify whether specified arguments are within the
valid ranges of the Netezza-encoded values.

isValidDate
Verifies whether a Netezza-encoded Date value is valid and within the Netezza Date range.
Description

The function has the following syntax:

inline bool isValidDate(int32 encodedDate)

encodedDate is a value encoded in Netezza Date format.


Returns False if encodedDate < ENC_DATE_MIN or encodedDate > ENC_DATE_MAX.
Otherwise, returns true.

isValidEpochDate
Verifies whether a Netezza-encoded Date value is valid and within the time_t Epoch range.
Description

The function has the following syntax:

inline bool isValidEpochDate(int32 encodedDate)

encodedDate specifies a value encoded in Netezza Date format.


Returns False if encodedDate < EPOCH_START_AS_DATE or
encodedDate > EPOCH_END_AS_DATE. Otherwise, returns true.

isValidTime
Verifies whether a Netezza-encoded Time value is valid and within range.
Description

The function has the following syntax:

inline bool isValidTime(uint64 encodedTime)

encodedTime specifies a value encoded in Netezza Time format.


Returns False if encodedTime < ENC_TIME_MIN, or encodedTime >ENC_TIME_MAX.
Otherwise, returns true.

20444-5

Rev.4

C-7

IBM Netezza User-Defined Functions Developers Guide

isValidTimeTzOffset
Verifies whether the offset part of a Netezza-encoded TimeTZ value is valid and within
range.
Description

The function has the following syntax:

inline bool isValidTimeTzOffset(int32 encodedZone)

encodedZone specifies a value encoded in Netezza Time Offset format.


Returns False if encodedZone > ENC_TIMETZ_OFFSET_MAX, or
encodedZone < ENC_TIMETZ_OFFSET_MIN, or encodedZone%60 is not 0. Otherwise,
returns true.

isValidTimeTz
Verifies whether a Netezza-encoded TimeTZ value is valid and within range.
Description

The function has the following syntax:

inline bool isValidTimeTz(uint64 encodedTime, int32 encodedZone)

encodedTime specifies a value encoded in Netezza Time format.


encodedZone specifies a value encoded in Netezza Time Offset format.
Returns False if isValidTime(encodedTime) is false, or isValidTimeTzOffset(encodedZone)
is false. Otherwise, returns true.

isValidTimestamp
Verifies whether a Netezza-encoded Timestamp value is valid and within range.
Description

The function has the following syntax:

inline bool isValidTimestamp(int64 encodedTimestamp)

encodedTimestamp specifies a value encoded in Netezza Timestamp format.


Returns False if encodedTimestamp < ENC_TIMESTAMP_MIN or
encodedTimestamp > ENC_TIMESTAMP_MAX. Otherwise, returns true.

isValidEpochTimestamp
Verifies whether a Netezza-encoded Timestamp value is valid and within the time_t Epoch
range.
Description

The function has the following syntax:

inline bool isValidEpochTimestamp(int64 encodedTimestamp)

encodedTimestamp specifies a value encoded in Netezza Timestamp format.


Returns False if encodedTimestamp < EPOCH_START_AS_TIMESTAMP or
encodedTimestamp > EPOCH_END_AS_TIMESTAMP. Otherwise, returns true.

C-8

20444-5

Rev.4

Temporal Datatype Helper Functions

isValidInterval
Verifies whether a Netezza-encoded Interval value is valid and within range.
Description

The function has the following syntax:

inline bool isValidInterval(int32 intervalMonth, int64 intervalTime)

intervalMonth specifies the Month part of an Interval encoded in Netezza format.


intervalTime specifies the Time part of an Interval encoded in Netezza format.
Returns False if intervalMonth < ENC_INTERVAL_MONTH_MIN or
intervalMonth > ENC_INTERVAL_MONTH_MAX. Otherwise, returns true.

Decoded Range-Checking Functions


You can use the following functions to verify whether specified arguments that use developer-style data values fall within the ranges of Netezza datatypes.

isValidDate
Verifies whether a decoded m/d/y Date value is valid and within the Netezza Date range.
Description

The function has the following syntax:

inline bool isValidDate(uint32 month, uint32 day, uint32 year)

month specifies the month in the range of 1 (January) to 12 (December).


day specifies the day of the month in the range of 1 to 31 inclusive.
year specifies the year, which must be in the range of SQL_YEAR_MIN to SQL_YEAR_MAX,
inclusive.
Returns False if (month<1 or month>12), or (day<1 or day>31), or (year<SQL_YEAR_MIN
or year > SQL_YEAR_MAX). Also false if (month is in (4, 6, 9, 11) and day>30), or
(isLeapYear(year) and month=2 and day>29), or (!isLeapYear(year) and month=2 and
day>28). Otherwise, returns true.

isValidTime
Verifies whether a decoded h:m:s:micros Time value is valid and within the Netezza Time
range.
Description

The function has the following syntax:

inline bool isValidTime(uint32 hour, uint32 minute, uint32 second,


uint32 mcrs)

hour specifies the hour in the range of 0 to 23 inclusive.


minute specifies the minute in the range of 0 to 59 inclusive.
second specifies the second in the range of 0 to 59 inclusive.
mcrs specifies the microsecond in the range of 0 to 999999 inclusive.
Returns False if hour >23 or minute > 59 or second >59 or mcrs > 999999. Otherwise,
returns true.

20444-5

Rev.4

C-9

IBM Netezza User-Defined Functions Developers Guide

isValidSqlOffset
Verifies whether a time offset value is within the valid API range.
Description

The function has the following syntax:

inline bool isValidSqlOffset(int32 offset)

offset specifies a time offset in minutes in the range of SQL_OFFSET_MIN to


SQL_OFFSET_MAX inclusive.
Returns False if offset > SQL_OFFSET_MAX or offset < SQL_OFFSET_MIN. Otherwise,
returns true.

isValidTimeTz
Verifies whether a decoded h:m:s:micros+offset TimeTZ value is valid and within the
Netezza TimeTZ range.
Description

The function has the following syntax:

inline bool isValidTimeTz(uint32 hour, uint32 minute, uint32 second,


uint32 mcrs, uint32 sqlOffset)

hour specifies the hour in the range of 0 to 23 inclusive.


minute specifies the minute in the range of 0 to 59 inclusive.
second specifies the second in the range of 0 to 59 inclusive.
mcrs specifies the microsecond in the range of 0 to 999999 inclusive.
sqlOffset specifies the time zone offset in minutes in the range of SQL_OFFSET_MIN to
SQL_OFFSET_MAX inclusive.
Returns False if isValidTime(hour, minute, second, mcrs) is false or
isValidSqlOffset(sqlOffset) is false. Otherwise, returns true.

isValidTimestamp
Verifies whether a decoded m/d/y, h:m:s:micros Timestamp value is valid and within the
Netezza Timestamp range.
Description

The function has the following syntax:

inline bool isValidTimestamp(uint32 month, uint32 day, uint32 year,


uint32 hour, uint32 minute, uint32 second, uint32 mcrs)

month specifies the month in the range of 1 (January) to 12 (December) inclusive.


day specifies the day in the range of 1 to 31 inclusive.
year specifies the year of the date in the range of SQL_YEAR_MIN to SQL_YEAR_MAX
inclusive.
hour specifies the hour in the range of 0 to 23 inclusive.
minute specifies the minute in the range of 0 to 59 inclusive.
second specifies the second in the range of 0 to 59 inclusive.
mcrs specifies the microsecond in the range of 0 to 999999 inclusive.
Returns False if isValidDate(month, day, year) is false or isValidTime(hour, minute, second, mcrs) is false. Otherwise, returns true.

C-10

20444-5

Rev.4

Temporal Datatype Helper Functions

isValidEpoch
Verifies whether a decoded time_t value is valid and can be decoded to NZ Timestamp or
Date.
Description

The function has the following syntax:

inline bool isValidEpoch(int32 time)

time specifies the C++ time value, forced to be a signed int32.


Returns

False if time < 0. Otherwise, returns true.

isValidTimeValUsecs
Verifies whether the microseconds part of a timeval structure is valid.
Description

The function has the following syntax:

inline bool isValidTimeValUsecs(uint32 usecs)

usecs specifies the tv_usec part of a timeval data structure.


Returns

False if usecs < 0 or usecs > 999,999. Otherwise, returns true.

isValidTimeVal
Verifies whether a given timeval structure is valid and can be encoded to a Netezza
Timestamp.
Description

The function has the following syntax:

inline bool isValidTimeStruct(const struct timeval& tv)

tv specifies the timeval structure to verify.


Returns False if isValidEpoch(tv.tv_sec) is false or isValidTimeValUsecs(tv.tv_usec) is
false. Otherwise, returns true.
Throws

The function throws an opaque exception object if &tv is NULL.

isValidTimeStruct
Verifies whether a given tm structure can be encoded to a Netezza Date or Timestamp
value.
Description

The function has the following syntax:

inline bool isValidTimeStruct(const struct tm& ts)

ts specifies the tm structure to verify.


Returns false if any ts.(tm_mon, tm_mday, tm_hour, tm_min, tm_sec) is negative or
ts.tm_year+1900<SQL_YEAR_MIN or isValidDate(ts.tm_mon+1, ts.tm_day,
ts.tm_year+1900) is false or isValidTime(ts.tm_hour, ts.tm_min, ts.tm_sec) is false.
Otherwise, returns true.
Throws

20444-5

Rev.4

The function throws an opaque exception object if &ts is NULL.

C-11

IBM Netezza User-Defined Functions Developers Guide

Decoder Conversion Functions


You can use the following functions to decode Netezza internal format datatype values.

decodeDate (m/d/y Output)


Converts a Netezza-encoded Date value to m/d/y.
Description

The function has the following syntax:

inline void decodeDate(int32 encodedDate, uint8* month, uint8* day,


uint16* year, bool* errorFlag = NULL)

encodedDate specifies a value encoded in Netezza Date format.


day specifies the parameter in which to record the day count (1 to 31 inclusive).
month specifies the parameter in which to record the month number (1 to 12 inclusive).
year specifies the parameter in which to record the year number (SQL_YEAR_MIN to
SQL_YEAR_MAX inclusive).
errorFlag is an optional argument. If not NULL, it is set to true if isValidDate(encodedDate)
is false. Otherwise it is set to false.
Throws The function throws an opaque exception object if any(month,day,year) is NULL
or (errorFlag is NULL and isValidDate(encodedDate) is false).

decodeDate (time_t Output)


Converts a Netezza-encoded Date value to time_t. The function treats encodedDate as if it
is UTC. The resulting time_t represents the time 00:00:00 on the specified date.
argument encodedDate: a Date encoded in NZ Format.
Description

The function has the following syntax:

inline void decodeDate(int32 encodedDate, int32* result, bool*


errorFlag = NULL)

encodedDate specifies a value encoded in Netezza Date format.


result specifies the parameter in which to record the time_t date representation, forced to
be a signed int32.
errorFlag is an optional argument. If not NULL, it is set to true if isValidEpochDate(encodedDate) is false. Otherwise it is set to false.
Throws The function throws an opaque exception object if result is NULL or (errorFlag is
NULL and isValidEpochDate(encodedDate) is false).

C-12

20444-5

Rev.4

Temporal Datatype Helper Functions

decodeDate (struct tm Output)


Converts a Netezza-encoded Date value to a struct tm. The resulting tm represents the time
00:00:00 on the specified date, with an unknown daylight savings status.
Description

The function has the following syntax:

inline void decodeDate(int32 encodedDate, struct tm* result, bool*


errorFlag = NULL)

encodedDate specifies a value encoded in Netezza Date format.


result specifies the structure where the decoded Date is written, such that result->tm_year,
result->tm_mon, result-> tm_mday, result->tm_yday and result->tm_wday contain the
appropriate fields in tm format. result->tm_isdst is set to -1. When applicable, all the other
fields of result are set to 0.
errorFlag is an optional argument. If not NULL, it is set to true if isValidDate(encodedDate)
is false. Otherwise it is set to false.
Throws The function throws an opaque exception object if result is NULL or (errorFlag is
NULL and isValidDate(encodedDate) is false).

decodeTime
Converts a Netezza-encoded Time value to h:m:s:micros.
Description

The function has the following syntax:

inline void decodeTime(int64 encodedTime, uint8* hour, uint8* minute,


uint8* second, uint32* mcrs, bool* errorFlag = NULL))

encodedTime specifies a value encoded in Netezza Time format.


hour specifies the parameter in which to record the hour (0 to 23 inclusive).
minute specifies the parameter in which to record the minute (0 to 59 inclusive).
second specifies the parameter in which to record the second (0 to 59 inclusive).
mcrs specifies the parameter in which to record the microsecond (0 to 999,999 inclusive).
errorFlag is an optional argument. If not NULL, it is set to true if isValidTime(encodedTime)
is false. Otherwise it is set to false.
Notes The function throws an opaque exception object if any(hour,minute,second,mcrs) is
NULL or (errorFlag is NULL and isValidTime(encodedTime) is false).

decodeTimeTz
Converts a Netezza-encoded TimeTz value to h:m:s:micros.
Description

The function has the following syntax:

inline void decodeTimeTz(int64 encodedTime, int32 encodedZone, uint8*


hour, uint8* minute, uint8* second, uint32* mcrs, int16* sqlOffset,
bool* errorFlag = NULL)

encodedTime specifies a value encoded in Netezza Time format.


encodedZone specifies a value encoded in Netezza Time Offset format.
hour specifies the parameter in which to record the hour (0 to 23 inclusive).
minute specifies the parameter in which to record the minute (0 to 59 inclusive).

20444-5

Rev.4

C-13

IBM Netezza User-Defined Functions Developers Guide

second specifies the parameter in which to record the second (0 to 59 inclusive).


mcrs specifies the parameter in which to record the microsecond (0 to 999,999 inclusive).
sqlOffset specifies the parameter in which to record the offset in minutes (SQL_OFFSET_
MIN to SQL_OFFSET_MAX, inclusive).
errorFlag is an optional argument. If not NULL, it is set to true if isValidTimeTz(encodedTime, encodedZone) is false. Otherwise it is set to false.
Notes The function throws an opaque exception object if any(hour,minute,second,mcrs,sqloffset) is NULL or (errorFlag is NULL and isValidTimeTZ(encodedTime,
encodedZone) is false).

decodeTimestamp (m/d/y h:m:s:m Output)


Converts a Netezza-encoded Timestamp value to m/d/y, h:m:s:micros.
Description

The function has the following syntax:

inline void decodeTimestamp(int64 encodedTimestamp, uint8* month,


uint8* day, uint16* year, uint8* hour, uint8* minute, uint8* second,
uint32* mcrs, bool* errorFlag = NULL)

encodedTimestamp specifies a value encoded in Netezza Timestamp format.


month specifies the parameter in which to record the month number (1 to 12 inclusive).
day specifies the parameter in which to record the day count (1 to 31 inclusive).
year specifies the parameter in which to record the year number (SQL_YEAR_MIN to SQL_
YEAR_MAX inclusive).
hour specifies the parameter in which to record the hour (0 to 23 inclusive).
minute specifies the parameter in which to record the minute (0 to 59 inclusive).
second specifies the parameter in which to record the second (0 to 59 inclusive).
mcrs specifies the parameter in which to record the microsecond (0 to 999,999 inclusive).
errorFlag is an optional argument. If not NULL, it is set to true if isValidTimestamp(encodedTimestamp) is false. Otherwise it is set to false.
Notes The function throws an opaque exception object if
any(month,day,year,hour,minute,second,mcrs) is NULL or (errorFlag is NULL and isValidTimestamp(encodedTimestamp) is false).

decodeTimestamp (time_t Output)


Converts a Netezza-encoded Timestamp value to a time_t value. The function drops the
microseconds after the last whole minute of the timestamp value.
Description

The function has the following syntax:

inline void decodeTimestamp(int64 encodedTimestamp, int32* result,


bool* errorFlag = NULL)

encodedTimestamp specifies a value encoded in Netezza Timestamp format.


result specifies the parameter in which to return the resulting time_t value, which is forced
to be a signed int32 implementation.

C-14

20444-5

Rev.4

Temporal Datatype Helper Functions

errorFlag is an optional argument. If not NULL, it is set to true if


isValidEpochTimestamp(encodedTimestamp) is false. Otherwise it is set to false.
Notes The function throws an opaque exception object if result is NULL or (errorFlag is
NULL and isValidEpochTimestamp(encodedTimestamp) is false).

decodeTimestamp (struct timeval Output)


Converts a Netezza-encoded Timestamp value to struct timeval.
Description

The function has the following syntax:

inline void decodeTimestamp(int64 encodedTimestamp, struct timeval*


result, bool* errorFlag = NULL)

encodedTimestamp specifies a value encoded in Netezza Timestamp format.


result specifies the structure where the decoded Timestamp is written.
errorFlag is an optional argument. If not NULL, it is set to true if
isValidEpochTimestamp(encodedTimestamp) is false. Otherwise it is set to false.
Notes The function throws an opaque exception object if result is NULL or (errorFlag is
NULL and isValidEpochTimestamp(encodedTimestamp) is false).

decodeTimestamp (struct tm Output)


Converts a Netezza-encoded Timestamp value to struct tm. The function drops the microseconds after the last whole minute of the timestamp value.
Description

The function has the following syntax:

inline void decodeTimestamp(int64 encodedTimestamp, struct tm* result,


bool* errorFlag = NULL)

encodedTimestamp specifies a value encoded in Netezza Timestamp format.


result specifies the structure where the decoded Timestamp is written, such that result>tm_hour, result->tm_min, result->tm_sec, result->tm_year, result->tm_mon, result->tm_
mday, result->tm_yday, and result->tm_wday contain the appropriate fields in tm format.
result->tm_isdst is set to -1. If applicable, all other fields of result are set to 0.
errorFlag is an optional argument. If not NULL, it is set to true if
isValidTimestamp(encodedTimestamp) is false. Otherwise it is set to false.
Notes The function throws an opaque exception object if result is NULL or (errorFlag is
NULL and isValidTimestamp(encodedTimestamp) is false).

Encoder Conversion Functions


You can use the following functions to encode developer-style data into Netezza internal
formats for storage on the Netezza system.

encodeDate (m/d/y Values)


The encodeDate function converts a m/d/y Date value to a Netezza-encoded Date value.
Description

The function has the following syntax:

inline void encodeDate(uint32 month, uint32 day, uint32 year, int32*


encodedDate, bool* errorFlag = NULL)

20444-5

Rev.4

C-15

IBM Netezza User-Defined Functions Developers Guide

day specifies the day count from 1 to 31 inclusive.


month specifies the month number from 1 to 12 inclusive.
year specifies the year number from SQL_YEAR_MIN to SQL_YEAR_MAX inclusive.
encodedDate is the parameter in which to record the Date in Netezza-encoded format.
errorFlag is an optional argument. If not NULL, it is set to true if isValidDate(month,day,year) is false. Otherwise it is set to false.
Throws The function throws an opaque exception object if encodedDate is NULL or (errorFlag is NULL and isValidDate(month, day, year) is false).

encodeDate (time_t Values)


The encodeDate function converts a time_t Date value to a Netezza-encoded Date. The
function drops the hours, minutes, and seconds elapsed after the last whole day in the
time_t value to round the time_t value down to the last whole day.
Description

The function has the following syntax:

inline void encodeDate(int32 date, int32* encodedDate, bool* errorFlag


= NULL)

date specifies the time_t date value.


encodedDate specifies the parameter in which to record the Date encoded in Netezza
format.
errorFlag is an optional argument. If not NULL, it is set to true if isValidEpoch(date) is
false. Otherwise it is set to false.
Throws The function throws an opaque exception object if encodedDate is NULL or (errorFlag is NULL and isValidEpoch(date) is false).

encodeDate (struct tm Values)


Converts a struct tm value to a Netezza-encoded Date. The function uses tm.tm_year,
tm.tm_mon and tm.tm_day fields of date only, and ignores the other fields. It is recommended that the date value should pass the isValidTimeStruct boolean test, but it is not a
requirement.
Description

The function has the following syntax:

inline void encodeDate(const struct tm& date, int32* encodedDate,


bool* errorFlag = NULL)

date specifies the struct tm date value.


encodedDate specifies the parameter in which to record the Date encoded in Netezza
format.
errorFlag is an optional argument. If not NULL, it is set to true if date.tm_mon<0 or
date.tm_mday<1 or date.tm_year+1900<SQL_YEAR_MIN or isValidDate(date.tm_mon+1,
date.tm_mday, date.tm_year) is false. Otherwise it is set to false.
Throws The function throws an opaque exception object if &date is NULL or encodedDate
is NULL or (errorFlag is NULL and (isValidDate(date.tm_mon+1, date.tm_mday,
date.tm_year+1900) is false or date.tm_mon<0 or date.tm_mday<1 or
date.tm_year+1900<SQL_YEAR_MIN)).

C-16

20444-5

Rev.4

Temporal Datatype Helper Functions

encodeTime
The encodeTime function converts a h:m:s:micros Time value to a Netezza-encoded Time
value.
Description

The function has the following syntax:

inline void encodeTime(uint32 hour, uint32 minute, uint32 second,


uint32 mcrs, int64* encodedTime, bool* errorFlag = NULL)

hour specifies the hour in the range 0 to 23 inclusive.


minute specifies the minute in the range 0 to 59 inclusive.
second specifies the second in the range 0 to 59 inclusive.
mcrs specifies the microsecond in the range 0 to 999,999 inclusive.
encodedTime specifies the parameter in which to record the Time encoded in Netezza
format.
errorFlag is an optional argument. If not NULL, it is set to true if
isValidTime(hour,minute,second,mcrs) is false. Otherwise it is set to false.
Throws The function throws an opaque exception object if encodedTime is NULL or
(errorFlag is NULL and isValidTime(hour, minute, second, mcrs) is false).

encodeTimeTZ
Converts a h:m:s:micros TimeTZ value to a Netezza-encoded TimeTZ.
Description

The function has the following syntax:

inline void encodeTimeTZ(uint32 hour, uint32 minute, uint32 second,


uint32 mcrs, uint32 sqlOffset, int64* encodedTime, int32* encodedZone,
bool* errorFlag = NULL)

hour specifies the hour in the range 0 to 23 inclusive.


minute specifies the minute in the range 0 to 59 inclusive.
second specifies the second in the range 0 to 59 inclusive.
mcrs specifies the microsecond in the range 0 to 999,999 inclusive.
sqlOffset specifies the time zone offset in minutes in the range of SQL_OFFSET_MIN to
SQL_OFFSET_MAX inclusive.
encodedTime specifies the parameter in which to record the Time encoded in Netezza
format.
encodedZone specifies the parameter in which to record the Time Offset in Netezza format.
errorFlag is an optional argument. If not NULL, it is set to true if
isValidTimeTz(hour,minute,second,mcrs) is false. Otherwise it is set to false.
Throws The function throws an opaque exception object if any *argument is NULL or
(errorFlag is NULL and isValidTimeTZ(hour, minute, second, mcrs, sqlOffset) is false).

20444-5

Rev.4

C-17

IBM Netezza User-Defined Functions Developers Guide

encodeTimestamp (m/d/y Input Format)


Converts a m/d/y, h:m:s:micros Timestamp value to a Netezza-encoded Timestamp.
Description

The function has the following syntax:

inline void encodeTimestamp(uint32 month, uint32 day, uint32 year,


uint32 hour, uint32 minute, uint32 second, uint32 mcrs, int64*
encodedTimestamp, bool* errorFlag = NULL)

month specifies the month in the range of 1 (January) to 12 (December).


day specifies the day in the range of 1 to 31.
year specifies the year of the date in the range of SQL_YEAR_MIN to SQL_YEAR_MAX
inclusive.
hour specifies the hour in the range 0 to 23 inclusive.
minute specifies the minute in the range 0 to 59 inclusive.
second specifies the second in the range 0 to 59 inclusive.
mcrs specifies the microsecond in the range 0 to 999,999 inclusive.
encodedTimestamp specifies the parameter in which to record the Timestamp in Netezza
format.
errorFlag is an optional argument. If not NULL, it is set to true if isValidTimestamp(month,
day, year, hour, minute, second, mcrs) is false. Otherwise it is set to false.
Throws The function throws an opaque exception object if encodedTimestamp is NULL or
(errorFlag is NULL and isValidTimestamp(month, day, year, hour, minute, second, mcrs) is
false).

encodeTimestamp (time_t Input Format)


Converts a time_t value to a Netezza-encoded Timestamp. Encodes the value in UTC and
applies no offsets. It also adds zero (0) microseconds to the encoded value.
Description

The function has the following syntax:

inline void encodeTimestamp(int32 ts, int64* encodedTimestamp, bool*


errorFlag = NULL)

ts specifies the time_t Timestamp value.


encodedTimestamp specifies the parameter in which to record the Timestamp encoded in
Netezza format.
errorFlag is an optional argument. If not NULL, it is set to true if isValidEpoch(ts) is false.
Otherwise it is set to false.
Throws The function throws an opaque exception object if encodedTimestamp is NULL or
(errorFlag is not NULL and isValidEpoch(ts) is false).

C-18

20444-5

Rev.4

Temporal Datatype Helper Functions

encodeTimestamp (timeval Input Format)


Converts a struct timeval value to a Netezza-encoded Timestamp.
Description

The function has the following syntax:

inline void encodeTimestamp(const struct timeval& ts, int64*


encodedTimestamp, bool* errorFlag = NULL)

ts specifies the struct timeval Timestamp value.


encodedTimestamp specifies the parameter in which to record the Timestamp encoded in
Netezza format.
errorFlag is an optional argument. If not NULL, it is set to true if isValidTimeVal(ts) is false.
Otherwise it is set to false.
Throws The function throws an opaque exception object if encodedTimestamp is NULL or
(errorFlag is NULL and isValidEpoch(ts) is false).

encodeTimestamp (struct tm Input Format)


Converts a struct tm value to a Netezza-encoded Timestamp. The function uses only the
tm.tm_year, tm.tm_day, tm.tm_mon, tm.tm_hour, tm.tm_min and tm.tm_sec fields of the
ts structure, and ignores the rest. The ts structure must pass the isValidTimeStruct() boolean test. The function also adds zero (0) microseconds to the encoded value.
Description

The function has the following syntax:

inline void encodeTimestamp(const struct tm& ts, int64*


encodedTimestamp, bool* errorFlag = NULL)

ts specifies the struct tm Timestamp value.


encodedTimestamp specifies the parameter in which to record the Timestamp encoded in
Netezza format.
errorFlag is an optional argument. If not NULL, it is set to true if isValidTimeStruct(ts) is
false. Otherwise it is set to false.
Throws The function throws an opaque exception object if &ts is NULL or
encodedTimestamp is NULL or (errorFlag is NULL and isValidTimeStruct(ts) is false).

Miscellaneous Functions
The miscellaneous datatype helper functions provide additional capabilities that you can
use within your UDX programs.

isLeapYear
Verifies whether the specified year is a leap year.
Description

The function has the following syntax:

inline bool isLeapYear(uint32 year)

year specifies a year number in the range of SQL_YEAR_MIN to SQL_YEAR_MAX inclusive.

20444-5

Rev.4

C-19

IBM Netezza User-Defined Functions Developers Guide

Returns

The function returns a value of true when any of the following are true:

year%4 is 0

year%100 is not 0

(year%100 is 0 and year%400 is 0)

Otherwise, the function returns a value of false.

offsetTimestamp
Applies an offset [SQL_OFFSET_MIN, SQL_OFFSET_MAX] to a Netezza Timestamp value.
Description

The function has the following syntax:

inline int64 (int64 nzTimestamp, int32 sqlOffset, bool* errorFlag =


NULL)

nzTimestamp specifies a value in Netezza Timestamp format.


sqlOffset specifies the time offset in minutes in the range of SQL_OFFSET_MIN to
SQL_OFFSET_MAX, inclusive.
errorFlag is an optional argument. If not NULL, it is set to true if isValidSqlOffset(sqlOffset)
is false, or isValidTimestamp(nzTimestamp) is false or isValidTimestamp(nzTimestamp+sqlOffset*60*1,000,000) is false. Otherwise it is set to false.
Returns ts=nzTimestamp + sqlOffset*60*1,000,000, if *errorFlag is NULL and an exception is not thrown or if *errorFlag is not NULL and, after the call, *errorFlag is false.
Otherwise the function returns an indeterminate value.
Throws The function throws an opaque exception object if errorFlag is NULL and (isValidSqlOffset(sqlOffset) is false, or isValidTimestamp(nzTimestamp) is false or
isValidTimestamp(nzTimestamp+sqlOffset*60*1,000,000) is false).

offsetTime
Applies an offset to a Netezza Time value. If nzTime with offset crosses 23:59:59.999999,
the value will reset (wrap) back to zero. For example, applying "+120 minutes" to the
encoded equivalent of "23:00:00" returns the encoded equivalent of "01:00:00".
Description

The function has the following syntax:

inline int64 offsetTime(int64 nzTime, int32 sqlOffset, bool* errorFlag


= NULL)

nzTime specifies the Time value that you want to offset.


sqlOffset specifies the time offset, in minutes, in the range of SQL_OFFSET_MIN to
SQL_OFFSET_MAX inclusive.
errorFlag is an optional argument. If not NULL, it is set to true if isValidSqlOffset(sqlOffset)
is false or isValidTime(nzTime) is false. Otherwise it is set to false.
Returns t= (nzTime + sqlOffset*60*1,000,000) mod (ENC_TIME_MAX+1) if *errorFlag is
NULL and an exception is not thrown or if *errorFlag is not NULL and, after the call, *errorFlag is false. Otherwise, the function returns an indeterminate value.
Throws The function throws an opaque exception object if errorFlag is NULL and
(isValidSqlOffset(sqlOffset) is false or isValidTime(nzTime) is false).

C-20

20444-5

Rev.4

Temporal Datatype Helper Functions

offsetEpoch
Applies an offset to a time_t structure. It treats time_t as if it allows offsets, which is
slightly outside the time_t specification, but it allows for easy usage.
Description

The function has the following syntax:

inline int32 offsetEpoch(int32 time, int32 sqlOffset, bool* errorFlag


= NULL)

time specifies a value in time_t format.


sqlOffset specifies the time offset in minutes in the range of SQL_OFFSET_MIN to
SQL_OFFSET_MAX inclusive.
errorFlag is an optional argument. If not NULL, it is set to true if isValidSqlOffset(sqlOffset)
is false or isValidEpoch(time) is false or isValidEpoch(time+sqlOffset*60) is false. Otherwise it is set to false.
Returns t=time+sqlOffset*60, if *errorFlag is NULL and an exception is not thrown or if
*errorFlag is not NULL and, after the call, *errorFlag is false. Otherwise, the function
returns an indeterminate value.
Throws The function throws an opaque exception object if errorFlag is NULL and (isValidSqlOffset(sqlOffset) is false, or isValidEpoch(time) is false or
isValidEpoch(time+sqlOffset*60) is false.

offsetTimeStruct
Applies an offset to a struct tm.
Description

The function has the following syntax:

inline struct tm offsetTimeStruct(const struct tm& time, int32


sqlOffset, bool* errorFlag = NULL)

time specifies a time value to offset.


sqlOffset specifies the time offset in minutes in the range of SQL_OFFSET_MIN to
SQL_OFFSET_MAX inclusive.
errorFlag is an optional argument. If not NULL, it is set to true if isValidSqlOffset(sqlOffset)
is false or isValidTimeStruct(time) is false or isValidTimeStruct(offset_time) is false. Otherwise it is set to false.
Returns Assume offset_time = tm +/- appropriate offset added, which could result in a
different day, month, and so on. offset_time.tm_isdst is set to -1. The fields set are offset_
time.tm_mday, offset_time.tm_yday, offset_time.tm_year, offset_time.tm_mon, offset_
time.tm_hour, offset_time.tm_min and offset_time.tm_sec. When applicable, all other
fields of offset_time are set to 0. Returns an indeterminate value otherwise.
Returns offset_time, if *errorFlag is NULL and an exception is not thrown or if, *errorFlag is
not NULL and, after the call, *errorFlag is false. Otherwise, the function returns an indeterminate value.
Throws The function throws an opaque exception object if errorFlag is NULL and
(isValidSqlOffset(sqlOffset) is false, or isValidEpoch(time) is false or
isValidEpoch(time+sqlOffset*60) is false.

20444-5

Rev.4

C-21

IBM Netezza User-Defined Functions Developers Guide

Numeric Datatype Helper Functions


For fixed-point numeric datatypes, the Netezza system stores numeric data as integers
using three different value sizes: 32-bit, 64-bit, and 128-bit. The storage size depends
upon the numeric precision, or the total number of digits before and after the decimal
point. Numeric values can have up to 38 digits in total.
For example, Netezza stores the value 129.456 as the integer 129456 using a
numeric(6,3) datatype. The precision (6) specifies the number of digits in the numeric,
and the scale (3) specifies the number of digits that follow the decimal point.
You can use the following helper functions to convert numeric datatypes to different precisions and/or scales, and also to different storage sizes. In addition, there is a helper
function that you can use to verify whether an input numeric value has 38 digits or less.
Note: These helper functions are contained within the global namespace.

convertNumeric32
Converts an input numeric value from its current storage size, precision, and scale to a 32bit numeric with a new precision and scale.

Description
The function has three forms of syntax:
int32 convertNumeric32(int32 value, int curPrec, int curScale, int
desiredPrec, int desiredScale);
int32 convertNumeric32(int64 value, int curPrec, int curScale, int
desiredPrec, int desiredScale);
int32 convertNumeric32(CNumeric128 value, int curPrec, int curScale,
int desiredPrec, int desiredScale);

value specifies the integer part of the input numeric, which can be either a 32-bit, 64-bit,
or 128-bit value.
curPrec and curScale specify the current precision and scale for the input numeric specified in value.
desiredPrec and desiredScale specify the new precision and scale for the converted
numeric. For a 32-bit numeric, the desiredPrec value can range from 1 to 9, and the
desiredScale value can range from 0 to (9-desiredPrec). If your desired precision is in the
range of 10 to 18, use the convertNumeric64 function, or if the desired precision is in the
range of 19-38, use the convertNumeric128 function. This helps to ensure that you select
the right storage size for the resulting integer part of the numeric.

Returns
The function returns a 32-bit integer that is compatible with the desired precision and
scale values.

C-22

20444-5

Rev.4

Numeric Datatype Helper Functions

Throws
The function throws the following exceptions:

Numeric value out of range usually indicates that the input value is outside
the range of the current precision and scale values, or outside the range of a 128-bit
integer.

overflow in 128-bit arithmetic for the convertNumeric32 and


convertNumeric64 functions, indicates that the precision is larger than the datatype
supports.

%d: precision range error, exponent usually indicates that a precision


value is greater than the datatype supports, or that the precision is greater than the
maximum of 38 for a 128-bit integer.

convertNumeric64
Converts an input numeric value from its current storage size, precision, and scale to a 64bit numeric with a new precision and scale.

Description
The function has three forms of syntax:
int64 convertNumeric64(int32 value, int curPrec, int curScale, int
desiredPrec, int desiredScale);
int64 convertNumeric64(int64 value, int curPrec, int curScale, int
desiredPrec, int desiredScale);
int64 convertNumeric64(CNumeric128 value, int curPrec, int curScale,
int desiredPrec, int desiredScale);

value specifies the integer part of the input numeric, which can be either a 32-bit, 64-bit,
or 128-bit value.
curPrec and curScale specify the current precision and scale for the input numeric specified in value.
desiredPrec and desiredScale specify the new precision and scale for the converted
numeric value. For a 64-bit numeric, the desiredPrec value can range from 10 to 18, and
the desiredScale value can range from 0 to (18-desiredPrec). If your desired precision is in
the range of 1 to 9, use the convertNumeric32 function, or if the desired precision is in the
range of 19-38, use the convertNumeric128 function. This helps to ensure that you select
the right storage size for the resulting integer portion of the numeric.

Returns
The function returns a 64-bit integer that is compatible with the desired precision and
scale values.

Throws
For a description of the exceptions, see the exceptions for convertNumeric32 on
page C-22.

20444-5

Rev.4

C-23

IBM Netezza User-Defined Functions Developers Guide

convertNumeric128
Converts an input numeric value from its current storage size, precision, and scale to a
128-bit numeric with a new precision and scale.

Description
The function has three forms of syntax:
CNumeric128 convertNumeric128(int32 value, int curPrec, int curScale,
int desiredPrec, int desiredScale);
CNumeric128 convertNumeric128(int64 value, int curPrec, int curScale,
int desiredPrec, int desiredScale);
CNumeric128 convertNumeric128(CNumeric128 value, int curPrec, int
curScale, int desiredPrec, int desiredScale);

value specifies the integer part of the input numeric, which can be either a 32-bit, 64-bit,
or 128-bit value.
curPrec and curScale specify the current precision and scale for the input numeric value.
desiredPrec and desiredScale specify the new precision and scale for the converted
numeric value. For a 128-bit numeric, the desiredPrec value can range from 19 to 38, and
the desiredScale value can range from 0 to (38-desiredPrec). If your desired precision is in
the range of 1 to 9, use the convertNumeric32 function, or if the desired precision is in the
range of 10 to 18, use the convertNumeric64 function. This helps to ensure that you select
the right storage size for the resulting integer portion of the numeric.

Returns
The function returns a 128-bit integer that is compatible with the desired precision and
scale values.

Throws
For a description of the exceptions, see the exceptions for convertNumeric32 on
page C-22.

CheckPrecision38Limit
Verifies that an input numeric value is within the 38-digit limit of a numeric. Netezza storage limits require that a numeric cannot have more than 38 combined digits before and
after the decimal point.

Description
The function has the following syntax:
CNumeric128 const& CheckPrecision38Limit(CNumeric128 const& value)

value specifies a fixed-point numeric value.

Returns
The function returns a reference to the input 128-bit numeric value or throws an error.

Throws
The function throws the error Numeric value requires more than 38 digits if the input
numeric value has more than 38 digits.

C-24

20444-5

Rev.4

UTF-8 Datatype Helper Functions

UTF-8 Datatype Helper Functions


This section describes the UTF-8 (8-bit UCS/Unicode Transformation Format) datatype
helper API functions. These functions can help you to manage NCHAR and NVARCHAR
datatypes.

UTF8CharCount
Returns a quick UTF-8 character count of a string.

Description
The function has the following syntax:
inline int UTF8CharCount(const char* bytes, int length)

bytes specifies the string of UTF-8 characters, which cannot be null-terminated.


length specifies the number of bytes to review.

Returns
The function returns the number of UTF-8 characters. The result maybe indeterminate if
bytes is not a valid UTF-8 string. As a best practice, use the isValidUTF8 helper function to
confirm that the string is composed of valid UTF-8 characters before you call this function
to count the characters.

Throws
The function throws an function throws an opaque exception object if length < 0 or if bytes
is NULL.

isValidUTF8
Checks if a given string represents valid UTF-8 characters.

Description
The function has the following syntax:
inline bool isValidUTF8(const char* bytes, int length, int*
charLength= NULL)

bytes specifies the string of UTF-8 characters, which cannot be null-terminated.


length specifies the number of bytes to review.
charLength is an optional argument. If you specify this argument and the bytes argument is
valid, charLength is set to the number of UTF-8 characters in the bytes string. If you specify this argument and the bytes argument is not valid, charLength is set to -1.

Returns
The function returns true if length is 0 or bytes[0...length-1] is a valid UTF8 string. Otherwise, the function returns false.

Throws
The function throws an opaque exception object if bytes is NULL or if length < 0.

20444-5

Rev.4

C-25

IBM Netezza User-Defined Functions Developers Guide

C-26

20444-5

Rev.4

APPENDIX

UDX Datatypes Reference Information


This appendix includes the following reference information that supports the C++ programdevelopment process:

Supported Data Types

UDX Arguments

Logging Methods

Memory Management Methods

UDA State Arguments

UDX Return Value Macros

UDX Environment Methods

UDF Sizer Methods

UDTF Shaper Methods

UDTF Column Return Methods

Supported Data Types


This section describes the supported data types for your UDFs, UDAs, and UDTFs. The data
types include the following:

char

nchar

nvarchar

varchar

boolean

date

time

time with time zone

numeric

real

double precision

D-1

IBM Netezza User-Defined Functions Developers Guide

interval

integer

bigint

smallint

byteint

timestamp

char
DDL info: CHAR(n)
C++ info: UdxBase::UDX_FIXED
struct StringArg
{
char* data;
int length;
// Bytes used by string data (not characters).
int dec_length; // Character declared length ie NCHAR(20) = 20. Will
// also be set if VARCHAR or NVARCHAR.
};
struct StringReturn
{
char* data;
int size;
// On enter it is the size (in bytes) allocated for string data.
// On return it is the size (in bytes) actually used.
};

The char type often has implicit spaces at the end when passed as an argument. The difference between the specified length and the dec_length indicates how many trailing spaces
must be accounted for. Note that length is in bytes and dec_length is in characters.

nchar
DDL info: NCHAR(n)
C++ info: UdxBase::UDX_NATIONAL_FIXED
struct StringArg
{
char* data;
int length;
// Bytes used by string data (not characters).
int dec_length; // Character declared length ie NCHAR(20) = 20. Will
// also be set if VARCHAR or NVARCHAR.
};
struct StringReturn
{
char* data;
int size;

D-2

20444-5

Rev.4

Supported Data Types

// On enter it is the size (in bytes) allocated for string data.


// On return it is the size (in bytes) actually used.
};

The nchar type often has implicit spaces at the end when passed as an argument. The difference between the specified length and the dec_length indicates how many trailing
spaces must be accounted for. Note that length is in bytes and dec_length is in characters.

nvarchar
DDL info: NVARCHAR(n)

C++ info: UdxBase::UDX_NATIONAL_VARIABLE


struct StringArg
{
char* data;
int length;
// Bytes used by string data (not characters).
int dec_length; // Character declared length ie NCHAR(20) = 20. Will
// also be set if VARCHAR or NVARCHAR.
};
struct StringReturn
{
char* data;
int size;
// On enter it is the size (in bytes) allocated for string data.
// On return it is the size (in bytes) actually used.
};

varchar
DDL info: VARCHAR(n)

C++ info: UdxBase::UDX_VARIABLE


struct StringArg
{
char* data;
int length;
// Bytes used by string data (not characters).
int dec_length; // Character declared length ie NCHAR(20) = 20. Will
// also be set if VARCHAR or NVARCHAR.
};
struct StringReturn
{
char* data;
int size;
// On enter it is the size (in bytes) allocated for string data.
// On return it is the size (in bytes) actually used.
};

20444-5

Rev.4

D-3

IBM Netezza User-Defined Functions Developers Guide

boolean
DDL info: BOOL

C++ info: UdxBase::UDX_BOOL


int8 boolval;

// 1 = true, 0 = false

date
DDL info: DATE
C++ info: dxBase::UDX_DATE
int32 date; // Day resolution spans January 1, 0001 to December
// 31, 9999 (centered around 2000-01-01).

time
DDL info: TIME

C++ info: UdxBase::UDX_TIME


int64 time; // Microsecond resolution that represents the time of
// day only (midnight to one microsecond before midnight).

time with time zone


DDL info: TIMETZ

C++ info: UdxBase::UDX_TIMETZ


struct TimeTzADT
{
int64 time;
int32 zone;
};
struct TimeTzReturn
{
TimeTzADT *value;
};

Uses the int64 time value and adds an int32 time zone as well. The time zone is represented in seconds.

numeric
DDL info: NUMERIC(p,s)

C++ info: UdxBase::UDX_NUMERIC32, UdxBase::UDX_NUMERIC64, UdxBase::UDX_


NUMERIC128

D-4

20444-5

Rev.4

Supported Data Types

struct Numeric32Val
{
CNumeric32 *value;
int precision; // Number of digits (both sides of decimal point)
int scale; // Number of decimal digits
};
struct Numeric64Val
{
CNumeric64 *value;
int precision; // Number of digits (both sides of decimal point)
int scale; // Number of decimal digits
};
struct Numeric128Val
{
CNumeric128 *value;
int precision; // Number of digits (both sides of decimal point)
int scale; // Number of decimal digits
};

The precision determines which of the three variations will be used. 1 - 9 digits use
Numeric1, 10 - 18 digits use numeric2, and 19 - 38 digits use Numeric4 The scale value
is necessary to determine the meaning of the numeric since it is presented as an integer,
with the scale indicating where the floating point is placed.

real
DDL info: FLOAT4

C++ info: UdxBase::UDX_FLOAT


float floatval;

double precision
DDL info: FLOAT8

C++ info: UdxBase::UDX_DOUBLE


double dblval;

interval
DDL info: INTERVAL

C++ info: UdxBase::UDX_INTERVAL


struct Interval
{
int64 time;
int32 month;
};
struct IntervalReturn

20444-5

Rev.4

D-5

IBM Netezza User-Defined Functions Developers Guide

{
Interval *value;
};

It has microsecond resolution and ranges from +/- 178000000 years. The time part represents everything but months and years (microseconds) and the month part represents
months and years.

integer
DDL info: INT4

C++ info: UdxBase::UDX_INT32


int32 intValue;

bigint
DDL info: INT8

C++ info: UdxBase::UDX_INT64


int64 bigInt;

smallint
DDL info: INT2
C++ info: UdxBase::UDX_INT16
int16 smallInt;

byteint
DDL info: INT1

C++ info: UdxBase::UDX_INT8


int8 byteInt;

timestamp
DDL info: TIMESTAMP
C++ info: UdxBase::UDX_TIMESTAMP
int64 timestamp;

The value represents the number of microseconds since midnight 2000-01-01. This can
be positive or negative, the low value is January 1, 0001 and the high value is December
31, 9999.

D-6

20444-5

Rev.4

UDX Arguments

UDX Arguments
You can use the following argument types in your UDX code:
bool isArgNull(int n)
int argType(int n)
bool isArgConst(int n)
int numArgs()
int64 timestampArg(int n)
int64 timeArg(int n)
int32 dateArg(int n)
bool boolArg(int n)
int64 int64Arg(int n)
int32 int32Arg(int n)
int16 int16Arg(int n)
int8 int8Arg(int n)
double doubleArg(int n)
float floatArg(int n)
struct Interval* intervalArg(int n)
TimeTzADT* timetzArg(int n)
Numeric128Val* numeric128Arg(int n)
Numeric64Val* numeric64Arg(int n)
Numeric32Val* numeric32Arg(int n)
StringArg* stringArg(int n)
int stringArgSize(int n)
int numericArgPrecision(int n)
int numericArgScale(int n)

Logging Methods
You can use the following message logging methods in your UDXs:
bool isLoggingEnabled(int8 val);
int8 getLogMask();
void logMsg(int8 flags, const char* fmt, ...);

The isLoggingEnabled and getLogMask methods are available only in UDXs that use API
Version 2.

Memory Management Methods


You can use the following memory management method in your UDXs:
int getMemory();

20444-5

Rev.4

D-7

IBM Netezza User-Defined Functions Developers Guide

UDA State Arguments


The following values are valid for UDA state arguments and can be used in the finalResult()
method of the UDA:
int numStateVars()
int stateType(int n)
bool setStateNull(int n, bool val)
bool isStateNull(int n)
StringArg* stringState(int n)
Numeric32Val* numeric32State(int n)
Numeric64Val* numeric64State(int n)
Numeric128Val* numeric128State(int n)
TimeTzADT* timetzState(int n)
struct Interval* intervalState(int n)
float* floatState(int n)
double* doubleState(int n)
int8* int8State(int n)
int16* int16State(int n)
int32* int32State(int n)
int64* int64State(int n)
bool* boolState(int n)
int32* dateState(int n)
int64* timeState(int n)
int64* timestampState(int n)

UDX Return Value Macros


There are macros that you can use to ensure that return values from the evaluate() and
finalResult() methods are valid values. These macros are defined in the
/nz/kit/sys/include/udxbase.h header file.
int returnType()
IntervalReturn* intervalReturnInfo()
TimeTzReturn* timetzReturnInfo()
StringReturn* stringReturnInfo()
Numeric128Val* numeric128ReturnInfo()
Numeric64Val* numeric64ReturnInfo()
Numeric32Val* numeric32ReturnInfo()
void setReturnNull(bool val)
NZ_UDX_RETURN_NULL()
NZ_UDX_RETURN_STRING(x)
NZ_UDX_RETURN_BOOL(x)
NZ_UDX_RETURN_DATE(x)
NZ_UDX_RETURN_TIME(x)

D-8

20444-5

Rev.4

UDX Environment Methods

NZ_UDX_RETURN_TIMETZ(x)
NZ_UDX_RETURN_NUMERIC32(x)
NZ_UDX_RETURN_NUMERIC64(x)
NZ_UDX_RETURN_NUMERIC128(x)
NZ_UDX_RETURN_FLOAT(x)
NZ_UDX_RETURN_DOUBLE(x)
NZ_UDX_RETURN_INTERVAL(x)
NZ_UDX_RETURN_INT64(x)
NZ_UDX_RETURN_INT32(x)
NZ_UDX_RETURN_INT16(x)
NZ_UDX_RETURN_INT8(x)
NZ_UDX_RETURN_TIMESTAMP(x)

UDX Environment Methods


The following methods operate on the API version 2 UdxEnvironment object to obtain information about the number of environment entries as well as their name and index value:
int getNumEntries();
int findEntry(const char* strKey);
const UdxEnvironmentEntry* getEntry(int idx);
const UdxEnvironmentEntry* getEntry(const char* strKey);

You can use the following methods to obtain information from UdxEnvironmentEntry :
const char* getKey();
const char* getValue();

For more information, see UDX Environment on page 2-7.

UDF Sizer Methods


You can use the following methods for UDFs that use a generic return value to set the
return value for a UDF:
int sizeReturnType();
int numSizerArgs();
int sizeArgType(int n);
int sizerNumericArgPrecision(int n);
int sizerNumericArgScale(int n);
uint64 sizerStringSizeValue(int len);
uint64 sizerNumericSizeValue (int prec, int scale);
bool isSizerArgConstnat(int n);
int32 sizerGetConstantArg(int n);
virtual uint64 calculateSize();

20444-5

Rev.4

D-9

IBM Netezza User-Defined Functions Developers Guide

UDTF Shaper Methods


The calculateShape() method defines the shape and content of a return table for UDTFs:
calculateShape(UdxOutputShaper *shaper);

The following methods operate on the UdxOutputShaper object to define an output column:
void addOutputColumn(int nType, const char* strName, int nSize);
void addOutputColumn(int nType, const char* strName, int precision,
int scale);
void addOutputColumn(int nType, const char* strName);

The following methods operate on the UdxOutputShaper object to obtain information about
the return columns and system casing:
int numOutputColumns();
const UdxColumnInfo* getOutputColumn(int n);
bool isSystemCaseUpper();

You can use the following methods on the UdxColumnInfo object to obtain information
about a column:
int getType();
int getSize();
int getPrecision();
int getScale();
const char* getName();

You can use the following methods on the shaper object to get input arguments and input
metadata. Many of these methods work the same way as the standard UDX Arguments
methods.
int numArgs();
int argType(int n);
int stringArgSize(int n);
int numericArgPrecision(int n);
int numericArgScale(int n);
bool isArgConst(int n);
bool isArgNull(int n);
Numeric32Val* numeric32Arg(int n);
Numeric64Val* numeric64Arg(int n);
Numeric128Val* numeric128Arg(int n);
StringArg* stringArg(int n);
TimeTzADT* timetzArg(int n);
struct Interval* intervalArg(int n);
bool boolArg(int n);
int32 dateArg(int n);
int64 timeArg(int n);
int64 timestampArg(int n);
int8 int8Arg(int n);

D-10

20444-5

Rev.4

UDTF Column Return Methods

int16 int16Arg(int n);


int32 int32Arg(int n);
int64 int64Arg(int n);
float floatArg(int n);
double doubleArg(int n);

For more information on generic UDTFs, see Registering Generic Return Type UDTFs on
page 3-12.

UDTF Column Return Methods


You can use the following methods to return multiple columns in a UDTF:
int numReturnColumns();
int returnTypeColumn(int n);
void setReturnColumnNull(int n, bool val);
bool isReturnColumnNull(int n);
StringReturn* stringReturnColumn(int n);
Numeric32Val* numeric32ReturnColumn(int n);
Numeric64Val* numeric64ReturnColumn(int n);
Numeric128Val* numeric128ReturnColumn(int n);
TimeTzADT* timetzReturnColumn(int n);
struct Interval* intervalReturnColumn(int n);
float* floatReturnColumn(int n);
double* doubleReturnColumn(int n);
int8* int8ReturnColumn(int n);
int16* int16ReturnColumn(int n);
int32* int32ReturnColumn(int n);
int64* int64ReturnColumn(int n);
bool* boolReturnColumn(int n);
int32* dateReturnColumn(int n);
int64* timeReturnColumn(int n);
int64* timestampReturnColumn(int n);

20444-5

Rev.4

D-11

IBM Netezza User-Defined Functions Developers Guide

D-12

20444-5

Rev.4

APPENDIX

Using UDXs with Stored Procedures


Whats in this appendix
Using UDXs to Extend the NZPLSQL Language

This appendix describes some advanced development topics that show how user-defined
functions and aggregates can be used with stored procedures and to extend the NZPLSQL
language. For general information about creating and using stored procedures, refer to the
IBM Netezza Stored Procedures Developers Guide.

Using UDXs to Extend the NZPLSQL Language


Users create stored procedures by writing applications that use the NZPLSQL language.
NZPLSQL is an interpreted language which is based on Postgres PL/pgSQL language and
which is designed for the Netezza host environment.
You can extend the NZPLSQL language with the C++ UDF functionality of the user-defined
functions feature. These UDFs must be invoked using SQL that is structured in such a way
as to ensure that the UDFs always run on the Netezza host inside Postgres. Since the UDFs
are restricted to the host, they can take advantage of the full range of LIBC functions, features, and other standard libraries which are not present on the Nucleus-based SPUs.
These UDFs must also be registered to run in unfenced mode.
Note: Table 6-1 on page 6-11 describes the recommended set of LIBC supported libraries
for UDFs that can run on the SPUs as well as the host. This is a subset of the full LIBC
library set.
To ensure that your language-extension UDFs run only on the Netezza host inside Postgres,
you use the getCurrentLocus() function. This function detects the locus or place of execution of the UDF. Design your UDFs to throw an exception if the return value is anything
other than UDX_LOCUS_POSTGRES. (The other possible values are UDX_LOCUS_DBOS or
UDX_LOCUS_SPU.)
As an example, the following UDF C++ file named dir.cpp defines three functions called
OpenDir, ReadDir and CloseDir (based on the LIBC functions), which can be used to open a
local directory on the host, read its contents, and close the connection to the directory.
These functions will be defined so that they can be called from NZPLSQL, and must run in
unfenced mode to perform these actions. The dir.cpp file follows:
#include
#include
#include
#include
#include

<udxinc.h>
<dirent.h>
<sys/types.h>
<string.h>
<errno.h>

E-1

IBM Netezza User-Defined Functions Developers Guide

using namespace nz::udx;


class OpenDir : public Udf
{
public:
Udf * instantiate();
ReturnValue evaluate() {
if (getCurrentLocus() != UDX_LOCUS_POSTGRES)
throwUdxException("opendir only supported in frontend");
#ifndef FOR_SPU
if (isArgNull(0))
NZ_UDX_RETURN_NULL();
StringArg *arg = stringArg(0);
char path[2048];
memcpy(path, arg->data, arg->length);
path[arg->length] = 0;
DIR *dir = opendir(path);
if (dir == NULL) {
char format[2500];
sprintf(format, "Can't open dir %s: %s", path,
strerror(errno));
throwUdxException(format);
}
NZ_UDX_RETURN_INT32((int32)dir);
#endif
}
};
Udf* OpenDir::instantiate()
{
return new OpenDir;
}
class ReadDir : public Udf
{
public:
Udf * instantiate();
ReturnValue evaluate() {
if (getCurrentLocus() != UDX_LOCUS_POSTGRES)
throwUdxException("readdir only supported in frontend");
#ifndef FOR_SPU
if (isArgNull(0))
NZ_UDX_RETURN_NULL();
int32 arg = int32Arg(0);
DIR *dir = (DIR*)arg;
struct dirent *dp;
dp = readdir(dir);
if (dp == NULL)
NZ_UDX_RETURN_NULL();
StringReturn* info = stringReturnInfo();

E-2

20444-5

Rev.4

Using UDXs to Extend the NZPLSQL Language

memcpy(info->data, dp->d_name, strlen(dp->d_name));


info->size = strlen(dp->d_name);
NZ_UDX_RETURN_STRING(info);
#endif
}
};
Udf* ReadDir::instantiate()
{
return new ReadDir;
}
class CloseDir : public Udf
{
public:
Udf * instantiate();
ReturnValue evaluate() {
if (getCurrentLocus() != UDX_LOCUS_POSTGRES)
throwUdxException("closedir only supported in frontend");
#ifndef FOR_SPU
if (isArgNull(0))
NZ_UDX_RETURN_NULL();
int32 arg = int32Arg(0);
DIR *dir = (DIR*)arg;
closedir(dir);
NZ_UDX_RETURN_BOOL(true);
#endif
}
};
Udf* CloseDir::instantiate()
{
return new CloseDir;
}

Note: The sample dir.cpp file stores several pointers into an int32 field. If the Netezza
operating system changes to a 64-bit version in the future, note that these pointers would
have to switch to use int64 instead.
You can compile and register the three UDFs in the dir.cpp file using the following three
commands or using CREATE AND REPLACE FUNCTION commands:
nzudxcompile dir.cpp --sig "opendir(varchar(any))" --return int4
--class OpenDir --unfenced
nzudxcompile dir.cpp --sig "readdir(int4)" --return "varchar(512)"
--class ReadDir --unfenced
nzudxcompile dir.cpp --sig "closedir(int4)" --return "bool"
--class CloseDir --unfenced

Then, create a stored procedure similar to the following:


DEV(MYUSER)=> CREATE OR REPLACE PROCEDURE sp_listdirs01() RETURNS BOOL
LANGUAGE NZPLSQL AS
BEGIN_PROC
DECLARE
dirp int4;

20444-5

Rev.4

E-3

IBM Netezza User-Defined Functions Developers Guide

nm varchar(512);
cl bool;
dir varchar(1024);
num int4;
r record;
BEGIN
select count(*) INTO num from _t_object where upper(objname) =
'SORTER' and objclass = 4905 and objdb = current_db;
IF num = 1 THEN
DROP TABLE SORTER;
END IF;
CREATE TABLE SORTER (grp int4, name varchar(2000));
dir := '/tmp/udx_known';
dirp := opendir(dir);
LOOP
nm = readdir(dirp);
exit when nm is null;
EXECUTE IMMEDIATE 'INSERT INTO SORTER VALUES (1, ' ||
quote_literal(nm) || ')';
END LOOP;
FOR r in SELECT name from sorter order by name LOOP
RAISE NOTICE 'got %/%', dir, r.name;
END LOOP;
cl = closedir(dirp);
DROP TABLE SORTER;
RETURN cl;
EXCEPTION WHEN OTHERS THEN
IF dirp is not NULL THEN
cl = closedir(dirp);
RETURN cl;
END IF;
END;
END_PROC;

The sample procedure calls the new UDFs opendir(), readdir(), and closedir() to operate on
a directory named /tmp/udx_known. As an example, if udx_known contains the dir.cpp program and the object files from nzudxcompile, a sample sp_listdirs01() call returns the
following information:
DEV(MYUSER)=> CALL sp_listdirs01();
call sp_listdirs01();
NOTICE: got /tmp/udx_known/.
NOTICE: got /tmp/udx_known/..
NOTICE: got /tmp/udx_known/dir.cpp
NOTICE: got /tmp/udx_known/dir.o_diab_ppc
NOTICE: got /tmp/udx_known/dir.o_ppc
NOTICE: got /tmp/udx_known/dir.o_x86
SP_LISTDIRS01
--------------t
(1 row)

If you attempt to run any of the UDFs OpenDir, ReadDir, or CloseDir on the SPUs or in
DBOS, the Netezza system reports an error similar to the following:
DEV(MYUSER)=> SELECT readdir(grp) FROM customers;
ERROR: readdir only supported in frontend

E-4

20444-5

Rev.4

APPENDIX

Sample User-Defined Functions and Aggregates Reference


Whats in this appendix
Sample User-Defined Functions
Sample User-Defined Aggregates
Sample User-Defined Table Function

This appendix contains some sample user-defined functions and aggregates. In addition to
the examples in this appendix, other samples are available on the Demo page of the NDN
Developers web site at https://developer.netezza.com.

Sample User-Defined Functions


The following sections provide some examples of UDFs.

Generic UDF Example


The following UDF, var_concat, takes two input VARCHAR strings of any length, concatenates them, and returns the combined VARCHAR string, which is also declared as a generic
return value to support the combined string length.
The function can also concatenate national character strings (NVARCHAR). Within the
sample, note the additional registration command definition to create an nvar_concat UDF.
/**
* UDF var_concat(Xvarchar(any), Xvarchar(any)) -> Xvarchar(any)
*
* COMPILATION:
nzudxcompile UDX_Concat.cpp -o /tmp/udx_test/UDX_Concat.o
*
* REGISTRATION:
CREATE OR REPLACE FUNCTION var_concat(VARCHAR(ANY), VARCHAR(ANY))
RETURNS VARCHAR(ANY)
LANGUAGE CPP
PARAMETER STYLE NPSGENERIC
CALLED ON NULL INPUT
NOT DETERMINISTIC
EXTERNAL CLASS NAME 'Concat'
EXTERNAL HOST OBJECT '/tmp/udx_test/UDX_Concat.o_x86'
EXTERNAL SPU OBJECT '/tmp/udx_test/UDX_Concat.o_spu10';
CREATE OR REPLACE FUNCTION nvar_concat(NVARCHAR(ANY), NVARCHAR(ANY))
RETURNS NVARCHAR(ANY)

F-1

IBM Netezza User-Defined Functions Developers Guide

LANGUAGE CPP
PARAMETER STYLE NPSGENERIC
CALLED ON NULL INPUT
NOT DETERMINISTIC
EXTERNAL CLASS NAME 'Concat'
EXTERNAL HOST OBJECT '/tmp/udx_test/UDX_Concat.o_x86'
EXTERNAL SPU OBJECT '/tmp/udx_test/UDX_Concat.o_spu10';
*
* USAGE:
select var_concat('str1','str2');
*
* Copyright (c) 2007-2009 Netezza Corporation, an IBM Company
* All rights reserved.
*/
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
class Concat: public Udf
{
static Udf* instantiate();
inline bool isValidArgType(int at) const
{
return at==UDX_FIXED||at==UDX_VARIABLE||at==UDX_NATIONAL_
FIXED||at==UDX_NATIONAL_VARIABLE;
}
virtual ReturnValue evaluate()
{
if(numArgs()!=2)
{
throwUdxException("var_concat number of arguments is not 2");
}
if (isArgNull(0))
NZ_UDX_RETURN_NULL();
if (isArgNull(1))
NZ_UDX_RETURN_NULL();
setReturnNull(false);
int argType0=argType(0);
int argType1=argType(1);
if(isValidArgType(argType0)&&isValidArgType(argType1))
{
StringArg *a = stringArg(0);
StringArg *b = stringArg(1);
StringReturn *ret = stringReturnInfo();
ret->size=a->length+b->length;
memcpy(ret->data,a->data, a->length);
memcpy(ret->data+a->length, b->data, b->length);
NZ_UDX_RETURN_STRING(ret);
}
else
{

F-2

20444-5

Rev.4

Sample User-Defined Functions

throwUdxException("Datatype mismatch.");
}
}
virtual uint64 calculateSize() const
{
int argType0=sizerArgType(0);
int argType1=sizerArgType(1);
if(isValidArgType(argType0)&&isValidArgType(argType1))
{
return sizerStringSizeValue(sizerStringArgSize(0)
+sizerStringArgSize(1));
}
else
{
throwUdxException("Datatype mismatch.");
}
}
};
Udf* Concat::instantiate()
{
return new Concat;
}

Binary to Hexadecimal Converter Functions


The following sample program defines two UDFs, bintohex and hextobin, which perform
some common binary and hexadecimal data conversions.
/*
Copyright (c) 2007-2009 Netezza Corporation, an IBM Company
All rights reserved.
Functions bintohex and hextobin allow packing and unpacking of binary
data into varchar fields.
This file must be placed in the /home/nz/osf/ directory.
create or replace function bintohex(varchar(16000)) returns
varchar(32000)
language cpp parameter style npsgeneric
EXTERNAL CLASS NAME 'CBinToHex'
EXTERNAL HOST OBJECT '/home/nz/osf/hexbin.o_x86'
EXTERNAL SPU OBJECT '/home/nz/osf/hexbin.o_spu10';
create or replace function hextobin(varchar(32000)) returns
varchar(16000)
language cpp parameter style npsgeneric
EXTERNAL CLASS NAME 'CHexToBin'
EXTERNAL HOST OBJECT '/home/nz/osf/hexbin.o_x86'
EXTERNAL SPU OBJECT '/home/nz/osf/hexbin.o_spu10';
(SIMPLE USAGE)
select bintohex(field) from a;
select bintohex('abcd') from a;
select hextobin(field) from b;

20444-5

Rev.4

F-3

IBM Netezza User-Defined Functions Developers Guide

select hextobin('3A3B0F1141') from b;


(COMPLEX USAGE)
create table bintest (a varchar(100), b varchar(50));
insert into bintest values ('000102030405060708090A0B0C0D0E0F','');
update bintest set B = hextobin(A);
select bintohex(B) from bintest;
Note: select * from bintest will not properly display column B.
*/
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
class CHexToBin : public Udf
{
char convert(char inp)
{
if (inp >= '0' && inp <= '9')
return inp - '0';
if (inp >= 'a' && inp <= 'f')
return inp - 'a' + 10;
if (inp >= 'A' && inp <= 'F')
return inp - 'A' + 10;
throwUdxException("Bad Hex");
return 0;
}
public:
char *m_pBuf;
CHexToBin()
{
m_pBuf = new char[16000];
}
~CHexToBin()
{
delete m_pBuf;
}
static Udf* instantiate();
virtual ReturnValue evaluate()
{
StringReturn* ret = stringReturnInfo();
StringArg *input = stringArg(0);
int numbytes = input->length / 2;
if ((input->length%2) != 0)
throwUdxException("Bad Hex (Dangling)");
for (int i = 0; i < numbytes; i++)
{
char b1 = input->data[i*2];
char b2 = input->data[i*2+1];
m_pBuf[i] = convert(b1) * 16 + convert(b2);
}
ret->size = numbytes;
memcpy(ret->data, m_pBuf, numbytes);

F-4

20444-5

Rev.4

Sample User-Defined Functions

NZ_UDX_RETURN_STRING(ret);
}
};
Udf* CHexToBin::instantiate()
{
return new CHexToBin;
}
class CBinToHex : public Udf
{
char unconvert(char inp)
{
if (inp <= 9)
return inp + '0';
return inp + 'A' - 10;
}
public:
char *m_pBuf;
CBinToHex()
{
m_pBuf = new char[32000];
}
~CBinToHex()
{
delete m_pBuf;
}
static Udf* instantiate();
virtual ReturnValue evaluate()
{
StringReturn* ret = stringReturnInfo();
StringArg *input = stringArg(0);
int numbytes = input->length * 2;
for (int i=0; i < input->length; i++)
{
m_pBuf[i*2] = unconvert(((unsigned char)(input->data[i]) &
0xF0) >> 4);
m_pBuf[i*2+1] = unconvert((unsigned char)(input->data[i]) &
0x0F);
}
ret->size = numbytes;
memcpy(ret->data, m_pBuf, numbytes);
NZ_UDX_RETURN_STRING(ret);
}
};
Udf* CBinToHex::instantiate()
{
return new CBinToHex;
}

20444-5

Rev.4

F-5

IBM Netezza User-Defined Functions Developers Guide

Business Hours Verification Function


The following sample function isBusinessHours verifies that a timestamp reflects a time
that is in the business hours range of 9:00AM to 6:00PM, Monday through Friday. If the
time is within the range, the function returns a value of true. Otherwise, it returns a value
of false. This example uses the datatype helpers API to verify values and decode the timestamps for verification against constants that use more common developer-style values.
/*
Copyright (c) 2007-2009 Netezza Corporation, an IBM Company
All rights reserved.
Function isBusinessHours takes a timestamp and returns true if the
timestamp is between 9 AM and 6 PM on a weekday. Returns false
otherwise. Returns NULL on NULL input.
REGISTRATION:
create or replace function
isBusinessHours(timestamp)
returns bool
language cpp
parameter style npsgeneric
returns null on null input
EXTERNAL CLASS NAME 'IsBusinessHours'
EXTERNAL HOST OBJECT '[host .o_x86 file]'
EXTERNAL SPU OBJECT '[SPU .o_spu10 file]'
USAGE:
create
insert
insert
insert
select
*/

table times (c1 timestamp);


into times values ('12/15/2000, 1:50:00 PM');
into times values ('3/10/2008, 4:30:05.32 PM');
into times values ('3/10/2008, 6:00 PM');
c1, isBusinessHours(c1) from times;

#include "udxinc.h"
using namespace nz::udx;
using namespace nz::udx::dthelpers;
static const uint8 WORK_START_HOUR=9;
static const uint8 WORK_END_HOUR=18;
class IsBusinessHours : public Udf
{
public:
static Udf* instantiate();
virtual ReturnValue evaluate()
{
if(isArgNull(0))
NZ_UDX_RETURN_NULL();
int64 ts = timestampArg(0);
if(!isValidTimestamp(ts)) //if this test does not pass, we won't
// be able to decode ts
throwUdxException("invalid timestamp passed");
struct tm decomp;
bool err=false;
decodeTimestamp(ts, &decomp, &err);
if(err) //if isValidTimestamp(ts) is true, err should be false,
//but better safe than sorry

F-6

20444-5

Rev.4

Sample User-Defined Aggregates

throwUdxException("error decoding timestamp");


if(decomp.tm_wday==0||decomp.tm_wday==6||decomp.tm_hour<=WORK_
START_HOUR||decomp.tm_hour>=WORK_END_HOUR)
NZ_UDX_RETURN_BOOL(false);
NZ_UDX_RETURN_BOOL(true);
}
private:
};
Udf* IsBusinessHours::instantiate()
{
return new IsBusinessHours;
}

Sample User-Defined Aggregates


Several examples of UDAs follow.

PackChildren Aggregate (UDX Version 1)


/*
Copyright (c) 2007-2009 Netezza Corporation, an IBM Company
All rights reserved.
The aggregate PackChildren allows us to "pack" multiple children of one
node into a varchar (binary representation)
REGISTRATION:
create or replace aggregate PackChildren (INT4) returns VARCHAR(400)
state (VARCHAR(400))
language cpp parameter style npsgeneric
EXTERNAL CLASS NAME 'CPackChildren'
EXTERNAL HOST OBJECT '/home/nz/osf/packchildren.o_x86'
EXTERNAL SPU OBJECT '/home/nz/osf/packchildren.o_spu10';
Usage:
create
insert
insert
insert

table packtest (a int4);


into packtest values (1);
into packtest values (2);
into packtest values (3);

select PackChildren(a) from packtest;


*/
#include "udxinc.h"
#include <string.h>
using namespace nz::udx;
#define MAXCHILDREN 4000
class CPackChildren : public Uda
{

20444-5

Rev.4

F-7

IBM Netezza User-Defined Functions Developers Guide

public:
static Uda* instantiate();
void initializeState()
{
StringArg *s = stringState(0);
s->length = 0;
setStateNull(0, false);
}
virtual void accumulate()
{
StringArg *s = stringState(0);
int32 value = int32Arg(0);
if (s->length < MAXCHILDREN * 4)
{
*((int32*)(s->data+s->length)) = value;
s->length = s->length + 4;
}
}
virtual void merge()
{
/* Destination */
StringArg *s = stringState(0);
/* Source */
StringArg *s2 = stringArg(0);
if (s->length + s2->length <= MAXCHILDREN * 4)
{
memcpy(s->data+s->length, s2->data, s2->length);
s->length = s->length + s2->length;
}
}
virtual ReturnValue finalResult()
{
setReturnNull(false);
StringReturn *ret = stringReturnInfo();
StringArg *s = stringArg(0);
printf("got %d\n", s->length);
ret->size = s->length;
memcpy(ret->data,s->data,s->length);
NZ_UDX_RETURN_STRING(ret);
}
};
Uda* CPackChildren::instantiate()
{
return new CPackChildren;
}

PenMax Example (UDX Version 2)


/*
Copyright (c) 2007-2010 Netezza Corporation, an IBM Company
All rights reserved.
The aggregate PenMax displays the penultimate (second largest) value.

F-8

20444-5

Rev.4

Sample User-Defined Aggregates

REGISTRATION:
CREATE AGGREGATE PENMAX(INT4) RETURNS INT4 STATE (INT4, INT4)
LANGUAGE CPP API VERSION 2 PARAMETER STYLE NPSGENERIC
EXTERNAL CLASS NAME 'CPenMax'
EXTERNAL HOST OBJECT '/home/nz/udx_files/penmax.o_x86'
EXTERNAL SPU OBJECT '/home/nz/udx_files/penmax.o_spu10'
Usage:
CREATE
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT

TABLE myints (a int, b int);


INTO myints VALUES (1,2);
INTO myints VALUES (1,4);
INTO myints VALUES (1,6);
INTO myints VALUES (2,8);
INTO myints VALUES (2,10);
INTO myints VALUES (2,12);

SELECT penmax(b) FROM myints;


*/
#include "udxinc.h"
using namespace nz::udx_ver2;
class CPenMax: public nz::udx_ver2::Uda
{
public:
CPenMax(UdxInit *pInit) : Uda(pInit)
{
}
static nz::udx_ver2::Uda* instantiate(UdxInit *pInit);
void CPenMax::accumulate()
{
int *pCurMax = int32State(0);
bool curMaxNull = isStateNull(0);
int *pCurPenMax = int32State(1);
bool curPenMaxNull = isStateNull(1);
int curVal = int32Arg(0);
bool curValNull = isArgNull(0);
if ( !curValNull ) { // do nothing if argument is null - can't
//affect max or penmax
if ( curMaxNull ) { // if current max is null, this arg
//becomes current max
setStateNull(0, false); // current max no longer null
*pCurMax = curVal;
} else
{ if ( curVal > *pCurMax ) { // if arg is new max
setStateNull(1, false); // then prior current max
// becomes current penmax
*pCurPenMax = *pCurMax;
*pCurMax = curVal; // and current max gets arg
} else if ( curPenMaxNull || curVal > *pCurPenMax ){
// arg might be greater than current penmax
setStateNull(1, false); // it is

20444-5

Rev.4

F-9

IBM Netezza User-Defined Functions Developers Guide

*pCurPenMax = curVal;
}
}
}
}
void CPenMax::merge()
{
int *pCurMax = int32State(0);
bool curMaxNull = isStateNull(0);
int *pCurPenMax = int32State(1);
bool curPenMaxNull = isStateNull(1);
int nextMax = int32Arg(0);
bool nextMaxNull = isArgNull(0);
int nextPenMax = int32Arg(1);
bool nextPenMaxNull = isArgNull(1);
if ( !nextMaxNull ) { // if next max is null, then so is
//next penmax and we do nothing
if ( curMaxNull ) {
setStateNull(0, false); // current max was null,
// so save next max
*pCurMax = nextMax;
} else {
if ( nextMax > *pCurMax ) {
setStateNull(1, false);
// next max is greater than current, so save next
*pCurPenMax = *pCurMax;
// and make current penmax prior current max
*pCurMax = nextMax;
} else if ( curPenMaxNull || nextMax > *pCurPenMax ) {
// next max may be greater than current penmax
setStateNull(1, false); // it is
*pCurPenMax = nextMax;
}
}
if ( !nextPenMaxNull ) {
if ( isStateNull(1) ) {
// can't rely on curPenMaxNull here, might have
// change state var null flag above
setStateNull(1, false); // first non-null penmax,
// save it
*pCurPenMax = nextPenMax;
} else {
if ( nextPenMax > *pCurPenMax ) {
*pCurPenMax = nextPenMax;
// next penmax greater than current, save it
}
}
}
}
}
ReturnValue CPenMax::finalResult()
{
int curPenMax = int32Arg(1);

F-10

20444-5

Rev.4

Sample User-Defined Table Function

bool curPenMaxNull = isArgNull(1);


if ( curPenMaxNull )
NZ_UDX_RETURN_NULL();
setReturnNull(false);
NZ_UDX_RETURN_INT32(curPenMax);
}
};
nz::udx_ver2::Uda* CPenMax::instantiate(UdxInit *pInit)
{
return new CPenMax(pInit);
}

Sample User-Defined Table Function


A sample UDTF follows.
#include "udxinc.h"
#include <string.h>
using namespace nz::udx_ver2;
class SimpleUdtf : public nz::udx_ver2::Udtf
{
public:
SimpleUdtf(UdxInit *pInit) : Udtf(pInit) {
m_first = false;
}
static nz::udx_ver2::Udtf* instantiate(UdxInit *pInit);
bool m_new;
bool m_first;
virtual DataAvailable nextEoiOutputRow()
{
if (!m_first) {
m_first = true;
m_new = true;
}
return nextOutputRow();
}
virtual DataAvailable nextOutputRow()
{
//return Done;
if (!m_new)
return Done;
m_new = false;
int val;
for (int i=0; i < numReturnColumns(); i++) {
setReturnColumnNull(i, false);
val = i+1;
char temp[100];
switch (returnTypeColumn(i))

20444-5

Rev.4

F-11

IBM Netezza User-Defined Functions Developers Guide

{
case
case
case
case
{

UDX_FIXED:
UDX_VARIABLE:
UDX_NATIONAL_FIXED:
UDX_NATIONAL_VARIABLE:
StringReturn *ret = stringReturnColumn(i);
if (ret->size)
memset(ret->data, ' ', ret->size);
sprintf(temp, "%d", val);
ret->size = strlen(temp);
memcpy(ret->data, temp, strlen(temp));

break;
}
case UDX_BOOL:
*boolReturnColumn(i) = true;
break;
case UDX_DATE:
*dateReturnColumn(i) = val;
break;
case UDX_TIME:
*timeReturnColumn(i) = val;
break;
case UDX_TIMETZ:
{
TimeTzADT *ret = timetzReturnColumn(i);
ret->time = val;
ret->zone = 0;
break;
}
case UDX_NUMERIC32:
{
Numeric32Val *ret = numeric32ReturnColumn(i);
*ret->value = val;
break;
}
case UDX_NUMERIC64:
{
Numeric64Val *ret = numeric64ReturnColumn(i);
*ret->value = val;
break;
}
case UDX_NUMERIC128:
{
Numeric128Val *ret = numeric128ReturnColumn(i);
*ret->value = val;
break;
}
case UDX_FLOAT:
*floatReturnColumn(i) = val * 1.0;
break;
case UDX_DOUBLE:
*doubleReturnColumn(i) = val * 1.0;
break;
case UDX_INTERVAL:
{

F-12

20444-5

Rev.4

Sample UDTF with Generic Return Value

struct Interval *ret = intervalReturnColumn(i);


ret->time = val;
ret->month = 0;
break;
}
case UDX_INT8:
*int8ReturnColumn(i) = val % sizeof(char);
break;
case UDX_INT16:
*int16ReturnColumn(i) = val;
break;
case UDX_INT32:
*int32ReturnColumn(i) = val;
break;
case UDX_INT64:
*int64ReturnColumn(i) = val;
break;
case UDX_TIMESTAMP:
*timestampReturnColumn(i) = val;
break;
default:
throwError(NZ_ERROR_ILLVAL, "Unknown type",
returnTypeColumn(i));
}
}
return MoreData;
}
virtual void newInputRow()
{
m_new = true;
}
}
;
nz::udx_ver2::Udtf* SimpleUdtf::instantiate(UdxInit *pInit)
{
return new SimpleUdtf(pInit);
}

Sample UDTF with Generic Return Value


The following program creates a UDTF that returns a table size of ANY. It uses the calculateShape method to create table shape based on the input.
/*
This UDTF takes in a string (CHAR or VARCHAR) and returns
three columns:
- UPPER_CASE: the upper case of the string
- lower_case: the lower case of the string
- Title_Case: the title case of the string
The function is VARARGS, so it will determine the types
of the columns based on the type of the input argument.
The output strings will have the same data type as the
input string (CHAR(x) or VARCHAR(x)).
Compile:
=======

20444-5

Rev.4

F-13

IBM Netezza User-Defined Functions Developers Guide

nzudxcompile --sig "UcLcTc()" --varargs --return "TABLE(ANY)" --class


"UcLcTc" --version 2 --unfenced --db test UcLcTc.cpp
Example SQL:
===========
CREATE TABLE words (key VARCHAR(20), text VARCHAR(600));
INSERT INTO words VALUES ('1','Hello World');
INSERT INTO words VALUES ('2','Hello dave goodbye dave');
INSERT INTO words VALUES ('2','Hello Dave Hello World');
INSERT INTO words VALUES ('3','goodbye world');
CREATE TABLE words_uclctc AS SELECT * FROM words, TABLE ( UcLcTc(text)
);
*/
#include "udxinc.h"
using namespace nz::udx_ver2;
class UcLcTc : public Udtf {
private:
bool output;
public:
UcLcTc(UdxInit *pInit) : Udtf(pInit) { }
static Udtf* instantiate(UdxInit *pInit);
void newInputRow() {
output = true;
}
DataAvailable nextOutputRow() {
if (!output) {
output = true;
return Done;
}
StringArg *str = stringArg(0);
bool strNull = isArgNull(0);
if (strNull) {
return Done;
}
StringReturn *uc = stringReturnColumn(0);
StringReturn *lc = stringReturnColumn(1);
StringReturn *tc = stringReturnColumn(2);
if ((uc->size < str->length)
|| (lc->size < str->length)
|| (tc->size < str->length))
throwUdxException("Input too long for output");
memcpy(uc->data, str->data, str->length);
uc->size = str->length;
memcpy(lc->data, str->data, str->length);
lc->size = str->length;
memcpy(tc->data, str->data, str->length);
tc->size = str->length;

F-14

20444-5

Rev.4

Sample UDTF with Generic Return Value

for (int i = 0; i < str->length; i++) {


if ((uc->data[i] >= 'a') && (uc->data[i] <= 'z'))
uc->data[i] = uc->data[i] - 'a' + 'A';
if ((lc->data[i] >= 'A') && (lc->data[i] <= 'Z'))
lc->data[i] = lc->data[i] - 'A' + 'a';
if (((0 == i) || (' ' == tc->data[i-1]))
&& ((tc->data[i] >= 'a') && (tc->data[i] <= 'z')))
tc->data[i] = tc->data[i] - 'a' + 'A';
}
output = false;
return MoreData;
}
void calculateShape(UdxOutputShaper *shaper) {
if (shaper->numArgs() != 1)
throwUdxException("Expecting only one argument");
int nType = shaper->argType(0);
if ((UDX_FIXED == nType) || (UDX_VARIABLE == nType)) {
int len = shaper->stringArgSize(0);
char ucstr[] = "UPPER_CASE"; // For column names on systems that
char lcstr[] = "lower_case"; // use lowercase naming
char tcstr[] = "Title_Case";
char ucstrU[] = "UPPER_CASE"; // For column names on systems
char lcstrU[] = "LOWER_CASE"; // that use uppercase naming
char tcstrU[] = "TITLE_CASE";
if (shaper->isSystemCaseUpper()) {
shaper->addOutputColumn(nType, ucstrU, len);
shaper->addOutputColumn(nType, lcstrU, len);
shaper->addOutputColumn(nType, tcstrU, len);
}
else {
shaper->addOutputColumn(nType, ucstr, len);
shaper->addOutputColumn(nType, lcstr, len);
shaper->addOutputColumn(nType, tcstr, len);
}
}
else {
throwUdxException("Only CHAR and VARCHAR types are supported");
}
}
};
Udtf* UcLcTc::instantiate(UdxInit *pInit) {
return new UcLcTc(pInit);
}

20444-5

Rev.4

F-15

IBM Netezza User-Defined Functions Developers Guide

F-16

20444-5

Rev.4

APPENDIX

Notices and Trademarks


Whats in this appendix
Notices
Trademarks
Electronic Emission Notices
Regulatory and Compliance

This section describes some important notices, trademarks, and compliance information.

Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other
countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service
is not intended to state or imply that only that IBM product, program, or service may be
used. Any functionally equivalent product, program, or service that does not infringe any
IBM intellectual property right may be used instead. However, it is the user's responsibility
to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in
this document. The furnishing of this document does not grant you any license to these
patents. You can send license inquiries, in writing, to: This information was developed for
products and services offered in the U.S.A.
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785 U.S.A.
For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:
IBM World Trade Asia Corporation
Licensing 2-31 Roppongi 3-chome, Minato-ku
Tokyo 106-0032, Japan
The following paragraph does not apply to the United Kingdom or any other country where
such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES
CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE

G-1

IBM Netezza User-Defined Functions Developers Guide

IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR


A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are
periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s)
and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only
and do not in any manner serve as an endorsement of those Web sites. The materials at
those Web sites are not part of the materials for this IBM product and use of those Web
sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose of
enabling: (i) the exchange of information between independently created programs and
other programs (including this one) and (ii) the mutual use of the information which has
been exchanged, should contact:
IBM Corporation
Software Interoperability Coordinator, Department 49XA
3605 Highway 52 N
Rochester, MN 55901
U.S.A.
Such information may be available, subject to appropriate terms and conditions, including
in some cases, payment of a fee.
The licensed program described in this document and all licensed material available for it
are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us.
Any performance data contained herein was determined in a controlled environment.
Therefore, the results obtained in other operating environments may vary significantly.
Some measurements may have been made on development-level systems and there is no
guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual
results may vary. Users of this document should verify the applicable data for their specific
environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not
tested those products and cannot confirm the accuracy of performance, compatibility or
any other claims related to non-IBM products. Questions on the capabilities of non-IBM
products should be addressed to the suppliers of those products.
All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only.
All IBM prices shown are IBM's suggested retail prices, are current and are subject to
change without notice. Dealer prices may vary.
This information contains examples of data and reports used in daily business operations.
To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to
the names and addresses used by an actual business enterprise is entirely coincidental.

G-2

20444-5

Rev.4

Trademarks

COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate
programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of
developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are
written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
Each copy or any portion of these sample programs or any derivative work, must include a
copyright notice as follows:
your company name) (year). Portions of this code are derived from IBM Corp. Sample
Programs.
Copyright IBM Corp. _enter the year or years_.
If you are viewing this information softcopy, the photographs and color illustrations may not
appear.

Trademarks
IBM, the IBM logo, ibm.com and Netezza are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If
these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or ), these symbols indicate U.S. registered or
common law trademarks owned by IBM at the time this information was published. Such
trademarks may also be registered or common law trademarks in other countries. A current
list of IBM trademarks is available on the Web at Copyright and trademark information at
ibm.com/legal/copytrade.shtml.
Adobe is a registered trademark of Adobe Systems Incorporated in the United States, and/
or other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or
both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
NEC is a registered trademark of NEC Corporation.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United
States, other countries, or both.
Red Hat is a trademark or registered trademark of Red Hat, Inc. in the United States and/or
other countries.
D-CC, D-C++, Diab+, FastJ, pSOS+, SingleStep, Tornado, VxWorks, Wind River, and the
Wind River logo are trademarks, registered trademarks, or service marks of Wind River Systems, Inc. Tornado patent pending.
APC and the APC logo are trademarks or registered trademarks of American Power Conversion Corporation.
Other company, product or service names may be trademarks or service marks of others.

20444-5

Rev.4

G-3

IBM Netezza User-Defined Functions Developers Guide

Electronic Emission Notices


When you attach a monitor to the equipment, you must use the designated monitor cable
and any interference suppression devices that are supplied with the monitor.
Federal Communications Commission (FCC) Statement
Note: This equipment has been tested and found to comply with the limits for a Class A
digital device, pursuant to Part 15 of the FCC Rules. These limits are designed to provide
reasonable protection against harmful interference when the equipment is operated in a
commercial environment. This equipment generates, uses, and can radiate radio frequency
energy and, if not installed and used in accordance with the instruction manual, may cause
harmful interference to radio communications. Operation of this equipment in a residential
area is likely to cause harmful interference, in which case the user will be required to correct the interference at his own expense.
Properly shielded and grounded cables and connectors must be used in order to meet FCC
emission limits. IBM is not responsible for any radio or television interference caused by
using other than recommended cables and connectors or by unauthorized changes or modifications to this equipment. Unauthorized changes or modifications could void the user's
authority to operate the equipment.
This device complies with Part 15 of the FCC Rules. Operation is subject to the following
two conditions: (1) this device may not cause harmful interference, and (2) this device
must accept any interference received, including interference that might cause undesired
operation.
Industry Canada Class A Emission Compliance Statement
This Class A digital apparatus complies with Canadian ICES-003.
Avis de conformit la rglementation d'Industrie Canada
Cet appareil numrique de la classe A est conforme la norme NMB-003 du Canada.
Australia and New Zealand Class A Statement
Attention: This is a Class A product. In a domestic environment this product may cause
radio interference in which case the user may be required to take adequate measures.
European Union EMC Directive Conformance Statement
This product is in conformity with the protection requirements of EU Council Directive
2004/108/EC on the approximation of the laws of the Member States relating to electromagnetic compatibility. IBM cannot accept responsibility for any failure to satisfy the
protection requirements resulting from a nonrecommended modification of the product,
including the fitting of non-IBM option cards.
Attention: This is an EN 55022 Class A product. In a domestic environment this product
may cause radio interference in which case the user may be required to take adequate
measures.
Responsible manufacturer:
International Business Machines Corp.
New Orchard Road
Armonk, New York 10504
914-499-1900

G-4

20444-5

Rev.4

Electronic Emission Notices

European Community contact:


IBM Technical Regulations, Department M456
IBM-Allee 1, 71137 Ehningen, Germany
Telephone: +49 7032 15-2937
Email: [email protected]
Germany Class A Statement
Deutschsprachiger EU Hinweis: Hinweis fr Gerte der Klasse A EU-Richtlinie zur Elektromagnetischen Vertrglichkeit
Dieses Produkt entspricht den Schutzanforderungen der EU-Richtlinie 2004/108/EG zur
Angleichung der Rechtsvorschriften ber die elektromagnetische Vertrglichkeit in den EUMitgliedsstaaten und hlt die Grenzwerte der EN 55022 Klasse A ein.
Um dieses sicherzustellen, sind die Gerte wie in den Handbchern beschrieben zu installieren und zu betreiben. Des Weiteren drfen auch nur von der IBM empfohlene Kabel
angeschlossen werden. IBM bernimmt keine Verantwortung fr die Einhaltung der Schutzanforderungen, wenn das Produkt ohne Zustimmung der IBM verndert bzw. wenn
Erweiterungskomponenten von Fremdherstellern ohne Empfehlung der IBM gesteckt/eingebaut werden.
EN 55022 Klasse A Gerte mssen mit folgendem Warnhinweis versehen werden:
Warnung: Dieses ist eine Einrichtung der Klasse A. Diese Einrichtung kann im Wohnbereich Funk-Strungen verursachen; in diesem Fall kann vom Betreiber verlangt werden,
angemessene Manahmen zu ergreifen und dafr aufzukommen.
Deutschland: Einhaltung des Gesetzes ber die elektromagnetische Vertrglichkeit von Gerten
Dieses Produkt entspricht dem Gesetz ber die elektromagnetische Vertrglichkeit von
Gerten (EMVG). Dies ist die Umsetzung der EU-Richtlinie 2004/108/EG in der Bundesrepublik Deutschland.
Zulassungsbescheinigung laut dem Deutschen Gesetz ber die elektromagnetische Vertrglichkeit von Gerten
(EMVG) (bzw. der EMC EG Richtlinie 2004/108/EG) fr Gerte der Klasse A
Dieses Gert ist berechtigt, in bereinstimmung mit dem Deutschen EMVG das EG-Konformittszeichen - CE - zu fhren.
Verantwortlich fr die Einhaltung der EMV Vorschriften ist der Hersteller:
International Business Machines Corp.
New Orchard Road
Armonk, New York 10504
914-499-1900
Der verantwortliche Ansprechpartner des Herstellers in der EU ist:
IBM Deutschland
Technical Regulations, Department M456
IBM-Allee 1, 71137 Ehningen, Germany
Telephone: +49 7032 15-2937
Email: [email protected]
Generelle Informationen:
Das Gert erfllt die Schutzanforderungen nach EN 55024 und EN 55022 Klasse A.

20444-5

Rev.4

G-5

IBM Netezza User-Defined Functions Developers Guide

Japan VCCI Class A Statement

This is a Class A product based on the standard of the Voluntary Control Council for Interference (VCCI). If this equipment is used in a domestic environment, radio interference
may occur, in which case the user may be required to take corrective actions.
Japan Electronics and Information Technology Industries Association (JEITA) Statement

Japan Electronics and Information Technology Industries Association (JEITA) Confirmed


Harmonics Guidelines (products less than or equal to 20 A per phase)
Japan Electronics and Information Technology Industries Association (JEITA) Statement

Japan Electronics and Information Technology Industries Association (JEITA) Confirmed


Harmonics Guidelines (products greater than 20 A per phase)
Korea Communications Commission (KCC) Statement

This is electromagnetic wave compatibility equipment for business (Type A). Sellers and
users need to pay attention to it. This is for any areas other than home.
Russia Electromagnetic Interference (EMI) Class A Statement

People's Republic of China Class A Electronic Emission Statement

G-6

20444-5

Rev.4

Regulatory and Compliance

Taiwan Class A Compliance Statement

Regulatory and Compliance


Regulatory Notices
Install the NPS system in a restricted-access location. Ensure that only those trained to
operate or service the equipment have physical access to it. Install each AC power outlet
near the NPS rack that plugs into it, and keep it freely accessible.
Provide approved circuit breakers on all power sources.
Product may be powered by redundant power sources. Disconnect ALL power sources
before servicing.
High leakage current. Earth connection essential before connecting supply. Courant de
fuite lev. Raccordement la terre indispensable avant le raccordement au rseau.
Homologation Statement
Attention: This product is not intended to be connected directly or indirectly by any means
whatsoever to interfaces of public telecommunications networks, neither to be used in a
Public Services Network.
WEEE
Netezza Corporation is committed to meeting the requirements of the European Union (EU)
Waste Electrical and Electronic Equipment (WEEE) Directive. This Directive requires producers of electrical and electronic equipment to finance the takeback, for reuse or
recycling, of their products placed on the EU market after August 13, 2005.

20444-5

Rev.4

G-7

IBM Netezza User-Defined Functions Developers Guide

G-8

20444-5

Rev.4

Index

Index
Symbols
/nz/extensions directory, about 1-8
\da switch, showing UDAs 6-5
\df switch, showing UDFs 6-5
\dl command 6-5
_v_dual view 6-7
_v_dual_dslice view 6-7

A
access permissions 1-5
account permissions 1-5
about 1-5
managing 6-1
addOutputColumn Method 3-13
admin user, permissions 6-1
aggregate function. See user-defined aggregates.
aggregate objects A-2
aggregates
altering B-2
creating or replacing B-13
dropping B-25
allocate function A-7
ALTER AGGREGATE command B-2
ALTER FUNCTION command B-6
ALTER LIBRARY command B-11
ANY keyword 2-9
API versions, about 1-5
automatic load, user-defined shared libraries 5-2

B
backups, Netezza and UDX code 1-10
best practices
registering UDXs with SPUPads A-14
UDX development 6-6
bigint datatype D-6
bintohex UDF example F-3
boolean datatype D-4
built-in
aggregates 1-2
functions 1-1
functions, checking 1-6
byteint datatype D-6

C
C++
files with multiple functions 6-17
functions, support for 6-10
library header files, declaring 2-1, 3-1
objects, aggregates and non-aggregates A-2
calculateShape method 3-11
calculateSize method
definition 2-16
examples 2-10
casting, input values to match signature sizes 2-9

char data type D-2


CheckPrecision38Limit function C-24
column-management methods D-11
COMMENT permissions 6-5
conditional compilation 6-18
control file for test harness 7-10
control file parameters 7-11
conventions, documenting UDXs 6-4
conversion functions C-12
convertNumeric128 function C-24
convertNumeric32 function C-22
convertNumeric64 function C-23
correlated subquery 3-8
correlated table function 3-8
CREATE AGGREGATE command B-13
CREATE FUNCTION command B-17
CREATE OR REPLACE AGGREGATE
command B-13
example 4-6
generic UDA example 2-11
CREATE OR REPLACE FUNCTION
command B-17
example 2-5, 2-6, 2-8, 3-5
generic UDF example 2-11
cross-database access 1-9, 6-6

D
datatype
helper API
about 1-7
functions C-1
Netezza C-2
supported D-1
UDX_BOOL D-4
UDX_DATE D-4
UDX_DOUBLE D-5
UDX_FIXED D-2
UDX_FLOAT D-5
UDX_INT16 D-6
UDX_INT32 D-6
UDX_INT64 D-6
UDX_INT8 D-6
UDX_INTERVAL D-5
UDX_NATIONAL_FIXED D-2
UDX_NATIONAL_VARIABLE D-3
UDX_NUMERIC128 D-4
UDX_NUMERIC32 D-4
UDX_NUMERIC64 D-4
UDX_TIME D-4
UDX_TIMESTAMP D-6
UDX_TIMETZ D-4
UDX_VARIABLE D-3
datatype conversion
functions C-12
ignoring values C-5
Date datatype C-2
date datatype D-4
deallocate function A-8

Index-1

Index

debugging
flags 7-1
hints 7-7
decoded range-checking functions C-9
decodeDate (m/d/y Output) function C-12
decodeDate (struct tm Output) function C-13
decodeDate (time_t Output) function C-12
decodeTime function C-13
decodeTimestamp (mdy h:m:s:m Output) function C-14
decodeTimestamp (struct timeval Output) function C-15
decodeTimestamp (struct tm Output) function C-15
decodeTimestamp (time_t Output) function C-14
decodeTimeTz function C-13
decoding functions C-12
Demo page, NDN web site F-1
dependencies
clearing 5-4
viewing and resolving 6-13
Developer-style data C-1
development test environment 1-4
double precision datatype D-5
downgrade cautions 1-11
DROP AGGREGATE command B-25
DROP FUNCTION command B-27
DROP LIBRARY command B-28
dymanic memory, best practices 6-8

E
encoded range-checking functions C-7
encodeDate (m/d/y Values) function C-15
encodeDate (struct tm Values) function C-16
encodeDate (time_t Values) function C-16
encodeTime function C-17
encodeTimestamp (m/d/y Input Format) function C-18
encodeTimestamp (struct tm Input Format) function C-19
encodeTimestamp (time_t Input Format) function C-18
encodeTimestamp (timeval Input Format) function C-19
encodeTimeTZ function C-17
encoding functions C-15
error checking, in UDXs 6-16
errors, record size exceeded 6-8
examples of UDXs F-1
execution locus, specifying for UDX 6-7
EXTERNAL CLASS NAME, requirements 2-5

F
fenced mode 1-3
fencing, impacts on query performance 1-3
findEntry method 2-8
fixed-point numeric datatypes, conversions C-22
flags, debug 7-1
FOR_SPU compiler code 6-18
fully-qualified names, UDX 6-6
fully-qualified object names, for stored procedures 1-9
functions
altering B-6
creating or replacing B-17
dropping B-27
in table column expressions or views 1-7
multiple in one C++ file 6-17

Index-2

G
generic return value for UDTFs 3-11
generic UDTFs
ANY return value 3-11
calculateShaper() method for UDTF return value 3-11
registering 3-12
generic UDXs
ANY keyword 2-9
calculateSize() method for UDF return value 2-10
input arguments 2-9
registering 2-10
return value for UDFs 2-10
See also user-defined functions, generic.
getCurrentDatasliceId function 6-9
getCurrentHardwareId function 6-9
getCurrentLocus function
description 6-8
using in language-extension UDFs E-1
getCurrentSessionId function 6-9
getCurrentTransaction function 6-9
getCurrentUsername function 6-9
getEntry method 2-8
getKey method 2-8
getName Method 3-15
getNumberDataslices function 6-9
getNumberSpus function 6-10
getNumEntries method 2-7
getOutputColumn Method 3-14
getPad function A-9
getpadcount example A-20
getPrecision Method 3-15
getRootObject function A-9
getScale Method 3-15
getSize Method 3-14
getTotalSize function A-9
getType Method 3-14
getValue method 2-8
global objects 1-4
GRANT ALL command, create permission 6-2
GRANT command
alter permission 6-2
create permission 6-2
drop permission 6-3
execute permission 6-3
unfence permission 6-3

H
helper functions C-1
helper routines
datatype C-1
numerics C-22
temporal C-1
UTF-8 datatypes C-25
hextobin UDF example F-3

I
identifier collisions, avoiding 6-7
IgnoreBuffer C-5
implicit castings, for UDX input values 2-9

Index

inner correlation 3-8


installation location, UDX packages 1-8
integer datatype D-6
Interval datatype C-2
interval datatype D-5
isBusinessHours UDF example F-6
isLeapYear function C-19
isSizerArgConstant method 2-15
isSystemCaseUpper Method 3-14
isUserQuery function A-10
isValidDate function C-7, C-9
isValidEpoch function C-11
isValidEpochDate function C-7
isValidEpochTimestamp function C-8
isValidInterval function C-9
isValidSqlOffset function C-10
isValidTime function C-7, C-9
isValidTimestamp function C-8, C-10
isValidTimeStruct function C-11
isValidTimeTz function C-8, C-10
isValidTimeTzOffset function C-8
isValidTimeVal function C-11
isValidTimeValUsecs function C-11
isValidUTF8 function C-25

L
lateral subquery 3-8
laterally correlated table function, restrictions 3-9
left outer correlation 3-9
LIBC, support on SPUs 6-10
LIBRARY
command B-23
objects 1-5
linker errors, avoiding for common symbols 6-7
locale environment variables 6-10
locale-aware functions 6-10
locus of UDTFs 3-10
log mask, checking settings 7-2
logging
messages from UDXs 7-1
methods D-7
LOGMASK
attribute 7-1
using 7-3
logMsg
example 7-2
facility 7-1
function 7-1

M
macros, return values D-8
manual load, user-defined shared libraries 5-2
MAXIMUM MEMORY
determining 6-17
including SPUPad memory in A-14
memcmp use 2-3
memory
allocating with SPUPad A-1
calculating for SPUPad A-13
freeing SPUPad memory A-13

management method D-7


registration, checking 6-17
reservation routines 2-3
message logging
about 7-1
enabling or disabling 7-4
methods D-7
miscellaneous datatype helper functions C-19

N
names for UDXs, avoiding built-in function names 6-7
namespace, about version 1 and 2 2-2
nchar datatype D-2
Netezza
datatypes C-2
Developer Network (NDN) 1-3
SQL commands B-1
temporal values, converting 1-7
Web site 1-3
new and delete operators 6-8
newInputRow() method 3-3
nextEoiOutputRow() method 3-4
nextOutputRow() method 3-3
non-aggregate objects A-2
null checking 6-16
numeric datatype D-4
numerics, conversions C-22
numOutputColumns Method 3-13
numSizerArgs method 2-13
nvar_concat UDF example F-1
nvarchar datatype D-3
nz::udx::dthelpers namespace C-2
NZPLSQL language, extending with UDFs E-1
nzsql command
comments for UDXs 6-5
help on UDX 6-5
nzudxcompile
command 6-18
syntax 2-4
UDA example 4-5
UDF example 2-4
nzudxrunharness command 7-5, 7-7
nzudxvalidate command 6-14

O
object files
multiple, compiling 6-17
resolving problems with 6-15
objects,limiting in SPUPad A-4
offsetEpoch function C-21
offsetTime function C-20
offsetTimestamp function C-20
offsetTimeStruct function C-21
CREATE B-23
outputs
ALTER GROUP command B-4, B-9, B-12, B-24
CREATE DATABASE command B-15, B-21
DROP VIEW command B-25, B-27, B-29
SHOW PROCEDURE command B-30, B-32
overloading, functions and aggregates 2-6

Index-3

Index

owner, UDX 6-1

P
packChildren UDA example F-7
PAD_DELETE macro A-10
PAD_NEW macro A-9
padcounter example A-20
patches, and UDX code 1-10
PATH SQL session variable 1-9
PenMax example 4-1
permissions
account 1-5
granting
all 6-2
alter permission 6-2
create 6-2
drop permission 6-3
execute permission 6-3
unfence permission 6-3
managing 6-1
revoking
alter permission 6-2
create permission 6-2
drop permission 6-3
execute permission 6-3
unfence permission 6-3
plus switch 6-6
privileges, commands
ALTER AGGREGATE command B-5
ALTER FUNCTION command B-10
ALTER LIBRARY command B-12
CREATE AGGREGATE command B-17
CREATE FUNCTION command B-22
CREATE LIBRARY command B-24

Q
queries, using UDFs in 2-11
query cancellation, using throwUdxException() 6-16
query optimization, and UDXs 6-12

R
range specifier constants C-5
range-checking functions C-7
real datatype D-5
record function. See user-defined functions.
record size exceeded errors 6-8
repeating subquery 3-8
restores, Netezza and UDX code 1-10
return value macros D-8
return value sizer API 2-12
return value sizer methods
calculateSize 2-16
isSizerArgConstant 2-15
numSizerArgs 2-13
sizerArgType 2-13
sizerGetConstantArg 2-15
sizerNumericArgPrecision 2-14
sizerNumericArgScale 2-14

Index-4

sizerNumericSizeValue 2-15
sizerReturnType 2-12
sizerStringArgSize 2-13
sizerStringSizeValue 2-14
REVOKE command
alter permission 6-3
create permission 6-2
drop permission 6-3
execute permission 6-3
unfence permission 6-3
root object, SPUPad A-3
Root structure A-3

S
setReturnNull, checking for null returns 6-16
setRootObject function A-8
shaper methods 3-11
shared libraries
altering B-11
creating B-23
showing B-33
shell window, writing messages to 7-3
SHOW AGGREGATE command B-30
SHOW FUNCTION command B-32
SHOW LIBRARY command B-33
signature
about 6-1
format 2-6
sizerArgType method 2-13
sizerGetConstantArg method 2-15
sizerNumericArgPrecision method 2-14
sizerNumericArgScale method 2-14
sizerNumericSizeValue method 2-15
sizerReturnType method 2-12
sizerStringArgSize method 2-13
sizerStringSizeValue method 2-14
smallint datatype D-6
SPUPad
about 1-3, A-1
accessing A-5
best practices for registering A-14
calculating memory use A-13
content restrictions A-2
creating A-2, A-4
creating on one SPU versus multiple SPUs A-11
creating on the Netezza host A-12
define content of A-3
examples A-14
freeing memory used by A-13
functions
allocate A-7
deallocate A-8
getPad A-9
getRootObject A-9
getTotalSize A-9
isUserQuery A-10
setRootObject A-8
getpadcount example A-20
limiting number of objects A-4

Index

macros
PAD_DELETE A-10
PAD_NEW A-9
non-virtual destructors A-2
padcounter example A-20
process data in A-5
root object, about A-3
Root structure A-3
running functions A-6
string_pad_get example A-5
stringpad example A-4
transaction restarts A-14
understanding return values A-11
uses A-1
using NOT DETERMINISTIC A-14
Standard C Library (LIBC) support 6-10
standard log files, writing messages to 7-3
stored procedures
about 1-8
fully qualified name of 1-9
PATH session variable 1-9
string sizes, best practices 2-5
string_pad_create
code A-14
example A-4
string_pad_get
code A-17
example A-5
stringpad example A-4
struct tm
conversion restrictions C-4
implementation C-4
structure C-4
symbols, multiple definitions errors 6-7
synonyms, creating for UDXs 6-6
system prerequisites 1-4

T
table function. See user-defined table functions.
table shaper methods 3-13
TABLE WITH FINAL syntax 3-6
tables
resolving references to dropped UDFs 6-15
resolving references to UDFs 6-13
temporal
types C-2
values, about 1-7
test harness
about 7-5
control file 7-10
example 7-5
Time datatype C-2
time datatype D-4
time with time zone datatype D-4
time_t structure C-3
Timestamp datatype C-2
timestamp datatype D-6
TimeTZ datatype C-2
timeval support C-4

U
Uda base class 4-2
UDA. See user-defined aggregates.
Udf base class 2-2
UDF. See user-defined functions.
Udtf base class 3-2
UDTF.See user-defined table functions.
UDX
avoiding built-in names 6-7
commenting on 6-5
compiling 6-18
controlling access to 1-4
creation steps 1-7
cross-database access to 6-6
definition 1-3
environment
about 2-7
class 2-7
methods D-9
error checking 6-16
examples F-1
how to call 1-9
how to plan and create 1-6
installation location best practices 1-8
macros 2-4
migrating version 1 to version 2 6-23
planning steps 1-6
record size exceeded errors 6-8
resolving object file problems 6-15
signature 6-1
UDX_BOOL datatype D-4
UDX_DATE datatype D-4
UDX_DOUBLE datatype D-5
UDX_FIXED datatype D-2
UDX_FLOAT datatype D-5
UDX_INT16 datatype D-6
UDX_INT32 datatype D-6
UDX_INT64 datatype D-6
UDX_INT8 datatype D-6
UDX_INTERVAL datatype D-5
UDX_LOCUS_POSTGRES, example of E-1
UDX_NATIONAL_FIXED datatype D-2
UDX_NATIONAL_VARIABLE datatype D-3
UDX_NUMERIC128 datatypes D-4
UDX_NUMERIC32 datatypes D-4
UDX_NUMERIC64 datatypes D-4
UDX_TIME datatype D-4
UDX_TIMESTAMP datatype D-6
UDX_TIMETZ datatype D-4
UDX_VARIABLE datatype D-3
udx_ver2 namespace 2-2
UdxEnvironmentEntry values 2-7
udxinc.h header file
in user-defined aggregates 4-1
in user-defined functions 2-1
in user-defined table functions 3-1
udxLibraryName function 6-10
UDXs
cross-database access to 1-9
references to user-defined shared libraries 6-13
uncorrelated table function 3-7
Unfence privilege 1-3

Index-5

Index

upgrades, and UDX code 1-10


user-defined aggregates
about 1-2
account permissions 1-5
accumulate() method 4-3
cautions 1-4
compiling 4-5
creating C++ for 4-1
debugging 7-1
examples F-1
finalResult() method 4-5
generic, registering 2-11
initializeState() method 4-3
instantiate() method 4-2
merge() method 4-4
packChildren example F-7
PenMax example 4-1
registering 4-6
See also user-defined functions.
showing B-30
signature 2-6
using in a query 4-6
user-defined functions 1-5
avoiding strcmp, strcpy, strlen, atol 2-3
bintohex example F-3
cautions 1-4
constructors, implementing 2-3
controlling access to 1-4
creating C++ program for 2-1
cross-database access to 6-6
debugging 7-1
destructors, implementing 2-3
documenting 6-4
error checking 6-16
evaluate() method 2-3
examples F-1
extending NZPLSQl E-1
generic
registering 2-10
return value sizer API 2-12
hextobin example F-3
instantiate() method 2-2
isBusinessHours example F-6
limitations of size-specific input arguments 2-8
null checking 6-16
nvar_concat example F-1
program language support 1-4
programming best practices 1-4
query optimization impacts 6-12
registering 2-4
showing B-32
signature 2-6
sizer methods reference D-9
system prerequisites 1-4
updating after program file change 2-5
using in queries 2-11
var_concat example F-1
user-defined shared libraries
about 1-2
account permissions 1-5
altering 5-4, B-11
compiling 5-2
creating 5-1, B-23

Index-6

cross-database access 6-6


dependency limits 5-1
dropping 5-4, B-28
linking 5-2
loading options 5-2
registering in the database 5-3
showing B-33
using in a UDX 5-3
user-defined table functions
about 1-2
altering 3-12
argument forms 3-7
arguments and return values 3-11
chaining 3-9
column methods D-11
compiling 3-4
correlated function 3-8
creating C++ program for 3-1
dropping 3-12
instantiate() method 3-2
invocation forms 3-6
locus 3-10
registering 3-4
sample F-11
sample of generic return UDTF F-13
shaper methods reference D-10
uncorrelated function 3-7
using in queries 3-5
visibility rules 3-9
UTF-8 datatype helpers C-25
UTF8CharCount function C-25

V
_v_depend view 6-13
var_concat UDF example F-1
varchar datatype D-3
views
resolving references to dropped UDAs 6-15
resolving references to UDAs 6-13

You might also like