Netezza Best Practices

This document provides best practices for Netezza including: - Choosing good distribution keys to evenly distribute data across nodes for optimal performance. - Using consistent and appropriate data types. - Leveraging zonemaps for ordered or grouped data to reduce disk scans. - Maintaining up-to-date statistics for better query optimization. - Regularly reclaiming space through grooming to free up outdated or deleted tuples. - Following guidelines for efficient ETL such as bulk loading and minimizing I/O.

Uploaded by

SUBHASH RAJAK

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

268 views5 pages

Netezza Best Practices

Uploaded by

SUBHASH RAJAK

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Netezza Best Practices

Prepared By
Sivakumar Nair/India/IBM
1. Introduction

2. Distribution

3. Datatypes

4. ZoneMaps

5. Statistics

6. Groom / Reclaim

7. ETL/ELT Guidelines
Introduction
Netezza sells itself on simplicity and therefore best practice should not mean
hundreds of rules and regulations to follow. Recommended that basic principles are
on
> Distribution
> Datatypes
> Statistics
> Zonemaps
> Reclaim
Along side some basic standards for ETL and general pointers will help applications
to perform 99%. Best practices means minimal effort early on for maximum gain.

Distribution
Good Distribution is the fundamental element of performance. A SPU is the
individual element of parallelism and if all SPUs have same amount of work to do, a
query will be quicker than if one SPU was asked to do same job.
> Bad distribution is called data skew
> Skew to one SPU is worst case scenario.
> Skew affects query in hand and others as SPU has more to do.
> Skew also means that the machine will fill up quicker.
> Simple rule. Good distribution-Good Performance.
> Never create a table with out distribution key.
> If no distribution key is specified, the NPS chooses a distribution key and there is
no guarantee what that key is. This will eventually creates data skew.
When choosing the distribution key consider the following factors
> More distinct the distribution key values, the better.
> The Same distribution key value always goes to the same SPU.
> Table Used together should use the same columns for their distribution key
when possible.
> If a particular key is largely used in equal join clause, then that key is good
choice for distribution key.
> Check that there is no accidental process skew when there is a good record
distribution.
> If in doubt, use Random distribution as it will give perfect distribution.
> For Smaller tables Random distribution is usually good choice.
Criteria for Selecting distribution keys.
> Choose column for distribution key that distribute table rows eventually.
> Choose columns for the distribution key based on the selection set that you use
most frequently to retrieve rows from the table.
> Choose as few columns as possible for distribution key (Max 4 Columns).
> Do not choose Boolean columns as distribution key.

Data types
Picking right data types always give better performance.
> Having columns of uniform type produces consistent results.
> Having columns of uniform type ensures that data is stored efficiently.
> Having columns of Uniform type allow the system to process the queries
efficiently
> Numeric data type with a scales 0 are similar to INTEGER datatypes and switch to
Integer datatype means Zonemaps
> The INTERVAL datatype means cumbersome and hard to work with. Consider
storing original Time and Timestamp values and calculating interval on fly.
> Floating point datatype are, by definition, lousy in nature. There may be
performance hit by using them
> Inconsistent datatype for same column in different tables hit performance

ZoneMaps
> ZoneMaps improve the throughput and the response time of SQL against large
groups, or continually augmented nearly ordered data.
> Zonemaps are automatically generated, persistent, internal tables.
> Works with Large, grouped or nearly ordered date, timestamp and byteint,
smallint, integer, and biginteger datatypes.
> Zonemaps take advantage of inherent ordering or grouping of data to reduce disk
scans required to retrieve data on restricted scan queries.

Statistics
> Netezza uses Cost based Optimizer
> The more up to date and accurate table statistics are, the better plans the query
optimizer will generate.
> Statistics should be built into ETL or ELT processing where ever possible.
> Regular monitoring should be deployed to check out of date statistics.

Groom / Reclaim
Why groom is important?
> An update or delete of a table row does not remove old tuple.
> Over time outdated or deleted tuples are of no interest to any transaction
and must be deleted to free up space.
When should you reclaim
> Groom tables that receives frequent update or deletes
> Groom tables if you cancel or abort large load operation.
Groom best practices
> If You have a table whose contents are delete completely, consider using
truncate rather than delete, which eliminates the need to run groom
command.
> Build groom into the ETL processing where ever possible.

ETL / ELT Guidelines

> Avoid many small insert / update, especially single line inserts.
> Use bulk load method where ever possible.
> Avoid cursor based processing
> Order by on primary key, date or common join column field to optimize zone
maps
> Look to establish standard load and ETL methods (best practices) for ETL and
Load tools and methods that you used.
> Minimize I/O between the host and the ETL server where ever possible.

1Z0 071
100% (3)
1Z0 071
38 pages
SQL Server Student Guide-1
No ratings yet
SQL Server Student Guide-1
98 pages
db2 Backup Restore
No ratings yet
db2 Backup Restore
8 pages
Advanced SQL
No ratings yet
Advanced SQL
273 pages
Understanding Table Queues
No ratings yet
Understanding Table Queues
56 pages
CL205v1.0 Student Exercises - 06092016
No ratings yet
CL205v1.0 Student Exercises - 06092016
124 pages
MySQL To DB2 Redbook
No ratings yet
MySQL To DB2 Redbook
474 pages
Test Ict450
100% (1)
Test Ict450
11 pages
Best Practices DB2 BLU Acceleration Für SAP - Olaf Depper
No ratings yet
Best Practices DB2 BLU Acceleration Für SAP - Olaf Depper
67 pages
Securing Your Data in Motion With TLS
No ratings yet
Securing Your Data in Motion With TLS
38 pages
Best Practice For DB2 On AIX 61-Sg247821
No ratings yet
Best Practice For DB2 On AIX 61-Sg247821
426 pages
How To Influence The DB2 Query Optimizer Using Optimization Profiles
No ratings yet
How To Influence The DB2 Query Optimizer Using Optimization Profiles
52 pages
Cross Node Recovery Using TSM in DB2
No ratings yet
Cross Node Recovery Using TSM in DB2
3 pages
Db2 Cert6112 PDF
100% (1)
Db2 Cert6112 PDF
47 pages
DB2BP System Performance 0813
No ratings yet
DB2BP System Performance 0813
80 pages
DB2 Migration
No ratings yet
DB2 Migration
17 pages
DB2 Tech Talk PureData Systems Presentation PDF
No ratings yet
DB2 Tech Talk PureData Systems Presentation PDF
65 pages
DB2 9 Fundamentals Exam 730 Prep, Part 1
No ratings yet
DB2 9 Fundamentals Exam 730 Prep, Part 1
60 pages
Performance Comparison
No ratings yet
Performance Comparison
11 pages
DB2 10.1 LUW Data Recovery and High Availability Guide and Reference IBM Inc
No ratings yet
DB2 10.1 LUW Data Recovery and High Availability Guide and Reference IBM Inc
507 pages
Oracle 12c - CDB - PDB - Performing Basic Tasks PDF
No ratings yet
Oracle 12c - CDB - PDB - Performing Basic Tasks PDF
18 pages
Db2 DPF 08 Backup and Restore
No ratings yet
Db2 DPF 08 Backup and Restore
3 pages
Netezza Performance Server Release Notes
No ratings yet
Netezza Performance Server Release Notes
58 pages
CB116-Lab-Workbook (6.x)
No ratings yet
CB116-Lab-Workbook (6.x)
28 pages
IBM Netezza Analytics Administrators Guide-3.2.2
No ratings yet
IBM Netezza Analytics Administrators Guide-3.2.2
59 pages
CDC Installation
No ratings yet
CDC Installation
686 pages
Netezza Tips and Tricks
No ratings yet
Netezza Tips and Tricks
11 pages
Dbms New
No ratings yet
Dbms New
156 pages
List of All ORACLE Interview Questions
No ratings yet
List of All ORACLE Interview Questions
69 pages
DB2 HADR Performance Tuning: IBM Software Group
No ratings yet
DB2 HADR Performance Tuning: IBM Software Group
28 pages
DB2BP HPU Data Movement 1212
No ratings yet
DB2BP HPU Data Movement 1212
35 pages
HTTP://WWW - Sioug.si/predavanja/97/platinum/sld012.htm : 1.what Is An Oracle Instance?
100% (1)
HTTP://WWW - Sioug.si/predavanja/97/platinum/sld012.htm : 1.what Is An Oracle Instance?
6 pages
How CDC Refresh Works Mar 2010 EXT
No ratings yet
How CDC Refresh Works Mar 2010 EXT
9 pages
DB2 Survival Guide
No ratings yet
DB2 Survival Guide
11 pages
Useful Netezza Queries and Tips
No ratings yet
Useful Netezza Queries and Tips
6 pages
DM Perf Tuning2 PDF
No ratings yet
DM Perf Tuning2 PDF
19 pages
Oracle Interview Questions and Answers
No ratings yet
Oracle Interview Questions and Answers
7 pages
DB 2 PD
No ratings yet
DB 2 PD
20 pages
The Secrets of Materialized Views
100% (2)
The Secrets of Materialized Views
8 pages
DB2 SQL Tuning Best Practices
No ratings yet
DB2 SQL Tuning Best Practices
22 pages
Logical I/O: Julian Dyke Independent Consultant
No ratings yet
Logical I/O: Julian Dyke Independent Consultant
42 pages
DB2 - IBM's Relational DBMS
No ratings yet
DB2 - IBM's Relational DBMS
167 pages
DB2 Storagegroups
No ratings yet
DB2 Storagegroups
5 pages
Oracle Interview Questions and Answers: SQL: 1. To See Current User Name
No ratings yet
Oracle Interview Questions and Answers: SQL: 1. To See Current User Name
18 pages
Tanel Poder Active Session History Seminar
No ratings yet
Tanel Poder Active Session History Seminar
88 pages
SQL Server Replication
No ratings yet
SQL Server Replication
3 pages
LogMiner by Example
No ratings yet
LogMiner by Example
35 pages
ABAP Training For SAP HANA (Autosaved)
No ratings yet
ABAP Training For SAP HANA (Autosaved)
30 pages
DB2 How To Run Reorg
No ratings yet
DB2 How To Run Reorg
2 pages
DB2 FamilyIntro
No ratings yet
DB2 FamilyIntro
15 pages
Relational Database Management System
No ratings yet
Relational Database Management System
30 pages
SQL Interview Question &answer: Null Null
No ratings yet
SQL Interview Question &answer: Null Null
12 pages
Oracle Streams
No ratings yet
Oracle Streams
6 pages
Oracle PL/SQL Faq: Neosoft Technologies
No ratings yet
Oracle PL/SQL Faq: Neosoft Technologies
10 pages
Migrate From Oracle To Postgresql With Azure: Webinar Series
No ratings yet
Migrate From Oracle To Postgresql With Azure: Webinar Series
21 pages
DP 300 Demo
No ratings yet
DP 300 Demo
13 pages
8960 - DWM Experiment 2
No ratings yet
8960 - DWM Experiment 2
15 pages
Oracle To Microsoft SQL Server Migration - SQLines
No ratings yet
Oracle To Microsoft SQL Server Migration - SQLines
7 pages
Unit-Iii Distributed Database: System
No ratings yet
Unit-Iii Distributed Database: System
55 pages
Netezza Commands
No ratings yet
Netezza Commands
1 page
Crud Operation Projects
No ratings yet
Crud Operation Projects
12 pages
10gnew Features
No ratings yet
10gnew Features
141 pages
Database
No ratings yet
Database
24 pages
Oracle Performance Improvement by Tuning Disk Input Output
No ratings yet
Oracle Performance Improvement by Tuning Disk Input Output
4 pages
Sem-4 II Mca Database MCQ
No ratings yet
Sem-4 II Mca Database MCQ
34 pages
Unit-03 DBMS Notes
No ratings yet
Unit-03 DBMS Notes
51 pages
WordPress - Convert MyISAM To InnoDB
No ratings yet
WordPress - Convert MyISAM To InnoDB
4 pages
Basic Principles of Database Normalization
No ratings yet
Basic Principles of Database Normalization
7 pages
Unit2 Cassandra
No ratings yet
Unit2 Cassandra
15 pages
Unit-3 FBDA
No ratings yet
Unit-3 FBDA
34 pages
DBMS 22mca21 Ia3-Sos
No ratings yet
DBMS 22mca21 Ia3-Sos
17 pages
Oracle 11g - Snapshot Standby and Active Data Guard
No ratings yet
Oracle 11g - Snapshot Standby and Active Data Guard
45 pages
Create Table
0% (2)
Create Table
3 pages
Sem 4
No ratings yet
Sem 4
20 pages
Accenture Interview 1
No ratings yet
Accenture Interview 1
3 pages
MCA Resume
No ratings yet
MCA Resume
1 page
5.4 Data Models - Network Model
No ratings yet
5.4 Data Models - Network Model
3 pages
Simple Arithmetic Calculator Aim: Algorithm
No ratings yet
Simple Arithmetic Calculator Aim: Algorithm
15 pages
SQL Server Production DBA
No ratings yet
SQL Server Production DBA
14 pages
Quiz6 Solution PDF
No ratings yet
Quiz6 Solution PDF
3 pages
Access Final Exam Chapter 1 CS1000
No ratings yet
Access Final Exam Chapter 1 CS1000
3 pages
IBM InfoSphere Replication Server and Data Event Publisher
From Everand
IBM InfoSphere Replication Server and Data Event Publisher
Pav Kumar-Chatterjee
No ratings yet
Oracle Database Mastery: Comprehensive Techniques for Advanced Application
From Everand
Oracle Database Mastery: Comprehensive Techniques for Advanced Application
Adam Jones
No ratings yet
HBase Administration Cookbook
From Everand
HBase Administration Cookbook
Yifeng Jiang
No ratings yet
Oracle OBIEE Interview Q & A
From Everand
Oracle OBIEE Interview Q & A
Mohammed Azizuddin Aamer
3/5 (1)
Monitoring Hadoop
From Everand
Monitoring Hadoop
Gurmukh Singh
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Oracle Exadata Complete Self-Assessment Guide
From Everand
Oracle Exadata Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
ORACLE 12C Complete Self-Assessment Guide
From Everand
ORACLE 12C Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
Oracle Data Guard A Clear and Concise Reference
From Everand
Oracle Data Guard A Clear and Concise Reference
Gerardus Blokdyk
No ratings yet

Netezza Best Practices

Uploaded by

Netezza Best Practices

Uploaded by

Netezza Best Practices

ETL / ELT Guidelines

You might also like