0% found this document useful (0 votes)

309 views14 pages

Section 1 - Design & Performance For Netezza Migration To Azure Synapse

Uploaded by

Sliptnock Martinez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

309 views14 pages

Section 1 - Design & Performance For Netezza Migration To Azure Synapse

Uploaded by

Sliptnock Martinez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Migration to Azure Synapse Analytics

Section 1.2 - Design and performance for

Netezza migrations
Design and Performance for Netezza Migrations

Table of Contents
Context ........................................................................................................................ 3
Overview ..................................................................................................................... 4
Design considerations................................................................................................ 6
Migration scope .................................................................................................................................. 6
Preparation for migration ......................................................................................................... 6
Choosing the workload for the initial migration .............................................................. 6
‘Lift and shift as-is’ vs a phased approach incorporating changes ........................... 6
Use Azure Data Factory to implement a metadata-driven migration ..................... 7
Design differences between Netezza and Azure Synapse.................................................. 7
Multiple databases vs single database and schemas ..................................................... 7
Table considerations ................................................................................................................... 8
Unsupported Netezza database object types ................................................................... 8
Netezza data type mapping ..................................................................................................... 9
SQL DML syntax differences.................................................................................................. 10
Functions, stored procedures and sequences ................................................................ 10
Extracting metadata and data from a Netezza environment ......................................... 11
Data Definition Language (DDL) generation .................................................................. 11
Data extraction from Netezza .............................................................................................. 12
Performance recommendations for Netezza migrations ..................................... 13
Similarities in performance tuning approach concepts.............................................. 13
Differences in performance tuning approach ................................................................ 13

Copyright © Microsoft Corporation, 2019, All Rights Reserved

2
Design and Performance for Netezza Migrations

Context
This paper is one of a series of documents which discuss aspects of migrating legacy
data warehouse implementations to Azure Synapse Analytics. The focus of this
paper is on the design and performance aspects of migrated data specifically from
existing Netezza environments – other topics such as ETL, recommended migration
approach and advanced analytics in the data warehouse are covered in separate
documents. This document should be read in conjunction with the ‘Section 1 –
Design and Performance’ document which discusses the general aspects of design
and performance for migrations to Azure Synapse.

Copyright © Microsoft Corporation, 2019, All Rights Reserved

3
Design and Performance for Netezza Migrations

Overview
‘More than just a Given the end of support from IBM, many existing users of Netezza data warehouse
database’ – the Azure
environment includes a systems are now looking to take advantage of the innovations provided by newer
comprehensive set of environments (e.g. cloud, IaaS, PaaS) and to delegate tasks such as infrastructure
capabilites and tools
maintenance and platform development to the cloud provider.

While there are similarities between Netezza and Azure Synapse in that both are
SQL databases designed to use massively parallel processing (MPP) techniques to
achieve high query performance on very large data volumes, there are also some
basic differences in approach:
• Legacy Netezza systems are installed on-premise, using proprietary hardware
whereas Azure Synapse is cloud based using Azure storage and compute
resources.
• Upgrading a Netezza configuration is a major task involving additional physical
hardware and a potentially lengthy database reconfiguration or dump and
reload. Since storage and compute resources are separate in the Azure
environment these can easily be scaled (upwards and downwards) independently
leveraging the elastic scalability capability.
• Azure Synapse can be paused or resized as required to reduce resource
utilization and therefore cost.
Microsoft Azure is a globally available, highly secure, scalable cloud environment
which includes Azure Synapse within an eco-system of supporting tools and
capabilities.

Azure Synapse gives best Azure Synapse provides best-of-breed relational database performance by using
performance and price-
performance in techniques such as massively parallel processing (MPP) and automatic in-memory
independent benchmark caching – the results of this approach can be seen in independent benchmarks such
as the one run recently by GigaOm – see https://gigaom.com/report/data-
warehouse-cloud-benchmark/ which compares Azure Synapse to other popular
cloud data warehouse offerings. Customers who have already migrated to this
environment have seen many benefits including:
• Improved performance and price/performance

Copyright © Microsoft Corporation, 2019, All Rights Reserved

4
Design and Performance for Netezza Migrations

• Increased agility and shorter time to value

• Faster server deployment and application development
• Elastic scalability – only pay for actual usage
• Improved security/compliance
• Reduced storage and Disaster Recovery costs
• Lower overall TCO and better cost control (OPEX)
To maximize these benefits it is necessary to migrate existing (or new) data and
applications to the Azure Synapse platform, and in many organizations this will
include migration of an existing data warehouse from legacy on-premise platforms
such as Netezza. At a high level, the basic process will include the following steps:

This paper looks at schema migration with a view to obtain equivalent or better
performance of your migrated Netezza data warehouse and data marts on Azure
Synapse. The topics included in this paper apply specifically to migrations from an
existing Netezza environment.

Copyright © Microsoft Corporation, 2019, All Rights Reserved

5
Design and Performance for Netezza Migrations

Design considerations
Migration scope

Preparation for migration

Build an inventory of When migrating from a Netezza environment there are some specific topics which
objects to be migrated
and document the
must be taken into account in addition to the more general subjects described in
process the ‘Section 1 – Design and Performance’ document.

Choosing the workload for the initial migration

Legacy Netezza environments have typically evolved over time to encompass
multiple subject areas and mixed workloads. When deciding where to start on an
initial migration project it makes sense to choose an area which will be able to:
• Prove the viability of migrating to Azure Synapse by quickly delivering of the
benefits of the new environment
• Allow the in-house technical staff to gain relevant experience of the processes
and tools involved which can be used in migrations of other areas
• Create a template for further migration exercises which is specific to the source
Netezza environment and the current tools and processes which are already in
place
A good candidate for an initial migration from a Netezza environment which would
enable the items above is typically one that implements a BI/Analytics workload (i.e.
not an OLTP workload) with a data model that can be migrated with minimal
modifications – normally a start or snowflake schema.

In terms of size, it is important that the data volume to be migrated in the initial
exercise is large enough to demonstrate the capabilities and benefits of the Azure
Synapse environment while keeping the time to demonstrate value short – typically
in the 1-10TB range.

One possible approach for the initial migration project which will minimize the risk
and reduce the implementation time for the initial project is confine the scope of
the migration to just the data marts. This approach by definition limits the scope of
the migration and can typically be achieved within short timescales and so can be a
good starting point – however this will not address the broader topics such as ETL
migration and historical data migration as part of the initial migration project. These
would have to be addressed in later phases of the project as the migrated data mart
layer is ‘back filled’ with the data and processes required to build them.

‘Lift and shift as-is’ vs a phased approach incorporating changes

‘Lift and shift’ is a good Whatever the drivers and scope of the intended migration, broadly speaking there
starting point even if
subsequent phases will are 2 types of migration:
implement changes to
the data model ‘Lift and shift’
In this case the existing data model (e.g. star schema) is migrated unchanged
to the new Azure Synapse platform. The emphasis here is on minimising risk

Copyright © Microsoft Corporation, 2019, All Rights Reserved

6
Design and Performance for Netezza Migrations

and the time taken to migrate by reducing the work that has to be done to
achieve the benefits of moving to the Azure cloud environment.

This is a good fit for existing Netezza environments where a single data mart
is to be migrated, or the data is already in a well-designed star or snowflake
schema or there are time and cost pressures to move to a more modern
cloud environment.

Phased approach incorporating modifications

For cases where a legacy warehouse has evolved over a long time it may be
necessary to re-engineer them to maintain the require performance levels or
support new data (e.g. IoT steams). Migration to Azure Synapse to obtain the
well accepted benefits of a scalable cloud environment might be considered
as part of the re-engineering process. This could include a change of the
underlying data model (e.g. a move from an Inmon model to Data Vault)

The recommended approach for this is to initially move the existing data
model ‘as-is’ into the Azure environment then to use the performance and
flexibility of the Azure environment to apply the re-engineering changes,
leveraging the Azure capabilities where appropriate to make the changes
without impacting the existing source system.

Use Azure Data Factory to implement a metadata-driven migration

It makes sense to automate and orchestrate the migration process by making use of
the capabilities in the Azure environment. This approach also minimizes the impact
on the existing Netezza environment (which may already be running close to full
capacity).

Azure Data Factory is a cloud-based data integration service that allows creation of
data-driven workflows in the cloud for orchestrating and automating data movement
and data transformation. Using Azure Data Factory, you can create and schedule
data-driven workflows (called pipelines) that can ingest data from disparate data
stores. It can process and transform the data by using compute services such as
Azure HDInsight Hadoop, Spark, Azure Data Lake Analytics, and Azure Machine
Learning.

By creating metadata to list the data tables to be migrated and their location it is
possible to use the ADF facilities to manage the migration process.

Design differences between Netezza and Azure Synapse

Multiple databases vs single database and schemas

Combine multiple In a Netezza environment there are sometimes multiple separate databases for
databases into a single
database within Azure individual parts of the overall environment – e.g. there may be a separate database
Synapse and use schema for data ingestion and staging tables, a database for the core warehouse tables and
names to logically
another database for data marts (sometimes called a semantic layer). Processing
separate the tables
such as ETL/ELT pipelines may implement cross-database jojns and will move data
between these separate databases.

Copyright © Microsoft Corporation, 2019, All Rights Reserved

7
Design and Performance for Netezza Migrations

In the Azure Synapse environment there is a single database, and schemas are used
to separate the tables into logically separate groups. Therefore, the recommendation
is to use a series of schemas within the target Azure Synapse to mimic any separate
databases that will be migrated from the Netezza environment. If schemas are
already being used within the Netezza environment then it may be necessary to use
a new naming convention to move the existing Netezza tables and views to the new
environment (e.g. concatenate the existing Netezza schema and table names into the
new Azure Synapse table name and use schema names in the new environment to
maintain the original separate database names). Another option is to use SQL views
over the underlying tables to maintain the logical structures – but there are some
potential downsides to this approach:
• Views in Azure Synapse are read-only – therefore any updates to the data must
take place on the underlying base tables
There may already be a layer (or layers) of views in existence and adding an
extra layer of views might impact performance

Table considerations
Use existing indexes to When migrating tables between different technologies it is generally only the raw
give an indiation of
candidates for indexing data (and the metadata that describes it) that gets physically moved between the 2
in the migrated environments. Other database elements from the source system (e.g. indexes) are
warehouse
not migrated as these may not be needed, or may be implemented differently
within the new target environment.

However, it is important to understand where performance optimizations such as

indexes have been used in the source environment as this information can give
useful indication of where performance optimization might be added in the new
target environment. For example, if zone maps are frequently used by queries within
the source Netezza environment, it may indicate that a non-clustered index should
be created within the migrated Azure Synapse – but also be aware that other native
performance optimization techniques (such as table replication) may be more
applicable that a straight ‘like for like’ creation of indexes.

Unsupported Netezza database object types

Netezza-specific features Netezza implements some database objects that are not directly supported in Azure
can be replaced by Azure
Synapse features Synapse, but there are generally methods to achieve the same functionality within
the new environment:

• Zone Maps – In Netezza zone maps are automatically created and maintained for
some column types and are used at query time to restrict the amount of data to
be scanned. They are created on the following column types:
INTEGER columns of length 8 bytes or less

Temporal columns (i.e. DATE, TIME, TIMESTAMP)

CHAR columns if these are part of a Materialized View and

mentioned in the ORDER BY clause

Copyright © Microsoft Corporation, 2019, All Rights Reserved

8
Design and Performance for Netezza Migrations

It is possible to find out which columns have zone maps by using the
nz_zonemap utility (part of the NZ Toolkit).

Azure Synapse does not include zone maps, but similar results can be
achieved by using other (user-defined) index types and/or partitioning.

• Clustered Base tables (CBT) – In Netezza CBT’s are most commonly used
for fact table which has billions of records. Scanning such a huge table
requires lot of processing time as full table scan could be needed to get
relevant records. Organizing records on restrictive CBT via allows Netezza
to group records in same or nearby extents, and this process will also
create zone maps that improves the performance by reducing the
amount of data to be scanned.
In Azure Synapse a similar effect can be achieved by use of partitioning
and/or use of other indexes.

• Materialized views – Netezza supports materialized views and recommends that

1 (or more) of these is created over large tables that have many columns where
only a few of those columns are regularly used in queries. Materialized views are
automatically maintained by the system when data in the base table is updates.
As of May 2019, Microsoft has announced that Azure Synapse will support
materialized views which have the same functionality as Netezza – this
feature is now available in preview.

Netezza data type mapping

Assess the impact of Most Netezza datatypes have a direct equivalent in the Azure Synapse – below is a
unsupported data types
as part of the preparation
table which shows these data types together with the recommended approach for
phase mapping these.

Netezza Data Type ASDW Data Type

BIGINT BIGINT
BINARY VARYING(n) VARBINARY(n)
BOOLEAN BIT
BYTEINT TINYINT
CHARACTER VARYING(n) VARCHAR(n)
CHARACTER(n) CHAR(n)
DATE DATE(DATE
DECIMAL(p,s) DECIMAL(p,s)
DOUBLE PRECISION FLOAT
FLOAT(n) FLOAT(n)
INTEGER INT
INTERVAL INTERVAL data types are not currently
directly supported in ASDW but can be
calculated using temporal functions such as
DATEDIFF
MONEY MONEY
NATIONAL CHARACTER VARYING(n) NVARCHAR(n)
NATIONAL CHARACTER(n) NCHAR(n)

Copyright © Microsoft Corporation, 2019, All Rights Reserved

9
Design and Performance for Netezza Migrations

NUMERIC(p,s) NUMERIC(p,s)
REAL REAL
SMALLINT SMALLINT
ST_GEOMETRY(n) Spatial data types such as ST_GEOMETRY
are not currently supported in ASDW - but
the data could be stored as VARCHAR or
VARBINARY
TIME TIME
TIME WITH TIME ZONE DATETIMEOFFSET
TIMESTAMP DATETIME

There are 3rd party vendors who offer tools and services to automate migration
including the mapping of data types as described above. Also, if a 3rd party ETL tool
such as Informatica or Talend is already in use in the Netezza environment, these can
implement any required data transformations.

SQL DML syntax differences

There are a few differences in SQL Data Manipulation Language (DML) syntax
between Netezza SQL and Azure Synapse to be aware of when migrating:

• STRPOS – in Netezza the STRPOS function returns the position of a substring

within a string – the equivalent in Azure Synapse is the CHARINDEX function. The
order of the arguments is reversed so that in Netezza
SELECT STRPOS(’abcdef’,’def’)…

Would be replaced by:

SELECT CHARINDEX(‘def’,‘abcdef’)…

• AGE – Netezza supports the AGE operator to give the interval between 2
temporal values (, timestamps, dates,etc) – e.g.
SELECT AGE (’23-03-1956’,’01-01-2019’) FROM…

This can be achieved in Azure Synapse by using DATEDIFF (note also the date
representation sequence):
SELECT DATEDIFF(day, ’1956-03-26’,’2019-01-01’) FROM…

• NOW() – Netezza uses NOW() to represent CURRENT_TIMESTAMP in Azure

Synapse

Functions, stored procedures and sequences

Assess the number and When migrating from a mature legacy data warehouse environment such as
type of non-data objects
to be migrated as part of Netezza there are often elements other than simple tables and views which need to
the preparation phase be migrated to the new target environment. Examples of this in Netezza are
Functions, Stored Procedures and Sequences.

As part of the preparation phase, an inventory of these objects which are to be

migrated should be created and the method of handling them defined, with an
appropriate allocation of resources assigned in the project plan.

Copyright © Microsoft Corporation, 2019, All Rights Reserved

10
Design and Performance for Netezza Migrations

It may be that there are facilities in the Azure environment that replace the
functionality implemented as functions or stored procedures in the Netezza
environment – in which case it is generally more efficient to use the built-in Azure
facilities rather than re-coding the Netezza functions.

3rd party vendors offer tools and services that can automate the migration of these –
see for example see Attunity or Wherescape migration products.

See below for more information on each of these elements:

Functions
In common with most database products, Netezza supports system functions
and also user-defined functions within the SQL implementation. When
migrating to another database platform such as Azure Synapse common
system functions are generally available and can be migrated without
change. Some system functions my have slightly different syntax but the
required changes can be automated in this case.

For system functions where there is no equivalent, of for arbitrary user-

defined functions these may need to be re-coded using the language(s)
available in the target environment. Netezza user-defined functions are
coded in nzlua or C++ languages whereas Azure Synapse uses the popular
Transact-SQL language for implementation of user-defined functions.

Stored procedures
Most modern database products allow for procedures to be stored within
the database – in Netezza’s case the NZPLSQL language is provided for this
purpose. NZPLSQL is based on Postgres PL/pgSQL. A stored procedure
typically contains SQL statements and some procedural logic and may return
data or a status.

SQL Azure Data Warehouse also supports stored procedures using T-SQL –
so if there are stored procedures to be migrated they must be recoded
accordingly.

Sequences
In Netezza a sequence is a named database object created via CREATE
SEQUENCE that can provide the unique value via the NEXT VALUE FOR
method. These can be used to generate unique numbers that can be used as
surrogate key values for primary key values.

Within Azure Synapse there is no CREATE SEQUENCE so sequences are

handled via use of IDENTITY columns or using SQL code to create the next
sequence number in a series.

Extracting metadata and data from a Netezza environment

Data Definition Language (DDL) generation

11
Design and Performance for Netezza Migrations

It is possible to edit existing Netezza CREATE TABLE and CREATE VIEW scripts to
create the equivalent definitions (with modified data types if necessary as described
above) – typically this involves removing or modifying any extra Netezza-specific
clauses (e.g. ORGANIZE ON).

However all the information that specifes the current definitions of tables and views
within the existing Netezza environment is maintained within system catalog tables –
this is the best source of this information as it is bound to be up to date and
complete. (Be aware that user-maintained documentation may not be in sync with
the current table definitions).

This information can be accessed via utilities such as nz_ddl_table and can be used to
generate the CREATE TABLE DDL statements which can then be edited for the
equivalent tables in Azure Synapse.

3rd party migration and ETL tools also use the catalog information to achieve the
same result.

Data extraction from Netezza

Use Netezza external The raw data to be migrated from existing Netezza tables can be extracted to flat
tables for most efficient
data extract
delimited files using standard Netezza utilities such as nzsql, nzunload and via
external tables. These files can be compressed using gzip and uploaded to Azure
Blob Storage via AzCopy or by using Azure data transport facilities such as Azure
Data Box.

Generally during a migration exercise it is important to extract the data as efficiently

as possible and the recommended approach for this for Netezza is to use the
external tables approach as this is the fastest method. Multiple extracts can be
performed in parallel to maximize the throughput for data extraction.

A simple example of an external table extract is shown below:

CREATE EXTERNAL TABLE '/tmp/export_tab1.csv' USING (DELIM ',') AS

SELECT * from <TABLENAME>;

If sufficient network bandwidth exists data can be extracted directly from an on-
premise Netezza system into Azure Synapse tables or Azure Blob Data Storage by
using Azure Data Factory processes or 3rd party data migration or ETL products.

Recommended data formats for the extracted data are delimited text files (also
called Comma Separated Values or CSV or similar) or Optimized Row Columnar
(ORC) or Parquet files.

For more detailed information on the process of migrating data and ETL from a
Netezza environment see the associated document ‘Section 2.1. Data Migration ETL
and Load from Netezza’.

12
Design and Performance for Netezza Migrations

Performance recommendations for Netezza migrations

The associated document ‘Section 1 – Design and Performance’ gives general
information and guidelines about use of performance optimization techniques for
Azure Synapse. This section adds specific recommendations for use when migrating
from a Netezza environment

Similarities in performance tuning approach concepts

Many Netezza tuning When moving from a Netezza environment many of the performance tuning
concepts hold true for
Azure Synapse concepts for Azure Data Warehouse will be very familiar. For example:

• Using data distribution to co-locate data to be joined onto the same processing
node
• Using the smallest data type for a given column will save storage space and
accelerate query processing
• Ensuring data types of columns to be joined are identical will optimize join
processing by reducing the need to transform data for matching
• Ensuring statistics are up to date will help the optimizer produce the best
execution plan

Differences in performance tuning approach

Familiarity with Azure This section highlights lower level implementation differences between Netezza and
Synapse tuning options is
an early priority in a
Azure Synapse for performance tuning.
migration exercise
Data distribution options
CREATE TABLE statements in both Netezza and Azure Synapse allow for
specification of a distribution definition – via DISTRIBUTE ON for Netezza and
DISTRIBUTION = in Azure Synapse.

Compared to Netezza, Azure Synapse provides an additional way to achieve

‘local joins’ for small table-large table joins (typically dimension table to fact
table in a start schema model) is to replicate the smaller dimension table
across all nodes, therefore ensuring any value of the join key of the larger
table will have a matching dimension row locally available. The overhead of
replicating the dimension tables is relatively low provided the tables are not
very large – in which case the hash distribution approach as described above
is more appropriate.

Data indexing
Azure Synapse provides a number of user definable indexing options, but
these are different in operation and usage to the system managed zone
maps in Netezza. Understand the different indexing options as described in
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-
warehouse-tables-index

Existing system managed zone maps within the source Netezza environment
can however provide a useful indication of how the data is currently used

13
Design and Performance for Netezza Migrations

and provide an indication of candidate columns for indexing within the Azure
Synapse environment.

Data partitioning
In an enterprise data warehouse fact tables can contain many billions of rows
and partitioning is a way to optimize the maintenance and querying of these
tables by splitting them into separate parts to reduce the amount of data
processed. The partitioning specification for a table is defined in the CREATE
TABLE statement.

Only 1 field per table can be used for partitioning, and this is frequently a
date field as many queries will be filtered by date or a date range. Note that
it is possible to change the partitioning of a table after initial load if
necessary by recreating the table with the new distribution using the CREATE
TABLE AS (or CTAS) statement. See https://docs.microsoft.com/en-
us/azure/sql-data-warehouse/sql-data-warehouse-tables-partition for a
detailed discussion of partitioning in Azure Synapse.

PolyBase for data loading

PolyBase is the most efficient method for loading large amounts of data into
the Warehouse as it is able to leverage parallel loading streams

Use resource classes for workload management

Azure Synapse uses resource classes to manage workloads – in general large
resource classes provide better individual query performance while smaller
resource classes enable higher levels of concurrency. Utilization can be
monitored via Dynamic Management Views (DMVs) to ensure that the
appropriate resources are being utilised efficiently.

Azure Data Factory
No ratings yet
Azure Data Factory
3,167 pages
Cloud Security Checklist
No ratings yet
Cloud Security Checklist
83 pages
MicrosoftPre DP-300
No ratings yet
MicrosoftPre DP-300
51 pages
Data Migration With Informatica: Todd Funasaki Director - Data Integration Solutions
No ratings yet
Data Migration With Informatica: Todd Funasaki Director - Data Integration Solutions
23 pages
Module 4 - Cloud Mapping Designer
No ratings yet
Module 4 - Cloud Mapping Designer
22 pages
Kunathilagan Thangavaloo Resume PMO
No ratings yet
Kunathilagan Thangavaloo Resume PMO
4 pages
Datastage Architecture
No ratings yet
Datastage Architecture
4 pages
MDM 103HF1 BusinessEntityServicesGuide en
100% (1)
MDM 103HF1 BusinessEntityServicesGuide en
193 pages
Network Assessment Full Detail Report
No ratings yet
Network Assessment Full Detail Report
223 pages
70-473 Design and Implement Cloud Data Platform Solutions
100% (2)
70-473 Design and Implement Cloud Data Platform Solutions
74 pages
Cyber Security
No ratings yet
Cyber Security
81 pages
SQL Server Security Checklist
100% (1)
SQL Server Security Checklist
3 pages
ML Use Cases Ebook
100% (2)
ML Use Cases Ebook
53 pages
Project: High Level Design Template: Department Name
No ratings yet
Project: High Level Design Template: Department Name
12 pages
301 Dock Appointment Scheduling
No ratings yet
301 Dock Appointment Scheduling
35 pages
Axon Facet Relationships 7 0
No ratings yet
Axon Facet Relationships 7 0
25 pages
Bandwidth Reference Brent Ozar Unlimited Legal
No ratings yet
Bandwidth Reference Brent Ozar Unlimited Legal
1 page
AI-100 Original
No ratings yet
AI-100 Original
112 pages
C Program by Best Author
No ratings yet
C Program by Best Author
358 pages
AWS Certified Database Specialty - Exam Guide
No ratings yet
AWS Certified Database Specialty - Exam Guide
11 pages
Informatica Shortcuts
100% (1)
Informatica Shortcuts
18 pages
Az 304
No ratings yet
Az 304
187 pages
NDMO - Data and Privacy Regulatory Sandbox - Guideline
No ratings yet
NDMO - Data and Privacy Regulatory Sandbox - Guideline
13 pages
Microsoft Azure HIPAA Implementation Guide July2015
100% (2)
Microsoft Azure HIPAA Implementation Guide July2015
7 pages
SQL Server Architecture
No ratings yet
SQL Server Architecture
28 pages
CISM - Certified Information Security Manager CISM Topic 4
No ratings yet
CISM - Certified Information Security Manager CISM Topic 4
46 pages
SCIM Configuration in IICS Admin
No ratings yet
SCIM Configuration in IICS Admin
16 pages
Active Directory Zone Replication
No ratings yet
Active Directory Zone Replication
13 pages
Jcpenney: Standard Operating Procedures
No ratings yet
Jcpenney: Standard Operating Procedures
39 pages
Skills Map - Data Architect (GSC Top 5)
No ratings yet
Skills Map - Data Architect (GSC Top 5)
2 pages
Az 300 Studyguide PDF
No ratings yet
Az 300 Studyguide PDF
10 pages
SQL Fundamentals I & II Documentation
No ratings yet
SQL Fundamentals I & II Documentation
127 pages
Verilog FAQ
100% (2)
Verilog FAQ
11 pages
SAFe Scrum Master Roles and Responsibilities
100% (1)
SAFe Scrum Master Roles and Responsibilities
26 pages
AZ-304-Version 4.0
No ratings yet
AZ-304-Version 4.0
164 pages
Conversational Azure Backup Best Practices PDF
No ratings yet
Conversational Azure Backup Best Practices PDF
28 pages
MIE1628 Big Data Analytics Lecture8
No ratings yet
MIE1628 Big Data Analytics Lecture8
82 pages
Performance Tuning With InfoSphere CDC
100% (1)
Performance Tuning With InfoSphere CDC
37 pages
DWC Security Guide
No ratings yet
DWC Security Guide
18 pages
70-764 - Administering A SQL Database Infrastructure
No ratings yet
70-764 - Administering A SQL Database Infrastructure
8 pages
EnterpriseDataPlanning PDF
No ratings yet
EnterpriseDataPlanning PDF
24 pages
SQL Server On VMware-Best Practices Guide
No ratings yet
SQL Server On VMware-Best Practices Guide
54 pages
2019 NEW Questions and Answers RELEASED in Online IT Study Website Today!
No ratings yet
2019 NEW Questions and Answers RELEASED in Online IT Study Website Today!
9 pages
MDM 4
No ratings yet
MDM 4
159 pages
Security in Cloud Computing Overview
No ratings yet
Security in Cloud Computing Overview
8 pages
Database Migration Case Study
No ratings yet
Database Migration Case Study
3 pages
Talend Architecture White Paper - Branded - Final 11302020
No ratings yet
Talend Architecture White Paper - Branded - Final 11302020
18 pages
Auditing in CIS Environment DISCUSSION 13
No ratings yet
Auditing in CIS Environment DISCUSSION 13
7 pages
Software Engineering Unit-V (Se R23 Jntuk)
No ratings yet
Software Engineering Unit-V (Se R23 Jntuk)
14 pages
Lab 8 - Securing Azure Data Platforms
No ratings yet
Lab 8 - Securing Azure Data Platforms
8 pages
Guide To Data Governance Part2 People and Process Whitepaper
No ratings yet
Guide To Data Governance Part2 People and Process Whitepaper
28 pages
Case Study Single Sign On Solution Implementation Software Luxoft For Ping Identity
No ratings yet
Case Study Single Sign On Solution Implementation Software Luxoft For Ping Identity
5 pages
APAC Privacy - Australia - FINAL (2017!10!10)
No ratings yet
APAC Privacy - Australia - FINAL (2017!10!10)
14 pages
Varonis Data Risk Assessment: Sample Report: Acme
No ratings yet
Varonis Data Risk Assessment: Sample Report: Acme
12 pages
Data Virtuality Best Practices
No ratings yet
Data Virtuality Best Practices
18 pages
XBRL US Pacific Rim Workshop Database and Business Intelligence Workshop Karen Hsu Director Product Marketing, Informatica
No ratings yet
XBRL US Pacific Rim Workshop Database and Business Intelligence Workshop Karen Hsu Director Product Marketing, Informatica
18 pages
Production Control Group Copy The Source Program To Production Libraries
No ratings yet
Production Control Group Copy The Source Program To Production Libraries
7 pages
Informatica Powermart / Powercenter 6.X Upgrade Features: Ted Williams
No ratings yet
Informatica Powermart / Powercenter 6.X Upgrade Features: Ted Williams
53 pages
SQL Complete Notes
No ratings yet
SQL Complete Notes
82 pages
Oracle DBA Checklist
No ratings yet
Oracle DBA Checklist
17 pages
ETL Specification Review Check List Ods - Ap
No ratings yet
ETL Specification Review Check List Ods - Ap
5 pages
DataStage Naming Standards v11 2
No ratings yet
DataStage Naming Standards v11 2
17 pages
COCOMO Estimating
No ratings yet
COCOMO Estimating
40 pages
Split or Dont Split Application and Database
No ratings yet
Split or Dont Split Application and Database
15 pages
Data - Led Approach To Digital Innovation
No ratings yet
Data - Led Approach To Digital Innovation
11 pages
Data Bricks - BDCS
No ratings yet
Data Bricks - BDCS
6 pages
Cs-344: Web Engineering: Dr. Mehdi Hussain
No ratings yet
Cs-344: Web Engineering: Dr. Mehdi Hussain
40 pages
Exam BCA 3rd Semester (Dec 2023-Jan-2024) Regular Reappear
No ratings yet
Exam BCA 3rd Semester (Dec 2023-Jan-2024) Regular Reappear
389 pages
Introduction To C++
No ratings yet
Introduction To C++
306 pages
CDO Water District E-Complaint: Online Complaint Registration Management Application
No ratings yet
CDO Water District E-Complaint: Online Complaint Registration Management Application
15 pages
Assignment - 1 (Kiran Jamil)
No ratings yet
Assignment - 1 (Kiran Jamil)
5 pages
Application and System Software: Course: BCA Subject: Fundamental of Computer Unit: 2
No ratings yet
Application and System Software: Course: BCA Subject: Fundamental of Computer Unit: 2
38 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
84 pages
Computer Practical File
No ratings yet
Computer Practical File
21 pages
Free Questions For: Associate-Reactive-Developer
No ratings yet
Free Questions For: Associate-Reactive-Developer
5 pages
Lesson 10 Repair Process
No ratings yet
Lesson 10 Repair Process
29 pages
OS LAb Manual
No ratings yet
OS LAb Manual
43 pages
Sample MCQ
No ratings yet
Sample MCQ
4 pages
Downloads - Oracle VM VirtualBox
No ratings yet
Downloads - Oracle VM VirtualBox
1 page
S6 Sem Syllabus Computer Science (Old Scheme)
No ratings yet
S6 Sem Syllabus Computer Science (Old Scheme)
6 pages
Dzone Kubernetes in The Enterprise 2022 1669219453241
No ratings yet
Dzone Kubernetes in The Enterprise 2022 1669219453241
52 pages
Why I Don't Like The Test Pyramid
No ratings yet
Why I Don't Like The Test Pyramid
5 pages
HTML An Introductin Class 6 Chapter 7
No ratings yet
HTML An Introductin Class 6 Chapter 7
9 pages
Data Contracts For Schema Registry - Confluent Documentation
No ratings yet
Data Contracts For Schema Registry - Confluent Documentation
22 pages
IBM Cognos To Schedule Any Report
No ratings yet
IBM Cognos To Schedule Any Report
4 pages
Growexx Questions 30.08.2022
No ratings yet
Growexx Questions 30.08.2022
8 pages
RS IT (402) ClassX Installation Guide
No ratings yet
RS IT (402) ClassX Installation Guide
6 pages
Batch Backorder Processing
No ratings yet
Batch Backorder Processing
2 pages
Introduction To Algorithms - Abdul Rehman
No ratings yet
Introduction To Algorithms - Abdul Rehman
3 pages
Woodelivery & Wordpress/Woocommerce Integration: Generate Your Woodelivery Api Key
No ratings yet
Woodelivery & Wordpress/Woocommerce Integration: Generate Your Woodelivery Api Key
4 pages
How To Move and Click The Mouse in VBA - Excel Help HQ
No ratings yet
How To Move and Click The Mouse in VBA - Excel Help HQ
5 pages
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
ORACLE 12C Complete Self-Assessment Guide
From Everand
ORACLE 12C Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
AppDynamics Third Edition
From Everand
AppDynamics Third Edition
Gerardus Blokdyk
No ratings yet

Section 1 - Design & Performance For Netezza Migration To Azure Synapse

Uploaded by

Section 1 - Design & Performance For Netezza Migration To Azure Synapse

Uploaded by

Migration to Azure Synapse Analytics

Section 1.2 - Design and performance for

Copyright © Microsoft Corporation, 2019, All Rights Reserved

Copyright © Microsoft Corporation, 2019, All Rights Reserved

Copyright © Microsoft Corporation, 2019, All Rights Reserved

• Increased agility and shorter time to value

Copyright © Microsoft Corporation, 2019, All Rights Reserved

Preparation for migration

Choosing the workload for the initial migration

‘Lift and shift as-is’ vs a phased approach incorporating changes

Copyright © Microsoft Corporation, 2019, All Rights Reserved

Phased approach incorporating modifications

Use Azure Data Factory to implement a metadata-driven migration

Design differences between Netezza and Azure Synapse

Multiple databases vs single database and schemas

Copyright © Microsoft Corporation, 2019, All Rights Reserved

However, it is important to understand where performance optimizations such as

Unsupported Netezza database object types

Temporal columns (i.e. DATE, TIME, TIMESTAMP)

CHAR columns if these are part of a Materialized View and

Copyright © Microsoft Corporation, 2019, All Rights Reserved

• Materialized views – Netezza supports materialized views and recommends that

Netezza data type mapping

Netezza Data Type ASDW Data Type

Copyright © Microsoft Corporation, 2019, All Rights Reserved

SQL DML syntax differences

• STRPOS – in Netezza the STRPOS function returns the position of a substring

Would be replaced by:

• NOW() – Netezza uses NOW() to represent CURRENT_TIMESTAMP in Azure

Functions, stored procedures and sequences

As part of the preparation phase, an inventory of these objects which are to be

Copyright © Microsoft Corporation, 2019, All Rights Reserved

See below for more information on each of these elements:

For system functions where there is no equivalent, of for arbitrary user-

Within Azure Synapse there is no CREATE SEQUENCE so sequences are

Extracting metadata and data from a Netezza environment

Data Definition Language (DDL) generation

Copyright © Microsoft Corporation, 2019, All Rights Reserved

Data extraction from Netezza

Generally during a migration exercise it is important to extract the data as efficiently

A simple example of an external table extract is shown below:

CREATE EXTERNAL TABLE '/tmp/export_tab1.csv' USING (DELIM ',') AS

Copyright © Microsoft Corporation, 2019, All Rights Reserved

Performance recommendations for Netezza migrations

Similarities in performance tuning approach concepts

Differences in performance tuning approach

Compared to Netezza, Azure Synapse provides an additional way to achieve

Copyright © Microsoft Corporation, 2019, All Rights Reserved

PolyBase for data loading

Use resource classes for workload management

Copyright © Microsoft Corporation, 2019, All Rights Reserved

You might also like