Top 10 Methods To Improve ETL Performance Using SSIS

This document discusses 10 methods for improving ETL performance in SQL Server Integration Services (SSIS). It divides the methods into two categories: SSIS package design considerations and configuring component properties. The design considerations include extracting data in parallel, only extracting required data, and avoiding asynchronous transformations. Configuring properties involves settings like controlling parallel execution, using fast load mode in destinations, setting batch sizes, and avoiding implicit data type conversions. Overall, following these approaches can help optimize ETL performance at design time or for existing systems.

Uploaded by

vijairv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

124 views6 pages

Top 10 Methods To Improve ETL Performance Using SSIS

Uploaded by

vijairv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Top 10 Methods to Improve ETL Performance Using

SSIS
Extraction Transformation Load (ETL) is the backbone for any data warehouse. In the data
warehouse world data is managed by the ETL process, which consists of three processes, Extraction-
Pull/Acquire data from sources, Transformation-change data in the required format and Load-push
data to the destination generally into a data warehouse or a data mart.

We will discuss how easily you can improve ETL performance or design a high performing ETL
system with the help of SSIS. For a better understanding, I will divide ten methods into two different
categories; first, SSIS package design time considerations and second configuring different property
values of components available in the SSIS package.

2.1 SSIS Package Design Time Considerations

#1, Extract data in parallel; SSIS provides the way to pull data in parallel using Sequence containers
in control flow. You can design a package in such a way that it can pull data from non-dependent
tables or files in parallel, which will help to reduce overall ETL execution time.

#2, Extract required data; pull only the required set of data from any table or file. You need to avoid
the tendency to pull everything available on the source for now that you will use in future; it eats up
network bandwidth, consumes system resources (I/O and CPU), requires extra storage, and it
degrades the overall performance of ETL system.

If your ETL system is really dynamic in nature and your requirements frequently change, it would be
better to consider other design approaches, like Meta Data driven ETL, etc. rather than design to pull
everything in at one time.

#3, Avoid the use of Asynchronous transformation components; SSIS is a rich tool with a set of
transformation components to achieve complex tasks during ETL execution but at the same time it
costs you a lot if these components are not being used properly.

Two categories of transformation components are available in SSIS; Synchronous and Asynchronous.

Synchronous transformations are those components which process each row and push down to the
next component/destination, it uses allocated buffer memory and doesn’t require additional memory
as it is direct relation between input/output data row which fits completely into allocated memory.
Components like Lookup, Derived Columns, and Data Conversion etc. fall into this category.

Asynchronous transformations are those components which first store data into buffer memory then
process operations like Sort and Aggregate. Additional buffer memory is required to complete the task
and until the buffer memory is available it holds up the entire data in memory and blocks the
transaction, also known as blocking transformation. To complete the task SSIS engine (data flow
pipeline engine) will allocate extra buffer memory, which is again an overhead to the ETL system.
Components like Sort, Aggregate, Merge, Join, etc. fall into this category.

Overall, you should avoid Asynchronous transformations but still, if you get into a situation where you
don’t have any other choice then you must aware of how to deal with the available property values of
these components. I’ll discuss them later in this article.

#4, Optimum use of event in event handlers; to track package execution progress or take any other
appropriate action on a specific event, SSIS provides a set of events. Events are very useful but
excess use of events will cost extra overhead on ETL execution.
Here, you need to validate all traits before enabling an event in the SSIS package.

#5, Need to be aware of the destination table schema when working on a huge volume of data. You
need to think twice when you need to pull a huge volume of data from the source and push it into a
data warehouse or data mart. You may see performance issues when trying to push huge data into
the destination with a combination of insert, update and delete (DML) operations, as there could be a
chance that the destination table will have clustered or non-clustered indexes, which may cause a lot
of data shuffling in memory due to DML operations.

If ETL is having performance issues due to a huge amount of DML operations on a table that has an
index, you need to make appropriate changes in the ETL design, like dropping existing clustered
indexes in the pre-execution phase and re-create all indexes in the post-execute phase. You may find
other better alternatves to resolve the issue based on your situation.

2.2 Configure Components Properties

#6, Control parallel execution of a task by configuring the MaxConcurrentExecutables and
EngineThreads property. SSIS package and data flow tasks have a property to control parallel
execution of a task; MaxConcurrentExecutables is the package level property and has a default value
of -1, which means the maximum number of tasks that can be executed is equal to the total number of
processors on the machine plus two;

Package

EngineThreads is a data flow task level property and has a default value of 10, which specifies the
total number of threads that can be created for executing the data flow task.
Data Flow Task

You can change default values of these properties as per ETL needs and resources availability.

#7, Configure Data access mode option in OLEDB Destination. In the SSIS data flow task we can find
the OLEDB destination, which provides a couple of options to push data into the destination table,
under the Data access mode; first, the “Table or view“ option, which inserts one row at a time;
second, the “Table or view fast load” option, which internally uses the bulk insert statement to send
data into the destination table, which always provides better performance compared to other options.
Once you choose the “fast load” option it gives you more control to manage the destination table
behavior during a data push operation, like Keep identity, Keep nulls, Table lock and Check
constraints.
OLE DB Destination Editor

It’s highly recommended that you use the fast load option to push data into the destination table to
improve ETL performance.

#8, Configure Rows per Batch and Maximum Insert Commit Size in OLEDB destination. These two
settings are important to control the performance of tempdb and transaction log because with the
given default values of these properties it will push data into the destination table under one batch and
one transaction. It will require excessive use of tempdb and transaction log, which turns into an ETL
performance issue because of excessive consumption of memory and disk storage.
OLE DB Destination Editor

To improve ETL performance you can put a positive integer value in both of the properties based on
anticipated data volume, which will help to divide a whole bunch of data into multiple batches, and
data in a batch can again commit into the destination table depending on the specified value. It will
avoid excessive use of tempdb and transaction log, which will help to improve the ETL performance.

#9, Use of SQL Server Destination in a data flow task. When you want to push data into a local SQL
Server database, it is highly recommended to use SQL Server Destination, as it provides many
benefits to overcome other option’s limitations, which helps you to improve ETL performance. For
example, it uses the bulk insert feature that is built into SQL Server but it gives you the option to apply
transformation before loading data into the destination table. Apart from that, it gives you the option to
enable/disable the trigger to be fired when loading data, which also helps to reduce ETL overhead.
SQL Server Destination Data Flow Component

#10, Avoid implicit typecast. When data comes from a flat file, the flat file connection manager treats
all columns as a string (DS_STR) data type, including numeric columns. As you know, SSIS uses
buffer memory to store the whole set of data and applies the required transformation before pushing
data into the destination table. Now, when all columns are string data types, it will require more space
in the buffer, which will reduce ETL performance.

To improve ETL performance you should convert all the numeric columns into the appropriate data
type and avoid implicit conversion, which will help the SSIS engine to accommodate more rows in a
single buffer.

3 Summary
In this article we explored how easily ETL performance can be controlled at any point of time. These
are 10 common ways to improve ETL performance. There may be more methods based on different
scenarios through which performance can be improved.
Overall, with the help of categorization you can identify how to handle the situation. If you are in the
design phase of a data warehouse then you may need to concentrate on both the categories but if
you're supporting any legacy system then first closely work on the second category.

http://msdn.microsoft.com/en-us/library/ms141031.aspx

Photoshop MCQ Questions and Answers
73% (15)
Photoshop MCQ Questions and Answers
9 pages
Ssis Class Notes
100% (4)
Ssis Class Notes
71 pages
SSIS Tutorial - SQL Server Integration Services
No ratings yet
SSIS Tutorial - SQL Server Integration Services
35 pages
SSIS Performance Tunning
No ratings yet
SSIS Performance Tunning
3 pages
Group 3: Molecular Orbital Theory
No ratings yet
Group 3: Molecular Orbital Theory
37 pages
PSU Manual
100% (1)
PSU Manual
23 pages
C 4
No ratings yet
C 4
61 pages
SSIS Naming Conventions and Effective Practices
No ratings yet
SSIS Naming Conventions and Effective Practices
9 pages
Image Registration Methods A Survey
No ratings yet
Image Registration Methods A Survey
25 pages
Msbi - Ssis (2.0)
No ratings yet
Msbi - Ssis (2.0)
87 pages
ABB Test
No ratings yet
ABB Test
47 pages
KT Ykts
No ratings yet
KT Ykts
41 pages
SSIS Material PDF
No ratings yet
SSIS Material PDF
61 pages
Ssis Interview Questions and Answers 2
No ratings yet
Ssis Interview Questions and Answers 2
7 pages
SSIS Best Practices
100% (2)
SSIS Best Practices
47 pages
Step by Step: Creating A ETL Process in MS SQL Server Integration Services (SSIS)
No ratings yet
Step by Step: Creating A ETL Process in MS SQL Server Integration Services (SSIS)
11 pages
ETL Best Practices 1.3
No ratings yet
ETL Best Practices 1.3
180 pages
SSIS Online Training PDF
No ratings yet
SSIS Online Training PDF
4 pages
Custom Auditing in SSIS: - Meghana Vasavada
No ratings yet
Custom Auditing in SSIS: - Meghana Vasavada
39 pages
Assignment 1
No ratings yet
Assignment 1
13 pages
ETL Process: - 4 Major Components
No ratings yet
ETL Process: - 4 Major Components
27 pages
PROJECT REVIEW 2 Final
No ratings yet
PROJECT REVIEW 2 Final
23 pages
MSBI Video Training
No ratings yet
MSBI Video Training
12 pages
Production of Ceramic Foam Filters For Molten Meta
No ratings yet
Production of Ceramic Foam Filters For Molten Meta
5 pages
Bi 06 Etl
No ratings yet
Bi 06 Etl
35 pages
ETL Specification Table of Contents: Change Log
No ratings yet
ETL Specification Table of Contents: Change Log
3 pages
JNV. Chemistry Viva
No ratings yet
JNV. Chemistry Viva
30 pages
Etl Ssis
No ratings yet
Etl Ssis
10 pages
SQ L Server 2014 Customized
No ratings yet
SQ L Server 2014 Customized
53 pages
Optimisation of Intake Trashracks: S. Bjarnason T.S. Leifsson G. Pétursson H. Jóhannesson
No ratings yet
Optimisation of Intake Trashracks: S. Bjarnason T.S. Leifsson G. Pétursson H. Jóhannesson
7 pages
Surmount International School Half Yearly Examination (2019-2020) Class: 10 Subject: Mathematics
No ratings yet
Surmount International School Half Yearly Examination (2019-2020) Class: 10 Subject: Mathematics
4 pages
Lect 6
No ratings yet
Lect 6
8 pages
High Volume Data Processing Techniques: Lobsterpot Solutions - March 26, 2013
No ratings yet
High Volume Data Processing Techniques: Lobsterpot Solutions - March 26, 2013
47 pages
SSIS Perf Tuning Slides
No ratings yet
SSIS Perf Tuning Slides
22 pages
EmpowermentG11 MODULE 6 7
No ratings yet
EmpowermentG11 MODULE 6 7
25 pages
MSBI Online Training PDF
No ratings yet
MSBI Online Training PDF
8 pages
Chemical Shift
No ratings yet
Chemical Shift
10 pages
Crude Oil Emulsions A State-Of-The-Art Review
100% (3)
Crude Oil Emulsions A State-Of-The-Art Review
11 pages
LAB03-Creating An ETL Solution With SSIS
No ratings yet
LAB03-Creating An ETL Solution With SSIS
9 pages
Ch07 ETL Specification ToC
No ratings yet
Ch07 ETL Specification ToC
3 pages
Creating An ETL Solution
No ratings yet
Creating An ETL Solution
27 pages
How To Reduce EMI in Switching Power Supplies
No ratings yet
How To Reduce EMI in Switching Power Supplies
3 pages
Avoid SELECT : SSIS Performance Improvement
No ratings yet
Avoid SELECT : SSIS Performance Improvement
7 pages
SSIS Best Practices
No ratings yet
SSIS Best Practices
5 pages
Jonathan Stewart - Performance Tuning For The BI Professional
No ratings yet
Jonathan Stewart - Performance Tuning For The BI Professional
57 pages
Performance Testing Process
No ratings yet
Performance Testing Process
3 pages
Designing An ETL Solution
No ratings yet
Designing An ETL Solution
26 pages
Study of Suspension System in All Terrain Vehicle: Presented by
No ratings yet
Study of Suspension System in All Terrain Vehicle: Presented by
14 pages
Automated Face Mask Detection: A Project by Nishant Goel Under The Guidance of Dr. Anil Kumar
No ratings yet
Automated Face Mask Detection: A Project by Nishant Goel Under The Guidance of Dr. Anil Kumar
21 pages
Data Warehouse Slide3
No ratings yet
Data Warehouse Slide3
43 pages
Microsoft Official Course: Designing An ETL Solution
No ratings yet
Microsoft Official Course: Designing An ETL Solution
25 pages
CHEM 113-Quiz #7 Answer Key
No ratings yet
CHEM 113-Quiz #7 Answer Key
4 pages
CE Board Nov 2020 - Hydraulics - Set 19
No ratings yet
CE Board Nov 2020 - Hydraulics - Set 19
1 page
Experiment 2 VOM
No ratings yet
Experiment 2 VOM
5 pages
5 PB
No ratings yet
5 PB
18 pages
SQL Server Integration Services (SSIS) 15 Best Practices
No ratings yet
SQL Server Integration Services (SSIS) 15 Best Practices
2 pages
SSIS Best Practices
No ratings yet
SSIS Best Practices
14 pages
CH3140 Lecture Notes S1AY23-24 Set A
No ratings yet
CH3140 Lecture Notes S1AY23-24 Set A
40 pages
SSIS ETL Process Documentation Using Visual Studio 2022 and SQL Server
No ratings yet
SSIS ETL Process Documentation Using Visual Studio 2022 and SQL Server
11 pages
Ramakrishnan Robbins Zmigrod 2021 - The Habitual Tendencies Questionnaire
No ratings yet
Ramakrishnan Robbins Zmigrod 2021 - The Habitual Tendencies Questionnaire
54 pages
Thesis Topics On Image Processing
100% (3)
Thesis Topics On Image Processing
6 pages
IO Wheel Balancer WB220L - CE - 1.1 - ENG - Set910710984
No ratings yet
IO Wheel Balancer WB220L - CE - 1.1 - ENG - Set910710984
18 pages
Ssis Basics
No ratings yet
Ssis Basics
58 pages
2 - Data Integration Using Integration Services
No ratings yet
2 - Data Integration Using Integration Services
14 pages
2 - Data Integration Using SSIS
No ratings yet
2 - Data Integration Using SSIS
15 pages
ETL Guide
No ratings yet
ETL Guide
44 pages
Ssis
No ratings yet
Ssis
45 pages
Dynamic Tab
No ratings yet
Dynamic Tab
4 pages
DW Practical No 1 & 2
No ratings yet
DW Practical No 1 & 2
6 pages
Ethanolamine and Phosphoethanolamine Inhibit Mitochondrial Function in Vitro - Implications For Mitochondrial Dysfunction Hypothesis in Depression and Bipolar Disorder - ScienceDirect
No ratings yet
Ethanolamine and Phosphoethanolamine Inhibit Mitochondrial Function in Vitro - Implications For Mitochondrial Dysfunction Hypothesis in Depression and Bipolar Disorder - ScienceDirect
6 pages
Revised Notes Chapter 1
No ratings yet
Revised Notes Chapter 1
16 pages
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
From Everand
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
AJIT DASH
2/5 (2)
Starting Database Administration: Oracle DBA
From Everand
Starting Database Administration: Oracle DBA
anuragbaruah84
3/5 (2)
Oracle Database 12c Quickstart
From Everand
Oracle Database 12c Quickstart
Michael Elliott
5/5 (5)
Introduction to Oracle Database Administration
From Everand
Introduction to Oracle Database Administration
Ying Wang
5/5 (1)
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
The Informed Company: How to Build Modern Agile Data Stacks that Drive Winning Insights
From Everand
The Informed Company: How to Build Modern Agile Data Stacks that Drive Winning Insights
Dave Fowler
No ratings yet
GETTING STARTED WITH SQL: Exercises with PhpMyAdmin and MySQL
From Everand
GETTING STARTED WITH SQL: Exercises with PhpMyAdmin and MySQL
Remy Lentzner
No ratings yet
Professional Microsoft SQL Server 2012 Integration Services
From Everand
Professional Microsoft SQL Server 2012 Integration Services
Brian Knight
No ratings yet
SQL Interview Success From Beginner To Pro
From Everand
SQL Interview Success From Beginner To Pro
Shana
No ratings yet
Knight's Microsoft SQL Server 2012 Integration Services 24-Hour Trainer
From Everand
Knight's Microsoft SQL Server 2012 Integration Services 24-Hour Trainer
Brian Knight
No ratings yet
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Learn SQL: Database Management Basics
From Everand
Learn SQL: Database Management Basics
Kiet Huynh
No ratings yet
Streamlining ETL: A Practical Guide to Building Pipelines with Python and SQL
From Everand
Streamlining ETL: A Practical Guide to Building Pipelines with Python and SQL
Peter Jones
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
AWS Glue for Data Engineers: Serverless ETL Made Easy
From Everand
AWS Glue for Data Engineers: Serverless ETL Made Easy
Robert Johnson
No ratings yet
ELT Architecture and Implementation: Definitive Reference for Developers and Engineers
From Everand
ELT Architecture and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient ETL Systems Design: Definitive Reference for Developers and Engineers
From Everand
Efficient ETL Systems Design: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Top 10 Methods To Improve ETL Performance Using SSIS

Uploaded by

Top 10 Methods To Improve ETL Performance Using SSIS

Uploaded by

Top 10 Methods to Improve ETL Performance Using

2.1 SSIS Package Design Time Considerations

2.2 Configure Components Properties

You might also like