Ssis Interview Questions: Control Flow
Ssis Interview Questions: Control Flow
a) Control Flow
b) Data Flow
d) Package Explorer.
Q95. While running SSIS package, after 15 min of execution it went to hung state. How you
troubleshoot?
There are three common reasons that hold / hung the SSIS execution.
Resource Bottleneck: Memory / CPU / IO / Network
Blocking / Deadlock: Blocking happens at database level or In accessing a file or reading writing
variables from script task.
Poor Performance query: If SSIS stops at Execute SQL Task look for query using inside the task and
tune it.
Looking through above aspects one can identify the issue, based on that we can provide the
resolution. If everything looks good but still SSIS is in hung state then check the latest service pack is
applied if that’s also passed collect the hung dump file using ADPlus and contact Microsoft support
center.
Q96. SSIS 2008 uses all available RAM, and after package completes Memory is not released?
This is not actually a problem. You have allowed SQL Server to use x amount of memory, so it does.
SQL Server takes that memory as required, up to the limit set, but it does not release it. It can
respond to request from OS, again read up on the fine details, but by default once it has got hold of
some memory it will keep it even if it is not using it currently. The simple reason is that finding and
taking hold of memory is quite expensive to do, so once it has it, it keeps it and then any subsequent
operations that need memory will have it available much faster. This makes perfect sense when you
remember that SQL Server is a service application and more often than not runs on a dedicated
machine.
Q97. What is the property “RunInOptimized”? How to set this property?
If this property is set to true then the SSIS engine ignore the unused/unmapped columns. Means it
does not allocate memory to store data for those columns. At the compilation phase itself SSIS
engine identifies what are the columns from source are using across the package, if it finds any
columns are neither using nor mapping to destination, it simply ignores all those columns.
We can set this property at two levels “Project Level” and “Package Level”.
Project Level: From project properties → debugging → Run-in*****. By default “FALSE”
Package Level: Can find in DataFlow properties. By default “TRUE”
Q98. Does using “RowCount” transformation affect the package performance?
Rowcount component is a synchronous component and it doesn’t actually do anything particularly
resource intensive means the performance degradation of your package should be negligible.
We do use this component to capture the number of inserts, deletes and updates from each data-
flow and then using “OnPost Execute” event this information would be written to a SQL Server table.
Q99. A SSIS 2008 package has been crashed due to low memory. How to resolve low memory
issues with SSIS package?
1. Add more memory to the physical machine
2. Run SSIS package on a computer that is not running an instance of SQL Server
3. When SSIS and SQL instance on the same machine, balance the memory allocated to SQL Server
instance using “MAX Server Memory” option.
4. Run SSIS package components in series instead of parallel
Q100. How to identify the SSIS processes?
SSIS run-time processes include the DTExec.exe process and the DTSHost.exe process.
Q101. How to enable containers continue to run even a task failed inside the container? Suppose
you have an application, where we need to loop through some log table based on the IDs & load
data into the destination. Now, in this scenario there might be the situation where some of the
tasks in foreach loop container may fail. But your requirement is even though the inner tasks fail
we should process the other sources which are available with us.
We can do this by updating the propagation property of a task / container to “False”. It means that
the loop or sequence container ignores the failure of an internal task.
Assume we have designed a foreach loop container with a dataflow task. As per our requirement
DFT is loading 100 files into database if DFT is failed to load 47th file it should skip the error and
should continue to load from 48th file.
Steps to accomplish this are:
Select the Data Flow Task and go to event handler
Enable the On Error Event handler.
In the Event Handler tab, click on the “Show System Variables”.
Now select the “Propogate” property & change its value to “False”.
This will ensure that the parent control i.e. For Each loop will not know about the error in the child
task.
If incase the foreach loop container is having more than one task, instead of setting the property to
all these tasks, add all these tasks to sequence container and change the “Propagate” property of
sequence container.
Q102. What is ForceExecution property in SSIS component properties?
ForceExecution is a property of Control flow elements in SSIS. If it is enabled to any of the element
then ssis engine follows the execution result as per the given parameters. In other words to control
the execution result of any control flow element we can use this property.
ForceExecutionValue: True or False
ForcedExecutionType: <Data type>
ForcedExecutionValue: <Value>, we usually gives as 1 to make suite its true.
Q103. How to improve the performance of a SSIS package?
1- Utilize parallelism: It is easy to utilize parallelism in SSIS. All you need to do is to recognize which
Data Flow Tasks (DFTs) could be started at the same time and set the control flow constraints of your
package in the way that they all can run simultaneously.
2- Synchronous vs. Asynchronous components: A synchronous transformation of SSIS takes a buffer,
processes the buffer, and passes the result through without waiting for the next buffer to come in.
On the other hand, an asynchronous transformation needs to process all its input data to be able to
give out any output. This can cause serious performance issues when the size of the input data to
the asynchronies transformation is too big to fit into memory and needs to be transferred to HDD at
multiple stages.
3- Execution tree: An execution tree starts where a buffer starts and ends where the same buffer
ends. These execution trees specify how buffers and threads are allocated in the package. Each tree
creates a new buffer and may execute on a different thread. When a new buffer is created such as
when a partially blocking or blocking transformation is added to the pipeline, additional memory is
required to handle the data transformation; however, it is important to note that each new tree may
also give you an additional worker thread.
4-OLE DB Command transformation: OLE DB Command is a row-by-row transformation, meaning
that it runs the command in it on each one of its input rows. This make sit to be damn too slow when
the number of the rows goes up. The solution for boosting performance is to stage data into a
temporary table and use Execute SQL Task outside that DFT.
5-SQL Server Destination vs. OLE DB Destination: There is multiple reason why to use OLE DB
Destination and not use SQL Server Destination:
OLE DB Destination is mostly faster,
OLE DB Destination is a lot clearer when it fails (The error message is more helpful),
SQL Server Destination works only when SSIS is installed on the destination server.
6- Change Data Capture (CDC): Try to reduce the amount of data to be transferred to the maximum
level you can, and do it as close to the source as you can. A Modified On column on the source
table(s) helps a lot in this case.
7- Slowly Changing Dimension (SCD) transformation: There is only one advice about SSIS’s Slowly
Changing Dimension transformation, and that is get rid of it! The reasons are:
It doesn’t use any cached data, and goes to the data source every single time it is called,
It uses many OLE DB Command transformations,
Fast Data Load is off by default on its OLE DB Destination.
8. Choose the best way in designing Data flow between SQL and SSIS: Remember SSIS is good at
Row by Row operations where AS SQL is not. So depends on the situation design data flow using DFT
components instead of executing a query using “Execute SQL Task”.
9. Use queries for selecting data rather than selecting a table and checking off the columns you
want. This will reduce the initial record set before SSIS gets it rather than ignoring the fields
10. Carefully deal with your connections. By default, your connection manager will connect to the
database as many times as it wants to. You can set the RetainSameConnection property so it will
only connect once. This can allow you to manage transactions using an ExecuteSQL task and BEGIN
TRAN / COMMIT TRAN statements avoiding the overhead of DTC.
11. While running the package with in BIDS ensure you set the package to run in optimized mode.
12. While loading data into destination tables it’s helpful to use the “Fast Load option”.
13. Wherever possible Consider aggregating and (un)pivoting in SQL Server instead doing it in SSIS
package – SQL Server outperforms Integration Services in these tasks;
14. Avoid manipulating large datasets using T-SQL statements. All T-SQL statements cause changed
data to write out to the transaction log even if you use Simple Recovery Model.
15. For large datasets, do data sorts at the source if possible.
16. Use the SQL Server Destination if you know your package is going to run on the destination
server, since it offers roughly 15% performance increase over OLE DB because it shares memory with
SQL Server.
17. Increase the network packet size to 32767 on your database connection managers. This allows
large volumes of data to move faster from the source servers.
18. If using Lookup transforms, experiment with cache sizes – between using a Cache connection or
Full Cache mode for smaller lookup datasets, and Partial / No Cache for larger datasets. This can free
up much needed RAM.
19. Make sure “Lock Options” is using while loading very large datasets as bulk insert happens when
it satisfies the below conditions.
a) Destination table is empty
b) Destination database recovery model is either simple or bulk insert
c) When table lock option specified
20. Experiment with the DefaultBufferSize and DefaulBufferMaxRows properties. You’ll need to
monitor your package’s “Buffers Spooled” performance counter using Perfmon.exe, and adjust the
buffer sizes upwards until you see buffers being spooled (paged to disk), then back off a little.
21. Do all set based, aggregations and sort operations at source or destination using T-SQL.
22. If possible always use “NOLOCK” at source and “LOCK” at destination.
23. While loading to data warehouses try to disable the indexes while loading.
Q104. Can you explain the settings “Rows Per Batch” and “Maximum Insert Commit Size”?
These options are available at “OLEDB destination” in DFT.
Rows per batch – The default value for this setting is -1 which specifies all incoming rows will be
treated as a single batch. You can change this default behaviour and break all incoming rows into
multiple batches. The allowed value is only positive integer which specifies the maximum number of
rows in a batch.
Maximum insert commit size – The default value for this setting is ‘2147483647’ (largest value for 4
byte integer type) which specifies all incoming rows will be committed once on successful
completion. You can specify a positive value for this setting to indicate that commit will be done for
those number of records. You might be wondering, changing the default value for this setting will
put overhead on the dataflow engine to commit several times. Yes that is true, but at the same time
it will release the pressure on the transaction log and tempdb to grow tremendously specifically
during high volume data transfers.
Q105. Can you explain the DFT properties “DefaultBufferMaxRows” and “DefaultBufferMaxSize”?
The data flow task in SSIS (SQL Server Integration Services) sends data in series of buffers. How much
data does one buffer hold? This is bounded by DefaultBufferMaxRows and DefaultBufferMaxSize,
two Data Flow properties. They have default values of 10,000 and 10,485,760 (10 MB), respectively.
That means, one buffer will contain either 10,000 rows or 10 MB of data, whichever is less.
You can adjust these two properties based on your scenario. Setting them to a higher value can
boost performance, but only as long as all buffers fit in memory. In other words, no swapping please!
Q106. How can we connect to Oracle, DB2 and MySQL from SSIS?
Oracle:
Native OLEDB\Microsoft OLEDB Provider for Oracle
Native .Net providers\ or
.Net providers for OLEDB\
MySQL:
.Net Providers \ MySQL Data Provider Or
.Net Providers \ ODBC
DB2:
Native OLEDB\Microsoft OLEDB Provider for DB2
Native .Net providers\ ,
.Net providers\ ODBC OR
.Net providers for OLEDB\
Q107. Can’t we do FastLoad using “ADODotNet Destination”?
Yes, there is an option called “Use Bulk insert when possible” that needs to be tick at the time of
mapping.
Q108. How to check whether SSIS transformations are using memory or spilling to Disk due to
huge loads and asynchronous transformations?
A great way to check if your packages are staying within memory is to review the SSIS performance
counter Buffers spooled, which has an initial value of 0; above 0 is an indication that the engine has
started swapping to disk.
Q109. How to find how much of total memory allocated to SSIS and SQL Server?
Below are the performance counters which can help us in finding memory details.
Process / Private Bytes (DTEXEC.exe): The amount of memory currently in use by Integration
Services.
Process / Working Set (DTEXEC.exe): The total amount of allocated memory by Integration Services.
SQL Server: Memory Manager / Total Server Memory: The total amount of memory allocated by SQL
Server. Because SQL Server has another way to allocate memory using the AWE API, this counter is
the best indicator of total memory used by SQL Server.
Memory / Page Reads / sec: Represents to total memory pressure on the system. If this consistently
goes above 500, the system is under memory pressure.
Q110. See there is a scenario: We have a package which has to be open using BIDS / SSDT and has
to be modified different elements. But from the location where the SSIS has to be open and
modified is not having permissions to access the databases hence all connection managers and
other location constraints will fail in validation phase and it takes lot of time to validate all of
these connections. Do you have any idea how to control this validation phase?
Below are the different methods to switch off the package validation.
Work OffLine: There is a option called Work Offline. It doesn’t try to locate/validate packages. Once
the package is ready then we have to uncheck the option Work Offline from SSI menu.
Delay Validation: Set the values to “True” to skip the validation while opening the package. It only
applies for executables / control flow elements including package.
ValidateExternalMetadata: Property is set to be “True” for disabling the validation for dataflow
components.
Q111. What are the SSIS package protection levels?
There are 6 different types of protection levels.
Do not save sensitive – (When exporting using DTUTIL specify for protection- 0)
Encrypt sensitive with user key – 1
Encrypt sensitive with password – 2
Encrypt all with password -3
Encrypt all with user key – 4
Rely on server storage
Do not save sensitive: makes the sensitive data unavailable to other users. If a different user opens
the package, the sensitive information is replaced with blanks and the user must provide the
sensitive information.
Encrypt sensitive with user key: Uses a key that is based on the current user profile to encrypt only
the values of sensitive properties in the package. Only the same user who uses the same profile can
load the package. If a different user opens the package, the sensitive information is replaced with
blanks and the current user must provide new values for the sensitive data. If the user attempts to
execute the package, package execution fails.
Encrypt sensitive with password: Uses a password to encrypt only the values of sensitive properties
in the package. To open the package in SSIS Designer, the user must provide the package password.
If the password is not provided, the package opens without the sensitive data and the current user
must provide new values for sensitive data. If the user tries to execute the package without
providing the password, package execution fails.
Encrypt all with password: Uses a password to encrypt the whole package. The user must provide
the package password. Without the password the user cannot access or run the package.
Encrypt all with user key: Uses a key that is based on the current user profile to encrypt the whole
package. Only the user who created or exported the package can open the package in SSIS Designer
or run the package by using the dtexec command prompt utility
Rely on server storage: Protects the whole package using SQL Server database roles. This option is
supported only when a package is saved to the SQL Server msdb database.
When it is time to deploy the packages, you have to change the protection level to one that does not
depend on the developer’s user key. Therefore you typically have to select
EncryptSensitiveWithPassword, or EncryptAllWithPassword. Encrypt the packages by assigning a
temporary strong password that is also known to the operations team in the production
environment.
Q112. What are the phases of execution of a package when running with DTEXEC?
Command sourcing phase: The command prompt reads the list of options and arguments
Package load phase: The package specified by the /SQL, /FILE, or /DTS option is loaded.
Configuration phase: Options are processed in this order:
Options that set package flags, variables, and properties.
Options that verify the package version and build.
Options that configure the run-time behavior of the utility, such as reporting.
Validation and execution phase: The package is run, or validated without running if
the /VALIDATE option is specified.
Q113. What are the exit codes from DTEXEC?
0: The package executed successfully.
1: The package failed.
3: The package was canceled by the user.
4: The utility was unable to locate the requested package.
5: The utility was unable to load the requested package.
6: The utility encountered an internal error of syntactic or semantic errors in the command line.
Q114. Can you demonstrate the DTEXEC?
Execute a package located on file system:
DECLARE @returncode int
EXEC @returncode = xp_cmdshell ‘dtexec /f “C:\UpsertData.dtsx”‘
To execute an SSIS package saved to SQL Server using Windows Authentication:
dtexec /sq pkgOne /ser productionServer
To execute an SSIS package saved to the File System folder in the SSIS Package Store:
dtexec /dts “\File System\MyPackage”
To validate a package that uses Windows Authentication and is saved in SQL Server without
executing the package:
dtexec /sq pkgOne /ser productionServer /va
To execute an SSIS package that is saved in the file system, and specify logging options:
dtexec /f “c:\pkgOne.dtsx” /l “DTS.LogProviderTextFile;c:\log.txt”
To execute an SSIS package that is saved in the file system and configured externally:
dtexec /f “c:\pkgOne.dtsx” /conf “c:\pkgOneConfig.cfg
Q115. Process to upgrade DTS TO SSIS?
1. Choosing a DTS to SSIS Migration Strategy (Reactive/Proactive)
2. Capturing SSUA DTS Package Alerts (all categories of notifications)
3. Building a dev/test environment
4. Migrating the packages using the selected DTS to SSIS Migration Strategy
5. Testing/Correcting the resulting SSIS 2008 Packages in the dev/test environment
6. Deploying and reconfirming the resulting SSIS 2008 Packages work in production as expected
7. Removing the old DTS Packages from production w/optional SQL Server Agent Jobs
Q116. Does all components are converted automatically from DTS TO SSIS?
Not all components can be upgraded. ActiveX transforms, for instance, present a challenge for the
upgrade wizard, and may not be able to be migrated.
Delete and recreate ODBC connections after package migration
Reconfigure transaction settings after package migration
Replace functionality of ActiveX script attached to package steps after package migration. Use Script
task
After migration, convert the Execute DTS 2000 Task that encapsulates the Analysis Services task to
an Integration Services Analysis Services Processing task.
After migration, re-create the functionality of the Dynamic Properties task by using Integration
Services features such as variables, property expressions, and package configurations.
Q117. Why is the need for data conversion transformations?
This transformation converts the datatype of input columns to different datatype and then route the
data to output columns. This transformation can be used to:
Change the datatype
If datatype is string then for setting the column length
If datatype is numeric then for setting decimal precision.
This data conversion transformation is very useful where you want to merge the data from different
source into one. This transformation can remove the abnormality of the data. Example à The
Company’s offices are located at different part of world. Each office has separate attendance
tracking system in place. Some offices stores data in Access database, some in Oracle and some in
SQL Server. Now you want to take data from all the offices and merged into one system. Since the
datatypes in all these databases vary, it would be difficult to perform merge directly. Using this
transformation, we can normalize them into single datatype and perform merge
Q118. Explain why variables called the most powerful component of SSIS.
Variable allows us to dynamically control the package at runtime. Example: You have some custom
code or script that determines the query parameter’s value. Now, we cannot have fixed value for
query parameter. In such scenarios, we can use variables and refer the variable to query parameter.
We can use variables for like:
Updating the properties at runtime,
Populating the query parameter value at runtime,
Used in script task,
Error handling logic
With various looping logic.
Q119. What are the for each loop enumerators available in SSIS?
Below are the lists of various types of enumerators provided by SSIS Foreach Loop Container:
Foreach File Enumerator: It enumerates files in a folder. The plus point here is it can traverse
through subfolders also.
Foreach Item Enumerator: It enumerates items in a collection. Like enumerating rows and columns
in an Excel sheet.
Foreach ADO Enumerator: Useful for enumerating rows in tables.
Foreach ADO.NET Schema Rowset Enumerator: To enumerate through schema information about a
data source. For example, to get list of tables in a database.
Foreach From Variable Enumerator: Used to enumerate through the object contained in a variable.
(if the object is enumerable)
Foreach NodeList Enumerator: Used to enumerate the result set of an XML Path Language (XPath)
expression.
Foreach SMO Enumerator: It enumerates through SQL Server Management Objects (SMO) objects.
Q120. We have a situation that needs to be push data into DB2 database from SQL Server. What
connection manager you use to connect to DB2 running on AS/400?
Primary method to connect to DB2 is “Microsoft OLE DB Provider for DB2”. There is one more
method using ADO.NET data providers \ ODBC Data provider.
OLEDB is always faster than ODBC, but there might be issues with OLEDB to DB2 while dealing with
parameters in queries.
Q121. What is “ActiveX Script” task? Does it available in SQL Server 2012?
The ActiveX Script task provides a way to continue to use custom code that was developed using
ActiveX script. ActiveX script task supports writing scripts using VBScript and Jscript and other
languages installed in the local computer.
This task is to just support’s the backward compatibility with the deprecated component DTS
packages
Now in SQL Server 2012 the “ActiveX Script” has to be upgraded to “Script Task”.
“Script task” supports VB.Net and C#.Net
Q122. What is the use of either “Script task” or “ActiveX Script”?
Implementing customized business logics in SSIS packages. Example using the script task we can
access table values, applies logic and those values can be added to SSIS variables.
Performing complex computations for example modifying date formats using date functions
To access data from sources for which no support from built-in connections, for example a script can
use Active Directory Service Interface (ADSI) to access usernames from AD.
To create a package specific performance counters for example a script can create a performance
counter that can be updated when a complex task or poorly performed task executes.
Q123. What is “Script Component”?
Script component is like a “Script Task” but it is designed for “Data Flow”. It can be useful in below
scenarios.
Apply multiple transformations to data instead of using multiple transformations in the data flow.
For example, a script can add the values in two columns and then calculate the average of the sum.
Use custom formulas and functions for example, validate passport numbers.
Validate incoming column data and skip unmatched records
Script Component support for inputs and outputs
If used as a source: Supports multiple outputs
If used as a transformation: Supports one input and multiple outputs
If used as a destination: Supports one input
Q124. Can we call a web service from SSIS? If Yes how?
Yes! We can call a web service using “Web Service” task. We have to provide HTTP connection
manager and WebServiceDescriptionLanguage (WSDL) file. The output can be stored either in a
variable or on file system (In XML, TXT etc)
Q125. What is the use of derived column in SSIS?
Derived column transformation can process existing column data and apply some functionality.
For example to change the case of a string for a column: can replace the actual column by applying
the expression UPPER (COLUMN) or LOWER (COLUMN).
Can also useful when need to calculate sum values example: Add a new column “Gross Value” by
applying the expression (Column1+Column2)
Applying arithmetic operations like “Round” and calculating date and time differences etc.
In addition with this, it can deal with “NULL” values. When NULL values needs to be populated with
blanks
If incase we can’t perform any of these kind of operations with derived column we have another
option called “Script Transform”
Q126. Which config file we should use for storing SSIS package configurations?
There are different ways to do this. But it’s all depends on requirement and environment. I couldn’t
see any problem which is resolved in one config type and can’t be resolved in other config option.
If you are using a file system deployment, it probably makes more sense to use XML configuration
files.
If you are using a SQL Server deployment, it probably makes more sense to use SQL Server
configurations.
If your ETL solution is managed by the application owner or server administrator, it probably makes
more sense to use XML configuration files.
If your ETL solution is managed by the database administrator, it probably makes more sense to use
SQL Server configurations.
If project team members and/or administrators have past experience and success with a given
configuration type, it probably makes sense to use that type unless there is some compelling project-
specific reason to do otherwise.
Q127. What are the possible issues in handling SSIS packages?
Mostly Data Conversion errors due to datatype mismatches – Truncation of strings, loosing some
decimal points etc
Expression Evolution errors on run time: Unable to evaluate expressions at run time due to wrong
comparisons etc
Package Validation Errors: When we configure a variable to locate a file on file system which is
actually creates on run time the package files debugging as initially the file is not located at the
specified path. To avoid these issues set the property “DelayValidation” to “True”
Package Configuration Issues: Always make sure that we are using the right package at the right
environment. It always depends on package configuration. Package will be used on dev, test and
prod environments with different config values. If wrong config values are passed to a SSIS package
which may leads to loss of data or data corruption.
To avoid these issues two things we have to consider
1. Use a centralized database to store all SSIS package config values
2. Use different account (either domain or sql) for different environments
3. Tight the security by assigning only required permissions to the SSIS user accounts.
So that even though a dev package runs with the prod credentials it fails to connect to the instance.
When a Variable is using another Variable:
See we usually give a variable as a source for “Execute SQL Task”. But for the variable value is setting
by evaluating an expression which is using another variable.
For example we have created a variable called “FileName” and it’s been using in Execute SQL Task.
But a filename should be evaluated as “B2B_Reports_”+User: BatchID. Here BatchID is another
variable.
By default it fails to evaluate this expression and to fix this we have to change the variable property
“EvaluateAsExpression” to “True”
Running SSIS packages on 64 bit and dealing with Excel files:
Typically excel files are not provided with 64 bit drivers (Excel 2010 has 64 bit but not before).
So to deal with excel files from SSIS which is running on 64 bit is bit difficult task.
There is an option in SSIS which allows SSIS package to support 32 bit execution on 64 bit
environment.
From project properties on debugging page an option called “Run64BitRunTime”. By default it’s set
to be true for SSIS running on 64 bit. We have to modify this to false to handle with 32-bit support
activities. Below are more reasons to use this option
From SSIS where it’s running on 64 bit:
We can’t call a ExecuteDTS package task as it doesn’t support 64 bit
It may raise errors while using Script Task or Script Component. Might be using Dotnet assembles or
COM objects for which there might be no 64 bit support available or drivers are not installed.
Case Sensitive issues:
One of the popular data transformations is a Lookup. It compares column values from two tables.
Unlike T-SQL SSIS is a case sensitive comparison so we have to be careful in handling with these
transformations and tasks.
Q128. What are event handlers in SSIS?
Event handlers allow MSBI developers to monitor and audit SSIS packages. Event handlers can be
associated with SSIS components and executables. Any component that can be added to the control
flow is called as “Executable” plus the package itself. Child components is considered to be child
executable and parent is known as parent executable.
Q129. What are the different types of event handlers?
OnPreValidate, OnPostValidate, OnProgress, OnPreExecute, OnPostExecute, OnError, OnWarning,
OnInformation, OnQueryCancel, OnTaskFailed, OnVariableValueChanged, OnExecStatusChanged
Q130. What are the general cases that event handlers can be helpful in?
Cleanup stage tables after a bulk load completed
Send an email when a specific component failed
Load lookup tables after a task completed
Retrieve system / resource information before starting a task.
Q131. How to implement event handlers in SSIS?
Create log tables (As per the requirement) on centralized logging database
On BIDS / SSDT add event handler
Add control flow elements. Most of the times “Execute SQL Task”
Store the required information (RowCounts – Messages – Time durations – System / resource
information).
We can use expressions and variables to capture this information.
Q132. What is container hierarchy in attaching event handlers?
Container hierarchy plays a vital role in implementing “Event Handlers” in SSIS. If an event handler is
attached to a “Package” (a package itself it’s a container), then, the event handler applies to all
associated components of that package. We need not attach the event handlers to all of them
separately. But if we want to switch off the event handlers to any of the specific component in a
container simply change the property “Disable Event Handlers” to “TRUE”.
Q133. How to implement SCD (Slowly changing dimension) type 2 using SSIS?
Type 2 means we have to keep historical data.
Assume that we have a table called “Employee_Stage” at Stage Server and “Employee_Archieved” at
Archive Server.
Now we have to read data from stage and insert into archive instance.
We have to implement SCD type 2, means we have to keep the changed records information. For
example for an employee a column “Designation” has been changed then a new row has to be
inserted into archive.
While inserting there are three columns that helps us in identifying the old and current records for a
given employee.
StartDate – startdate for the given designation
EndDate – enddate for the given designation
IsCurrent – Bit column: 1 – Current; 0 – History
Let’s start designing SSIS package:
As usual create a SSIS project
Create two connection managers. 1. Staging, 2 – Archive
Drag and drop a dataflow task at control flow
Open data flow task and add a OLEDB source and map it with stage connection manager
Drag and drop “SCD transformation” to data flow task.
Double click and configure SCD as below.
Map “Archive” connection manager and choose the “Business Key” in the archive table.
Business key is nothing but a column which can be used to compare / lookup with stage table. Here I
have given “EmpID” as a Business Key.
We have to mention “Change Type” for SCD columns.
There are three change types available as below
‘Fixed attribute’, ‘Changing attribute’ and ‘Historical Attribute’.
I do choose “Historical Attribute” for the column Designation. Based on this a new record will be
inserted into archive if the column value is changed in stage table.
Now give the historical attribute options. There are two options available to identify current and
historical records. Based on a single column, “Based on two date values”
Here I choose the first option and give “1” to current and “0” to expiration
Don’t select “Inferred member support” as this is not useful in this scenario.
Click finish, it’ll automatically creates some transformations and destination that includes “derived
Column” to add flag values to “IsCurrent” column, OLEDB Command to update the “IsCurrent”
column and OLEDB destination to insert new records into archive table.
Note 1: To implement SCD type – 1 (Means overwrite the values) have to follow the same steps
above. Instead of choosing “Hierarchical Attribute” choose “Changing Attribute”.
Note 2: “Fixed Attribute” can be useful at situations where to apply a domain rule for example the
column “NationalNumber” has to be fixed. If the column is forced to overwritten then there would
be an error or it would be redirected but it never allow to be changed.
Q134. Can we use a temp table in data flow that is created in control flow?
Yes we can use. Assume we are executing a stored procedure from “Execute SQL Task”. That stored
procedure creates a global temp table on database and the same temp table has to be used in
dataflow while creating OLEDB source, we can give a query like “SELECT * FROM ##TempTable”.
To use a temp table in SSIS from the same connection some of the properties has to be set as below.
From the properties of OLEDB connection manager change the value to “TRUE” for the property
“RetainSameConnection”.
For OLEDB source in dataflow make sure the property “ValidateExternalMetadata” to “False” as it
fails to locate the temp table at complaining phase.
Q135. Have you ever create templates in SSIS?
Yes! I have created templates for SSIS new package designs.
See in environments where SSIS packages are being utilized often, creating templates are very useful
and saves the development time.
To create a SSIS package template, create a SSIS package with all default required, environment
settings, connection managers and essential data flows and save the package to disk.
Copy the package file (.dtsx file) to the location
2012:
C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\PrivateAssemblies\ProjectItems\
DataTransformationProject\DataTransformationItems
Might be different based on OS / 32-bit / 64 bit / SQL Server Version etc.
Once the package is copied then create a new SSIS project, right click on project name → Add
Item → from there you can see the template. Select and add it to the project.
Q136. Which is the best method for storing package configurations?
Storing package configurations depends on requirement and operations team. Mean it’s ways
depends on type of config values would be storing and the team which controls the configuration.
There are two famous methods, XML and SQL Server. I do suggest “SQL Server”. Because at the first
we have to consider the aspect “Security”, “SQL Server” is the best place to store package
Configurations from the security prospective.
Best Approach:
1. Store all package configurations in SQL Server
2. Store SQL Server (Where config table exists) into a XML file
3. Stored XML file location in an environment variable
So that for every deployment, we just need to change the table values and environment variable
value.
See for example we are using the same package for all development, test and stage server every
time we need to execute that package we need not have different packages instead we just need
different configuration files by pointing the proper config file using XML and then choosing proper
XML by using environment variable.
Q137. Can we validate data in SSIS package?
Yes we can validate data in SSIS using Data Flow Transformations.
But I do suggest do validation at database side. For example instead of applying validation rules at
package level, use a stored procedure at source to apply / check all validations and then from that
select the data which can be directly loaded to destination.
If incase source is not a database instead if it is a flat file, then get all data from flat file and stage it
on SQL Server and apply all validations and load that data to destination table.
By doing this there might be overhead at database but operation would be faster as validations can
be applied or all rows in bulk set operation where as in SSIS the same validation has to be applied as
row by row. And if any modifications required at validations, we can simply modify the stored
procedure and need not touch the SSIS package.
Data Profiler can be used to validate data.
Q138. How to store redirects error information in SQL Server?
We can use an OLEDB destination to “SQL Server” log table and error precedence arrow can be
mapped to this destination.
But to get more error description we have to use a script component between them. To capture the
exact error message use below code at script component:
Public override void Input0_ProcessInputRow (Input0Buffer Row)
{
Row.ErrorDescription = this.ComponentMetaData.GetErrorDescription(Row.ErrorCode);
}
Q139. Give some simple expressions those are used in SSIS?
We usually use “Derived Column” to validate data and we use “Data Conversion” to convert
datatype of a column.
Remove Leading and Trailing Spaces: TRIM(<Column Name>)
http://www.sqlservercentral.com/Forums/Topic733263-148-1.aspx
Check NULL existence:
This example returns “Unknown last name” if the value in the LastName column is null, otherwise it
returns the value in Last Name.
ISNULL(Last Name)? “Unknown last name”:LastName
This example always returns TRUE if the DaysToManufacture column is null, regardless of the value
of the variable AddDays.
ISNULL(DaysToManufacture + @AddDays)
Q140. What is character Map transformation used for?
This transformation is used for applying formations to the column data that includes changing
characters from lower to upper case, upper to lower case, half width, full width, Byte reversal etc.
See when we are using lookup on columns from source and destinations as we know that
SSIS.Lookup is a case sensitive not like T-SQL. So beofore comparing two columns we can design data
flow to pas those two columns through “CharecterMAP” and can convert data into a common
format either “Lower” or “Upper” case.
Q141. What are import and export column transformations?
Import Column Transformation – The Import Column transformation reads data from files and adds
the data to columns in a data flow. Using this transformation, a package can add text and images
stored in separate files to a data flow. *
Export Column Transformation – The Export Column transformation reads data in a data flow and
inserts the data into a file. *
Q142. How matching happens inside the lookup transformation?
Lookup transformation tries to perform an equi-join between transformation input and reference
dataset. By default unmatched row is considered as an error however we can configure lookup to
redirect such rows as “no match output” (from 2008 and above).
If the reference data set is having multiple matches it returns only the first match. In-case the
reference set is a cache then it raises a warning or error in case of multiple matches.
Q143. What are all the inputs and outputs of a lookup transformation?
Input: Dataset from data source
Match Output: All matched rows
No Match Output: All not matched rows. If unmatched rows are not configured to redirect to error
output then such rows are redirected to no match output
Error Output: Rows failed to compare or unmatched rows
Q144. How to transfer logins using SSIS?
It can be done using Transfer SQL Server Login. But there are limitations.
Transferring windows authentication logins to cross domain: Drop and recreate logins
Transferring SQL Logins: Need to change the password as a random password is chosen while
moving from source to destination.
Best way to move logins is using scripts: Logins, users, role mapping scripts.
Q145. How to troubleshoot connection error regarding?
“DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER”?
1. Incorrect Provider for the connection:
A) Lack of 64-bit provider: Remember on target server if the package is running on 32-bit, make sure
that the execution option “Use 32-Bit runtime” is being selected while creating the job to execute
the SSIS package.
B) Lack of client binary installed: Make sure client binaries installed on target server.
2. Incorrect connection parameters settings:
A) Typo in password
B) Password is not stored in configuration
3. Failed to decrypt the sensitive information: It usually happens when a SSIS package is executing
from SQL Server agent job. If the package is saved with the option “SaveSensitiveWithUserKey” and
the sql agent service account different from package creator.
4. Oracle Data Provider Limitation:
Another common scenario happens when you use Microsoft OLE DB Provider for Oracle or Microsoft
ODBC Driver for Oracle to connect to Oracle9i or later version database. Recommended is Oracle
OLE DB Provider for the Oracle 9i or later versions
Q146. What are the logs available to check if a SSIS package fails?
1. Windows Event Log & Job History: When SSIS package scheduled from a SQL Job
2. Logs from SSIS Logging Audit: When a log provider configured for package
3. Logs from SSIS Event Handler: When event handler designed to capture the log
4. Logs from the SSIS components: When custom logging configured using Script task
5. Logs from underlying data sources: Check the error log at data source example SQL Server, Oracle
etc.