1
sdfsdff
Difference between OLTP and OLAP Systems ?
ETL: It's process of Extracting, Transforming and Loading
data into Destination
SQL Server Integration Services (SSIS) has
transformations, which are key components to the
Data Flow, that transform the data to a desired format as
data moves
from SOURCE to DESTINATION(Target).
Data will be altered in the data pipeline based on the
business requirement.
www.rudrasoft24.com
Contact: 9848486690
2
sdfsdff
There are two types of Transformation Basically.
1. Synchronous Transformations (Non Blocking
Transformations)
2. Asynchronous Transformations (Blocking
Transformations)
Synchronous Transformation: If the incoming records are
same as out coming records of the transformations then
we can call as Synchronous Transformations.
No rows are held during moving data. These
transformations improves the performance
1. Audit
2. Character Map
3. Conditional Split
4. Copy Column
www.rudrasoft24.com
Contact: 9848486690
3
sdfsdff
5. Data Conversion
6. Derived Column
7. Lookup
8. Multicast
9. Percent Sampling
10. Row Count
11. Script Component
12. Export Column
13. Import Column
14. Slowly Changing Dimension
15. OLE DB Command
Asynchronous transformation: If the incoming records
are not same as out coming records of the
transformations then we can call as Asynchronous
Transformations.
There are two types of Asynchronous transformations
1. Fully blocking transformations
2. Partial blocking transformations.
Partial blocking transformation is used to hold some the
records by taking from Source and after completion
action, it releases records and again it will take some of
the records till data is completed
1. Union all
2. Merge
www.rudrasoft24.com
Contact: 9848486690
4
sdfsdff
Fully blocking transformations is used to hold entire data
in memory, then it will actions based on requirement and
releases records.
1. Aggregate
2. Fuzzy Grouping
3. Fuzzy Lookup
4. Row Sampling
5. Sort
6. Term Extraction
Transformations: These are used to transform the data
from one format to other format
The following are the Transformation activities
1. Data Cleansing
2. Data Conversion
3. Data Scrubbing
4. Data Sorting
5. Data Merging
6. Data Aggregation
Data Cleansing: It is the process of removing unwanted
data meaning keeping required data based on business
requirement.
www.rudrasoft24.com
Contact: 9848486690
5
sdfsdff
Data Conversion : It is the process converting incoming
data type based on Destination datatype.
If the Destination datatypes are not same as SOURCE
datatypes, then we need to convert to DESTINATION
Datatypes
When you are loading data from Excel to Database, we
need to use Data Conversion
When you are loading data from Tabular Model or
Multidimensional Model, we need to convert datatype
based on Destination datatypes
www.rudrasoft24.com
Contact: 9848486690
6
sdfsdff
Data Scrubbing : It is the process deriving new data
definition either by using existed definition
or non-existed definition based on the Business
requirement.
There are two types of Mappings in SSIS
1. Direct Mapping: If the data is directly coming from
SOURCE column to DESTINATION column, then we
can call it as Direct Mapping. Example Empno, Ename
2. Indirect Mapping: If the data is indirectly coming from
SOURCE column to DESTINATION column then we can
call it as Indirect Mapping. Example TAX
Data Sorting: It is the process of doing order either in
ascending or descending order. The default order is
Ascending Order. Sorting Operation is very costly
operation so try to avoid to use sorting
www.rudrasoft24.com
Contact: 9848486690
7
sdfsdff
Data Merging: It is the process joining data from two
tables based Business requirement. There are two types
joining mechanisms in Data Merging
1. Horizontal Merging
2. Vertical Merging
Horizontal Merging: If the two tables having different
structure and different kind of data, then we can use
Horizontal Merging.
Note: There should be common column between two
tables, meaning two should have same kind of data.
Vertical Merging: If the two table having same structure
and same kind of data, then we use Vertical Merging.
There should be same number of columns
The order of the columns should be same
The datatypes of the columns should be same
www.rudrasoft24.com
Contact: 9848486690
8
sdfsdff
Data Aggregation: It is the process of
adding(summarizing) data for the specified column based
on GROUP and without GROUP
www.rudrasoft24.com
Contact: 9848486690
9
sdfsdff
SSIS Activities
1. Create SSIS Project
2. Create SSIS Package
3. Create Connection Managers for SOURCE and
DESTINATION
4. Design Dataflow Task
5. Assign Connection Managers for SOURCE table and
DESTINATION table
6. Save and Run SSIS Package
Connection Managers contains the name of "Data
Source" (Server name) and name of the "Initial Catelog"
(Database Name) which are used to connect to various
databases based on requirement.
Packages will be saved with extension .dtsx
DATA FLOW TASK: Its process extracting data from
SOURCE to DESTINATION is known as
Data Flow Task
www.rudrasoft24.com
Contact: 9848486690
10
sdfsdff
1. Create SSIS Project
2. Create SSIS Package
3. Create Connection Managers for SOURCE and
DESTINATION
4. Design Dataflow Task
5. Assign Connection Managers for SOURCE table and
DESTINATION table
6. Save and Run SSIS Package
1. From Connection Managers Window, right click and
select "New OLEDB Connection" and Click on New
2. Provide the following information
a. Server Name : Rudrasoft
b. Select or Enter a Database name: SOURCE. Click
on Ok
3. Create Destination connection with help of above
steps
www.rudrasoft24.com
Contact: 9848486690
11
sdfsdff
DESIGN DATA FLOW TASK
1. From SSIS Toolbox, Drag and Drop "Dataflow Task" on
Design Surface
2. Double click on Dataflow task, From Other Sources,
Drag and Drop OLEDB Source into Design Surface to
pull data from SOURCE database
3. From Other Destinations, Drag and Drop OLEDB
Destination into Design Surface to load data into
SalesBI Destination
4. Rename OLEDB SOURCE as "EMP" and OLEDB
Destination as "DimEmp"
Assign Connection Managers for SOURCE table and
DESTINATION table
1. Double click on OLEDB Source, and provide the
following information
i. OLEDB Connection Manager :
Rudrasoft.SOURCE
ii. Name Of the Table or the View : EMP
iii. Click on Columns to select all columns or
required columns and Click OK
2. Connect from EMP to DimEmp with help of "Blue
Pipeline" to bring data
3. Double Click on Destination and provide the following
information
i. OLEDB Connection Manager :
Rudrasoft.SalesBI
ii. Name Of the Table or the View : DimEmp
www.rudrasoft24.com
Contact: 9848486690
12
sdfsdff
iii. Click on mappings to map "Input Columns"
and "Destination Columns"
4. Save and Execute the Package (Right click on Package
and click on Execute Package)
5. Validate the SOURCE and Destination data
Conditional Split Transformation
It is Synchronous transformation. It allows you to send
the data from a single data path to various outputs or
paths based on conditions that use the SSIS expressions.
If the record is not satisfied by condition, then we can
capture those records by Conditional Split default Output.
It is like IF condition(function), CASE statement and
WHERE clause in SQL.
This transformation evaluates expressions, and based on
the results, directs the data row to the specified output
www.rudrasoft24.com
Contact: 9848486690
13
sdfsdff
1. Create SSIS Package and rename it as "Conditional Split"
2. Create/Add Connection Managers for SOURCE and
DESTINATION
3. Design Dataflow Task as per above diagram
4. Assign Connection Manager for SOURCE table
5. Drag and drop Conditional Split Transformation on
Design Surface and double click on Conditional split
Transformation and provide the following condition
6. Connect Blue Pipeline from Conditional Split to OLEDB
Destination and select Output as Deptno_10
7. Assign Connection Manager for DESTINATION table and
make sure the columns are mapped properly
8. Save and Run SSIS Package
www.rudrasoft24.com
Contact: 9848486690
14
sdfsdff
9. Validate Destination data as per Business requirement
Exercise:
1. Create a package to load employee details whose
name starts with "S" ?
2. Create a package to load employee details whose
name ends with "T"?
3. Create a package to load employee details whose
name starts with "J" and ends with "S"?
4. Create a package to filter rows with Null values in
SSIS by taking data from various Sources ?
5. Create a package to load non Null values in
Destination ?
6. Create a package to filter rows with Blank Values ?
7. Create a package to remove or redirect duplicate
records based on Business requirement ?
8. How can you write SSIS Expressions and Explain ?
9. Create a package to load data into multiple
destination tables based on Condition
10. Create a package to load data into multiple
destination with following Condition
a. If employee is related to Department 10 or
Department 20 will be stored in DimEmp1020
b. If Employee is related to Department 30, will be
stored in DimEmp30
c. If Employee is related to Department 10, will be
stored in DimEmp10
www.rudrasoft24.com
Contact: 9848486690
15
sdfsdff
Multicast Transformation
It is Synchronous transformation
Multicast Transformation in SSIS sends input data to
multiple destination paths without applying any sort of
conditions or transformations. OR, Takes ONE Input and
makes the logical COPY of data and passes the same data
to multiple outputs.
This transformation is similar to the Conditional Split
transformation. Both transformations direct an input to
multiple outputs.
The difference between the two is that the Multicast
transformation directs every row to every output, and
the Conditional Split directs a row to a single output.
We can not write SSIS Expression in Multicast
Transformation.
Data Conversion Transformation:
It is Synchronous transformation.
It converts the data in an input column to a different
data type and then copies it to a new output column
It is a similar function to the Convert or Cast functions in
T-SQL.
www.rudrasoft24.com
Contact: 9848486690
16
sdfsdff
It is a very useful transformation if we are pulling same
data from multiple sources. We need to convert based
on the destination datatypes.
NOTE: It is always necessary to convert data types while
transfer from Excel or Model to SQL server database.
Write the following SQL Script to create table in SOURCE
database
CREATE TABLE Department
(DNO INT,
DNAME NVARCHAR(20),
LOC NVARCHAR(20)
)
INSERT INTO Department
(DNO,
DNAME,
LOC)
SELECT DEPTNO, DNAME, LOC
FROM DEPT
SELECT * FROM Department
www.rudrasoft24.com
Contact: 9848486690
17
sdfsdff
Write the following SQL Script to create table in SalesBI
database
CREATE TABLE DimDept
(DEPTNO INT,
DNAME VARCHAR(20),
LOC VARCHAR(20)
)
Note: We can not convert from unicode String
(DT_WSTR) datatype to non-unicode String (DT_STR)
datatype directly.
To convert from unicode to non-unicode, we need to use
Data Conversion Transformation.
www.rudrasoft24.com
Contact: 9848486690
18
sdfsdff
Derived Column Transformation
It is Synchronous transformation. This transformation
creates a new column that is derived from the output of
another column. This transformation provides you two
options;
Either you can create a new column as a derived column
or replace the existing column with a new derived
column.
The SSIS Expression Language has powerful built-in
functions for string manipulation, data type conversions,
mathematical functions, conditional expressions and
handling Null values.
www.rudrasoft24.com
Contact: 9848486690
19
sdfsdff
i. Concatenate data from different columns into
Derived Column
ii. Extract characters from string data by using functions
such as LEFT, RIGHT and SUBSTRING
iii. Apply mathematical functions to numeric data and
store the result in a Derived Column
iv. Create expressions that compare input column and
variables
v. Extract parts of a datetime value
vi. Convert data strings to a specific format using SSIS
expression
Derived Column Name: Provide any unique name. Same
like Alias column in T-SQL
Derived Column: It Provides 2 options. Whether you
want to add this as new column or you want to replace
any existing column with this one
www.rudrasoft24.com
Contact: 9848486690
20
sdfsdff
Expression: In this place we will write custom expression
by combining the built-in SSIS function, variables and
Columns.
Precision: When we are adding any new column then
Derived Column Transformation automatically sets the
precision for numeric data based on the data type. The
value of this is read-only.
Scale: When we are adding any new column then
Derived Column Transformation automatically sets the
scale for numeric data based on the data type. The value
of this is read-only.
Code Page: When we are adding any new column then
Derived Column Transformation automatically sets code
page for the DT_STR data type.
Default Code Page : 1252
Configure error output: Specify how to handle errors.
Create a table in the SalesBI with following script
USE SalesBI
CREATE TABLE DimEmpDerived(
EMPNO int NULL,
ENAME varchar(20) NULL,
JOB varchar(20) NULL,
SAL money NULL,
www.rudrasoft24.com
Contact: 9848486690
21
sdfsdff
DEPTNO int NULL,
Commission money NULL,
TAX money NULL
)
Create table in the SalesBI with following Script
USE SalesBI
CREATE TABLE DimCustomer(
CustID INT,
FullName VARCHAR(100)
)
www.rudrasoft24.com
Contact: 9848486690
22
sdfsdff
Create a package to split the data from Full Name into
First Name and Last Name
Create table in the SalesBI with following Script
CREATE TABLE DimEmpDetails(
ENO INT,
"First Name" VARCHAR(30),
[Last Name] VARCHAR(30)
)
Create a package to get Year, Quarter and Month from
Hiredate ?
www.rudrasoft24.com
Contact: 9848486690
23
sdfsdff
Exercise:
i. How can you write IF ELESE statements in Derived
Column Transformation?
ii. Create a package to show "Not Eligible for Comm", if
the employee is not getting commission
iii. Business have Date table with only date column, then
How can you get month names with Derived Column
Transformation ?
iv. Is it possible to write Case statement in SSIS, if
Yes/No, Explain in more detail
Sort Transformation: It is an Asynchronous full blocking
transformation.
The Sort Transformation in SSIS is used to sort the source
data in either Ascending or Descending order, which is
similar to T-SQL command ORDER BY statement.
Some transformations like Merge Transformation and
Merge Join Transformation needs data to be sorted
www.rudrasoft24.com
Contact: 9848486690
24
sdfsdff
before using them. In these situations we use Sort
Transformation to sort the data.
If the Sort Order value is positive number then Sort
Transformation sort the data in Ascending order
If the Sort Order value is Negative number then Sort
Transformation sort the data in Descending order
Note: Sort transformation has one input and one output.
It does not support error outputs.
Ignore case: Specify whether you want to differentiate
between uppercase and lowercase letters.
If we check this option then both XYZ is same as xyz.
Ignore Kana Type: Specify whether you want to
differentiate between Japanese language hiragana and
katakana letters.
If we check this option then it will ignores kana Type.
Ignore nonspacing characters: If you don’t want to
differentiate between the normal characters
and diacritics then check this option.
Ignore Character Width:
Specify whether you want to differentiate between
single bite and double byte representation of same
character. If we check this option then Sort
Transformation ignores the difference.
www.rudrasoft24.com
Contact: 9848486690
25
sdfsdff
Ignore Symbols: Specify whether you want to consider
the normal letters and letters with symbols (such as
white spaces, currency symbols, operators etc) as same
or not. If we check this option then both %xyz is same as
xyz.
Sort punctuation as symbols: If we check mark this
option then all the punctuation symbols except the
hyphen and apostrophe will be sorted before the actual
letters. For instance, Sort Transformation will sort ?xyz
before x.
Remove rows with duplicate sort values: If you check
mark this option then, Sort Transformation will remove
the duplicate columns. If not, then this transformation
will copy all the columns including duplicate rows.
Exercise:
Create a Package to sort Salary in Ascending /
Descending Order
www.rudrasoft24.com
Contact: 9848486690
26
sdfsdff
Create a Package to sort Department in Ascending Order
and Salary in Descending Order
www.rudrasoft24.com
Contact: 9848486690
27
sdfsdff
Create a Package to remove duplicate records using Sort
Transformation
www.rudrasoft24.com
Contact: 9848486690
28
sdfsdff
Create a package to sort negative and positive values in
Ascending order ?
Create a package to sort characters in ascending order ,
if the values have lower and upper case ?
www.rudrasoft24.com
Contact: 9848486690
29
sdfsdff
Merge JOIN
It is an Asynchronous partial blocking transformation.
It allows joining data from two sorted datasets using a
FULL, LEFT, or INNER join.
It will also supports Right Outer Join (indirectly achieved
by Swapping the tables) in SQL Server Integration
Services.
Merge Join Transformation is very useful to load data
into the Dimension tables
in Data Warehouse.
It also has two inputs and one output and It does not
support an error.
Note: It is recommendable to use this transformation, if
the data sources are different.
NOTE: The Merge Join Transformation in SSIS will only
work with Sorted data, So Sort Transformation is
mandatory before applying any joins using Merge Join
Transformation.
Note: The column which is participating in join condition
should be sorted else join will not occur.
Note: We can not join a column that has numeric
datatype with a column that has character datatype.
www.rudrasoft24.com
Contact: 9848486690
30
sdfsdff
Exercise:
Create a package to join Emp table and Dept table to get
complete details of each and every employee ?
Can you create a package to join two tables, if the two
common columns having string datatype ?
Can you create a package to join two tables, if the two
common columns having string datatype and has
different size of the columns?
How many types of joins will be supported by Merge Join
Transformation and can you explain in detail?
Can you create a package to implement Right Outer join
with Merge Join Transformation ?
How can you handle NULL's if the common column
(joining column) having NULL's ?
www.rudrasoft24.com
Contact: 9848486690
31
sdfsdff
Can you create a package to join two tables with multiple
columns ?
Can you improve the performance while using Merge
Join Transformation ? If Yes, How can you do it ?
Create a package to load only non matching records
from employee table / department table ? How can you
do it in SQL and SSIS ?
Lookup Transformation
It is Synchronous transformation. It allows you to
perform an equi-join between values in the
transformation input and values in the reference dataset
similar to T-SQL.
This transformation is used to join two datasets at a
time. To join more than two datasets we need to use
multiple Lookup transformations, similar to a T-SQL join
condition.
It will not support Left Outer Join, Right Outer Join and
Full Outer Join.
If there is no matching entry in the reference dataset, no
join occurs. By default, the Lookup transformation treats
rows without matching entries as errors. However, it can
configure the Lookup transformation to redirect such
rows to a no match output.
www.rudrasoft24.com
Contact: 9848486690
32
sdfsdff
The join can be a composite join, which means that
multiple columns can be used in the join in the
transformation input to columns in the reference dataset
Lookup transformations provides several modes of
operations, Full cache, Partial cache or No cache, that
allows a trade-off between performance and resource
usage.
Lookup Transformation uses OLE DB Connection
Manager to access the reference table present in the
SQL Server, Oracle and DB2. Lookup Transformation is
also uses to access the data from Cache file.
www.rudrasoft24.com
Contact: 9848486690
33
sdfsdff
Full Cache mode: This is the most commonly used
approach in Lookup Transformation.
If we selected this option then, entire lookup (or
reference) table will be preloaded into the cache
(Memory) and Lookup Transformation will perform
lookup from the Memory instead of Dataset. This works
well when we have less number of rows in the lookup
table.
Partial Cache mode: If we selected this option then,
Lookup Transformation starts the transformation with
empty cache. When a new row comes from the data
flow, the Lookup Transformation first checks for the
matching values in its cache. If it is not found in the
cache then it will check in the lookup table.
If no match is found in the Cache file, then it queries the
lookup table. If the match is found in the lookup table
then, the value will be cached (stored in the memory) for
the next time. If the lookup table is very big then we can
use this approach
No Cache mode: If we selected this option then, Lookup
Transformation will not use cache to store the lookup
table at any stage. When a new row comes from the data
flow, the Lookup Transformation directly check in the
lookup table for the matching values.
The final option in this page is to specify, How to handle
rows with no matching entries. Lookup Transformation
provides you four options:
www.rudrasoft24.com
Contact: 9848486690
34
sdfsdff
Fail Component (Default): If we selected this option
then, When a new row is passed from the data, flow
Lookup Transformation will fail the package if it doesn’t
found any matching row in the lookup table.
Ignore Failure: If we selected this option then, When a
new row is passed from the data, flow Lookup
Transformation will continue its processing even though
it doesn’t found any matching row in the lookup table.
Redirect Rows to No Match Output: If we selected this
option then, Lookup Transformation will direct the rows
that doesn’t found any matching row in the lookup table
to No Match In real-time we mostly use this option
Redirect Rows to Error Output: If we selected this option
then, Lookup Transformation will direct the rows that
doesn’t found any matching row in the lookup table to
standard error output.
UNION ALL Transformation
It is an Asynchronous partial blocking transformation.
It allows you to combine multiple (more than two) input
and produce one output.
Note: If there is N number of tables, there will be 1
Union All Transformations
www.rudrasoft24.com
Contact: 9848486690
35
sdfsdff
What are the differences between Merge and Union All
transformation?
Merge Transformation
It is an Asynchronous partial blocking transformation.
It used to merges two sorted data sets into a single
dataset. This transformation is very useful when during
ETL its needs to merge data from two different data
sources.
Merge transformation can’t merge a column that has a
numeric data type with a column that has a character
data type.
Note: There should be same number of columns from
the data datasets and same kind of data
Merge Transformation is very useful when we want to
merge the error path data (after handling the errors) and
normal data.
For Example: we split the data using Conditional Split
Transformation in SSIS as per the condition. After
www.rudrasoft24.com
Contact: 9848486690
36
sdfsdff
performing few more operations on them we want to
merge them back.
In these situations we can use Merge Transformation to
Merge them back.
NOTE: Merge Transformation will produce only one
output and it does not support error output.
Note: If there is N number of tables, there will be N-1
Merge Transformations
Union All Transformation:
It is an Asynchronous partial blocking transformation.
It allows you to combine multiple (more than two) input
and produce one output.
Note: If there is N number of tables, there will be 1
Union All Transformations
www.rudrasoft24.com
Contact: 9848486690
37
sdfsdff
Differences between Merge and Union All
transformation:
Merge Transformation in SSIS
Data should be in sorted order before applying Merge
Transformation
Output of the Merge Transformation will produce Sorted
data
It only accepts 2 inputs
Union All Transformation in SSIS
No need to sort the data
Output will be unsorted data.
It can accept more than 2 inputs
Cache Transform
www.rudrasoft24.com
Contact: 9848486690
38
sdfsdff
Cache Transform transformation is used to create a
reference dataset
(in the form of file). Cache Transform will take the data
from SOURCE.
This Cache will be used by the lookup transformation to
join the data with SOURCE data.
We need to use Cache Connection Manager and Full
Cache Mode, when you want lookup on Cache file.
Cache will be saved with extension .caw
Note: Lookup Cache file should be loaded before running
Lookup Transformation, if the lookup transformation is
using Cache file.
Note: Use Cache file, if the lookup table is not getting
update frequently.
Note: Cache file will be used by multiple Lookup
Transformations in the same package or different
packages based on Business requirement
www.rudrasoft24.com
Contact: 9848486690
39
sdfsdff
www.rudrasoft24.com
Contact: 9848486690
40
sdfsdff
Copy Column Transformation
Copy Column Transformation in SSIS simply duplicates
the source columns like copying the source column data
and pasting into new column.
Later in the data flow, different transformations can be
applied to the column copies.
For example, you can use the Copy Column
transformation to create a copy of a column and then
convert the copied data to uppercase characters by using
the Character Map transformation, or apply aggregations
to the new column by using the Aggregate
transformation
Execute the following SQL Statements.
www.rudrasoft24.com
Contact: 9848486690
41
sdfsdff
CREATE TABLE EmpDetails(ENO INT,
[Full Name] VARCHAR(20))
INSERT INTO EmpDetails VALUES(
1, 'Pawan Kalyan'),
(2, 'Rama Rao'),
(3, 'Nageswar Rao'),
(4, 'Chandra Babu')
SELECT ENO, [Full Name]
FROM EmpDetails
The Derived Column Transformation will allow you to
transform the data from a column to a new column, but
the UI in the Copy Column Transformation is simpler for
some
By Using Derived Column Transformation:
What are the differences between Derived Column and
Copy Column Transformations ?
www.rudrasoft24.com
Contact: 9848486690
42
sdfsdff
Character Map Transformation
It is Synchronous Non Blocking Transformation
The Character Map Transformation performs common
character translations in the flow.
Byte Reversal: Reverses the order of the bytes. For
example, for the data 0x1234 0x9876, the result is
0x4321 0x6789.
This uses the same behavior as LCMapString with the
LCMAP_BYTEREV option
Full Width: Converts the half-width character type to full
width
Half Width: Converts the full-width character type to half
width
www.rudrasoft24.com
Contact: 9848486690
43
sdfsdff
Hiragana: Converts the Katakana style of Japanese
characters to Hiragana
Katakana: Converts the Hiragana style of Japanese
characters to Katakana
Linguistic Casing: Applies the regional linguistic rules for
casing
Lowercase: Changes all letters in the input to lowercase
Traditional Chinese: Converts the simplified Chinese
characters to traditional Chinese
Simplified Chinese: Converts the traditional Chinese
characters to simplified Chinese
Uppercase: Changes all letters in the input to uppercase
Export Column
The Export Column transformation reads data in a data
flow and inserts the data into a file.
Unlike the other transformations, the Export Column
Transformation doesn’t need a destination to create the
file
For example, if the data flow contains product
information, such as a picture of each product, you could
use the Export Column transformation to save the
images to files.
www.rudrasoft24.com
Contact: 9848486690
44
sdfsdff
FilePath "D:\\SOURCE\\Exported\\" +
(DT_WSTR,50)ProductKey + ".gif"
Import Column
The Import Column Transformation is a partner to the
Export Column Transformation.
The Import Column transformation reads data from files
and adds the data to columns in a data flow. Using this
transformation, a package can add text and images
stored in separate files to a data flow.
Create the following table in the DESTINATION
CREATE TABLE myImages (
[StoredFilePath] [varchar](50) NOT NULL,
[ProdPicture] image )
Data Flow Diagram
www.rudrasoft24.com
Contact: 9848486690
45
sdfsdff
FileList.txt
D:\MSBI Training\SSIS\Source Images\228.Gif
D:\MSBI Training\SSIS\Source Images\354.Gif
D:\MSBI Training\SSIS\Source Images\368.Gif
D:\MSBI Training\SSIS\Source Images\445.Gif
D:\MSBI Training\SSIS\Source Images\451.Gif
D:\MSBI Training\SSIS\Source Images\540.Gif
D:\MSBI Training\SSIS\Source Images\542.Gif
D:\MSBI Training\SSIS\Source Images\Classes.PNG
D:\MSBI Training\SSIS\Source Images\Snip.PNG
Note: ID of "myImage" is going to add for input column
to make connection between input and output column
(Property name is "FileDataColumnID")
Aggregate Transformation
It is an Asynchronous Fully Blocking Transformation.
Aggregate Transformation which allows you to aggregate
data.
It supports Average, Minimum, Maximum and Count.
www.rudrasoft24.com
Contact: 9848486690
46
sdfsdff
Group By: Breaks the data set into groups by the column
you specify
Average: Averages the selected column’s numeric data
Count: Counts the records in a group
Count Distinct: Counts the distinct non-NULL values in a
group
Minimum: Returns the minimum numeric value in the
group
Maximum: Returns the maximum numeric value in the
group
Sum: Returns sum of the selected column’s numeric data
in the group.
CREATE TABLE DimAggr(
Deptno INT,
Salary MONEY
)
Audit Transformation
www.rudrasoft24.com
Contact: 9848486690
47
sdfsdff
It is Synchronous Non Blocking Transformation allows
you to add auditing data to your Data Flow.
If we want to log information like
Which user executed this package?
Execution time of the package?
From which machine he/she executed?
What is the task ID, PackageID, Name of the Package
etc.? Then we need to use Audit transformation.
ExecutionInstanceGUID – The GUID that identifies the
execution instance of the package.
PackageID – This is the unique identifier of the package.
PackageName – Shows the name of the package.
VersionID – The unique version number of the package.
ExecutionStartTime – The time the package started to
run.
MachineName – The Name of the computer.
UserName – The login name of the person who started
the package.
TaskName – Name of the Data Flow task with which the
Audit transformation is associated.
TaskId – The unique identifier of the Data Flow task.
OLE DB Command Transformation
It is Synchronous Non Blocking Transformation
www.rudrasoft24.com
Contact: 9848486690
48
sdfsdff
We can run the SQL Statements like INSERT, UPDATE and
DELETE Statements in the Data Flow. Let use see How to
delete data using OLE DB Command Transformation
and Update Data using OLEDB Command
Transformation.
It will be useful when you are implementing SCD-Type1,
SCD-Tyep2. It will do INSERT, UPDATE and DELETE record
by record so it degrades the performance.
Pivot Transformation
The Pivot transformation is used to makes a normalized
data set into a less normalized by pivoting the input data
on a column value.
Example:
Orders data set that lists Product Name, Sales Amount
by Month Name typically has multiple rows for any
Product Name
For Each Product will be having multiple records for
every month. By pivoting the data set on the Month
Name column, the Pivot transformation can output a
data set with a single row per Product.
www.rudrasoft24.com
Contact: 9848486690
49
sdfsdff
Copy the following SQL Script.
SELECT
LEFT(EnglishMonthName,3) EnglishMonthName,
EnglishProductName,
SUM([SalesAmount]) [SalesAmount]
FROM [FactInternetSales] f
INNER JOIN DimProduct p ON
f.ProductKey=p.ProductKey
INNER JOIN DimDate d ON f.OrderDateKey=d.DateKey
GROUP BY
LEFT(EnglishMonthName,3),
EnglishProductName
CREATE TABLE [DimPivot] (
[EnglishProductName] nvarchar(50),
[Jan_Sales] money,
[Feb_Sales] money,
[Mar_Sales] money,
[Apr_Sales] money,
[May_Sales] money,
[Jun_Sales] money,
[Jul_Sales] money,
[Aug_Sales] money,
[Sep_Sales] money,
[Oct_Sales] money,
[Nov_Sales] money,
[Dec_Sales] money
)
www.rudrasoft24.com
Contact: 9848486690
50
sdfsdff
Unpivot Transformation
The Unpivot transformation makes an unnormalized
dataset into a more normalized version by expanding
values from multiple columns in a single record into
multiple records with the same values in a single column.
Example:
Orders data set that lists Product Name, Sales Amount
by Month Name typically has one row for one Product
Name from Pivot table.
For Each Product will be having single record for all
months. By Unpivoting the data set on the Month Name
column, the Unpivot transformation can output a data
set with a multiple rows for every Product.
Execute the following script
USE SOURCE
CREATE TABLE [Unpivot]
(
[Month Name] VARCHAR(20),
[Product Name] VARCHAR(100),
[Sales Amount] MONEY
)
Data Flow Diagram
www.rudrasoft24.com
Contact: 9848486690
51
sdfsdff
OLE DB Source
The OLE DB source extracts data from a variety of OLE
DB-compliant relational databases by using a database
table, a view, or an SQL command.
1. A table or view
2. A table or view specified in a variable.
www.rudrasoft24.com
Contact: 9848486690
52
sdfsdff
3. The results of an SQL statement. The query can be a
parameterized query.
4. The results of an SQL statement stored in a variable
OLE DB Destination
The OLE DB destination loads data into a variety of OLE
DB-compliant databases using a database table or view
or an SQL command.
The OLE DB destination provides five different data
access modes
1. A table or view
2. A table or view using fast-load options
3. A table or view specified in a variable
4. A table or view specified in a variable using fast-load
options
5. The results of an SQL statement
Note: The OLE DB destination does not support
parameters. If you need to execute a parameterized
INSERT statement, consider the OLE DB Command
transformation.
EXCEL Source:
The Excel source extracts data from worksheets or
ranges in Microsoft Excel workbooks
The Excel source provides four different data access
modes for extracting data
1. A table or view.
2. A table or view specified in a variable.
www.rudrasoft24.com
Contact: 9848486690
53
sdfsdff
3. The results of an SQL statement. The query can be a
parameterized query.
4. The results of an SQL statement stored in a variable.
SSIS translates this general format as a Unicode string
data type. In SQL Server, Unicode translates into
nvarchar
If you have a Unicode data type in SSIS and you try to
insert it into a varchar column, it will potentially fail
The solution for above problem is to place a Data
Conversion Transformation between the source and
the destination in order to change the Excel data
types.
Excel Destination
The Excel destination loads data into worksheets or
ranges in Microsoft Excel workbooks.
Flat File Source
The Flat File source reads data from a text file. The text
file can be in delimited, fixed width, or mixed format.
Delimited format uses column and row delimiters to
define columns and rows
Fixed width format uses width to define columns and
rows. This format also includes a character for padding
fields to their maximum width.
www.rudrasoft24.com
Contact: 9848486690
54
sdfsdff
Ragged right format uses width to define all columns,
except for the last column, which is delimited by the row
delimiter
Note: Retain null values from the source as null values in
the data flow
Flat File Destination
The Flat File destination writes data to a text file. The
text file can be in delimited, fixed width, fixed width with
row delimiter, or ragged right format.
Raw File Source
The Raw File source reads raw data from a file. Because
the representation of the data is native to the source,
the data requires no translation and almost no parsing.
The Raw File source is used to retrieve raw data that was
previously written by the Raw File destination
Raw File Destination
The Raw File destination writes raw data to a file.
Because the format of the data is native to the
destination, the data requires no translation and little
parsing. Raw File destination can write data more
quickly than other destinations such as the Flat File and
the OLE DB destinations
Note: The Raw File destination is frequently used to
write intermediary results of partly processed data
between package executions. Storing raw data means
www.rudrasoft24.com
Contact: 9848486690
55
sdfsdff
that the data can be read quickly by a Raw File source
and then further transformed before it is loaded into its
final destination.
Note: The Raw File destination supports null data but not
binary large object (BLOB) data.
Note: The Raw File destination does not use a
connection manager.
XML Source
The XML source reads an XML data file and populates
the columns in the source output with the data.
The data in XML files frequently includes hierarchical
relationships.
SQL Server Destination
The SQL Server destination connects to a local SQL
Server database and bulk loads data into SQL Server
tables and views. You cannot use the SQL Server
destination in packages that access a SQL Server
database on a remote server.
CONTROL FLOW TASKS
Tasks are control flow elements that define units of work
that are performed in a package control flow.
File System Task
The File System Task is a configurable GUI component
that performs file operations
www.rudrasoft24.com
Contact: 9848486690
56
sdfsdff
Copy Directory: Copies all files from one directory to
another. You must provide the source and destination
directories.
Copy File: Copies a specific file. You must provide the
source and destination filename.
Create Directory: Creates a directory. You must provide
the source directory name and indicate whether the task
should fail if the destination directory already exists.
Delete Directory: Deletes a directory. You must provide
the source directory to delete.
Delete Directory Content: Deletes all files in a source
directory
Delete File: Deletes a specifically provided source file
Move Directory: Moves a provided source directory to a
destination directory. You must indicate whether the
task should fail if the destination directory already exists.
Move File: Moves a specific provided source file to a
destination. You must indicate whether the task should
fail if the destination file already exists.
Rename File: Moves a specific provided source file to a
destination by changing the name. You must indicate
whether the task should fail if the destination file already
exists.
Set Attributes: Sets Hidden, Read-Only, Archive, or
System attributes on a provided source file.
FTP TASK
www.rudrasoft24.com
Contact: 9848486690
57
sdfsdff
The SSIS FTP Task enables the use of the File Transfer
Protocol (FTP) in your package development tasks. This
task now exposes more FTP command capability,
enabling you to create or remove local and remote
directories and files
Copying directories and data files from one directory to
another, before or after moving data, and applying
transformations to the data.
Logging in to a source FTP location and copying files or
packages to a destination directory
Downloading files from an FTP location and applying
transformations to column data before loading the data
into a database.
BULK INSERT TASK
The Bulk Insert task provides an efficient way to copy
large amounts of data into a SQL Server table or view.
Note: To ensure high-speed data copying,
transformations cannot be performed on the data while
it is moving from the source file to the table or view.
Check Constraints: This option checks table and column
constraints before committing the record. It is the only
option enabled by default.
Keep Nulls: By selecting this option, the Bulk Insert Task
will replace any empty columns in the source file with
NULLs in SQL Server.
Enable Identity Insert: Enable this option if your
destination table has an identity column into which
you’re inserting. Otherwise, you will receive an error.
www.rudrasoft24.com
Contact: 9848486690
58
sdfsdff
Table Lock: This option creates a SQL Server lock on the
target table, preventing inserts and updates other than
the records you are inserting. This option speeds up your
process but may cause a production outage, as others
are blocked from modifying the table. If you check this
option, SSIS will not have to compete for locks to insert
massive amounts of data into the target table. Set this
option only if you’re certain that no other process will be
competing with your task for table access.
Fire Triggers: By default, the Bulk Insert Task ignores
triggers for maximum speed. When you check this
option, the task will no longer ignore triggers and will
instead fire the insert triggers for the table into which
you’re inserting.
EXECUTE SQL TASK
The Execute SQL Task is used for all sorts of things,
including truncating a staging data table prior to
importing, retrieving row counts to determine the next
step in a workflow, or calling stored procedures to
perform business logic against sets of staged data.
Truncate a table or view in preparation for inserting
data.
Create, alter, and drop database objects such as
tables and views.
Re-create fact and dimension tables before loading
data into them.
Save the rowset returned from a query into a
variable.
www.rudrasoft24.com
Contact: 9848486690
59
sdfsdff
Executing a Parameterized SQL Statement
The task can execute a SQL command in two basic ways:
by executing inline SQL statements or by executing
stored procedures.
Executing a Batch of SQL Statements
Use GO statements between each distinct
command. Note that some providers allow you to
use the semicolon (;) as a command delimiter.
If there are multiple parameterized statements in
the batch, all parameters must match in type and
order.
Only one statement can return a result, and it must
be the first statement.
Capturing Singleton Results
Multi-Row Results
Executing a Stored Procedure
Note: The Execute SQL task can be used in combination
with the Foreach Loop and For Loop containers to run
multiple SQL statements. These containers implement
repeating control flows in a package and they can run
the Execute SQL task repeatedly. For example, using the
Foreach Loop container, a package can enumerate files
in a folder and run an Execute SQL task repeatedly to
execute the SQL statement stored in each file.
www.rudrasoft24.com
Contact: 9848486690
60
sdfsdff
EXECUTE PACKAGE TASK
The Execute Package task extends the enterprise
capabilities of Integration Services by letting packages
run other packages as part of a workflow.
Breaking down complex package workflow.
Reusing parts of packages
Grouping work units
Controlling package security.
Execute Process Task
The Execute Process task runs an application or batch file
as part of a SQL Server Integration Services package
workflow.
The most common example would have to be unzipping
packed or encrypted data files with a command-line tool.
www.rudrasoft24.com
Contact: 9848486690
61
sdfsdff
WMI Event Watcher Task
The WMI Event Watcher task watches for a Windows
Management Instrumentation (WMI) event using a
Management Instrumentation Query Language (WQL)
event query to specify events of interest.
Watch a directory for a certain file to be written.
Wait for a given service to start.
Wait for the memory of a server to reach a certain
level before executing the rest of the package or
before transferring files to the server.
Watch for the CPU to be free.
SELECT * FROM __InstanceCreationEvent WITHIN 10
WHERE TargetInstance ISA "CIM_DirectoryContainsFile"
AND TargetInstance.GroupComponent =
"Win32_Directory.Name=\"d:\\\\Event\\\\One\""
www.rudrasoft24.com
Contact: 9848486690