0% found this document useful (0 votes)
1K views7 pages

Ssis Notes

This document discusses Extract, Transform, Load (ETL) operations and SSIS (SQL Server Integration Services). It defines ETL, describes the components of an SSIS package including control flow, data flow, parameters, event handlers and package explorer. It also discusses tasks, containers, precedence constraints, sources, destinations, transformations, error handling and logging in SSIS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views7 pages

Ssis Notes

This document discusses Extract, Transform, Load (ETL) operations and SSIS (SQL Server Integration Services). It defines ETL, describes the components of an SSIS package including control flow, data flow, parameters, event handlers and package explorer. It also discusses tasks, containers, precedence constraints, sources, destinations, transformations, error handling and logging in SSIS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 7

ETL operations.

==========

E---Extracting --->Getting data


T---Transform--->performing intermediate operations by using transfermations or
business rules.
L---Load---->Load to distination

DB<---->DB
File<---->File
File<---->DB
DB<---->FIle
csv--file
excel --file
xml--db

1)What is Integration Services?


-SSIS an ETL Tool,it is used to extracts data from different sources and loads
to different destinations using transformations or business rules.

SSIS package contains 5 components


1.Control flow.
2.Data flow.
3.parameters.
4.Event handler.
5.Package explorer.

2) What is Control Flow


it controls the execution of the package.
--Control flow consists of tasks,containers and precedence constraints.
--here two process are there one is sequencial processing and parallel processing..
defalt is sequence procesing.

3).What is Data flow?

--Data flow represents flow of data.It is used to transform the data from
source to distination by using business rules.
Data flow consists of sources,distinations and comman tranformations and
other transafermations.

4)What is Event Handler?


--Event handler allows to impliment errorhandling and debbugging of
control flow tasks in ssis package.

5)Package explorer allows to browse the package contents generally,


it contains,tasks,precedence constraints,log providers,variables,event handler,
connection manager.

6)It manages all connection used by different tasks and adapter in the package.

7)what are the tasks in SSIS

1.Data preferred task(File System Task,FTP,WebService,XML)


2.RDBMS Task (Bulk Insert task ,Execute Sql Task)
3.Data Flow Task.(Sources,Transformations,Destination)
4.work flow task
5.Script task
6.Maintanence task.
7.SMO(sql management object)task.
8.Custom task.

8.what are the containers in SSIS?


Containers provides the structure to the package and to repeat control flows with
in the package.

SSIS package consists 3 types

1.For Loop Container


2.For Each loop Container
3.Sequence Container.

1.For loop Container:It is used to repeats control flow to the specified number of
times depends on condition.

2.For Each loop container:It is used to repeats control flow by using an


enumarator.

1)for each Ado enumarator : To enumarator rows in tables.


2)for each File enumurator :To enumarator files in a folder.

3)For each from variable enumurator:to enumarate enumarable object


4) Item enumurator:To enumerate items that are in collection.
5)For each node list:To enumerate the result sets of an '''''
6)For each smo:To enumerate rhe sql management object.

3.Sequence container:It is used to group the control flow items.

9.what are the precedence constraints in SSIS?


A.precedence constraints are used to connect the tasks and containers.
1.on success
2.on failure
3.on completion

10.what the sources in SSIS?


A.-OLEDB source(eg:sql server,my sql,oracle,db2)
-excel source(eg:excel files)
-flatfile source(eg:text file)
-rawfile source
-xml source(eg:xml file)
-data reader source(eg:Ado objects)

11.what are destinations in SSIS?


A.-data reader destination
-execel destination
-flat file destination
-OLEDB destination
-raw file destination
-recordset destination
-sql server and mobile destination

12.what are the transformations in SSIS?


A.Trasformation:It is a intermediate operation between source and destination.
Types of transformations:

1.Aggregate:It is used to perform aggregation functions like


sum,avg,max,min,count,group by,distinct..etc on source columns.

2.Audit:It is used to get the tracing or tracking information(package id,name,task


name,user name etc.

3.Character map:It is used to convert the data to different languages and convert
case of string(upper,lower)

4.Conditional split:It is used to split the data depend upon client


requirement(condition into more number of destinations)

5.Data convertion:It is used convert the data into another datatype.

6.Copy column:It creates new column by copying input column and adding new column
to the transforamation output.

7.Derived column:It is used to convert the data into different data types and we
can do mathematical operatios and string operations with the help of predefined
functions.
8.Union all:It is used to combine multiple inputs into one output(same structure)

9.Merge:It is used to combine two sorted data sets into single dataset

10.Merge join:It provides an output that is generated by joining two sorted inputs
using a full,left,inner joins.

11.Sort:It is used to sort data into particular order(Asc,Dsc)

12.Lookup:lookup transformation performs lookup by joining data in input columns


with columns in a
referenced data set.

it is used to get the relavant information from reference table based on the
key column.

13.OLEDB command:It is used to execute sql statement or stored procedures for each
row in the input

14.script component:this transformation is used to extract,load,transform the


script.

15.Fuzzy grouping :it performs data cleaning tasks by identifying rows of data that
are likely to be duplicates
and selecting canonical row of data to use in standardising of data.
or it performs grouping of the rows on appropriate match.

16.Fuzzy lookup :it performs data cleaning tasks such as standardizing data,
correcting data and providing
missing values. Or used to perform appropriate match for given row values against a
lookup table rows.

17. Slowly Changing Dimension: Used to synchronise the changes in the OLTP database
tables into Datawarehousing Dimension tables.

SCD 1: When the changes are occured in the source, its simply updating and
overriding existing content in destination.

SCD 2: Instead of overriding existing content, here we maintaining entire row as


history and creating one more row with latest changes.

SCD 3: We are mantain historical data, but instead of maintaining entire record as
history. Here we are
maintaining those columns, which are going to update.IN .NET

18. How do you do erroR handling in SSIS?

-- When dataflow component applies transformations to column data or extract data


from sources, or loads data into destinations, error can occure.
-- Errors frequently occure because of unexpected data values
-- Errors generally raising in the following categories.
1. Data Conversion
2. Expression Evaluation

3. Lookup
1. Data Conversion: Data types mismatch.
Ex: Instead of String when we pass int.
2. Expression Evaluation: Raising expression errors because of performing invalid
operator.
Ex: Instead of '+' operator, when pass different operator.
3. Lookup errors: occures because lookup operation fails to locate a match in the
lookup table.
Ex: mismatch in lookup table.

Lookup Transformation

The Look up transformation performs exact matched records by joining data in input
columns with
coulmns in a refrenced data set.

note:the look up transformation supports the following database provides for the
OLEDB connection manager
-----sql server,
----oracle
---and DB2
the look ups performed by the lookup transformations are case sensitive .

note:- In sql server 2005 integration services (SSIS),the look up transformation


had only one output.

Fuzzy look up:-the fuzzy look up transformation uses fuzzy matching to return one
or more close matches
from reference table
the fuzzy look up transformation includes three features for customizing the look
up it performs

1.maximim number of matches to return per input row .


2.taken delimiters.
3.similarity thresholds.

How do you do logging in ssis?


-- includes logging features that write log entries when run-time events occured
and can also write custom messages.
-- SSIS log providers can write log entries to text files, sql server, sql server
profiler, xml files.
-- To enable logging in package, 3 approaches are there.
1. In Bids, ope the integration services projects that contain the package.
2. on the SSIS menu, click logging.
3. Select a log provider in the provider type and then click add.

1)conditional split:

1).it splits the data based on the condition


2).there are two types of output comes from this transformation.
a)conditions matched output
b)conditions unmatched output(or)default output.

1)audit tranformation:
it is used to populate the audit information such as package name,machine name
,execution time and etc.
we can populate the same information using derived column. t/r
--it displays audit information for every row coming from source or it adds
audit
information to the source data.

2)Loogging:Logging is a feature,which creates the record that traces the execution


of tasks,
containers with in the package.

3)copy column:by using this t/r we can copy the data from existing
column to new column.same way using the derived column t/r we can
exist the data from existing column to new column

4)row count:it is used to capture the no of records into the variable in the data
flow.

5)script component:it can be used as a source or tranformation or distination.


suppose if u have any complicated business logic which as not support
by other t/r then we can use script component.

6)oledb command:it is used to execute a SQL statements dynamically for each and
every record in the input.

7)scd:
To process the data from granualarity tables to main table .we follow a
mechanism is called slowly changing dimension.

SCD will give the information the way see in the changed that can be
maintained in the target.

type 1:type1 mantains recent data in the target table.


if any record is updated in the source that record has to be overwrite in the
target table.
if any new record comes from the source that record has to be inserted
in the target table.

type 2: type 2 maintain historical data,

for each and every record inserted in the source those records has to be also
inserted in the target table.

type3:type3 maintain partial history

suppose if you want to maintain current record & previous record.

first previous record has to be delete from target table.


if any new record comes from the source that record has to be inserted in the
target table.
note: ssis doesn't support type 3
8)execute sql task:
execute sql task is used to execute sql statements such as ddl,dml,tcl

9)Lookup:look up t/r is used to get the relevant information from the reference
table based on the key field

this look up t/r will be used in the slowly changing dimension to check
the incoming record is existed or not in the target table.

Full:reference emp(referenced)---dept(reference)---

empno ename sal deptno deptno dname loc high cache memory( buffer) store--
database link distroyed---
1001 shiva 3000 10 10 it hyd
1002 ramu 3000 20 20 sales hyd Fullcache:
run pack after refresh--more---disconnectted mode--high performance
1003 fuel 4000 30 40 it hyd

partial ------------memory low

no cache-----------connected mode--
look up table locked.performance low,don't use look up table,

select * from sysssispackages

Look up:

10)Row sampling:
it used to select the specific number of random rows from the input dataset.
it will produce data sampling selected output as well as sampling unselected
output.

11)Percentage sampling:
Row sampling:it used to select the specific percetage number of random rows from
the input dataset.
it will produce data sampling selected output as well as sampling unselected
output.

You might also like