0% found this document useful (0 votes)
4 views11 pages

SQL Server Integration Services

SQL Server 2005 introduces SQL Server Integration Services (SSIS), a new enterprise ETL platform designed to enhance data extraction, transformation, and loading processes. SSIS offers a comprehensive set of features including data cleansing, extensibility, and integration with other SQL Server services, all within a robust development environment. The article provides an overview of SSIS capabilities and a step-by-step guide for creating a simple ETL package using Business Intelligence Development Studio.

Uploaded by

maulikshopping
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views11 pages

SQL Server Integration Services

SQL Server 2005 introduces SQL Server Integration Services (SSIS), a new enterprise ETL platform designed to enhance data extraction, transformation, and loading processes. SSIS offers a comprehensive set of features including data cleansing, extensibility, and integration with other SQL Server services, all within a robust development environment. The article provides an overview of SSIS capabilities and a step-by-step guide for creating a simple ETL package using Business Intelligence Development Studio.

Uploaded by

maulikshopping
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 11

SQL Server Integration Services: SQL Server 2005's New ETL Platform

Go to page: 1 2 3 Next

Microsoft SQL Server 2005 provides a completely new enterprise extraction,


transformation, and loading (ETL) platform called SQL Server Integration Services
(SSIS) that ships with the features, tools, and functionality to build both classic and
innovative kinds of ETL-based applications. This article examines some of the exciting
SSIS features that you can use to build ETL applications. Along the way, you will also
learn how to build a simple package using the new Business Intelligence
Development Studio, which is a key component of SSIS features.

A History Lesson
Back in the days when Microsoft SQL Server 6.5 was still a hot product, the bulk copy
program was the only way to back up or export databases from SQL Server to other
data sources, such as other database servers or text files. However, in subsequent
SQL Server versions, Microsoft added a large number of features that were aimed at
improving the productivity of SQL Server developers and DBAs. One such feature is
DTS (Data Transformation Services). By creating DTS packages, you can combine
several tasks into one process and use any programming language that supports
automation (such as Visual Basic.NET, Visual C#, or Managed C++) to execute these
packages and monitor their progress for errors.

With the upcoming SQL Server 2005 release, Microsoft has raised the bar by
introducing SSIS, the brand new ETL tool. Similar to DTS, it provides functions for
moving data from one place to another and manipulating that data at run time.
However, SSIS has been completely redesigned from scratch to be an enterprise ETL
platform. SSIS provides the breadth of features—and very high-level performance—
necessary to build enterprise-class ETL applications.

SSIS is fully programmable, embeddable, and extensible, which makes it an ideal ETL
platform. It has a great development environment hosted in a Visual Studio shell with
cool capabilities for building workflows and pipelines through a rich set of pre-built or
custom components. SQL Server 2005 ships with the following development and
management environments for designing and managing packages:

 The SQL Server Management Studio is an environment for managing the


storage and execution of deployed packages. It has special features for doing
this, including integration with the DTS Service, and the ability to enumerate
packages on remote servers. But it is not a design environment.
 The Business Intelligence Design Studio is an environment for designing
packages, organizing them in Solutions and Projects, debugging them, and
managing source and version control for multi-user projects.

These two environments offer powerful facilities for deploying, debugging, and
monitoring deployed packages. In addition, SQL Server 2005 provides rich workflow
capabilities that you can use for performing sophisticated data manipulations.

SSIS Features
There are too many SSIS features to cover in a single article, so this piece highlights
some of the important ones and then moves on to creating a simple package with
Business Intelligence Development Studio. The following are the most notable
features:

 You can use SSIS to transfer millions of rows of data to and from
heterogeneous data sources, but SSIS functionality doesn't stop there. The
tool leverages the end-to-end BI suite by offering complete data integration,
movement, and shaping, which means that SSIS provides data cleansing,
extensibility, and interoperability.
 SSIS comes with a lot of pre-built data-cleansing functionality, including
completely integrated functions, such as fuzzy matching and fuzzy grouping,
that use algorithms to match or group disparate data to a configurable degree
of accuracy.
 SSIS offers broad extensibility points for third-party component vendors,
meaning that if functionality is not available out of the box, SSIS lets you build
your own components or add a third-party component to solve your problem.
 SSIS can load data directly into Analysis Services cubes, and it also offers
robust data-mining features for including scalable data-mining model creation,
training, and predictions.
 SSIS is seamlessly integrated with SQL Server Reporting Services integration,
which lets you treat an SSIS package as the data source for reporting.
 SSIS also has greatly improved performance and scalability that allow you to
host complex, high-volume ETL applications on lightweight servers, enabling
you to scale down.
 SSIS can also help reduce ETL data staging areas and help minimize
performance costs associated with data staging (disk I/O and serial
processing). This is made possible by the ability to perform complex data
transformations, data cleansing, and high-volume lookups—all inline from
source to destination.
 SSIS also provides a new feature, the Slowly Changing Dimension (SCD)
wizard. Through the SCD interface, you can rapidly generate all the steps and
required code to add unique handling of history to multiple attributes in a
given dimension.
 The development environment for SSIS, known as Business Intelligence
Development Studio, is hosted in Visual Studio, enabling scripting and other
programming tasks that take advantage of that enterprise development
environment.
 SSIS now fully supports the Microsoft .NET Framework, allowing developers to
program SSIS in their choice of .NET-compliant languages, as well as native
code.
 The Data Transformation run-time engine is exposed both as a native COM
object model and as an entirely managed object model. Although the Data
Transformation engine is written in native code, it is available though a signed
Primary Interop Assembly (PIA) that enables full, managed access to it.

Now that you have a general understanding of the SSIS features, consider an
example that shows the steps involved in constructing and executing a package
using the Business Intelligence Development Studio environment.

Creating a Simple Package


Consider a simple example wherein you transfer data from one table to another table
by executing a stored procedure. For each row in the source table, you will invoke
the stored procedure to load the data into the destination table. (Click here to
download the accompanying source code for the demo.)
Before looking at the package design, create the required tables and the stored
procedure. You will create these database objects in the AdventureWorks database,
which is one of the sample databases that ships with SQL Server 2005.

To start, create the source table using the following definition:

CREATE TABLE [dbo].[EmpName](


[FirstName] [varchar](50) NOT NULL,
[LastName] [varchar](50) NOT NULL
) ON [PRIMARY]

The destination table definition is as follows:

CREATE TABLE [dbo].[Emp](


[EmpID] [int] IDENTITY(1,1) NOT NULL,
[EmpName] [varchar](100) NOT NULL
) ON [PRIMARY]

The stored procedure to load the data into the destination table is as follows:

CREATE PROCEDURE [dbo].[InsertEmp]


@FirstName [varchar](50),
@LastName [varchar](50)
WITH EXECUTE AS CALLER
AS
Insert into Emp(EmpName) Values (@LastName + ', ' + @FirstName)
GO

Using the package, you will retrieve the FirstName and LastName columns from the
EmpName table and load them into the Emp Table through the InsertEmp stored
procedure. Inside the stored procedure, you will concatenate the FirstName and
LastName columns and assign the concatenated value to the EmpName column of
the Emp table. Now that you understand the package functionality, take a look at the
design of the package, as shown in the next section.

Creating a Package Using Business Intelligence Studio


Before creating the package, you first need to create a project that can host the
package. Open Business Intelligence Development Studio from the Start->Programs
menu. Select File->New Project from the menu and you will see the dialog box in
Figure 1.
Figure 1: New Project Dialog Box

As Figure 1 shows, select Integration Services Project as the project template and
specify the name of the project as SSISDemo. Once the project is created, you will
see the package designer window of the default package Package.dtsx.

For the requirements you've outlined, the package you are going to create is very
simple and straightforward. It consists of one Data Flow task with two components
inside the data flow: an OLE DB Source adapter and an OLE DB Command
Transformation. To start, bring up the toolbox by selecting View->Toolbox from the
menu. You will see the dialog box in Figure 2.
Figure 2: Toolbox Dialog Box

As you can see, the toolbox offers a number of options that you can leverage to
construct your packages. Drag and drop a Data Flow Task from the toolbar onto the
package designer. Next, double-click on the Data Flow Task and you will see the
designer window in Figure 3, which you can use to build sophisticated data flows.
Figure 3: Data Flow Task Designer Window

Now, bring up the toolbox again by selecting View->Toolbox from the menu. From
the toolbox, drag and drop an OLE DB Source onto the designer and double-click on
the object to bring up the window in Figure 4. The window allows you to specify the
data source as well as the name of the source table to use for transformation.
Figure 4: Designer View of OLE DB Source

Click on OK in the Figure 4 dialog box.

Now, drag and drop an OLE DB Command object onto the designer and connect the
OLE DB Source object to the OLE DB Command object. Double-click on the OLE DB
Command object to bring up its properties window. Modify the Connection Managers
tab of the properties window to look like Figure 5.
Figure 5: Connection Managers Tab of the OLE DB Command Object's Properties
Window

Navigate to the Component Properties tab and specify the stored procedure details in
the SqlCommand properties text box as Figure 6 shows.
Figure 6: Stored Procedure Details in the SqlCommand Properties Text Box

As you can see from Figure 6, the InsertEmp procedure is used to load the data into
the Emp table and the parameters to that procedure will be supplied at run time. The
mapping between the source table columns and the stored procedure parameters is
what you will specify in the Column Mapping tab. Note that in the Component
Properties tab; once you specify the SqlCommand property value and hit the Refresh
button, the title of the Input Columns tab in the Figure 6 dialog box automatically
changes to Column Mappings.

In the Column Mappings tab, you can either drag a column (between the input
columns and destination columns) or you can use the dropdown list to select the
corresponding input and output columns. Once you perform the mappings, a window
like Figure 7 appears.
Figure 7: The Resulting Mappings from Column Mappings Tab

Click OK in the Column Mappings dialog box. That's all there is to creating a package
with the new package designer. Now, you are ready to execute the package. Just
press F5, and you will see the output in Figure 8.
Figure 8: Output of a Successful Package Execution

If the package executes successfully, you will see an output that is somewhat similar
to Figure 8, which shows the objects on a green background. The above screenshot
also shows the number of rows transferred between the source and destination
tables.

A Solid Foundation for Creating Packages

With the release of SQL Server Integration Services, Microsoft now has a powerful ETL
tool that is not only enterprise class but can also go a long way in increasing the
productivity of developers. Its feature set makes it extremely easy and seamless to
build sophisticated, high-performance ETL applications. This article has only
scratched the surface of what is possible with SSIS, but the simple example it gave
should provide a solid foundation for creating packages with SSIS. For more
information, check out the following Web sites:

 http://msdn.microsoft.com/SQL/sqlwarehouse/SSIS/default.aspx
 http://blogs.conchango.com/jamiethomson/category/71.aspx
 http://sqljunkies.com/WebLog/donald%20farmer
 hhttp://sqljunkies.com/WebLog/kamalh
 http://sqljunkies.com/WebLog/knight_reign
 http://sqlis.com

Download the Code

To download the accompanying source code for the demo, click here.

You might also like