0% found this document useful (0 votes)
204 views4 pages

An Introduction To Olap in SQL Server 2005 PDF

Uploaded by

Umberto Alcaraz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
204 views4 pages

An Introduction To Olap in SQL Server 2005 PDF

Uploaded by

Umberto Alcaraz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

An Introduction to OLAP in SQL Server 2005 Página 1 de 4

http://www.devx.com Printed from http://www.devx.com/dbzone/Article/21410

An Introduction to OLAP in SQL Server 2005


Get a preview of the upcoming SQL Server 2005's Business Intelligence suite and find out
about the major OLAP components of Analysis Services.
by Gail Tieh
f you're a database developer, you've no doubt heard of On-Line Analytical Processing (OLAP) and the
advantages of analysis using multi-dimensional, pre-aggregated data. Maybe you've even thought about
creating your own multidimensional cubes to give your end users true ad hoc capabilities, including the
creation of calculated measures/KPIs. If you've relegated that task to the back burner because it was too
complex, you'll be happy to know that SQL 2005 has made the process easier.

This article discusses the major OLAP components of Analysis Services, all of which can be implemented by
even a first-time cube builder. A follow up article by Mark Frawley will examine the differences between
Analysis Services in SQL 2000 and SQL 2005.

Why Use OLAP?


OLAP is useful because it provides fast and interactive access to aggregated data and the ability to drill down
to detail. OLAP lets users view and interrogate large volumes of data (often millions of rows) by pre-
aggregating the information. It puts the data needed to make strategic decisions directly into the hands of the
decision makers, not only through pre-defined queries and reports, but also because it gives end users the
ability to perform their own ad hoc queries, minimizing users' dependence on database developers.

What's the Secret?


OLAP leverages existing data from a relational schema or data warehouse (data source) by placing key
performance indicators (measures) into context (dimensions). Once processed into a multidimensional
database (cube), all of the measures are pre-aggregated, which makes data retrieval significantly faster. The
processed cube can then be made available to business users who can browse the data using a variety of
tools, making ad hoc analysis an interactive and analytical process rather than a development effort. SQL
Server 2005's BI Workbench substantially improves upon SQL Server 2000's BI capability.

SQL Server 2005 BI Workbench Platform


The SQL Server 2005 BI Workbench suite consists of five basic tools:

z SQL Server Relational Database—used to create relational


database
z Analysis Services—used to create multidimensional model
(measures, dimensions and schema)
z Data Transformation Services (DTS)—used to extract, transform and
load data from source(s) to the data ware house or schema Figure 1. Analysis Services
z Reporting Services—used to build and manage enterprise reporting Architecture: The figure shows the
using the relational or multidimensional sources relationship of the various
technology tiers involved in
z Data Mining—used to extract information based on predetermined Analysis Services.
algorithms.

This remainder of this article focuses on multidimensional modeling using Analysis Services and briefly
touches upon DTS's role. Figure 1 shows the Analysis Services architecture.

Elements of Multidimensional Models


To fully leverage the SQL Server 2005 BI Workbench platform, one must first understand the basic elements

http://www.devx.com/dbzone/Article/21410/1954?pf=true 29/12/2005
An Introduction to OLAP in SQL Server 2005 Página 2 de 4

of multidimensional modeling. The basic elements of a multidimensional cube are: measures, dimensions,
and schema.

Measures
Measures are the key performance indicators that you want to evaluate. To determine which of the numbers
in the data might be measures, a rule of thumb is: If a number makes sense when it is aggregated, then it is a
measure. For example, it makes sense to aggregate daily volume to month, quarter and year. On the other
hand, aggregating zip codes or telephone numbers would not make sense; therefore, zip codes and
telephone numbers are not measures. Typical measures include volume, sales, and cost.

Dimensions
Dimensions are the categories of data analysis. The rule of thumb is: When a report is requested "by"
something, that something is usually a dimension. For example, in a revenue report by month by sales region,
the two dimensions needed are time and sales region. For this reason, OLAP analysts often nickname
dimensions the "bys." Typical dimensions include product, time, and region.

Dimensions are arranged in hierarchical levels, with unique positions within each level. For example, a time
dimension may have four levels, such as Year, Quarter, Month, and Day. Or the dimension might have only
three levels, for example, Year, Week, and Day. The values within the levels are called members. For
example, the years 2002 and 2003 are members of the level Year in the Time dimension.

We believe as a best practice, a cube should have no


more than twelve dimensions. A cube with more than
twelve dimensions becomes difficult to understand and
browse. Too many dimensions can cause confusion
among end users and having too many dimensions and
aggregations can also lead to "data explosion." As the
number of dimensions and levels increase, the amount of
data grows exponentially. As mentioned earlier, an OLAP
application is typically used to manipulate large volumes
of data. To optimize response time, Analysis Services
usually pre-aggregate a multidimensional schema.

A dimension can be thought of as a tree structure. Many


OLAP tools present it in a tree control (see Figure 2). This
Figure 2. A Tree-Structured Multidimensional schema:
familiar software control makes using dimensions easier The figure shows Excel Pivot Table interface consisting
as it allows dimension members and their relationships to of tree view for a multidimensional structure.
be viewed simultaneously. This simple interface makes
using the dimensions extremely user-friendly and allows user to view data of different levels simultaneously.

Schema
The dimensions and measures are physically represented by a star schema. The most basic star
schema arranges the dimension tables around a central fact table that contains the measures (see Figure 3).

A fact table contains a column for each measure as well as a column for
each dimension. Each dimension column has a foreign-key relationship to
the related dimension table, and the dimension columns taken together are
the key to the fact table.

After determining the measures, dimensions, and schema using the BI


Workbench, there is one more step—you must decide where the data
aggregation is to be stored. Historically, there were three basic storage
options: Multidimensional OLAP (MOLAP), Relational OLAP (ROLAP), or
Hybrid OLAP (HOLAP). SQL Server 2005's introduction of what Microsoft Figure 3. Simple Star Schema: The
calls the Unified Dimensional Model, which leverages the best of relational figure shows a basic star
and OLAP cube technologies, allows the designer many more storage schema with the dimension
options and unlike SQL Server 2000, allows combining them in the same tables arranged around a central
solution. fact table that contains the
measures.

DTS
Microsoft's Data Transformation Services (DTS) is perhaps the most critical tool in an OLAP project. DTS is
used to pull data from various sources into the star schema. The data warehouse will, in turn, feed the
Analysis Services database. More often that not, you must transform data from the source (for example, you
may have to convert currency values, balance calculations, and the like) and remap it. Microsoft has
estimated that in most cases, organizations spend eighty percent of their data warehousing on the extract,

http://www.devx.com/dbzone/Article/21410/1954?pf=true 29/12/2005
An Introduction to OLAP in SQL Server 2005 Página 3 de 4

transform, and load (ETL) phase.

Visual Studio 2005 hosts a new tool, BI Workbench, which is a replacement for DTS Designer. Chief among
the improvements found in BI Workbench is its separation of control flow (insertions, looping, sequencing,
scripting, etc.) from the data flow (source identification, aggregation, character mapping, and data conversion)
tasks (see Figure 4 and Figure 5). This separation makes DTS packages easier to read, develop, and
maintain. BI Workbench is reason enough to learn and use Visual Studio 2005.

Figure 4. Control Flow Task Figure 5. Data Flow Diagram: The


Diagram: The figure shows a figure shows a typical data flow
typical control flow and an from source to destination.
associated task statement.

Because DTS has been completely reworked in SQL Server 2005, current SQL 2000 DTS user will need to
brush up on DTS—and learn a few new tricks.

Working with Analysis Services


After identifying the dimensions and measures you wish to analyze, you can use Analysis Services to
construct an OLAP cube.

Analysis Services has built-in wizards that make the actual process of creating dimensions fairly easy,
especially if you're already familiar with SQL 2000's version, although SQL Server 2005's version does add
one additional step—you must create a Data Source View to import your database objects.

MDX
Just as you use SQL to query relational databases, you use MDX to query
a multidimensional cube (see Figure 6). For those who are eager to
interrogate the cube without learning MDX, there is an Excel Pivot Table
add-in that provides a drag and drop query interface. This interface
generates MDX and queries the cube on behalf of the user and as a special
bonus the results are displayed in Excel!

You use MDX used to create "calculated measures" that would be too
complex or impossible to do in SQL. For example, suppose the VP of Sales Figure 6. SQL vs. MDX: The figure
wants to know what the average sales price of each product is. compares data extraction using
Unfortunately, average sales price is not a measure in the Sales cube; SQL vs. MDX.
however, Store Sales and Sales Count are available. Because you can
calculate Average Sales Price by dividing Store Sales by Sales Count, you can calculate the measure (ergo
the name "calculated measure") by using MDX. Here's the MDX code.

WITH
MEMBER Measures.[Average Sale Price] AS
'Measures.[Store Sales] /
Measures.[Sales Count]'
SELECT
{ Measures.[Average Sale Price] } ON COLUMNS,
{ Product.CHILDREN } ON ROWS
FROM Sales

Luckily, some third party tools let users create calculated measures that may have been intentionally omitted
from the original cube design, such as commission or bonus calculations.

Cube Browser
After creating the cube, you need a cube browser to connect to the cube and display the data. Cube browsers
usually provide user-friendly tree-structured dimension filters and/or drag and drop interfaces that allow end
users to interrogate the cube. You can set up pre-defined queries, or allow ad hoc querying by letting users

http://www.devx.com/dbzone/Article/21410/1954?pf=true 29/12/2005
An Introduction to OLAP in SQL Server 2005 Página 4 de 4

combine the various measures with dimensions.

For example, suppose you want to create a report that shows Revenue by Sales Territory by Product.
Because dimensions are hierarchical, you can obtain the details of a dimension by drilling down. This usually
involves clicking on the dimension (for example, clicking on Sales Territory may reveal each store's level in
that dimension).

Dimensions can have multiple levels (such as year, quarter, and month). Users can mix and match members
within the same dimension. Furthermore, some cube browsers enable developers to export cube browsers as
a Web part that they can then easily include in a portal site or digital dashboard.

There are three basic types of cube browsers:

z Office Web components such as the Excel Pivot Table


z Third-party applications such ProClarity
z Custom-built applications

Some OLAP developers find debugging cube design and validating data using pivot tables much easier than
performing the same tasks using the native Analysis Services screen.

To sum up, here's the process in a nutshell.

z Determine the required dimensions and measures.


z Use Data Transformation Services to extract data from your source databases, transforming the data
as needed, and loading the finished data into the cube.
z Use the BI Workbench's Analysis Services wizards to build the measures, dimensions, and schema.
z Provide cube browsers for your users so they can select and view reports. If necessary, write MDX
queries or use automated tools, such as Excel Pivot Tables to query the cube.

Hopefully, this primer has whetted your OLAP appetite and given you the confidence to start creating OLAP
cubes yourself. A good way to get started is to use the sample Foodmart or Adventure Works databases that
ship with SQL Server 2005.

Gail Tieh is a Project Leader with Citigate Hudson's Pervasive Business Intelligence team. Gail's expertise
includes process improvement through the use of technology. Gail holds an MBA in Information Systems and
BA in Economics from Baruch College of City University of New York. She currently serves on the Board of
New York Software Industry Association (NYSIA) and is also the Special Interest Group Leader of NYSIA's
Database Professionals Council.

DevX is a division of Jupitermedia Corporation


© Copyright 2005 Jupitermedia Corporation. All Rights Reserved. Legal Notices

http://www.devx.com/dbzone/Article/21410/1954?pf=true 29/12/2005

You might also like