Ssas Rolap For SQL Server
Ssas Rolap For SQL Server
Warehouses
SQL Server Technical Case Study
Applies to: Microsoft SQL Server 2008 and SQL Server 2008 R2
Summary: This technical case study describes how the SQL Server Customer Advisory Team
(SQLCAT), in collaboration with SQL Server developers, tested and optimized a large Online
Analytical Processing (OLAP) solution based on SQL Server 2008 Analysis Services by using
the Relational OLAP (ROLAP) storage mode. The study examines ROLAP system requirements
and usage scenarios, highlights advantages and disadvantages of ROLAP in comparison with
Multidimensional OLAP (MOLAP), and evaluates various ROLAP-related data warehouse (DW)
optimization techniques regarding their effectiveness and limitations.
This case study is for data warehouse architects, database administrators, and storage
engineers, and assumes the audience is already familiar with the concepts of large-scale data
warehouse designs for servers, storage subsystems, and databases. A high-level understanding
of Analysis Services optimization techniques for cube processing and query performance is also
helpful. Detailed information is available in the SQL Server 2008 Analysis Services Performance
Guide at http://go.microsoft.com/fwlink/?LinkId=165486.
Note: For security reasons, the actual names of servers, cubes, tables, views, columns, and
other resources have been replaced with fictitious names. The sample names mentioned in this
paper do not represent real resource names and are for illustration purposes only.
Copyright
The information contained in this document represents the current view of Microsoft Corporation
on the issues discussed as of the date of publication. Because Microsoft must respond to
changing market conditions, it should not be interpreted to be a commitment on the part of
Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the
date of publication.
This case study is for informational purposes only. MICROSOFT MAKES NO WARRANTIES,
EXPRESS, IMPLIED, OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the
rights under copyright, no part of this document may be reproduced, stored in, or introduced into
a retrieval system, or transmitted in any form or by any means (electronic, mechanical,
photocopying, recording, or otherwise), or for any purpose, without the express written
permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you any
license to these patents, trademarks, copyrights, or other intellectual property.
Unless otherwise noted, the example companies, organizations, products, domain names, e-
mail addresses, logos, people, places, and events depicted herein are fictitious, and no
association with any real company, organization, product, domain name, e-mail address, logo,
person, place, or event is intended or should be inferred.
Microsoft and SQL Server are trademarks of the Microsoft group of companies.
2
Contents
Introduction.................................................................................................................................................4
Scoping the Project and Building the TEST Lab............................................................................................5
Performing Initial Data Warehouse Optimization........................................................................................7
Deploying the Lab Cube and Defining the Test Cases..................................................................................9
Lab Cube Deployment.............................................................................................................................9
Cube Test Scenarios...............................................................................................................................10
Selecting Test Tools and Methods..............................................................................................................11
Establishing a Performance Baseline.........................................................................................................12
Implementing Data Warehouse Aggregations...........................................................................................13
Cube-Based ROLAP Aggregations..........................................................................................................14
Transparent ROLAP Aggregations..........................................................................................................15
Choosing an Aggregation Strategy.........................................................................................................17
Creating Transparent Aggregations........................................................................................................20
Table Binding versus Query Binding.......................................................................................................22
Reorganizing the Lab Data Warehouse..................................................................................................23
Replacing Table Partitions with Individual Base Tables......................................................................24
Loading the Facts Data.......................................................................................................................25
Realigning Cube Partitions.................................................................................................................27
Reviewing Measure Groups and Measures............................................................................................29
Zooming in on Measures...................................................................................................................29
Row Counting and Query Binding......................................................................................................31
Benchmarking the ROLAP Cube.................................................................................................................32
TopCount Customer Queries..................................................................................................................32
Additional SQLCAT MDX Queries...........................................................................................................34
Understanding Query Concurrency and Processor Utilization...................................................................34
Usage-Based Optimization of Long-Running Queries................................................................................36
Dealing with Skewed Distribution of Data.................................................................................................37
Lessons Learned and Best Practices...........................................................................................................38
Conclusion.................................................................................................................................................40
3
Introduction
Data warehouse architects and storage engineers face exciting and challenging times as
emerging hardware and software make it possible to build large-scale data warehouses with
hundreds of terabytes of data at affordable prices. AMD Opteron and Intel Xeon Nehalem-EX
servers with up to 256 logical processors and hundreds of gigabytes of memory offer enormous
potential to achieve high performance and scalability. Such features let organizations keep
increasing amounts of analytical data online, enabling users to readily analyze massive
datasets, drill down into intricate details, and uncover actionable insights with a microscopic
level of granularity. The challenge is on to stay on top of this data mountain that keeps
accumulating at a rapid pace in the enterprise.
SQL Server Analysis Services is a crucial technology for most Microsoft BI solutions, yet
building Analysis Services cubes that keep up with growing data sizes is difficult. Gigantic
MOLAP cubes with hundreds of terabytes of data are out of the question. Even the increased
cube processing performance available with the latest versions of Analysis Services cant help
manage this enormous volume. Full cube processing would take weeks, and there are query-
related performance issues to consider as well. A MOLAP cube of 5 terabytes is stretching the
limit. A more reasonable maximum ranges between 1.5 and 3 terabytes. The adCenter cubes,
documented in the white paper Accelerating Microsoft adCenter with Microsoft SQL Server
2008 Analysis Services (http://go.microsoft.com/fwlink/?LinkId=183798), can serve as a
reference.
With MOLAP (at this time) impractical for data volumes larger than 5 terabytes, ROLAP
becomes an interesting option. ROLAP lets data and aggregates remain in the relational data
warehouse. This is attractive because the SQL Server 2008 relational engine can be amazingly
fast on top-end server models with storage subsystems designed for fast serial input/output
(I/O), which means that DW architects can take full advantage of the latest hardware and
software trends. Microsoft SQL Server Fast Track Data Warehouse
(http://www.microsoft.com/sqlserver/2008/en/us/fasttrack.aspx) even supports the deployment
of such systems with reference architectures for HP, Dell, Bull, EMC, and IBM. In contrast to the
relational engine, Analysis Services is currently limited to 64 logical processors. Plus, a ROLAP
cube can, in some cases, fan out SQL queries across multiple database servers and combine
the results for analysis. ROLAP can also provide low-latency data delivery for real-time
solutions. In short, there are many good reasons to evaluate ROLAP.
Of course, there are also complicating factors, which are good reasons to evaluate this storage
mode just the same. In comparison with MOLAP, ROLAP handles attribute relationships less
efficiently, requires more physical scans to get the data, and does not benefit from Analysis
Services data compression. Hardware requirements are therefore substantially higher. ROLAP
is also more susceptible to data quality issues, which MOLAP can automatically detect during
cube processing. And there is a significant disadvantage relating to aggregation design and
management. DW architects have to build ROLAP aggregates directly into the relational data
warehouse. The relational engine of SQL Server includes all necessary capabilities to support
4
aggregations seamlessly, yet optimizing a ROLAP cube for query performance and scalability
remains complex. Repeated testing, tweaking, and aggregation building are essential.
For all of these reasons, SQLCAT decided to investigate typical ROLAP design and optimization
tasks in a realistic lab setting representative of an enterprise environment. A large Microsoft
customer agreed to help by providing SQLCAT with a copy of a 1.35 terabyte production data
warehouse. During the project, SQLCAT converted an existing Analysis Services cube from
MOLAP to ROLAP, compared the ROLAP performance with the MOLAP baseline, analyzed
performance issues, optimized the ROLAP design to increase performance, and engaged SQL
Server developers to determine best solutions and workarounds for difficult issues. The result of
this work is the collection of ROLAP optimization techniques, lessons learned, and best
practices summarized in this paper.
On the one hand, SQLCAT wanted to test with the largest possible cubes, multiple servers, and
high query concurrency, but it also wanted to provide actionable information and guidelines
without getting lost in implementation complexity or troubleshooting minutiae. For this reason,
SQLCAT decided to rule out scalability testing and constrained the project scope by starting with
a single cube of low complexity. This approach enabled SQLCAT to tackle typical ROLAP issues
efficiently and to quickly develop effective optimization techniques. The ultimate goal was to
benchmark the original query performance of a given MOLAP cube and then try to achieve
similar or better results with ROLAP.
Figure 1 illustrates the lab environment that SQLCAT deployed according to the overall project
scope. This lab was based on a single quad-socket server with four dual-core processors and
64 GB of memory. With a total of eight processor cores and 64 GB of memory, the lab system
was able to sustain the ROLAP processing demand, which is generally higher in comparison to
MOLAP, as discussed in the section Understanding Query Concurrency and Processor
Utilization later in this case study.
5
Figure 1: Server Configuration for ROLAP Performance Tests and Benchmarking
Particularly noteworthy is the lab servers storage design. Because I/O performance greatly
influences query performance, SQLCAT optimized the storage subsystem by using three
separate SANs. In particular, SQLCAT connected the logical unit number (LUN) drives of each
SAN to different ports so that two SAN ports handled the workload for every pair of LUNs (that
is, O/P, S/T, and U/V). This configuration doubled the maximum I/O throughput and ensured that
the storage subsystem did not create a bottleneck during performance testing. With 64 KB block
sizes (except for the OLAP LUNs, which were set to 32 KB block sizes) and every port
delivering 600-900 MB per second, SQLCAT was able to achieve in this configuration an I/O
throughput of between 1.2 and 1.9 GB per second with minimal latencies.
SQLCAT implemented the following storage design for the lab server:
Fact tables for the ROLAP cube Stored in a file group that was evenly split across the
LUNs S, T, U, and V, formatted with a 64 KB cluster size according to SQL Server best
practices.
TempDB Placed on its own two separate LUNs, O and P, so that TempDB usage did
not generate write requests to the data LUNs during querying.
SQL Server log folder Placed on LUN H to ensure that the sequential writing of the
transaction log would occur on a separate, dedicated drive.
Windows page file Placed on LUN M to isolate most of the paging activity on a
separate, dedicated drive. Only a small page file of 800 MB remained on the C drive.
Using multiple page files placed on separate physical disks is a best practice to increase
6
system performance, and by leaving a small page file on the system drive, SQLCAT
ensured that critical operating system processes, such as the Local Security Authority
Service (Lsass.exe), are able to access a page file without requiring a SAN environment.
OLAP folder Placed on LUN N, which was optimized for random I/O requests
according to the typical I/O pattern of MOLAP measures and was formatted with a 32 KB
cluster size. SQLCAT performance tests show that a 32 KB cluster size can benefit
Analysis Services performance more than a 64 KB cluster size.
OLAP log folder Placed on LUN H, which also hosted the SQL log folder. Storing the
OLTP and OLAP logs on the same LUN did not impact test results because SQLCAT
performed the MOLAP and ROLAP tests in separate cycles.
OLAP temp folder Placed on LUN O, which also hosted the TempDB database. Again,
it was acceptable to share a LUN between OLAP temp folder and TempDB because
SQLCAT performed MOLAP and ROLAP tests separately and there was very little disk
activity.
Note: SQLCAT hosted all resources on a single server to run the MOLAP and ROLAP
performance tests with exactly the same system configuration. However, the single-server
design depicted in Figure 1 is not a recommended configuration for production environments.
For large-scale production designs, refer to the SQL Server Best Practices article Scale-Out
Querying with Analysis Services, available on Microsoft TechNet at
http://go.microsoft.com/fwlink/?LinkId=193053, as well as the white paper Accelerating
Microsoft adCenter with Microsoft SQL Server 2008 Analysis Services at
http://go.microsoft.com/fwlink/?LinkId=183798.
7
Figure 2: Cube Design in Relationship to the Source Facts Table
Most importantly, SQLCAT streamlined the source facts table in the lab data warehouse to
reduce the size of the data rows. Smaller data rows lead to smaller indexes and lower I/O
overhead to retrieve the data, which ultimately benefits query performance. After all, ROLAP
performance depends on the capabilities of the relational data warehouse to return results
quickly.
SQLCAT carried out the following steps to optimize the source facts table:
Removal of unnecessary columns The original source facts table contained more
columns than the lab cube needed. The columns in red in Figure 2 were not necessary.
By removing these columns, SQLCAT was able to reduce the size of the facts table by
more than 50 percent.
Conversion of data types to save space By converting the numeric columns in the
original facts table to the bigint data type, SQLCAT was able to save 11 bytes per value.
This optimization also potentially improved the performance of Multidimensional
Expressions (MDX). For example, the DistinctCount function is optimized for integer
values.
8
Table 1 compares the original and optimized lab data warehouse versions.
Note: As a best practice to achieve high ROLAP performance, SQLCAT recommends avoiding
complex data models and simplifying the relational data warehouse as much as possible, such
as by removing unnecessary columns and, if possible, by replacing wide data types with
narrower alternatives to increase storage efficiency. SQLCAT also recommends using a
straightforward star schema, avoiding snowflake designs and many-to-many attribute
relationships in the dimensions.
The only extra optimization that SQLCAT applied to the MOLAP cube concerned the OLAP
drive N, which SQLCAT formatted with a 32 KB cluster size for best performance, as mentioned
earlier.
9
Cube Test Scenarios
For the test scenarios, SQLCAT planned to use a set of original customer queries. These
queries promised realistic test runs, but a quick inspection revealed that the queries all used
similar MDX expressions. Essentially, all of the queries the customer provided relied on the
TopCount() function and analyzed three days or seven days of data.
The TopCount() function can help to highlight storage mode differences because it requires a
full scan of the data across all relevant cube partitions, which also implies a full scan of the
underlying facts tables in ROLAP mode. Still, SQLCAT deemed it necessary to create additional
sets of three-day and seven-day queries, as well as advanced queries that had more variety
and did not rely on the TopCount() function. This expanded portfolio of test cases enabled
SQLCAT to simulate standard queries as well as complex ad-hoc queries that individual
analysts might generate in a production environment.
Table 2 provides an overview of the test scenarios and a breakdown of the corresponding MDX
queries that SQLCAT decided to use for the ROLAP investigation.
10
Selecting Test Tools and Methods
The standard tools to measure and benchmark Analysis Services performance are SQL Server
Profiler and Performance Monitor. Another useful tool, and a SQLCAT favorite, is the Analysis
Services command-line tool (ascmd.exe). ASCMD can run XML for Analysis (XMLA) scripts,
MDX queries, and Data Mining Extensions (DMX) statements while tracking results and trace
information in a log file.
Note: The ASCMD source code is available as part of the SQL Server 2008 Samples at
http://msftasprodsamples.codeplex.com. With SQL Server 2008 Samples installed, the path to
the source code is %ProgramFiles%\Microsoft SQL Server\100\Samples\Analysis
Services\Administrator\ASCMD. The source code can be compiled using Microsoft Visual Studio
2008, or Microsoft Visual C# 2008 Express Edition available as a free download at
http://go.microsoft.com/fwlink/?LinkId=184935.
Figure 3 shows how SQLCAT automated the test runs by using a batch file to start the ASCMD
command-line tool for each individual MDX query in a FORDO loop.
For the MOLAP and ROLAP performance tests, SQLCAT used the following tools and methods:
ASCMD command-line tool SQLCAT fully automated the test runs for all MDX queries
by using a batch file (see Figure 3). In each loop, SQLCAT first cleared the Analysis
Services data cache by running the ClearCache.xmla script as documented in the
11
ASCMD readme file, then executed the currently selected MDX query, recording the
start time, end time, and query duration in milliseconds.
SQL Server Profiler During the early stages of the ROLAP tests, SQLCAT double-
checked the accuracy of the ASCMD results with Profiler traces. A Profiler trace can
provide detailed information about where Analysis Services spends its time processing
an MDX query, starting with the command parser and ending with the formula engine
finishing query execution. SQLCAT also used Profiler traces to examine the SQL queries
that the lab ROLAP cube generated.
Performance Monitor SQLCAT used Performance Monitor to keep an eye on CPU
load, memory utilization, and I/O throughput to ensure that no unexpected factor or
bottleneck distorted the query performance measurements.
Note: All performance results mentioned in this case study are based on a cold Analysis
Services cache. By clearing the data cache prior to running each MDX query, SQLCAT ensured
accurate tests. A warm cache would boost query performance but would also distort test results,
as Analysis Services could then answer queries directly from memory without going to disk or
querying the relational data warehouse.
Note: The figures in this paper use a logarithmic scale to help emphasize the difference
between MOLAP and ROLAP on a per-query basis. A linear scale would emphasize the
difference between short- and long-running queries in each storage mode, but would make it
very difficult to compare MOLAP and ROLAP queries of less than 1 second, as well as those
that take more than 20 minutes.
12
Figure 4: MOLAP Performance Baseline on a Logarithmic Scale
As illustrated, the majority of the MDX queries finished in less than one minute, the additional
SQLCAT queries O\01, A\02, A\03, and A\04 required more than five minutes, and query A\03
took the longest with 20 minutes and 11 seconds. A\03 required this time to calculate the
average of the cross product of two large datasets over a particular measure for 30 days of data
(25 billion rows of facts).
13
Cube-Based ROLAP Aggregations
In MOLAP and HOLAP modes, aggregations reside in the multidimensional database
maintained by Analysis Services. In ROLAP mode, on the other hand, aggregations are based
on indexed views in the relational data warehouse. Analysis Services supports ROLAP
aggregations insofar as it can create the indexed views to contain aggregations, and then uses
these indexed views instead of the source tables where possible to answer queries (see Figure
5). However, Analysis Services is not involved in materializing these views or storing their result
sets. Analysis Services delegates these tasks to the SQL Server relational engine by submitting
CREATE VIEW and CREATE UNIQUE CLUSTERED INDEX commands to the data warehouse
during cube partition processing. The relational engine then creates and maintains the indexed
views in the database in much the same way as tables with a clustered index.
To highlight how Analysis Services benefits from indexed views created by means of a cube-
based aggregation design, Figure 5 describes the following scenarios:
SQL query without aggregations in place (red arrow, left side of figure) The SQL
query goes with the lowest possible grain directly against the source facts table.
Although SQL Server can rapidly scan large data sets on a fast storage subsystem, by
aggressively performing synchronous and asynchronous I/O operations and using 512
KB block sizes, for example, performance suffers due to the time spent scanning across
unnecessarily detailed data in the database (25 billion rows of facts in the SQLCAT lab
environment).
SQL query with aggregations in place (purple arrow, right side of figure) Analysis
Services can generate a SQL query against the indexed view and the relational engine
can process the query much faster because the indexed view already summarized the
data and persisted the result set.
14
Figure 5: Analysis Services Querying with and without Cube-Based ROLAP Aggregations
The key to leveraging transparent aggregations is the relational engines ability to choose an
optimal execution plan for parsed SQL queries. If the relational engines query optimizer finds a
match between the query elements (such as columns, search conditions, joins, and aggregate
functions) and the elements of an indexed view, it considers the view index as a possible option.
Query optimizer then estimates the cost of each possible option, chooses the query plan with
the lowest cost, and keeps track of the choice for subsequent queries. As a result, if an index of
a view has the lowest cost of any considered access option, query optimizer chooses this view
index even though the original SQL query references a table in the FROM clause.
The execution plan in Figure 6 shows query optimizer at work. The SQL query references the
FactLabDataOptimized table, but the relational engine transparently chooses the clustered
index on the Agg_Fact Lab Data_SW_ID view to answer the query. Microsoft introduced this
query optimization feature based on indexed views 10 years ago with SQL Server 2000 to
increase the performance of database applications without requiring T-SQL code modifications,
and has continued to improve the view-matching capabilities with every subsequent SQL Server
release.
Note: Query optimizer does not differentiate between client applications and is not aware of
Analysis Services, and Analysis Services is not aware of query optimizer.
16
Figure 6: Query Execution Plan Selecting a Transparent Aggregation
19
Figure 7: Cube-Based and Transparent ROLAP Aggregations
The dimension key was the only relevant variable, and with seven different dimension keys in
the source facts table SQLCAT was able to cover the most basic test cases with seven indexed
views that each joined the facts data to the date dimension on the date key and then
aggregated the values by an additional dimension key and by date dimension attributes. The
clustered indexes then physically ordered the results by dimension key, year, quarter, month,
and day so that the aggregated values for consecutive days were located in close physical
proximity in the database.
The following T-SQL listing shows the corresponding CREATE VIEW and CREATE UNIQUE
CLUSTERED INDEX statements with placeholders for dimension-specific information (for more
information about creating indexed views, refer to the topic Creating Indexed Views in SQL
Server 2008 R2 Books Online at http://msdn.microsoft.com/en-us/library/ms191432.aspx):
GO
21
Table Binding versus Query Binding
Much to SQLCATs dismay, creating indexed views on top of a facts table with 25 billion rows
took far longer than expected. After more than 24 hours, SQLCAT cancelled this process as
soon as the relational engine completed the first view and index. This first indexed view now
had to suffice for a small ROLAP test with a subset of MDX queries. However, not a single query
showed performance improvements. It turned out that the facts table design and the
corresponding cube partition design did not permit the relational engine to utilize indexed views
for SQL queries originating from Analysis Services. Figure 8 shows the relevant design aspects.
According to a best practice in large-scale data warehousing, the design of the facts table relied
on partitions organized by calendar day to facilitate data pruning along a sliding time window. In
this particular case, the partition function used the Date_ID in the format yyyyMMdd. Every day,
the data warehouse would drop one expired table partition and add a new partition for the
current day. An XMLA script would then drop the expired cube partition as well and create a new
cube partition with a source query that referenced the new table partition by specifying the
current days Date_ID in the WHERE clause. This design works well for MOLAP cubes, but it
creates issues for ROLAP.
22
The primary problem is that Analysis Services must use query binding to reference individual
facts table partitions, so the SQL queries do not reference the underlying source facts table
directly. Instead, Analysis Services uses the source query of the cube partition in the FROM
clause to access the relevant facts data, and because the FROM clause now references a
subquery instead of the base table, query optimizer cant choose a view index to optimize the
execution plan. Query optimizer only considers view indexes if the SQL query references a base
table or if the FROM clause specifies an indexed view directly. Therefore, ROLAP cube
partitions must use table binding to benefit from transparent aggregations in the data
warehouse.
Figure 9 confirms this requirement. As the query execution plans indicate, the first query uses
table binding and leverages the indexed view, while the second query uses query binding and
performs all joining, sorting, and aggregating dynamically. Both queries return the same results,
but the second is much less efficient because it does not use the aggregations that already exist
in the data warehouse.
23
aspect of the table or aggregation design throughout the cubes lifecycle leads to lengthy
processing time. For example, switch operations on partitioned tables require dropping and
recreating any associated indexed views.
Taking these dependencies into account, SQLCAT concluded that the lab data warehouse
required a substantial overhaul. It was necessary to replace the facts table partitions for
individual days of data with separate daily tables, load the facts data from the original source
table into the daily tables, create separate indexed views for each daily table, and then
reconfigure the cube partitions to access the daily tables instead of the original source table.
Note: Partitioning a large database by using separate daily tables used to be a best practice
for Microsoft SQL Server 2000 and earlier versions, but it is not a best practice for SQL Server
2005 Enterprise Edition and higher because these editions support table partitions based on a
partition function. SQLCAT reverted to the earlier method to achieve optimal ROLAP query
performance, but does not recommend this approach for other partitioning scenarios because of
the incurred complexity and administrative overhead.
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
Note: An alternative approach is to switch out partitions to new tables. However, switch
operations on partitioned tables require dropping and recreating any associated indexed views,
which renders this approach impractical for large-scale data warehouses with ROLAP
aggregations based on indexed views.
Figure 10 displays the resulting distribution of facts data across the new collection of daily tables
in the lab data warehouse.
25
Figure 10: Bulk Loading Daily Facts using SSIS
Although the SQL Server Destination object is less flexible than other Extract, Transform, and
Load (ETL) objects, it lets you customize bulk-load operations for high performance.
Specifically, SQLCAT disabled the option to check constraints and ran five packages in parallel,
achieving a throughput of 32 MB per second. By increasing the network packet size to 32 KB,
SQLCAT increased the bulk-load performance even further to 42 MB per second. For detailed
information about SSIS-based high-performance bulk loading, refer to We Loaded 1TB in 30
Minutes with SSIS, and So Can You, available on Microsoft MSDN at
http://go.microsoft.com/fwlink/?LinkId=193054. See also Top 10 SQL Server Integration
Services Best Practices on the SQLCAT Web site at http://go.microsoft.com/fwlink/?
LinkId=147462.
Note: After loading the facts data, SQLCAT created the full set of indexed views for each daily
table with the options SORT_IN_TEMPDB = ON, ALLOW_PAGE_LOCKS = OFF,
ALLOW_ROW_LOCKS = OFF, and FILLFACTOR = 100. For more information about how
26
SQLCAT created indexed views for ROLAP aggregations, see the section Creating Transparent
Aggregations earlier in this case study.
To bring the ROLAP cube back into alignment with the lab data warehouse, SQLCAT performed
the following steps:
1. Adding all daily facts tables to the DSV By using the Add/Remove Tables wizard in
BIDS, SQLCAT added all 31 daily facts tables to the Analysis Services database in a
single step. The Add/Remove Tables wizard established the relationships between facts
and dimension tables automatically because the daily tables included all necessary
foreign key definitions. It is worth pointing out that SQLCAT did not remove the original
27
facts table from the DSV because it served as the source table for the measures in the
Fact Lab Data Optimized measure group.
2. Removing all existing cube partitions By deleting the existing cube partitions
associated with the original table partitions, SQLCAT excluded the original facts table
from further querying and processing in the Fact Lab Data Optimized measure group.
3. Adding daily cube partitions By using the following XMLA script, SQLCAT added all
31 daily partitions to the ROLAP cube with consistent configuration settings. Particularly
important for query performance is the Slice property, which Analysis Services uses
during query execution to determine the partitions that contain relevant data. SQLCAT
configured each cube partitions Slice property according to each base tables calendar
day (see also Figure 11).
<Create
xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<ParentObject>
<DatabaseID>SQLCAT</DatabaseID>
<CubeID>SQLCAT</CubeID>
<MeasureGroupID>Fact Lab Data Optimized</MeasureGroupID>
</ParentObject>
<ObjectDefinition>
<Partition xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Name>Fact Lab Data Optimized_<yyyyMMdd></Name>
<ID>Fact Lab Data Optimized_<yyyyMMdd></ID>
<Source xsi:type="DsvTableBinding">
<DataSourceViewID>SQLCAT</DataSourceViewID>
<TableID>dbo_FactLabDataOptimized_<yyyyMMdd></TableID>
</Source>
<StorageMode>Rolap</StorageMode>
<ProcessingMode>Regular</ProcessingMode>
<Slice>
[Dim Date].&[<yyyy>]&[<qq>]&[<MM>]&[<d>]
</Slice>
<ProactiveCaching>
<SilenceInterval>-PT1S</SilenceInterval>
<Latency>P0D</Latency>
<SilenceOverrideInterval>-PT1S</SilenceOverrideInterval>
<ForceRebuildInterval>-PT1S</ForceRebuildInterval>
<Enabled>false</Enabled>
<AggregationStorage>MolapOnly</AggregationStorage>
<OnlineMode>Immediate</OnlineMode>
<Source xsi:type="ProactiveCachingInheritedBinding">
<NotificationTechnique>Server</NotificationTechnique>
</Source>
</ProactiveCaching>
28
</Partition>
</ObjectDefinition>
</Create>
4. Processing the measure group To process all daily cube partitions and any
unprocessed dimensions for the Fact Lab Data Optimized measure group in a single
step, SQLCAT reprocessed the measure group programmatically using the following
XMLA script, This script enabled SQLCAT to process the data quickly using the ASCMD
command-line tool.
<Process
xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Object>
<DatabaseID>SQLCAT</DatabaseID>
<CubeID>SQLCAT</CubeID>
<MeasureGroupID>Fact Lab Data Optimized</MeasureGroupID>
<PartitionID>$(Partition)</PartitionID>
</Object>
<Type>ProcessFull</Type>
<WriteBackTableCreation>UseExisting</WriteBackTableCreation>
</Process>
Note: The ASCMD command-line tool supports a -v command parameter for different
variables. In this case, the different variables corresponded to the different partitions (i.e.,
different dates) within the cube.
Zooming in on Measures
Figure 12 shows the hierarchy of the most important Analysis Services objects that influence
how a ROLAP cube communicates with an underlying data warehouse.
29
Figure 12: Analysis Services Object Hierarchy
Based on this hierarchy and the following considerations, SQLCAT was able to narrow down the
root cause and focus attention on the measure group:
Data source The Data Source object of an Analysis Services database specifies the
connection string, database isolation level, and maximum number of concurrent
connections, which have no bearing on the binding type.
Data source view The DSV on top of the Data Source included no views or named
queries and only referenced base tables, so the DSV was not the reason.
Cube partitions The cube partitions also showed the correct configuration. The XMLA
script to create the partitions set the Type parameter of the Source property correctly to
DsvTableBinding (see previous T-SQL listing under Realigning Cube Partitions).
Cube-based aggregations The cube partitions were not associated with an
aggregation design, so cube-based aggregations did not yet exist at this stage.
Measure groups The next logical place to look was the Fact Lab Data Optimized
measure group with its measures Transmitted MB, Gathered MB, Duration, Incident
Count, and Fact Lab Data Optimized Count. The first four measures relied on the SUM
aggregation function to calculate summary data over their corresponding source
columns (that is, Transmitted MB, Gathered MB, Duration, and Incident Count in the
daily facts tables), but the fifth measure was different. It used the Count of Rows
30
function, which calculates the row count on the entire source table, not just a single
column. This Count of Rows measure was the likely cause of the issue.
The Fact Lab Data Optimized Count measure was also different in the sense that it did not exist
in the original MOLAP cube. When SQLCAT created the ROLAP cube, the Cube Wizard in
BIDS automatically added this measure to the measure group. As soon as SQLCAT deleted this
measure and processed the cube, Analysis Services switched to table binding in the SQL
queries, as SQLCAT verified through Profiler traces.
Figure 13: Calculating the Row Count with the Help of a Subquery
31
Benchmarking the ROLAP Cube
At last, SQLCATs redesign and optimization efforts paid off. The ROLAP cube was finally ready
for performance testing, and thanks to the amazingly fast performance of the relational SQL
Server engine on top of a super-fast storage subsystem, the results looked better than
expected. To everybodys surprise, the ROLAP cube outpaced the MOLAP cube in 45 percent
of all queries right from the start (see Figure 14). Only 39 percent of the queries showed
substantially slower response times in ROLAP mode (more than twice the amount of MOLAP
time) and 16 percent showed moderate performance degradation (less than twice the amount of
MOLAP time). Admittedly, the lab cube featured a low number of attributes, low cardinality, and
the date dimension was relatively simple, but the performance results proved that ROLAP can
be a viable option for simple cubes with large data volumes but low dimensional complexity.
The ROLAP performance also benefited from the daily cube partition design. As explained
earlier, the ROLAP cube included 31 partitions that mapped to 31 daily facts tables, each with
its own set of indexed views that satisfied the TopCount() queries. In this configuration, when
executing an MDX query for a given range of days, Analysis Services first checks the Slice
property of the cube partitions to determine the partitions that contain relevant data, then
submits SQL queries to the data warehouse for only those facts tables that correspond to the
relevant cube partitions. If the relational engine can process these SQL queries in parallel and
fully leverage multiple indexed views, and if the storage subsystem is optimized for fast serial
I/O, the data warehouse can return the results to Analysis Services in the shortest possible time.
Figure 15 shows a seven-day MDX query example. The seven selected days corresponded to
seven cube partitions, each owning a Slice within the specified day range. Accordingly, Analysis
Services submitted seven SQL queries to the SQLCAT lab data warehouse.
33
Additional SQLCAT MDX Queries
For the most part, the additional queries that SQLCAT defined for a greater variety of test cases
also performed well in ROLAP mode, although query performance suffered noticeably over
larger data sets. As Figure 16 reveals, query A\03 was the worst performer. It calculated
average values for 30 days of dataand ROLAP required no less than 2 hours, 34 minutes,
and 57 seconds to deliver the results. The next section, Understanding Query Concurrency and
Processor Utilization, deals with query A\03 in more detail.
By studying Performance Monitor traces, SQLCAT noticed the following query behavior:
34
Seven-day MDX query A\02 Generated seven SQL queries, which the relational
engine was able to execute quickly because the number of queries did not exceed the
number of processor cores on the server. The server had eight cores, and thanks to the
high I/O performance in the storage subsystem, the relational engine was able to reach
a very high level of CPU utilization while executing all seven queries in parallel.
30-day query A\03 Generated 30 SQL queries, which far exceeded the number of
available cores, thus flooding the relational engine and causing a substantial backlog of
queries attempting to execute. The relational engine parallelized each of these 30 SQL
queries and utilized all available cores to distribute the query threads, which caused
excessive thread thrashing and I/O congestion.
To increase the performance of query A\03 and to ensure that the cores would not become a
bottleneck for a wider range of ROLAP queries, SQLCAT concluded that it would be necessary
to limit parallel query plan generation and scale out the lab environment, as illustrated in Figure
17. By restricting the maximum degree of parallelism to 1 (MAXDOP = 1), SQL queries run
single-threaded, so 30 cores would be necessary to run all SQL queries of MDX query A\03 in
parallel. For more information about the MAXDOP setting, refer to the section Max Degree of
Parallelism Option in SQL Server 2008 Books Online at http://go.microsoft.com/fwlink/?
LinkId=193055. Note that the MAXDOP setting can also be enforced with Resource Governor.
Note: The SQL query fan-out design depicted in Figure 17 primarily benefits ROLAP cubes
with a high percentage of MDX queries spanning all cube partitions, such as MDX queries on
attributes that are not associated with partition slices. This design does not apply to MOLAP.
35
MOLAP has substantially lower hardware requirements. For example, MOLAP was able to
execute query A\03 much faster than ROLAP on a single server with eight cores. It didnt require
as much CPU power to parse through 30 days of data (see the results in Figure 16).
The Usage-Based Optimization Wizard can create smart cube-based aggregations based on
statistical information about MDX queries tracked in the Analysis Services query log. To
generate suitable query log entries, SQLCAT enabled the query log as discussed in the TechNet
article Configuring the Analysis Services Query Log at http://go.microsoft.com/fwlink/?
LinkId=193056, then ran the selected queries via the ASCMD command-line tool.
Figure 18 shows a breakdown of the response times for query 3\10 and 3\11 before and after
UBO. As the results reveal, the long-running queries benefited substantially from UBO.
Figure 18: Query Times Before and After UBO on a Logarithmic Scale
Particularly in ROLAP mode, response times dropped from almost 3 minutes to roughly 300
milliseconds. Prior to UBO, the ROLAP queries lagged far behind their MOLAP counterparts.
After UBO, MOLAP and ROLAP response times were at par, and only fractions of a second
apart. This proves that the application of a smart aggregation design through UBO can benefit
both MOLAP and ROLAP cubes. However, UBO should not be applied indiscriminately to all
queries to avoid inadvertent effects on storage and processing overhead, as discussed in the
36
Technical Note Reintroducing Usage-Based Optimization in SQL Server 2008 Analysis
Services available on the SQLCAT web site at http://go.microsoft.com/fwlink/?LinkId=191147.
In particular, SQLCAT wanted to examine the impact of data skew on query optimizer. Query
optimizer uses a complex optimization algorithm based on table and index statistics to formulate
the query execution plan. This algorithm works well in most cases, but it is not perfect. Skewed
data can cause query optimizer to generate an inefficient query execution plan, which would
affect ROLAP performance while the effect remains virtually non-existent for MOLAP.
To perform the tests, SQLCAT ran arbitrary ad-hoc queries against the MOLAP and ROLAP
cubes. These ad-hoc queries did not benefit from aggregations or indexed views, and the
performance results validated the assumption that data skew must be better compensated for
with ROLAP than with MOLAP. The MOLAP queries performed well with an average response
time of 49 seconds while the ROLAP queries, on average, took approximately two minutes
longer (see Figure 19).
One way to compensate for data skew in a relational data warehouse is to cover the most
relevant queries with indexed views. Another is to ensure that appropriate, up-to-date column
37
statistics exist. For information about concepts and guidelines for using query optimization
statistics, refer to the topic Using Statistics to Improve Query Performance in SQL Server 2008
Books Online at http://go.microsoft.com/fwlink/?LinkId=164535.
38
relational databases, it does place a high aggregations to avoid inefficient use of
databases. demand on storage capacities in the storage resources, particularly in
relational data warehouse for indexed environments with multiple cubes
views and summary tables. For sharing the same data warehouse. Avoid
example, indexing on every attribute overlapping cube-based aggregations
can lead to excessive storage and cubes with different levels of
consumption in ROLAP mode, while granularity because they require different
MOLAP is much more efficient with its sets of tables for each level of
automatic two-level bitmap indexes. granularity.
Increases MOLAP automatically optimizes many Create scripts or other custom solutions
design design aspects that require manual to automate the provisioning of facts
complexity attention in ROLAP mode. Queries that tables, indexed views, and
and dont hit aggregations require a very corresponding cube partitions. Directly
management clean, normalized schema and other map the facts tables in the relational
overhead. data warehouse optimization data warehouse to cube partitions in
techniques also come into play, such Analysis Services. Also, use the ASCMD
as restricting the maximum degree of command-line tool and Profiler traces to
parallelism to 1 and compensating for test applicable queries with original
data skew. Also, because the cube aggregations, new UBO aggregations,
cant rely on partitioned facts tables, handmade aggregations, and no
the data warehouse requires individual aggregations to better understand the
daily tables and separate sets of query characteristics of the cube.
indexed views, which can substantially
increase management overhead.
Can leverage MOLAP and ROLAP can use Create cube-based aggregations at the
cube-based aggregations even when the query leaf level to help known cross-join
aggregations granularity doesnt exactly match the patterns. Note that Analysis Services can
even if the aggregation granularity. However, only leverage aggregations for different
granularities ROLAP can only use cube-based granularities if the aggregations are part
do not aggregations on the leaf level of a of the ROLAP cube design, such as
exactly dimension when querying higher aggregations created by using UBO.
match. levels. Transparent aggregations do not fall into
this category.
Might be MOLAP can catch referential integrity Focus on data cleanup and data quality
affected by errors and other data problems during during the load process. Confine data
data cube processing, but these issues can changes to the most current facts table,
problems affect ROLAP during query execution. such as to the most recent daily table, in
and ROLAP cubes do not support order to avoid any impact of referential
referential Snapshot isolation with multiple active integrity issues on past data.
integrity result sets on a SQL Server
issues. connection, so DW architects must use
the ReadCommitted isolation level,
which allows nonrepeatable reads
(reading the same row twice delivers
different results) and phantoms (re-
executing the same query delivers a
different set of rows).
Relies on ROLAP performance depends on the Add foreign key and other constraints to
query capabilities of the query optimizer to the facts tables. Keep the statistics up to
optimizers formulate an efficient query execution date by running regular UPDATE
39
ability to plan. An inefficient query execution STATISTICS jobs after updates (or turn
make the plan affects ROLAP performance, on auto update stats).
best while the effect remains virtually non-
decisions for existent for MOLAP.
the query
execution
plan.
Conclusion
ROLAP can be a viable option for simple cubes with large data volumes but low dimensional
complexity if DW architects and database administrators have the knowledge to fine-tune the
data warehouse manually. It is important to realize that MOLAP optimizes many design aspects
automatically, while this must be done by hand in ROLAP mode. However, given sufficient CPU
power, memory capacity, a storage subsystem optimized for high I/O throughput, and useful
relational aggregations, ROLAP can reach very high performance levels thanks to the amazing
performance of the SQL Server 2008 relational engine, exceeding MOLAP performance in some
cases.
In comparison to the gold standard of MOLAP, ROLAP clearly places a higher demand on
hardware resources, impacts the size of relational databases, increases design complexity and
management overhead, comes with higher processing overhead, and might be affected by data
problems and referential integrity issues. For instance, to benefit from aggregations, a ROLAP
cube cant use query binding, which limits cube design flexibility and prevents the data
warehouse from using table partitioning. For partitioning purposes, the data warehouse must
rely on the old-style method of individual facts tables organized by attributes related to cube
partitioning, which noticeably increases management overhead. Furthermore, if ROLAP cant
determine which partitions contain relevant data, perhaps because the query did not specify any
of the slice attributes or because of complex attribute relationships that ROLAP cant analyze,
ROLAP includes all available cube partitions in the query execution, potentially flooding the
relational engine and causing a substantial backlog of queries as well as excessive thread
thrashing and I/O congestion. Limiting parallel query plan generation and implementing a SQL
query fan-out design can help to mitigate this issue, but the required resources to implement a
fan-out are substantial. In comparison, MOLAP can determine the relevant cube partitions even
with parent-child and many-to-many relationships in the dimensions and can execute queries
much more efficiently and with much less CPU power. MOLAP offers many performance
advantages out of the box, such as automatic two-level bitmap indexing on every attribute and
lower I/O overhead due to data compression.
On the other hand, ROLAP undeniably does have the advantage that it doesnt need to copy the
source data into a multidimensional database. With data volumes beginning to exceed the
maximum MOLAP capabilities, this advantage will trump all of the ROLAP disadvantages,
especially considering the powerful features of the SQL Server relational engine to manage the
data volumes. Smart aggregation designs that combine the advantages of transparent and
40
cube-based aggregations can mitigate some of the performance disadvantages of ROLAP. It is
also possible to fan out SQL queries across multiple data warehouse servers. The data
warehouse scale-out onto multiple servers would also benefit cube partition processing. And
finally, XMLA scripts and custom solutions can help to lower the administrative burden of
maintaining a large number of facts tables and cube partitions.
The keys to a successful ROLAP implementation are a fast underlying storage subsystem, a
streamlined relational data warehouse, and thorough testing, tweaking, and usage-based
optimization. It is imperative to study Profiler traces and analyze the SQL queries generated by
Analysis Services. If these SQL queries reference views or subqueries in the FROM clause, the
ROLAP cube will not be able to leverage relational aggregations. Query optimizer will not use
transparent aggregations created manually in the data warehouse and the creation of cube-
based aggregations will fail during partition processing if the cube uses query binding for any
reason. It is important to review the design of DSVs, cube partitions, and measure groups to
ensure that Analysis Services uses table binding.
Although SQLCAT deemed its first lab-based customer ROLAP investigation a great success, a
few questions still remain to be answered. For example, SQLCAT did not examine the ROLAP
behavior for multiple concurrent users and did not study how ROLAP would benefit from a warm
Analysis Services cache. In theory, the ROLAP storage mode exhibits the same caching as
MOLAP does and should therefore show similar performance improvements, but this still needs
to be verified in a realistic test lab. Furthermore, SQLCAT still needs to clarify the impact of
attribute cardinality, distinct count measures, and complex attribute relationships on ROLAP
performance. It still needs to be seen how ROLAP performs over very wide dimensions and
attributes with one million or more members. For all of these reasons, customers are
encouraged to use ROLAP with caution, apply the optimization techniques discussed in this
case study, and perform additional acceptance tests with the full number of current users in their
environments.
41
For more information:
Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5
(excellent) how you would rate this paper and why youve given it this rating. For example:
Are you rating it high because it has good examples, excellent screen shots, clear
writing, or another reason?
Are you rating it low because the examples are poor, the screen shots fuzzy, or the
writing unclear?
This feedback will help us improve the quality of the case studies we release.
Send feedback.
42